The Gateway to Algorithmic and Automated Trading

What time do you call that?

Published in Automated Trader Magazine Issue 18 Q3 2010

Sub-microsecond latency is only impressive if you’re sure all your server clocks are showing the right time. Victor Yodaiken, CEO of FSMLabs, brings us up to date with the latest developments in network-wide clock-watching.

Victor Yodaiken

Victor Yodaiken

Trading system computers may be loaded with extreme processing power, but they are not very good at keeping track of the time of day. The clocks on computers that run trading applications are actually quite unstable - the time they report can drift substantially during the course of even a few minutes.

Our personal PCs use clocks that are not particularly accurate and most reach out to networks for time checks, just as our cell phones do. But neither our PCs nor the application processors of our mobile phones need to track time accurately in increments of microseconds.

The servers that drive trading applications are also dependent on external sources for accurate timekeeping. But there, the similarity with our stand- alone electronics ends. The networked environment of server farms works only when time is tightly synchronized across all servers. The complexity of delivering time information from its source - whether a Global Positioning System (GPS) clock or a local cell tower - begins to add up as microsecond or nanosecond accuracy is affected by the routing of the data from timeservers through networks to application servers and eventually to trading applications.

If the time is off, even by a little bit, there can be significant and sometimes startling consequences. The timestamps on incoming market data, especially when matched with test "pings" out to the data source, inform the trading systems of latency with venues where orders may have to be placed or canceled. Wrong estimates of latency can cause missed profit opportunities or, possibly worse, inability to cancel standing orders quickly enough when the market changes or the orders are filled elsewhere.

melting clock

Beyond best-execution issues, faulty timekeeping can cause anomalies like high-speed trading events being recorded out of the order in which they occurred - a regulatory and litigation risk issue in audits. Imagine if you had a wristwatch that kept drifting off time and you depended on a friend across town to send you the correct time by courier. When the message arrived, you'd have to try to figure out how long the courier had taken to carry it to you and correct for that. You could use the couriered information plus the estimate of courier travel time to periodically reset your faulty wristwatch and then use your wristwatch time between updates. Moreover, you could always approximate the correct time by correcting for the known error in your wristwatch.

For example, if you knew it had been 10 minutes since the last update and your watch gained a second every minute, subtracting 10 seconds from the current time on your wristwatch would provide a reasonable estimate. The passage of reference time data as it moves through wires, routers, into servers and then, at some point, as it is picked up by the application, is a similar scenario but with much, much, smaller time units.

The likelihood of wavering off reference time is not only significant, it is inevitable. These delays can easily add milliseconds of uncertainty, and if they're not solved, can totally defeat systems that are designed to act on decision-making processes gauged in microseconds.


The technology behind timekeeping

For more than two decades, the standard approach to managing time in a networked environment has been software based on Network Time Protocol (NTP), which was developed to synchronize time over networks in the early years of the Internet when microsecond-level accuracy was barely imagined. NTP is now ubiquitous, often unnoticed, and has a number of major problems. It is complex to configure properly, tends to overcorrect and "lurch" when it is given reference time updates, is easily disrupted, and can be shown to be so slow to synchronize with reference time that most of a day can go by before downstream servers are locked to the reference time.

A new protocol, Precision Time Protocol (PTP), also called 1588 after its IEEE standard number, promises more precision based on emerging support from new time-enabled chips in routing equipment, but it does not make the problems that bedevil NTP just go away. If we go back to the wristwatch and couriered update scenario, PTP provides additional timing checks on the courier packet, such as the time the packet arrived at your building, how long the courier was stuck at lights, and so on. That information can improve your estimate of how long the package was in transit, but it's no magic bullet.

Other approaches to pinning down the correct time have included locating GPS radios directly inside application servers, rather than depending on time distributed from a time server. But beside the cost involved in the additional hardware, there can be challenges in getting the time signal into the data center. For example, it may be necessary run a special antenna from a basement data center to the roof of the building to capture a GPS signal. Cell towers can be accessed with smaller antennas, but the time can be off as much as 10 microseconds from true reference time. When it comes to attempting to locate time devices inside servers, additional problems such as power, cooling and space in server racks arise. And even when timing devices can be located within the application server, there is still the "last mile" problem of getting time to applications - especially when there are reasons not to modify application code.

Modern solutions for various objectives

Today the state of the art in timekeeping comes down to two fundamentally different approaches. One is focused on the passage of time data through the network. The other is focused on the accuracy of timekeeping at the application level. Each approach serves a different objective.

Network monitoring solutions today claim the highest degree of granularity - often low nanoseconds - in timestamping relative to source time. This is largely because they are fundamentally hardware-based solutions linking the most advanced GPS receivers with "packet sniffing" probes implemented at key points along the network to capture and timestamp time data being distributed to application servers.melting clock

The ability to compare the differences between true reference times at different points in the network provides fine-grained latency measurements that are reported to technical staff overseeing the trading systems. This analytical data may be used for optimizing the network or fine tuning automated execution systems.

Application-oriented solutions are designed to enable maximum accuracy in timestamping within the application by allowing the application to request accurate time from the operating system. This focus on the "last mile" is essential if the application is to be able to immediately use the time data to feed its own business logic and decisioning processes - such as selecting optimal venues for trading by real-time changes in latency of communications.

Delivering accurate time to the applications in networked servers is a challenge requiring intelligent analysis of the physics of network transmission and local clock drift, adapting the content of the time data to sensed changes in the quality of the network, and then calculating delays in the operating system and network stack. Currently, the state of the art methods for overlaying this moderated time on top of the software infrastructure of NTP or via PTP can produce accuracy to the single-digit microseconds.

The level of time-keeping accuracy is constantly being improved as the sensing algorithms are refined and as hardware support from the new time chip-enabled routers is added. The emerging use of Precision Time Protocol along with the addition of tighter hardware linkages with advanced time server hardware, promises to bring in-application timestamping to the single microsecond range of accuracy or better in the near future. But the promise of a solution coming in the near future doesn't make this an issue you can safely overlook.

As development continues, the net result for the trading environment is faster and more intelligent order routing and much more advanced algorithms - more advanced not least because they are being provided with more accurate information about time. In fact, the quality of time data is a new frontier that is attracting the attention of the most aggressive and technologically advanced trading firms in the industry, because it make a significant difference between success and failure when the speed of trading activity outstrips the computer clocks' ability accurately to nail down the time.

Victor Yodaiken is CEO of FSMLabs, which develops and markets real-time technology to enhance the performance of off-the-shelf operating systems, including TimeKeeper®, an application-centric time distribution software system currently in use by leading trading institutions.