Over the last year, the trading industry has witnessed an increase in the demand for Precision Time Protocol (PTP). This has been driven partly by compliance, but also by a desire for firms to obtain a common time reference from which they can accurately deduce the precise sequence of - and hence causality within - events occurring across their distributed systems.
Corvil is a vendor with many years of experience in the timestamping and analysis of network packets, primarily for performance monitoring, and has a keen interest and expertise in the use of clock synchronisation technologies. As our customers have rolled out PTP, we have been able to help them validate accuracy and to troubleshoot the various causes of jitter and other deployment issues they have faced. In this article, we are going to explore a number of the issues with time synchronisation that we have witnessed first-hand.
Figure 01: PTP distributing a GPS time signal to a variety of end-systems
Before we start looking at some specific examples, let's first review the typical architectures in which PTP is deployed and the mechanisms that PTP uses to distribute time.
Firstly, a deployment requires an accurate clock source and GPS is the approach chosen by most users. A GPS antenna installed on the roof of the building receives signals from a constellation of satellites that are broadcasting atomic time as part of their primary function of providing global positioning. As a side note, GPS is operated by the United States government, whereas GLONASS and Galileo are the equivalent Russian and European systems.
Figure 02: Ideal deployment of PTPv1 with PTP enabled on all network devices
The GPS hardware on site typically supplies a grandmaster clock with atomic time which it uses to discipline its own internal clock. As part of this process, corrections should also be made in software to compensate for the propagation delay, as due to the length of the cable, between the antenna and the grandmaster - five nanoseconds per metre. Side note: Atomic time differs somewhat to UTC due to the fact that the speed at which the earth is revolving is changing; in fact it is slowing slightly, so periodically a leap second is added to UTC to correct for this effect. Currently UTC is 37 seconds ahead of TAI, the last correction having occurred on 31 December 2016.
The grandmaster's primary role is then to distribute accurate time to relevant downstream devices such as application servers, intrusion detection systems (IDSs), monitoring systems and network packet brokers, to name a few.
The mechanisms for distributing time from a grandmaster are typically Network Time Protocol (NTP), PTP or Pulse per Second (PPS). While NTP can synchronize systems reliably to 100μs or better, the way it exchanges timestamps is susceptible to jitter in the UDP/IP stack; this is a limiting factor in the standard implementations of NTP. In contrast, PTP uses hardware timestamping in end-systems that avoids this jitter, which allows it to achieve sub-microsecond accuracy. It is for this reason that most - but not all - implementations within electronic trading rely on PTP. PPS arguably provides the most accurate signal but, being a 1Hz square wave delivered over a dedicated coaxial cable, does not scale as an appropriate delivery mechanism beyond a handful of devices in local proximity.
Figure 03: Details of the exchange of PTP messages
What sort of accuracy can be expected from a combined GPS and PTP solution? GPS should provide accuracy to within 40 nanoseconds and, as illustrated in Figure 01, a good PTP deployment should be better than 100 nanoseconds when using PTP hardware-assisted switches.
Let's take a minute to briefly recap how PTP works and understand in more detail what happens within a PTP-aware network. We are going to focus upon what is still the most prevalent implementation of PTP, the Boundary clock. It is worth mentioning that in 2008 the Institute of Electrical and Electronics Engineers (IEEE) standardised an expanded version of the protocol, known as PTPv2, which introduced support for transparent clocks, but to date the adoption of this approach by switch vendors has been limited.
Each pair of devices in a PTP v1 network establish a master-slave relationship, with a master being responsible for forwarding its time signal to all of its slaves. Each master is in turn a slave to its upstream master, in a hierarchy terminating in the grandmaster which is the ultimate source of the network's time signal. Each PTP message is sent in a User Datagram Protocol (UDP) multicast packet and is timestamped in hardware by both the sending and receiving interfaces. An important design consideration is that these are point-to-point connections between each PTP enabled device, and there should be no intermediate hardware on these links. This ideal deployment mode is sketched in Figure 02.
Now let's consider what happens between a master and slave and how time is accurately propagated downstream. As shown in Figure 03, the upstream master first sends a 'Sync message' for which it records a timestamp of when the message is first serialised onto the link. If it is not able to write that timestamp into the Sync message as it is being sent, it will send a second 'Follow-Up' message that contains the timestamp of when the original Sync message was sent.
Figure 04: Location of the four timestamps used by the PTP slave to calculate its offset from the master
The slave receives the Sync message (and the optional Follow-Up message) thus learning the time from the upstream master. However, what now needs to happen is a correction for the propagation delay between the master and slave. This is achieved by the 'Delay Request' and 'Delay Response' messages, whose exchange is initiated by the slave.
After the slave has received the Delay Response message, it has the four timestamps pin-pointed in Figure 04 that it can use to calculate its offset from the upstream master and thus correctly align its clock to that master.
|Master to slave difference||=||T2-T1|
|Slave to master difference||=||T4-T3|
|One way latency||=||(master to slave diff + slave to master diff) / 2|
|Offset||=||master to slave diff - one way latency|
Implementing this approach across every point-to-point link, as illustrated in Figure 05, from grandmaster to slave, is fundamental to PTP being able to achieve the sub-microsecond accuracy that the protocol is capable of.
Although this architecture looks pretty straightforward, things can and do go wrong. Often the biggest challenge is actually knowing that you have a problem and that the time you are receiving and trusting is not as accurate as you had assumed. It is precisely for this reason that the ESMA RTS-25 (Regulatory Technical Standard) specifies traceability to UTC as a requirement for MIFiD 2 compliance.
Figure 05: PTP hierarchy: PTP switches act as slave to upstream master and also master to downstream slave
When things go wrong
We now explore some factors that can affect the accuracy of a typical PTP deployment. We start upstream at the GPS and work our way down towards the client.
While GPS provides us with a high degree of accuracy, it is in practice quite fragile, being susceptible to physical and radio frequency (RF) disturbances.
The antenna itself needs to have a clear line of sight to satellites. Antennas installed in urban canyons not only have restricted line of sight but are also susceptible to interference from the GPS signals bouncing off buildings. Similarly, newly built neighbouring high-rise buildings, construction cranes, and nesting birds, to name but a few, can all have an impact. Anecdotally, we also learnt of an air-conditioning engineer working on the roof of a building who inadvertently damaged an antenna, rendering it unreliable.
Extreme weather condition such as atmospherics can attenuate signals, taking the system out of tolerance, and solar storms can also have a detrimental effect. Furthermore, because the GPS signal is low-power RF, there have been occurrences of sabotage where GPS signals are deliberately jammed. In principle, it is also possible to spoof a GPS signal and deliberately drift the time away from where it should be, but this would constitute a highly sophisticated and technical attack.
So, what would happen if your grandmaster lost its GPS signal? It really depends on how you have engineered your environment. PTP provides a Best Master Clock Algorithm (BMCA) that allows more than one GPS source and grandmaster to be deployed. BMCA takes care of the failover to ensure that downstream clients do not suffer from inaccurate time.
Figure 06: Network without PTP enabled - PTP packets are not forwarded
But what if you don't have redundant clocks and GPS is lost? In this case the grandmaster will switch into holdover mode, where its internal clock is now free-running, no longer being disciplined by the GPS. What happens next is really down to the quality of the clock in the grandmaster as that will determine the rate of drift. Suppose you have a relatively inexpensive grandmaster and the oscillator has been rated with a drift of 0.1 parts per million.
|0.1 ppm = 100ns per second.|
|After 1 minute, the clock would have drifted by 6μs.|
|After half an hour, the clock would now have an inaccuracy of 180μs.|
If the GPS was lost during the night then by the next morning, the drift could be in the realms of milliseconds.
It is also possible for bugs to occur in the GPS system. A prime example of this occurred on 26 January 2016 where a 13 millisecond error was introduced for up to 12 hours. It was caused by routine maintenance work when one of the older satellites, SVN-23, was 'retired'. Ground system software pushed an update to 15 other satellites in the constellation that introduced the error and it took several hours to unwind to change. Interestingly, on that day, Corvil support received a number of calls from customers who reported their Corvil appliance logging 13ms timesync errors.
Figure 07: Multicast routing of PTP
Finally, in one deployment we witnessed a grandmaster not handling UTC correctly. The grandmaster should publish International Atomic Time (TAI) along with the current UTC offset, which takes account of leap seconds, and allows the downstream client to make its own correction. Instead, we saw the grandmaster publishing UTC time rather than TAI, but with a zero offset. Now, of course it achieves the same result, but what happens when the next leap second occurs? The client will experience a one second jump in what it thinks is atomic time. The impact of this is unknown and it is also a very difficult scenario to test for. If this grandmaster was providing time to your critical systems you would be right to be concerned.
Let's move downstream from the grandmaster and look at some scenarios where issues have been found within the distribution of PTP.
PTP Switches - Scenario 1
A scenario that we have encountered on a couple of occasions is where PTP is not configured on the Layer 3 switches between clients and the grandmaster. Initially, the clients correctly report that they are not synchronised because they are not receiving any Sync messages from the grandmaster, as shown in Figure 06.
The networking team, who are not specialists in PTP, are tasked with resolving the problem. Upon realising that PTP is sent as a multicast they go ahead and configure multicast routing for that group from the grandmaster out to the clients. The clients then report that they are receiving Sync messages and everything appears to be working correctly.
However, in this scenario, PTP is not correctly configured: the root cause of the problem is in fact that PTP is not enabled on the intermediate devices. Although the switches are now forwarding the PTP traffic as illustrated in Figure 07, the PTP function and boundary clock with hardware assistance has been completely bypassed, with the switch behaving like a regular switch. The grandmaster's time signal is not corrected for any queuing delays that might be happening on their egress ports and the resulting accuracy of PTP that the client receives is compromised.
The enabling of multicast routing is the wrong solution. The correct action is to leave the multicast routing alone and instead enable PTP on all intermediate devices. The multicast traffic should stay local and be used by the clients only to slave to the PTPv1 master on the adjacent PTP device.
This problem can be exacerbated by the fact that changes such as the deployment of PTP are usually done outside of production hours, during major change-control windows at the weekend. At such times, the traffic loads on the network are typically minimal and the time-sync problems that can arise from congestion on the network simply do not show up.
Figure 08: PTP in a network with multicast routing already enabled
PTP Switches - Scenario 2
A related scenario we have encountered is where multicast routing has already been enabled for some (possibly unrelated) reason. The PTP clients will never register a problem as they start to see Sync messages from the grandmaster as soon as they are enabled, as shown in Figure 08.
In this case, it can be even harder to track down the issue as there is no record of any network changes being made at the time of the PTP deployment. The root cause of the problem is that PTP is not enabled on all the network devices involved in handling the PTP traffic. As in the first scenario, the degradation of the PTP signal will probably only materialise during business hours when production traffic loads start to drive transient congestion in the network.
PTP Switches - Scenario 3
Another problematic scenario that we experienced is where a misconfiguration of PTP on the boundary network devices means that the path-delay communication does not work. Recall that the path-delay mechanism in PTP is an exchange of timing initiated by the slave which allows it to assess the network delay between itself and the master.
Figure 09 illustrates this scenario: Although the client is able to receive the time-signal from the master via the Sync messages, which are working correctly, it cannot correct for the effect of latencies. These could be propagation delay, which is of the order of 5ns per metre of cable, or transceiver delay, which can be up to 1μs for copper.
In some cases, the slave will report the problem, but in many cases the default action is for the slave to do the best it can and simply assume that the network delay is zero, silently introducing an incorrect offset into the slave clock.
PTP Switches - Scenario 4
The final scenario invovles a Layer 2 switch that is not PTP-aware being used in the network. In some ways this is very much like the first two scenarios as the underlying cause is that PTP is not enabled on all devices that handle the PTP signals. The difference, shown in Figure 10, is that the problem is not caused by misconfiguration.
From a purely protocol level, it can be very hard to determine that Scenario 4 is happening, because the switch is completely transparent to the PTP protocol and any host or network PTP agents. A further complication arises when the device in question is overlooked because it is a low-latency switch, whose latency is rated in hundreds of nanoseconds. Of course, these figures refer to the constant switching time of the switch fabric and do not reflect the significant egress queuing that can occur on contended links. The ideal solution to this problem is to ensure all switches are PTP enabled so that the PTP time signals are appropriately compensated for any egress queuing that occurs.
This last scenario is even more important to understand when considering PTP over the Wide Area Network (WAN): PTP is not normally thought of as a WAN technology, but in principle there is nothing preventing it from operating successfully. However, to do so requires careful consideration of the physical network topology: If the WAN circuit consists solely of dark fiber, then lighting that fiber with PTP-enabled switches will allow reliable propagation of the time signal. If instead the WAN circuit is something like Multiprotocol Label Switching (MPLS) or even a MetroEthernet service, then all bets are off: Service providers will typically carry such circuits over a shared network that can introduce such large jitter as to obviate any benefit of attempting to use PTP in the first place.
Figure 09: PTP master sync messages are forwarded, but slave delay-requests are not
Miscellaneous problems detected downstream
Finally, when consuming PTP downstream and comparing it to a stable reference, it is possible to detect and then diagnose any unexpected behaviour. Here are just a few examples that we have seen:
Various grandmasters and switches introducing jitter to the order of a microsecond.
Bug: Delay Response messages missing, resulting in an incorrect offset.
Bug: PTP Follow-Up messages delayed by 5 seconds - a significant error can accumulate within a 5 second window.
PTP switch that has an inaccuracy of 5.5 microseconds. Tracked down to a misconfiguration of a physical port (100 Mbps instead of 10 Gbps) The interface speed is used to compensate for the transceiver serialisation delay.
Third party PTP service providers not meeting their service-level agreement.
Bug: A particular PTP switch would run beautifully with very low jitter but it appeared to sync to random offsets from UTC, for instance an offset of 18 minutes 19.5 seconds or 55 minutes. Each time the switch was reset it would sync at a new random offset, but each time very stable. Working with the switch vendor we identified a bug where the offset was an exact multiple of 240 nanoseconds!
Figure 10: PTP messages jittered across a non PTP-aware switch
Clock synchronisation has become top-of-mind for many in the financial markets due to regulatory changes and a desire for transparency. Familiarity with the various hardware components and protocols is essential for successfully deploying a timestamping solution. As we have discussed, problems can be hard to spot and identifying their cause is often even more difficult.
A variety of complications can arise, including bugs, misconfigurations and hardware failures. It is wise to expect the unexpected, to carry out rigorous testing, implement ongoing monitoring and use third party tools, such as those from Corvil, in order to validate performance.