Many organisations, particularly in the securities industry, are turning to Precision Time Protocol (PTP) for its precision and accuracy. A well-designed PTP network can keep all of the nodes within a handful of microseconds of each other, even without special hardware. With special switches and with special Network Interface Controllers (NICs) - or kernel support for socket timestamping - a network may achieve the sub-microsecond range. Increasingly tight timing requirements on regulated industries make this level of performance mandatory. Not only are firms asked to have good synchronisation, they are asked to be able to prove it, and this is where PTP falls flat on its face.
Monitoring, auditing and real-time alerting tasks range from difficult to impossible, which seems surprising until one examines the protocol in detail. PTP is a 'time distribution scheme', with a hierarchy of nodes, each responsible for following the best time source available. The best time source is called the 'master'; its followers are called 'slaves'. This is a one-way communication path: Masters provide information, slaves consume it. Proving the time on a master is fairly easy, but proving the time on a slave is not.
- PTP excels at distributing the time with high precision and accuracy.
- PTP cannot demonstrate that its nodes are synchronised.
- PTP's controlling standard lies at the heart of most monitoring problems.
It is useful to keep in mind the distinction between 'in-band techniques' (using the protocol itself) and 'out-of-band techniques' (using any other method) to measure compliance. Those transitioning from Network Time Protocol (NTP) or similar protocols are not used to worrying about the distinction, because a standard implementation of NTP is both a consumer and a producer of time information, thus allowing in-band monitoring even of machines that do not ordinarily provide the time to others.
PTP's in-band monitoring abilities are severely handicapped by the protocol specification itself. In particular, we will show why PTP messages cannot be used in a manner analogous to NTP's messages. PTP's design militates against the collection of compliance data. In order to show this, we must first review the protocol itself, including tedious detail about its messages and how they are delivered. If you are already familiar with how PTP operates and the kind of messages it uses, you may skip ahead to the section 'Node Identification'.
Precision Time Protocol version 2, commonly called 'PTP' (or PTPv2 when necessary to distinguish from other versions), is controlled by IEEE under the standard 1588-2008. PTPv1 (1588-2002) is not obsolete, but is not directly interoperable with v2 and has largely been abandoned. PTPv3 is still in the planning stage and does not appear poised to solve any of v2's inherent problems.
The present article is limited to PTPv2. Table numbers, section numbers or other references in this article pertain to labels as given in IEEE 1588-2008. From here on, these references are labelled with the prefix 'IEEE'. We provide these call-outs to the standard in order to avoid having to recapitulate every detail and to provide documentation of where the standard itself hinders monitoring. You do not need to have a copy at hand in order to follow this article.
Here we only address PTP communications using UDP at the transport layer of the Open Systems Interconnection (OSI) model, using IPv4 or IPv6 at the network layer. IEEE 1588-2008 can operate directly at the data link layer (normally 802.3). It can also use alternatives such as DeviceNet, ControlNet and PROFINET, but these options are outside the scope of this article.
Popular software implementations of PTP include Greyware's Domain Time for Windows, the SourceForge PTPd project (and its many derivatives, some proprietary, some open source) for Linux, TimeKeeper from FSMLabs, clients from Meinberg and dozens of others. Most are 1588-2008 compliant with respect to the defaults, but many have either proprietary extensions or idiosyncratic interpretations of the murkier aspects of the specification.
Glossary of terms
|BMC||Best Master Clock|
|CDMA||Code Division Multiple Access|
|CIDR||Classless Inter-Domain Routing|
|CPU||Contral Processing Unit|
|DHCP||Dynamic Host Configuration Protocol|
|GNSS||Global Navigation Satellite System|
|GPS||Global Positioning System|
|IEEE||Institute of Electrical and Electronics Engineers|
|MPD||Mean Path Delay|
|NIC||Network Interface Card|
|NTP||Network Time Protocol|
|OSI||Open Systems Interconnection|
|PHY||Physical layer (e.g. wire or glass)|
|PTP||Precision Time Protocol|
|RFC||Request for Comments|
|TAI||International Atomic Time|
|TCP||Transmission Control Protocol|
|TSC||Time Stamp Counter|
|UDP||User Datagram Protocol|
|UTC||Coordinated Universal Time|
PTP networks consist of nodes, of which only one is master and the rest are slaves, passive observers or specialised hardware devices for segmenting and distributing the master's time. A reference implementation using either of the default profiles (discussed later) requires all nodes to be potential masters and to use a very specific 'best master clock' (BMC) algorithm for determining which node should be master. When the current master goes offline or downgrades its quality, perhaps due to loss of GPS signal or other fault, then the other nodes will quickly hold an election to determine which of the remaining nodes has the best quality and switch allegiance.
An appliance billing itself as a 'grandmaster' is simply a PTP node with access to a primary reference time, such as a Global Navigation Satellite System (GNSS) like Global Positioning System (GPS), and is configured in master-only mode. Grandmasters participate in the BMC, but instead of slaving to a better time source, they retreat to passive mode, ready to step in as master when needed. In practice, a robust PTP network consists of an appliance with excellent quality, a backup appliance and all other nodes configured as slave-only, but it is important to understand that PTP does not require this configuration.
The normal operation of a PTP network is not, by itself, a hindrance to monitoring. It can, however, be a problem for industries that require traceability to UTC, because the BMC algorithm does not require the selected master to have a primary time source. Advertisements of clock quality by potential masters are taken at face value and PTP is happy to let a network of nodes without any external reference auto-configure to follow the best claim, even if the selected master is wrong by seconds, minutes or days. Only monitoring can prove that a PTP network is operating correctly, within tolerance, and also traceable to UTC.
In the absence of an appliance, a software-based master can work to syntonise and synchronise the network, but will be limited to the accuracy and precision of its own time source, and the quality of the PTP timestamps will be only as good as its ability to syntonise itself with the source. If the source itself is an atomic clock or GNSS/GPS-connected unit, the distributed time will probably be within approximately half a millisecond of UTC. If the source is secondary or malfunctioning, the quality is unpredictable and traceability goes out the window.
Except for expense, this does not seem like a problem at first blush. You get an appliance or two, configure your other nodes and monitor the whole thing, right? Why is this complicated? To understand, we have to look at the individual messages PTP uses. We will call out the shortcomings of each as we examine them.