The Gateway to Algorithmic and Automated Trading

Market data connectivity

Published in Automated Trader Magazine Issue 05 April 2007

With: - David Hann, general manager for EMEA at Interactive Data Real-Time Services - Gregory Smith, CEO at Cicada Corporation - Mark Akass, CTO at BT Radianz - Alasdair Moore, global head of sales at Fixnetix - Guy Tagliavia, president and CEO at Infodyne Corporation

What do you see as the most significant challenges facing market data managers tasked with maintaining point-to-point trading connectivity for their algorithmic trading teams, who now require API access to an ever widening range of markets?

Moore: The market data industry is currently facing a number of challenges including increases in feed sizes, the need to aggregate multiple feeds as the market fragments due to MiFID, and the constant requirement for hardware upgrades and increased power capacity due to the need to support more and more boxes internally. In order to combat these issues, there are three options available which an organisation should consider. The first is to write bespoke feed handlers for any markets in question; though writing feed handlers requires considerable internal resources, especially if an organisation has a large number of gateways over which they trade.

The second option is to buy feed handlers from an ISV, as this reduces the time to market. However, as feed sizes increase, so do the support implications. This can not only become extremely costly in terms of resources (one investment bank in London is currently running feed handlers from a well known ISV on over 100 servers), but often requires hardware upgrades, power capacity and rigid change control procedures.

The third option is to outsource the end-to-end delivery process to a market data telecommunications company. This type of organisation can provide a hosted solution that encompasses network provision, ticker plant technology, hardware, upgrades and support. They also have the ability to upgrade bandwidth as required with no infrastructure needing to be installed on site.

Akass: The key problems are connectivity, data volumes and of course latency. The data volumes are getting higher and higher, meaning capacity and scaling are becoming issues for the whole market. Another point is that the algorithmic trading model generates a higher volume of trades for the same volume of elements traded. That then generates a lot more data in the post-trade side.

If people are trading different markets, using ever more complex strategies, there is data from a lot of different sources together, feeding it into their trading engines and making decisions on that, hence the connectivity challenges. What companies need is the delivery of all those different data sources down the same physical feed. That increases flexibility, as when companies change and alter their business strategies, this can be taken into account easily.

David Hann

Hann: The combination of service and price with respect to return on investment is an issue that we've seen in the marketplace. There is a significant cost in bringing feeds into an organisation, not only in the price of the feed itself, but also in maintenance, accessibility, capacity management on volumes and which liquidity pools to go after. All of that coupled together leads to the question, are you getting a return?

Effectively, the growth of algorithmic trading is driving efficiencies in the market because the more people that put models into the market, the fewer the opportunities there are, so margins are being squeezed.

Smith: The biggest challenges are, on the technical side, the compositing issue, where liquidity fragmentation is occurring because of MiFID in Europe and is already occurring because of Reg NMS and other similar regulations in the US. The new rules are having a profound effect on trading rules and information dissemination.

Liquidity is being fragmented via different trading venues and by trading instruments. In equities, the action is being drained away from the underlying cash market. You see all kinds of newly created derivatives, but the effect of that in relation to executing algorithmically is that you are going to have to form a composite book of each instrument complex for the entire market. Liquidity fragmentation will make trade execution for algo traders significantly more difficult.

Change management is the other major issue for point-to-point connections. A lot of the things that were hidden from an organisation in the data feed world will become its responsibility. Exchanges generate at least two to three changes in feeds per year, which direct-connect customers must manage. If you add the emerging new trading venues, change management issues, even if the protocols were all standardised, become a major expense and operating risk.

Tagliavia: The primary concern has to be standardisation. Unfortunately, exchanges do not follow standards for connectivity, APIs, or messaging.

On the contrary, algorithmic trading applications want a common view of these venues relative to receiving market data, placing orders and trades, and receiving reports. The goal should be to base algorithms on a normalised model such that the burden for each new venue is reduced to solving the normalisation problem in one place, without having to change the heart of the system.

Reducing data latency = ever increasing investment + ever diminishing returns?

Are attempts to reduce market data latency becoming a case of ever diminishing returns and ever increasing investment?

Tagliavia: If everyone is on the same level playing field, then this is true, however as long as there are competitive gains to be had, then some would argue the investment is worthwhile. The bigger issue is not investment, but the return on investment. Investment should be made in first knowing what your latency is at any point in time, and then determining where and how to improve it. Most may find latency is occurring to a greater degree in areas they had not even suspected.

Akass: At some point you do hit diminishing returns, but there are things you can do. Not all markets have optimised latency down to the absolute minimum. But at some point you get to the level where the network delivery across the infrastructure can't go much further, in some cases due to the simple speed of light.

At that point, more people look at their part of the trade process, so if you get the network delivery and connectivity optimised, the algorithms and trade routing engines have to be optimised as well.

If you're not participating, if you're not participating in the lowest latency infrastructure you can get access to, you could be disadvantaged in the market.

Smith: If the world didn't change and if we didn't have fragmentation of liquidity, the answer would be yes, you are rapidly reaching the point of diminishing marginal returns. However, the world is changing dramatically over the next year to 18 months. There is a whole new wave of arbitrage about to spring into play. From temporal arbitrage, to instrument complex arbitrage, liquidity fragmentation is generating new algo trading opportunities.

Hann: There is an element of reducing market data latency in order to keep up. As low latency models get more efficient, they drive efficiency within the market itself, so people are scrambling for the same basis points on whatever trade they may be doing.

What we do know is that the financial markets are very good at creating new models, opportunities and new derivatives to trade. Because of this, I don't think it's necessarily going to be efficiency that drives up the complete market, as a lot of algo trading is still very much equity-focused. There is still excellent growth opportunity in other asset classes such as fixed income and foreign exchange, and there is room to reduce latency in these models.

Are data messaging formats an area where proprietary standards return real latency benefits? Or should common standards, such as FAST, be the preferred route?

Guy Tagliavia

Tagliavia: Common standards are the preferred route in many cases since implementations based on standards can be improved, allowing all solutions based on them to take advantage of the improvements. Any standard is good as long as it is fit for purpose and FAST has proven itself on numerous occasions.

Smith: There's a belief in the marketplace that says if we could just get everybody to standardise on a single protocol like FIX, everything would be great, but FIX is not standardised; FIX is a framework containing a lot of variations on a theme.

In some areas standardisation not only works but is very important, for example on order structure. An order is a very well-behaved set of content and you get big efficiency gains market-wide by having standardised order structure. What I think you will see continuing is standardisation around things that are well behaved, like orders, confirmations, indications, and things like that. What probably will not be successful is standardisation on actual feed protocols for content.

Although there are attempts to have standard market data feeds, I think for there to be a standard, it has to be so loose as to be not meaningful. So I think it's probably likely that you will not see standardisation in market data.

Hann: A common standard should have a number of parameters to handle wide ranging situations. It is inherent in this process that you are likely to add latency. If firms are looking to get real latency benefits, then yes, they are going to use proprietary standards. Firms will only have their own system parameters to match their proprietary software. This contributes to lower latency.

Akass: I think standards are a good thing and I think most participants look for them, but I don't think they are massively influential on latency.

Messaging formats don't necessarily help latency, but they do reduce implementation costs. In FX for instance there isn't necessarily conformance on data standards, as there is in equities, so having common standards for your messaging formats certainly helps a participant because you implement it once and apply it many times.

In what ways would you suggest firms might consider overhauling their market data distribution backbones to cope with ever increasing data volumes?

Smith: First, don't overhaul the backbone. The effort needs to be focused on determining the specific requirements for the applications that need the low latency and that must consume the full available volume of market data. Then construct a new infrastructure for this specific requirement, with the secondary requirement of feeding the legacy infrastructure, but only at the rate it can consume, and perhaps with reduced total content. In other words, you have to become a data vendor to yourself and optimise for your firm's own utilisation.

Hann: Particularly within some of the larger organisations, distribution backbones have been in place for quite a while now. When they were put in place they were not necessarily built to deal with the volumes of data that are actually available in the marketplace today, and equally latency was not such a concern as it is now.

What firms have to do to their networks is to look at who needs the data and how best to distribute it. They have to adapt their existing backbones to handle the new volumes and the new data sources, and then decide where it's best to source the data from and whether it should be taken in directly via an API.

And on those network backbones, firms need to look at what is the best method of distribution, whether they change the broadcast protocol from broadcast to multicast, or whether they employ some kind of compression techniques.

Ways to overhaul data distribution

Tagliavia: The first thing to do is to ensure your market data and distribution technology are network and application friendly. This includes making sure data is only put on the wire when requested, and that data is only received from the wire when asked for.

A well behaved distribution technology will include both traffic minimisation and traffic isolation characteristics such that the system is well behaved and not over-taxing everything in its path.

The next area to consider is a distribution architecture that can be configured and scaled to service different types of applications and yet retain a peaceful coexistence with the rest of the system.

Finally, consideration to increase network and machine throughput by adopting faster networking, such as Infiniband, to help gain additional headroom should be considered if all else fails.

Akass: Common infrastructure providers can provide a very good solution for people delivering market data. Aggregators of data can do everything much more quickly. Where as, if you're doing it yourself, deployment costs of global infrastructure, timeline and expense means your ability to respond to the market is much diminished. So if data providers can move from proprietary standards to IP, commons standards and common technology, it is easier for infrastructure providers to deliver and for clients to receive.

If people don't want to use common infrastructure, then the financial institutions looking to obtain this data will have to buy capacity, big fat pipes bigger boxes, invest in resilience and diversity, and run a backbone network that runs uncongested, dimensioned for peak market hours.

Do you expect to see an increase in the use of data mitigation strategies by firms in an effort to address expanding data volumes or do these have significant constraints that continue to make them unpopular?

Smith: Within the domain of algorithmic trading the answer is no, because it would defeat the whole purpose. Outside of the domain of algorithmic trading, the answer is an emphatic yes, because most of the information is white noise to these secondary applications.

Tagliavia: Data mitigation is a double edged sword for most firms that can gain from the reduced traffic rates, but ultimately miss out on data events that may otherwise be important to their trading and/or compliance requirements.

Akass: I think some of the information providers are finding it very expensive to keep upgrading their capacity and their delivery pipes. It's quite expensive for clients to keep having their last miles upgraded; it's a real pain for everybody. There are moves to find ways of being more selective about what data gets delivered. But there's a mixed message. The providers are looking for options to give lower bandwidth pipe as an alternative to full tick, but some of our clients are very wary about not getting everything in case they miss something.

If you look at the OPRA feed in the States, that's projected to push past 400 to 600 MB of data. Very few clients can receive that comfortably; it's not easy to deliver. When you're getting into those sort of bandwidths, it starts to get very difficult for everybody. But I don't see trends in the data dropping off just yet, it's still growing.

Hann: What you will find is the creation of a two tier structure where firms will bring the data inhouse to a certain point, then distribute it to areas that actually require it. If someone needs a full feed you can find a way to send that across the network just to those individuals, and then other individuals might have some kind of mitigated or delayed feed.

We expect firms to have some data mitigation strategies, particularly as volumes increase. There will be a trade-off internally between justifying the cost savings that may be attained by having a mitigation strategy, against the loss of trading opportunities by not having a full depth data feed.

What do you consider is the single biggest challenge in integrating multiple data feeds and delivering them with minimal latency?

Tagliavia: The easiest technical solution is to deploy simple feed handlers that push everything onto the wire. This is an unfriendly approach. The real challenge is in maintaining a network and application-friendly distribution infrastructure and performing resolution services that introduce no additionally latency.

Smith: The biggest challenge is that not all aspects of the problem are under any one organisation's control. The second challenge is the laws of physics and that connectivity-induced latency is hard to identify, and sometimes impossible to solve, because of the lack of control over all the components of a solution. After that is the issue of synchronisation in a fragmented world. At the time resolution level now considered the standard, 1 millisecond, there is significant temporal slippage (disconnect) between liquidity venues for a single instrument or instrument complex. After that there are the usual challenges associated with running high availability, low latency, high throughput data management systems.

Hann: The single biggest challenge we see is the ongoing maintenance of those feeds. Exchanges themselves are making changes to the feeds in a bid to be more competitive; for instance the NASDAQ announced 32 changes in 2006 alone. If you multiply this by say, 200 data sources, that's a lot of maintenance. The changes are a constant job.

Mark Akass

Akass: If a client is doing it by themselves, they need to source all the relationships with all thedifferent providers, and they would have multiple links coming into their infrastructure at different bandwidths. Each of these links would have to be capacity managed individually.

This is clearly where BT Radianz provides value. We consolidate data feeds from all the key information providers and can deliver it to client sites globally. To further eliminate latency we also allow clients to host their data feed equipment within our infrastructure.

How much increased demand for additional Level II data do you expect to see from trading firms seeking to be more effective in the execution of their algorithmic trading strategies and what impact might this have on the current provision of data feeds?

Tagliavia: As in the US markets, there will likely be a high demand for Level II data. Exchange venues will react to this demand and either introduce additional feeds or make wholesale changes to their existing ones. Some may iterate through this more often, causing a broader impact on downstream recipients. Ultimately this will create a surge in connectivity demands and solutions that can only be a good thing for network and system vendors.

Akass: It is tough to quantify the actual expected increase for demand in Level II data, but it is safe to assume that as algorithmic trading continues to expand globally, the need for and demand for Level II data across different execution venues will continue to increase. This is due to the simple fact that algorithmic trading cannot exist without low latency access to comprehensive market data products. More firms are expanding the feature of Level II by showing full depth of book and not just the top of book. The trend has been to offer Level II data from various execution venues across worldwide as algorithmic trading expands into other regions.

Hann: From our perspective, we are seeing a significant increase in the requirement for not just level II data, but for full depth. Typically some of the exchanges have only offered full depth to member firms, but they are now opening that out. So there is definitely a significant shift and again it's going to add volume on the current data feed.

Smith: Depth of market is the single most important information set now. In algorithmic trading the price feed is not particularly useful in isolation. Algo traders need to be able to see inside the book and then composite the book or books for an instrument across all liquidity venues for that instrument or instrument complex. In a perfect world this all happens in a submillisecond timeframe.

Our customers are telling us that we must accommodate in excess of 50,000 positions on a side in the next 18 months. That many orders flowing into the book simultaneously implies data rates in excess of 250,000 messages per second per single instrument. To put that in context right now, it is more than the entire US options market disseminated via the OPRA feed.

There is an expectation that the market is within two to four years of seeing that kind of activity in a single instrument. It's all being generated by algorithmic trading. The only boundary condition is the trading venue.

Are there any recent generic technological advances that you feel have significantly assisted in data connectivity?

Hann: I think the growth in popularity of extranets and managed networks has assisted the connectivity requirements of firms; companies such as BT Radianz and TNS have been around for a period of time, but it seems that there is now a significant shift in momentum in terms of usage of these networks.

Smith: There's a big wave of compression and network hardware technologies that are emerging in the market that jack up speeds. The other big event is the release of multiprocessor chipsets. These multiprocessor chips and the amount of memory available on a single chip have a very significant impact on speed and latency.

Akass: A big advance that has really helped the delivery of market data has been the adoption of multicast technology. Multicast is a way of taking market data from a source, like the stock exchange, and delivering it efficiently as well as reliably, globally.

As more and more market data organisations have adopted multicast interfaces, companies such as ourselves have been able to deliver the service very efficiently for them.

What impact do you feel increasing use of colocation facilities will have on the market data space?

Alasdair Moore

Moore: Colocation demand has increased purely as a function of competition. Once you look under the bonnet at the causes of latency, it is apparent that propagation delay (latency associated with distance), which has historically been a technical problem, is now purely a political issue.

In Europe, incumbent market data companies are beginning to look at colocation as the only way of solving the dichotomy of how to provide data streams to customers without having to constantly reengineer their network every six months. This is becoming more of a problem as data streams are increasing in bandwidth requirements by over 5% a month and being within the proximity of an exchange is no longer good enough. However, many companies have already left it too late as space is limited and in many cases, data centres who can provide collocation with an exchange are already at capacity.

Smith: Under the current market model, centralised colocation has a big effect, but I don't think it's going to survive. I think what you're going to see is what is already starting to happen, where exchanges will facilitate the colocation of trading instruments next to the trading platforms. This distributed colocation model will be the dominant solution in the next 18 months.

In effect, algorithmic trading will become a game in which an algo trader does not have a single engine receiving feeds at a collocation facility, but rather the trader will have multiple engines colocated at the trading venues, interacting with each other dynamically and controlled from a single location. The technology shift will be that algorithmic traders will have to devise technology solutions that allow them to, in effect, control the trading in an almost distributed concept.

Akass: Collocation has been a very big thing for us. We've had huge sales growth in collocation over the last 18 months. Collocation is more than taking the algorithmic service blackbox trading engine from broker dealers and putting it into our infrastructure; that alone does not add sufficient value. It is very much part of the low latency solution that clients need, but in itself, just collocation isn't enough.

What you need is the connectivity between the exchanges, the trade venues, the data sources and the participants. By providing that connectivity there is a whole proposition that creates the value. Radianz Proximity Solution provides connectivity and collocation for a complete service.

Hann: To a certain extent it's fashionable to be talking about colocation or proximity location in the financial space. Co-location has been around for years, since the dot-com boom. A number of co-location firms went out of business in the dot-com crash. There can be issues associated with trying to be the fastest by having proximity or being colocated. To ease the impact, we are providing proximity services via our new ultra-low latency direct exchange data service, DirectPlus. DirectPlus is a fully managed service that leverages our high performance ticker plant for delivering ultra-low latency data despite massive increases in data volume.

Which in your view has the greatest effect on data delivery performance - network infrastructure or data compression techniques?

Smith: Neither one. It's a combination. It's physical location, it's network infrastructure, it's datacompression, it's also the software you use to process, and the chipsets involved. Just because you get the data, you've still have to eat it quickly. Each time you make something go faster, a new bottleneck appears. You hack through that bottleneck, and a new one appears. It's never ending.

Tagliavia: Network infrastructure by far! Infrastructure is what it is all about; compression is just a component of it.

Moore: Both are important but by far the most effective in reducing latency and ensuring that data is not buffered, is the network. If compression was the answer, incumbent data vendors would have just compressed existing feeds without the need to reinvent themselves.

However as always, there are a number of issues to consider. For every organisation and feed there is an inflection point where the additional latency added in the compression process produces an increasing performance benefit over distance through serialisation. Therefore in the example of colocation, data compression will add latency since the distance travelled is effectively zero. Equally with the price of bandwidth reducing, any organisation can achieve an improvement in performance by ordering more bandwidth.

Hann: Definitely network infrastructure, because data compression techniques add latency as there is always a component need to either wrap data up or unwrap it. With network infrastructure, if you're processing raw data, all you need to do is make sure you've got the right bandwidth and the right processing power built into your servers to process that more efficiently. Network infrastructure will always win, but again there is a cost equation there. Compression techniques may save you costs, but it will be at the cost of latency.

Akass: Network infrastructure! We've experimented with data compression. One of our big providers uses data compression on some of their legacy services. In the olden days when bandwidth was really difficult to get hold of, it was difficult to scale, so people used data compression as a way of keeping costs down to avoid bandwidth growth.

Unfortunately data compression adds delay and latency. It takes time to squash your data down into smaller bits and to unsquash it. It adds latency and there are performance limits. As the data volumes are moving into the megabytes and tens of meg and hundreds of meg, you've got quite significant performance issues to make your data compression technologies keep up to date with bandwidth growth.