Thinking Out Cloud - The Market Data Sweet Spot
from Xand : Joel York - 1st January 1970
The opinions expressed by this blogger and those providing comments are theirs alone, this does not reflect the opinion of Automated Trader or any employee thereof. Automated Trader is not responsible for the accuracy of any of the information supplied by this article.
Market data providers and IT professionals have tough jobs. Every day financial markets spew out huge fountains of data that must be captured, routed, scrubbed, reconciled, stored and redistributed with dizzying speed and accuracy. The diversity of data is staggering, from low-latency pricing data for algorithmic trading to intermittent corporate actions such as stock splits, and from globally dispersed real-time currency exchange rates to aggregated end-of-day VWAP and NAV calculations. Optimizing and tuning the market data systems that keep this crucial information flowing smoothly and cost effectively is no easy task. What, if anything, can cloud computing offer to ease the challenge?
This is the second post in a series called "Thinking Out Cloud" with the aim of helping financial services and market data IT professionals charged with developing cloud computing strategies separate the cloud buzz from the cloud reality. This post explores the types of market data that naturally lend themselves to cloud computing (and those that do not) in order to identify the market data sweet spot for cloud computing.
First, it's important to recognize that it is not the market data that is being outsourced to the cloud, but the on-premise market data management infrastructure. Therefore, the market data sweet spot for cloud computing is the intersection of data management systems that offer little competitive advantage, but are costly and difficult to maintain in-house. Both relative competitive advantage and relative level of difficulty will vary from firm to firm by business focus and IT capability respectively; however, there are aspects of the market data itself that can contribute significantly to the cost and complexity of maintaining an in-house data management system.
Hard to Use Market Data
If the market data comes in original feed formats that are not well suited to the particular use of the data in the final application, then considerable effort must be expended to make the market data application-ready. For example, a real-time streaming exchange feed is great for creating a stock ticker, but not so great when the goal is to analyze historical tick data or simply get an ad-hoc real-time quote for a single symbol, then there can be lots of programming involved to parse the feed, store the data, continuously refresh the database and create a data access layer that applications can easily utilize. Cloud computing is built on Web services that allow for multiple interfaces to the market data, so it is especially good at tailoring the data format to the specific needs of the application on the fly. For example, a Web service request can be for a single price, multiple prices, or simply for symbol validation against master data.
Hard to Maintain Market Data
If the market data in question is stored and refreshed often due to daily activity, such as historical time series and tick-by-tick data, then it can entail a complex update process that must be maintained and monitored. Quality testing must be put in place to ensure data quality and alerts to ensure that update processes run successfully to completion. As market data accumulates, regular backups, purges and capacity upgrades must be carried out to ensure efficient operation. Also, market data that is particularly complex, such as corporate actions, can consume significant resources scrubbing, mapping and updating the data even if the volume is not as heavy as price data.
On the other hand, market data that is infrequently updated in large batches that replace the entire data set may be easier to receive and maintain internally as a simple flat file. Similarly, market data that is streamed for continuous presentation and immediately discarded should be relatively easy to handle in-house.
Hard to Access Market Data
Technical and geographic barriers can conspire to make certain kinds of market data extremely difficult to access, let alone store and use. Getting access to market data that is not in high demand in your geographic location can be very difficult when local data providers do not support it, such as low volume market niches and new products, or market data that is created on the other side of the world. If data needs to be made available throughout a dispersed global organization, it can be quite costly to receive it centrally and build the necessary network and services infrastructure to distribute it globally to consuming applications. Cloud computing provides direct application access to market data over the Internet, so both the geographic and technical barriers to access are significantly reduced. Web services standards ensure that the technical hurdle is very low, so as long as sufficient Internet bandwidth is available, access is no more difficult than pointing a Web browser to a Web page.
The Cloud Computing Sour Spot for Market Data
If hard to access, hard to maintain, and hard to use define the market data cloud computing sweet spot, then what is the sour spot? When boiled down to its essence, all of the "hard to's" above are made "easy to's" by the Internet. So, the sour spot is found when the Internet has a significant negative impact on market data delivery. The three biggest limitations, in most likely order of importance are…
The market data is unique and proprietary with significant competitive, security or privacy concerns that preclude storing or distributing it using a public network, even with encryption.
The latency requirements of the data are very strict and have a low tolerance for variation. Typically less than about 50ms on average which is about the best that can be consistently expected from a high performance Internet connection. In addition, Internet latency can vary significantly depending on the number of network hops between the cloud provider and the consuming application. This number improves every year (back in 2000 we were talking seconds). But it is safe to say that latency requirements measured in microseconds or nanoseconds won't be a good fit for the cloud.
The market data volume involved implies a network transfer time that exceeds the requirements of the consuming application given the best expected performance over the Internet. This may be alleviated by getting a direct network link to the cloud provider if the cost of the additional bandwidth is justified by the overall cost savings from outsourcing the relevant data management infrastructure. In some cases, this can also be resolved by moving the application itself to the same cloud computing infrastructure. However, an in-house application that requires frequent updates with large volumes of data can quickly exceed the best performance currently available over the Internet.