Get a Grid!

Issue 07 October 2007
Automated Trader Magazine

In the Q3 issue’s Technology Forum, our panel of experts agreed that grid computing had not yet been fully harnessed to support automated trading. So Automated Trader asked Mike Stoltz, VP, Architecture and Strategy, Financial Services at Gemstone Systems, Inc, to explain how grid computing might support a theoretical automated trading strategy.

Mike Stoltz
Mike Stoltz

The Scenario

A large proprietary trading operation wishes to develop a global, automated synthetic pairs trading programme across its offices in New York, London and Singapore. In its first phase, the objective is to identify baskets of instruments that can be traded as synthetics against individual real securities. In its second phase, the intention is to extend this strategy so that both legs of the pairs trades consist of synthetic instruments.

The intention is that the strategy will involve multiple security types denominated in various currencies and that it will also operate across twenty different timeframes (ranging from one minute to daily data). Combining these factors with the need to conduct in- and out-of-sample testing for cointegration means that the computational requirements will be very significant from the outset, but will expand substantially with the second phase of the project.

Furthermore, the trading operation is making the assumption that the tradable pairs arrived at after ‘out of sample’ cointegration testing will not be stable (i.e. the cointegration relationship will probably decay), especially in short timeframes. Therefore, the decision has been made that development and testing must be continuous, so all calculations and testing on all possible pairs and timeframes will be updated in real time. In the case of shorter time frames, this will result in calculations having to be performed during market hours. The intention is not to have a ‘Big Bang’ go live, but a gradual ramp up. Initial testing and trading will only involve a subset of the final instrument universe, but will still be conducted across all three locations.

Initial set-up

Due to the global nature of this business, it is likely that each of the three trading centres will have a replica of data from other centres. However, keeping multiple data centres in sync is a classic data-management problem. A new generation of distributed data caches, or ‘enterprise data fabrics’, are well suited to address this data management problem. By partitioning data according to well-defined rules pertinent to the company’s business use-case, the data fabric will help provide transactionality for updates typically managed by regional resource managers. Inserts of new data do not require transaction management and the new records will only need a globally unique identifier.

Load balancing

Once the data is present at all locations through the data fabric, it is a simple task to fire off computations at any location by utilising computing resources in regions that would otherwise be idle. This is a question of ‘moving the computations to the data’ rather than ‘moving the data to the computations’. Given the size and complexity of the kinds of data needed, the request for computation is an order of magnitude smaller than the data required to perform the calculation. Therefore, using a ‘command pattern’ can easily achieve a ‘10x speed-up’ by treating the resources in each location as a computation pool and making data available to ensure all pools are of equal value to the compute grid software that is managing the load balancing. ...

Limited Access

This article is for registered viewers and paid subscribers only, please either log into your account above or click here to register an account now with Automated Trader Magazine.