The Gateway to Algorithmic and Automated Trading

Plimsoll Takes the Plunge

Published in Automated Trader Magazine Issue 07 October 2007

To offer greater diversification to investors, US-based Plimsoll Capital introduced Armada, a portfolio of automated strategies. Tom Parry, Director of Algorithmic Trading, tells AT how the FX-focused asset manager built the platform on which Armada now sails.

Tom ParryWhy did Plimsoll adopt to automated trading?

Plimsoll has been in involved in the FX market since 2002, but we only really become involved in automated trading this year with the introduction of Armada, which is our portfolio of automated high-frequency, statistical arbitrage and market-making strategies. We designed Armada to be non-correlated with Headwind, our existing discretionary strategy, so we could offer the two strategies individually or collectively to investors. By choosing to allocate to both strategies they are able to achieve additional diversification between automated and discretionary investment styles. Over 80% are institutional managed accounts and they are generally looking for pure alpha rather than a currency overlay programme.

We've started to make some of the FX-specific execution algorithms developed for Armada available to Headwind, as well as automating some of the trade signal generation and pattern identification processes that suggest potential trading opportunities for the Headwind programme. This new automated functionality for Headwind doesn't actually generate a trade, but causes an order entry window to 'pop-up' on the portfolio manager's screen with all the necessary order entry fields populated to make the trade and offers a choice of execution algorithms, but the final decision to take the trade is still ultimately based on the trader's view of the market.

What has been the impact on the Headwind strategy?

This increased level of automation has greatly simplified Headwind's workflow for Randall Durie, the portfolio manager, and enabled him to effectively monitor more potential trading signals and price patterns. For example, a number of our clients have accounts with different FX trading platforms, so previously each time he needed to execute a trade, he had to do so on a on a point-and-click basis across four or five venues' GUIs. Obviously, this type of workflow greatly increases the probability of adverse price impact by the time you get the trade off on the last platform. Through automation we have been able to eliminate this risk and provide our clients with a more consistent level of execution regardless of where they decide to fund the account.

Randall DuRie (Principal and Portfolio Manager, Headwind), Thomas Parry (Director of Algorithmic Trading and Portfolio Manager, Armada), Daniel Bremmer (Director of Operations).

How different is developing automated trading models for FX compared with equities?

One of the biggest challenges in developing FX models is the lack of any single consolidated, data source. Ideally, the market data that you're using to build your trading model should come from the same source as where it will ultimately be traded because of the small differences between data sources at the tick level. Although the vast majority of these differences are eliminated when data is aggregated at higher time resolutions (i.e., tick vs. 10-min OHLC bars), these differences at the tick level are both extremely significant and relevant for those developing high-frequency or statistical arbitrage types of trading models. Additionally, you should always try to use market data that comes from firm (i.e. executable) prices, as opposed to indicative prices whenever possible. Accordingly, we only use data from Currenex and HotSpotFXi, and have developed a number of proprietary algorithms that enable us to fully reconstruct the order book at a given point in time to be able to accurately estimate where we would have got filled and where we would have had to go up a couple of price levels and estimate the potential price impact of the trade.

What impact does the lack of transactional (time-of-sales) data have on model development?

This presents FX system developers with an additional number of challenges if their models employ some type of passive execution strategy such as using limit orders to try to avoid paying the spread and obtain some form of price improvement. The most obvious solution to this apparent lack of data is to create a model to estimate a synthetic order flow measure based on changes in the displayed quantities. For example, a number of the FX platforms such as Currenex, HotspotFX, EBS, offer a 'ticker' that shows the date/time and direction of the last trade that took place.

Although we've developed a number of different stochastic execution and sequential trade models that attempt to estimate the probability and resulting price impact of our limit order executions based on a number of measures such as spread, size, volatility, in practice, we prefer to take a more conservative approach and assume the worst case scenario, i.e. paying market rates on everything. In addition to leaving more margin for error when back testing, this approach also eliminates the possibility that some overlooked misspecification in your limit order execution model causes you to implement a unprofitable model.

How many sub-strategies underpin Armada and how are these managed?

Armada currently contains ten different underlying trading strategies. The strategies run the gamut in terms of style, trade duration and trading frequency. It is a well-documented fact that there are a variety of different types of participants in the FX market, each of which may have a variety of reasons for trading. In order to identify and exploit potentially non-random price movements caused by these various actors, we run our models at a number of different timeframes and also use a variety of activity-based proxies for time such as ticks.

At the portfolio level, we use a series of proprietary filters that examine volatility, correlation and other performance-based statistics to turn the underlying models, as well as the actual currency pairs being traded by those models, 'on' and 'off' throughout the trading day. For example, certain volatility conditions create more favourable trading environments for some of our models, so when these conditions are met, some of our models will be activated while others will but shut off.

Tom Parry

How do the underlying strategies combine to deliver performance to investors?

At the highest level we have what we call a 'meta' or master strategy, which controls Armada in terms of risk, positions and exposure limits. It also allocates capital and opportunities between the various sub-strategies. For example, if two of our strategies both generated signals to buy 10m EUR/USD at the same but there was only 10m available, the meta-strategy would allocate the available liquidity to the strategy with the higher expected risk/return ratio and then attempt to work the other 10m for the second strategy for the remainder of the signal's duration. The meta-strategy also has a number of other filters that look at correlations between the performance of the underlying strategies and currency pairs being traded because, from a diversification angle, there's no point in running ten different strategies if they're all buying EUR and selling JPY at the same time.

We've also set up our EMS with an internal crossing mechanism that allows us to leave resting limit order on our servers and look for opportunities for natural crosses occurring within our internal order flow as opposed to sending everything out to external execution venues. For example, if we have two strategies that are simultaneously buying and selling the same currencies at the same time we can just meet at the mid point so neither strategy has to pay the spread, we don't have to worry about the market moving on us and we save the commission.

What level of human direction is there in Armada's meta-strategy?

Although all of Armada's signal generation and execution is entirely automated and systematic, human discretion still plays a role in the strategy in that we always have someone monitoring the strategies to ensure that they are running properly. This monitoring is necessary to ensure that the actual execution and order placement is carried out effectively, o to override the signals and turn the models off in extraordinary circumstances or market conditions. For example, discretion may be used to veto a false signal triggered by some type of systematic liquidity shock or to improve execution in fast moving or illiquid markets. However, under no circumstances is discretion ever used to enter trades or override the strategy's risk management parameters.

The only other form of human input to Armada's trading process comes in the form of daily updates to each strategy's risk allocation at the end of each trading day. If a strategy starts with a daily allocation of a $1m, throughout the day it will get an update based on its P&L and the next day's allocation is based on previous day's performance. We've been looking into some new continuous optimisation techniques that would reoptimise the portfolio's allocations to each of its underlying strategies based on some kind of real-time performance metric, but have found that most of these techniques are quite computationally intensive and would rather spend the CPU cycles in other areas. This type of optimisation technique also introduces the risk of parameter misspecification errors which could ultimately reduce performance. Finally, all of the actual ideas that the models are based on human input as opposed to using some type of genetic algorithm to create strategies.

What are the key elements of Armada's technology platform?

All of Armada's strategies have been built on top of our recently installed execution management system from Aegis Software. We're using AthenaTrader as our front-end GUI, Athena Gateway Server as our FIX engine and Athena Pricing Server for market data. One of our biggest factors in our decision to go with Aegis was that they were one of the few vendors that were able to offer us a complete automated/algorithmic trading solution so we were able to avoid delays related to having to integrate components from various third parties.

We found that AthenaTrader was also very unique in that it provides an unusual balance of customisability with out-of-the-box functionality while still maintaining a very high level of usability. For example, the software offer enough ways to slice and dice an order to make the average trader dizzy, but the functionality is very easy to configure via through the various GUIs they offer. Another big factor in choosing Aegis was development time; we felt we could roll out the strategies fastest with Athena. Ultimately, their systems performance was just better and more scalable than many of the other firms we were looking at, partly because Aegis are used to serving some of the highest volume buy- and sell-side firms in the equities and derivatives markets.

The only other software we really use are some in-house backtesting and Monte Carlo simulation tools that we built in Matlab/Simulink and integrated into Athena via Matlab's Java API. We've had virtually no integration issues because we've been able to use pre-built plug-and-play connections from Athena into the banks and trading venues. In terms of hardware, we're running everything on 64-bit Red Hat Linux servers each with dual(2) quad-core Pentiums. All of our routers and switches are from Cisco. All the Aegis components as well as the actual trading strategies are written in Java 1.5.

What technology developments are planned?

Looking down the road six months or so, the two technologies that we might consider adding to our current infrastructure would be some type of complex event processing or event stream processing engine or some type of hardware-based acceleration such as FGPAs (field programmable gate arrays). Although both would be extremely exciting to work on and likely have significant effects on our ability to reduce latencies in our trading processes, we'll most likely spend the majority of our development resources to further optimise and tune what we have already built in AthenaTrader. At this point, I can't justify spending the time/money on technologies to save microseconds when there are still milliseconds that we can eliminate with more efficient coding practices or data structures. However, once we hit the wall with what we can do with code alone, we will be looking at these technologies very closely to see how and where they can help us to further reduce latency from our trading processes. Aegis also offers a lot of advanced Q&A simulation/testing tools that we've started looking at which will allow us to perform some more extensive profiling and regression testing our current systems.

What are the core principles you follow when developing automated models?

Above and beyond anything else, a trading model has to make sense on some kind of intuitive level as opposed to just being able to generate statistically significant returns. If you can't explain the model without immediately resorting to numbers, there's a very good chance that you've don't fully understand the market behaviour you're trying to exploit and therefore have probably also overlooked some key explanatory variables as well.

The only other real principle that we try to follow is to try to stay clear of what the other people in the market might be using. This doesn't mean that you don't have to be aware of what they're doing, why they're doing it and the implications of their actions. However, in order to create and maintain some type of edge that enables you to earn excess returns, you can't simply be following what other people are doing or have already done.

What are the key steps when developing a new model?

We generally start with a quick exploratory analysis of the data in Matlab. At this initial stage we're more concerned with how well our model fits the particular market behaviour we're try to capture or its ability to identify some non-random price pattern as opposed to profitability and risk/return characteristics. If the model looks like it does a good job at explaining/predicting the underlying data, we run some simple backtests to check if it's profitable. If we like what we see here, we start looking at what value it adds at the portfolio level. If the model is adding significant value at the portfolio level, we'll perform some more intensive Monte Carlo simulations to see how robust it is. If the model gets to this stage and is still attractive, we'll start re-coding it in Java to be able to implement it in AthenaTrader. From there we run it in real time for a couple of weeks to a month to monitor performance before its rolled out into a market. One of the other reasons that we chose Athena is that within a single platform we can build a model, run it, then flick a switch from 'test' to 'real' destinations and it (the model) is on.

We tend to stick with this development process as it allows us to explore a large number of potential ideas fairly quickly and does a good job at eliminating bad ideas quickly before we waste too much time on them. This is extremely important since a number of high-frequency strategies are not particularly long lived; they might have a life span of 3-4 months, so it's essential to be able to quickly identify the non-random price action, build a model to exploit it, test it and ultimately deploy it quickly before the inefficiency is discovered by other practitioners or disappears and a new trend or correlation starts to change or fade.

AthenaTracker TeamHow long does it take to develop a strategy?

We're generally try to keep the development time of new strategies to 45-60 days, but this length can vary for a number of different reasons. For example, in some cases it can be as short as a few days if all we're doing is combining some existing analytics or models into a new strategy. Perhaps the most important factor to consider when allocating development resources to a new strategy is the scalability (capacity) of the strategy. Obviously a strategy that has a capacity of $100 million should receive significantly more development time than a strategy with similar return characteristics that could only handle $10 million.

In terns of actual time, it usually takes about a week or two (40-80 man hours) to complete the initial exploratory analysis and then anywhere from one to four weeks (40-160 man hours) or so to recode everything in Java, depending on the amount of custom code and the complexity of the underlying algorithms. Once we have everything built and running on AthenaTrader, we generally do up to a month of live testing.

During this live testing phase, we're looking for any signs of potential failure before the model starts trading real money. The live testing phase also serves as a sanity check. At the end of the trading session, we re-run the model with the market data collected over the session to look for any deviations in behaviour or returns that would cause us to question the backtesting results. This check has become somewhat redundant for us as our backtesting environment is essentially an exact replica of our real-time trading environment, but I think this step is critical for anyone who builds/tests their model in a different application or environment than the one they will be trading it in.

What code to you use when developing models?

I realise that it's probably impossible to convince some people that Java will ever be as fast as something written in C/C++ (I was one of them until a few months ago), but I think they would be very surprised as to how small the differences are getting between Java and native code particularly given the recent improvements in Java, such as JIT compilation and the new real-time specification for Java (RTSJ). These days, the latency and performance edge is really more determined by the proficiency of the programmer and the development methods/tools being used rather than the 3GL language being used. Another key factor is the ultimate development time; you might be shaving microseconds off performance here and there by using native code, but if you're taking twice as long to write the code, the trading opportunity might have been arbitraged out of the market as other participants were able to implement their models faster.

What role does agent-based simulation techniques play in backtesting?

One way we've been able to circumvent the lack of order flow (time-of-sales) data is to use some of the techniques from agent-based simulation to create populations of hundreds of strategies that we then run against our market-making models to determine when - and in which direction - our market-making strategies' prices would be getting hit. A very simple application of this would be to create a population of agents using a moving average cross-over strategy to generate their trading signals based on different types of moving average and the period length. We'd then create a simulated market environment populated with those agents and see how they would interact with our strategy (i.e., what the resulting order flow would be). The real challenge is to come up with an algorithm that matches the distribution of order flow in the market with your population of strategies. Then you can create populations of these agents (as synthetic traders) that are essentially trading at the same time and the same price as the historically aggregated prices and times. From there you can anticipate where and from what directions you're getting hit from passive orders as well as where other peoples' signals are going to trigger once they know that you're running a market-maker strategy. Also by knowing what types of strategies are in the market anyway and how they are reacting to price you can adapt your strategy's behavior based on your anticipation of where other peoples' strategies are going to go, i.e. by lifting your offer a couple of pips when people are looking to buy.

There are so many different ways to come up with that population - we've typically being using genetic algorithms so far - but the real key is coming up with algorithms to select which strategy will mimic what's going to happen in the market. You can never be exact but some of them are surprisingly close, accounting for 70-75 per cent of the estimated order flow from a population as small as 200 individual strategies.

How do you develop algorithms for different FX pairs?

As well as differences in volatility and liquidity, FX pairs have different time-of-day effects; there's more liquidity in the evenings (US Eastern Standard Time) for yen, for example, while trading in European pairs will thin out overnight. We need to incorporate these daily fluctuations into our algorithms as well as any changes in the correlations between the pairs, so you need to figure out if you can get in cheaper by using synthetics or if each leg is a cross. Importantly, each ECN has different matching logic and rules. For some venues that allow hidden/iceberg orders, the portion that is hidden gets time priority based on when it was entered into the queue, unlike in the equity markets. Another key difference is that most of the FX ECNs have not implemented a 'modify' order type/option, they just allow you to send cancel/replace, which means that if you want to change a portion of the trade, you lose time priority on everything. If you have an order for $10m you might be better off splitting it into ten $1m orders so that you can selectively go in and cancel three or four without losing your place in the queue for the others.

You need to be explicitly aware of any differences in matching logic and order prioritisation rules that are employed by your execution venues, both multi- and single-bank platforms. For example, some banks allow you to submit limit orders and some don't, there are also a number of differences in the rules regarding the modification of resting limit orders (if they are too close to the market), and the actual triggering of trades on limit orders.

What is the next stage of development in Plimsoll's use of automated trading?

We'll continue to expand the use of execution algorithms into the Headwind strategy. And for Armada itself, we'll be looking at adding more specific market-making strategies for some of the different crosses outside the mainstream currencies, particularly emerging markets crosses. One of the main areas we're hoping do some new work in next year is developing some new data visualisation techniques for exploring ultra high frequency data to help us identify possible trading opportunities or price patterns faster. Advanced data visualisation could be one of the next big trends we see in the industry. As systems continue to lose the ability to differentiate themselves based on performance or latency, they will need some new method of attracting customers. As you use more and more data you need to be able to look at it in so many different ways, but also consolidate it in a single chart. To me, 3-D visualisations on GUIs potentially are good way of vendors standing out from the crowd, but ease of use is also important; you don't want it to take six months for a new guy to learn how to use an interface.