The Gateway to Algorithmic and Automated Trading

Me and My Machine: The Automated Trader Interview

Published in Automated Trader Magazine Issue 15 Q4 2009

For many in the world of automated trading, low latency is critical to their trading performance. By contrast, for Florida-based III, it’s just one component among many. Automated Trader’s Editor, William Essex, spoke to David Allen, III’s portfolio manager, about this. While they were at it, they also discussed “appropriate automation”, artificial intelligence, data interpolation, integrating fundamental data, data maintenance, synthetic markets, in house and outsourced and avoiding the herd - among other things…

III, comprising III Associates and their affiliate III Offshore Advisors, is an SEC-registered investment adviser, founded in 1982. III has over $1 billion in total assets under management with a primary focus on fixed income and credit related strategies. David Allen, portfolio manager and Eugenio Perez, portfolio manager/trader for III, developed the III Futures Neural Network, a system which employs adaptive pattern recognition to trade liquid futures markets. Both Allen and Perez come from traditional physical commodity market backgrounds, having served as risk managers and traders for bank, fund and merchant utility desks over the past 35 years of collective experience.

William Essex: Let's start with the history. David, how did your division evolve, and where do you fit in?

David Allen: III is primarily fixed-income oriented and has been for almost the entirety of its 27 years in business. Self evidently, the principals gravitated towards the fixed income strategies that they feel most comfortable with. These are predominantly those interest-rate and credit arbitrage strategies where there are well understood and time tested relative value relationships.

I joined in the summer of 2006 from Barclays with the mandate of building out a commodity capability across the firm. That meant adding the diversifying element of an uncorrelated asset class and expanding the firm's reach to investors looking for exposure to commodities like energy, grains, metals and softs. We also were tasked with adapting some of the firm's fixed income relative value trading concepts to commodity trading. This then evolved into research and development of sophisticated quantitative trading strategies.

William Essex: In doing this have you been able to leverage a lot of the existing internal resources and relationships at III?

David Allen: Indeed. III's long standing relationships in the industry, proven infrastructure and infra-systems and the fact it has credit documentation in place with numerous counterparties undoubtedly facilitated our commodity-related activities. It means that we are now well-positioned to quickly provide very specific tailored solutions to the challenges that we or our clients may be facing.AVM LLP

William Essex: When you joined up from Barclays Capital, had you done much on trading-system automation? Had you already got an interest in that area?

David Allen: Eugenio Perez and I met while I was still at Barclays Capital, so by virtue of that I had already been exposed to the brand of systematic trading technology that we are using now. Part of the mandate when I moved here was to explore this concept of systematic trading using some proprietary technology that Eugenio had developed. Together we began applying this technology to futures markets in 2006.

My prior exposure to systematic trading also involved a significant technology element. When I started out in the early to mid nineties I was trading
for what is now Goldman Sachs. I spent quite a bit of time there when floor technology, at the specialist level, was just starting to come in. So too were all of the ECNs that were routing equity and equity-derivative flow. This drew me into the technological side of trading rather than just execution.

William Essex: When you arrived at III, was your approach to look to systematise and automate as soon as possible, or was it to get some basic trading strategies working first?

David Allen: It was a combination of the two. We are a large enough group, with roughly 100 people, that we feel we have a big enough critical mass to develop a lot of our own applications and a lot of our own tools.

I felt pretty well equipped even in the first couple of months of being here. I could get anything that I needed, in terms of extracting historical data, plus live feeds for current tick data, as well as obtaining fundamental data, such as storage levels and weather data.

My objective was to take existing processes and make them faster and more profitable by wrapping them with decision-support technology. That could be anything from a good risk-visualisation tool, all the way up through much more complex code that automates routing of order flow. Given the strength of the technology function here, there was also the opportunity to improve existing processes. However, true automation only really happened after Eugenio and I on-boarded the systematic program that we've been trading for a couple of years now.

William Essex: David, can you tell us a little in general terms about the development of your trading strategies and their automation?

David Allen: The first point to emphasise is that there's no replacement for hard work and persistence. The second is that automation does not have to be all or none; there are degrees of automation that can be easily adopted and will help save time and/or generate revenue. It's important, especially in commodities markets, to see signal and execution automation as part of the larger infrastructure build. This helps with scalability and front-office tool design later on.

We prefer to focus on intelligence over speed when it comes to strategy design. There are some great pieces of technology out there and teams of people using them. This is nothing new; using automation to enable systematic trading has enjoyed a recent revival due to market conditions but it has been around for a long time. We think it should always be a component of a diversified approach.

William Essex: One thing I've noticed with a lot of systematic trading firms is that their emphasis frequently tends to be quantitative or technical, and they almost always use price or volume in some form as raw inputs. It is only comparatively recently, such as with machine-readable news, that they have really tried to incorporate more fundamental data into a systematic model. Were you thinking in that direction? Was that in the back of your mind?

David Allen: I've never been an enormous fan of the purely technical approach. I fully recognise that it has been successful for a number of fund managers and it is clear that, particularly during certain periods of time (like portions of last year), trend following for example works very well. But it feels to me that it is just one part of trying to model market movements.

Think about this: When you say "the market", sometimes what that really means is in essence just a bunch of people who are loosely connected electronically - nothing much more organised or formal or complex than that. Trying to wrap your brain around why things behave the way they do sometimes becomes very difficult because of the human involvement. It's fine to use a complex engine for decision support around trading, particularly if it's something you design yourself and you believe in. But it is not always necessarily the case that you need to make the process of using it overly complex. The CTAs who use trend following alone are probably thinking the same way, actually. The simplest solution can be the most elegant, sometimes.
Simple, though, does not mean incomplete. For instance, where the purely trend-following approach falls down is that there is no longer an information gap. It is no longer so hard to disseminate accurate price or volume data to many people at once. Market psychology and resultant technicals are important, but to ignore the fact that, for example, every Wednesday the world receives crude-oil and product inventory data is to miss out on a big portion of what moves prices. Omitting fundamental data on weather, conditions, sentiment, inventory and the like would leave one's analysis incomplete.

So we decided that we needed to be able to bring in fundamental data as well as market price and technical, to be able to complete the picture. Even if at times it looks like the markets are disconnected from their fundamentals, you still need to be capturing that data and trying to discern the signal content within it. We use technology to assist us in that process.

William Essex: When you do that, how do you approach the problem? Let's take the crude inventory number as an example. In terms of your modelling, do you tend to think: "The number we anticipated would be X, but it's Y, therefore our model will do Z"? Or do you tend to think the other way? i.e. "Here is the number. Everybody was expecting X. But Y has actually happened. You would therefore expect the market to respond by doing Z, but in fact it's doing A, which is not conventionally logical. Therefore, there is an additional piece of information there that I am interested in and our model will trade upon?"

David Allen: It's definitely more the latter. There are folks that develop predictive models for things like crude oil and product storage. They do well from that and you can see the results of those models on Bloomberg every week. One can trade on these projections/predictions. We are not really part of that process at all.

At III, survival of the fittest doesn’t just apply to genetic algorithm populations...Eugenio Perez in action
At III, survival of the fittest doesn't just apply to genetic algorithm populations...Eugenio Perez in action

We would rather do more of what you mentioned but with a little bit of a twist. In your example, you suggested that a number will come out and the market will have a consensus as to where they think it should be. Perhaps the actual, realised data is far higher than where the market predicted. So obviously you would expect that, with all other things being equal and with no other exogenous issues applying, the market should sell off because supply is greater than anticipated.

It's never that simple. We know that. Having stared at the market for fifteen years, our thinking is that very often, when the numbers first come out, traders' initial reaction is wrong. They end up taking the market substantially the other way, only to reverse a little later that day. Why? I think it is because these are not linear problems. These are very non-linear problems against the backdrop of time, movements in other asset classes (especially FX and equities nowadays), weather, geopolitics and changing dynamic relationships between all the contributory factors.

William Essex: That sort of non-linear problem would seem to suggest some sort of artificial intelligence technology is appropriate. Yet I've noticed over the years that this only appears to enjoy very intermittent adoption. What's your view on that?

David Allen: I think the idea of using such technology to automate some of these non-linear optimisations is hardly new, but its popularity does seem to ebb and flow. I think that is to some extent a product of human nature. When people go through a period of enormous P&L volatility and heavy losses when using established strategies in areas such as stat arb, convertible arbitrage or pairs trading, they resort to their worst human biases. They tend to make decisions based on extreme risk aversion as opposed to logical thought process. People's expectations can impact their perception. We as humans also have a tendency to search for or interpret information in a way that helps confirm our pre-conception - and when these behaviours happen en masse, it causes a lot of trouble.

Traders then start casting around for something else and wonder why traders using, say, a systematic program seem to be faring better than them. Then they'll spend maybe a year unsuccessfully trying to get to grips with it. By which time they'll probably think it's appropriate to go back to using the techniques they were using in the first place… The fact is, though, that these systematic approaches have been around for a long time and at the very least don't suffer the same human biases mentioned above.

William Essex: With any AI techniques you're actually using, are you building the technology in-house, or are you using some sort of external components and building around them?

David Allen: Any of the technology that we are using that helps humans make better decisions or to solve problems systematically is developed in-house. I can't say that it isn't assisted by other frameworks, platforms and databases, but we tend to shy away from using anyone else's front end. We always try to control the application through an interface and then control that in turn through our own code.

We don't prefer one vendor over another; we are very agnostic in that way. But in my experience, you are making a deal with the devil when you buy a product that requires heavy maintenance and is based on a proprietary language or similar. You are then beholden to the vendor not only for upgrades and normal product maintenance but also even for minor changes. We want to avoid becoming caught in that loop.

It's far better for us to do all of our own custom reporting and for us to write the kind of code that takes control of execution algorithms. We're a huge supporter of a lot of the technology vendors out there that are creating the type of technology that is really open source enough and has enough APIs written to it that we can use it to suit ourselves. We can then customise it without having to go back to them every week for trivial changes.

William Essex: How do you deal with the task of data maintenance? While you can obviously automate some of this, it seems that over-automation can have fairly disastrous consequences?

David Allen: I suspect when people read an article like this they are looking for some blinding insight, so my saying: "we work extremely hard at cleaning our data and deploying our technology diligently and intelligently", is probably not what they want to read. However, it is the simple truth; for us, it is absolutely essential that we operate with the cleanest and most dense data - we are always keenly aware of the "garbage in/out" equation.

You can automate a lot of the process around feeding data to the trading engine, but it is vital to avoid the trap of over-automating the process of scrubbing, normalising, and de-seasonalising commodity data. For example, you might understandably code an automated cleaning rule that deletes all negative price points and interpolates a synthetic value between the two adjacent positive price points. Except that in markets such as US power an instrument can genuinely trade at a negative price for short periods of time. Introducing errors of that magnitude through blind over-reliance on technology has predictable and unfortunate consequences.

We always try and present the data in a consistent fashion that is easy for our models to read because the majority of our models are complex in terms of their engines but not as regards their decision-making. We try to keep things simple.

No data source is always going to be perfect. You are going to have to deal with real world factors, such as data that is restated a week after initial release, or data points way out of range, or completely absent. Inevitably situations arise where the data set looks like Swiss cheese and you have to figure out how best to manage it. Within the confines of the models that we use for systematic/automated trading, we occasionally find that the data is so poor that we have to conclude that trading that market just isn't appropriate given our requirements for input data quality, acceptable return and risk management.

William Essex: Moving on from data to the systematic and automation side; is it purely futures markets where you would automate the execution process or are there others too?

David Allen: It isn't necessarily restricted to futures. We manage strategies that are futures only, but also strategies that include OTC swaps and options and other derivatives, though we don't currently trade physicals, nor do we intend to.

William Essex: Where do you stand on the latency and latency arb question?

David Allen: We certainly don't want to be slow, but latency is not necessarily the primary factor that drives whether a trade is attractive to us. More often than not it's the potential edge we can achieve in intelligence or in interpretation of signal that attracts us.

The battle over microseconds was never really something that attracted us a great deal. We have done that type of trading in the past and it was interesting while it was around, but while these brief dislocations had an opportunity window, that window may have closed - mostly through technology but also through liquidity and potentially regulation. The fight over latency arbitrage seems to me to be a short lived one and the rewards are restricted so that it is not a scalable approach in commodity markets. I would rather build decision support around more intelligent trades rather than necessarily faster ones.

William Essex: So do you see synthetic pairs and structures as a major opportunity area?

David Allen: That is definitely an area that intrigues us, for sure. The idea of replicating or synthetically creating something out of similar, or the same, fungible components; we believe that kind of creativity has greater longevity than just
low-latency arb. I'd rather be smarter about the design of a higher-order spread and beyond that, I'd rather be inventing good decision logic that goes into trying to recognise how relationships between a commodity's underlying factors change over time.

William Essex: With a synthetic approach, you invariably have to consider how long you will be able to enjoy a particular inefficiency before someone else spots it. You can of course make your synthetic so incredibly complex that no one else ever finds it, but then you run the risk that your execution costs obliterate any profit. And whichever route you take you still have to consider how the relationships in your synthetic instruments change over time, so whichever way you go, you seem to be chasing a cycle. How do you deal with that?

David Allen: This presents two very interesting issues for us. The first one relates to the more automated product where we are trying to leverage some of our own in-house technology to recognise patterns in data and trade an associated predictive and adaptive model. For the purpose of trying to find true signal in commodity market data, which has its own unique characteristics and market conventions, what we have found is that using evolution to develop signals is an advantage. The genetic algorithms we use need to understand that the world is constantly changing and that there are inter-relationships between commodities that are also constantly shifting.

Being aware of the herd mentality is important, but more important to us are the changes that are going on outside our model in the real world. It's more important for us to be able to interpret how those changing rules are impacting the market. So using an adaptive, non-hard-coded but learned approach is much more important to us. It will avoid the commonplace mistake of creating a hard rule and trading it religiously for thirty years.

I like your example about chasing the cycle, having ever-more complex spreads to avoid everyone squeezing the last little bit or margin out of them. I don't see any way around that other than constantly trying to look at more, and different and better. That could mean higher- and higher-order implied spreads with multiple legs, which gets a bit absurd because of the transaction cost, as you said. The other thing it could potentially mean is looking at executing them across many platforms. The more liquidity providers for a market being made in a product, the better it is for that style of trading.

William Essex: What is your philosophy on the time-frame of trading? If you are working with adaptive models, people tend to say, X relationship exists between, for example, one of the simple calendar commodity spreads. What they are implicitly telling you is that their claimed relationship exists on a daily chart. But in reality, you can find loads of different relationships in the same spread in different time frames. Take a mean-reverting pairs strategy. You can have so many different "means" in so many different time frames. You can have so many different trigger levels of standard deviation before they revert, or don't revert, or become something entirely different anyway. How do you deal with that?

David Allen: This goes back to relationships. It definitely is not enough, and naïve, to have just noticed the correlation between two underlying factors behaving in a certain way over a certain period of time, and then to apply that simple correlation analysis to trading. These are multi-dimensional problems that do not lend themselves to simplistic analysis.

If you notice that crude and gasoline right now are trading at a relationship that looks relatively cheap or rich, depending on how you interpret the data, you need to ask yourself some questions. Is it richness relative to the rising price of crude, or richness based on the equity market? Is it richness or cheapness relative to competitor products, inputs or history? It's time sensitive too; these things need to be normalised so you can actually compare them on an apples-to-apples basis.

This is already way too much for the human brain to be able to comprehend. Even with the assistance of good databases and analytic tools, I think its nearly impossible for someone to look at those non-linear relationships across the match-up of multiple dimensions, and be able to assess: "well, 80% of the time when the equity market is rising and you also have a falling Eurodollar, and gold is rising relative to platinum, so interest rates are doing this, which means... "

In responding to this, it is very easy to fall into the traps of over-fitting and data-dredging. Even where a technique is adaptive, it tends to be over-thought and then over-fitted. You end up with some enormous polynomial that people have, either in their brain or in their code, which they're trying to fit to very complex relationships. This is a recipe for a disaster and it's something we guard against.
William Essex: So what do you do instead?

David Allen: We allow the adaptive and evolving process that we use to solve these equations for us. We would rather not hard-code any of the relationships, but try instead to be constantly listening. Embedded in our risk management is the ethos that every so often, despite our efforts to carefully manage systematic complexity, markets will go through such a radical change that you actually need to take your models and suspend them for a bit and then re-engineer or, better still, re-evolve. We use this technique in the main futures algo product that we trade; the basic thinking is that you need to be constantly listening for those changes. If you're not adaptive and dynamic, you're not going to catch these changes in inter-relationships. If it is not embedded in your risk management, it's quickly going to become a very discretionary process.

David Allen, Portfolio Manager (left) and Eugenio Perez, Portfolio Manager/Trader (right), III

William Essex: In view of that, how does your testing regime operate?

David Allen: In addition to the methodology I just mentioned, we also run statistical proofs on every idea or model we plan to trade. We want to make sure that any edge we have, not only in terms of trading signal creation but also regarding signal execution, is both real and supportable. Our advantage has to be statistically significant before we commit money to it.

We require that we are able to prove, on an in-sample/out-of-sample basis, a paper trade basis and a real-money basis the hypothesis that what we are creating is not the result of dumb luck. We demand a very large number of observations and real statistical confidence in what we do.

William Essex: With the multi-leg strategies that you undertake, it doesn't sound like you are particularly time sensitive. So presumably it's not a major issue for you to get all legs of a trade on near-simultaneously?

David Allen: Yes. I would much rather be designing more intelligent order-routing, or more intelligent signal generation that's not highly sensitive to the speed at which I get filled on a multi-legged spread.

The inter-relationship between different commodities is most important to us. The decision making that goes into that is job one. Then, once the relationship is established between different inter-connected inputs, and understood, then you can look at those inputs across time and between exchanges and non-exchanges. I think that's where we have invested most of our time, rather than in addressing the legging issues.

We spend a lot of time creating synchronisation around different feeds. In commodities, especially further back in the curve, a price that you see right now may be a trade that was done ten minutes ago, or even an hour ago. That alignment of pricing issue poses a lot of questions that you and your technology have to answer: "It traded there ten minutes ago but the things around it have now traded here, how do I want to adjust that price?"

That's incredibly important and it's the antithesis of high frequency, which assumes that the data is clean, rapid-fire and available, to the extent that you can take advantage of it by firing your latency weapon. Here by contrast there is a need to step back and make sure that your engine for implying where the price should be across the whole curve, and more importantly across different products, is properly calibrated.

William Essex: So rather than just calculating a fair value for your multi-legged synthetic spread, you additionally have to calculate, by inference from other instruments, a notional price for one leg of that spread before you can even think about working out how to trade it?

David Allen: True. I think this is a risk element that is currently understated. It is OK for the large sell-side shop if they don't have this optimised, because they will do so much flow throughout the day…

William Essex: … that it comes out in the wash?

David Allen: Exactly - so it never really gets highlighted. By contrast, on the buyside you don't generally have that luxury because it is so important that you understand the risks in granular detail. That gets back to how you are executing on the systematic side too, because your ability to execute systematically is hijacked by things like illiquidity affecting your ability to apply a price across a curve.

William Essex: How many commodity markets are you connected to for trading? Do they require different approaches in terms of, for example, connectivity and data gathering?

David Allen: Dozens. Not sure I have an exact number for you. We analyze and have traded or trade many different commodity markets, from energy and electricity to grains, softs and metals. Each of them has their own distinct market conventions, mechanisms for disseminating data and risk profile. As such, I do not think they can be treated as purely fungible from a data and technology standpoint. To do so would be dangerous - you won't know if you missed a nuance or a big-picture item

William Essex: How sensitive are you to liquidity?

David Allen: Irrespective of the strength or quality of a trading signal, it's obviously critical. However, the legging issue is less of a concern to us; what's more of a concern is the long-term liquidity around a product. We're less concerned about how we're going to get out in three seconds, but pay attention to how we're going to get out in three weeks or months.

That's particularly true for the pattern-recognition strategy we run because our holding period is longer. When we find statistically significant proof that our alpha, or our edge in creating those pattern-recognised signals, is viable, we want it to actually be available for a while. We focus only on where liquidity is deepest, on products where there's a history of data that goes back more than just a couple of years. You need that density of time series data in order to be able to do proper analysis.

Even though right now is probably one of the worst times I can recall in the last fifteen years, you still have good liquidity across all the US exchange-traded energy markets, all the UK exchange-traded energy markets and most of continental Europe on the energy side. You also have the ability to trade metals in New York and London, grain in Chicago and with the marriage of Chicago and New York the ability to do that electronically as well.

From an Asia standpoint, we just don't find that we have the maturity of data and depth of liquidity to be able to do a lot there. Across the board though, the exchange clearing mechanism is helping add to overall futures market liquidity and hopefully transparency.

William Essex: Do you regard automation as an end to end process?

David Allen: We make a clear distinction between the automation of trading signal generation and automation of its execution. Part of our art is determining where full automation or partial automation is most appropriate. Elements of execution automation do require a set of human eyes, while others do not.

William Essex: Finally, what would you say was the core ethos of your group?

David Allen: Hard work and persistence, thoroughness and depth of thought. We like to feel we have a pragmatic approach to automation and trading markets. While many Automated Trader readers may be focused on maximum automation of trading processes, we believe that there is value in matching automation to processes - some are clearly better suited to it than others.

Some activities are best handled by human reasoning and thought - they require a sanity check if you will. But other things involving exposure management or market making may not justify the human costs and cry out for "some" automation.

That's an important theme for us: there is no one-size-fits-all approach. The key point is that you cannot afford to become so excited by your ability to leverage technology that it obscures your ability to see the big-picture and thereby substantially impacts your P&L.

William Essex: David - thank you.