The Gateway to Algorithmic and Automated Trading

No Signal

Published in Automated Trader Magazine

NO SIGNAL is a regular column where we examine various snafus in the trading, particularly the automated trading world. We look at errors in application logic, mistakes by overzealous co-workers, failures in technology and temporary losses of power to both infrastructure as well as craniums. These all make for good stories that everyone can alternatively either learn from or be amused by - or both. If you have a story that you think makes for a valuable lesson or is simply funny in a facepalm moment kind of way, please get in touch with us at no.signal@trader.news. Naturally, we treat all submissions with the highest confidentiality. We are only interested in the lesson value, or in some cases the humour value, and not in identifying involved parties.

What's the floating point?

Following on from previous stories about problems related to compu­tational accu­racy in financial algorithms, this too is a tale of numerical woe.

Fundamentally, there are two common number formats in computer science: integers and floating point numbers. There are other, less common ways of dealing with numbers inside a computer, but these are not of interest here. For an amusing story about what can go wrong when using integers, the interested reader can refer to the 'No Signal' column in Automated Trader, Issue 39.

Most applications use floating point numbers to represent real numbers. Nowadays, finance is all about squeezing a few basis points (0.0001) from a small rock. This differs somewhat to taking whole (integer) bars of gold, which is apparently what finance used to be. So we often find ourselves dealing with small numbers.

Floating point math comes with a large number of caveats. Few programmers really understand what can go wrong and what they should pay attention to. From rounding issues, to significant digits, to processor and compiler specific problems, it is a veritable minefield for those who are less numerically gifted. There is a standard that is supposed to guide us all - IEEE 754-2008 - but studying this is, in my experience, beyond the ability of most programmers. People just want to get on with calculating stuff, not thinking about the ins and outs of addition or division under the hood.

We'll leave aside the exact nature of rounding issues and problems related to loss of precision and instead focus on what happens when people don't understand the unspecific nature of floating point maths.

A large, unnamed currency hedge fund generated all of its signals through a chunk of legacy code that hadn't changed much since the firm started over a decade ago. The fund was big, and accordingly its signals generated long-term trades.

Long-term signals are usually generated with - drum roll please! - long-term filters. These transform a time series of prices into a slow-moving time series of smoothed data. Common filters include moving averages, Kalman Filters and other, more esoteric menu items (unscented particle filters, anyone?). Quantitative traders are particularly fond of exponential moving averages because they can be cast as a very simple recursive formula:

        EMA_t(T, lambda) = lambda*x[T] + (1-lambda)*previousEMAValue;
    

(For an interesting discussion of EMAs, have a look at Automated Trader, Issue 40, page 36).

One application for exponential moving averages is the smoothing of price changes or returns to arrive at short, medium or long-term estimate of volatility. Below is an exponentially smoothed volatility estimator which uses the recursive EMA function above:

        double emaOfSquares = EMA_t(Math.Pow(R[i], 2), lambda);
    
        double squaredEma = Math.Pow(EMA_t(R[i], lambda), 2);
    
        double volatilityEstimateEma = Math.Sqrt(emaOfSquares - squaredEma);
    

The problem with these calculations is that we can end up with some pretty small numbers. We might have values on the order of 0.00001 to 0.0000001. How? If R is your difference from sample to sample in EUR/USD, it could be quite small: 0.00001. If it's JPY/USD (not USD/JPY!), the difference could be 0.0000001. And we are squaring these numbers. So you end up with (gotta be careful with the decimals here)... 0.00000000000001. That's 10^-14. Still easy to represent as a floating point number, but doing so accurately depends on the actual absolute value. If the number is close to zero, as 10^-14 is, then there is no problem. But if it is more like 1.00000000000001, this can no longer be represented as a floating point number and becomes... 1. You can verify this at home if you like.

Why would you add 1 to this small number? I am not sure, but the next line in the code was:

        double logVolatility = Math.Log(volatilityEstimateEma + 1);
    

At this stage you have entered floating point hell, which will freeze over at fairly unpredictable times (read: the value of logVolatility will drop to zero, because it will be log(1)).

Indeed, the volatility estimator did regularly drop to zero on currency pairs with small price increments and little movement. And what is worse, due to the vagaries of how the compiler would order the specific instructions for the above code, you could end up with different estimates at different points in time, depending on the exact compiler configuration used. Two different developers can easily end up with two different volatility time series, depending on the exact set-up of the CPU and compiler that they used.

Ultimately, this would mean that the signals the trading system generated were not only wrong, but also that they would differ depending how the entire trading system was compiled and on what specific hardware it ran on.

Debugging this problem would turn out to be a very long and pain-staking process, that would take many months and require fixing a number of other issues first. Thus, some good came out of the Herculean effort of tracing the problem to its source.

What went wrong

What went right

  • Writing numerical code without a good understanding of the dynamics of floating point maths is just not a good idea.
  • No regression testing for trading signals.
  • To fix this problem, a whole bunch of other problems had to be solved first, so there were positive knock-on effects from the bug hunt.

Only change one line

Data management nightmares occur if the data gets corrupted by transforms along the way and nobody notices.

One of the bigger challenges in quantitative finance is the ingestion (and digestion) of data in an automated fashion. Today's trading institutions take in data from a large variety of sources. In some cases, it may be only a dozen sources. In other cases, it may literally be more than a thousand disparate providers that feed data to a firm globally.

Data sources change their formats and/or content with varying degrees of frequency. Staying on top of these changes is basically impossible. Nobody does this perfectly, all you can do is hope that you are on top of it enough that any unexpected changes at the source do not propagate in a way that they can cause damage. And that if they do cause damage, that it's not too much. This is a story about one of those changes propagating all the way to the front line and affecting production.

One of the stock exchanges produces a daily file recapping all trades that happen during the day and makes this file available to subscribers at the end of the day. Ingesting this file and using it to calculate many end-of-day analytics is a convenient way to prepare for the next trading day. You don't have to worry about any loss of data that your own feed handlers might have had during the day. Plus the format is pretty simple and straightforward and it doesn't change much over time. And when the format changes, it is pre-announced. These are all good things from a data management perspective.

In the bright summer of 2015, the format changed. As was the way, it was a pre-announced change and when the day rolled around, this particular trading firm's developers were ready for the change. The change was the inclusion of some additional flags specified as additional columns. The format was a fixed layout, so it went:

from this...

...to this

0000694500AADD00

0000694600AADD00

0000694500AADD00

07KJ000694600AADD00

08KJ000694600AADD00

05NJ000694600AADD00

There are a number of ways to read this format and retrieve the price of a trade. The way this was done at this particular firm was like this:

        decimal integerPart = decimal.Parse(line.Substring(1,5));
    
        decimal fractionalPart = decimal.Parse(line.Substring(6,4))/10000;
    

To clarify, this takes '69' and adds to it '0.4500' in the above example from the first line. With the new change, what was required to shift everything to the right by three additional characters. However, in the rush to deploy everything, only the first line of code was changed correctly:

        decimal integerPart = decimal.Parse(line.Substring(4,5));
    
        decimal fractionalPart = decimal.Parse(line.Substring(6,4))/10000;
    

What happened now is that, for the first line of the new version, we get: '69' for the integer part and '0.0694' for the fractional part. In other words '69.0694'. Basically, the fractional part gets set to a constant and never really changes. So every trade pretty much looks like '69.0694', '69.0694', '69.0694'. Eventually, if the integer part of the price changed, it would become '70.0700', and so on.

This data was fed into the analytics engines sitting further downstream. You can imagine what that does to any data derived from this raw data: volatility and correlation estimators, VWAPs, you name it. Everything is going to go awry. Volatilities will basically drop towards zero (occasionally disrupted by jumps when the price crosses into a new integer part). This data was used to drive a whole bunch of overnight processes, including relative value analysis which would lead to buy/sell decisions. Given this input data (which is now basically random, to a first order approximation), you can imagine what the resulting trades looked like: they were also pretty random.

One would imagine that something like this would get spotted pretty swiftly. A few hours, or maybe a few days at most. But no, in a testament to the firms highly resilient systems, they just kept trading. Not for days, not for weeks but for a whooping full six months, before an analyst finally noticed that some of the numbers seemed "really off". After another day of diagnosis, the problem was located in the incorrect parsing of the daily summary files. While this may seem ridiculous and massively damaging to the trading effort, it actually turned out it wasn't. Re-running the trading strategy in simulation with the correct data showed roughly the same P/L over the six-month period. So no opportunity was lost - although the value of the trading strategies is obviously called into question.

What went wrong

What went right

  • No inspection/run-time diagnostics of incoming data
  • No monitoring of strategy states
  • No unit test of file parser
  • No damage from incorrect trading signals