The Gateway to Algorithmic and Automated Trading

Using Trading Dynamics to Boost Strategy Performance

Published in Automated Trader Magazine Issue 07 October 2007

In the first part of a two-part article, David Aronson, President of Hood River Research, introduces the concept of performance boosting strategies and explains the selection process for their predictor inputs.

David Aronson
David Aronson

The objective of performance boosting is to increase the alpha of the buy and sell signals issued by an existing trading model. This process is premised on the notion that if a model is completely objective1, it may be possible to predict the outcomes of its buy and sell signals to an economically meaningful degree. Alpha is boosted by taking larger than normal positions on trade signals that are predicted to have above-average outcomes and taking smaller than normal positions (or no position) on recommendations predicted to be below average.

The predictions may either be in the form of a forecast of the recommendation's return or in the form a probability that the recommendation will result in a profit. The predictions are based on a set of predictive variables or indicators that quantify the market's trading dynamics at the time a new position is signalled by the original model. We refer to these variables as 'trading-dynamics indicators'. The indicator values serve as input to the second-stage model. The model's output is a forecast of the original recommendation's outcome.

Predicting strategy returns

To explain how the performance-boosting model forecasts the outcome of a buy or sell signal, it is useful to consider the situation confronted by an investor who does not use a performance-boosting model. Suppose this investor is ranking stocks on a monthly basis and buying the stocks in the lowest PE decile. Assume that a backtest of the strategy has shown that stocks in the lowest PE decile earn an excess return of 0.50 per cent versus the universe over the one-month period following purchase.

Now consider another investor who follows the same strategy, but who is also using a performance-boosting model to predict the returns of recommendations generated by the low-PE strategy. His enhancement model contains only two predictor variables which were discovered to contain information that helps predict the excess returns of low-PE stocks. The first is the RSI (relative strength index) and VOL_CHG (see Figure 1) which quantifies the recent rate of change in the stock's trading volume.

Assume that as of the date the stock is recommended the RSI has a value of 15 and VOL_CHG has a value -10. Note that in Figure 1 these two values represent a set of coordinates that denote a specific location in a two-dimensional space, where one dimension (axis) represents RSI while the other represents VOL_CHG. Also note that this location is associated with a specific location on the grid surface in Figure 1 and has an altitude of +1.8 per cent with respect to a third dimension, which represents the predicted return for the recommendation.

The grid surface as a whole represents the relationship between just two predictor variables - RSI and VOL_CHG - and the variable we wish to predict, i.e. the return on the stock recommendation. However, in practice the performance-boosting model may contain numerous predictor variables (dimensions) and so the model surface would be a multi-dimensional generalisation of a surface known as a hyper-surface, which cannot be illustrated. Thus, one can think of the model as a hyper-surface in an abstract mathematical space of a number of dimensions 'n', where n minus 1 of the dimensions represent predictor variables while the remaining dimension represents the return on the strategy's recommendations.

In Figure 1, the state of knowledge of the investor operating without an enhancement model is represented by the level flat orange surface at 0.5 per cent. By contrast, the state of knowledge of the investor using an enhancement model is represented by the grid surface. This investor's expectation of a strategy recommendation's excess return is conditional upon the values of the predictor variables (RSI and VOL_CHG) that characterise the stock being recommended.

Boosting model return prediction

Figure 1: Boosting model return prediction

Booster model development

The development of a boosting model involves discovering two things:

  • Predictor variables that are helpful in forecasting strategy returns. The selection of these variables - typically from a large set of candidates - is best conducted by a modelling algorithm (an automated process). Numerous studies2 have shown that human intelligence is not well suited to this type of task, referred to as configural reasoning.
  • The shape of the surface that depicts the relationship between the selected predictor-variables and the variable to be predicted. A technique such as multiple linear regression assumes that the shape of the surface is flat and only the slope of the surface is left open to discovery. However, modern data modelling techniques are more flexible and relax this assumption, and so can discover the most appropriate shape for the model's hyper-surface.

Candidate predictors and pre-processing

The most important factor in the success of performance boosting is the set of candidate predictor variables proposed by a human expert for consideration by the automated modelling algorithm. Obviously, at least some of the proposed candidate predictors must contain information relevant to predicting signal outcomes. If they do not, then no matter how powerful the modelling technique a good result is impossible. If they do, then even a relatively simple modelling technique like multiple linear regression can often produce a useful prediction model.

Data pre-processing - transformation of raw financial market data prior to its submission to the modelling algorithm - is key to creating a useful list of candidate predictors (raw data, such as prices, are seldom useful as predictor variables).

Data pre-processing transformations range from simple operators (such as moving averages, RSI or average true range) to advanced forms of digital filtering. An example of an advanced transformation that is useful in strategy performance boosting is the wavelet3 transformation, which is useful for isolating transitory non-periodic fluctuations that are typical of financial market data (see Figure 2).

Figure 2: Pre-processing

Pre-processing serves several important functions:

  • It conserves the number of predictor variables, such as the RSI and VOL_CHG mentioned earlier, that can be used to good effect in a performance-boosting model. Although in theory, modelling algorithms can construct performance-boosting models comprised of a nearly unlimited number of dimensions, as a practical matter, the amount of historical data (number of historical signals generated by the existing trading model) severely limits this number. Note that each predictor variable consumes one dimension of the model's space. When the number of variables is large relative to the number of historical signals, the data becomes too sparse within the model space. This is a problem because there is minimum level of data density within the model space required to discover the correct shape of the model's prediction surface. This problem, known as the 'curse of dimensionality', tells us that as the number of predictor variables is increased, the required number of historical observations needed to adequately populate the model space goes up at an exponential rate. Hence, if 100 observations provide adequate data density for a model with two predictors then 1000 observations are needed to provide the same level of data density for three predictors, 10,000 for four, 100,000 for five and so on.

    However, pre-processing can conserve these dimensions when, as a consequence of the analyst's expertise and insight, two or more raw variables can be combined into a single more potent predictor variable. For the strategy in question, suppose that the degree to which the price momentum of the stock was consistent with or divergent from the price momentum of the universe to which the stock belongs (i.e. the difference) would be a useful predictor variable in the performance-boosting model. Also suppose that both the stock's price momentum and the universe index's price momentum were supplied as candidate predictors. Eventually the modelling algorithm would discover that the two momentums used conjointly were useful, but this would consume two dimensions in the model. However, if the analyst proposed a predictor which quantified the degree of divergence between the two momentums, the modelling algorithm would have the opportunity to select just that one predictor and thus conserve a dimension.
  • It assures that the predictor variables will be reasonably stationary - i.e. their statistical characteristics, such as mean and variance, remain stable over time. Without this, modelling algorithms are unable to discern any useful relationship between the predictor-variable and the raw trading signal outcomes.
  • A third function of pre-processing is to reduce the noise and amplify the information of raw market data. The more effective this process, the more easily the modelling algorithm can glean the informative component.

Predictors used in performance boosting

From practical experience, (and, as mentioned above, subject to the number of historical signals available from the existing trading model) the number of pre-processed variables offered as candidate predictors to the modelling algorithm number is typically in the range of 200 to 500. These candidate predictors fall into three general categories:

  • Price, volume and volatility measures that pertain to an individual stock. These predictors are derivatives of the stock's price, volume and volatility behaviour. In this respect, higher order derivatives (acceleration, change in acceleration, etc.) and higher order moments (skew and kurtosis) can offer valuable independent and potentially useful information when developing a set of candidate predictors. Moreover, other transformations that measure trend strength, irrespective of trend direction, the degree of order/disorder in the price and volume structure can also be used.

  • Price, volume and volatility measures that pertain to the universe to which the stock belongs. The same transformations as for individual stocks are also applied to the index or universe average to which the stocks belong, which can provide valuable context information. For example, the strategy may work best when the universe is operating in a particular market regime or state.

  • Variables that quantify the divergence between the individual stock and its universe with respect to specific price, volume and volatility measures. This third set of predictors falls naturally out of the first two sets. It measures the difference (i.e. divergence) between the same transformation on the stock and on its universe and often provides useful additional information that is independent of that found in predictors based upon the individual stock and the universe.

One of the most attractive features applying predictive modelling to the problem of performance boosting is the large number of observations that can be obtained by aggregating signals across an entire universe of securities.

However, because stocks vary in terms of their specific behavioural attributes, such as volatility, stability of trading volume, acceleration etc, careful attention must be paid to normalising the predictor variables so they are comparable. For example, if a candidate predictor has a historical range of 40 to 60 for one security but 20 to 80 for another, the data from both stocks cannot be usefully aggregated without normalisation.

Part two will examine the modelling techniques used in arriving at a valuable predictor set for boosting 'raw' trading model performance.

1 An objective model is one that can be reduced to a computerised algorithm and back-tested.
2 For a listing of studies see, Aronson, David, Evidence-Based Technical Analysis, Wiley & Sons 2006. See endnotes 20-26 Chapter 2 and Chapter 9 endnotes 33 - 43.
3 For an introduction to wavelets, see: