The Gateway to Algorithmic and Automated Trading

Models for daily and intra-day volume prediction

Published in Automated Trader Magazine Issue 44 Q1 2018

A practical approach to building an intuitive model for intra-session and all-day traded volumes. Multiple factors are combined in a straight-forward manner to give robust and usable results.

About the authors

Vladimir Markov

Vladimir Markov is a senior quantitative analyst at Bloomberg where he focuses on trading alpha and advanced equity analytics. His expertise in financial market analysis was earned through roles working for buy- and sell-side firms since 2007. Vladimir holds a PhD degree in Theoretical Physics.

Olga Vilenskaia

Olga Vilenskaia is a quantitative analyst at Bloomberg where she focuses on trade execution analytics. She has ten years experience on quantitative positions in finance and two Master's degrees in Economics and in Mathematics with a major in probability theory and statistics.

Vlad Rashkovich

Vlad Rashkovich is the global head of Quantitative Trading Research at Bloomberg, where he focuses on building frameworks and quantitative tools that present new alpha frontiers to institutional traders, portfolio managers and analysts. Vlad publishes in various journals and frequently speaks at leading industry conferences around the world.

A practical model in finance is always a compromise between a mathematical rigour, an underlying assumption about the data and practicalities of serving final user needs. Instead of building one complex model, we construct simple models that work together in an ensemble to increase transparency and interpretability of the original model.

There is a number of approaches to modelling intra-day volume. Despite their diversity, the models have several common components: they combine the historical daily volume component with an intra-day component. They may have seasonal and dynamical sub-components. The autoregressive nature of the volume is captured by the autoregressive moving average (ARMA) or autoregressive integrated moving average (ARIMA) models.

Here we give an overview of econometric models that can be used as building blocks for a generic volume prediction model. We start by discussing the log-normal distribution approximation for traded volumes and the shortcomings of traditional statistical error metrics for calibrating volume prediction and introduce the Asymmetrical Logarithmic Error (ALE) to overweight the risk of overestimation. The daily volume prediction can be improved by using a simple ARMA(1,1) model and making adjustments for special days (for example, days with a large overnight price gap or an earnings announcement). The dependence of the intra-day volume profile, the so called "U-curve", on the overnight price gap can be modelled using a functional regression. And finally, Bayesian methods are used to optimally combine historical and current intra-day inputs while automatically taking into account uncertainty of the model input components.

Figure 01: Fitting normal distribution for log(volume) for ticker JCP US Figure 01: Fitting normal distribution for log(volume) for ticker JCP US
Figure 01: Fitting normal distribution for log(volume) for ticker JCP US Figure 01: Fitting normal distribution for log(volume) for ticker JCP US

Figure 01: Fitting normal distribution for log(volume) for ticker JCP US

Log-Normal Distribution of Volume

A log-normal distribution provides a loose fit for the main part of the daily and intra-day volume distribution. It means that the logarithm of volume approximately follows a normal distribution where many analytical results are available. Having said that, we formulate all models in log-space.

In Figure 01, we show an example for a typical midcap stock. Alternative to the log-normal distribution are q-gamma and Weibull distributions. Although tail behaviour of an empirical distribution may vary, the analytic tractability and overall possession of many desirable properties (which we discuss later) strongly favour a log-normal distribution as a base model for the volume distribution. The observations that deviate from normality can be filtered out by the Grubbs' filter.

Volume Prediction Model Metrics

The volume prediction model metrics have to take into account the asymmetric risk profile of execution and the fat tails of the volume distribution. For example, overestimation of the daily volume by a factor of two leads to the same increase in participation rate. If there is an obligation to complete the order and the target participation is 20 percent, the actual participation rate would then be 40 percent, leading to excessive market impact. Estimating future volume conservatively gives more freedom to an execution algorithm and leads to impact savings, especially for orders with slow alpha decay. Also, the log-normal distribution of volume possesses fat tails and thus the error metric has to be robust to handle large-volume days.

To calibrate model parameters, we use the weighted Asymmetrical Logarithmic Error (ALE):

 01

where

02

and Xi=log(Vi). Thus, ALE is, in fact, an asymmetrical generalisation of L1 norm in logarithmic space. With ALE, we use double weights for overestimation errors in order to take into account the asymmetric profile of the execution risk as mentioned above.

We note that Root Mean Square Error (RMSE) is symmetric and can be influenced by tail days. Mean Absolute Percentage Error (MAPE) metric is risk-symmetric as well. The R2 metric has limited value outside of the linear regression framework.

Daily Volume Prediction

Without any intra-day information, the historical daily volume and its averages are often used as a proxy for today's expected volume.

The 20-day arithmetic moving average is an industry standard for estimating liquidity of a stock. Moving averages have a tendency to overestimate volume due to retaining memory of large-volume days that happen quite frequently. Corporate news, earnings announcements, option expirations and index rebalancings all can lead to large-volume days. Formally, these large-volume days lead to the fat right tail of daily volume distributions.

There are multiple views on what an average is. Subjectively, it is the 'number in the middle' or a 'number that is balanced'. Another goal when applying the average is to understand a data set by using a single representative number (hence the term "summary statistic"). Mathematically speaking, for many types of distributions there is a 'native average'. In other words: the calculation of average depends on the distribution. The arithmetic mean of normal random variables is normal, the geometric mean of log-normal random variables is log-normal, while Cauchy-distributed random variables are closed under taking harmonic means.

As it is more risky to overestimate volume than to underestimate it, the geometric average gives a better performance in ALE metrics than the arithmetic mean. Also, the geometric mean of log-normal random variables happens to be equal to its median, which represents a typical trading day's volume and is not influenced by outliers.

As a simple prior estimator of the logarithm of total daily volume, an average of the most recent N=20 daily log-volume observations Xi = log(Vi) is taken:

log-volume observations Xi = log(Vi) is taken

Daily volume prediction is given by a geometric mean of volume:

03

We note that given a log-normal distribution with parameters µ and σ, the arithmetic mean is given by:

04

According to a well-known inequality concerning arithmetic and geometric means for any set of positive numbers, the arithmetic average is always greater or equal to the geometrical average.

Adjusting prediction with an ARMA component

Autoregressive moving average (ARMA) models provide a description of a stationary stochastic process in terms of two polynomials: one for the autoregressive part and the second for the moving average part. The notation ARMA(1,1) refers to a model with one autoregressive term and one moving-average term:

05
05
05

the logarithm of total daily trading volume for day t

05

µt is N = 20-day moving average:

In this section subindex t refers to the index of a historical day. To make a volume series (quasi) stationary we subtract from the log-volume observations Xt the moving average . The coefficients ' and θ were estimated per stock by minimising the ALE metrics between realised and estimated volume. The fitted parameters for the US stocks vary slightly by stock with a typical value given by ' ≈ 0.7 and θ ≈ -0.3. More complex models of the ARIMA(n,p,q) class do not give significant risk metric improvement over the simple ARMA(1,1).

Figure 02: Arithmetic average, geometric average and ARMA predictions for IBM US equity in 2017

Figure 02: Arithmetic average, geometric average and ARMA predictions for IBM US equity in 2017

The model contains both an autoregressive (AR) and a moving-average (MA) model. It is useful to look at the MA(1) and the AR(1) processes separately to better understand their dynamics. For an AR(1) process , with |'| < 1, the effects of ' on x are as follows:

06
06

The AR(1) model describes autoregressive behaviour in which the next period's value should be predicted to be ' times as far away from the mean as the previous period's value.

For an MA(1) process, , the effects of θ on x are:

07
07

The lagged values of the forecast errors are called moving-average (MA) terms. After a day with big volume, where the error of prediction is positive, the MA component dumps the prior for the next day due to a negative sign of θ.

In essence, the MA part models the response of the market to external shocks such as earning announcements or significant corporate news. Endogenous trends of volume are modelled by the AR component.

An example of ARMA prediction versus the 20-day arithmetic average and the 20-day geometric average predictions is shown in Figure 02.

Adjusting prediction for special days

Special days - aforementioned earnings announcements, options expirations, index rebalancings, high overnight price gaps and so on - may be informative and often lead to higher trading volumes. To take this into account, a linear regression with ALE error metrics can be performed:

08
08
08
08

µt is N = 20-day moving average:

m is the number of independent variables

xk,t is k-th independent scalar predictor on day t

The choice of dependent variable yt = Xt - µt allows easy interpretation of the regression coefficient β as a multiplier for the prior. If the only independent variable x1,t in the regression is the overnight price gap gt, then the gap multiplier ηgap = exp(βgt) for today's volume is given by:

09
Figure 03: Cross-sectional regression of excess logarithmic volume on overnight price gap to volatility ratio for S&P 500 Index sample

Figure 03: Cross-sectional regression of excess logarithmic volume on overnight price gap to volatility ratio for S&P 500 Index sample

Figure 03 shows a cross-sectional regression (Equation 08) of excess logarithmic volume on overnight price gap to volatility for a representative sample of S&P 500 stocks.

Intra-day Volume Profile: U-curve

The separate modelling of the total daily volume level from its intra-day shape (U-curve) increases both stability and interpretability of the model. Moreover, the U-curve has its own value as it is used by VWAP-type algorithms.

For intra-day observations we consider the trading day as a sequence of 10-minute intervals and calculate the elapsed time in number of intervals since market open. We define the intra-day volume profile u(t) (the U-curve) as the fraction of the day's volume that has been traded during t-th interval, that is at time t. We call the cumulative sum of the U-curve the "C-curve": c(t). It represents the fraction of the day's volume that has been traded from market open until time t:

10

Here, V(T) is the total volume for the whole day and V(t) is the total volume traded up to time t. In this section, subindex t refers to the index of an intra-day interval. The volume curves c(t) and u(t) are known to us only after the close. So the estimated ĉ(t) and û(t) have to be used in making predictions.

The intra-day volume profile tends to be stable, and on average does not change significantly over time. A plain approach to obtain a U-curve estimator is to take an average U-curve estimated over a prolonged historic period, for instance the last 180 days.

Figure 04: Cumulative volume profiles (C-curves) by overnight price gap to volatility for the S&P 500, S&P MidCap 400 and Russell 2000 indices samples

Figure 04: Cumulative volume profiles (C-curves) by overnight price gap to volatility for the S&P 500, S&P MidCap 400 and Russell 2000 indices samples

Functional regression for the U-curve

Functional Data Analysis (FDA) deals with the analysis and theory of data that come in form of functions. In functional regression, responses or covariates are functional or vector data. In this section we model dependence of the cumulative U-curve c(t) on the overnight price gap and the quantile level of daily volume:

11

Here, i=1,...,n - i-th day from n historical days available; t=t0,...,T - t-th interval of the current day; m is the number of independent variables; xk,i - k-th independent scalar predictor on day i; βk(t) - partial effect of predictor xk on the response at time t. For visualisation purposes, we perform two separate functional regressions: one for the overnight price gap and one for the total percentile of daily trading volume.

In the first regression we use the overnight price gap as an independent external parameter xk. The gap is defined as the relative difference between the opening price of the current day and the closing price of the previous day over the 20-day price volatility. According to Equation 11 the intra-day volume profile has higher values at the beginning of the day for days with higher ratios of overnight price gap to price volatility. This means that for days with high overnight price gaps, the intra-day volume profile changes from the familiar U-shape closer to a flipped J-shape. Figure 04 shows average cumulative intra-day volume profiles for different values of ratio of overnight price gap to price volatility for the S&P 500, S&P MidCap 400 and Russell 2000 indices representative samples.

Figure 05: Average coefficients of functional regression for the S&P 500 Index

Figure 05: Average coefficients of functional regression for the S&P 500 Index

Figure 05 shows average coefficients of the functional regression (Equation 11) for the S&P 500 Index representative sample on a single independent variable - the overnight price gap:

12

Here, gi is the ratio of the overnight price gap to average volatility on day i. Higher values of β1 (dotted line in Figure 05) in the beginning of the trading day means that a higher overnight price gap results in a larger increase of trading volume for the first bins of the trading day than for the last ones.

Similar to the regression on the overnight price gap, the regression on the percentile of total daily trading volume (specifically the percentage of historical days for which total daily trading volume is lower than for the current day) indicates that for high-volume days the intra-day volume profile changes from U-shape closer to an inverted J-shape. Figure 06 shows average cumulative intra-day volume profiles for different values of percentiles of total daily trading volume for S&P 500, S&P MidCap 400 and Russell 2000 indices representative samples. The effect of high-volume days is more prominent for small-cap equities. The dependence of the shape of the U-curve on total daily volume can be used to update the shape of the U-curve intra-day based on total daily volume predictions.

Although we plot the aggregated results only, the regression coefficients from Equation 11 are often stock-dependent. For example, Figures 07 and 08 show an inverted J-shape of the U-curve for AAPL on a day with a high overnight price gap of about three units of daily volatility.

Intra-day Volume Prediction and Bayesian Inference

Figure 06: Cumulative volume profiles by total daily trading volume percentile for the S&P 500, S&P MidCap 400 and Russell 2000 indices

Figure 06: Cumulative volume profiles by total daily trading volume percentile for the S&P 500, S&P MidCap 400 and Russell 2000 indices

Financial data is inherently noisy, non-stationary and often has only small sample size leading to predictions characterised by large uncertainties. The Bayesian approach gives a quantitative framework for finding the best prediction despite that uncertainty by assigning each possible state of the world a probability and using the laws of probability to calculate the best prediction.

Formally, Bayesian inference is an application of Bayes' theorem to update the probability for a hypothesis as more evidence or information becomes available. In this framework volume V is a random variable that takes on a realised value v once observed. V is unobserved but described by some probability distribution that we want to derive from the actual data values v that we have. Denote by θ the parameters (such as the mean or the variance of a distribution) that characterise the probability model. The goal is to obtain estimates of the unknown parameters θ given the data v.

In Bayesian statistical inference, θ is random as well, possessing a probability distribution of its own that reflects our uncertainty about the true value of θ. Because both the observed data V and the parameters θ are assumed to be random, we can model the joint probability of the parameters and the data as a function of the conditional distribution of the data given the parameters and the prior distribution of the parameters.

13

This leads to the well-known Bayes' theorem:

14

Here p(θ|v) is referred to as the posterior distribution of the parameters θ, given the observed data v, p(v|θ) is the likelihood function and p(θ) is the prior. The normalisation factor does not depend on data parameters θ and is given by:

15

The posterior probability is a function of a prior probability and a likelihood function that defines a statistical model for the observed data. Equation 15 states that our uncertainty regarding the parameters of our model, as expressed by the prior distribution p(θ), is weighed by the actual data via likelihood function p(v|θ) - yielding an updated estimate of the model parameters as expressed in the posterior distribution p(θ|v).

Although Bayes' theorem is mathematically simple, its implementation can be computationally expensive. The difficulties lie in the normalising constant p(θ), where the product of the prior and likelihood functions must be integrated over the valid domain of the parameters being estimated. One way to obtain a tractable solution is to derive pairs of likelihood functions and prior distributions with convenient mathematical properties, including tractable analytic solutions to the integral. Namely, if the posterior distributions p(θ|x) are in the same family as the prior probability distribution p(θ), the prior and posterior are then called conjugate distributions, and the prior is called a conjugate prior for the likelihood function. Taking into account the log-normal approximation of the volume distribution, we use well-known results for conjugate priors and marginal distribution for normal random variables.

Figure 07: U-curve for AAPL US equity on 3 November 2017

Figure 07: U-curve for AAPL US equity on 3 November 2017

Figure 08: Overnight price gap for AAPL US equity on 3 November 2017

Figure 08: Overnight price gap for AAPL US equity on 3 November 2017

Assuming that volume follows a log-normal distribution means that the logarithm of volume log(V)~N(µ,σ) follows a normal distribution. The problem is to estimate a posterior distribution of the mean µ and standard deviation σ of the expected daily volume given the finite number of intra-day bin volume observations and the prior information about past daily volumes.

Suppose that we are given data that is known to be independent and identically distributed (iid) and taken from a normal process with known or unknown variance and unknown mean. We wish to infer the mean of this process. There are two flavours of Bayesian inference in our case: the distribution has known variance σ2 but unknown mean µ; and the distribution has unknown variance σ2 and unknown mean µ.

Bayesian inference with unknown mean and known variance

Suppose observations D={xt} have known variance σ2 but unknown mean µ. The posterior distribution of the mean µ is normal P(µ|D) = N(µ|µp,σp), with posterior mean of µp expressed as a weighted average of the sample mean x and the prior mean µ0 where the weights are proportional to their precision:

16

and posterior variance

17

here:

Each observation increases the precision of the posterior distribution by the precision λ = 1/σ2 of one observation. The mean of the posterior µp is a convex combination of the prior µ0 and the maximum likelihood estimator of the current daily volume x for Gaussian random variables xt, with weights proportional to the relative precisions.

If we are interested only in inferences about the mean, and if the sample size is not too small, we can achieve a reasonable approximation of the posterior distribution by treating the standard deviation as known and equal to the sample standard deviation.

A more accurate representation of our knowledge should account for the unknown variance as below:

Bayesian inference with unknown mean and unknown variance

The conjugate prior for mean µ and precision λ=1/σ2 is normal gamma distribution. The marginal distribution of the mean P(µ|D) (given observations D={xi}) is given by the Student's t-distribution, and the estimation of the mean is a simple average between prior µ0 and average of n observations

18

Here, the k0 parameter is the effective size of the prior sample. The natural values of k0 are k0 ∈ [0.3 - 0.8]Nprior, and Nprior is the number of observations of the historical daily volume (in our case Nprior = 20 and the interval size is ten minutes).

Volume model for liquid securities: intra-day bin model

For liquid securities the intra-day prediction for the daily volume xt = x(t) based on observation of volume traded in bin t v(t) is given by:

19

Note that we use the estimated U-curve û, not the true one which becomes known only after the day closes. This shows the value of modelling U-curve discussed in the previous section.

In the beginning of the day, the variance of intra-day observations is unknown and the log-daily volume after n bins is estimated using Equation 18:

20

where k0 parameter is the effective size of the prior sample. The optimal value of k0 is bin size-, market cap- and country-dependent. The natural values of k0 are within range k0 ∈ [0.3 - 0.8]Nprior. For example, for liquid names in the US and 10-minute intervals, we use k0 = 0.5. After, there are enough observations to estimate the variance σ2 of intra-day observations x, Equation 16 is used and the estimated log-daily volume is given by:

21

Here, µ0 and are mean and variance of the logarithmic prior, and x and σ2 are mean and variance of n current-day estimates x(t) defined by Equation 19.

Volume model for illiquid securities: historical cumulative model

There may be cases when calculating total daily volume estimators based on each particular bin is not optimal (for instance, when the data is sparse or not very stable).

Illiquid stocks often have intra-day bars with zero volume, making the U-curves erratic. In such case, the use of the cumulative curve for forming intra-day observations looks more promising. Let's define z(t) as the log of daily volume based on the estimated cumulative U-curve ĉ(t) and cumulative intra-day volume up to time t V(t):

22

The estimate of the log-daily volume in the historical cumulative model after n bins is given by:

23

Here, Ω2(n) is a dispersion of daily prediction on time n using z(n) over the last M days:

24

where X(I) is the total daily volume of day I.

Forming the final prediction

We estimate the remaining daily volume as:

25

Total daily volume VD is given by:

26

where V(t) is the volume traded so far.

The estimated volume that is expected to be traded between time t1 and t2 is:

27

A human trader needs to know some practical numbers to set parameters of an execution algorithm, such as the expected urgency or the end time for their order.

Knowing the expected volume between t1 and t2 V(t1,t2) can help a trader to select the expected participation rate in participation of volume algorithm:

28

where S is the order size. Alternatively, the predictor allows us to estimate the end time t1 of the execution given a participation rate ρ:

29
Figure 09: Closing auction volume for AAPL US equity

Figure 09: Closing auction volume for AAPL US equity

Figure 10: Close auction volume versus daily volume for S&P 500 subsample for special days

Figure 10: Close auction volume versus daily volume for S&P 500 subsample for special days

Similar analytics can be provided to VWAP traders who are interested in U-curve analytics (how strictly the VWAP schedule should be followed) and urgency recommendations. The dispersion information about the U-curve can be shown as well to offer reliability when using VWAP for a given stock.

If the intra-day U-curve is stable, which is typical of liquid stocks, we can use a Bayesian model that has a form of weighted sum of historical daily volume component and intra-day interval volume observations. If the intra-day U-curve is unstable or noisy, which is typical of illiquid stocks, we use a model that has a form of weighted sum of historical daily volume component and intra-day cumulative volume observation. As error metrics, we use the asymmetric logarithmic error ALE.

Closing Auction Volume Prediction

The volume transacted at the closing auction represents an important and significant fraction of the daily volume. Both absolute auction volume and the volume at the close measured as a proportion of the total volume of the day are highly volatile and hard to predict.

The closing auction price is defined by the price that maximises the number of crossed shares. Given that the order size submitted by a trader is small relative to the typical auction volume (that minimises the risk affecting the closing price) and the rest of the order flow is random, there is a high probability that the auction price will be near to the closing price of the continuous session. A number of traders use the closing price as a benchmark, and any deviation from it represents a risk for them.

The closing auction allocation has to be decided in advance and thus the simplest (and most robust) strategy is to submit a fixed percentage of the predicted volume.

It is recommended to follow a particular rule of thumb to minimise the price impact during the close auction: take the lesser of 12 percent of the predicted closing auction volume and 12 percent of an order and allocate that number of shares to the closing auction.

ALE metrics provide a trade-off between the quality of reasonable prediction and the risk of overestimating to match the objectives of a fixed percentage strategy.

As a base prediction of the closing auction volume, we take the 20-day geometric average. In some cases the ALE error can be slightly improved (within 5 percent) by an ARMA model. Unfortunately, the ARMA signal is weak and does not justify the increase in complexity of the model.

In general, the closing auction volume increases around major option and future expirations and rolls and index re-balancing days.

Figure 11: AVAT estimator of AAPL US equity

Figure 11: AVAT estimator of AAPL US equity

In the US, the most noticeable spike of auction volume is seen during triple witching days. A triple witching day is the third Friday of every March, June, September and December. On those days, the market experiences the simultaneous expirations of stock market index futures, stock market index options and stock options.

Figure 09 (see page 63) gives a typical example of closing auction volumes.

To take the seasonality into account, we perform a linear regression:

30

Here, is the excess log-auction volume over the average; - log-auction volume; µt - average log-auction volume over the previous 20 days; and dt - dummy variable for the quarterly option expiration days. It allows us to calculate the option expiration dates multiplier ηa = exp(βdt) for today's auction volume:

31

For regular days, the behaviour of the close auction volume is erratic and almost uncorrelated with trading activity until 30 minutes before close. For the special days specified above, there is a dependence between total daily volume of the continuous session and the closing auction volume. In Figure 10 we plot the regression of the ratio of the closing auction volume to its geometric average on ratio of the daily volume to its geometric average for the S&P 500 representative sample for special days.

Better accuracy for the auction volume can be achieved using real-time auction imbalance information. This information becomes available only a few minutes before the close and can be combined with geometric average prior using the Bayesian technique described above.

AVAT function in the Bloomberg terminal

The model discussed above was implemented in the Average Volume At Time (AVAT) function in the Bloomberg terminal.

AVAT allows to forecast a selected security's likely trading volume, so one can develop an optimal order execution strategy. One can visualise volume forecasts, analyse volume for specific time intervals and test earlier forecasts' accuracy to optimise the order execution strategy. One can also display forecasts based on Average Turnover At Time (ATAT) data. AVAT supports equities, equity indices, ETFs, warrants and funds. Figure 11 gives an example of AVAT estimation for AAPL US equity.

Conclusion

We presented a set of models relevant for predicting intra-day trading volume for equities. Instead of building a single complex model, we used multiple simple models for historical daily volume and intra-day U-curves. These were calibrated separately and merged together by Bayesian formulae that automatically takes the uncertainty of the model input components into account. The models are calibrated using asymmetric error metrics that gives greater weight to overestimation error.

To summarise, the full intra-day volume prediction model combines:

Total historical daily volume model that is based on the combination of the 20-day geometric average of the daily volume, an ARMA component and special days adjustments.

An intra-day volume curve (U-curve) model that is based on the deep history curve (180 days), curve shift based on the overnight gap and expected total daily volume (functional regression).

Close auction model that is based on geometrical average and the seasonal adjustments for options expiration days.