The Gateway to Algorithmic and Automated Trading

Regulators grappling with market data: a case study

Published in Automated Trader Magazine Issue 29 Q2 2013

Regulators and trading firms have at least one thing in common: they both find grappling with the torrent of data generated in today's markets a tall order. The CFTC is just beginning to wrestle with the flow of swaps data, while the Securities and Exchange Commission has a head start on the equities side. It has deployed MIDAS, the Market Information Data Analytics System, developed by Tradeworx. Automated Trader met Gregg Berman, senior advisor to the Director at the SEC's Division of Trading & Markets, to find out how the regulator hopes to develop the MIDAS touch.

(Editor's note: Automated Trader subscribers can also access a wide-ranging interview with the CEO of Tradeworx, the company that supplied MIDAS here.)


AT: Could you describe what you're doing with this new system?

Gregg: First of all, what is the system? The system collects and gives us an analytical platform for the prop feeds that come from the equity exchanges. Each of the equity exchanges produces its own proprietary feed and that contains full depth-of-book quotes. Some of them contain lots of detailed information, quote IDs and other fields that we can use to construct and analyse how quote traffic is moving, tying into trades. We also get the Opera feed, which is the options market. We get Tape A, Tape B, Tape C, which are the public consolidated tapes.

AT: So you have it for every equity platform?

Gregg: If it's public equity data, we get it. It's only the public feed, so we don't get any information on dark pool activity except for trades that occur in dark pools which are required to go to the public already.

Before this, we would have been able to see the data that would have come from the … consolidated tape that gives you best bid/best offer from each exchange and all the trades that happen. That's readily available, but it's still a large volume of data and we didn't necessarily have systems that were good at processing and analysing that information.

We did not even attempt to access on a regular basis the prop feeds, where you had the full depth of all the cancels, all the modifications, et cetera that were going on at the individual exchanges - and that was the detail that we wanted.

AT: Maybe describe the kind of analysis you want to do?

Gregg: So, there are three categories of analysis. The first category, and the one I think that people tend to think about first, would be forensic analysis, looking at events that have happened and understanding them a lot better than we do. The Flash Crash was analysed the entire day, the entire market. But there are also one-off events that happen that we need to drill down, we need to look at the order books, we need to look at how the trading works, to better understand exactly what happened.

AT: Is that where you'll get into situations with individual firms?

Gregg: You don't have firm identifiers. This is all public information, so these feeds do not contain information on the parties. So in order for us to get information on the parties on a regular basis, we're going to have to wait for the consolidated audit trail, which is a new rule that we have to produce a market-wide repository of all trades, all orders, including dark pools, including the customer IDs - that will take years to develop.

But even if we don't have the IDs of participants, we can still do a lot more than we can do today.

AT: Does that mean you have no visibility as to whether activity is by one firm or different firms?

Gregg: No attribution. You might try to glean something statistically, by looking at when the quotes came in. So if a bunch of buys come in that are tagged within the same microsecond, it's unlikely that those orders are from two participants, just because the probability that they come in from a hundred different participants at the same microsecond is small. You might be able to infer some things in the data, in the same way that an asset manager or a hedge fund might be able to infer it themselves. That's the art of trading, trying to figure out what other people are trying to do.

Regulators grappling with market data: a case study

AT: What's the second area, after forensic analysis?

Gregg: The second is monitoring, to be able to be more timely about identifying issues that are in the market. So being able to look for aberrant behaviour, look for patterns et cetera that will inform us about market structure. And also, if we know about something sooner, then we can start looking at it and see if there's an issue that needs to be resolved. As opposed to being reactive - to hear about something from someone else - the idea is to be proactive. We should be the first to know.

AT: To do that you then need to devise systems for identifying aberrant behavior.

Gregg: That's exactly right. We have to come up with our own math to determine: what is aberrant? And the best that we can do is just say, 'There's a possibility that there's some aberrant behavior here.' The chance that there's something nefarious or even interesting going on is still small. So the first thing you do is you whittle down the market if it happens during the day, to 'Hey, look here, there might be something interesting' and then you look and say 'Yes, that stock fell 5% and that's because Google just released an article that said x, y and z and the stock fell 5%'.

AT: So for example, you in theory might be able to identify wash trades?

Gregg: That one in particular is a little tougher because we don't have IDs. But to the extent that wash trading tends to create a pattern… One of the reasons people try to do wash trading is when they're trying to create a print to signal that something interesting is going on in the market when in fact nothing interesting is happening.

If we saw the market moving in an aberrant way, that is something that would trigger, 'Hey, let's take a look at this.' But a good portion of this is still aspirational. Again, the system just went live this year, so there's a lot of work to be done in terms of building on these types of things.

AT: Forensic, monitoring - what was the third area?

Gregg: The third is fundamental market structure research. What is the nature of high-frequency trading? What is the nature of quoting, fast quoting, slow quoting, message traffic? Plus the many questions around those topics that continue to evolve. We now have some people, and we're continuing to add to that team, people who will look at markets not necessarily to see if there's someone who's doing something illegal on a particular case, but just to better understand market structure. And the more we look, the more we recognise that almost everything that's out there is just factually incorrect. There's so much that people write about, especially on the popular press side: they'll throw out numbers and they'll throw things out - some of them are actually real, but most of them are not. And I think our job is to fix that.

AT: This finding, is that a result of this tool?

Gregg: Think of simple things around depth of book, quotes: so when people say 'Size has dried up, there's no size any more in the market.' Okay, we get that, we all can watch the order size change at the NBBO (national best bid and offer). But does that mean that there's no depth of book outside of the NBBO? We don't know! Now, we can actually look and see the levels outside the NBBO: Have they also fallen away? Are they larger, are they smaller? Are they different for different types of stocks, are they different between stocks and ETFs? You can see that very quickly. You can ask a hundred questions. And now we're in a position to actually analyse that and address some of those questions.