The Gateway to Algorithmic and Automated Trading

Trading that worx - Mike Beller, CTO of Tradeworx, talks about the ultra-highs and lows of trading and tech

Published in Automated Trader Magazine Issue 34 Q3 2014

Mike Beller hadn't expected to end up in financial services, having started in telecom early on in his career. Feeling that the research and analysis side of telecom was a bit of an echo chamber, Beller decided it was time for a switch. He eventually landed at Tradeworx, ultimately becoming CTO of the firm and its infrastructure affiliate, Thesys Technologies.

Mike Beller Tradeworx

Mike Beller, CTO, Tradeworx

The hedge fund has been no stranger to controversy throughout the years. One of the issues is the provision of market surveillance technology via Tradeworx to US financial markets watchdog, the SEC, amid cries of conflict of interest from some HFT industry critics.

Mike Beller talks to Automated Trader about the evolution of Tradeworx and Thesys, the technical challenges of conducting and monitoring ultra-high frequency operations, and why he believes public-private partnerships are inevitable for successful market surveillance in the 21st century.

Automated Trader: What are some of the big changes you've seen throughout your time with Tradeworx?

Mike Beller: I thought I was joining a financial technology company because that is what we started, but we have gone through a lot of changes over the past 15 years. The financial tech business did well for three years, but then in the (dot.com) market crash and September 11 (2001 terrorist attacks), we were not able to continue that business model. So we rebooted as a hedge fund and had to redo a lot from scratch. Initially a lot of our edge came from advanced analytics and our ability to gather data and make decisions quickly. That has always been an edge for Tradeworx; but as the years went on, it became more and more important to evolve our trading technology to be able to efficiently trade a larger and more diversified book in a situation where the markets were getting more complicated. There was decimalisation, there was RegNMS, and the beginnings of colocation. We had to conform to these changes or we would have had unacceptable trading performance.

We put more and more resources into trading technology until that became a huge part of the technical effort of the company. In 2008, we launched a fully colocated platform, which immediately paid dividends in terms of improving execution by our hedge fund, but we also found that it was possible to do high frequency trading on our platform. So we began a high frequency proprietary trading effort, which was immediately successful. That was very gratifying - it showed that the work we had done on advanced trading technology had really paid off. But there was an issue with the secular trend, this was going to be a very expensive platform to run. We knew that there was going to be a constant need for investment, for adapting the latest advanced technologies and continually improving our systems in order to stay on the leading edge.

AT: Can you be more specific? How expensive exactly?

MB: Millions of dollars a year. At the time, it was an incremental thing. At first it was a modest investment and then it became more and more. You can go into a new colo, set up networking, get connectivity up and then there would be a newer lower latency this, or higher bandwidth that, and you would have to keep adapting.

We knew that continual investment could be a serious problem but we decided rather to make it into an opportunity. This was the moment we decided to get back into the technology business. If we can take this technology we developed, which is clearly among the fastest trading technology in the world, and make it available to other firms, we can essentially share the cost of this expensive infrastructure and we can make a cost centre into a profit centre. That was the genesis of our subsidiary, Thesys Technologies, which makes trading technology available to other trading firms. Our original product let customers access the markets with extreme low latency. Now we build matching engines - which let exchange and dark pool operators provide their services to the market with low latency and cost.

AT: How many markets and products do you trade in?

MB: Currently the platform supports US equities and US futures, and Canadian equities. Currently, about 5-6% of US equities goes through our trading platform every day and about 12% to 15% of Canadian equities goes through our platform. It has gone from 2008 when we created it, having just our flow, to now having quite a few other firms' flow going through it.

AT: How many firms? And are you planning expansion?

MB: It is dozens (of firms). We are always looking at new opportunities, there has been a focus from our customer base over the recent years on increasing features and capabilities in the US but we are definitely looking at global markets and sizing up other opportunities.

AT: Like Asia? Latin America? Europe?

MB: We have some opportunities in all those areas but nothing I can speak about.

AT: So what is next for technology for Tradeworx?

MB: Thesys started providing the trading platform and the market access platform. Then it added the matching engine capability and algorithm testing capability, which we've entered into a partnership with NASDAQ to make the simulator, that is part of our quant trading platform, available as a way for firms to test their algorithms before connecting them to the market - make sure they are safe, make sure they perform in the way they want to when they connect them to live markers. So that is another area that we've expanded into

AT: You are also working with the SEC to improve their capabilities. How did that come about?

MB: The Securities and Exchange Commission had a problem, which was evident after the flash crash (May 6, 2010). The problem was always there but it became very evident when it took them months to analyse what had happened. There were some very sharp people at the SEC who wanted the ability to make quick and accurate analyses, but they were struggling to find the tools that they needed to make that happen and there weren't broadly commercially available tools.

Manoj (Narang, founder & CEO of Tradeworx) had been doing a certain amount of outreach to give industry and regulators and legislators a better understanding of what is actually going on in high frequency trading, and how trading the equity markets actually works. He had been talking to people at the SEC as part of that effort and the SEC was intrigued when he told them how we were able to analyse in just hours what had happened during the flash crash and come to conclusions very rapidly based on the toolset we had. That led to interest, and ultimately the SEC did an RFP to the industry to see who could provide them with the tools that would allow them to be able to make these types of analyses. We (thought) that it would be really good if the SEC had the tools necessary to be able to make data driven decisions as opposed to being at the mercy of everybody in the industry explaining what they think is going on. We responded to that RFP and we won.

AT: What do you supply to the SEC exactly?

MB: We now supply a platform to them which is basically data and tools. The data is the data we collect as part of the Thesys infrastructure that we provide to all of our customers - all the direct feeds and also the SIPs across the entire US market for all venues, US equities and options, which is what the SEC regulates, and a set of tools that make it possible to rapidly analyse that data because both of those things are necessary.

When you take all the direct feeds on US equities together, that is a lot of messages every day. Billions and billions of messages every day. And you need to collect that in a very accurate way using a low latency network or you don't have an accurate picture of what is going on. You need to organize all of that in way that it is easy to retrieve all the information, and then you need to have the right tools in order to be able to rapidly perform analyses across millions of messages. So we basically took tools and capabilities that we were providing to Thesys customers, plus tools that we had at Tradeworx, and made them available to the SEC in a separate system that only they can access and only they can use. They are making great use of it as far as we can tell from what they are publishing and stating in public statements now, which is great because we think it can only be good if the regulator has a better picture of what is going on in the market.

AT: Let's talk tech about your trading platform and matching engine, what are the specs?

MB: We deployed an extremely low latency trading platform. When you are trading US equities on the short term horizon, you are doing a market making strategy or an arbitrage strategy that is arbitraging very tightly correlated securities like an index to underlying securities, or two related indices to each other. The way US equities is right now, you have to be really fast to do that, and what I mean by fast is you need to react to what is going on in the markets within microseconds to be competitive with other parties who are doing similar things. So what distinguished, and distinguishes, Thesys' trading products is low latency. Its trading platform, its matching engine are extremely low latency capabilities. Our platform reacts within microseconds while processing millions of messages per second across thousands of different stocks and you have the flexibility to produce sophisticated trading strategy programmes, but with us providing all the framework to make that fast and able to handle the speed and volumes of data. And then, when we subsequently had the opportunity to do so, we entered the matching engine market. We now provide the matching engine for LeveL ATS, which is a dark pool, and we are now working on another one for a client that I can't name. It is the fastest commercially available matching engine but it is also extremely cost effective. Just like with our trading platform, which somebody just has to build their algos and they don't have to invest in all the other infrastructure or figure out how to make feeds fast, or figure out how to make market access fast, or how to network these things with the minimum latency. We take care of all of that in our matching engine platform too. Customers just need to tell us what are their order types, what is the specific special sauce that you want to present to your customers, we put that in and you have an extremely cost effective fully hosted solution that is turnkey.

In the case of traders, they can focus on their alpha. In the case of dark pools, they can focus on growing their business and not worry about being very specialised in the very arcane world of making things fast. It is one of the hardest technical challenges that I have faced as an engineer and I have had a pretty long career at this point. There has been a lot of great engineering that has gone on over the years as we figured out how to make the networking lower latency, how to get a software programme to react immediately to something that comes into the bus of the computer. With each of these things, there are more and more challenges to getting it to a lower and lower latency.

AT: Can you be more specific? Like how fast is fast, and how cost effective is such a system?

MB: Our trading platform has a baseline tick to order latency of around nine microseconds, and our matching engine platform has baseline latency from order to acknowledgement of around 15 microseconds. Those numbers are extremely fast and they really provide a leading edge capability. A small firm that wants to get into high performance trading can do it for hundreds of thousands of dollars a year rather than millions of dollars a year.

The matching engine business is different because you can have anything from small to large installations and that can range a lot in cost. A few people in this industry know how much a trading engine is supposed to cost, what I can tell you is that it is less costly than the competitive offerings, and higher performance.

AT: What are some of the important technical details of low latency?

MB: We just kind of figured this stuff out by ourselves, there was no place to look to figure out how to make systems fast, and it was one of the great fun aspects of the challenge of making low latency systems: nobody could tell you how to do it. You could pick up hints of how to do pieces of it, but there was a lot of experimentation and it was really an applied research process.

It is beginning to get known what some of the elements are. Clearly, if you want to squeeze the last few microseconds out, you have to use C++ for your programming language. You have to use cut through switches, switches with deterministic latency. You have to use host adapters for the network, which do what is called "onload" - used to avoid having the operating system introduce variance and extra latency. Getting accurate measurements across colos and the synchronisation of clocks across colos using GPS technology and other technologies has also been a big part of being successful.

AT: How does that contrast with SEC technology?

MB: The SEC is the opposite challenge. There is a similar problem in that you have to be able to collect these billions of messages and accurately time stamp them. So, collect them over a low latency network accurately, receive them, transform them, time stamp them, record them and do that really well. But once you have that data it becomes stored, and that is a very different problem from the high performance trading problem. It starts with that same network, that same technology base that we use to collect all these billions of messages every day, but then it becomes a big data problem because now you are collecting hundreds of gigabytes of data every single day. You need to store it all, and you need to organise it in a way that somebody can rapidly get to the information they want, or analyse lots of the data in parallel and that is where this problem becomes more like what is called "big data".

The high trading performance problem is a niche problem, this problem of low latency engineering that only a few people really care about. But everybody is hearing about big data. Now that I have terabytes and terabytes of data collected, years and years of information, how do I pick the needle out of the haystack? The needle being the thing I am trying to figure out - some problem in the markets, or some research or analysis that as a researcher I need to find. And that is solved with a completely different tool set. Basically, take the data that is collected in the colo and put it up into some kind of scalable system. We chose to do it in the Amazon Web Services cloud in 2007 because we could scale out the analysis with more and more machines when we needed them, and spin them down when we didn't and only pay for what we use. The cloud gave us the ability to spin up 100 Linux servers, analyse terabytes and terabytes of data, get our answer, and then spin them back down. That was an early decision by us to begin working in the cloud for large scale research and it has served us extremely well. When the SEC came along we were able, by using the resources of that cloud, to make a private cloud system, make a separate area of the cloud that is only for the SEC, and provide the data and tools to them without having to go to a data centre and put it on a whole new bunch of computers and storage system. It allowed us to rapidly deploy the system. From the time we actually got the SEC contract, to the time there was a system stood up and they had the authority to operate was about six months. Which is very fast, considering that a lot of effort involves complying with many different government rules and security regulations.

AT: There have been some very loud criticisms that your involvement with regulators presents a conflict of interest - how do you respond to that?

MB: The first thing I say is: we provide them data and tools, what they do with them is only their business. We only know what we read in the press and public statements. I am not sure where the conflict of interest comes in. The other point is, even setting that aside, our interest is that the SEC has a clear picture of what is going on in the markets. That they have all the capabilities they need to have a clear picture of what is going on in the markets, because they need to be making decisions from the data.

If you think of how the US markets are regulated, it is intrinsically a public-private partnership. The SEC regulates equities and options but other regulatory bodies are businesses. NASDAQ is a regulatory body, FINRA is a regulatory body - these are industry bodies. Public-private partnerships is the way that America decides to regulate its financial markets and that is important because if you try to put up too big a wall between those, then there is going to be too huge a gap. The only way the SEC could get the level of capabilities to see the market the way a practitioner sees the market is to have the same capabilities as those practitioners.

AT: How do you see private public partnerships evolving?

MB: I think the next stage for US equity markets is the Consolidated Audit Trail. Regulators need the ability to see everything that is going on in the markets. MIDAS (Market Information Data Analytics System) gives them the ability to see public feeds, in other words, feeds that are publicly available to anyone who wants to pay the registered price; it is not secret data. It costs to get it because it is so expensive to produce, and these firms want to make money on their products and services. The point is, that it is public information, there is a lot of other messages going on in the markets that it is much harder for regulators to see quickly and that is all of the activity related to dark pools, also all the messages going back and forth between brokers and customers that are not on the lit markets. So the CAT will be a great way now for regulators to see an even more complete picture of what is going on in the markets. If you consider the MIDAS project to be a big data problem, then the Consolidated Audit Trail is a huge data problem. It is going to be tens of petabytes in scale and it is going to be a very sensitive database because it will be storing all the old orders and acknowledgements of individuals and companies on all US equities and options markets. It is a fat juicy target from a security perspective, so there is going to have to be a lot of work at making sure it is secure and that the tools organising it are scalable to operate on a database tens of petabytes in size. It is a huge technical challenge and it is one that Thesys Technologies, is participating in terms of responding to the RFP. We hope we are selected to do it, but in any case we hope the market gets a good solution.