The Gateway to Algorithmic and Automated Trading

Evolutions and revolutions: how MTS keeps up with fast-changing technology

Published in Automated Trader Magazine Issue 26 Q3 2012

How does an electronic market make sound decisions when technology is changing at such a rapid pace? What are the trade-offs between price and performance? Andy Webb spends some time with MTS CTO Fabrizio Cazzulini and Fabrizio Testa, head of product development, to get the answers.

fabrizio Cazzulini

Fabrizio Cazzulini

Andy Webb: We'll start with investment in technology - how does MTS as an organisation think about investment in technology? Is there a board-level policy?

Fabrizio Cazzulini: The process essentially involves putting together a proposal for the board, based on the prevalent business requirements and the best available technologies to achieve those requirements. Specifically, we look at the toolbox of available technical solutions and decide which is the best fit for the business requirements, also taking into account the cost of the various alternatives. For instance, in the continuous pursuit of increasing capacity and reducing latency and costs without upsetting the existing infrastructure, a decision was made to move towards a Linux RedHat MRG and InfiniBand infrastructure. We realise that, in the future, we may need to make the transfer to a FPGA-based technology which, at the moment, may entail very high costs and complexity.

Andy Webb: From a purely hypothetical perspective, say that you know a particular technology will not only be the best solution for this year or next year but will enable you to 'future proof' a lot of your development. In other words, it's got a nice development path over the next decade but it costs more than another technology. What's the sort of thinking at board level?

Fabrizio Cazzulini: It depends pretty much on the situation. If the argument and proof of concept can demonstrate clear benefits of a new technology, the board would take cognizance of that. At the end of the day, we are an electronic market and if a case can be made to support the use of a new technology, it would be in the best interests of the company for the board to approve investment in the technology. Given that we are running a mission-critical system, we are wary of being a 'guinea pig' for a new solution. We tend to adopt a second release of the product just to make sure that we are not facing possible issues.

Andy Webb: You mentioned FPGAs. Can you go into a little more detail on that point? A lot of our readers are using fairly low-level FPGAs, which are not very expensive at the moment. By 'too expensive' do you mean the implementation rather than the hardware?

Fabrizio Cazzulini: Yes, the hardware is relatively cheap. It's the effort that would be required to design a fully fledged solution, especially one with rich functionality. For instance, we have a platform like BondVision which is quite rich from a functionality perspective. In other words, it does more than order matching like most exchanges. We feel that trying to shift from the traditional paradigm of software running on normal servers to FPGA could be very expensive in terms of development effort as well as maintenance, because at the moment one of our main requirements is to continually evolve our products. In an FPGA scenario that would be very costly, as it would mean redoing everything every time. However, I think that FPGA may be very beneficial if we moved more intensively towards a standard protocol like FIX in the future. Also, purely from an R&D perspective, we are also exploring technologies like CUDA in terms of exploiting the speed of accessing the memory of the graphical processing unit. I've actually seen a working proof of concept. It's really like a toolbox which we try to keep updated and ready to be used when there is a strong case for it.

 MTS system performance evolution

MTS system performance evolution

Andy Webb: It's very interesting that you mention CUDA. I've discussed FPGAs with a number of exchanges and their approach generally seems to be, yes, we think FPGAs are relevant but we don't think CUDA is relevant to us. I wouldn't say that they dismissed it but they don't see GPUs as the technology, they see FPGAs as the technology for them.

Fabrizio Cazzulini: CUDA can be seen as a very proprietary technology.

Andy Webb: Maybe we can move on to talk about this specific project and more specifically the feedback from clients. Enriched functionality I particularly wanted to hear about. Were clients being specific about what they wanted in terms of enriched functionality?

Fabrizio Testa: We try to gather intelligence on a continuous basis from clients and I like to stress that those who are more active in providing feedback and suggestions from the low level solution to the front end one are, at this stage, the buy side - probably because they are getting more and more sophisticated in the way they carry out business. Electronically, more functionality is required. They currently don't access an order book, but trade using a request for quote or executable order and, therefore, they are the ones coming up with suggestions in order to meet their business requirements.

Recently across all our products from money market to cash, and possibly swaps in the near future, they either tweak existing functionality or they come up with new ideas. An area where we are particularly active is in BondVision and especially in the dealer pages section. Dealer pages are a piece of real estate that we have made available to the market makers in order to provide private prices to their investors and then, of course, at the front end level, we try to make life easier for buy side clients by allowing them to see aggregated prices from all the dealers, allowing them to check requests for quotes as well as executable orders at once and trade multiple legs up to 20. This, again, is what needs a lot of fine tuning on a regular basis.

Ultimately what you want to achieve is to be in line with the regulators' requirements and to offer the least number of clicks to execute orders. This is achieved by integrating the clients' Order Management Systems to our front end solution via FIX protocol.

On the money market side, for instance, in partnership with NewEdge we came up with the solution of a cash management facility which operates like an auction. Again, we are different from other solutions in the market that try to fit the request for competitive quotes in the money market space. We came up with a new solution that, twice a day, brings together cash providers and collateral providers and runs a very neat auction. There are very distinct stages of this process, very transparent but which at the same time prevent front running. Ultimately, what it achieves is to bring together these two communities - and most importantly, this solution is managed by a tri-party repo agent, so it allows for counter party risk, and especially collateral management, to be in the hands of a trusted third party. 

Again, this arises from client feedback regarding their requirements in order to be able to trade repos. It is difficult for the buy side client to enter the inter-dealer space for the repo market, and the requirements are very different from the sell side. A request for quote wasn't exactly the right type of tool for executing this type of transaction, whether you are long cash or long collateral. When NewEdge and MTS came up with this solution, we did a lot of research and in the end we came up with something different which was not available.

Andy Webb: Just looking at your path of development here and the sorts of additions you have made over time, for example MTS Credit, MTS Hungary and so on, what I am interested in is your testing process before putting new functionality into live production. How do you manage that?

Fabrizio Cazzulini: In new markets, such as MTS Credit, MTS Czech Republic for instance, we are not so heavy on development because, from a functional standpoint, we listed new instruments and interfaced with downstream systems for the settlement and clearing of those instruments. For the new markets, the activity has been mainly making sure, for two or three months, that those instruments were listed in our test environment so that the user could have test integrated the pricing of their deal capture system with these instruments, and the focus was more on integration with downstream systems. In some cases we have longer periods of testing, but much depends on the content of the release and its complexity. For instance, we are about to launch a new product called MTS Live which is a low latency feed. Due to the sensitive nature of low latency and the type of tuning that a user needs to do, we started with testing at the end of January and we are providing five months' time in total for testing before going live.

Andy Webb: Would you say then that MTS Live has been one of the more demanding projects in that respect?

Fabrizio Cazzulini: It is certainly the product with the longest testing period before going live.

Andy Webb: And when stress-testing involves a third party, what sort of nature does that take? Does that involve firing simulated trades at the platform at an extremely high rate or sudden bursts of activity?

Fabrizio Cazzulini: For stress testing, we do this jointly with an independent third party, in addition to the technical vendor SIA which is implementing the platform. We go through a list of test cases which are simulating, for instance, an intense period of activity in quotations and orders, peaks of order matching, peaks of connections to the market. So we are trying to stress each individual component as much as possible by using specific test cases.

Andy Webb: The test cases you are using presumably involve volumes or frequencies, several multiples in excess of what you would previously ever seen in the live market?

Fabrizio Cazzulini: Indeed.

Andy Webb: You've mentioned Linux. Prior to Linux, what operating system did you use?

Fabrizio Cazzulini: We migrated to Linux at the beginning of 2006, so it has been a relatively long period. Before that, we were running on HP NonStop (Tandem) technology. We are using Red Hat Enterprise MRG, which is a version of Linux supporting messaging, real-time functions and grid capabilities. It has native support for technology like remote direct memory access directly embedded in the operating system through a messaging protocol based on AMQP. Essentially we can use this transparently with our InfiniBand infrastructure without having to deal with low level details.

screen

Andy Webb: So because the version that you're using has got messaging embedded in it, that makes your interface with InfiniBand easier to achieve?

Fabrizio Cazzulini: Much easier to handle, yes.

Andy Webb: When did you switch to InfiniBand?

Fabrizio Cazzulini: We switched to InfiniBand with the latest release of our CMF platform back in November 2011. [Before that] we were essentially using normal connectivity, 1 gigabit connectivity across the servers. Now we have moved over to blade servers from HP and essentially everything is quite well embedded within the enclosure. So we have Linux RedHat MRG, with InfiniBand providing good 'glue' and actually connecting all these blades.

Andy Webb: Plus your footprint goes down. When you are in the last stages of testing prior to going live and you've got clients running up tests on their side as well, is there a standard process you go through with each client individually, or does it vary from client to client?

Fabrizio Cazzulini: We tend, as much as possible, to leverage on joint effort and activity with independent vendors, so often with our current conformance process we focus on whether the software we are delivering to the user is compliant with the platform and is in line with the test cases and business expectations. Then, case by case, if the banks have a specific in-house development or modified component which may essentially affect the conformance test that is being done on the solution, we do a case-by-case test with the client. Besides, we are currently reviewing our conformance process in order to make it even more effective.

Andy Webb: Perhaps we could move on to how you monitor future technology. Do you just read the right magazines or look at the right websites? You're obviously looking at FPGAs and GPUs already, but more generally is there a process for scanning the marketplace?

Fabrizio Cazzulini: I would say we don't have a structured process for scanning new technologies. I would say, rather, that it is a mix of continuous contact with technical vendors, gathering business intelligence from the industry, and essentially, case by case, depending on new technologies, there is continuous self-training.

Andy Webb: Would you say you have different skill sets in house, so that it's a case of people spotting different developments? How do they go about feeding that back in and becoming considered for possible use in the future?

Fabrizio Cazzulini: It's pretty much a bottom up process - for instance, I can use the example of our internal data warehouse. If one of the specialists in the team spots an interesting new technology or an interesting new tool, we discuss it internally, we start looking at the possible benefits in comparison to the costs and then we create a business case. Over the last three months, we essentially made a decision on the direction of our data warehouse cluster and the direction of the system, plus introducing new Oracle tools, and built up a business case starting from the input of the specialist in the team.

Andy Webb: Interesting, I should have asked this before, but what programming languages are you using in-house, particularly with the core platform?

Fabrizio Cazzulini: For the core market platform, the actual coding is done by our main technical vendor, SIA. For instance, they use C++ as the programming language at this stage.

Andy Webb: Are you using C# for GUI components, things like that?

Fabrizio Cazzulini: We are using .net for these components.

Andy Webb: Are any other languages used for other specialist activities?

Fabrizio Cazzulini: We have, for instance, components for data distribution which are built in Java. We have web applications for distributing data to regulators and these are built using Enterprise Java technologies mainly, and then we have other web applications internally which are built on C++. We tend to use a good mix actually!

Andy Webb: Whatever works for whatever job!

Fabrizio Cazzulini: Exactly. As I mentioned, it is pretty much like having a toolbox and choosing the best tool for the job on a case by case basis.

Andy Webb: Absolutely. When you are looking at a new technology, and maybe we can use FPGAs and GPUs as an example, it's obviously not just the cost of the hardware, it's the cost, as you said earlier, of the migration. Moving from a serial programming mindset to a parallel programming mindset, that is quite a lot of codebase to rewrite. But is there also the question of support and skill availability.

Fabrizio Cazzulini: Absolutely, and this skill set is still quite peculiar... especially with experience of financial industry and markets. That's why, in some cases, it's good to track a certain technology and wait for a wider adoption and lower cost as well. Also, unless we have a specific need for leapfrogging competition or doing a paradigm shift on latency we tend just to track. The floor of a roundtrip time for our transactions is around 200 microseconds right now, even if the average is a bit higher. So this floor indicates to us the level from which we could scale up horizontally on the current platform we have, which would be the easiest solution to improve performance. If, at some stage, that level of latency is no longer adequate, we would automatically have a case for shifting and to start evaluating new technologies.

Andy Webb: This is in regard to scaling your platform to further reduce latency?

Fabrizio Cazzulini: Exactly. Just to provide a very low level example, at the moment we have roughly 4,000 instruments listed on the cash market and one way of achieving capacity would be to dedicate one single CPU to each individual security. It may be the easiest way of scaling up. In any case, we don't need that at the moment because we are improving the system by making the code more efficient and by evolving the hardware etcetera. I mean, one possible direction for introducing more capacity and reducing the latency could be a sort of 'break even' analogy across all the possible solutions. We would end up selecting the best way and the most cost effective way, actually, for improving the performance.

Andy Webb: You talked about scaling things in order to reduce latency. Presumably the switch to blade from conventional rack servers must have saved a lot of space. How about in terms of speed? So if you suddenly realised activity has picked up dramatically in an instrument and you needed to scale up - how quickly can you do that?

Fabrizio Cazzulini: The system is already quite distributed and modular. We always stress-test new software versions, configurations and the set-ups in a proper test environment. The level of traffic simulated and supported is by far higher than in normal market conditions. We have a stress-test environment which is an exact copy of the production environment. In any case, it would be a matter of a weekend, actually, deploying the hardware and configuring the system in order to distribute the load and the order books on additional hardware.

Andy Webb: You use identical hardware - where's your testing based?

Fabrizio Cazzulini: We use a Disaster Recovery Facility. We have a primary facility for production and we have a facility where we run the test environment plus disaster recovery.

Andy Webb: Assuming you don't want your disaster recovery not to work, you've got to run three lots of hardware. One for production, one for disaster recovery and one for whatever it is you're testing that you might next put into production.

Fabrizio Cazzulini: Yes, that is one of the risks for investment, unfortunately.

Andy Webb: Not cheap!

Fabrizio Cazzulini: Not cheap at all. We are heavily regulated and audited so we need to show that we have a proper progression between production and tests, and we have to run periodic stress tests and disaster recovery tests in the Disaster Recovery Facility.

Andy Webb: So you literally have regulators showing up and asking that you demonstrate a hot recovery - or are there just standard tests you run anyway?

Fabrizio Cazzulini: We do standard tests anyway, but the regulators of our markets do annual audits with an external auditor, and we have to show and demonstrate the disaster recovery tests.

Also, we have internet policies which go in the very same direction. I mean, it's a good practice at the end of the day, testing and making sure that a facility works.

Andy Webb: When it comes to selection of hardware, how does that work?

Fabrizio Cazzulini: With specific hardware selection, we rely extensively on our main technical partner SIA and for the foreseeable future I think we will use HP, unless there are dramatic innovations or dramatic improvements, or shifts from other vendors. SIA proposes solutions and the acceptance and final sign-off is with us.

Andy Webb: It's an extremely long relationship - you've been together for how long?

Fabrizio Cazzulini: The relationship between MTS and SIA dates back to the early '90s, at least. Over 20 years ago, i.e. before I joined MTS.

Andy Webb: When you're looking for a future technology, does it start from someone internal seeing something promising, or do you get input from SIA?

Fabrizio Cazzulini: Typically we have quite a cooperative approach with them and we carry out technical analysis jointly. From time to time, we carry out our due diligence autonomously and we also collect input from various technical vendors anyway.

Andy Webb: We've talked about FPGAs and GPUs - is there any other technology out there at the moment which you think has particular promise for MTS but is not quite ready now? What do you see when you peer over the horizon?

Fabrizio Cazzulini: We are using fairly modern mainstream technology based on our Linux, InfiniBand and of course, on Intel processors. But apart from GPUs and FPGAs, we can't see huge paradigm shifts emerging. On specific aspects like memory, storage etcetera, there is a continuous evolution, but I would say this is more 'evolution' than 'revolution'. In this respect, GPUs and FPGAs are revolutionary technologies. We constantly keep an eye on new types of storages, memories and network infrastructures more from an evolutionary standpoint.

Andy Webb: Thank you both very much for your time, it's been great.