The Gateway to Algorithmic and Automated Trading

How CPUseful is that?

Published in Automated Trader Magazine Issue 25 Q2 2012

If it's field-programmable, it's more flexible than an ASIC. And if it's a post-crash, over-regulated, potentially lucrative but problematic market, it's going to take all the flexibility you can find (and probably more) to build an effective alpha-extracting solution for trading it. James Fitzgerald debates the contemporary case for FPGAs.

How CPUseful is that?

Yes, we all know that FPGAs have been around for a while. But in recent years - months, even, to judge from the Latest News inbox - they've begun to become a major factor in competitive automated trading. According to a Linley Group report published in 2009, the FPGA technology market is expected to grow to 3.5 billion USD by 2013, whilst Automated Trader's own research ( has identified that over the next two to three years FPGA usage will grow to 19% of buy-side and 41% of sell-side firms. According to a recent conversation with the young apprentice charged with combing the Latest New s inbox for significant trends, the FPGA-related press release is set to become a major factor in all our lives by, oh, about Q3 of this year.

Why? The design and programming of FPGAs is a lot simpler than it used to be, so they are being used more frequently to solve more and more complex trading problems. Also, with programmers well versed in the hardware description language (HDL) becoming more and more common (albeit no less expensive), FPGAs are being implemented in a variety of trading strategies and solutions.

So what does this mean for the rest of us? First off, the use of FPGAs allows deterministic latency to be achieved via the pipelining of data (see box). Deterministic latency ensures that the speed at which orders are processed is ring-fenced from any flashes in the markets. Marc Battyani, CTO, NovaSparks, says: "Pipelining increases the throughput of a system when processing a stream of data. In terms of processing network data, a pipeline clocked at the same rate as incoming traffic can guarantee to process each byte of network data, even at 100% utilization."

Field programmable gate arrays

• Re-programmable integrated circuit

• Can be altered in-field

• High transistor wastage in development

Specific Integrated Circuit (ASICs)

• Integrated circuit meant for a specific application

• Cannot be altered once created

• Wastes very little material in development

Figure 1:FPGA vs ASIC; nearly a knock-down argument
Simon Garland

Simon Garland

Key word there, I think, is "guarantee". For many users, and for my money in these troubled times, FPGAs even represent a preferable alternative to ASICs. Although a full custom Application-Specific Integrated Circuit chip will provide greater speed once it's up and running, it will take a long time to produce, and any changes post-production are expensive and complex. (Figure 1 gives you the key arguments in the FPGA/ASIC debate. ) Indeed, the very nature of an FPGA chip is its in-field re-programmability, and this sets it up as the hardware of choice for many in automated trading, HFT in particular. FPGAs also have an advantage over CPUs in that they use considerably less power.

FPGAs for today

For some new-ish problems, only an old-ish solution will do. FPGAs have turned out to be useful in overcoming some very contemporary problems in HFT. Since the introduction of the SEC 15c3-5 naked access ban, for example, demand to reduce risk-check latency has been great. As a result, developers have looked to FPGAs for a way of meeting this need. In one case, xCelor has begun designing a risk-evaluation model (REM) using FPGAs to provide a low-latency solution, working with Arista Networks.

Marc Battyani

Marc Battyani

Interestingly, Arista recently announced the new 7124FX switch, containing 24 ports. Ports 1 through 16 are regular 1/10-Gbps SFP/SPF+ ports, while ports 17 through 24 are directly attached to an Altera Stratix V FPGA board. The switch is designed to be programmed specifically by the customer, and this in-field programming has the potential massively to reduce latency.

Discussing the use of the 7124FX switch to overcome 15c3-5 check-induced latency, Rob Walker, CTO, xCelor, says: "We reduce the 500ns latency when leaving the cabinet by writing and customising for a specific exchange. By doing this, we can reduce latency to 200ns, meaning that in the remaining 300ns we can do additional processing. Using the 7124FX Arista switch, we can do the 15c3-5 checks in under 300ns, and we can do the switching to the exchange in 200ns, which means that we're effectively carrying out the 15c3-5 checks in zero net latency."

Stefan Gratzl

Stefan Gratzl

Walker gives the example of trading over an RF link between Chicago and New York. Such a link can be unreliable and expensive, but arbitrage between the two ends is a popular and lucrative strategy. Throw in small bandwidth allowance (150 megabits) and reliability of 95-99%, depending on the weather, and on balance, you might just pack up and go home. But Walker's point is that you could use the 7124FX Arista switch to increase the reliability of trading over RF.

By placing a switch at either end of the RF link, and after applying a specific image to the FPGA, you can filter out symbols which aren't relevant. This symbol-filtering process allows you to reduce the amount of bandwidth being used. xCelor's Chief Architect, Stefan Gratzl, told me: "The main feature here is the reliability, because reliability of transfer is what gives you additional speed when putting through an order. Most people will sacrifice most of their symbols to get a handful of symbols as fast as they can".

Now the key word is "reliability", although again we're using the word to mean something like "speed by other means".

Accelerating popularity?

JP Morgan, working with Maxeler technologies, has recently developed an application-led, high-performance computing system based on FPGA technology. The "headline" achievement of the new system has been to reduce the end-of-day risk process from an overall eight hours in 2008 (much of that time taken up with data management) to a current [intra-day] 238 seconds, featuring an FPGA time of 12 seconds. That was not a single, overnight reduction, but the culmination of a series of steps.

Rob Walker

Rob Walker

Also significant, although perhaps less dramatic, is a gain in efficiency that is entirely attributable to the new system. A disadvantage of traditional CPUs is their considerable power consumption - and since JP Morgan has just under a million square foot of raised floor space in their data centres, power consumption and efficiency are a significant issue. Their switch to FPGA technology enabled a considerable power-usage reduction. After initial trials, which yielded the discovery that they could run calculations almost 30 times faster than before, J P Morgan's people set about building a supercomputer containing 80 FPGA boards.

Summing up the evolution of J P Morgan's thinking on FPGAs to date, J P Morgan's Head of Market Strategies, Peter Cherasia, says now: "Our core use-case for FPGAs is to accelerate the valuation and risk assessment of select structured derivative products that require complex calculations for a variety of scenarios over very large datasets. The FPGA solution reduced compute time by a factor of 200X, which now allows intraday assessments, as opposed to waiting until the next morning for a start of day exposure."

But they're not for everyone

Such examples prompt the question: why hasn't FPGA usage become widespread? Price is one reason. A standard rack of 10 FPGAs might cost you upwards of $100,000, and the 7124FX Arista switch will retail at $49,995. The difference between 10ns and 5ns latency becomes a moot point, if purchasing the technology just isn't financially viable. Plus, of course, there are the added costs of mothballing whatever you bought when you last felt rich.

Matt Davey

Matt Davey

And as so often in automated trading, there's a human factor. Programmers well versed in the subtle arts of talking to these particular machines are increasing in demand; keeping them interested can be a challenge. "Very good programmers who can programme FPGAs are certainly in enormous demand," says Simon Garland, Chief Strategist at Kx Systems, "so when a project goes into maintenance mode, they start calling their friends to find a new, interesting gig. Then you're left with this very clever, very fast code which is almost unmaintainable - it's a big risk for a firm to take".

Granted, it is now becoming possible to code onto FPGAs using C++, but that doesn't add up to an argument for taking a DIY approach. But don't panic. More programmers are taking on FPGA-related skills, as I said earlier, and anyway, as Simon Garland says: "A lot of people are still doing well running traditional software programs. Many of them are using shrink-wrapped solutions for their trading infrastructure, and they are the people who aren't doing anything with FPGAs, and probably won't be for a while." Garland adds: "As a result, I'm not sure how quickly we'll see this stuff getting cheap. Like with the compression/ decompression boards, once there's a big demand for it, the price will drop way down."

Peter Cherasia

Peter Cherasia

That's almost: if it's cheap, I need it. A valid provisioning strategy, perhaps? In some cases, an ultra-low latency approach simply may not be appropriate. Matt Davey, CTO, Lab49, doesn't see the J P Morgan case cited above as typical. Davey says: "If you look at the business solutions built by the various banks on Wall Street or in London, sometimes FPGAs may be appropriate, but some may also use CPU, cell processors, or GPUs, or even a combination of all four. It's all about understanding the value-add you're going to get and where the problem lies."

Davey makes a further pragmatic point that is easily overlooked. "There's only a certain class of problem that needs very low latency. For instance, if you've got an application on the desktop displaying charts, it's probably not worth putting an FPGA into that desktop to render them, as the human eye can only see a certain number of refreshes a second - the hardware's wasted."

But let's give the final word to Peter Cherasia. "FPGA's are already a key technology in the low latency and high frequency trading space. With changing regulations driving increased transparency and electronic trading expanding across more asset classes, the market opportunity for FPGAs will grow."