Guy Warren, ITRS Group
"The world doesn't want to hear that outages happen..."
At a recent conference, FINRA's CRO Carlo di Florio asked whether algos are governable. The speech attracted a lot of attention, not least because of the technical glitches consistently making headlines around the world.
Referring to the Knight Capital meltdown in 2012, di Florio said that these malfunctions "raise concern about the ability of firms and market to develop, implement and effectively supervise these systems".
The SEC, he added, took some important initial steps by passing the Market Access Rule, 15c3-5 - which requires that financial risk management controls and supervisory procedures be reasonably designed to systematically limit the financial exposure of broker-dealers arising as a result of market access - and proposing Reg SCI (Regulation Systems Compliance Integrity).
Reg SCI is expected to be revisited this month, and intended to enforce written procedures and audits for safe systems in the financial markets. The idea is taken from ISO 9000 best practices, said Guy Warren, CEO of ITRS Group.
ITRS provides monitoring and alert notifications for clients such as global investment banks, which are often subject to a wide swathe of regulations from numerous jurisdictions. Reg SCI is just one incoming set of guidelines among many to improve quality of visibility to understand what's happening in the markets, he added.
As a case in point, a recent technical glitch at NYSE, which seemed to have been the caused by hardware failure, resulted in US market-wide problems publishing and receiving trades and quotes. "Hardware can fail, you can't stop that, but you need to have processes for failing," said Warren.
But can a complex network the likes of the global financial markets be expected to have a 100% success rate?
If an infrastructure is aiming for 99.99% availability rate, that is about 52 minutes of downtime per year. NYSE, Warren added, was down about 30 minutes, which is well within an acceptable range. Compare that to, for example, control systems for nuclear power stations with targets of 99.99995% availability rates - or less than 10 seconds of downtime per year.
"When you build systems like that, you can't allow a human to make a decision that something should fail over (and) it is a lot more expensive," explained Warren, who started his career in such control systems.
"The world doesn't want to hear that outages happen or that 30 minutes (of downtime) might be OK but there should be a grown-up discussion between the regulator and these key forums, particularly exchanges, but also regulated central banks and equivalent, about what is an acceptable level of availability. You don't want knee-jerk reactions and histrionics."
The most modern approach for control systems in the current environment given cost constraints is to design infrastructure and applications to run "dual hot". This means that both the main and back up data centres are providing service all the time.
It would be best practice, and it should be in the regulation, Warren added, that a fail level is performed once a month, meaning if a standby is running, it's tested to ensure it can pick up when needed.
"You should fail over to (disaster recovery) once a month, or once every three months just so you know the DR site is fully operational and fully working," he said, adding that it's very easy for a DR site to drift out of sync with the production site. "You fix a bug, you change your configuration, but has that been done in the DR site?"
Meanwhile, as OTC products get a big regulatory push onto exchanges and through clearing amid continuing fragmentation of other markets as a result of dark pools, regulators should have a hand in mandating best practices for systems controls, Warren added.
"If exchanges are run as commercial businesses, there is a pressure not to spend all the money on change control, and testing and monitoring of complex architectures," he said.
"Regulators are trying to say, you have a responsibility to the market, it is not just you and your P&L. A lot of people rely on you being there as a key part of the financial infrastructure."