The Gateway to Algorithmic and Automated Trading

What an idea!

Published in Automated Trader Magazine Issue 24 Q1 2012

If you were to list the key attributes of a 'future-proofed' enterprise, of whatever size, you would probably put sound trade ideas and effectively deployed technology somewhere towards the top. We certainly would. So, for the third of our series of conversations with key industry figures about ensuring the long-term survival of the enterprise, we spoke to Colin Berthoud and then Jason Larsen of TIM Group about - you guessed it - running a business focused on developing sound trade ideas and dependent on effectively deployed technology.
The two interviews were conducted on consecutive days by our founder and all-round techno-guru, Andy Webb, and, despite some degree of overlap, we have resisted the temptation to edit them into one continuous conversation. There are important issues being discussed here, and we were struck by the 'added value' of hearing two distinct perspectives.
Andy began by inviting Colin Berthoud to set the scene...

what an idea!

Andy Webb: Can you give us a bit of background on the company and what you're trying to do?

Colin Berthoud: The main focus of the company is on generating new revenue for our broker and administrator customers through the analysis and evaluation of their data and intellectual property. We then deliver that to investors who use it to support investment decisions. Through our TIM Ideas product, we have become the class leader in the distribution and measurement of trade ideas in equities. Our other product, TIM Funds, is all to do with portfolio management for funds of hedge funds. Portfolio management is more of a post-trade activity whereas trade ideas are more pre-trade. You might expect, in principle, slightly different technologies, but in practice we use the same systems.

Until our recent rebrand, we had been using the name youDevise because originally we did bespoke projects for clients - so it made sense. TIM Group represents the consolidated operation that we have become. Our team is dedicated, knowledgeable and up-to-date which enables us to provide a quality and timely response for our customers.

In both product areas, we decided early on to use open source. We chose that route partly because of cost - you end up not having to pay for licences and things - but also because that suited our developers. They liked the idea of doing something creative outside the Windows environment. The early days of our technology were MySQL, Hibernate and Java. That was our technology stack and we have largely stuck with it.

We also went fully web-interfaced so we had one of the early business SaaS-type environments. That was back in 2004. We were deciding on the technology stack, and we went for the web and a pure browser delivery. So there is no install, we deliver via the web, which is hugely helpful to our agile focus. We release new features into production every two weeks for both products. TIM Ideas has over 4,000 users, so such a frequency of releases is really quite something.

Colin Berthoud

Colin Berthoud

What it really means for the user is that they see very small incremental enhancements every two weeks rather than having to get used to a whole year of enhancements all at once. We actually just brief the relevant people about the enhancements we have made rather than having to send out whole product announcements and having to retrain people for each new release.

Andy Webb: Is the idea behind using that agile technique primarily to give the end users a gradual change in experience rather than them having to cope with a massive change, or is it also from the project management side of things?

Colin Berthoud: It's actually more of the second. A side benefit is that the user gets these small incremental changes but the primary benefit is that we can change direction very quickly. The typical request we get is: can we do a calculation a different way? For example, can we combine a win probability with a P&L and that's the kind of thing we can turn around in a two week period if necessary, depending on priorities.

We were deciding on the technology stack, and we went for the web and a pure browser delivery. So there is no install, we deliver via the web, which is hugely helpful to our agile focus.

Andy Webb: Is there heavy number-crunching involved, in terms of the analytics that you may provide, or is it relatively low power?

Colin Berthoud: There are two types of analytics. There are the analytics within the product that we deliver; those are straightforward although there are quite a lot of them. I wouldn't call them heavyweight. For example, we benchmark to local country indices and sector indices and author allocated indices. None of that is complicated, but if you are then trying to measure across a portfolio against all those different types of benchmarks, that's a lot of number-crunching.

That leads nicely into a newer technology that we have moved to, specifically for the number-crunching. One of the things agile development technology says you should do is write expressive code - that is, code that will be easy to read for someone new to it. With Java you can write expressive code, but for numbers, and particularly more complex numbers, we have started using Scala. We have actually rewritten all of our calculations using Scala.

Andy Webb: What prompted that and why particularly Scala?

Colin Berthoud: The developers find that Scala has the most concise language when it comes to expressing calculations. It's concise and easy to understand. In principle, you can read the code and understand the calculations even if you are not a developer. Our code maintenance goes down and our code agility goes up. We first wrote the calculations back in about 2005 and they have been continually added to. We took this opportunity to refactor all the calculations so they have all been rewritten in Scala.

Andy Webb: That must have been quite a project in its own right, given the number of calculations you have. Did you do it as a "big bang" project, or did you do a couple of calculations per release?

Colin Berthoud: As we did each calculation we released it into production. Because it all sits on a server and clients access it through the web it doesn't matter. It is transparent to them what the underlying function is. I'm pleased to say our users are unaware of the change.

Andy Webb: You must have got it right then!

Colin Berthoud: If you are releasing every two weeks, you have got to be able to test things automatically. So calculations are something we have always constantly tested. We actually run the tests every day on the whole system so it's not just at the release time.

Andy Webb: Where do you keep all your hardware for the web interface? Do you have a server farm or multiple server farms?

Colin Berthoud: The main site is a specialist hosting site that provides ultra-secure hosting and we also have a disaster recovery site in a separate location.

Andy Webb: What sort of hardware are you using?

Colin Berthoud: It is very generic, Linux-based hardware.

Andy Webb: Are you open-source all the way down to the operating system level?

Colin Berthoud: It is Linux and two distributions, Gentoo and Ubuntu.

Andy Webb: Are there any other aspects of open source that you are keeping an eye on because of potential in the future?

Colin Berthoud: There is one other thing that is quite current and we are just starting. One of the things we haven't done but we need to do is handle documents better. TIM has always been about sales people and measuring their performance. We are gradually moving more towards the analyst and research analyst end of things. There are a lot more documents from research analysts. One problem with documents and the metadata around documents is that it is all a bit unstructured. Each document is a little bit different.

We are going to be using a new database management system called MongoDB, which is open source. That is optimised for very fast data access for unstructured data.

Andy Webb: When you decided to use Mongo was it a no-brainer, that was the only thing out there, or did you have to go through multiple contenders to evaluate them as well?

Colin Berthoud: MySQL is fine in what it does for structured data and accessing that in a straightforward way. For us it was more: shall we use MySQL and some basic document management within that or shall we look for something else? I don't know if there were any other contenders other than Mongo. We wouldn't have gone for it unless it offered clear benefits over MySQL.

One other thing: for the first time, we are experimenting with distributed development. It's not terribly exciting but we have always felt it is helpful to have development teams in one place to help and train each other. We have set up in Boston recently because we had problems attracting enough new people.

We were growing at 50% a year and we wanted to increase that to 100%. We felt that having two development centres would allow us to grow faster. So this particular change is actually being handled out of our Boston office.

Andy Webb: Is there anything else you are looking at on the horizon?

Colin Berthoud: We are moving to a new Reuters feed. It's interesting in that most people get their pipe from a virtual line coming into a server in their machine room. We actually go back to Reuters over the internet and send a rest request or XML request into their system. So we are accessing their TRKD feed. It's pull rather than push. When someone puts an idea in we want to take a snapshot at that moment. When someone looks at a screen and says, I want to see the performance on these fifty ideas, we go back to Reuters at that moment for the latest price.

Andy Webb: So you end up with a less bunged-up network because you are not being shoved a load of data you don't need?

Colin Berthoud: Yes, but on the other hand if there hasn't been a price update we wouldn't be able to go back to the local server to get the latest price. It is swings and roundabouts but it's probably a lower load.

Andy Webb: Colin, thank you very much.

Jason Larsen

Jason Larsen

Our sequel…

The second interview, with Jason Larsen , was an opportunity to get behind the GUI and drill down into the technology with the man responsible for the technology deployment at TIM Group…

Andy Webb: What sort of database technology do you use?

Jason Larsen: We use MySQL for most things currently. At times, we have scale issues with MySQL which can be frustrating, but we manage to work through them. We do regularly look into other tools and trial them. For example, we are now building a separate service that uses MongoDB as its database. It's important to actually use and deploy a new tool, in this case, MongoDB, rather than just reading testimonials on the web, positive or otherwise. We can then introduce the tool into our processes and see how well it works, both operationally, and from a development perspective.

Andy Webb: So both of those things are important, in terms of selecting it?

Jason Larsen: Yes. We have a data store for the service we're now building, with a unique set of data, lots of files and other non-standard data. Equally, we have an upcoming project on the horizon that is likely to have a much higher volume of data than our current MySQL database. We thought that exploring other database options now was prudent.

Andy Webb: When I spoke to Colin, he mentioned free-format text as a reason you were particularly interested in MongoDB?

Jason Larsen: One of the types of data that we're bringing in is research documents. Some of it is text and some of it is a little more binary, like PDFs and Microsoft documents. Mongo supports storing these files very cleanly. It has its own file system implementation, which was another reason for choosing it over its competitors.

Andy Webb: So this is so that you can take in data in multiple formats from your sources of trade ideas, and then render them down to a common format that you can then store and manipulate easily?

Jason Larsen: Well actually this is a bit before the trade idea itself. We store and process the research that can lead to an idea. I guess that's why I say it's somewhat separate, because it is new data that we haven't quite treated in this way. To answer your question. We are not really using it as a middle layer. We are actually storing a different set of data in this database.

Andy Webb: So you are talking here about the background material that would support a trade idea, or a particular view of the market, rather than simply buy or sell this, that or the other. It's the information behind that?

Jason Larsen: Yes exactly. There is some metadata - the rating or the analysts involved - but we are actually storing the research itself

Andy Webb: Are you receiving that research from the brokers in a common format? Or are they sending you all sorts of stuff and you have to sort it out?

Jason Larsen: It's rare to get everyone to agree on a format. There are some industry standards like RIXML etc. I imagine we'll try to get as many as possible to use a common standard but some will refuse. We'll have to do custom work for some.

Andy Webb: Actually, you're using MongoDB for reasons of scale and speed, rather than just to deal with text?

Jason Larsen: Because of its schema-less nature, MongoDB marries well with what we are doing. It's the continuous updating and continuous uptime on the system that gives us a lot more flexibility. It also helps developers think along those continuous uptime lines. We might have two versions up at the same time that might rely on subtly different schemas of data and we have to think about that.

We have definitely made choices based on how easy it is to integrate. If it requires a Windows GUI tool to configure and we can't have our operations staff just write a script for it, then that will probably cause us not to choose it.

Andy Webb: Are you completely up and running with MongoDB or is it still a pilot?

Jason Larsen: It's still in pilot and the bulk of the main product is still using MySQL and that isn't changing. We aren't going to migrate the whole thing to Mongo. It's still in the pilot stage and we will go live with this aspect using Mongo.

Andy Webb: Assuming that Mongo does everything that you want it to, while you're running it in parallel with MySQL, do you anticipate that it could ever be a 100% replacement for MySQL?

Jason Larsen: It could be, but that's a way off. We will have to gather the data and get to a consensus on it, but yes, it has that potential. In the case of TIM Ideas we are dealing with high volumes of data and as the network grows that will become a problem. There are a lot of social networks where you can see their growth pattern, and they have run into this problem. We want to know the problem is coming and have solutions ready to use.

Andy Webb: In the context of thinking ahead, planning for the future, how do you analyse the technological aspects of your business? Are you thinking that, for example, current CPUs might become a barrier, so maybe GPUs might be useful in six months' time? How do you go about thinking about overall technology planning?

Jason Larsen: I think in the past we have tended towards flexibility. We use tools and techniques that allow us to change quickly. Operationally, we use tools like Puppet that easily manages our configuration and deploys on demand much more quickly. That's easier than requiring hours or days of downtime to make a slight upgrade to a database. We are migrating in that direction.

On the development side, we are tending towards continuous integration and continuous deployment. We want to think in small changes, rather than releasing something new every quarter. It is much easier to understand little changes that happen more frequently. When you trace back through the changes, if there is a problem or if conversely things go better than expected, you can attribute that to an individual change. If we decide Mongo is not the right choice, we can switch it out. It's not going to cause us months of delay.

Andy Webb: Does that mean that when you are considering future technology, or potential technology, one of the first things you look at is, how easily you can integrate it, and how flexible is it with your current business process?

Jason Larsen: Yes, absolutely. We have definitely made choices based on how easy it is to integrate. If it requires a Windows GUI tool to configure and we can't have our operations staff just write a script for it, then that will probably cause us not to choose it.

Andy Webb: As you have said, you are very much an open-source shop.

Jason Larsen: Largely, yes. But if the best tool we can find for the job is something closed-source then we will use it. Generally speaking, the open-source alternatives are good and best for our needs.

Andy Webb: Thinking about the way open-source projects seem to work, you have frequent releases and indeed alpha releases all over the place. In that sense, in terms of the flexibility you are seeking, I wonder whether open-source projects are a better fit with the way you think?

Jason Larsen: Some open source projects will iterate quickly. For the most part, open-source projects seem to respond to the community quicker and respond in a more frequently-updated fashion.

Andy Webb: We spoke to somebody recently who felt it was important to feed back into the open-source community and that was a big part of his message within his development team. What is your view on that?

Jason Larsen: We absolutely do try to give back. It is a hard balance at times to respond to client needs and give back to the community. We do have a small number of open-source projects that we internally foster and give to the community. Many of our developers are active in different independent projects, and use different tools and technologies. Where we can, we try to help foster young projects where we see potential or that resonate with us. We also provide feedback with bug reports and that sort of thing on a simpler level.

Andy Webb: Quite a few companies forbid that. They say: you are working for us now and you are not allowed to contribute to open-source projects. You seem to be a bit more flexible on that?

Jason Larsen: I guess there is a touch of pragmatism to it. If you are developing a competitor then that would violate a contract I'm sure. If you are writing a testing framework and you are not in the business of trying to sell testing frameworks, sure, give it away. If you convince the world that it is a better tool and everyone wants to develop on it and make it better, that only helps us.

Andy Webb: With your involvement in open source, in terms of giving back to the community, are you ever trying to steer projects so that they are beneficial to yourselves?

Jason Larsen: I imagine there is a bit of selfishness in there too. If we can foster projects that work well for us, that we use or depend on heavily, that's great. But really, it's more about altruism than us just getting our way. There are some projects that we are close friends with now, shall we say, and we can influence in that way. I hadn't really thought about it, but perhaps there is a bit of an ulterior motive.

Andy Webb: Are there any technologies of any type that you are keeping an eye on at the moment because you think they could be promising for you in the future?

Jason Larsen: I'm not sure I can answer that totally. We generally keep our eyes closely on the Java space, whether that be the JVM in other languages, like Clojure and Scala, or tools built on the JVM, like Lift. In general, we are watching for a tool that is in our sweet spot. There are even exotic options like NodeJS that could really take our scaling to another level. It isn't in our core Java/JVM competency, but we are keeping our minds open.

Andy Webb: Jason, thank you.