Building a rules engine for stocks and cryptocurrency trading
TLDR: I built a platform and marketplace to create, test and run trading strategies for stocks and cryptocurrencies. This is due to launch in closed beta early next year. Read on for the why, how and where to next.
I've had a deep interest in finance and APIs for at least the last decade which led me to build startups like BVNK and products like the COVID19API. I also have an active interest in investing, and being a software engineer I'm interested in how technology can bring more value to a given task.
After reading a few books on trading strategies, I decided to try and implement them and see if they worked. There were a couple of options out there to do this, like Quantopian (RIP), Trading View, and the Meta Trader platform. However, these either had a steep learning curve and involved languages I wasn't familiar with (Python and MQL) or were too simplistic and only offered some subset of features.
For one particular trading strategy, I was interested in the following:
- The price of the stock should be below a certain value
- The market cap of the stock should be below a certain value
- There should be certain number of insider trades reported on the stock within a given period
- The volume of stock traded should have increased by a given percentage over a given period
- The SMA20 should cross over the SMA50 within a given period
- The PE ratio should be below a certain value
No platform I found could offer the ability to do this easily, allowing for both back testing as well as daily running, so I decided to build one to enable this.
Note: while I refer only to stocks in this post, everything is relevant to currencies, although of course some criteria will not be applicable.
Let's Start, Again
I had previously attempted to build this project, got it working but then coded myself into a hole. It became too complex to amend and was untestable without significant refactoring. I wasn't ready to give up on the idea, so decided to start again and do it properly this time.
From day one I used Test Driven Development to ensure the code could be maintained moving forward. This has been an absolute life saver during the building of this application and surprisingly it didn't slow down development. I found that when I slipped out of the TDD habit ("let me just get this done quickly") the amount of time spent debugging and fixing was more than the time it would've taken to do TDD in the first place.
TDD does not automatically lead to good architecture though, so soon after starting development I spent time designing the overall architecture of the platform. As I worked through the desired outcome I realised this was way more complex than I had anticipated.
Hidden Complexity
The desire was as follows: be able to create a methodology with N entry criteria and N exit criteria. These criteria should be of the same type internally, so I made use of Go's interface to implement a good pattern around this.
Note: interfaces are crucial to using TDD in Go and ensuring you get good coverage, so I would recommend getting used to lots of interfaces and lots of mocking.
There were now several types that needed to be implemented: the criteria type ("what are you testing for"), the criteria status type ("what are you testing against"), and the direction ("above or below"). To get to this point took many hours, but after implementing this as the foundation I had something that a) worked and b) would be able to scale as I added more types across the board.
Examples of criteria types are: target, nominal increase, percentage increase and crossover.
Examples of criteria status types are: volume, price, pe ratio, indicator, insider trades, target, etc.
The best way I could get this to work is you create a criteria object with the relevant fields and data points, and then during evaluation you use a switch statement to direct the criteria to a given function which can finalise the evaluation.
If all criteria points are met, the criteria is marked as Met. If all criteria are met, then the relevant trade is executed (open or close).
Dealing with indicators has been particularly difficult, primarily due to the different number of fields and the different data types. One approach is to create each indicator as an object in code which is easier but results in a lot of code. The other approach is to create a more generic Indicator object which can have a type associated, and fields and functionality differ depending on the type. I have gone with the latter as the design upfront will save many hours in the future (for example if you want to add a field to all indicators, or change behaviour across all of them).
Decoupling
By developing through TDD you end up with beautiful, decoupled code. This makes changes far easier to make because you have surety that the changes won't have unintended effects (most of the time). However, it does cause some headaches when dealing with top-level domain type processes.
The domain process in my case is a Methodology. The methodology holds all the criteria and associated trades, as well as linking to a given user, and some more metadata. A few key elements are not included in the object: the stock itself, price data around the stock, and trades across runs of the methodology.
The idea is that stocks should be passed to the methodology and just run, and that the methodology shouldn't care or even know about the stock. But because the methodology doesn't know about the stock, the price is also not linked. For clarity, the flow is as follows:
- Methodology gets run and all criteria are checked
- The methodology returns with the criteria status, and it Met then a trade is opened
- Because the criteria are flexible, and because of when we check the criteria, there is no pricing information available
You could overload the Methodology and add a pricing field and a stock field, but that is not linked to the methodology object - it's something that influences it. I avoid pollution like that plague because it always ends up in tears.
Instead, to solve this problem, when the trade needs to be opened we fetch the relevant price for the stock. I say "relevant" here because we also run back tests, so we can't just fetch the latest price, we need to fetch a price that has a certain offset.
Data Management
I am using IEX cloud as a data source, and as anyone who has worked with financial data knows, the costs can add up quickly. I only want to fetch data as and when I need it, and only as much as I need.
This happens in a few places: when running back tests and when requesting data. So, when either of these events happen, I do the following check:
- Get the latest record of data (price, company stats, insider trades, etc)
- If there is a record, check the date. If there isn't a record, set the date to UTC Zero
- Send the date to a function to return how much data to fetch. If we are 5 days behind, fetch 1 months data (the smallest increment), if we are 40 days behind, fetch 3 months data, etc. Also, make sure to check if you are around a weekend and adjust accordingly
- Fetch the data and save it, then return the data
Due to back tests all using these same functions, the resulting slices had to be cut according to which data we wanted. If we wanted data from 60 periods ago, we would need to return only data up to that point.
The above data fetching also works when requesting data directly through the API and surprisingly does not take long at all. For example, when requesting 3 months data we fetch the data from IEX, save it and return it in around 500ms. Subsequent calls are much quicker, <200ms for saved data and <100ms for cached data.
Current Status
I've spent the best part of six months on this (probably around 350 hours), written 23,000 lines of code and will end up at 90% code coverage. It's been a labour of love and hugely rewarding.
The application needs a few small tweaks and further testing before releasing to the public, but it is currently running most of the criteria of my methodology against ~3,000 stocks daily. I will be working on the frontend shortly and should have some initial version of this up by end of January. I've also applied to YCombinator as a late applicant, so let's see what happens.
The business model will be selling subscriptions to businesses, like Freetrade and Robinhood, for them to integrate this functionality directly into their platforms, and have a white labeled dashboard if they want.
I've only touched on a few of the major points in this post. If you'd like to hear more please feel free to reach out to me.