Under the hood of Codat with Jason Dryhurst-Smith, Head of Engineering.
At Codat, our philosophy of keeping things simple and relying on ‘boring technology’ has meant we haven’t had to make many revisions to our tech stack since setting up shop. But, that’s not to say that we haven’t made some changes along the way in order to deepen our sophistication. We’ve outlined a number of our key learnings below:
2016 – The Start 🎬
Using the Web Apps and WebJobs services from Azure, along with SQL, and Azure Storage, Codat was born. The bulk of the machinery was situated in a monolithic service called The Data API.
This service had a lot to do, it governed the mechanics of the system and housed the query engine for the standardized data. All of the third parties that we integrated with got their own set of services for doing things like authorization, fetching from the third party API and then mapping that data to the Codat model.
This pattern that was set on day one for integrations is still the pattern we use now, it means that any team adding features to an integration can work on it behind a standardized interface, and scale that system independently in line with usage.
2017 – Mitosis ➗
As we gained more clients, we started to see various bottlenecks in the performance of the system and in the team’s ability to work on one large codebase.
This meant splitting The Data API into a few pieces that we could work on and scale the infrastructure for, independently. The biggest split was between the parts of the service that dealt with data Codat owned and needed to operate the system (we call this metadata) and the parts of the service that dealt with the standardized data (we call this contributed data). This led to the introduction of the clients API, along with a front-end API that handled user management and authorization – we had a distributed monolith.
At this point, the SQL database that housed all the contributed data also gained a set of serverless functions that rebuild indexes nightly, if you read ahead, you will understand that this is foreshadowing.
2018 – Push and Refactor 📈
The team started to grow rapidly in 2018, enabling us to push the product out on more fronts.
Over the course of the year, there were a huge number of internal improvements made to the internal libraries and SDKs that our engineers use every day.
We also started to refine our own high-performance .NET RPC over HTTP stack called ServiceClient. The SDK used by the integrations to handle the large volume of data needing to be mapped to the Codat standard was upgraded, to reduce its memory footprint – working mostly with JSON in .NET there can be a lot of string allocations if you’re not careful.
In the same year we started building our prototype Sync product. Despite being built from the tools we already had available to us, it involved more whiteboards and head scratching than new tech.
Throughout the early part of this year, it became increasingly obvious that our SQL schema for the contributed data was no longer fit for purpose. We had made all the improvements we could using the patchwork of Entity Framework and SQLBulkCopy and hand-rolled hashing for getting data into the database. A longer term solution was required.
2019 – A Reckoning 🕑
After a lot of thought and review, we decided that maintaining the schema for contributed data twice, in both C# classes and in SQL tables, was a waste of engineering time. Rebuilding and maintaining all the indexes necessary to get good query performance out of a dynamic query engine was also really time consuming and expensive. So, we chose Cosmos DB as the new persistence tool of choice for the contributed data stack. Our plan was to run the two systems in parallel and build it. How long could it take, right?
It took the best part of the year and we encountered a number of hurdles along the way with the technology never quite behaving as we had expected.
The final straw came after running a set of monitoring tests while paging through large volumes of data, it seemed that the paging engine was loading all the data, and sliding a window over it to return the right page! This meant that asking for page 5 of a query, fetched pages 1, 2, 3, and 4 before returning you the data. This was slow, and since Cosmos is priced in requests, very expensive. It was obvious, despite the sunk cost, this was not the technology for us.
What followed was a serious review of the bottlenecks in our data processing pipeline and its schema. This resulted in a few changes such as turning updates in the data cache into stored procs using merge statements and we started storing the row hashes rather than calculating them in .NET.
What we ended up with was a much more mature application of the technology we were used to operating, and a much better understanding of the pricing. Never has a project highlighted more clearly to Codat the benefits of choosing ‘boring technology’.
2020 – The Conscious Decoupling 🚀
To really accelerate our ability to rapidly change and continue to grow the product faster by adding more engineering resources, we had to start thinking seriously about any point where two or more domains had to be changed in order to release a new feature. This coupling incurred a cost that was a waste. While our culture has never been one of efficiency above all else, there were a lot of changes that required stop-the-world type activities to release new versions of shared contracts and this is prone to error and incurs downtime for our users.
The biggest area of coupling was the interaction between controlling systems (such as workflow orchestrators) and the integration services. We decided to enforce a maxim that all services should publish events about their capability to any consumer and the consumer would use this to control their interaction with that service. For integrations this was simply an event that told the world exactly what the integration was capable of. This information is also really useful for developers that are building against our API. Now any team can build an integration using our integrations SDK and release it, and it will tell the rest of the system that it exists and what features it supports.
I hope that this has been a useful glimpse into the evolution of the technology stack and engineering culture at Codat. This is by no means a complete picture, not because there is anything we do that we wouldn’t share, but because I have tried to pick representative anecdotes of the many small and large problems and successes that we have encountered along the way. There are more recent moves, such as moving all UI to a micro-frontend architecture, that I haven’t detailed yet either, because we can’t evaluate our decisions yet.
I have also not really gone into any detail about the organization and management of engineering teams and their work, and how that has changed over time. How you manage people and getting ideas into specs and into code and then into operable systems running over the internet is arguably more important than any database technology, but we’ll save that for another time.
Jason Dryhurst-Smith, Head of Engineering
You can start building with Codat for free today. Sign up here for a free account or visit our docs to find out more about our data model.