Poor Man's Solutions: Convenient in-memory cache

Imagine we have an API that calculates carbon emissions based on activity. There are a couple of methodologies we can use to do that, and each transportation mode comes with its own. Each methodology has specific factors based on the car you drive or the route you take. These are accessed frequently, so they need to be available at any point in the request lifecycle to ensure response times remain reasonable.

The Beginning

We had a couple of methodologies to start with, and the fastest way to handle them was to keep everything in-memory. Most of them came in CSV form, so with a couple of trivial scripts, we could load them into memory and access them like this: Calculation::Flight::SmartMethodology::SOME_FACTOR.

We introduced a bunch of helpers to make it convenient; it was simple and looked good. You might wonder about performance: the memory footprint was negligible, eating up maybe 10-20MB in the first year as we added more methodologies.

The Outlier

One day, we were asked to introduce a new methodology with a twist. Instead of a tiny CSV file, it came with a 40GB database of factors that we had to update weekly. That was obviously not going into memory.

We decided to put it into a completely separate container with a dedicated Postgres instance. This allowed us to scale it based on demand, ensuring the heavy data wouldn’t interfere with our main Postgres instance, keeping our connection pool less busy.

Things start to fall apart

A few months passed. We introduced more methodologies and improved the old ones. We also added versioning so users could calculate emissions for past trips using factors that fit the specific year of travel.

Suddenly, the app boot became sluggish. We were reading a bunch of CSVs into memory, and our health check threshold stopped passing, which blocked our container swaps. Even though the memory footprint was still low, the startup time had become a bottleneck.

After a quick brainstorm, we decided to rebuild the pipeline for the app boot. We turned the CSVs into SQLite databases, which are significantly faster to read and frankly simpler to work with. That solved the boot time issues and kept our pipeline stable.

The takeaway

I used to have this mindset that I had to nail the solution from the get-go. If I couldn’t do that, I felt I wasn’t a good engineer. But the truth is, to move fast, you have to learn from your own mistakes. There is no single book or blog post that will make you a master of your craft instantly.

Don’t be shy about your imperfect solutions. Embrace them, learn from them, and share them! Nobody is perfect.