Here’s something I’ve been mulling around. Often in programming you hit a situation where you have to choose two paths:
- The 15 minute fix
- The 30 day fix (+30 days of repercussions)
I know, I know, that’s a gross generalization but it’s related to the tension between speed and sustainability in software development. I’ll explain a bit further…
The 15 minute fix
The fifteen minute fix is the most direct way to solve the problem. A hard-coded value, installing a third-party library, adding another prop, or introducing a manual process to get the job done. You don’t have time to find the right abstraction, to write it yourself, think through all the edge cases, or automate the process. But deep in the back of your mind, you know that you made a compromise…
Sometimes the compromise is worth it. Meeting a deadline, not over-thinking code, avoiding a maintenance burden, a lot of “unknown unknowns”, or simply finding the relief of getting something off your plate. Life is full of compromises and codebases full of tradeoffs.
The 30 day fix
The 30 day fix is a more thoughtful abstraction. It takes a much larger chunk of time but —with the right amount of consideration and discussion— you stand a better chance at architecting and building a scalable solution in one shot. More robust, less glue code, less manual switch-flipping or knowledge transfer, a healthier baseline for the codebase going forward.
There’s a feeling of relief when the 30 day fix lands. Technical debt paid down, an organized kitchen pantry, and the codebase now reflects your current understanding of the problem. Adding a line of code doesn’t feel like the whole project will topple over.
But… the 30 day fix also has a hidden cost in the form of an extra 30 days of fixing the repercussions introduced by the fix itself. Breaking a test suite, an unintended side-effect, a broken layout, or a new API that leads to combing through the entire codebase to upgrade all the props to the new schema. There’s always a need to refactor a critical component to make it all work error-free.
30 days is a long time for a codebase. 60 days even more. Surely it’s cheaper to keep applying 15 minute fixes, right? Well… can I tell you a secret? Unless a small miracle happens (or your product never grows), you almost always end up needing the 30 day fix. The shortcut works for a short while.
Example: The case of the evolving API needs
I built an API to retrieve posts
from a database. A basic CRUD sort of thing.
GET /posts/ # index
GET /posts/:id # read
POST /posts/:id # create
PATCH /posts/:id # update
DELETE /posts/:id # destroy
But when I started using it, across the application I needed a speedier list view of post.title
and post.id
fields (for a dropdown). I cracked my knuckles and wrote a 15 minute fix by creating another endpoint to solve that problem.
GET /posts/ # index
GET /posts/list # index, but with less fields
GET /posts/:id # read
POST /posts/:id # create
PATCH /posts/:id # update
DELETE /posts/:id # destroy
Then we needed to have a category filter on the /posts/lists
endpoint. Another 15 minute fix added to the mix.
GET /posts/ # index
GET /posts/list/:category? # index, but with less fields (optional: filter by category)
GET /posts/:id # read
POST /posts/:id # create
PATCH /posts/:id # update
DELETE /posts/:id # destroy
This worked well for a couple months until I saw four of these /posts/list/:category
queries on a dashboard page, but we showed ~5 records in the UI, but were fetching potentially thousands. I added another 15 minute fix to allow a limit
as a query param.
GET /posts/ # index
GET /posts/list/:category?limit=num # index, but with less fields (optional: filter by category, limit number records returned)
GET /posts/:id # read
POST /posts/:id # create
PATCH /posts/:id # update
DELETE /posts/:id # destroy
Ahh.. Are we done yet? No. Absolutely not. We’re doing a lot of data-massaging on the client that would be better (and faster) on the server. Now we need orderBy
and groupBy
params for sorting, include
params for an ad-hoc JOIN
of related records, ability to sort JOIN
records; a little less rigidity in the design.
We — as predicted — are heading towards a 30 day fix situation. With all the added options, the /posts/list
endpoint is a mess — it may not even need to exist! — and we could solve a lot of problems by sitting down and thinking about the solution a little bit harder. But I actually don’t know if we made a mistake. We were running towards getting feedback. We don’t need a big infrastructure project when sandbags would fix the issue.
To build or over-build, that is the question
It’s a tough situation that comes down to estimating. Not estimating in a “choose some fake fibonacci numbers” sense, but in an overcoming personal biases sense. Are you over-estimating your need for the 30 day fix? Are you under-estimating how well the 15 minute fix is going to work? Are you operating emotionally out of a bad experience from a previous situation? Have you overcome your personal optimism bias? What’s the pressure from management like?
Often you don’t know the scope of the problem until you have users, so over-building at the beginning of the process is probably a mistake. The 15 minute fix gets you closer to actually learning what your codebase needs. But months down the line, you have a codebase full of 15 minute fixes that now need 30 day upgrades. Time-bombs. Could you have saved time by architecting the 30 day fix from the beginning? Or could you have saved a lot of money by chipping away at the problem with more and more 15 minute fixes?
And that’s what it comes down to: economics. A 15 minute fix and a 30 day fix have colossally different costs and impacts. Can your business afford that time? Can your users afford the delay? Is it a Monty Hall Rewrite problem where the fix could actually be worse once it’s landed? I struggle with this because I hate rework, but if you think of software as evolving, then rework and maintenance is part of the job.
How do you decide? I don’t know. The best advice I have is: Never go to the grocery store when you’re hungry… but y’know, for software.