Working with a codebase that suffers from stage 4 bit rot is far from fun. In fact, it’s demoralizing. Most developers don’t intend to write bad code, but the pressures of the business often force you to leave code in a decomposing legacy state. No need to fret, though - There are several ways you can turn the ship around with a little tenacity. Let’s take a look at how a codebase gets into this state, as well as how you can fix it and prevent it from happening again in the future.
Just like that, it happened. I didn’t enjoy working on our codebase anymore. I actually kind of hated it. No, I really hated it. How did this happen? Then I put my finger on it - Every time I changed something I’d end up creating a bunch of regression bugs. The act of changing something simple would take hours when it should only take minutes. This codebase had become a living technical debt nightmare. It wasn’t always like this though. Making changes used to be painless, fun and exciting, but that was when the codebase was small and manageable.
This happens to every developer at least one point in their career - they end up hating their own codebase. Sometimes it’s their fault, sometimes it’s not. Regardless though, it’s something that needs to get fixed so they can continue programming in a more zenful state of development.
The Graceful Art of Tripping Over Yourself
Before we figure out how to regain your zen state of programming, let’s review how we got to this dreadful state of development in the first place.
I work with startups a lot and there’s one very common thread that startups share - they have to move insanely fast. Hypotheses are formed, developers implement them, the company tests them in the market and then the company iterates on the results. You end up writing a lot of code really fast to get things done. Most of the time this code is garbage, and that’s Ok.
Ugly code that ships and solves a problem is a 100 times better than beautiful code that doesn't ship.
— Donn Felker (@donnfelker) January 24, 2013
In case you don’t believe me, even Zuckerberg feels the same way:*
“Move Fast and Break Things” - Mark Zuckerberg
I have nothing against this and I think it’s good practice.** A startup is a business and a business has to prove its market and make money, otherwise it dies. Unfortunately though, if left untreated, your messy “move fast and break things” code turns into legacy technical debt. Most developers and engineering managers really do want to fix the issues and eliminate technical debt and legacy code. We all consciously tell ourselves -
“It’s ok, we’re almost to our alpha/beta/pilot release. We’ll fix this later - when we have time.”
The problem is, the time never shows up and eventually the team starts tripping over themselves because of all the old hacks that have been implemented.
They’re Called Symptoms for a Reason
Then it happens - your company closes that big round, lands a huge customer, or hits that inflection point of new users. Either way, the money is in the bank!
The team can now take the time to eliminate the technical debt that has amassed. Unfortunately though, the business has gotten accustomed to the pace of development and can’t afford to slow down any more. They actually need your team to move faster. That’s when it hits you -
“Wait a second … we’re still moving fast and breaking things, but we’re not fixing anything!”
You’re seeing the cracks in the foundation starting to form and you’re noticing something else that is concerning -the team’s morale is beginning to dip because working on the codebase is becoming frustrating and regression plagued. Will you be at work another 6 hours fixing regression bugs? If “probably” or “possibly” come to mind then you may have crested the apex of your mountain of technical debt.
By this time the company has received some press and the pressure is on to deliver. Inter employee tensions are high because the stakes are even higher. This is when testing and ensuring stability really matters, but the problem is - you have no tests in place to ensure quality, and the quality slips further down the slope of hopelessness with each commit. Each time a release is pushed the entire engineering team shuts down to perform manual testing on the application to make sure it passes the de facto “we didn’t break anything” acceptance test.
Trust me …
Something has to be done.
Steering The Ship Back On Course
Theoretically, curbing the technical debt is easy. In reality though, it’s usually quite painful. You’re dealing with legacy code and it’s hard to change anything without causing a bunch of regressions.
“The main thing that distinguishes legacy code from non-legacy code is tests, or rather a lack of tests.“ - Michael Feathers, Working Effectively with Legacy Code
Code that’s even a year old that does not have tests, is legacy code. Code you wrote last week, with no tests, is legacy code by this definition. This code is like a landmine, waiting to go off if you bump it the wrong way. It’s a bug waiting to happen.
One way to ensure your code does not become legacy and turn into technical debt is to implement some tests. I’m not saying you have to use TDD, BDD, etc - not at all (though these methods have been proven to work for folks***). Just creating some form of testable code is good. It could be a simple end-to-end test to a full-blown test suite. I suggest starting small though, and then expanding from there.
A Blueprint for Avoiding Mass Technical Debt
Let’s get real for a second - there is no silver bullet for solving technical debt. Technical debt and legacy code will be around forever in one form another. It will be around long after you’re gone, but there are a few things you can do now to lessen the pain in the future.
Implement a Test Coverage Plan
Any new code that is written from this day forward should require a corresponding test. Code reviewers should reject the code if it does not have a test associated with it. This will help ensure that no future issues are introduced. You can additionally push the code coverage bar further along by implementing 1-5 extra tests per week, per developer, that cover legacy code paths that currently do not have tests. An alternate method is to include a test for the legacy code component(s) each time a commit is made against the repository during a regular development sprint. In order to stay on track, organize the tests to be implemented by criticality of the component, and revisit the list weekly to revise as necessary.
Finally, once a feature reaches a point where it is baked into the application for the foreseeable future (it has been validated) tests should be implemented in the very next development sprint (if there are none for this feature.)
Continuous Improvement
Set up a continuous integration server. This should be one of the first things a team should in order to help cultivate a culture of collective code ownership. You should be running your unit tests on every commit (or gated commit) and running the functional tests on a schedule (every 30 minutes, hour, etc - this depends on how long they take to run). The goal is to have a tight feedback loop which enables the team to identify issues earlier so they can be fixed as soon as they are found. Notifications of these failures should be sent to the entire team via email, Slack/Hipchat, etc.
There are various continuous integration servers you can use. The de facto champ in the self-hosted arena is Jenkins.
Alternatively, if you prefer hosted solutions there are many which are available:
- Circle CI
- Travis CI
- CodeShip
- Greenhouse CI
- Drone
- … and many more
Cultivating an Effective Team Culture
When I manage and train teams of developers working on a legacy system, one of the first things I do is buy everyone a copy of the book Working Effectively with Legacy Code by Michael Feathers. This seminal book helps team members realize what to do in situations where making changes is difficult and painful. Take a look at these chapter names to get a taste for what this book has to offer (yes, these are actual chapter names):
- “I Don’t Have Much time and I Have to Change It”
- “It Takes Forever to make a change”
- “I need to make a change, but I but I don’t know what tests to write”
- “I don’t understand the code well enough to change it”
- “We feel overwhelmed. It isn’t going to get any better“
In addition to the book, I like to implement weekly team code reviews for the first few months while the team is adjusting to writing tests and eliminating legacy code. Each week for about an hour (at most, the shorter the better actually) the team gets together and they review one piece of code from each contributing developer on the team (this includes coding engineering leads). The team should aim to ensure a consistent formatting, testing, architecture, etc is followed. This helps foster a collective code ownership and brings more cohesion to the team.
Finding Zen
Not all work that we do as developers is enjoyable or highly engaging. However, we can make it easier by ensuring that we have the proper tooling and testing in place to protect against inadvertently introducing technical debt and legacy code into the codebase. Tests are a simple way to do this. Help you and your team out and go write a test or two!
* Zuckerberg has since changed the motto that Facebook operates under now that they’re no longer a small scrappy startup. Their new motto is “Move fast with stable infrastructure.”
** I consider this good practice when you need to iterate fast, prove a market, validate an MVP and iterate. This is not good practice in day to day coding.
*** There’s a great series by Martin Fowler, Kent Beck and DHH where they discuss if TDD is dead. It’s definitely worth your time to watch these videos
Receive news and updates from Realm straight to your inbox