The Real Cost of Technical Debt
Reading time: 7 minutes
Last modified:
Ask a room of engineers what technical debt is and you’ll get a dozen different definitions. Hacky code. Missing tests. Outdated dependencies. “Everything the previous team wrote.” The word gets applied to so many different things that it’s lost most of its practical meaning.
The original definition, from Ward Cunningham in 1992, was more precise and more useful: technical debt is the gap between the design you shipped and the design you now know would be better. It’s the cost of decisions made under constraints — time, knowledge, resources — that need to be revisited as those constraints change.
By that definition, some technical debt is inevitable. Some of it is even correct.
The Myth: All Technical Debt Is Bad
There’s a version of engineering culture where “we have technical debt” is always a failure statement — an indictment of past decisions. This is wrong in a practical sense and counterproductive in a team sense.
Taking on technical debt deliberately, knowing you’re doing it, is a valid engineering decision. Shipping a working feature with a rough implementation to meet a deadline, with the intention of cleaning it up once the feature is validated — that’s a reasonable trade-off. The product gets to market, the feature gets tested with real users, and the team pays off the debt once they know it’s worth paying.
The problematic debt isn’t the kind you took on deliberately. It’s the kind you took on accidentally, or the kind you took on deliberately and then forgot to pay off.
The Quadrant
Cunningham’s framework (later expanded by Martin Fowler) plots debt across two axes:
| Prudent | Reckless | |
|---|---|---|
| Deliberate | “We know we’re cutting a corner — we’ll fix it after launch.” | “We don’t have time to do this right.” |
| Inadvertent | “Now we understand the domain better, we’d design it differently.” | “What’s layering? What’s coupling?” |
The top-left quadrant is healthy engineering. You made a conscious decision, accepted a trade-off, and have a plan to revisit it.
The top-right quadrant is a process problem. Teams under chronic pressure accumulate reckless deliberate debt — they know they’re doing it wrong, but there’s no space to do it right.
The bottom-left quadrant is actually a sign of learning — you understand the domain better now than you did when you built it. That’s how software development works.
The bottom-right quadrant is the genuinely dangerous kind: technical debt that nobody knows exists because nobody understood what good looked like to begin with. It’s also the hardest to fix, because the people who need to fix it don’t know it needs fixing.
How to Actually Measure Debt
Test coverage percentages and lines of code are misleading proxies. A codebase can have 80% test coverage and be in terrible shape. A codebase with no tests can be clean and well-factored.
The metric that actually reflects technical debt is time-to-feature: how long does it take to add a new capability to the system? Specifically:
- How long does it take an experienced developer to understand the relevant part of the codebase before they can change it?
- How many unrelated files need to change when you modify one thing?
- How long does a typical PR stay in review because reviewers don’t understand the context?
- How often does fixing one bug introduce a new bug?
If these numbers are growing over time as the team grows in experience and the system matures, that’s the fingerprint of compounding debt. If they’re stable or shrinking, the team is managing it well regardless of what the code looks like aesthetically.
The Compounding Effect
The financial debt metaphor is apt because debt compounds. A codebase with moderate coupling in one area tends to attract more coupling. Engineers working in a messy area write messy code — partly because messy code is harder to refactor incrementally, partly because “it’s already messy here” is a real psychological signal that reduces care.
The compounding shows up in specific ways:
Feature velocity drops. What took one sprint in year one takes three sprints in year three. The team isn’t slower — the codebase has accumulated friction.
Bugs cluster. When the same three files keep appearing in post-mortems, the debt is localised. That’s actually useful information — it tells you exactly where to focus remediation.
Onboarding time grows. A new engineer taking four weeks to make their first meaningful contribution indicates a knowledge transfer problem that’s usually rooted in accidental complexity — things being hard to understand that shouldn’t be.
Changes feel risky. When experienced engineers hesitate before touching an area of the codebase, that hesitation is measuring something real. Systems that are well-understood and well-structured feel safe to change. Systems laden with debt feel dangerous.
When to Pay It Off
Pay off technical debt when the signals above appear in combination:
- Feature velocity in a specific area is measurably slower than it was
- Bugs are clustering in the same components
- Onboarding to that component takes disproportionate time
- Engineers express reluctance to touch it
These aren’t aesthetic concerns — they’re business concerns with a cost you can estimate. If a feature takes three sprints now that would take one sprint in a refactored system, the debt is costing you two sprints per feature. That’s a concrete number you can use to justify the remediation work.
The strongest case for paying debt is when the affected code is on the critical path for upcoming work. Refactor the module you’re about to extend, not the one that hasn’t been touched in two years.
When NOT to Pay It Off
When the code isn’t on the critical path. Code that works, is stable, and is never going to change doesn’t need to be beautiful. Touching it creates risk without corresponding benefit. Leave it alone.
When refactoring means rewriting. Full rewrites almost always create new debt while paying off the old. The system that exists, however messy, encodes a tremendous amount of learned behaviour, edge case handling, and implicit product knowledge. A rewrite starts from zero and typically rediscovers the same problems. The graveyard of failed big-bang rewrites is extensive.
When you don’t understand the problem well enough. Premature refactoring — cleaning up code whose domain you don’t fully understand — often makes things worse. Wait until you’ve worked in an area long enough to know what the right abstractions actually are.
When the business is changing faster than the technology. If you’re iterating on the product rapidly, paying off architectural debt before the product shape stabilises is often wasted work. Wait for the product to settle before hardening the architecture.
Refactor-as-You-Go vs Dedicated Sprints
Both approaches work. Neither works alone.
Refactor-as-you-go is the boy-scout rule: leave the code a little better than you found it. When you’re working in an area, clean up local issues — rename confusing variables, extract a function, add a test for the path you just changed. This prevents accumulation without requiring dedicated time.
Dedicated sprints are appropriate for larger structural problems — architecture-level issues that can’t be fixed in passing. A module with the wrong abstraction boundary, a database schema that needs migration, an API contract that’s accumulated too many backwards-compatible hacks. These require focused effort and carry risk that should be planned for explicitly.
The mistake is treating them as alternatives. Teams that rely only on refactor-as-you-go never tackle structural problems. Teams that rely only on dedicated sprints never get business approval for the time, and small issues accumulate in the interim.
How to Talk About It With Non-Technical Stakeholders
“We need to pay off technical debt” lands poorly with people who don’t write code. It sounds like engineers wanting to clean up their mess instead of building features.
The translation that works: cost per change. “Right now, adding a feature to the billing module takes three weeks because of how it’s structured. If we spend two weeks restructuring it, the next five features each take one week instead of three. That’s a net saving of eight weeks on the next five features.”
This is not a fabricated argument — it’s usually accurate. The work of making the business case is just measuring the current friction and estimating the improvement. If you can’t make that case with numbers, the work might not be justified.
Dealing with an inherited codebase and not sure where to start? Write to us at hello@cimpleo.com — we can do a technical audit and tell you what’s worth fixing and what’s better left alone.