Engineering10 min read

Technical Debt: When to Pay It Down (and When to Ignore It)

Most technical debt advice is moralizing dressed up as engineering. Here is a real framework: which debt costs more to ignore than to fix, which is stable and can be left alone, and how to tell the difference.

K
Senior System Architect & Fractional CTO
Published
On this page

'We need to pay down our technical debt' is the most over-used phrase in engineering, and most of the time it means 'I do not like the code I inherited.' Real debt decisions are not about taste. They are about whether the cost of living with a piece of code is higher than the cost of changing it — and that is a math problem, not an aesthetic one.

Below is the framework I use when founders ask me whether to refactor, rewrite, or leave a system alone. It is built on Martin Fowler's debt quadrant, sharpened by about a hundred codebase audits, and it works in roughly 30 seconds per debt item once you internalize it.

The one-line test: cost of carry vs cost of change

Pay down debt when the cost of carrying it (engineering hours lost, outage risk, hiring drag) exceeds the cost of fixing it. Ignore debt when the cost of carry is roughly zero. That is the entire framework. Everything else is just learning to estimate the two costs honestly.

Cost of carry is concrete: hours per week the team loses to the debt, frequency and severity of incidents it causes, ramp-up time for new engineers because of it, opportunities the team cannot pursue because the area is too risky to touch. Cost of change is also concrete: engineer-weeks to fix, regression risk, opportunity cost of not shipping features during the fix, and the dreaded scope creep multiplier (1.5 to 3x for any non-trivial refactor).

Fowler's quadrant: which debt you actually have

Martin Fowler's debt quadrant splits debt across two axes: deliberate vs accidental, and prudent vs reckless. The diagnosis matters because the response is different for each.

QuadrantWhat it isCommon exampleRight response
Deliberate PrudentWe know the right way; we ship the shortcut on purpose with a plan to fix'Ship a single-tenant for the first 10 customers, multi-tenant by month 6'Track it, set a trigger to fix
Deliberate RecklessWe know the right way; we ship the shortcut and pretend we will fix it'No tests for v1, we will add them later'Audit honestly — usually becomes accidental reckless
Accidental PrudentWe did not know the right way; we shipped, learned, and now we know'Turns out we needed an event bus, not direct calls'Refactor at the seams when next touching the area
Accidental RecklessWe did not know there was a right way; we still do not'Why is everything tightly coupled to the auth module?'The dangerous one — fix when you can spot it
Fowler's technical debt quadrant. The accidental-reckless box is where startups die.

Pre-PMF, take on deliberate prudent debt cheerfully — it is the price of speed. Pre-PMF deliberate reckless debt is also fine if you are honest with yourself about it (most teams are not). The killer is accidental reckless debt: by definition you do not know it is there, so you cannot price it, and it compounds. The single most valuable hour of an architecture audit is usually surfacing the accidental reckless debt nobody had named.

Signs you are losing the debt fight

Six measurable signals tell you debt is winning. If three or more are true, stop shipping features and fix the system. If five or more are true, you are probably already in a death spiral and need outside help.

  • Deploy time is over 30 minutes (or you deploy less than weekly because deploys scare the team)
  • Local dev setup takes a new engineer more than half a day to get green
  • There is at least one module or service everyone refers to as 'the scary one'
  • Bug fixes regularly cause new bugs in unrelated areas
  • On-call pages have grown more than 50 percent in the last 6 months on flat traffic
  • A typical small feature (the kind that took 2 days a year ago) now takes 1 to 2 weeks

Debt to ignore (yes, ignore)

The hardest discipline in engineering leadership is leaving ugly code alone. Three categories almost always belong in the ignore pile.

Stable, isolated, and rarely touched. A 600-line legacy module that handles a feature used by 4 percent of users, has not had a bug in a year, and nobody needs to modify — leave it alone, even if the style would not pass code review today. The cost of carry is approximately zero.

Code that offends your taste but works. 'I would have written it differently' is not a business case. 'It costs us 4 hours a week of debugging' is. If you cannot translate the debt to wall-clock cost, you are likely paying for taste, not value.

Old patterns that are not your bottleneck. A REST API that 'should be GraphQL' is rarely the thing slowing your team down. A monolith that 'should be microservices' is, more often than not, exactly the right architecture for a 5-engineer team. The only debt worth fixing is the debt blocking your team's actual current work — not the debt that bothers you in theory.

Pre-PMF vs post-PMF: the rules change

Pre-PMF, the debt-to-feature ratio should run at 5 to 10 percent. The risk of failure is 'nobody wants this product,' not 'our code base is messy.' Take shortcuts. Skip tests on throwaway code. Hardcode the 5 things that should be configurable. Ship the version of the system that lets you learn from real users this week. The codebase you will refactor later is the codebase your competitor never gets to write because they polished theirs to death.

Post-PMF, the math inverts. You now have customers who depend on the system not breaking, a team you want to keep, and a roadmap longer than 90 days. Allocate 15 to 25 percent of engineering capacity to debt and infrastructure work. Below 10 percent and you are accumulating; above 35 percent and you are over-engineering or working on the wrong debt.

The transition is brutal because the habits that got you to PMF are the wrong habits for scaling past it. Most teams notice this 6 to 12 months too late, after the first scaled hire ramps slowly and a recurring outage starts costing real customers. The signal that you are post-PMF and over-leveraged on debt is usually that the founder-engineer's productivity is still high but new hires take 2 to 3 months to ship anything meaningful.

A 5-step debt prioritization process

Once a quarter, run this process across your whole team. It takes 2 to 4 hours and produces a defendable prioritization. Most teams I work with do this once and immediately keep doing it.

  1. List every known piece of debt — whiteboard or doc, no judgement, include the items everyone complains about and the ones nobody talks about anymore
  2. For each, estimate cost of carry per quarter in engineer-hours (be honest: how much time is actually lost?) and quadrant (Fowler's: deliberate or accidental, prudent or reckless)
  3. For each, estimate cost of change in engineer-weeks, with a 1.5x multiplier for any item over a week
  4. Compute payback: cost of change divided by cost of carry per quarter — anything paying back in under 2 quarters is a yes, over 4 is a no, in between needs strategic context
  5. Pick the top 3 debts that fit inside this quarter's 15-25 percent debt budget and put them on the roadmap with explicit acceptance criteria

When a rewrite is actually the right call

Rewrites are right roughly 5 percent of the time. The honest tests: the architecture is fundamentally wrong (multi-tenant requirement on a single-tenant codebase, real-time required on a polling codebase, a runtime that is end-of-life), the team has at least 3 months of dedicated runway for it, and a clear cutover plan that does not require freezing feature work for the duration.

Even then, prefer the strangler-fig pattern: stand up the new system alongside the old, route increasing percentages of traffic to it, retire the old system module by module. Big-bang rewrites have a roughly 50 percent failure rate at startups and a much higher 'shipped 9 months late' rate. The exception is rewrites under 2 weeks of work — those are usually fine because the scope is small enough not to drift.

What a debt-aware team looks like

Healthy teams treat debt the way good operators treat finances: explicit, tracked, paid down on a schedule, with a budget and a quarterly review. They have a debt log, they retire items off it every quarter, and they incur new ones consciously rather than by accident. They are not chasing zero debt — that goal would kill velocity. They are running a sustainable level of leverage.

If you are unsure whether your team is winning or losing the debt fight, that is exactly what an architecture audit is for (from $1,499) — I spend a few days in your code, your repo history, and your incident log, and produce a debt register with payback calculations on each item. Pairs well with the AI-native architecture guide if you are AI-heavy, the AWS bill cut playbook if cost-of-carry is high, and the cost-per-user calculator if you are trying to put a dollar number on engineering inefficiency. For ongoing oversight, the Fractional CTO engagement (from $2,999/month) covers running this process quarterly without you needing to lead it.

Frequently asked questions

How do I tell my CEO that we need to stop shipping features and pay down debt?

Do not pitch it as 'pay down debt.' Pitch it as 'we are losing X engineering hours per week to a problem that takes Y hours to fix.' Quantify: deploy time, time to add a typical feature, on-call hours, time to onboard a new engineer. If those numbers have doubled in 6 months, you have a business case. CEOs do not approve refactors. They approve velocity recovery.

Is technical debt always bad?

No. Deliberate prudent debt — taking a known shortcut to ship faster, with the team aware of the trade-off and a plan to fix it — is one of the most powerful tools in startup engineering. Pre-PMF, you should be taking on debt every week. The bad debt is accidental and reckless: shortcuts the team did not realize were shortcuts, that nobody is tracking, that compound silently.

What is a healthy debt-to-feature ratio?

Post-PMF, somewhere between 15 and 25 percent of engineering time on debt and infrastructure work is normal and healthy. Below 10 percent and you are accumulating; above 35 percent and you are probably over-engineering or paying for the wrong debt. Pre-PMF, the ratio can drop to 5 to 10 percent — speed of learning matters more than code quality at that stage.

Should we rewrite or refactor?

Refactor 95 percent of the time. Rewrites take 2 to 5x longer than estimated, regress features users depend on, and stop the business from shipping for months. The case for a rewrite: the architecture is wrong at a foundational level (e.g. a single-tenant codebase that needs to be multi-tenant, or a monolith that is genuinely impossible to slice), and you have at least 3 months of runway specifically for the rewrite. Otherwise, refactor at the seams.

How do I prevent debt from coming back after we pay it down?

Three habits: a 'Definition of Done' checklist that includes 'leaves the surrounding code at least as clean as you found it,' an explicit debt budget on the engineering roadmap (15-25 percent of capacity, not zero, not 50), and a quarterly debt review where you list new debts incurred and old debts retired. If you cannot point to debts retired in the last quarter, you are accumulating.

Tech DebtEngineeringStrategy

Related articles

MVP

The Best MVP Tech Stack for 2026 (Boring, Fast, Cheap)

After shipping 30+ MVPs, here's the exact stack I'd pick today. Six tools, under $50/month at MVP scale, deploys in a weekend, and won't embarrass you in two years.

11 min readRead
Hiring

How to Hire Your First Engineer (Without Getting Burned)

Most founders hire their first engineer wrong, and they pay for it for three years. Here is the playbook I use when I run hiring loops for portfolio companies — sourcing, structure, comp, equity, and reference checks.

12 min readRead
Strategy

Build vs Buy: When Custom Beats SaaS (and Vice Versa)

Most founders build the wrong things and buy the wrong things. Here is the rubric I use across 100+ engagements to decide what deserves your engineering time and what should stay a $99/month subscription.

11 min readRead

Want a senior eye on your stack?

If you are scoping an MVP, scaling a SaaS, or staring at an inherited codebase, book a 30-minute call. No pitch deck required.