How to Measure Technical Debt: Metrics, Formulas & Tools
Learn how to measure technical debt with proven metrics: technical debt ratio, code churn, complexity trends, plus tools to track them live.

Learn how to measure technical debt with proven metrics: technical debt ratio, code churn, complexity trends, plus tools to track them live.
The fastest way to lose a tech debt argument is to walk into the planning meeting with a line like "The payments service is really fragile," ready to go. This type of approach loses to "we're shipping this feature next quarter" every time, because one side has a number and the other has a feeling. Measuring technical debt is how you stop bringing adjectives to a numbers fight.
This guide is for engineers and engineering managers who need to put a defensible figure on debt: the metrics that actually correlate with pain, the one formula worth knowing, and a repeatable way to measure debt in the codebase rather than from memory. None of it requires a quarter-long audit.
Because unmeasured debt is unfundable. Google's engineering-research team studied this directly and concluded that technical debt is "a prime example of an entangled human and technical problem," which is a careful way of saying the hard part isn't the code, it's getting people to agree the debt is real. Measurement is what ends the "engineering said / leadership said" standoff. When you can show a deprecated API used in 4,000 callsites trending up and to the right, the conversation shifts from whether to pay it down to when.
The second reason is prioritization. Most codebases carry more debt than any team can service, so the point of measuring isn't a tidy total. It's separating the debt that charges real interest from the debt that just looks alarming in an audit. We covered the management side of that in our technical debt management guide; this post is the measurement layer underneath it.
There's a cost to not measuring that is rarely made up for. When debt is invisible, it gets priced into every estimate as a vague "things take longer here" tax that nobody can challenge or fix. Teams over-staff the fragile service, route around the scary module, and quietly slow down, all without a number anyone can point to. Measurement doesn't reduce the debt. It makes the drag legible, which is the precondition for doing anything about it.
The technical debt ratio (TDR) is one of the most-cited metrics in this space, and it's worth understanding even if you never adopt it wholesale. TDR expresses the cost of fixing a codebase as a percentage of the cost of having built it.
Technical Debt Ratio = (Remediation Cost / Development Cost) × 100

TDR divides remediation cost by development cost; a worked example lands at 1.6% — watch the slope, not just the threshold.
Remediation cost is the estimated effort to fix all the known issues. Development cost is the effort the codebase represents, usually approximated as lines of code multiplied by a cost-per-line constant. This is the SQALE-style debt-ratio model used by tools such as SonarQube.
A worked example makes the formula concrete.
Say a service is 50,000 lines of code, your tooling estimates 30 minutes per line of code at the cost model you've configured, and the analyzer projects 400 hours to clear every flagged issue.
Development cost is 50,000 × 0.5 = 25,000 hours.
Remediation is 400 hours.
TDR is 400 ÷ 25,000 = 1.6%. A healthy number that most teams could be happy with.
Run the same math on a 5,000-line module with 300 hours of remediation, and the ratio jumps to 12%, which is the number that should pull the module to the front of the queue. The formula's value is exactly this comparison: it normalizes debt by size so a small, rotten module doesn't hide behind a large, healthy one.
In Sonar/SQALE-style implementations, teams often treat a low single-digit TDR as healthy, but the threshold depends on the ruleset and cost model, so treat it as a tool heuristic rather than an industry law. The exact threshold matters less than the trend. A TDR holding at 8% across two quarters is a healthier signal than one at 4% and climbing a point a month. Treat the absolute number as a conversation starter and the slope as the real metric.
The honest limitation is that TDR rewards what its underlying ruleset can see. It captures complexity and rule violations well, and architectural debt, knowledge debt, and "this whole service is the wrong shape" debt poorly. It's a floor, not a full accounting, which is why standards bodies like CISQ publish separate automated-measurement specifications rather than treating a single ratio as the answer.
TDR is one number. A measurement program needs a handful, because different debt charges interest in different ways. These eight cover most of what teams track:

Five code-level signals describe the code, the DORA pair describes delivery, and the deprecated-pattern count bridges the two.
The first five describe the code. The DORA pair describes delivery, translating debt into a language the business already tracks. The last one is the bridge between the two, and it's the metric most teams underuse.
Metrics are inert without a method. Here's a five-step loop that produces a defensible measurement without a heroic audit.
Debt is contextual. A 600-line function in a stable, well-tested billing module may be fine, while a 60-line one in a service that changes daily may not be. Before measuring, write down the patterns that count: deprecated libraries, banned APIs, and complexity thresholds. This list makes the measurement reproducible rather than subjective.
Turn each definition from step 1 into a query. "Every call to the legacy auth client" becomes a search you can run on demand. This is where Code Insights fits: it turns the codebase into a queryable database, so a pattern like lang:java @deprecated becomes a tracked count rather than a one-time grep. The advantage over a static scan is that the measurement re-runs whenever the code changes.
Take the first reading and save it. A single number is a data point, not a measurement. The measurement is the line you draw across readings, and the line is what you bring to planning. A deprecated-API count of 4,000 means little on its own; "4,000, down from 5,200 last quarter" is a story leadership can fund.
A codebase-wide TDR hides the signal. Break metrics down by service and owner so the debt lands with the team that can act on it. Cross-repository search makes this practical even when the relevant code spans hundreds of repositories, since Code Search resolves the same pattern across every repo and branch at once.
The final translation. "Complexity is up 12%" persuades no one outside engineering, while "the checkout service now takes three days to change instead of one, and our change failure rate there doubled" persuades everyone. Tie the code-level metrics to the delivery metrics, and tie those to time and risk. That's the chain that turns a measurement into a budget.
The discipline that makes this loop work is keeping the metric set small. It's tempting to dashboard everything the tooling can produce, but a measurement program with thirty metrics gets ignored as fast as one with none. Pick the two or three signals that map to your actual pain, usually a complexity-and-churn pair plus one delivery metric, and let the rest stay available for drill-down rather than front and center. The goal of the whole exercise is a number a tech lead can defend in a planning meeting, not a wall of gauges nobody reads.
The tooling splits cleanly by what each tool measures:
Most serious programs use one tool from at least two of these rows, because no single tool measures code-level, delivery, and custom-pattern debt at once.
AI coding tools can change the measurement problem in one specific way: they can raise the rate at which new code, and therefore new debt, enters the codebase if review and measurement don't keep pace. A measurement cadence that worked at human review speed can fall behind when a meaningful share of commits is machine-drafted. The fix isn't a new metric; it's a faster loop. The deprecated-pattern count that you recompute on every change becomes more valuable, not less, when code volume climbs.
What doesn't change is the need for ground truth. A model can estimate how much debt a change adds; only a query against the actual codebase can tell you how many services still call the deprecated API. McKinsey's framing holds here, that the goal is to identify, value, and manage debt rather than chase a zero. Measurement is the "identify and value" half, and it's the half AI makes more important, not less.
Every reliable technical debt measurement shares one property: it comes from the code, not from a workshop estimate. TDR gives you a headline number, the eight metrics give you the texture, and segmenting by team turns the whole thing into something owners can act on. The teams that measure well don't run bigger audits; they run smaller queries more often.
If your current debt number lives in a spreadsheet that's already out of date, start by making it queryable. See how Code Insights turns any pattern in your codebase into a metric you can actually trend.
What is the 80/20 rule for technical debt? The observation that a small share of debt items causes most of the cost. In measurement terms, it's why you segment: a handful of high-churn, high-complexity files usually account for the bulk of defects and slowdowns, so measuring everything equally wastes the signal.
How do you monitor technical debt over time? Turn your debt definitions into queries, baseline them, and chart the readings on a dashboard reviewed at the same cadence as delivery metrics. The trend line, not the snapshot, is the monitor.
What are the 4 types of technical debt? Taxonomies vary, but most teams measure across code debt, architecture debt, testing debt, and documentation debt. Each needs different metrics, which is why a single TDR number underreports the architecture and the kinds of knowledge.
What is a good technical debt ratio? Under 5% is the common target, but the slope matters more than the threshold. A stable 8% beats a 4% that climbs a point a month.

With Sourcegraph, the code understanding platform for enterprise.
Schedule a demo