On owning a codebase, and why it may be the hardest job in software
AI coding agents are producing more code than ever, but the world still runs on massive, decades-old codebases. Why owning and understanding them may be the hardest job in software.

AI coding agents are producing more code than ever, but the world still runs on massive, decades-old codebases. Why owning and understanding them may be the hardest job in software.
It may feel anachronistic to think about a codebase as anything more than an temporary artifact of a prompt.
And it feels positively uncool to care about the code as much as the product.
But this is the job that most of us actually need to re-learn to survive in this industry over the next 5 years. 72% of employment in the software industry is in companies with >500 employees. Few of our peers work at startups, and even fewer are solopreneur X influencers.
The software that makes our world work isn’t pretty, it isn’t new, it isn’t clean, and it may be aging like a 30-year old freeway full of potholes. But it’s how our world works. It’s how your bank account will reject a transaction in real-time when you hit your limit. It’s how your insurance carrier calculates the reimbursement rate on a policyholder with dual or triple coverage. It’s how a chip in a warehouse scanner allows Amazon to tell you when your new socks arrive. It’s how your Uber estimates an arrival time, how an airplane uses RADAR to adjust trajectory, how your paycheck gets issued twice a month, how your office air conditioner ticks on and off, how your favorite brand’s new logo gets designed and printed, how the bevel on the edge of your bedframe gets precision cut, how your local grocery store stays full, and on, and on, and on.
We are all completely, utterly dependent on massive and complex codebases built over decades.
And in a world full of AI tools that can spit out brand new products from scratch faster than we can keep up, these codebases are still standing, making the world work. We cannot afford to leave them in the dust.
“We don’t know how to track code quality anymore. Vibe coding leads to cruft and rapid obsolescence in there. We need rigor to be sure we’re not blowing up the shop.”
— Technology Leader, top 10 global bank
The job of owning a codebase is harder than it’s ever been
Let’s give some love to the humans who own these codebases. These are the Engineering leaders I talk to every week, and the pressure they’re feeling is reaching a boiling point. The same coding agents that are helping these teams write more, better code faster than before, are also creating a tidal wave of new code that is burying codebase owners.
Every one of us has felt the deluge of AI-written code that requires review, and felt exhausted by the change in the job. And from that pain, a new wave of AI code review agents has been born! They’re invaluable for a developer that needs a bigger sandbag to hold back the waves. I imagine that the distinctions between code writer agent and code reviewer agents will disappear, the coding agents will get better, and smarter, and the whole dev tools market will continue to work to solve the next roadblock in front of it, jumping from one local maximum to the next as LLMs reshape the ground under us.
But meanwhile, that massive codebase—hundreds of millions of lines of code, tens of thousands of repositories, decades of commits applied, and vastly too large to fit in context windows under current LLM architectures (or even to be cloned to a VM in a reasonable amount of time)—will be decaying. Tons of new features and patches applied at record speed, different coding standards and rules applied by different agents in different parts, with duplicated code proliferating, with cross-service dependencies becoming more brittle, and slight deviations in code and UX standards creating more and more insidious and hidden vulnerabilities.
“I want to give my team the best tools. But then I overheard a dev say ‘I don’t know what this code does, AI wrote it.’”
— Technology Leader, top 10 global car manufacturer
The volume of code is the problem. And agents are producing a tidal wave of new code, at a pace we’ve never seen before. The call is coming from inside the house! The very tools speeding us up are creating the conditions that will cause these codebases to fail.
The current landscape of agentic dev tools is not meeting the moment
Understanding a massive codebase is a hard problem. It’s not something that can be done on every engineer’s laptop. It’s an infrastructure problem as much as a harness problem.
Your coding agent wouldn’t accomplish much without grep. In fact, grep is most of what an agent does. LLMs simply love to search; and like a new hire getting up to speed, there’s no better way to build context and a model of the world. After a year of industry-wide experimentation, it’s become quite clear that your repo architecture overview put into READMEs and AGENTS.mds aren’t actually having a big impact, and that your agent is really just going to grep its way to an understanding anyway.
But like most problems in the world of agents, understanding is a context problem. Understanding these massive scale codebases that make our world work is not a grep problem. You can’t grep what you can’t see. The infrastructure to allow you to see and understand a 50k+ repo codebase simply isn’t being built or sold by OpenAI, Anthropic, or Cursor. Their agents are so impressively good at small scale problems that people simply haven’t realized the problem building around them.
And thus, the massive codebase, carefully constructed by a team of architects over 10, 20, or 30 years, begins to degrade.
And quarter after quarter, the agent builders set new records and token consumption continues skyrocketing, because we are still in the early stages when it comes to building and adopting these tools. Why build the infrastructure layer to see and search and deeply understand the full codebase and the big picture, when you’re growing faster than any company in history.
“Sure, Claude Code can make this change. But I have 90 thousand repositories to make it in”
— Technology Leader, top 10 US bank
The owners of the largest codebases in the world deserve better! The world needs tools that make this job easier.
These codebases aren’t going away
I harbor no illusions that product building hasn’t changed forever. AI gives developers superpowers, shortens the time to launch, and completely changes the build-vs-buy formula for a software procurement manager. But none of this means that we’ve moved past deterministic business logic.
I was speaking to a software executive at one of the largest health insurance companies in the world, recently, and they described a claim reimbursement decision tree with over 160 stages, built meticulously in COBOL decades ago, dependent on ultra-high levels of data fidelity and thousands of database tables and data loader systems, maintained by an internal team of people that they can’t keep staffed. This is not a workflow that can be passed off to Claude; we won’t stop needing determinism, even after AGI.
But coding agents can write the code that modernizes these systems and empowers the developers and business owners of this software. They can, that is, if they have access to the right level of context infrastructure, the ability to deeply understand the codebase and the underlying business logic.
These codebases will change, and the way you build and maintain them have changed. But the future will only have more code, and bigger and more complex codebases for our agents to own.
Three cheers for the owners of the codebases that keep our world turning
The people who take on the challenge of owning and evolving a codebase in this world are heroes who deserve to be celebrated, the same way we celebrate our civic heroes, the people who keep bridges from collapsing, our power plants running, our government functioning, and beyond.
I’m proud of what we build and who we serve at Sourcegraph. If you’re feeling this pain, please reach out, we are here to help.
If you want to work on the challenge of making the largest, most complex codebases in the world easier to understand, oversee, and evolve, get in touch.

With Sourcegraph, the code understanding platform for enterprise.
Schedule a demo