The One-Way Doors That Remain

Almost every practice we have for building software was designed around an assumption that’s quietly stopped being true. Test coverage budgets, code review queues, refactoring debt, junior hiring pipelines, the ratio of product managers to engineers, the number of services we maintain, the cost of saying no to a feature. Each of these is a knob set somewhere on the dial, and the position of the dial was determined by a single underlying parameter: writing code was slow and expensive.

It is becoming neither. Not uniformly, not without footguns, and not for everyone. But for any team that has integrated agentic coding tools into their workflow with even modest discipline, the marginal cost of producing a working unit of code has dropped enough that we should be willing to ask the obvious follow-up question: if writing code is no longer the hard part, what is?

The most useful way I’ve found to think about this is in terms of one-way doors. For most of software’s history, we treated almost every decision as a one-way door, because the cost of changing it was bundled into the cost of rewriting code. Cheap code revealed that most of those doors were always two-way; we just couldn’t tell. The interesting question for an engineering leader in 2026 isn’t whether code got cheaper (it has) but which doors are still one-way.

I’ve been running an engineering team that ships production software in a regulated industry while AI coding tools have gone from interesting to load-bearing. This piece is me thinking out loud about which decisions I’m making differently, and which ones I haven’t figured out yet. I’m certain I’ll be wrong about some of this. Hopefully I’ll be wrong in useful ways.

The cost of code was always a proxy

For decades, “writing code” served as a proxy variable for a much larger set of costs that scaled with it: the coordination cost of more engineers building more things, the discovery cost of figuring out what to build, the design cost of getting interfaces and data shapes right, the mentorship cost of growing the next generation of engineers, the operational cost of running what got shipped. We rarely measured those costs directly. We measured them in story points and headcount and cycle time, all of which were ultimately measures of how much code we could write.

When you change the proxy, you don’t change the underlying costs. You just change which ones become visible. Andrew Ng said it bluntly in 2025: engineers are getting 10x faster, product managers aren’t, and now product managers look like the bottleneck. But he was describing a much more general phenomenon. Every cost that was previously bundled inside “writing code” is now exposed. Some of them are getting cheaper alongside code generation but many are not. The challenge for anyone running an engineering organization is to understand and account for which of these other costs are about to become the binding constraint.

I want to walk through four of them, in roughly the order I think they matter for engineering leaders right now: architecture, product discovery, quality, and the pipeline that produces the next generation of senior engineers. In each case, the question to ask is the same: which decisions in this domain are still one-way doors, and which were two-way doors all along?

I’ll be upfront that I find the architecture question by far the most intellectually interesting, and I think it’s also the one most CTOs are getting wrong. The product and pipeline questions are more pressing in the short term, but the architecture question is the one that will determine which companies can actually take advantage of cheap code over a five-year horizon. We’ll come back to that.

What “getting it right” actually meant

Brooks’ No Silver Bullet essay, forty years old this year, divided software complexity into two buckets. Accidental complexity is the cost of the tools, languages, and ceremonies we use to express what we want the computer to do. Essential complexity is the cost of figuring out what we want the computer to do in the first place. Brooks’ claim was that no tool, however good, could meaningfully reduce essential complexity, because the difficulty came from the problem itself.

Forty years of better languages, better frameworks, better IDEs, better deployment tooling, and so on, have proven him right. None of those innovations made it meaningfully easier to model a domain correctly, get the boundaries between services right, or decide when consistency mattered more than availability. They just made it less painful to implement whatever decision you eventually arrived at.

LLMs are the most dramatic attack on accidental complexity in the history of the field. They make boilerplate genuinely free. They make adapter code, glue code, refactoring within a module, test scaffolding, and the small mountain of mechanical work that surrounds any real engineering task into a near-zero marginal cost activity. But they have not, and based on what I’ve seen will not in the near term, do anything meaningful to essential complexity. Models still cannot tell you whether your domain model is right. They can implement either choice with equal facility.

This matters because the conventional wisdom about “getting the architecture right” was always conflating these two things. When we said “getting the design right matters because rewrites are expensive,” we usually meant a mix of: rewriting the code is expensive, and getting the design wrong has compounding consequences across the system. When code rewrites are cheap, only the second half of that statement is still true. And it turns out the second half was always doing most of the work.

I’ve started thinking of this as a sharper version of Bezos’ two-way door distinction. In the cheap-code era, almost all internal decisions are two-way doors. The shape of a class, the implementation of a service, even the choice of language for a self-contained component. All of these are now things you can change in an afternoon if you decide you got them wrong. The doors that remain one-way are not new doors; they’re the same ones we always had, but they were obscured by all the other things that used to feel one-way and weren’t.

What’s actually still expensive to change is anything with an interface to something you don’t control. Data schemas with millions of rows in production. Public APIs with external consumers. Customer-facing UX patterns we’ve trained users on for years. The contract between two services owned by two teams that ship on different cadences. The on-call burden of an additional service in production. These didn’t get cheaper. In some cases AI made them more expensive, because the velocity of generating new ones outstrips our capacity to retire the old ones.

The practical implication is that architectural attention has to migrate. A year ago, the senior engineering review was largely about whether the implementation was clean enough that we wouldn’t regret it later. Today, that question is mostly answered by the fact that we can rewrite the implementation later, cheaply. The review has to shift to the things that are still expensive to change: the boundaries between systems, the shape of the data, the surface area of the public API, the operational footprint of one more thing running in production. These are the one-way doors. Experienced architects have always told us to focus on them; we just had a hard time telling them apart from all the things we had to treat as one-way because changing the code was expensive.

I’d state it more strongly. “Getting it right” was never about the internals. We were always going to rewrite the internals. We just used to feel bad about it, and now we don’t have to.

The other bottlenecks: discovery and prioritization

If writing code becomes cheap, the next-most-binding constraint on shipping useful software is figuring out what to ship. That work is mostly a sequence of one-way doors: customer commitments, market positioning, brand promises, the strategic bets you can’t quietly walk back once you’ve shipped. None of them compress with AI. This is the framing Andrew Ng has been pushing publicly, and it tracks the experience of teams I’ve talked to that have aggressively adopted agentic coding.

The naive read is that the PM-to-engineer ratio is shifting, and we should all rebalance our hiring accordingly. I think the more interesting read is the one Drew Breunig laid out in August 2025: the product management function isn’t going to compress at the same rate as engineering, but it also isn’t going to grow proportionally. It’s going to bisect.

On one side, you get something that looks a lot like the Forward Deployed Engineer model: engineer-PM hybrids who sit close to customers, hold strong domain expertise, and ship working software in days. On the other, you get a smaller foundation function focused on platform decisions, compliance, security, and productizing whatever the application teams figure out works. The middle layer of “PM who writes specs and grooms a backlog and runs sprint ceremony” gets squeezed from both directions. LinkedIn’s CPO Tomer Cohen announced in August 2025 that they are restructuring their entry-level program into an Associate Product Builder track that explicitly trains generalists across product, design, and engineering.

What I find compelling about this framing is that it matches what’s already happened on my own team without anyone planning it. The engineers who have leaned hardest into AI tools have, almost without realizing it, started doing more product work. They’re not waiting for tickets; they’re proposing them, prototyping them, shipping the prototype to a customer for feedback, and iterating. The PMs they work with have correspondingly moved up a level: less time on requirements documents, more time on prioritization, market positioning, and the hard “should we even build this” calls. Neither group really asked for this, it’s just that the economics pushed both of them in this direction.

What I think this means for hiring, directionally: the value of a generalist who can hold both halves of the build/decide loop in their head is going up faster than I would have predicted a year ago. The value of a pure spec-writer is going down. The value of someone who can talk to a customer, understand a domain, and articulate a clear “what to build next” is going up faster still, and that skill is genuinely scarce. We have decades of practice training people to write code. We have very little practice training people to know what’s worth coding.

Tests are the objective function

The QA story is the one I’ve thought about the longest and have the least clean answer for. Here is the underlying tension: per the 2025 Stack Overflow Developer Survey, 84% of developers are now using or planning to use AI tools, and 51% of professionals use them daily. At the same time, CodeRabbit’s December 2025 analysis of 470 PRs found that AI-generated pull requests contain 1.7x more issues than human-authored ones. Logic and correctness issues are 75% more common; security issues up to 2.74x. The share of code an AI helped write is going up, the defect density is going up, and the number of humans available to review it is going up by zero.

The obvious response is “hire more QA, write more tests.” The second half of that is more correct than most people realize, but for an unexpected reason.

In a world where agents are writing the code, the role of tests changes. They become the objective function the agent is optimizing against. Anyone who has spent time in machine learning knows what happens with a poorly-specified objective: the model does not converge on the answer you actually wanted. It converges on whatever you actually told it to optimize. If the loss function is wrong, no amount of training fixes it.

The same dynamic now applies to writing software with agents. The agent will, with sometimes startling competence, satisfy whatever your tests describe. If those tests describe a thin slice of the actual problem, the agent produces code that handles that thin slice and drifts on everything else. The drift isn’t a bug. It’s the agent doing exactly what you specified.

To put it in this post’s frame: your tests are how you tell the agent which doors are one-way. Without that signal, every door looks two-way to it, and the agent walks through whichever one is closest.

Which means the most leveraged thing a QA function can do is design the objective function. Properties, contracts, invariants, edge cases that matter. The hard work moves up the stack: from writing assertions about specific known cases to expressing the actual constraints the system needs to honor. Mutation testing, property-based testing, fault injection, and contract testing all matter more now, because they are the frame inside which the agent operates. (I made an adjacent argument about open source a couple of months ago: when code is cheap, the spec is the artifact worth sharing. The QA equivalent is that the spec is the artifact worth enforcing.)

The corollary is that TDD becomes more important under agentic development, which is the opposite of what most people assumed the AI era would mean for it. Writing the test first used to be a discipline argument: it forced you to clarify your thinking before coding. Now it’s also a substrate argument. The test is the prompt, in the most precise sense. Whatever you fail to capture in tests, the agent has license to drift away from. And the old objection has flipped: writing tests is no longer the slow part of the loop. Figuring out what the test should assert is.

I’ve watched this play out. Engineers who give an agent a vague goal and a passing test suite end up debugging mystifying drift a week later. Engineers who give an agent a goal, a sharp test, and a tight feedback loop get code that converges. Same models, same prompts, same stakes. The difference is the objective function.

I suspect, but cannot prove, that we’re heading toward an architecture where some form of agentic QA layer continuously evaluates production behavior and proposes high-signal tests. Whether or not that materializes, the underlying skill the QA function needs to develop is specifying what good looks like with enough precision that an agent can optimize against it. Anyone who has tried to write a fair eval for a model knows how hard that is.

The pipeline problem nobody wants to solve

The thing that worries me most, and the thing I have the least confidence I’m getting right, is the junior engineer question.

Entry-level software engineering hiring was already in trouble before agentic coding tools went mainstream. SignalFire’s State of Tech Talent Report put the share of new grads landing roles at the Magnificent Seven down more than half since 2022, with new grads now accounting for just 7% of Big Tech hires (down over 50% from pre-pandemic levels). The justification was always macroeconomic, but the structural reason it has stayed down is that companies have rationally decided they don’t need as many people to do the work that juniors used to do. CRUD endpoints, simple bug fixes, boilerplate. All of that is now done, well, by a coding agent for the cost of a few cents in tokens.

Mark Russinovich and Scott Hanselman published a piece in Communications of the ACM in 2026, Redefining the Software Engineering Profession for AI, that put a useful name on the dynamic. They called it “AI lift” for senior engineers and “AI drag” for early-career ones. Senior engineers can steer, verify, and integrate AI output because they have judgment built up from years of being responsible for production systems. Junior engineers, almost by definition, do not. Hand them an agent and you don’t get a 10x junior engineer; you get a junior engineer producing more code than they can evaluate. The same tool that lifts the senior drags the junior, in opposite directions, with the same vector.

This is also, I think, what people are gesturing at when they argue about whether the 10x engineer has become the 100x engineer. For the engineers who already had the judgment to steer, the answer is closer to yes than I would have believed a year ago. An engineer who deeply understands the system they’re working on, has strong taste, and can articulate a precise specification can now run several agents in parallel, ship features in hours that used to take weeks, and maintain quality. I have seen this happen. It is real.

But the mechanism that produced those engineers in the first place was that they were once junior engineers. They were paged at 2 AM, debugged something incomprehensible in a system they didn’t write, and slowly built up the felt sense of how production systems fail. No coding agent can give you that. And if we’ve decided we don’t need to hire the people who would otherwise go through that process, we have collectively made a bet that the supply of senior engineers in 2031 will be roughly the same as it is today, even though we’re not refilling the pipeline that produces them. This seems unwise.

It’s also the most explicit one-way door of the four. There is no afternoon-rewrite for the careers we don’t start. Whatever decisions companies make about junior hiring between now and 2028 are setting the senior engineer supply for the early 2030s, and that one is genuinely a Type 1 decision.

The argument I find most persuasive on this point is the selfish one, which a blog post I read recently put as: if you stop hiring juniors, your seniors own you. Senior engineering compensation has been climbing relative to junior compensation for a decade. In a world where every company has aggressively skipped the junior hiring class, that gap is going to widen substantially. The companies that bet on training juniors through this period (IBM, by their own announcement, tripled US entry-level hiring in software) are going to look prescient in three to five years. The ones that didn’t are going to be paying through the nose for a senior engineer pool they failed to produce.

I haven’t fully figured out what this means for our hiring at startups, and I’d be lying if I said I had. I know that the answer isn’t to hire juniors and hope they figure it out, because handing a coding agent to someone without judgment is exactly the AI drag that Russinovich and Hanselman described. I think it has to involve something closer to a structured apprenticeship: pairing junior engineers with both a senior human and a coding agent, configuring the agent to coach rather than complete, and explicitly designing the work so that the human is doing the parts that build judgment, not the parts that are now free. If anyone reading this has a working version of that I’d like to compare notes.

The democratization argument cuts both ways here. “Anyone can code” is genuinely true now in a way that it wasn’t before, and that’s unambiguously good for the supply of software in the world. More people building more things solves more problems. But “anyone can code” produces code, not engineers. The work of becoming an engineer (developing the judgment to know when to write code, when not to, how to structure a system, how to recover when it breaks) is harder to learn now, not easier, because the activities that used to teach it are the ones AI is automating. We need to be honest about which problem we’re solving when we celebrate this.

What I’m actually doing about all of this

A few directional adjustments, none of them confident, all of them wrong in some way I’ll discover later. The unifying theme is that they’re all about spending more attention on the doors that didn’t move.

I am paying more attention to interfaces than I used to, and less attention to internals. Code review on internal implementation details is a less productive use of my time than it was eighteen months ago, because the cost of getting those details wrong is genuinely lower. Code review on the boundary between two systems, on the shape of data we’re going to commit to in a contract with a customer, on the operational footprint of a new service, all of those have gone up in importance, because those costs didn’t compress.

I am hiring more generalists and fewer specialists at the senior level, and looking for engineer-PM hybrids more aggressively than I would have a year ago. The skill of knowing what to build is getting scarcer relative to the skill of building it, and the people who can do both are disproportionately valuable.

I am leaning harder on TDD than I have in years. With agents writing most of the code, the test suite is functioning as the spec they’re optimizing against, and what I worry about in review is whether the spec captures the right behavior.

I have not solved the junior pipeline problem. I am not going to pretend I have. I think the honest answer is that the industry as a whole has under-invested in apprenticeship infrastructure for so long that we don’t have a good template for what an AI-era junior engineering program looks like. Building one is on my list, and probably should be on most CTOs’ lists.

The doors that didn’t move

The one-line summary: cheap code reveals which doors were always one-way, and the answer turns out to be the things Brooks called essential complexity forty years ago. Knowing what to build. Knowing where the boundaries go. Knowing what’s worth testing. Knowing how to produce the next generation of people who can answer those questions.

None of these are new problems. They are the doors that didn’t move. They’ve always been one-way; we just couldn’t see them clearly through all the two-way doors we mistook for one-way ourselves. We have spent a generation training engineers to be very good at the parts of software development that AI is now automating, and very little time training them on the parts that AI cannot. The companies that figure out how to invert that ratio are going to compound their advantages quickly, because the old constraint (writing code) has loosened, and the new constraints (judgment, design, mentorship) reward the kind of long-term investments that don’t show up in this quarter’s velocity metrics.

I am not sure who wins this race. I am reasonably sure that the answer to “is the architecture right?” used to be a question about the internals and is now almost entirely a question about the boundaries. I am reasonably sure that “where are our next senior engineers coming from?” is the most important question on most CTOs’ desks, even though almost nobody is treating it that way. And I am reasonably sure that the engineering leaders who get the next decade right are the ones who learn fastest to spend their attention on the doors that didn’t move.

We were always going to find out what we were actually paying for. AI just made it happen faster.

Things I read while thinking about this

No Silver Bullet, Essence and Accident in Software Engineering, Fred Brooks, 1986
2015 Letter to Shareholders, Jeff Bezos, 2015
Bottleneck or Bisect: AI-Assisted Coding Will Change Product Management, Drew Breunig, Aug 2025
Product Management is AI’s New Bottleneck. Andrew Ng Explains What’s Next, Andrew Ng, 2025
Bringing the Full Stack Builder to Life, Tomer Cohen, Aug 2025
2025 Stack Overflow Developer Survey: AI, Stack Overflow, 2025
State of AI vs Human Code Generation Report, CodeRabbit, Dec 2025
Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity, METR, Jul 2025
Speed at the Cost of Quality: The Impact of LLM Agent Assistants on Software Development, 2025
Redefining the Software Engineering Profession for AI, Mark Russinovich and Scott Hanselman, Communications of the ACM, 2026
If You Stop Hiring Juniors, Your Senior Engineers Own You, eval(code), 2026
IBM Plans to Triple Entry-Level Hiring in the US in 2026, Fortune, Feb 2026
The SignalFire State of Tech Talent Report 2025, SignalFire, 2025
Augmented Coding: Beyond the Vibes, Kent Beck, 2025
Vibe Engineering, Simon Willison, Oct 2025