The New Open Source Is Specs, Not Code

This week, a Cloudflare engineering manager “rebuilt” 94% of the Next.js API surface in under a week. He used Claude across 800 AI coding sessions, spending roughly $1,100 in API tokens. The result is something resembling Next.js that benchmarks at 4x faster build times and 57% smaller bundles than Next.js itself. Almost every line of code was AI-generated. The human contribution was architectural direction: deciding what to build, steering the AI away from dead ends, making design decisions.

This got me thinking about where we are headed. If code continues on this path and becomes trivially cheap to produce, what does that mean for open source? We’ve spent two decades building an ecosystem around sharing code. But what if the valuable thing to share now isn’t code at all?

Author’s Note: I use Claude Code daily. I work at a company that is adopting AI tools aggressively. I have obvious biases here and I’m not going to pretend otherwise. What I’m trying to do is honestly assess a shift I see happening (one that I’m actively participating in) and think through the implications. For those of you that have experience in the space, I would welcome any feedback.

The Open Source Crisis (or at least dilemma)

A little background. Before we had “vibe coding” we had developers who just wrote “glue code.” Not quite script kiddies but people that used OSS tools and connected together other peoples’ APIs. This wasn’t the worst thing for the professional developer ecosystem. These people were typically doing contract work, personal projects, and similar, and not contributing to major commercial or OSS projects. Most importantly, this still took time and energy and was gated in the total volume that the ecosystem had to absorb.

Here in 2026 we have a different problem. AI coding agents are actively undermining the open source ecosystem through multiple reinforcing mechanisms.

Low-quality AI-generated PRs are overwhelming maintainers. The economics are brutally asymmetric. It takes 60 seconds to prompt an agent, an hour (or more) to review the output. Three major open source projects already took drastic action in January of 2026:

curl killed its six-year bug bounty program
Ghostty implemented a zero-tolerance ban on AI contributions
tldraw auto-closed every external pull request

These aren’t fringe projects overreacting. These are veteran maintainers hitting a breaking point and the common thread is AI coding agents. They have made it so cheap to generate contributions that the cost of reviewing them exceeds the value they produce.

Beyond PRs, the engagement model is collapsing. Tailwind CSS saw documentation traffic drop 40% and revenue decline 80%, even as downloads grew. Developers are using the code without engaging with the ecosystem.

I’ve been using Claude Code daily since roughly November of 2025, and I’ve noticed something uncomfortable about my own behavior. When I hit a bug in an open source library, my first instinct is no longer to check the issue tracker, read the docs, or submit a bug report. It’s to ask Claude to work around it. When I need functionality from a library, I don’t read its source anymore. I describe what I need and let the agent figure it out. I’m part of the problem. And if you’re a heavy user of agentic coding tools, I suspect you are too.

So what is the point? Code is cheap. The hard part is making sure that it does something valuable.

Code Is Cheap. What’s Expensive?

The economics seem like they are shifting. Code generation is approaching commodity pricing. The valuable, scarce work is now: understanding requirements, making architectural decisions, writing specifications, and exercising judgment. “Code is cheap, judgment is expensive” is emerging as a common view among practitioners.

A 2026 ArXiv paper on spec-driven development identifies three levels of specification rigor: spec-first, spec-anchored, and spec-as-source. The interesting question is what happens when we push toward spec-as-source for open source.

Here’s a thought experiment. You need an application that monitors RSS feeds for new movie releases, cross-references them against your preferences, and downloads them via a usenet client (draw no inference from this as to how I spent my last weekend). You could go find Radarr on GitHub, read its documentation, figure out how to deploy it, deal with its particular configuration format, and work around its bugs. Or you could write a detailed spec: “monitor these feeds, apply these quality filters, integrate with this usenet client’s API, present a web UI that shows pending and completed downloads.” Then let Claude Code build it for you. A good spec takes you a few days of serious thought: data models, edge cases, error handling, integration points. That’s real work. But it’s a fraction of what the code would have taken you, and Claude generates a working implementation from it in hours. Which one do you want to share with the world?

If you asked me that question two years ago, I’d have said the code, obviously. Code is the hard part. Code is what you share. But that’s no longer true. The specification (the precise articulation of what the software should do, the edge cases it should handle, the design decisions and tradeoffs that were considered) is where the real intellectual work lives. The code is just a rendering of that spec into a particular language at a particular point in time.

The Proposal: Open Source Specs

Imagine a GitHub repository that contains no code at all. Instead, it contains a precise, comprehensive specification for a media automation tool. The spec describes the data model, the API contracts, the user workflows, the error handling, the performance requirements. It includes architectural decision records explaining why certain tradeoffs were made. It has a comprehensive test suite: not code tests, but acceptance criteria written in natural language that any implementation should satisfy.

When you want to use this tool, you don’t git clone and docker-compose up. You feed the spec to Claude Code, tell it you want a Python implementation that integrates with your existing home server setup, and it generates a working application in an afternoon. Your neighbor does the same thing but asks for a Go implementation with a different UI framework. Both implementations satisfy the same spec. Neither of you needs to maintain the other’s code.

The spec itself evolves through pull requests, just like code does today. But instead of debating whether a Python function should use list comprehensions or for loops, contributors debate whether the spec adequately covers the edge case where a torrent client loses connection mid-download. The conversations are higher-level, more accessible, and more valuable.

So is this strictly better than sharing code? No. It’s a set of tradeoffs, and whether they’re favorable depends entirely on the type of project.

On the positive side: dependency management hell largely goes away because you generate code in your stack, your way. “Works on my machine” stops being an issue because you generate for your environment. The maintainer bottleneck for PRs shifts from reviewing code to reviewing spec changes, which are natural language and more accessible. And the spec is inherently tech-stack agnostic. One spec, infinite implementations.

But there’s a lot that gets lost too, and I don’t want to hand-wave past it. When everyone runs the same codebase, you get a shared debugging community. You get test suites that catch regressions across thousands of production deployments. You get battle-tested edge case handling that no specification, however detailed, can fully replicate. A spec can say “handle the case where the usenet client disconnects mid-download” but it can’t encode the fifteen subtle ways that actually manifests across different clients, network configurations, and operating systems. That knowledge lives in code today, accumulated over years of bug reports and patches.

The tooling is starting to emerge. GitHub released their Spec Kit for spec-driven workflows, and Addy Osmani has written a practical guide to writing specs for AI agents. But mature tooling for versioning specs, validating implementations against them, and building shared test suites from acceptance criteria? That doesn’t exist yet.

The honest answer is that this probably works well for some types of projects and poorly for others. Which brings us to the more interesting question.

Where Does This Work? A (too simple and imperfect) Taxonomy of Open Source Value

Not all open source projects are the same. The value lives in different places for different types of projects. I think there’s a useful spectrum from “value is in the code” to “value is outside the code.” Where a project falls on that spectrum determines whether specs could replace code as the shared artifact.

Application Software: Specs Win

Projects like Radarr, httpie, Bitwarden, todo apps, most SaaS-like tools. The value is in the behavior specification: what it does, not how it’s implemented. These projects are essentially business logic wrapped in a UI, exactly the kind of thing AI agents excel at generating.

Radarr is a fantastic piece of software. It’s also, at its core, a collection of business rules: monitor these RSS feeds, match releases against quality profiles, send downloads to a configured client, rename and organize files. A sufficiently detailed specification of these behaviors (including the subtle edge cases around release naming conventions, quality scoring, and download client API quirks) would be more valuable than the C# codebase itself. You’d generate a Python version, or a Go version, or a Rust version, and it would work for your specific setup. The spec would be the shared artifact. The code would be ephemeral.

Data and Content Projects: Specs Are Irrelevant

Projects like Wikipedia/MediaWiki, OpenStreetMap, Common Crawl, open datasets. The value is entirely in the accumulated data, not the software.

Nobody uses Wikipedia because MediaWiki is elegantly written software. They use it because it contains the largest collaboratively maintained knowledge base in human history. You could rewrite MediaWiki from scratch in any language (and people have, many times) and it wouldn’t matter at all. The value was never in the code. It was always in the data. The same applies to OpenStreetMap, the USDA food database, and every open dataset project. For these projects, the “open source” revolution was really an “open data” revolution, and the code was just a vessel.

Deep Technical/Systems Software: Code Still Matters

Projects like FFmpeg, the Linux kernel, SQLite, OpenSSL, compilers. The value is in implementation expertise: hand-tuned assembly, decade-long optimization, platform-specific behavior.

FFmpeg is the counterexample that keeps this argument honest. Much of its value is in hand-tuned SIMD assembly that squeezes every cycle out of video encoding. Its maintainers have already dismissed AI-generated security reports as “CVE slop.” The codebase encodes decades of tacit knowledge about codec quirks, container format edge cases, and platform-specific optimizations that simply cannot be captured in a natural language specification (or would be so verbose you are effectively replicating the actual code). You could write the most detailed spec imaginable for a video transcoding tool, feed it to Claude Code, and the result would be orders of magnitude slower than FFmpeg. The code is the value. The spec would be a pale shadow.

The Gray Zone: Libraries and Frameworks

Projects like React, Rails, Express, Next.js, lodash, pytest. This is the interesting middle ground. The API design is the spec, and it’s already well-documented. But the value often lives in ecosystem (plugins, middleware, community knowledge) and battle-tested reliability.

This past week, a Cloudflare engineering manager did something that should make every framework maintainer pay attention. Using Claude across 800+ AI coding sessions at a total cost of roughly $1,100 in API tokens, he rebuilt 94% of the Next.js API surface in under a week. The result, an open source project called vinext, doesn’t seem like a toy (strong caveat here is that I haven’t tried it myself). It benchmarks at 4x faster build times and 57% smaller bundles than Next.js. Two routers, 33+ module shims, server rendering pipelines, RSC streaming, middleware, caching, all built on Vite as a plugin.

The human contribution was architectural direction: deciding what to build, steering the AI away from dead ends, making design decisions. Almost every line of code was AI-generated. This is the spec-as-source model playing out in real time. The “spec” was the Next.js API surface: its routing conventions, its rendering model, its server component behavior. That specification, which Vercel spent years developing and documenting, turned out to be the truly valuable artifact. The code was regenerable.

But before we declare victory for the thesis, look at what vinext doesn’t have: no static pre-rendering, no battle-testing at scale, no ecosystem of compatible plugins, no years of edge case fixes from thousands of production deployments. The gap between “94% API coverage” and “drop-in replacement” is exactly where the hardest engineering happens. Whether that gap closes in months or never will tell us a lot about where frameworks land on the spectrum.

There’s a deeper question here too. Vinext only exists because Next.js existed first. The “spec” that Claude worked from was years of API design, documentation, and community feedback that Vercel invested in building. This was recreation, not creation. The interesting thought experiment is whether you could start from a spec and build the next Next.js, something genuinely novel. Could a spec for “a React framework with file-based routing, server-side rendering, and edge deployment” have produced Next.js before Next.js existed? Or does this approach only work once someone has already done the hard, expensive work of figuring out what the right abstractions are? I genuinely don’t know the answer, but it matters a lot for how optimistic we should be about spec-driven open source for anything beyond established patterns.

“Hasn’t This Been Tried Before?”

If you’ve been in the industry long enough, you’re already composing your rebuttal: “This is just Model-Driven Development again, and MDD failed.” Fair point. The UML-to-code dream of the 2000s collapsed because formal specification languages were often harder to write than the code they generated, and the generated code was scaffolding, not working software.

But there’s a crucial difference this time. LLMs understand natural language. You don’t need to learn OMG’s Meta Object Facility to write a spec. You need to clearly describe what your software should do. And the output isn’t a skeleton you fill in; it’s a working application.

That said, Birgitta Böckeler at ThoughtWorks has rightly pointed out that current SDD tools can produce specs that are “repetitive and tedious to review.” That’s a real problem, however it strikes me as a tooling problem, not a fundamental limitation of the approach. We said similar things about version control and code review before GitHub made them accessible.

What This Means for the Ecosystem

The implications for the open source ecosystem are profound and uncomfortable. If specs become the primary shared artifact, the contributor pool widens dramatically. You don’t need to know C# to contribute to Radarr’s spec, you just need to understand media automation. Product managers, domain experts, even end users could meaningfully shape the specification of software they use daily. That’s genuinely exciting.

But it also means we need new tools for versioning specifications, resolving conflicts between requirements, and validating that an implementation actually satisfies a spec. We need the equivalent of CI/CD for specifications. None of this exists yet in any mature form.

The community and knowledge networks that form around open source projects (the forums, the Stack Overflow answers, the blog posts explaining weird edge cases) are arguably as valuable as the code itself. Whether those communities survive a transition to spec-centric open source is an open question. I suspect some will and some won’t, and the answer depends on whether a project’s community is organized around the what (behavior, use cases, domain knowledge) or the how (implementation details, language idioms, framework internals).

I don’t think this transition will be optional for certain projects. The economics are too compelling. When generating code costs essentially nothing, the artifact you share with the world should be the expensive part: the thinking, the decisions, the specification. For many projects, I don’t think the question isn’t whether this shift happens. It’s whether the open source community shapes it intentionally or gets dragged along by it.

I am excited to see where these projects end up and I’m genuinely curious whether the spec repos I’m imagining will look anything like what actually emerges. As is almost always the case, I was not the first person to think of this. GitHub, Addy Osmani, and a growing number of practitioners are already building in this direction. But the open source community hasn’t yet had the conversation about what it means for the social contract that has sustained it for two decades. We should probably start.

Things I read while thinking about this

How we rebuilt Next.js with AI in one week, Cloudflare Blog, Feb 2026
Cloudflare vibe codes 94% of Next.js API ‘in one week’, The Register, Feb 2026
AI “Vibe Coding” Threatens Open Source as Maintainers Face Crisis, InfoQ, Feb 2026
AI is destroying Open Source, and it’s not even good yet, Jeff Geerling, 2026
AI and Open Source: A Maintainer’s Take (End of 2025), Stan Lo, Dec 2025
Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity, METR, Jul 2025
Spec-Driven Development: From Code to Contract in the Age of AI Coding Assistants, Piskala, Jan 2026
Spec-driven development with AI: GitHub Spec Kit, GitHub Blog, 2025
Understanding Spec-Driven-Development: Kiro, spec-kit, and Tessl, Birgitta Böckeler, 2025
How to write a good spec for AI agents, Addy Osmani, 2025
Code Is Cheap: When AI-Generated Quality Is Good Enough, Dec 2025
FFmpeg to Google: Fund Us or Stop Sending Bugs, The New Stack, 2025
The kernel of open source: community, Ubuntu Blog, 2023
A Taxonomy of Open Source Software, Quira, 2024