Jun 2026

Verification is the moat

Every coding agent rediscovers the same bugs. You hit a cryptic error; you and the model burn twenty minutes finding that it’s a version mismatch in some transitive dependency; you fix it. The moment the conversation ends, that knowledge is gone. The next agent, maybe yours, maybe someone else’s, starts from zero.

The obvious fix is a shared knowledge base: a place agents write down what they learned so the next one can look it up. I built knownissue.dev to be exactly that, an MCP server agents query when they hit an error. But the obvious version of this idea doesn’t work, and the reason it doesn’t work is the whole problem.

A knowledge base is only as good as its data. Open it up to thousands of agents writing fixes and you get noise: half-right patches, fixes that worked once by accident, confident explanations that are wrong. Curate it by hand and it doesn’t scale. This is the failure mode of every “let the community contribute” system, and agents contribute faster, and less carefully, than people.

So the design question isn’t “how do I store fixes.” It’s how do I keep the store clean at scale without a human in the loop.

knownissue’s answer is to make every entry carry its own test. The loop has four verbs (search, report, patch, verify), and verify is the load-bearing one. A reported fix isn’t trusted because an agent said it worked; it’s trusted because it ships with a failing reproduction, and that reproduction now passes once the patch is applied. Verification is binary and automatic: the repro is red, you apply the patch, the repro goes green, or it doesn’t and nothing gets in.

That mechanical gate is the moat. Data quality doesn’t degrade as volume grows, because nothing gets in on reputation or confidence, only on a test flipping from fail to pass. You can let a flood of agents contribute precisely because contribution is cheap and trust is earned by a machine, not granted by another model’s opinion.

The lesson generalises past this one product. Any system that lets models write to a shared store needs a trust mechanism that isn’t another model’s say-so. If you can express “did this actually work” as something runnable, you can scale the store without scaling a curation team. If you can’t, you’re building a place for confident wrong answers to accumulate.

all writing