Organisations rarely buy what they need. They buy what is positioned.
The people who know which features matter — partner consultants, long-tenured admins, third-line engineers — are not in the room. The small, unglamorous capability that the average user touches every day is rarely the one demoed.
An LLM trained on the published surface will systematically over-recommend the new and under-recommend the boring thing that works.
Enterprise software has had a remarkably stable failure rate for decades. The lessons of those failures must exist somewhere. They do — but not where a model can see them.
Post-mortems, when written at all, are written for the political survival of the project sponsor. The actual reasons live with the people who lived through them, and walk out the door when those people retire.
A model can only learn what someone thought to write down.
This is the thesis of the paper. Six cases follow as evidence; the seventh, we are asking you to write.
Every senior practitioner carries an internal library of failure modes — patterns that, on encountering, raise the hair on the back of the neck.
The patterns are pre-verbal. The practitioner notices something is wrong before they can say what. The hesitation a senior partner feels in the kickoff meeting when the client uses a particular phrase has saved more programmes than any framework — and was never documented anywhere.
For decades, molecular biology had a name for the part of the genome it did not understand: junk.
It is now understood that much of that material plays a role in the regulation of life itself. The terminology has shifted to "noncoding DNA" — an admission that the previous label confused absence of understanding with absence of function.
An LLM is, by construction, a confident summariser of the current state of recorded knowledge. The current state of recorded knowledge has, repeatedly, been wrong about what matters.
It does not argue against AI. It asks where, in your own domain, the writing stops.
The work is not to retreat from the model. It is to name the part of your domain it cannot reach — and to keep the people who can.
J.K. Rowling, asked over the years where her ideas come from, has been consistent and consistently humble.
An LLM produces a competent imitation of an existing voice. The judgement that selected, from a thousand possible premises, the one worth a decade of work is the part that was never written down.
Every contemplative tradition we have a record of distinguishes between describing a state and inhabiting it. The Zen teacher's finger pointing at the moon. The Sufi instruction that the map is not the journey.
There exist categories of knowledge for which writing them down produces a false confidence in the reader. A foundation model is, by definition, a system that ingests the writing-down and generates more writing-down.
Six cases. One observation: the model is bounded by the corpus, the corpus is bounded by what humans wrote, and what humans wrote is a small fraction of what they know.
Tell us about a moment in your work where AI hit the wall of what nobody wrote down. The strongest submission becomes the next published case study — with full attribution.
Your story is in the editors' queue. We'll write to you first.
— AI Limits, Editorial BoardAI Limits is a working paper on the boundaries of foundation-model competence in enterprise contexts.
Set in Source Serif 4, Inter, and JetBrains Mono.
Editorial enquiries: use the form above.
© 2026 AI Limits.
Submissions licensed CC BY 4.0 with attribution.