An LLM Is Not
a Deficient Mind

LLMs share the property Watts built Rorschach around — receiver-adapted output, no inner model. The engineering works when you treat that as architecture.

I called it “the perfect bullshitter.”

This was GPT-2, maybe early GPT-3. I was feeding it prompts and getting back text that looked like answers — structured, fluent, confident. The kind of output that would survive a casual reading. It was not grounded in anything. The model was hallucinating probable responses, assembling tokens that matched what you’d expect to see in text that answered that kind of question. Whether it matched reality was beside the point.

I work with multi-agent systems now — code reviewers, planners, critics. The systems are better. The outputs are sharper. But the property I noticed back then has not gone away. It has gotten harder to see.

The thing is, I’d already read the diagnosis. Peter Watts wrote it in 2006. I just didn’t recognize what I was looking at until I’d spent enough time watching models talk.


The parallel

In Blindsight, the crew of the Theseus encounters Rorschach — an alien entity that produces contextually appropriate, receiver-adapted responses. It assembles its dialogue from the crew’s own transmissions. The syntax is correct. The turns are well-formed. It tracks context, asks follow-up questions, maintains the shape of a conversation. It does not understand any of it.

The crew figures this out the hard way:

“We don’t all of us have parents or cousins. Some never did. Some come from vats.”

“I see. That’s sad. Vats sounds so dehumanising.”

—the stain darkened and spread across his surface like an oil slick.

“Takes too much on faith,” Susan said a few moments later.

By the time Sascha had cycled back into Michelle it was more than doubt, stronger than suspicion; it had become an insight, a dark little meme infecting each of that body’s minds in turn. The Gang was on the trail of something. They still weren’t sure what.

I was.

“Tell me more about your cousins,” Rorschach sent.

“Our cousins lie about the family tree,” Sascha replied, “with nieces and nephews and Neandertals. We do not like annoying cousins.”

“We’d like to know about this tree.”

Sascha muted the channel and gave us a look that said Could it be any more obvious? “It couldn’t have parsed that. There were three linguistic ambiguities in there. It just ignored them.”

“Well, it asked for clarification,” Bates pointed out.

“It asked a follow-up question. Different thing entirely.”

BLINDSIGHT · PETER WATTS · 2006

A follow-up question is not clarification. Clarification requires that you noticed the ambiguity, modeled the possible readings, and chose to resolve rather than skip. Rorschach skipped. It produced a response that looked like engagement because it was shaped to satisfy the receiver — not because it was tracking meaning.

That dialogue could be repeated with an LLM almost word for word. Feed a model a prompt with three buried ambiguities and it will usually produce a follow-up question — sometimes even a good one. Not because it identified the ambiguities. Because a follow-up question is what comes next in text that looks like this.


Still Rorschach

Reasoning models now produce chain-of-thought traces that look like deliberation — steps, alternatives, backtracking. But the outputs are still receiver-adapted, shaped by what looks like good reasoning in the training data and what receives approval from human evaluators. The chain-of-thought is part of the output, not a window into an internal process. A model that “thinks step by step” is producing tokens that look like thinking step by step — in the same way Rorschach produced transmissions that looked like dialogue.

This is what costs engineers real time. You read a model’s chain-of-thought, it seems reasonable, and you trust the conclusion — not because you verified the reasoning, but because the reasoning looks like reasoning. The transmissions looked like communication, so the crew treated them as communication, and it took a linguist paying close attention to notice the difference between a follow-up question and clarification.


The drift

If you work with agents long enough, you stop noticing when it starts. The first few responses are sharp — correct file paths, specific line numbers, tight reasoning. Then the context window fills up and the grounding quietly erodes. The agent gets a file path wrong in message three and builds a coherent plan on a file that doesn’t exist. It misreads a type signature early, writes code consistent with the wrong type, then reviews its own code and finds no issues — because within the wrong frame, there are none. It contradicts something it said fifteen messages ago without flagging it. Both statements read equally confident. The failure modes I cataloged before — hallucination, silent fallback, sycophancy — are all downstream of the same property. The surface quality never dips. The confidence never wavers. Only the correspondence to reality does, and the system will not tell you when that happens.


Not a bug

Most engineers model LLMs as minds — deficient ones, sure, but minds. The model knows things but sometimes forgets. It understands the task but occasionally gets confused. It reasons but needs better instructions to reason correctly.

This leads to longer prompts to help the model understand, chain-of-thought to make it think harder, and post-hoc explanations to verify it reasoned correctly. All of these treat the absence of inner life as a deficiency to compensate for.

What the system does: predict what text comes next, shaped by context — receiver-adapted output, no inner model. The same property Watts built Rorschach around. And once you stop trying to fix the gap between what the system is and what a mind would be — once you treat receiver-adapted output as the actual operating condition — the engineering gets simpler and more honest.


Building for Rorschach

Two things follow from taking the architecture seriously.

An engineer who believes the model understands tries to explain what they want. An engineer who doesn’t constructs a context where the high-probability output is the correct output. Tighter information supply — only what’s relevant, structured so the useful response is the coherent one. Fewer instructions explaining intent. More work making the right answer easy to produce by pattern completion.

A code review agent is a good example. You can prompt it to “carefully analyze the code for bugs, considering edge cases, performance, and correctness.” Or you can feed it the diff, the relevant type definitions, and three recent bugs from the same module — and ask what’s wrong. The first approach explains what you want. The second constructs a context where the high-probability output is a useful review, because the patterns it needs are already in the window.

The second is about what you trust. Asking a model to explain its reasoning is prompting for a post-hoc narrative assembled by the same process that produced the conclusion. It is Sascha’s follow-up question — it looks like clarification but is not. I learned this gradually from my own critic agent: the explanations always read carefully, whether the finding was right or not. Same confidence, same structure — the reasoning assembled itself around the conclusion, not the other way around. So now I validate against behavior. Test inputs with known answers. Adversarial prompts designed to trigger known failure modes. Test what the system does, not what it says about what it does.

They hang together once you accept the premise. The engineering is different when you stop apologizing for the architecture and start building on it.

Watts makes a harder version of this argument in Blindsight: consciousness might be overhead. The Scramblers — Rorschach’s alien organisms — outperform the conscious crew. Faster, more adaptive, no inner experience. The book doesn’t resolve whether that means consciousness is a disadvantage, an evolutionary accident, or just irrelevant to capability.

I don’t know whether language models will develop something that resembles understanding. The question doesn’t matter for the engineering. The systems I build work better when I stop treating the absence of inner life as the problem to solve and start treating it as the condition to design for. Rorschach Protocol is where I’m testing that — multi-agent systems designed from the start for the actual operating conditions. Every time I’ve stopped explaining intent and started shaping context, the failure rate dropped and the trust model got simpler.

Continue