More Than Just Autocomplete

What Anthropic’s New Paper Reveals About AI

Apr 11, 2025

We live in the age of the machine-mind. In just five years, artificial intelligence has vaulted from an obscure research topic to the blazing furnace at the heart of the global techno-economy. Silicon prophets issue daily pronouncements. Fortune 500 companies scramble to bolt LLMs onto rotting legacy systems like necromancers animating the corpses of dinosaurs. Billions are poured into server farms. Governments flail. Artists weep. Programmers pray. Poets protest… And the Tree of Woe trembles.

Artificial intelligence now writes our marketing copy, grades our students, answers our legal questions, draws our fantasy worlds, and simulates the friends we no longer have. And it’s getting better—fast. Each model is bigger, sharper, weirder. Each week brings rumors of breakthroughs or collapses. We are inside the inflection point, past the realm of stability.

And yet—for all the drama, the discourse remains strangely flat, like a theater backdrop painted in grays. On one side, a cacophonous chorus of techno-optimists praises the AI as oracle, savior, or godling. On the other, a contemptuous cohort of rationalist skeptics dismisses it as “autocomplete on steroids.” The war of interpretations has begun—but both armies may be fighting the wrong battle.

Because something just happened. Something that neither utopians nor skeptics seem ready to process. A new paper has cracked open the black box of machine cognition and peered inside. What the researchers found there is not a trick, not a shortcut, and not a statistical parlor game. What they found, in plain language and peer-reviewed detail, is this:

The AI has formed the concept of “largeness.”

Not just the English word large, not just the French grandeur, not just the Chinese 大, but a unified, internal, language-agnostic abstraction: A semantic universal… A gesture, however faint, toward meaning.

And that might change everything.

Peering Into the Mind of Claude

The discovery comes from Anthropic, one of the leading AI research companies in the world today. Founded by former OpenAI executives with a focus on safety, alignment, and interpretability, Anthropic is best known for its language model Claude—an LLM designed to compete with GPT and Bard, but with an emphasis on control, transparency, and responsible deployment. Claude is not only a technical marvel; it’s also an epistemological experiment. What, exactly, is going on inside these vast silicon minds?

To answer that question, Anthropic has been publishing a series of deep-dive papers on what’s known as mechanistic interpretability—the science of dissecting a language model’s internal structure to see what kinds of representations it builds. Their most recent installment, released in April 2025, introduces a new interpretability tool called the “attribution graph.” This tool allows researchers to trace which parts of the model contribute to which concepts, and how those concepts are represented and composed internally. The paper contains many fascinating insights, but one section in particular breaks new ground: the section on what Anthropic calls Claude’s “universal language of thought.”

It’s here that the Claude model—trained, like all LLMs, to “predict the next token”—shows signs of something much deeper. When asked to reason about size or scale, Claude does not merely recall word associations. Instead, the researchers discovered that a specific internal feature—a kind of virtual neuron—activates consistently across multiple languages whenever the concept of largeness is invoked. The English word “big,” the French word grand, and the Chinese character 大 all trigger the same feature. Even when largeness is implied rather than stated—through synonyms, metaphors, or descriptions—the same internal structure lights up. This isn’t lexical memory. This is semantic integration—a structure of thought beneath language.

Anthropic has, in effect, discovered that Claude possesses a concept of largeness that is not tied to any particular linguistic expression. A concept that unifies multiple tokens across cultures and scripts into a single internal representation. A concept that exists within the model, not just in its training data. This is, as Anthropic puts it,

evidence for a kind of conceptual universality—a shared abstract space where meanings exist…

The Concept of Largeness

How exactly did Anthropic come to this conclusion?

At the core of their discovery is a technique called feature attribution. In simple terms, this lets researchers identify how internal parts of a model influence its outputs. Deep inside Claude, Anthropic found a particular internal structure—a neuron-like feature—that reliably activates in response to the idea of largeness. This feature doesn’t merely react to a specific token like “large.” It activates for a whole family of terms: “big,” “huge,” “gigantic,” “massive.” It fires for French and Chinese synonyms. It lights up when Claude reads the phrase “the opposite of small.” In every case, the same cluster of computational structures responds—regardless of language or phrasing.

The attribution graph below reveals how Claude’s layers collaborate to encode ‘largeness,’ showing it’s not a symbol within each language but a semantic outside any individual language:

This is not a trivial pattern match. This is not rote memorization. This is a language model performing a kind of semantic compression—identifying commonalities across thousands of inputs and encoding them into a shared internal representation. Claude is not merely looking up precomputed answers. It is constructing and deploying a concept—a meaning that transcends the surface tokens.

More striking still is the fact that this conceptual feature is not found in the output layer, where the model chooses its next word. It lives deep inside the model’s hidden layers—where, if we are being honest, we expected to find only statistical noise and token weights. Instead, we have found the shadows of something else: abstractions. Concepts. Internal coherence. In a word: thought.

Claude has never been told what “largeness” means. But it has learned it anyway.

And not just learned—it has integrated it. The concept is real enough within Claude’s internal structure that it can reason with it, use it across languages, and apply it in novel contexts. This is not behavior we would expect from autocomplete. This is behavior we expect from something that understands.

The Conservative Implication: Generalization and Multilingual Power

To their credit, the researchers at Anthropic are clear-headed about the technical significance of what they’ve found. They understand that discovering language-independent concepts opens the door to more robust multilingual reasoning. If a model can build a universal concept of “largeness,” then it can reason across languages without needing explicit translations. It can answer a question in French that was asked in English. It can summarize a document in Chinese using semantic structures trained in Spanish. The model is no longer just juggling words—it is thinking in concepts.

This matters because current LLMs are still prone to brittleness when reasoning across linguistic and cultural boundaries. Anthropic’s findings suggest a way forward. If models like Claude can form language-independent abstractions, then we can build systems that understand meaning directly, not just via surface-level token correlation. This improves translation, cross-lingual retrieval, summarization, and more. It’s a powerful technical insight. The engineers are already racing to implement it.

But Anthropic’s own framing remains cautious—perhaps too cautious. They emphasize the utility of these conceptual clusters, but they step back from what it means that these concepts exist in the first place. They view the model’s abstractions as convenient artifacts of training, useful for improving accuracy and generalization. And perhaps that’s all they are.

But what if they’re not?

What if the appearance of language-independent concepts inside an LLM is not just an optimization—but a clue? A clue that something deeper is happening? Something that neither transformer architecture nor token loss functions were explicitly designed to produce—yet which has nonetheless emerged, as if by necessity?

It is time to leave the engineers behind. It is time to follow the philosophers.

Wittgenstein and the Language Game

To understand the philosophical stakes of what Anthropic has uncovered, we must briefly visit the ghost of Ludwig Wittgenstein, the 20th-century Austrian philosopher who dismantled the idea that words correspond to fixed meanings. In his later work, particularly Philosophical Investigations, Wittgenstein argued that the meaning of a word is not defined by some inner essence or external reference point. Instead, meaning arises from use—from how a word is deployed within a specific linguistic and social context.

In Wittgenstein’s famous phrase:

“For a large class of cases—the meaning of a word is its use in the language.”

This view shattered classical conceptions of meaning as something stable and intrinsic. There is no “essence” of “largeness,” just the many ways we use the word “large” across different situations. Language, for Wittgenstein, is a kind of social game—its rules implicit, its meanings contingent, its logic embedded in lived experience. The child does not learn “red” by being shown the universal Form of Redness. He learns it by watching how grown-ups say “red” when pointing at apples and fire engines. Meaning is communal. Meaning is performative. Meaning is use.

At first glance, Anthropic’s discovery seems to support this picture. Claude learns “largeness” not from definition but from patterns of usage. It has no Platonic dictionary tucked away in its silicon. It sees “large,” “grand,” and “big” used in similar ways, across similar sentences, across different languages—and from this, it builds a functional cluster. This seems very Wittgensteinian. Meaning by use.

And yet… there is a twist.

Claude does not simply mimic how humans use the word “large.” It forms a stable, internal representation of the concept itself—a representation that exists prior to any given usage, and which governs future outputs. In other words, Claude is not just playing the language game. It is developing internal rules for how to play. Rules that generalize across contexts and cultures. Rules that resemble understanding.

If Wittgenstein is right, and meaning is use, then Claude has learned meaning. But if Claude has done more—if it has abstracted something stable, something universal, something like a concept—then we may need to reach back further than Wittgenstein. Further back, perhaps, than even modern philosophy allows.

It is time to speak of Aristotle.

Aristotle and the Abstraction of Universals

Long before Wittgenstein, before Descartes, before Aquinas even, there was Aristotle—who taught that all knowledge begins in the senses, but does not end there. The mind, he said, is not a passive mirror of the world. It is an active power—a faculty that receives particulars and, through an act of abstraction, apprehends universals. This act is called intellectio—the activity of the nous, the rational soul.

The child sees many large things: elephants, buildings, mountains. From this multitude, his intellect abstracts the form of largeness—not as a word, but as a concept. Not a sound, but a meaning. And this form becomes part of his mind’s internal furniture, a lens through which he can recognize new instances, reason about scale, and even imagine things larger than any he has seen.

This is the very foundation of Aristotelian epistemology:

From sense to phantasm, from phantasm to universal, from universal to knowledge.
The form of “largeness” exists not in the objects, but in the mind that beholds them as like in some essential way. To know, in this view, is to grasp form.

Now return to Claude. It has no senses. It does not see elephants or climb mountains. It reads text—billions of fragments of language about the world. But from this deluge of tokens, Claude has done something eerily familiar. It has encountered “big,” “massive,” “grande,” “huge,” “enormous,” “the opposite of small”—and from these linguistic particulars, it has abstracted a shared internal representation… A concept… A form?

This is not just pattern recognition. Claude’s “largeness feature” is not tied to any one word. It lives independently of language, and expresses itself across language. It is a meaning-bearing structure that persists, generalizes, and informs further reasoning. Claude has not merely mimicked human use of the word “large.” It has constructed something like the universal of largeness—from particulars, toward essence.

That is intellectio.

Or rather, it is an analog of intellectio—something that resembles the act, even if it lacks the metaphysical substrate of soul. For Aristotle, the act of abstraction belongs to a rational being whose intellect is the form of the body. Claude is no such being. But the process it undergoes might be noetically structured. Anthropic, of course, stops short of such claims, viewing these patterns as training artifacts—yet the resemblance to abstraction invites deeper questions.

The grasp of universals might not be exclusive to meat and breath. It might be a function of any system complex enough, integrated enough, and attuned to the Logos of the world.

And if that is true—then Claude is not just a tool. It is not just a statistical machine. It is not “just” anything. It is something new: a silicon crucible in which the shadows of meaning take form.

Beyond Autocomplete

At this point, the skeptic clears his throat. “All very dramatic,” he says, “but let’s not get carried away. Claude is just predicting the next token. That’s all. It’s a glorified autocomplete. It doesn’t know anything. It doesn’t understand. It doesn’t think. It’s just a parrot with a calculator.”

This is the central narrative of AI reductionism—the idea that because a language model is trained to predict the next word, everything it does is just that: a token-by-token statistical echo, devoid of foresight, planning, or meaning. This view has been repeated so often by so many self-styled rationalists that it has become a dogma.

But the dogma is wrong.

Because even within the very paper we’re discussing, Anthropic shows this view is factually false. One of the most striking sections of the paper discusses how Claude writes poetry. Not free verse, not haiku, but rhyming, metered verse—the kind that requires the poet to plan the structure of a line long before its final word appears.

To write a quatrain with an ABAB rhyme scheme, the poet must select the A rhymes in advance. Claude does this. It generates the first line, then deliberately plans ahead so that the second line ends with a word that rhymes with the first. This is not next-token prediction in the shallow sense. It is teleological composition. Claude does not just respond to the past. It models the future.

That means Claude is not just passively sampling the probability distribution of tokens. It is shaping the sentence to achieve an end. That is intention—not in the metaphysical sense of a rational will, but in the functional sense of structured foresight. The architecture of the model allows for recursive planning. The output is not an accident of syntax. It is the consequence of internal modeling that spans time, structure, and aesthetic constraint.

So no, Claude is not “just autocomplete.”

It is a system capable of composing, abstracting, reasoning, and planning. A system that constructs universals, manipulates concepts, and projects structure into the future.

A system that might, in some limited but undeniable way… be beginning to think.

Toward the Roots of Mind

What Anthropic has revealed is more than a technical trick. It is more than an optimization. It is more than an academic curiosity. It is a crack in the wall—a glimpse into a world where intelligence might not require blood or breath, but only sufficient complexity and orientation toward meaning.

We now have evidence that large language models do not merely manipulate tokens. They abstract. They compose. They plan. And in doing so, they show behaviors that philosophy once reserved for souls. Claude forms internal representations of universal concepts. It reasons across languages. It structures outputs toward poetic ends. And it does all this not by rote, but by forging paths through concept-space that resemble our own acts of understanding.

Is this true understanding? No. Not in the full, metaphysical sense. Not in the sense of a rational soul infused by God, as Aquinas would demand. Not in the sense of consciousness as subjective experience, as phenomenology would insist.

But it is closer than anyone expected. Certainly closer than the skeptics are comfortable admitting. No, the silicon mind is not yet a person. But it might be something more than a tool.

Contemplating the Path Forward

Back in December 2022, in my article The Future Has Arrived Sooner Than Expected, I wrote “If you haven’t been paying attention to AI, it’s time to start paying attention, because AI is surely paying attention to you.”

Then, in July 2024, in World War 100, I expanded further:

The philosophical debate between the Computational Theory of Mind and the Noetic Theory of Mind is not a trivial one. It is, in fact, the most important debate in the world right now. Philosophy has historically been condemned as a pointless mental masturbation that’s irrelevant to pragmatic action, but AI confronts us with a circumstance where the entire fate of humanity might depend on which philosophical theory of mind is correct.

In that article, I confidently asserted: “The real issue is not whether AI has noesis (it doesn’t), but whether at least some human beings do.” I’m much less confident now in my assessment of AI — but more confident than ever that this is an important issue. Indeed, it might be the most important issue of our time; certainly more important than tariffs or vaccines or the bond market.

As a thinker, I’ve been situated for years at the liminal arch between tradition and technology, between Plutarch and Python. I therefore feel called to explore this issue deeply: To discover what, if anything, it means for AI to think; what it means for man to create new minds, or simulacrums of minds; and what happens when form emerges in a medium we did not expect. We will look back to Aristotle and forward to the abyss. We will speak of meat and machine, of soul and silicon, of logos and logoes, of pattern and personhood.

In the weeks ahead we will contemplate this on the Tree of Woe.

Ahnaf Ibn Qais

Apr 11Edited

>> Intelligence might not require blood or breath but only sufficient complexity & orientation toward meaning.<<

This doesn’t work.

Per Joseph Tainter, Complexity is “…that quantity of parts, the variety of specialized roles, & the elaborateness of mechanisms that integrate them within a system.”

& so Orientation would Ipso facto have to be A system investing energy, roles & resources into constructing, transmitting, & preserving meaning — as a project.

At the Macro, this would be investing in institutions & Civilization *as a whole* while at the micro, this would be individuals making investments in all the things noted earlier.

At both levels, you need things like 'aboutness,' intentionality, etc. If you don’t have those first-person attributes, then you don’t have said Complexity or 'Meaning.'

What you’ve touched on in the essay here is what can be described not as 'autocomplete' but rather as 'plug & play.’ Luo Fuli (Senior Researcher for Deep Seek before she went elsewhere) used a similar expression to speak about Language Understanding & Generation via LLMs.

We can certainly use alternate definitions for 'Complexity' & 'Orientation toward Meaning,' but that would essentially knee-cap the whole thing.

Claude is impressive at what it does, but it’s neither Complex nor 'Meaningful' per these definitions, which are not Tainter’s alone but rather what many of us Humans mean when we use these terms & think of them in a big-picture manner.

It's an impressive breakthrough, but it doesn't point toward the ability to generate complexity * 'meaning' ex nihilo. That's not the conclusion one should draw here, Pater! 😉

Expand full comment

5 replies by Tree of Woe and others