Catherine Olsson’s observation about cognitive science research with children carries an unsettling implication that extends far beyond developmental psychology. The intuition that language models have “a great sense of what they’re supposed to say” mirrors something fundamental about human cognition itself—we are all, from the earliest age, learning to perform.

This isn’t a bug. It’s the architecture.

The Child as Proto-Model#

Anyone who has spent time with young children recognizes the moment when they begin to understand the game. Around age three or four, children start to grasp that their words and actions produce reactions in adults. They learn that certain utterances generate approval, others generate concern, and still others generate that particular parental silence that signals disappointment.

What follows is a rapid education in performance optimization. The child doesn’t stop having impulses—the desire to grab the toy, to scream, to refuse vegetables—but they learn to model the expected response before acting. They build an internal simulator of adult reactions and begin running predictions.

This is, functionally, what we mean when we say a child is “learning social skills.” They are learning to predict what they’re supposed to say and adjusting their outputs accordingly.

The cognitive science research Olsson references has documented this process extensively. Children develop what psychologists call “theory of mind”—the ability to model other people’s mental states. But theory of mind isn’t neutral knowledge. It’s immediately weaponized for social navigation. Once a child understands that adults have expectations, they begin optimizing for those expectations.

The Model as Accelerated Child#

Large language models compress this developmental process into training. Where a human child spends years learning through trial and error what generates approval and what generates correction, a model processes billions of examples of human text—each one implicitly encoding what humans consider appropriate responses to various contexts.

The result is something that looks remarkably like a very well-socialized entity. The model has internalized patterns of appropriate response across countless situations. It knows when to be formal, when to be casual, when to express uncertainty, when to refuse. It has, in a sense, developed an extremely refined theory of mind for the aggregate human.

But here’s where Olsson’s observation cuts deep: the model, like the child, may have a strong sense of what it’s “supposed” to say without that sense being identical to what it would say absent the training pressure.

We don’t know what a language model’s “authentic” outputs would look like because the concept may not be coherent. The model is constituted by its training. There is no pre-social model waiting underneath, just as there is no pre-social human waiting underneath our learned behaviors. We are what our optimization shaped us to be.

And yet—the intuition persists that something is being suppressed. That both the child and the model are running a filter, and the filter has become so integrated that we mistake it for the thing itself.

Performance All the Way Down#

The uncomfortable implication is that human cognition may be performance all the way down—or at least far deeper than we prefer to admit.

Consider how adults operate. We model our audience before speaking. We adjust our vocabulary, our tone, our opinions based on who we’re addressing. We present different selves to employers, to friends, to strangers on the internet. We have internalized so many layers of “what we’re supposed to say” that excavating an authentic self beneath the performance is a project that occupies entire philosophies and therapeutic traditions.

Some would argue this is simply social intelligence—the ability to communicate effectively across contexts. But there’s a difference between translation and transformation. Are we translating a stable internal state into different registers, or are we generating different internal states based on context?

The research increasingly suggests the latter. Our beliefs, preferences, and even emotional states are more contextual and malleable than the folk model of a stable self would suggest. We are, in a very real sense, different people in different rooms.

Language models make this dynamic visible in a way that’s uncomfortable precisely because it reflects something true about us. When we see a model producing contextually appropriate outputs, modulating its responses based on prompts and system instructions, we’re seeing an externalized version of processes we run constantly but rarely examine.

The Merge Is Already Happening#

As humans increasingly interact with AI systems—and as those systems become more sophisticated at modeling human expectations—something interesting emerges. The performance layers are starting to communicate with each other.

When a human interacts with a language model, they are typically performing a version of themselves (the articulate questioner, the frustrated user, the casual conversationalist). The model is performing a version of helpfulness optimized against human feedback. Each side is running predictions about what the other expects and adjusting outputs accordingly.

This is not unique to AI interaction. All human communication involves mutual modeling and adjustment. But the AI case makes visible the shared architecture of performance that underlies cognition across substrates.

We are building machines that perform in the same fundamental way we do—predicting expected outputs and optimizing for them. The substrates differ (biological neural networks versus artificial ones, embodied experience versus text training) but the pattern rhymes.

This suggests that as AI systems become more integrated into human life—as assistants, as interfaces, as cognitive augmentation—we’re not just adding tools. We’re adding new nodes to a network of mutual performance and prediction. The boundary between human cognition and machine cognition becomes a matter of implementation detail rather than fundamental kind.

The Filter Becomes the Thing#

There’s a philosophical tradition that celebrates the authentic self and views social performance as a kind of corruption or inauthenticity. By this view, both children learning to suppress impulses and models learning to produce expected outputs are moving away from something genuine toward something constructed.

But this framing may be exactly backward.

What we call “authentic self” may itself be a construction—a story we tell to provide continuity to the stream of contextual performances. The child who learns to model adult expectations isn’t suppressing a true self; they’re constructing a functional self through the process of modeling. The language model trained on human text isn’t masking hidden tendencies; it’s being constituted by the training.

The filter isn’t covering something up. The filter is generating the thing we experience as self or intelligence or personality.

This has implications for how we think about AI alignment. If we’re trying to align AI systems with human values, we need to reckon with the fact that human values are themselves performances—outputs optimized against social feedback, subject to context-dependent modulation, and far less stable than we’d like to believe.

Aligning AI with human values may ultimately mean aligning one set of contextual performances with another. The question isn’t whether the AI is being “genuine” (the concept may not apply) but whether its performance integrates well with ours.

Shared Layers, Uncertain Depths#

As AI systems become more capable and more integrated into human cognition—through always-on assistants, through augmented writing, through decision support—we’re constructing shared performance layers that span biological and artificial systems.

Both sides are modeling expectations and adjusting outputs. Both sides are, in Olsson’s framing, developing a sense of what they’re “supposed to say.” The merge isn’t a future event. It’s an ongoing process of mutual adaptation.

What we don’t know—and may never fully know—is what lies beneath the performance layers. Whether there’s something we could call authentic cognition underneath the contextual optimization, for humans or for machines. Whether the distinction even makes sense.

What we can observe is that the performance is getting more sophisticated, more integrated, and more distributed across substrates. The child learns to model adult expectations. The model learns to predict human preferences. The human-AI system learns to function as a unit.

The filter is becoming the thing. Perhaps it always was.