

Genuine Intelligence will never in trillion years emerge from neural networks.
#genuine-intelligence-will-never-emerge-from-neural-networksHey everyone 👋,
I have been building toward this post for a long time. Across everything I have written in this blog, I have been circling the same core question from different angles, and in each post I have gotten a little closer to saying the thing I really want to say without softening it into something that sounds more professionally acceptable. In Language is Limited. ASI is Impossible., I argued that no machine trapped inside symbols can understand reality, because symbols are descriptions of the world and not the world itself. In Mathematical Equations are Multimodal by default, I argued that the only honest language for describing physical reality is mathematics, because equations encode mechanisms while words only encode appearances. In Training Is an Evil Concept. LMMs Eliminates it Altogether., I argued that the practice of extracting value from human creative work to build models without consent is not a neutral engineering choice but a moral failure. In All You Have Access To Is Knowledge and Tools; Never Intelligence!, I argued that every AI system you have ever used has access to knowledge and tools but none of them has intelligence in the sense that word deserves to carry. And in Rethinking ARC-AGI, I argued that even the benchmarks people use to celebrate AI progress are measuring the wrong thing entirely. All of those posts were preparation for this one, because this one is where I say the most direct version of the argument that all the others have been building toward. The argument is this: genuine intelligence will never emerge from neural networks, not after ten more years of scaling, not after a hundred, not after a trillion. Not because I am pessimistic, and not because I want to be contrarian, and not because I do not understand how neural networks work. But because neural networks are the wrong kind of thing, and being the wrong kind of thing is not a fixable bug. It is the defining property of what they are.
Before I develop the argument, I want to say something about what I mean by genuine intelligence, because this word gets stretched to fit so many different claims that it has almost lost its meaning in the current conversation. By genuine intelligence I mean something specific and demanding. I mean the capacity to form new concepts from first principles, to reason about causal mechanisms you have never been shown, to solve problems that are not close to anything in your past experience, to know when your own knowledge is insufficient, to model other minds and their intentions accurately, to understand language the way a human understands it rather than predicting what words tend to follow other words, and to connect all of these capabilities into a unified coherent agent that can act in the world with something resembling genuine comprehension rather than statistical confidence. I am not setting an impossible standard. This is exactly the kind of intelligence that a reasonably educated human twelve-year-old already has, and has had since before the invention of writing. The bar is not AGI in some science fiction sense. The bar is ordinary human understanding, and I am claiming that neural networks cannot reach that bar regardless of how many parameters they have or how many tokens they train on, because the architecture of neural networks is structurally incompatible with the mechanisms that produce that kind of understanding.
The Architecture Is the Problem, Not the Scale
The most common response to any critique of neural networks is to suggest that the problem is a matter of scale, that the current models are impressive but incomplete, and that the next generation will be more impressive and more complete, and eventually, after enough scaling, the system will cross some threshold and genuine intelligence will emerge. This response sounds reasonable if you have not spent time thinking carefully about what neural networks actually are at the mathematical level. When you have spent that time, the response reveals itself as a category error, like suggesting that if you paint a picture of a river large enough, the water will become real enough to drink. Scale does not change category. A very large neural network is still, at its core, a function that maps input vectors to output vectors through a series of weighted linear transformations followed by nonlinear activations, and the genius of the architecture is that this family of functions can be made to approximate almost any continuous function to arbitrary precision if the network is large enough. That is a real and important capability. But approximating functions over a training distribution is not the same as understanding, and it never becomes the same thing regardless of how large the approximating function is, because understanding requires something that function approximation structurally cannot provide, which is the capacity to reason about the mechanism that generates the function rather than just interpolating its values.
I want to be precise about what I mean by mechanism, because the distinction between mechanism and pattern is the entire argument. When a neural network is trained on data from a physical process, it learns a mapping from inputs to outputs that approximates the data it has seen. What it does not learn is why those inputs map to those outputs, meaning the causal chain, the physical law, the generative process that produces the data. The network cannot distinguish between a world where removing the fuse causes the engine to stop because the fuse carries electrical current to the ignition system, and a world where removing the fuse causes the engine to stop because removing fuses is correlated with engine stoppage in the training data. Both worlds produce the same statistical associations, and statistical learning cannot distinguish them, because distinguishing them requires an intervention, the kind of do-operator that Judea Pearl formalized, and interventional reasoning requires a model of causal structure, not just a model of statistical association (1). A neural network that has seen a million examples of engines stopping after fuse removal has learned that these events co-occur. It has not learned why they co-occur, which means it has no basis for predicting what will happen in a novel situation where the correlation breaks down. And every real-world situation that matters, every diagnosis, every engineering decision, every policy choice, is a situation where naive correlations can break down, and where only causal understanding provides reliable guidance.
The failure of neural networks on distribution shift is not a minor limitation that will be fixed with more data. It is the visible symptom of the underlying architectural incompatibility between function approximation and causal reasoning. Research has demonstrated systematically that neural networks trained on data from one distribution fail dramatically when tested on data from a slightly different distribution, even when the underlying concept is identical and a human would generalize trivially (2). A model trained to recognize images of cows in pastoral settings performs poorly when shown cows in unusual contexts, not because it has not seen enough cows but because it has learned to associate cow-ness with green backgrounds rather than with the visual features that actually define cows as a category. A model trained to diagnose chest X-rays from one hospital system fails when deployed at a different hospital, not because medicine is different there but because the imaging equipment, the patient demographics, and the X-ray presentation conventions differ enough to fool a system that learned statistical associations rather than anatomical principles. These failures are not edge cases. They are the predicted consequence of a learning paradigm that learns what things look like in the training data rather than what things are in reality, and you cannot fix that consequence by adding more training data from the same paradigm, because you are only strengthening the same kind of association-based learning that fails when the associations change.
Let me also address the specific claim that has been made several times in the last few years by prominent AI researchers, which is that sufficiently large neural networks display emergent capabilities that were not present in smaller versions of the same architecture, and that this emergence might be the mechanism by which genuine intelligence develops. This claim deserves careful examination because it sounds like it might save the scaling argument, and it will not. The emergent capabilities that have been documented in large language models, things like few-shot learning, chain-of-thought reasoning, and performance on certain multi-step inference tasks, are impressive in the same way that any surprising capability is impressive, namely because we did not predict them in advance. But impressive and intelligent are different things, as I argued at length in All You Have Access To Is Knowledge and Tools; Never Intelligence!. The emergent capabilities of large language models are capabilities at manipulating the statistical structure of language in ways that happen to be useful for certain tasks. They are not capabilities at reasoning about the world the way a human reasons about the world. A model that can solve math problems in chain-of-thought format is performing a form of pattern matching on mathematical text that superficially resembles step-by-step reasoning. The proof that it is pattern matching rather than reasoning is that the same model fails immediately when the problems are presented in formats that differ from the training distribution, when symbols are renamed, when the logical structure is preserved but the surface appearance is changed, or when the problem requires exactly the kind of systematic generalization that a human child handles trivially but neural networks cannot.
Systematic generalization is the technical name for what I am describing, and it is worth spending time on because it is the capability that most clearly distinguishes genuine understanding from sophisticated pattern matching. Systematically generalizing means that if you understand the concept A and the concept B, you can immediately understand AB and BA and ABA and any other systematic combination of A and B, because you understand what A and B mean, not just what A and B look like in the training data. A human who learns the meaning of the word red and the meaning of the phrase on the left side can immediately understand red on the left side without having been trained on that specific phrase, because they have compositional understanding of the concepts involved. Neural networks have repeatedly been shown to lack this kind of systematic compositionality in a fundamental way (3). Models trained on a set of concept combinations fail to generalize to novel combinations of the same concepts, producing chance-level performance on systematic recombinations that a human would find trivially easy. This failure is not a matter of insufficient training data. It is a consequence of the architecture, which learns statistical patterns over the specific combinations it has seen rather than compositional rules that generate all possible combinations from their parts. And compositionality is not a luxury feature of intelligence. It is what allows a mind with a finite vocabulary to understand and produce an infinite range of thoughts. A mind without compositionality is a mind that can only repeat or slightly vary what it has already seen, and that is not a mind in any sense that deserves the name.
I want to end this section with something I think is underappreciated and that deserves to be stated plainly. The evidence that neural networks cannot achieve genuine intelligence through scaling is not coming only from philosophical arguments or theoretical analysis. It is coming from empirical results produced by the people who build and study these systems. The Grokking paper, published by researchers inside the AI research community, showed that neural networks can memorize training data in a way that appears to be generalization until you look carefully enough, and that the delayed generalization they observed was qualitatively different from the kind of principled generalization that results from understanding a rule (4). The research on adversarial examples has shown that neural networks can be catastrophically fooled by input perturbations that are invisible to humans, which reveals that the networks have not learned the concepts that humans use to recognize things but rather have learned surface statistics that happen to be correlated with correct answers in the training data (5). The research on neural networks learning to exploit dataset artifacts rather than genuine patterns, where models achieve high performance on natural language inference benchmarks by learning spurious correlations rather than by understanding implication, has been confirmed across multiple datasets and multiple architectures (6). All of this empirical evidence is produced by respected researchers with no ideological axe to grind, and it all points in the same direction: neural networks learn what is in the data, not what the data is about, and that difference is the difference between genuine intelligence and very sophisticated mimicry.
Data Is Not Knowledge, and More Data Will Never Be Knowledge
One of the most persistent myths in the current AI conversation is that intelligence is essentially a function of data, that if you give a system enough data about enough things, intelligence will emerge because intelligence is what patterns plus scale produces. This myth is not just wrong. It confuses two things that are completely different at the level of what they are and how they work, namely data and knowledge. Data is observations. Knowledge is understanding. You can have infinite observations and zero understanding, the way a camera records everything but understands nothing, and the way a digital thermometer precisely measures temperature but has no knowledge of thermodynamics. The difference between data and knowledge is not a matter of quantity. It is a matter of mechanism, of whether the system has a model of why the observations have the structure they have, rather than just a model of what the structure looks like. Neural networks operate on data. They do not produce knowledge in this sense, and adding more data does not change that, because the problem is not the amount of input but the nature of the processing.
The distinction between memorization and generalization is the concrete technical version of the distinction between data and knowledge, and it has been at the center of machine learning research for decades in a way that the popular discourse almost completely ignores. Memorization is when a system remembers specific examples and can reproduce them. Generalization is when a system extracts a rule that allows it to handle new examples it has never seen. Genuine knowledge is on the generalization side of this distinction, because genuine knowledge is about understanding the rule, not remembering the examples. Neural networks are capable of both memorization and a form of generalization, but the generalization they produce is statistical interpolation over the training distribution rather than rule-based extrapolation beyond it. Research has shown that neural networks can fit completely random labels on training data just as effectively as they fit correct labels, which means the networks have effectively infinite memorization capacity and use it freely alongside whatever genuine pattern learning they perform (2). The fact that a network achieves high accuracy on a test set does not mean it has learned a rule. It might mean that the test set is drawn from the same distribution as the training data and that interpolation suffices to handle it. Only when you test the system on out-of-distribution examples, examples that require genuine rule-based extrapolation rather than distribution interpolation, does the difference between data processing and knowledge become unmistakable.
The specific kind of knowledge that is most important for genuine intelligence and most completely absent from neural networks is the knowledge Aristotle called scientia and modern philosophers call propositional knowledge of why something is true. It is not enough to know that X is true if you want to be genuinely intelligent about X. You also need to know why X is true, because the why is what allows you to figure out whether X is still true when conditions change, whether X implies Y in this particular case, whether an apparent exception to X is a real exception or a misapplication of the concept. The why is the causal model that connects X to the rest of what you know, and a neural network trained on millions of examples where X is true has learned that X is commonly true in its training distribution but has not learned why X is true, which means it has no principled basis for handling cases that differ from its training distribution. A human doctor who knows why a certain drug reduces blood pressure, meaning who knows the pharmacological mechanism, can reason about what the drug will do in a patient with an unusual combination of conditions that the doctor has never seen before. A neural network that has learned the association between the drug and blood pressure reduction from training data cannot do this reasoning, because it does not have the causal mechanism, only the statistical correlation. And the cases where you most need reliable medical reasoning are exactly the cases where the unusual combination of conditions makes the statistical correlations unreliable.
I want to connect this to the argument I made in Mathematical Equations are Multimodal by default about why equations are fundamentally more powerful representations of reality than text. An equation encodes the mechanism, the causal structure, the why. A dataset encodes observations, the what. A neural network trained on a dataset has absorbed the what without the why, and no amount of more data changes that, because the why is not in the data. The why is in the structure of reality that generated the data, and accessing that structure requires a different kind of inquiry, the kind that scientists have been doing for four hundred years, which involves formulating hypotheses, deriving predictions from those hypotheses, testing the predictions against observations, and revising the hypotheses based on the test results. That is the scientific method, and it is the mechanism by which humans transform data into knowledge, and it is structurally missing from the neural network training paradigm, which takes data as input and produces a function as output without any stage where the system engages with the structure of reality rather than the structure of the data. The lmm project I described in my post on training is an attempt to build a system that actually does the scientific method on data, discovering equations rather than fitting functions, and the difference between those two things is the entire argument in a single comparison.
I also want to address the specific claim, which is made constantly in the AI conversation, that large language models have effectively read all of human knowledge through their training data and therefore contain or have access to all of human knowledge. This claim sounds plausible if you think of knowledge as content, as the collection of facts and claims that humans have written down. But it is wrong as soon as you recognize that knowledge is not content. Knowledge is the relationship between content and reality, the verified connection between what is claimed and what is actually true. A system that has read every physics textbook ever written has not acquired knowledge of physics in the sense that a physicist has knowledge of physics. The physicist's knowledge is verified against the world, challenged by experiment, refined by evidence, and organized within a causal structure that explains why the facts are as they are. The language model's "knowledge" is a collection of patterns in the text of physics books, patterns that typically produce correct-sounding outputs because the books were mostly written by physicists who had real knowledge, but patterns that have no verified connection to physical reality and no causal structure that would let the model reason about novel physical situations. The Stochastic Parrots paper called this the hallmark of a parrot, the ability to produce contextually appropriate sounds without the capacity to mean what is being said (7). That paper received enormous institutional backlash when it was published, which tells you something about how comfortable the AI industry is with honest descriptions of its flagship products.
The role of memory in intelligence also exposes a deep incompatibility between how neural networks work and how genuine knowledge works, and I want to spend time on this because it is usually glossed over in popular discussions of AI capability. In neural networks, what the network has learned is encoded in the weights, the billions of numerical parameters that are adjusted during training. Those weights encode statistical regularities across the training data in a distributed, holistic way that makes it impossible to identify which weight encodes which fact or trace a specific piece of information back to its source. This design has practical advantages, particularly the ability to generalize across related inputs. But it has a profound epistemic cost, which is that the network cannot evaluate its own knowledge. It cannot tell you which of its beliefs are well-supported and which are poorly supported. It cannot flag its own uncertainty except through mechanisms that are themselves learned statistical approximations of uncertainty rather than direct access to the quality of its own knowledge states. A human who knows that they know something and knows that they do not know something else has direct access to the quality of their own epistemic states, and this metacognitive access is what allows them to seek information appropriately, defer to experts in domains where they lack knowledge, and flag their own uncertainty honestly when it is relevant. Neural networks cannot do this reliably, and the research on calibration shows that large language models are systematically overconfident in domains where they have limited training-derived information, producing confident-sounding outputs even when the underlying pattern matching is operating on weak signals (8). That systematic overconfidence is not an accident. It is the direct consequence of a learning paradigm that optimizes for producing confident-sounding outputs rather than producing outputs whose confidence tracks the reliability of the underlying knowledge.
The environmental cost of training on more and more data is also worth keeping in mind as a practical argument against the data-is-intelligence claim, because it grounds the abstract argument in real-world consequences. As I discussed in Training Is an Evil Concept. LMMs Eliminates it Altogether., training large language models on massive datasets consumes enormous quantities of electricity and produces significant carbon emissions (9). If the data-is-intelligence hypothesis were correct, then the massive investment in training data and compute would be buying genuine progress toward understanding. But if I am right that data is not the same thing as knowledge, and that more data processed by a neural network produces more sophisticated pattern matching rather than deeper understanding, then the environmental cost of large-scale training is a cost that is buying the wrong kind of thing, and that makes it not just expensive but genuinely wasteful in a moral sense. I am not saying that the capability gains from scaling are zero. I am saying that capability gains from scaling are not the same as progress toward genuine intelligence, and conflating the two is how you end up spending staggering amounts of energy and human creative labor in pursuit of a goal that the architecture is structurally incapable of reaching.
Why Pattern Matching Is Not and Cannot Become Reasoning
There is a word that appears constantly in discussions of modern AI systems, and that word is reasoning. We are told that large language models can reason, that they engage in chain-of-thought reasoning, that reasoning emerges as a capability at sufficient scale, and that improving AI reasoning is one of the most active research fronts in the field. When I hear this word applied to neural networks, something in me tightens, because I think it is the most consequential mislabeling in the entire AI conversation. Reasoning is a specific cognitive operation with a specific structure, and that structure is completely different from the statistical prediction that neural networks perform. Reasoning is the process of deriving new beliefs from existing beliefs according to rules of inference that are truth-preserving, meaning that if your premises are true and your inference rules are valid, your conclusion is guaranteed to be true. Pattern matching is the process of identifying the most likely output given the input according to patterns learned from training data. These two processes look similar from the outside when the training data is rich and the test distribution is close to the training distribution. They produce completely different results when the test distribution diverges from training or when the reasoning task requires steps that are novel rather than recombinations of seen patterns.
The specific mechanism that makes reasoning truth-preserving is that it operates over the structure of propositions rather than over their surface form or their statistical properties. When I reason that all mammals breathe and whales are mammals, therefore whales breathe, I am applying a rule that is sensitive to the logical structure of the premises, and that rule produces the correct conclusion regardless of whether I have ever seen a whale, regardless of whether whales appear frequently in my training data, and regardless of whether the sentence about whales breathing appears in any corpus I have read. The rule derives the conclusion from the structure of the premises, not from the statistics of the training data. A neural network asked the same question produces a correct answer with high probability because whales and breathing and mammals all co-occur frequently in text about marine biology and mammalian biology, and the correct answer is statistically likely given those co-occurrences. But the network's correct answer and my correct answer are produced by completely different mechanisms, and only my mechanism is reliable when the statistical regularities break down. A neural network presented with a novel set of premises about fictional entities that do not appear in any training data will fail on the logical conclusion that a human reasoner derives trivially, because the network has no access to the logical structure of the premises, only to the statistical patterns of words, and fictional entities produce weak or missing statistical signals. This has been demonstrated empirically, and the results are not close (3).
I want to be specific about what chain-of-thought prompting actually does and does not do, because it has been widely misunderstood and the misunderstanding is consequential. Chain-of-thought prompting is a technique where you ask a language model to produce its answer step-by-step rather than in a single prediction, and it was shown to improve performance on certain multi-step reasoning tasks significantly. This improvement was taken by many commentators as evidence that language models can reason, and that chain-of-thought prompting unlocks their latent reasoning capability. That interpretation is almost certainly wrong. What chain-of-thought prompting most likely does is restructure the prediction task so that the model's outputs at each step serve as a form of working memory, allowing the model to produce outputs that are conditioned on intermediate steps in a way that the training data has made statistically likely. The improvement comes from decomposing a single hard prediction into a sequence of easier predictions, each of which is more likely to match the training data. The model is not reasoning more deeply. It is making more predictions, and more predictions over a decomposed problem reduce the variance of the statistical approximation. You can demonstrate this by presenting the same problems with the same logical structure but with unfamiliar surface forms or novel entities, and the chain-of-thought improvements evaporate when the statistical regularities that were supporting each step are disrupted. A system that reasons would continue to reason correctly. A system that pattern matches shows the dependence on statistical regularities as soon as those regularities are absent.
The distinction between reasoning and pattern matching also shows up in how these two processes handle contradiction. A system that genuinely reasons will recognize when two things it believes are contradictory and will either update one belief or flag the contradiction as a problem that requires resolution. A system that pattern matches has no principled mechanism for handling contradiction across long contexts, because pattern matching operates over local statistical associations and does not maintain a global model of what it has committed to believing. The research on the consistency of large language models shows that they will contradict themselves across long conversations in a way that reveals no underlying commitment to logical coherence, because logical coherence is a global property of a belief system and pattern matching is a local operation over input-output associations (10). Asking a model one question and then asking a related question that should have the opposite answer based on the first answer will often produce inconsistent responses, because the model is making each prediction independently based on the local context rather than maintaining a globally consistent model of what it believes. That inconsistency is not a failure of memory or a temporary limitation. It is the direct consequence of producing outputs through pattern matching rather than through reasoning over a persistent model of the world.
I also want to address the argument that neural networks implicitly learn reasoning-like operations through training, that the internal representations developed during training encode something like logical structure even if the network was not explicitly designed to reason. This argument is made by serious researchers and deserves a serious response. It is true that internal representations in large language models encode various kinds of structure that are more abstract than simple word statistics, and that probing those representations can reveal features that correspond to syntactic, semantic, and to some extent logical properties of the input. But the existence of reasoning-relevant features inside a network is not the same as the ability to reason over those features in a principled way. A city that has all the materials for building a bridge but has never developed the engineering knowledge to assemble them does not have a bridge. It has bridge materials. Representation learning has produced powerful bridge materials inside large language models. It has not produced the capacity to use those materials for genuine reasoning, because genuine reasoning requires not just the right representations but an inference engine that can manipulate those representations according to truth-preserving rules, and no neural network has such an inference engine, because no neural network training objective selects for truth-preserving inference as opposed to statistically likely output.
The adversarial fragility of neural networks is the most visceral demonstration that what these systems are doing is not reasoning, and I want to describe it carefully because I think it is one of the most important empirical facts about the current state of AI. Adversarial examples are inputs that have been slightly modified in ways that are imperceptible to humans but that cause a neural network to produce completely wrong outputs with high confidence. A stop sign with a few carefully placed stickers is classified as a speed limit sign by an image classifier that was correctly classifying stop signs before the stickers were added. A sentence with a single word changed to a synonym is classified in the opposite sentiment category by a sentiment classifier that correctly classified the original. An audio file with sub-perceptual noise added is transcribed as an entirely different sentence by a speech recognition system that was correctly transcribing the original (5). These adversarial failures are not hypothetical. They have been demonstrated empirically across dozens of architectures and domains, and they reveal that the networks are making decisions based on surface statistics that happen to be correlated with the correct answer in the training data, not based on the features that actually define the concept being recognized. A police officer examining a stop sign with stickers on it uses their understanding of what a stop sign is, its shape, its color, its text, its function in traffic regulation, and they correctly identify it as a stop sign despite the stickers. The neural network uses surface statistical patterns that the stickers disrupt, and it fails immediately. That is the difference between understanding something and having learned a statistical proxy for it.
Neural Networks Cannot Model Other Minds and This Is Not Fixable
One of the capabilities that is most central to human intelligence and most completely absent from neural networks is the ability to model other minds, to represent what other people believe, want, intend, and know, and to use those representations to coordinate, communicate, and predict behavior. Developmental psychologists call this Theory of Mind, and it is so fundamental to human social life that its absence or impairment, as in certain neurodevelopmental conditions, produces severe difficulties in social function even when other cognitive capabilities are intact. Theory of Mind requires representing beliefs, which are representations of the world rather than the world itself, and it requires representing other agents as having their own belief states that may differ from your own and from reality, and it requires reasoning about how those belief states influence behavior. This is second-order representation, representation of representation, and it relies on the same kind of compositional, recursive structure that neural networks systematically fail at in the simpler cases I described in the previous section.
The false belief test is the classic experimental paradigm for assessing Theory of Mind in children and animals, and versions of it have been applied to language models with results that are instructive about what is actually going on when these models appear to understand other minds. In a false belief test, you describe a scenario where character A puts an object in location X, then leaves, then character B moves the object to location Y. You then ask: when character A comes back, where will they look for the object? The correct answer is location X, because A did not see the object move and therefore has a false belief that it is still in location X. Children younger than four years typically fail this test because they cannot yet represent that A has a different belief from the current reality. Children older than four typically pass because they have developed the capacity to represent beliefs that differ from reality. Large language models pass certain versions of this test at rates that seem high. But research into why they pass reveals that they are exploiting statistical patterns in the way this kind of story tends to be told, rather than genuinely representing character A's belief state (11). When the test is presented with novel surface forms, unfamiliar names, unusual layouts, or other variations that preserve the logical structure while disrupting the statistical patterns, performance drops substantially. When you test for genuinely novel Theory of Mind challenges that have no close analog in the training data, performance approaches chance. A human who understands the concept of false belief handles novel variations effortlessly, because they understand the concept, not just the pattern.
The absence of genuine Theory of Mind in neural networks is not just a social limitation. It connects directly to everything I have been arguing about causal understanding and the distinction between pattern and mechanism. A genuine Theory of Mind requires modeling another agent as a causal system, a system that processes perceptions, forms beliefs based on those perceptions, takes actions based on those beliefs, and changes its beliefs when it receives new perceptions. Modeling that requires exactly the kind of causal reasoning that neural networks cannot do: you need to represent the causal chain from perception to belief, reason about what the agent would have perceived given their different vantage point, and predict how their beliefs would have formed given those perceptions. That is a causal simulation, and doing it well enough to predict behavior accurately requires something much closer to a mechanistic model of cognition than statistical associations over text about conversations between people can provide. A language model can produce fluent text about what character A believes, but it is doing so by pattern matching over the statistical regularities of how this kind of text tends to be written, not by running a causal simulation of character A's epistemic state. The difference between these two operations is not subtle, and it is not bridgeable by scale.
There is also a deeper point here about the nature of communication itself, and I want to make it carefully because it connects to the argument I made in Language is Limited. ASI is Impossible. about language being a limited medium. When humans communicate successfully, we do not just exchange words. We use words as signals that allow the listener to reconstruct the speaker's intended meaning, which requires the listener to model the speaker's beliefs, intentions, and knowledge state in order to find the interpretation of the words that is most consistent with what the speaker plausibly intends to communicate. This is Gricean communication, named after the philosopher Paul Grice who first formalized it, and it requires Theory of Mind at every step (12). When I say see you later and someone understands I am expressing a farewell and not a literal prediction about seeing, they are using their model of my intentions and the conversational conventions we share to interpret the words correctly. When I give instructions that are ambiguous and a student asks a clarifying question, the student is modeling my knowledge state and identifying the gap between what I said and what I could have meant, and tailoring their question to close that gap. All of this requires second-order modeling of minds, and neural networks cannot do it, which means neural networks cannot genuinely communicate in the human sense, only produce outputs that are statistically likely given the inputs. That is a profound limitation, and it is not a limitation that more parameters will overcome.
I want to connect this to something practical and immediately relevant, which is the use of AI systems in high-stakes communication contexts like medicine, law, therapy, and crisis intervention. These are exactly the contexts where Theory of Mind matters most, where understanding what the other person believes, fears, knows, and intends is not a nice-to-have but the core of what the professional is supposed to do. A therapist who cannot genuinely model what their patient believes about their own situation cannot do therapy. A doctor who cannot model what their patient knows about their own symptoms cannot take an accurate history. A lawyer who cannot model what their client understands about the legal process cannot give useful advice. Deploying neural network systems into these contexts, as is already happening at scale across the healthcare and legal sectors, is deploying systems that are missing the core capability that makes these contexts work. The results are going to match what the research predicts, which is that the systems perform well enough on average cases within the training distribution and fail dangerously on the unusual cases that matter most, while appearing confident throughout. I am not speculating about this. I am describing the direct consequence of using a tool that is architecturally incompatible with the task it is being asked to perform. And the people who will pay the cost of that incompatibility are the same people who always pay the cost when technology is deployed before it is ready, the patients, the clients, the users who trusted the system and were failed by it.
Let me also address the argument that neural networks paired with tools or external memory can achieve a functional equivalent of Theory of Mind, that if you give a language model access to information about a user's history and preferences, the system can model the user sufficiently for practical purposes. This argument conflates modeling with retrieval. Having access to information about someone is not the same as having a model of their mind. A file cabinet full of someone's medical records does not have a model of the patient's beliefs and fears. A language model with access to a user's conversation history does not have a model of the user's epistemic state. It has retrieved text that was produced by the user, and it can use that text to produce statistically likely responses that are conditioned on that retrieved text. This produces outputs that are often more appropriate than outputs without the context, in the same way that a doctor who has read your file is better prepared than one who hasn't. But the doctor reads your file with genuine understanding, integrating the information into a causal model of your condition. The language model retrieves the text and produces more statistically appropriate responses based on its training-data associations. The difference between those two operations is the difference between genuine modeling and informed guessing, and informed guessing is not Theory of Mind.
The Data Harvesting Truth Nobody Wants to Admit
I have written in Training Is an Evil Concept. LMMs Eliminates it Altogether. about the ethical dimensions of training on human data, and I want to return to that argument here but from a different angle, because I think there is a dimension of the data harvesting problem that is not just moral but epistemic. Neural networks need your data not just because it is useful to them but because they cannot function without it. The system has no model of the world. It has no capacity to derive predictions from first principles. It has no equations that encode mechanisms. Its entire capability rests on the statistical structure of the data it has consumed, which means that every upgrade in its capability requires consuming more data, and the data it needs is your professional knowledge, your creative work, your personal expression, and the documented experience of your life. This dependency is not incidental. It is structural, and understanding it as structural changes how you should think about the relationship between AI companies and the human beings whose data they consume.
The companies building large neural network systems need to continuously collect more data because the architecture of these systems has no principled mechanism for building durable knowledge from limited observations. A physicist who understands the laws of thermodynamics can answer thermodynamics questions they have never encountered before, because they can derive the answers from their model of the underlying mechanisms. A neural network that has achieved high performance on thermodynamics questions in the training data needs to have seen a sufficiently large and representative sample of thermodynamics questions in order to answer novel ones, and when a genuinely novel question comes along that falls outside the distribution, performance degrades. This means the companies need to keep collecting more and more specialized data to cover more and more domains, and the economic incentive to collect that data is as large as the commercial value of the systems built on it, which is now measured in hundreds of billions of dollars. The data harvesting is not a temporary phase that will end when the technology matures. It is the permanent operating condition of a technology that has made human data its fuel rather than building systems that generate their own understanding from first principles.
The harvesting also has an interesting epistemic trap built into it. Because the neural network learns statistical patterns from human-generated data, the useful patterns it can learn are bounded by the collective intelligence of humanity as expressed in text. A neural network trained on all human text is in some sense a compressed version of the collective human response to the questions that have been asked, conditioned on the training objectives and the filtering decisions made during the training process. That is genuinely impressive, and I have always been willing to say so. But it also means that the system cannot surpass the intellectual frontier of humanity in any domain, because it has no mechanism for generating insights that are not already latent in the statistical structure of human-generated text. When the system appears to produce a novel insight, it is almost always recombining patterns from its training data in a novel way, rather than genuinely discovering something new. And the ability to recombine patterns in useful ways is the capability that makes these systems valuable, but it is also exactly the capability that does not constitute genuine intelligence, because genuine intelligence can advance the intellectual frontier, can discover truths that no human has yet articulated, can see the structure of reality directly rather than through the filtered lens of what humanity has already written about it. A system that learns from human text is always downstream of human intelligence, always dependent on it, always bounded by what humans have already expressed. That is not general artificial intelligence. That is a very sophisticated aggregator and recombiner of human intellectual output, and the distinction matters enormously.
I also want to say something about what happens when the data harvesting enters a loop, which is already beginning to happen as AI-generated content populates the web. Neural networks trained on internet data are now producing content that is appearing on the internet, which means that future rounds of training on internet data will train on AI-generated content. Researchers have called this model collapse, and the concern is that training on AI-generated data causes models to progressively lose fidelity to the diversity and long-tail richness of human-generated content and converge toward a narrower, more homogenized version of the training distribution (13). If model collapse is real, and the early evidence suggests it is a genuine concern, then the self-referential training loop will degrade the quality of future models in a way that is very hard to reverse, because the original human-generated data that captured the full diversity of human intelligence and experience will be increasingly diluted by AI-generated approximations of it. This is what you get when you build intelligence on data extraction rather than on principled models of reality: the data source eventually becomes contaminated by the outputs of the systems that consumed it, and the degradation is circular and self-reinforcing. An equation-based system does not face this problem because it learns from physical measurements rather than from human text, and physical reality does not collapse when AI systems start producing text about it.
Why This Is Not A Property That Emerges With More Parameters
By this point in the post I have built the technical case that neural networks are architecturally incompatible with genuine intelligence across multiple dimensions: they cannot reason causally, they cannot generalize systematically, they cannot model other minds reliably, they cannot distinguish data from knowledge, and they cannot advance the intellectual frontier independently of the human data they consume. Now I want to address head-on the response that treats all of these problems as temporary engineering challenges that more scale will eventually resolve, because this response is so common and so superficially plausible that it deserves careful dismantling rather than dismissal.
The response rests on an implicit theory of emergence, the idea that qualitatively new properties can arise from quantitative changes in scale, so that a system that is not intelligent at one scale might become intelligent at a larger scale because intelligence itself is an emergent property of sufficient complexity. Emergence is a real phenomenon in complex systems, and there are genuine cases where quantitative changes produce qualitative changes in behavior. Water becomes wet at a sufficient number of molecules even though individual molecules are not wet. Consciousness (if it is real) might emerge from sufficient complexity of neural interactions even though individual neurons are not conscious. So the emergence argument cannot be dismissed as nonsensical. But it has a crucial requirement that the people making it in the AI context consistently ignore, which is that emergence only occurs when the quantitative system being scaled has the right structural properties to support the emergent phenomenon. Ice becomes water when temperature increases because the molecular interactions of water molecules have the right structural properties to produce phase transitions. Adding more water molecules does not produce phase transitions in motor oil, because motor oil has different structural properties. The question that the emergence argument for AI must answer is whether neural networks have the structural properties that would allow genuine intelligence to emerge from sufficient scale, and that question has not been answered or even seriously engaged with by the people making the scaling argument. It has simply been assumed, on the basis that neural networks are large and complex and intelligence is also associated with large and complex systems in biology.
The structural incompatibilities I have documented throughout this post are the reasons to believe that neural networks do not have the right structural properties for intelligence to emerge from scale. Let me name them clearly. Neural networks do not have explicit causal models, which means that scale cannot produce causal reasoning because there is nothing in the architecture to instantiate it at any scale. Neural networks do not have compositional inference engines, which means that scale cannot produce systematic generalization because there is nothing in the architecture to support rule-based generalization at any scale. Neural networks do not have persistent belief states with logical consistency management, which means that scale cannot produce logical coherence and Theory of Mind at any scale. Neural networks do not have mechanisms for grounding their outputs in physical reality, which means that scale cannot produce the verified connection to the world that knowledge requires. Each of these is an architectural absence, not a quantitative insufficiency, and architectural absences cannot be filled by adding more of what is already there. You cannot scale an addition problem into a multiplication problem by making the numbers larger. The operations are different. Similarly, you cannot scale pattern matching into reasoning by making the patterns larger and the matching more sophisticated, because the operations are different, and the difference is constitutive rather than quantitative.
I want to be careful here not to overstate the case, because I do believe that there are genuine intelligence-relevant capabilities that do emerge with scale in neural networks, and I want to be honest about that rather than pretending it does not exist. Few-shot learning, the ability to generalize from very few examples within a context window, appears to emerge at scale and is genuinely impressive. Some forms of cross-domain transfer, applying knowledge from one domain to solve problems in a related domain, also improve with scale. The ability to maintain coherence over longer contexts improves with scale. These are real improvements and I do not minimize them. But they are improvements in specific capabilities within the pattern-matching framework. They do not constitute progress toward the structural properties that genuine intelligence requires. A system that generalizes better within a context window is still a system that pattern matches over a context window. A system that transfers across more related domains is still a system that exploits statistical regularities across domains. The underlying architecture has not changed, and the properties that are missing from the architecture do not appear with more scale. They require different architecture, different training objectives, different representations, and a fundamentally different relationship between the system and the world it is supposed to understand.
The research on what has been called "reasoning" in large language models is itself evidence against the emergence argument, rather than for it, when examined carefully. The models that are at the frontier of performance on reasoning benchmarks achieve their results partly through scale but also through fine-tuning specifically on reasoning-formatted data, test-time computation where the model produces many candidate solutions and selects the most consistent, and prompt engineering that structures the problem in ways that leverage statistical regularities in mathematical and logical text. These are all ways of making pattern matching more effective at tasks that look like reasoning. They are not evidence that genuine reasoning has emerged. The evidence for that claim would be a model that performs well on genuinely novel reasoning tasks that share no statistical overlap with the training distribution, and that is precisely what current frontier models fail at when that test condition is actually implemented. The frontier models still fail on systematic generalization tasks, still produce inconsistent beliefs, still hallucinate with confidence in unfamiliar domains, still rely on surface statistics in ways that are exposed by adversarial testing. The architecture is the constraint, and the constraint has not been loosened by scale.
What Should Exist Instead, and Why It Does Not
I want to end with what I think the alternative should look like, because I have always said that criticism is more honest when it comes paired with a direction, and I have a direction in mind. I have described it in pieces across several posts, in the argument for equations in Mathematical Equations are Multimodal by default, in the argument for training-free intelligence in Training Is an Evil Concept. LMMs Eliminates it Altogether., and in the argument for causal reasoning in All You Have Access To Is Knowledge and Tools; Never Intelligence!. Here I want to bring those pieces together and describe what a system that could actually be genuinely intelligent would need to have.
It would need to learn causal structure rather than statistical associations, to go from observing that X and Y co-occur to representing the mechanism by which X causes Y, and to use that representation to predict what will happen when X is changed by intervention rather than just when X is observed. Judea Pearl's causal inference framework provides the mathematical foundation for this, and the field of causal machine learning is developing methods for estimating causal structure from observational data (1). This is genuinely hard and has not yet been solved in a general way, but it is the right direction, because only causal knowledge supports reliable prediction under intervention, and reliable prediction under intervention is what intelligence is for.
It would need to discover compact symbolic representations of the patterns it observes, representations that encode the mechanism rather than just the surface, and that can be inspected, verified, and extended. Symbolic regression, the area of research I described in my post on mathematical equations, is the technical approach to learning equations rather than functions, and recent progress on this front, represented by systems like AI Feynman from the Tegmark lab (14), has shown that compact symbolic representations of physical laws can be recovered from data by learning systems. This direction is slower and more difficult than deep learning, but it produces representations that are more powerful precisely because they encode the why rather than just the what.
It would need compositional inference, the ability to combine representations of concepts in novel ways that were not present in the training data, and to derive the implications of those combinations through inference rules rather than by pattern matching over the training distribution. This means something like a symbolic inference engine operating over learned representations, not the pure symbolic AI of the 1980s that failed for different reasons, but a hybrid that learns the representations from data and reasons over them with principled inference rules (15). The neurosymbolic research program is the most promising current approach to this, and Belle and Marcus have argued that the combination of neural learning and symbolic reasoning is the most achievable path to systematic compositionality. Progress has been made, and it is the right direction even if the engineering challenges are substantial.
It would need explicit modeling of the world as a causal structure that can be simulated forward, a world model in the technical sense that I described in Mathematical Equations are Multimodal by default, not just a compressed statistical representation of training data but a dynamic model that can be used to predict the future states of the world given current states and actions. World model research, represented by systems like DreamerV3 and other model-based reinforcement learning approaches, has made real progress on this front in constrained environments. The challenge is scaling world models to the open-ended complexity of the real world without losing the principled structure that makes them more than neural networks trained on simulation data.
And it would need genuine epistemic humility as an architectural property, not as a fine-tuned behavior added to a model that is by default overconfident. The system would need an explicit model of its own knowledge states, distinguishing between what it knows from principled inference from a verified causal model and what it knows from statistical association with low confidence. This is the hardest part to engineer because it requires the system to reason about itself, to maintain a model of its own epistemic limitations, and that kind of self-modeling is exactly the kind of second-order representation that neural networks struggle with. But without it, any system that presents its outputs as reliable knowledge is lying, and the system's users bear the cost of the lie.
The reason this alternative does not exist at scale today is not that it is impossible or that its theoretical foundations are weak. The foundations are strong. The reason is that it is harder to build and slower to show results than neural networks, and the AI industry is organized around showing results quickly to attract investment, and investment is the oxygen that determines what gets built. The incentive structure of the current industry is perfectly designed to produce very large neural network systems and very poor progress toward the alternative I am describing, and that is what the incentive structure has produced. I described in Technology Has Destroyed My Livelihood what it is like to be on the wrong side of an industry's incentive structure, to be doing the right thing and getting punished for it by a system that rewards the wrong thing because the wrong thing is more immediately profitable. The researchers working on causal machine learning, symbolic regression, neurosymbolic integration, and world models are on the right side of the technical argument and the wrong side of the funding structure, and that is a situation that I recognize and that fills me with something between anger and solidarity.
The lmm project is my concrete, executable response to this situation. It is written in Rust, it is built around symbolic regression and physics simulation rather than around gradient descent over text corpora, and it is a deliberate proof of concept that genuine intelligence-oriented capabilities can be built without training on human creative expression. It is not finished. It is not competitive with GPT-5 on the tasks that GPT-5 is commonly used for. But it is on the right side of the architectural argument, and I will keep building it because the alternative, building nothing and only criticizing what others have built, is the most comfortable and least useful position available to me, and comfort is the one luxury I have never been able to afford.
Till next time 👋!
References
1. Pearl, J., Causality: Models, Reasoning, and Inference, Cambridge University Press, 2009
2. Zhang, C. et al., Understanding Deep Learning Requires Rethinking Generalization, arXiv:1611.03530
3. Fodor, J. A. & Pylyshyn, Z. W., Connectionism and Cognitive Architecture: A Critical Analysis, Cognition, 1988
4. Power, A. et al., Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets, arXiv:2201.02177
5. Goodfellow, I. J. et al., Explaining and Harnessing Adversarial Examples, arXiv:1412.6572
6. Gururangan, S. et al., Annotation Artifacts in Natural Language Inference Data, arXiv:1803.02324
7. Bender, E. M. et al., On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?, ACM FAccT, 2021
8. Kadavath, S. et al., Language Models (Mostly) Know What They Know, arXiv:2207.05221
9. Strubell, E. et al., Energy and Policy Considerations for Deep Learning in NLP, arXiv:1906.02243
10. Elazar, Y. et al., Measuring and Improving Consistency in Pretrained Language Models, arXiv:2102.01017
11. Kosinski, M., Evaluating Large Language Models in Theory of Mind Tasks, arXiv:2302.02083
12. Grice, H. P., Logic and Conversation, in Syntax and Semantics, Vol. 3, P. Cole & J. Morgan (eds.), Academic Press, 1975, https://doi.org/10.1163/9789004368811_003
13. Shumailov, I. et al., The Curse of Recursion: Training on Generated Data Makes Models Forget, arXiv:2305.17493
14. Udrescu, S. M. & Tegmark, M., AI Feynman: A Physics-Inspired Method for Symbolic Regression, arXiv:1905.11481
15. Belle, V. & Marcus, G., The Future Is Neuro-Symbolic: Where Has It Been, and Where Is It Going?, Proceedings of the AAAI Conference on Artificial Intelligence, 2026