Skip to content

Kant's Categorical Imperative Is a Misreading of Game Theory

Immanuel Kant gave us one of the most famous tests in moral philosophy: act only according to that maxim whereby you can at the same time will that it should become a universal law. Strip away the eighteenth-century German and the categorical imperative is asking a simple, intuitive question: "what if everyone did that?" If a maxim — lying to get out of trouble, breaking promises when convenient, free-riding on public goods — produces incoherence or catastrophe when universalized, then it is forbidden. The test feels rigorous. It feels like it derives morality from pure reason, with no appeal to consequences, no appeal to God, no appeal to feeling.

It is also wrong, and it is wrong in a specific and instructive way. The categorical imperative is a pre-scientific attempt to answer a question that game theory and evolutionary biology have since answered properly. Kant was reaching, with the only tools available in 1785, for results that require the Prisoner's Dilemma, the concept of a Nash equilibrium, and the theory of evolutionarily stable strategies to state correctly. Lacking those tools, he got the structure of the problem backwards.

The Question Behind the Question

"What if everyone did that?" is a real and important question. But notice what it smuggles in: it assumes the relevant alternatives are everyone cooperates versus everyone defects. The maxim is universalized — applied to all agents at once — and we check whether the all-cooperate world or the all-defect world is coherent and livable.

This is exactly the framing game theory teaches us to distrust. The choice an actual agent faces is never "do I want everyone to do this?" It is "given what everyone else is doing, what is my best move?" Those are different questions with different answers, and the gap between them is the whole subject of game theory. Kant collapsed the second question into the first and called the result the moral law.

The Optimum Is Not the Strategy

Here is the core confusion, stated plainly. The categorical imperative identifies the collectively optimal rule — the maxim that, if followed by all, produces the best world — and then declares that a rational agent is therefore bound to follow it. But "the rule that is best if everyone follows it" and "the rule a rational agent will actually follow" are not the same thing, and the entire point of the Prisoner's Dilemma is that they come apart.

In the Prisoner's Dilemma, mutual cooperation is collectively optimal. Every party prefers the all-cooperate world to the all-defect world. And yet defection is the dominant strategy: no matter what the other player does, each individual does better by defecting. The collectively optimal outcome is not an equilibrium. Rationality, applied individually, drives agents away from the very outcome the categorical imperative would universalize.

Kant would say: but you cannot will universal defection, because the world of universal defection is terrible, therefore defection is irrational and forbidden. Game theory replies: nobody is choosing universal defection. Each agent is choosing their own defection, against a background of others' choices, and that local choice is individually rational regardless of what it would mean if universalized. The categorical imperative answers a question — "which universal rule is best?" — that no agent is actually being asked. The question agents face — "what is my best response right now?" — has a different answer, and reason does not bridge the gap. This is not a failure of virtue. It is the structure of the game.

The Equilibrium Is a Mix, Not a Universal

Now the most important correction, and the one the categorical imperative cannot survive.

The universalizability test presupposes that the stable states of a society are the uniform ones: everyone cooperating, or everyone defecting. The whole rhetorical force of "what if everyone did that?" depends on this all-or-nothing framing. But the actual stable state of almost every real cooperation problem is neither. It is a mixed equilibrium: a population that is mostly cooperators with a persistent minority of defectors who survive precisely because they are a minority.

This is the central result of evolutionary game theory, and it demolishes the universalizability test at the root. Consider the Hawk-Dove game, or any model of cooperation with cheating. When cheaters are rare, they are surrounded by cooperators to exploit, so cheating pays handsomely and the cheater frequency rises. When cheaters become common, they mostly run into each other, the exploitable cooperators are used up, and cheating stops paying — so the cheater frequency falls. The two forces balance at some intermediate frequency. The equilibrium is not 0% defectors and it is not 100% defectors. It is, say, 5%, or 15% — whatever the payoff structure dictates. This is an evolutionarily stable strategy: a mixture that, once reached, resists invasion from either direction.

The defectors at equilibrium are not a bug to be moralized away. They are a structural feature. They exist because cooperation is common. They are parasites on a host of cooperators, and the host's very abundance is what feeds them. Defection is rare not because reason or conscience suppresses it, but because its own success destroys the conditions that made it profitable. The minority of cheats is held in place by the same mathematics that holds the majority of cooperators in place.

This is fatal to Kant for a simple reason. The categorical imperative asks you to imagine your maxim universalized — adopted by everyone. But "everyone defects" is not a state the system ever visits or tends toward. It is an off-equilibrium fantasy. The real social world is not poised between two pure worlds, one heavenly and one hellish, with reason commanding us toward the heavenly one. It sits, stably, at a mixture — and that mixture includes the defectors as a permanent fraction. Asking "what if everyone lied?" is asking about a counterfactual that the dynamics of the situation actively prevent from ever occurring. The defector does not need everyone to defect. He needs almost everyone not to. His strategy is parasitic by design, and universalizing it misdescribes what it even is.

Evolution Builds Strategies Against the Common Good

Why should we expect this mixed, partly-parasitic equilibrium rather than the clean universal cooperation Kant imagined? Because evolution — the process that actually produced both human behavior and human moral intuitions — does not optimize for the common good. It optimizes for relative fitness. A gene spreads if it does better than its alternatives in the current population, full stop. It does not matter whether the gene's spread makes the group worse off. It does not matter whether a world of such genes would be a catastrophe. Selection is blind to the universalized consequence and sensitive only to the local, marginal, individual advantage.

This is precisely the Prisoner's Dilemma logic instantiated in biology, and it runs in the opposite direction from the categorical imperative. Where Kant says adopt the maxim whose universalization is best, evolution says adopt the variant whose local advantage is greatest, and let the universalized consequence fall where it may. Selfishness, cheating, free-riding, deception — these are not moral failures that reason corrects. They are strategies that evolution actively builds and maintains, at whatever frequency the payoffs support, because each one pays for the individual that carries it even as it taxes everyone else.

And it is the same evolution that built our moral sense. Our intuitions of fairness, reciprocity, guilt, and indignation are not a faculty of pure reason peering at the moral law. They are an evolved toolkit for navigating repeated games — for rewarding cooperators, punishing defectors, tracking reputation, and managing the very mixed equilibrium described above. Kant mistook this toolkit for a window onto a priori truth. He felt the pull of his evolved cooperative instincts, noticed they had a roughly universal form, and concluded that pure reason had discovered a universal law. What he had actually discovered was the psychological machinery that evolution installed to keep us functioning as the cooperator majority in a mixed population — defectors and all.

What Survives

None of this means cooperation is doomed or that "be moral" is a confusion. It means morality is not what Kant thought it was. Cooperation is real, valuable, and achievable — but it is achieved through the mechanisms game theory actually identifies: repeated interaction, reputation, the ability to punish defectors, institutions that change the payoff matrix so that cooperation becomes individually rational rather than individually costly. You sustain cooperation not by willing a universal maxim but by engineering the game so that defection stops paying — by raising the cost of cheating until the parasitic minority shrinks.

That is the right reading of the question Kant was fumbling toward. "What if everyone did that?" is not a test a rational agent passes by introspection. It is a description of a coordination problem whose solution is a matter of incentive design, enforcement, and the structure of repeated play. Kant asked a genuine question with the wrong tools and arrived at a confident answer that gets the agency exactly backwards. He took the collective optimum for the individual's obligation, took the uniform world for the stable one, and took his evolved instincts for pure reason.

Conclusion

The categorical imperative is a beautiful mistake. It senses, correctly, that there is something special about maxims that everyone could adopt — and there is: those maxims describe the cooperative optimum. But it then commits three errors that game theory and evolution expose. It confuses the collective optimum with the rational individual strategy, when the Prisoner's Dilemma shows these diverge. It assumes the relevant states are the uniform ones, when the real equilibrium is a stable mixture in which a defecting minority permanently free-rides on the cooperating majority. And it mistakes our evolved cooperative instincts for the deliverances of pure reason, when they are in fact the very machinery selection built to manage that mixed equilibrium. Morality is not a universal law legible to reason. It is the ongoing, never-finished engineering of incentives in a population that evolution stocked with both cooperators and the cheats who feed on them. Kant asked the right question. He just lived two centuries too early to get the answer right.