LLMs Play Mini-Mafia: A Game-Theoretic Benchmark for Social Intelligence
A new analytical framework, Mini-Mafia, simplifies the social deduction game Mafia to four players—mafioso, detective, villager, and a fixed night phase—to model large language model interactions. The mafia win-rate follows logit(p) = v × (m − d), where m, d, v measure deception, disclosure, and detection. This yields the Mini-Mafia Benchmark, using Bayesian inference on gameplay data to evaluate LLM social intelligence theoretically rather than empirically.
Key facts
- Mini-Mafia is a four-player simplification of the social deduction game Mafia.
- The game includes a mafioso, a detective, and a villager with a fixed night phase.
- Mafia win-rate formula: logit(p) = v × (m − d).
- m represents the mafioso's deception capability.
- d represents the detective's disclosure capability.
- v represents the villager's detection capability.
- The framework is turned into the Mini-Mafia Benchmark.
- The benchmark uses Bayesian inference over gameplay data.
Entities
—