LLMs Show Variable Reasoning Compared to Humans in Conditional Inference Study

ai-technology · 2026-05-22

A new study on arXiv (2605.21299) examines whether Large Language Models reason like humans when interpreting conditional statements. Researchers conducted a population-matching experiment with 25 LLMs across four languages, comparing their performance to an equal number of human participants per language. Humans consistently enrich logical reasoning through pragmatic inferences, understanding implied meanings beyond literal statements. In contrast, LLM behavior is more variable: some models perfectly follow truth-table logic but ignore pragmatic nuances, while others deviate from truth-tables and adhere to a single interpretation across contexts. The study highlights ongoing gaps in human-like reasoning for LLMs.

Key facts

Study published on arXiv with ID 2605.21299
Population-matching experiment with 25 LLMs and equal humans per language
Four languages tested
Humans use pragmatic inferences across languages
LLMs show variable reasoning: some follow truth-tables, others do not
Some LLMs ignore pragmatic inferences entirely
Some LLMs adhere to a single interpretation across contexts

LLMs Show Variable Reasoning Compared to Humans in Conditional Inference Study

Key facts

Entities

Institutions

Sources