GPT-4.5 Passes Turing Test in Landmark UC San Diego Study
A recent study from UC San Diego, featured in the Proceedings of the National Academy of Sciences, showcases that a modern AI can actually pass the Turing test. In their experiments, GPT-4.5 was rated as human 73% of the time when it mimicked human quirks, tone, and humor. On the other hand, Meta's LLaMa-3.1-405B was seen as human 56% of the time, which is on par with real people. Older models like ELIZA and GPT-4o only achieved human ratings of 23% and 21%, respectively. When not given specific prompts, GPT-4.5’s rating dropped to 36%. The study, involving nearly 500 participants, was led by Cameron Jones from Stony Brook University and coauthored by Ben Bergen from UC San Diego.
Key facts
- GPT-4.5 was judged human 73% of the time with persona prompts.
- LLaMa-3.1-405B achieved a 56% human rating, statistically indistinguishable from humans.
- Without persona prompts, GPT-4.5 fell to 36% and LLaMa-3.1 to 38%.
- ELIZA and GPT-4o were selected as human only 23% and 21% of the time.
- The study was published in the Proceedings of the National Academy of Sciences.
- Nearly 500 participants took part in the experiments.
- Conversations lasted 5 minutes in the main study and 15 minutes in a replication.
- The researchers built an online interface at turingtest.live.
Entities
Institutions
- University of California San Diego
- Proceedings of the National Academy of Sciences
- Stony Brook University
- Meta
- Prolific
- SONA system
Locations
- San Diego
- United States