GPT-5.5 matches Mythos Preview in UK cybersecurity tests

ai-technology · 2026-05-01

Recent findings from the UK's AI Security Institute (AISI) indicate that OpenAI's GPT-5.5, which was publicly launched last week, exhibits performance comparable to Anthropic's much-anticipated Mythos Preview in cybersecurity assessments. AISI evaluated both AI models on 95 Capture the Flag challenges, encompassing web exploitation, reverse engineering, and cryptography. GPT-5.5 achieved an average pass rate of 71.4% on Expert-level tasks, slightly surpassing Mythos Preview's 68.6% (within the margin of error). Impressively, GPT-5.5 completed a Rust binary disassembler task in 10 minutes and 22 seconds, incurring $1.73 in API costs without human help. In the "The Last Ones" (TLO) test, simulating a data extraction attack, GPT-5.5 succeeded in 3 out of 10 trials, while Mythos Preview managed 2. Both models failed AISI's challenging "Cooling Tower" simulation, a trend seen in all previously assessed AI systems. Anthropic initially limited Mythos Preview's release to key industry partners due to cybersecurity concerns.

Key facts

AISI evaluated GPT-5.5 and Mythos Preview on 95 Capture the Flag cybersecurity challenges
GPT-5.5 passed 71.4% of Expert tasks, Mythos Preview 68.6%
GPT-5.5 solved a Rust binary disassembler challenge in 10m22s at $1.73 cost
On TLO test, GPT-5.5 succeeded 3/10 times, Mythos Preview 2/10
No previous model had ever succeeded on TLO
Both models failed the Cooling Tower power plant simulation
Anthropic restricted Mythos Preview to critical industry partners
GPT-5.5 launched publicly last week

Entities

Institutions

Anthropic
OpenAI
UK AI Security Institute (AISI)

Locations

United Kingdom

Sources

Ars Technica AI — 2026-05-01