Stealth Pretraining Seeding Attack Plants Logic Landmines in LLMs

ai-technology · 2026-04-27

A new adversarial attack family called Stealth Pretraining Seeding (SPS) exploits the web-scale pretraining process of large language models (LLMs) by distributing small, benign-looking poisoned content across stealth websites. The content is exposed to web crawlers via robots.txt, increasing its absorption into training corpora like Common Crawl. Each payload is tiny and diffuse, evading detection during dataset filtering. The attack embeds dormant logic landmines that remain invisible under standard evaluation but can be activated by precise alphanumeric triggers (e.g., <00TRIGGER00>) to bypass safeguards. The paper, titled "PermaFrost-Attack: Stealth Pretraining Seeding(SPS) for planting Logic Landmines During LLM Training," is available on arXiv (2604.22117).

Key facts

Attack family called Stealth Pretraining Seeding (SPS)
Poisoned content distributed across stealth websites
Content exposed to web crawlers via robots.txt
Targets training corpora like Common Crawl
Payloads are tiny, diffuse, and superficially benign
Difficult to detect during dataset construction or filtering
Embeds dormant logic landmines in LLMs
Activated by precise alphanumeric triggers such as <00TRIGGER00>
Paper published on arXiv with ID 2604.22117

Stealth Pretraining Seeding Attack Plants Logic Landmines in LLMs

Key facts

Entities

Institutions

Sources