HTell: Data-Free Backdoor Detection via Head Random Probing

ai-technology · 2026-05-20

A new method called HTell detects backdoor attacks in deep neural networks without requiring clean data, model gradients, or trigger reconstruction. It works by feeding random latent probes into the model's prediction head and analyzing class-wise response statistics, as backdoored models show abnormal response concentration on the target class. HTell is architecture-aware and operates under practical model-auditing scenarios. The approach is evaluated on a large-scale benchmark, demonstrating fast and lightweight detection.

Key facts

HTell is a backdoor detection method for DNNs.
It uses head random probing without data or gradients.
Backdoored models exhibit abnormal response concentration on the target class.
HTell generates architecture-aware random latent probes.
It analyzes class-wise response statistics.
No clean or surrogate data is needed.
No iterative trigger reconstruction is required.
Evaluated on a large-scale benchmark.

Entities

—

Sources

arXiv cs.AI — 2026-05-20