Sentra-Guard: Real-Time Defense Against Adversarial LLM Prompts

ai-technology · 2026-05-04

Sentra-Guard has been unveiled by researchers as a modular defense system that operates in real-time, aimed at identifying and countering jailbreak and prompt injection threats aimed at large language models (LLMs). This innovative system utilizes a hybrid architecture that merges FAISS-indexed SBERT embeddings for semantic comprehension with finely-tuned transformer classifiers to differentiate between harmless and malicious inputs. A key feature is its classifier-retriever fusion module, which calculates context-aware risk scores dynamically. Sentra-Guard effectively addresses both direct and obscured attack methods and offers multilingual support via a language-agnostic preprocessing layer that converts non-English prompts into English for evaluation. The research paper can be found on arXiv with the identifier 2510.22628.

Key facts

Sentra-Guard is a real-time modular defense system.
It detects jailbreak and prompt injection attacks on LLMs.
Uses FAISS-indexed SBERT embeddings and fine-tuned transformer classifiers.
Features a classifier-retriever fusion module for context-aware risk scoring.
Handles direct and obfuscated attack vectors.
Includes a language-agnostic preprocessing layer for multilingual support.
Paper available on arXiv: 2510.22628.

Sentra-Guard: Real-Time Defense Against Adversarial LLM Prompts

Key facts

Entities

Institutions

Sources