ARTFEED — Contemporary Art Intelligence

Sentra-Guard: Real-Time Defense Against Adversarial LLM Prompts

ai-technology · 2026-05-04

Sentra-Guard has been unveiled by researchers as a modular defense system that operates in real-time, aimed at identifying and countering jailbreak and prompt injection threats aimed at large language models (LLMs). This innovative system utilizes a hybrid architecture that merges FAISS-indexed SBERT embeddings for semantic comprehension with finely-tuned transformer classifiers to differentiate between harmless and malicious inputs. A key feature is its classifier-retriever fusion module, which calculates context-aware risk scores dynamically. Sentra-Guard effectively addresses both direct and obscured attack methods and offers multilingual support via a language-agnostic preprocessing layer that converts non-English prompts into English for evaluation. The research paper can be found on arXiv with the identifier 2510.22628.

Key facts

  • Sentra-Guard is a real-time modular defense system.
  • It detects jailbreak and prompt injection attacks on LLMs.
  • Uses FAISS-indexed SBERT embeddings and fine-tuned transformer classifiers.
  • Features a classifier-retriever fusion module for context-aware risk scoring.
  • Handles direct and obfuscated attack vectors.
  • Includes a language-agnostic preprocessing layer for multilingual support.
  • Paper available on arXiv: 2510.22628.

Entities

Institutions

  • arXiv

Sources