IndicSafe: Benchmarking LLM Safety Across 12 Indic Languages

ai-technology · 2026-05-18

IndicSafe has launched the inaugural comprehensive assessment of large language model safety in 12 Indic languages, which are spoken by more than 1.2 billion individuals. Researchers evaluated 10 prominent LLMs using 6,000 culturally relevant prompts that addressed topics such as caste, religion, gender, health, and politics. The findings reveal a notable safety drift, with cross-language agreement at just 12.8% and a variance in SAFE rates exceeding 17% across different languages. Certain models tend to excessively refuse harmless prompts in low-resource scripts or overflag sensitive political issues, while others neglect to identify unsafe outputs. The research employs prompt-level entropy, category bias scores, and multilingual consistency indices to measure these shortcomings.

Key facts

First systematic evaluation of LLM safety across 12 Indic languages
Languages spoken by over 1.2 billion people
Dataset of 6,000 culturally grounded prompts
Topics include caste, religion, gender, health, and politics
10 leading LLMs assessed
Cross-language agreement is just 12.8%
SAFE rate variance exceeds 17% across languages
Some models over-refuse or under-refuse depending on language script

IndicSafe: Benchmarking LLM Safety Across 12 Indic Languages

Key facts

Entities

Locations

Sources