Machine Translation Preserves Moral Semantics Across Languages
A new study from arXiv demonstrates that LLM-based translation can preserve moral cues across languages, using Polish as a test case with ~50k annotated social media posts. The research applies a four-method validation pipeline including LaBSE similarity, CKA, LLM-as-judge, and classifier parity tests. Results show direct translation retains subtle moral semantics despite challenges with slang and cultural expressions.
Key facts
- Study uses ~50k morally-annotated social media posts
- Polish is the test language for cross-lingual transfer
- Four-method validation pipeline employed
- Direct translation preserves moral cues for machine learning
- Challenges include slang, vulgarity, and culturally-loaded expressions
- Mean cosine similarity used as metric
- arXiv paper ID: 2605.22660
- LLM-based translation bridges gap in moral corpora
Entities
Institutions
- arXiv