TildeOpen LLM: A 30B Parameter Model for 34 European Languages

ai-technology · 2026-04-30

A new foundational model named TildeOpen LLM has been unveiled by researchers, featuring 30 billion parameters and designed to support 34 European languages, aiming to tackle the issue of linguistic disparity in large language models. This model employs dataset upsampling alongside a curriculum-based training approach that alternates between uniform and natural language distributions. TildeOpen demonstrates superior performance compared to current open-weight models on multilingual benchmarks, especially for Baltic, Finno-Ugric, and Slavic languages, while utilizing fewer computing resources. Human assessments indicate a reduction in linguist errors by as much as tenfold for languages with limited resources. The research paper is available on arXiv under ID 2603.08182.

Key facts

TildeOpen LLM is a 30-billion-parameter open-weight foundational model.
Trained for 34 European languages.
Uses curriculum learning with alternating uniform and natural language distributions.
Outperforms other multilingual LLMs on text generation and comprehension.
Particularly effective for Baltic, Finno-Ugric, and Slavic languages.
Human evaluations show up to tenfold reduction in linguist errors.
Published on arXiv with ID 2603.08182.
Trained with significantly fewer computing resources than comparable models.

TildeOpen LLM: A 30B Parameter Model for 34 European Languages

Key facts

Entities

Institutions

Sources