EngGPT2-16B-A3B Benchmarked Against Italian and International LLMs
Ingegneria Informatica S.p.A. has published a benchmark evaluation for its EngGPT2MoE-16B-A3B, a Mixture of Experts (MoE) model featuring 16 billion parameters, of which 3 billion are active. This model underwent testing against global benchmarks such as ARC-Challenge, GSM8K, AIME24, AIME25, MMLU, HumanEval (HE), and the RULER benchmark with a 32k context. It matched or exceeded the performance of Italian models including FastwebMIIA-7B, Minerva-7B, Velvet-14B, and LLaMAntino-3-ANITA-8B in most assessments, although Velvet-14B surpassed it on the Italian benchmark ITALIC. When compared to other MoE models of similar size, EngGPT2MoE-16B-A3B demonstrated superior results compared to DeepSeek-MoE-16B-Chat.
Key facts
- EngGPT2MoE-16B-A3B is a 16B parameter MoE model with 3B active parameters.
- Benchmarked against Italian models: FastwebMIIA-7B, Minerva-7B, Velvet-14B, LLaMAntino-3-ANITA-8B.
- Performed as well or better on ARC-Challenge, GSM8K, AIME24, AIME25, MMLU, HumanEval.
- Achieved best performance on RULER benchmark at 32k context.
- Velvet-14B outperformed EngGPT2 on Italian benchmark ITALIC.
- Outperformed DeepSeek-MoE-16B-Chat in comparison.
- Report published by ENGINEERING Ingegneria Informatica S.p.A.
- Model is open-source and comparably sized to other MoE models.
Entities
Institutions
- ENGINEERING Ingegneria Informatica S.p.A.