EngGPT2-16B-A3B Benchmarked Against Italian and International LLMs

ai-technology · 2026-05-11

Ingegneria Informatica S.p.A. has published a benchmark evaluation for its EngGPT2MoE-16B-A3B, a Mixture of Experts (MoE) model featuring 16 billion parameters, of which 3 billion are active. This model underwent testing against global benchmarks such as ARC-Challenge, GSM8K, AIME24, AIME25, MMLU, HumanEval (HE), and the RULER benchmark with a 32k context. It matched or exceeded the performance of Italian models including FastwebMIIA-7B, Minerva-7B, Velvet-14B, and LLaMAntino-3-ANITA-8B in most assessments, although Velvet-14B surpassed it on the Italian benchmark ITALIC. When compared to other MoE models of similar size, EngGPT2MoE-16B-A3B demonstrated superior results compared to DeepSeek-MoE-16B-Chat.

Key facts

EngGPT2MoE-16B-A3B is a 16B parameter MoE model with 3B active parameters.
Benchmarked against Italian models: FastwebMIIA-7B, Minerva-7B, Velvet-14B, LLaMAntino-3-ANITA-8B.
Performed as well or better on ARC-Challenge, GSM8K, AIME24, AIME25, MMLU, HumanEval.
Achieved best performance on RULER benchmark at 32k context.
Velvet-14B outperformed EngGPT2 on Italian benchmark ITALIC.
Outperformed DeepSeek-MoE-16B-Chat in comparison.
Report published by ENGINEERING Ingegneria Informatica S.p.A.
Model is open-source and comparably sized to other MoE models.

EngGPT2-16B-A3B Benchmarked Against Italian and International LLMs

Key facts

Entities

Institutions

Sources