Laguna M.1 and XS.2: New Mixture-of-Experts Models for Agentic Coding
Laguna M.1 and Laguna XS.2 are two Mixture-of-Experts foundational models tailored for long-horizon, agentic coding challenges. The M.1 model boasts a total of 225.8 billion parameters, with 23.4 billion activated per token, while XS.2 features 33.4 billion total parameters and 3 billion activated per token. Both models underwent end-to-end training from scratch within the Model Factory, a cohesive system comprising versioned data, training, evaluation, and inference components. The report elaborates on the design principles and choices of the Model Factory, detailing the comprehensive training process, including pre-training data and architecture, post-training phases, evaluation, and quantization. In benchmarks like SWE-bench Verified, SWE-bench Multilingual, SWE-Bench Pro, and Terminal-Bench 2.0, both models perform competitively against leading open models in their parameter categories.
Key facts
- Laguna M.1 has 225.8B total parameters (23.4B activated per token).
- Laguna XS.2 has 33.4B total parameters (3B activated per token).
- Both models are Mixture-of-Experts foundation models for agentic coding.
- Trained from scratch end-to-end inside the Model Factory system.
- Model Factory is a tightly-integrated stack of versioned data, training, evaluation, and inference components.
- Report covers pre-training data, architecture, post-training, evaluation, and quantization.
- Benchmarks include SWE-bench Verified, SWE-bench Multilingual, SWE-Bench Pro, and Terminal-Bench 2.0.
- Competitive with state-of-the-art open models in respective parameter ranges.
Entities
—