Engineering Challenges of On-Device Small Language Model Integration
A longitudinal case study examines the engineering obstacles faced when incorporating Small Language Models (SLMs) into Palabrita, an Android word-guessing game. During a 5-day development sprint that involved 204 commits, the approach evolved from producing fully structured puzzles to a more practical framework utilizing curated word lists, with the LLM providing only three brief hints and a deterministic fallback. The research highlights five distinct types of failures related to on-device SLM integration, particularly concerning output format. The models employed were Gemma 4 E2B (2.6B parameters) and Qwen3 0.6B (600M parameters). This paper is accessible on arXiv.
Key facts
- The study is a longitudinal practitioner case study.
- SLMs were integrated into Palabrita, an Android word-guessing game.
- The development sprint lasted 5 days with 204 commits.
- The architecture shifted from generating complete structured puzzles to using curated word lists.
- The LLM now generates only three short hints with a deterministic fallback.
- Five categories of failures specific to on-device SLM integration were identified.
- Models used: Gemma 4 E2B (2.6B parameters) and Qwen3 0.6B (600M parameters).
- The paper is published on arXiv with ID 2604.24636.
Entities
Institutions
- arXiv