Engineering Challenges of On-Device Small Language Model Integration

other · 2026-05-07

A longitudinal case study examines the engineering obstacles faced when incorporating Small Language Models (SLMs) into Palabrita, an Android word-guessing game. During a 5-day development sprint that involved 204 commits, the approach evolved from producing fully structured puzzles to a more practical framework utilizing curated word lists, with the LLM providing only three brief hints and a deterministic fallback. The research highlights five distinct types of failures related to on-device SLM integration, particularly concerning output format. The models employed were Gemma 4 E2B (2.6B parameters) and Qwen3 0.6B (600M parameters). This paper is accessible on arXiv.

Key facts

The study is a longitudinal practitioner case study.
SLMs were integrated into Palabrita, an Android word-guessing game.
The development sprint lasted 5 days with 204 commits.
The architecture shifted from generating complete structured puzzles to using curated word lists.
The LLM now generates only three short hints with a deterministic fallback.
Five categories of failures specific to on-device SLM integration were identified.
Models used: Gemma 4 E2B (2.6B parameters) and Qwen3 0.6B (600M parameters).
The paper is published on arXiv with ID 2604.24636.

Engineering Challenges of On-Device Small Language Model Integration

Key facts

Entities

Institutions

Sources