OLLM: Options-based Large Language Models Introduce Discrete Latent Variables for Next-Token Prediction

ai-technology · 2026-04-22

A novel technique known as Options LLM (OLLM) substitutes the conventional single next-token prediction in large language models with a collection of learned options linked to a discrete latent variable. This method effectively captures variation using a compact latent space that characterizes several feasible next-token choices, which can be selected or explored by a downstream policy. Architecturally, OLLM operates as a lightweight "plug-in," integrating an encoder and a decoder prior to the output head, enabling the transformation of nearly any pretrained LLM with minimal additional parameters. Applied to a 1.7B-parameter backbone trained on OpenMathReasoning and assessed on OmniMath, only 1.56% of parameters were trainable. While state-of-the-art LoRA-adapted baselines achieve a maximum of 51% final answer accuracy, OLLM's option set can reach around 70% with optimal late selection. Unlike traditional methods that depend on temperature or sampling heuristics for diversity, OLLM explicitly models variation through its discrete latent variable framework. The paper detailing OLLM is accessible on arXiv under the identifier 2604.19087v1.

Key facts

OLLM replaces single next-token prediction with learned options indexed by a discrete latent variable
The method models variation explicitly through a small latent space parametrizing multiple plausible next-token options
OLLM is architecturally a lightweight plug-in inserting encoder and decoder layers before the output head
The approach allows almost any pretrained LLM to be converted with minimal additional parameters
Applied to a 1.7B-parameter backbone with only 1.56% trainable parameters
Trained on OpenMathReasoning and evaluated on OmniMath
LoRA-adapted baselines peak at 51% final answer correctness
OLLM enables up to approximately 70% correctness under optimal late selection

OLLM: Options-based Large Language Models Introduce Discrete Latent Variables for Next-Token Prediction

Key facts

Entities

Institutions

Sources