ARTFEED — Contemporary Art Intelligence

Digit Entropy Loss Improves LLM Number Prediction

ai-technology · 2026-05-22

A new method called Digit Entropy Loss (DEL) is proposed to improve numerical learning in large language models (LLMs). Number prediction is crucial for mathematical problem-solving and code generation, but standard maximum likelihood estimation (MLE) is not tailored for numbers. Existing penalty-driven approaches like Number Token Loss and Discretized Distance Loss introduce inductive bias but cause over-sharpened or over-flattened digit distributions. DEL reformulates unsupervised entropy optimization with three key designs, leveraging digit-level information to enhance auto-regressive numerical learning. The paper provides an in-depth analysis of LLM numerical learning, showing that current methods follow a criterion-distance formulation. DEL aims to balance optimization and geometric priors for better number prediction.

Key facts

  • DEL stands for Digit Entropy Loss
  • Paper is on arXiv with ID 2605.20369
  • Number prediction is fundamental for LLMs in math and code
  • MLE is not tailored for number prediction
  • Number Token Loss and Discretized Distance Loss are existing methods
  • Existing methods cause over-sharpened or over-flattened digit distributions
  • DEL reformulates unsupervised entropy optimization
  • DEL uses three key designs for auto-regressive learning

Entities

Institutions

  • arXiv

Sources