ARTFEED — Contemporary Art Intelligence

MELT: Memory-Efficient Looped Transformer for Recurrent LLMs

ai-technology · 2026-05-11

arXiv paper 2605.07721 introduces MELT (Memory-Efficient Looped Transformer), a novel architecture for recurrent large language models that decouples reasoning depth from memory consumption. Unlike models such as Ouro, which accumulate a standard Key-Value (KV) cache across iterations causing linear memory growth with reasoning depth, MELT maintains a single KV cache per layer shared across reasoning loops. This cache is updated via a learnable gating mechanism, enabling stable and efficient multi-step computation without prohibitive memory usage. The approach addresses a key scalability limitation of recurrent LLMs, allowing deeper reasoning without proportional memory costs.

Key facts

  • MELT decouples compute from memory in looped language models.
  • Standard recurrent LLMs like Ouro have memory consumption linear with reasoning depth.
  • MELT uses a single KV cache per layer shared across reasoning loops.
  • The KV cache is updated via a learnable gating mechanism.
  • The architecture enables stable and efficient multi-step computation.
  • The paper is on arXiv with ID 2605.07721.
  • The approach improves practical scalability of recurrent LLMs.
  • MELT allows increasing reasoning iterations without prohibitive memory growth.

Entities

Institutions

  • arXiv

Sources