ARTFEED — Contemporary Art Intelligence

Efficient-DLM: Converting Autoregressive Models to Fast Diffusion Language Models

publication · 2026-05-01

A new study on arXiv (2512.14067) introduces Efficient-DLM, a method to convert pretrained autoregressive (AR) language models into efficient diffusion language models (dLMs) that generate text in parallel while preserving task accuracy. The researchers identified limitations in existing AR-to-dLM conversion methods, particularly in attention patterns and objectives. They propose a continuous pretraining scheme with a block-wise attention pattern that maintains causal relationships across blocks but enables bidirectional attention within blocks, preserving pretrained AR weight distributions. This approach aims to bridge the learning efficiency gap between dLMs and AR models when trained from scratch.

Key facts

  • arXiv paper 2512.14067 titled 'Efficient-DLM: From Autoregressive to Diffusion Language Models, and Beyond in Speed'
  • Study focuses on converting pretrained AR models into efficient dLMs
  • Conversion aims to enable parallel non-autoregressive generation while preserving AR model accuracy
  • Researchers identified limitations in attention patterns and objectives of existing AR-to-dLM methods
  • Proposed continuous pretraining scheme with block-wise attention pattern
  • Block-wise attention remains causal across blocks but bidirectional within blocks
  • Maintaining pretrained AR weight distributions is critical for effective conversion
  • Method addresses learning efficiency gap between dLMs and AR models trained from scratch

Entities

Institutions

  • arXiv

Sources