ELAS: Efficient Pre-Training of Low-Rank LLMs via 2:4 Activation Sparsity

ai-technology · 2026-05-07

A new framework called ELAS (Efficient Pre-training of Low-rank LLMs via 2:4 Activation Sparsity) is proposed to address computational bottlenecks in training large language models. The method combines low-rank training, which reduces memory usage, with 2:4 structured sparsity applied to activations (not weights) to leverage NVIDIA GPU support for sparse formats. Existing low-rank approaches leave activation matrices in full-rank, dominating memory consumption and limiting throughput during large-batch training. Directly applying sparsity to weights often causes performance degradation. ELAS applies 2:4 sparsity specifically to activations, aiming to reduce memory and improve throughput without significant accuracy loss. The paper is published on arXiv under ID 2605.03667.

Key facts

ELAS stands for Efficient Pre-training of Low-rank LLMs via 2:4 Activation Sparsity.
The framework targets efficient pre-training of large language models.
It combines low-rank training with 2:4 structured sparsity on activations.
2:4 structured sparsity is supported by NVIDIA GPUs.
Existing low-rank methods leave activation matrices in full-rank, causing high memory consumption.
Direct weight sparsity leads to non-negligible performance degradation.
ELAS aims to reduce memory and improve throughput during large-batch training.
The paper is available on arXiv with ID 2605.03667.

ELAS: Efficient Pre-Training of Low-Rank LLMs via 2:4 Activation Sparsity

Key facts

Entities

Institutions

Sources