ARTFEED — Contemporary Art Intelligence

PATCH: A Hybrid Sparsity Framework for Efficient LLMs

ai-technology · 2026-04-30

PATCH introduces a learnable tile-level hybrid sparsity framework for large language models (LLMs), enabling a continuous sparsity ratio between 0% and 50%. It partitions weight matrices into tiles, each assigned as dense or 2:4 sparse via a learnable mask selection mechanism, offering fine-grained control over accuracy-acceleration tradeoffs and non-uniform sparsity across layers. This bridges the gap between unstructured sparsity (accurate but irregular) and semi-structured 2:4 sparsity (hardware-friendly but rigid), achieving superior overall quality. The paper is available on arXiv under ID 2509.23410.

Key facts

  • PATCH enables continuous sparsity ratio between 0% and 50%
  • Partitions weight matrices into tiles with learnable mask selection
  • Supports non-uniform sparsity across layers
  • Bridges unstructured and semi-structured 2:4 sparsity
  • arXiv paper ID: 2509.23410
  • Published on arXiv
  • Announce type: replace-cross

Entities

Institutions

  • arXiv

Sources