ARTFEED — Contemporary Art Intelligence

Xe-Forge: LLM-Powered Kernel Optimization for Intel GPU

ai-technology · 2026-05-27

Xe-Forge is a multi-stage LLM-powered pipeline that automates kernel optimization for Intel GPUs. It addresses the manual bottleneck of applying low-level optimizations—quantization, memory access coalescing, tile size tuning, and architecture-specific workarounds—to Triton kernels. The system applies up to nine optimization stages, including algorithmic restructuring, operator fusion, block pointer modernization, GPU-specific tuning, and open-ended discovery. Each stage is driven by a Chain-of-Verification-and-Refinement (CoVeR) agent that generates candidates and validates them. The work is published on arXiv (2605.26118) and targets deep learning algorithm porting to new hardware accelerators.

Key facts

  • Xe-Forge automates kernel optimization for Intel GPU
  • Applies up to nine optimization stages
  • Uses Chain-of-Verification-and-Refinement (CoVeR) agents
  • Targets Triton kernels
  • Optimizations include quantization, memory coalescing, tile tuning
  • Published on arXiv with ID 2605.26118
  • Addresses manual bottleneck in porting deep learning algorithms
  • System performs algorithmic restructuring and operator fusion

Entities

Institutions

  • Intel
  • arXiv

Sources