ARTFEED — Contemporary Art Intelligence

LLM-Based Translation of Compiler Intermediate Representations

ai-technology · 2026-05-12

A new research paper introduces IRIS-14B, a 14-billion-parameter transformer model fine-tuned to translate between GCC's GIMPLE and LLVM's LLVM IR, two distinct Intermediate Representations used by major compilers. The work addresses the challenge of cross-toolchain interaction, which has been limited by semantic and structural differences between these IRs. Traditional rule-based translators have proven complex and costly to maintain. The authors propose a data-driven approach using Large Language Models (LLMs) to learn mappings from examples. The paper is available on arXiv under identifier 2605.08247.

Key facts

  • IRIS-14B is a 14-billion-parameter transformer model.
  • It translates between GIMPLE (GCC) and LLVM IR.
  • The paper is on arXiv with ID 2605.08247.
  • LLMs offer a data-driven alternative to rule-based translators.
  • GCC and LLVM underpin much modern software infrastructure.
  • Cross-toolchain interaction is hindered by IR differences.
  • Rule-based translators have high complexity and maintenance cost.
  • The model is fine-tuned for compiler IR translation.

Entities

Institutions

  • arXiv

Sources