HLS-Seek: QoR-Aware Code Generation for High-Level Synthesis via Proxy Comparative Reward RL

ai-technology · 2026-05-14

The newly developed framework, HLS-Seek, aims to fill the void in current LLM-based High-Level Synthesis (HLS) methods, which focus on functional correctness while overlooking Quality of Results (QoR) aspects such as latency and resource usage. A crucial finding is that reinforcement learning for HLS can function with relative comparisons among candidates rather than requiring absolute synthesis outcomes. HLS-Seek substitutes the costly synthesis-in-the-loop RL with a comparative proxy reward model, attaining a Pareto-dominance accuracy of 99.53%. To mitigate reward hacking, it employs uncertainty-aware Monte Carlo dropout switching, which selectively engages real Vitis HLS synthesis for candidates with low confidence, thereby continually enhancing the proxy and achieving a synthesis accuracy of 81.5%.

Key facts

HLS-Seek is a QoR-aware NL-to-HLS framework.
Existing LLM-based HLS approaches ignore QoR.
RL for HLS only needs relative comparisons between candidates.
Proxy reward model achieves 99.53% Pareto-dominance accuracy.
Uncertainty-aware MC dropout switching prevents reward hacking.
Selectively invokes real Vitis HLS synthesis for low-confidence candidates.
Online updates the proxy, creating a self-improving reward system.
HLS-Seek achieves 81.5% synthesis accuracy.

Entities

—

Sources

arXiv cs.AI — 2026-05-14