ARTFEED — Contemporary Art Intelligence

LAWS: Self-Certifying Cache Architecture for Neural Inference

ai-technology · 2026-05-07

A new framework known as Learning from Actual Workloads Symbolically (LAWS) has been developed for self-certifying inference caching. This innovative system accumulates expert functions using real-world deployment information, with each function corresponding to a specific input space area, mapped as nodes in a Probabilistic Language Trie (PLT). LAWS features a self-certification theorem, ensuring that approximation errors remain within defined limits. It also utilizes Mixture-of-Experts and KV prefix caching, allowing for enhanced adaptability compared to traditional setups. Furthermore, the framework presents a theoretical foundation regarding monotone hit rates, contributing to its overall efficiency and performance.

Key facts

  • LAWS stands for Learning from Actual Workloads Symbolically.
  • It is a self-certifying inference caching architecture.
  • Each expert covers a region defined by a node in the Probabilistic Language Trie (PLT).
  • The self-certification theorem bounds error by epsilon_fit + 2*Lambda(W)*C_E.
  • Lambda(W) is the model Lipschitz constant.
  • C_E is the maximum embedding diameter.
  • LAWS generalizes Mixture-of-Experts and KV prefix caching.
  • It is strictly more expressive than any fixed-K MoE or finite cache.

Entities

Institutions

  • arXiv

Sources