Intelligence per Watt: New Metric for Local AI Efficiency

ai-technology · 2026-05-23

A new research paper proposes 'intelligence per watt' (IPW) as a unified metric to measure the capability and efficiency of local AI inference. The study evaluates over 20 state-of-the-art small language models (≤20B active parameters) on power-constrained devices like laptops, using local accelerators such as the Apple M4 Max. The goal is to determine whether local inference can viably redistribute demand from centralized cloud infrastructure, which is struggling to keep pace with demand growth. IPW combines task accuracy with power consumption to compare model-accelerator configurations. The paper is available on arXiv under identifier 2511.07885.

Key facts

Paper proposes intelligence per watt (IPW) as a metric for local AI efficiency.
IPW equals task accuracy per unit of power.
Evaluates 20+ small language models with ≤20B active parameters.
Uses local accelerators like Apple M4 Max.
Targets power-constrained devices such as laptops.
Aims to redistribute demand from centralized cloud infrastructure.
Cloud demand growth outpaces provider scaling.
Paper available on arXiv: 2511.07885.

Intelligence per Watt: New Metric for Local AI Efficiency

Key facts

Entities

Institutions

Sources