QuIDE: New Metric Quantifies Neural Network Efficiency Trade-offs

publication · 2026-05-13

A new metric called QuIDE has been developed by researchers to assess quantized neural networks. This metric utilizes the Intelligence Index I = (C x P)/log_2(T+1), which integrates compression, accuracy, and latency into one comprehensive score. Testing across six different scenarios—including SimpleCNN on MNIST and CIFAR, ResNet-18 on ImageNet-1K, and Llama-3-8B—shows a task-specific Pareto Knee. For MNIST and large LLMs, 4-bit quantization proves to be the most effective, whereas 8-bit is preferable for intricate CNN tasks like ResNet-18 on ImageNet, where 4-bit post-training quantization leads to significant accuracy loss. An accuracy-gated version, I', effectively identifies unsuitable configurations that the original I might endorse. QuIDE also offers a reproducible evaluation framework and a practical fitness function for mixed-precision optimization.

Key facts

QuIDE is a new metric for quantized neural network efficiency.
Intelligence Index I = (C x P)/log_2(T+1).
Experiments include SimpleCNN (MNIST, CIFAR), ResNet-18 (ImageNet-1K), and Llama-3-8B.
4-bit quantization optimal for MNIST and large LLMs.
8-bit quantization optimal for complex CNN tasks like ResNet-18 on ImageNet.
4-bit PTQ causes catastrophic accuracy collapse on ResNet-18/ImageNet.
Accuracy-gated variant I' flags non-viable configurations.
QuIDE offers a reproducible evaluation protocol and fitness function for mixed-precision search.

QuIDE: New Metric Quantifies Neural Network Efficiency Trade-offs

Key facts

Entities

Institutions

Sources