ARTFEED — Contemporary Art Intelligence

QuIDE: New Metric Quantifies Neural Network Efficiency Trade-offs

publication · 2026-05-13

A new metric called QuIDE has been developed by researchers to assess quantized neural networks. This metric utilizes the Intelligence Index I = (C x P)/log_2(T+1), which integrates compression, accuracy, and latency into one comprehensive score. Testing across six different scenarios—including SimpleCNN on MNIST and CIFAR, ResNet-18 on ImageNet-1K, and Llama-3-8B—shows a task-specific Pareto Knee. For MNIST and large LLMs, 4-bit quantization proves to be the most effective, whereas 8-bit is preferable for intricate CNN tasks like ResNet-18 on ImageNet, where 4-bit post-training quantization leads to significant accuracy loss. An accuracy-gated version, I', effectively identifies unsuitable configurations that the original I might endorse. QuIDE also offers a reproducible evaluation framework and a practical fitness function for mixed-precision optimization.

Key facts

  • QuIDE is a new metric for quantized neural network efficiency.
  • Intelligence Index I = (C x P)/log_2(T+1).
  • Experiments include SimpleCNN (MNIST, CIFAR), ResNet-18 (ImageNet-1K), and Llama-3-8B.
  • 4-bit quantization optimal for MNIST and large LLMs.
  • 8-bit quantization optimal for complex CNN tasks like ResNet-18 on ImageNet.
  • 4-bit PTQ causes catastrophic accuracy collapse on ResNet-18/ImageNet.
  • Accuracy-gated variant I' flags non-viable configurations.
  • QuIDE offers a reproducible evaluation protocol and fitness function for mixed-precision search.

Entities

Institutions

  • arXiv

Sources