QuIDE: New Metric Quantifies Neural Network Efficiency Trade-offs
A new metric called QuIDE has been developed by researchers to assess quantized neural networks. This metric utilizes the Intelligence Index I = (C x P)/log_2(T+1), which integrates compression, accuracy, and latency into one comprehensive score. Testing across six different scenarios—including SimpleCNN on MNIST and CIFAR, ResNet-18 on ImageNet-1K, and Llama-3-8B—shows a task-specific Pareto Knee. For MNIST and large LLMs, 4-bit quantization proves to be the most effective, whereas 8-bit is preferable for intricate CNN tasks like ResNet-18 on ImageNet, where 4-bit post-training quantization leads to significant accuracy loss. An accuracy-gated version, I', effectively identifies unsuitable configurations that the original I might endorse. QuIDE also offers a reproducible evaluation framework and a practical fitness function for mixed-precision optimization.
Key facts
- QuIDE is a new metric for quantized neural network efficiency.
- Intelligence Index I = (C x P)/log_2(T+1).
- Experiments include SimpleCNN (MNIST, CIFAR), ResNet-18 (ImageNet-1K), and Llama-3-8B.
- 4-bit quantization optimal for MNIST and large LLMs.
- 8-bit quantization optimal for complex CNN tasks like ResNet-18 on ImageNet.
- 4-bit PTQ causes catastrophic accuracy collapse on ResNet-18/ImageNet.
- Accuracy-gated variant I' flags non-viable configurations.
- QuIDE offers a reproducible evaluation protocol and fitness function for mixed-precision search.
Entities
Institutions
- arXiv