Weight Pruning Amplifies Bias in LLMs for Edge AI

ai-technology · 2026-05-12

A recent investigation published on arXiv indicates that weight pruning, a method for implementing Large Language Models (LLMs) on limited-resource IoT and edge devices, can greatly enhance model bias. The study involved a systematic empirical analysis of three instruction-tuned models (Gemma-2-9b-it, Mistral-7B-Instruct-v0.3, Phi-3.5-mini-instruct) and examined three pruning techniques (Random, Magnitude, Wanda) across four sparsity levels (10-70%) using 12,148 BBQ bias benchmark items with five random seeds, resulting in 2,368,860 inference records. Findings reveal a "Smart Pruning Paradox": while activation-aware pruning (Wanda) maintains perplexity with only a 3.5% rise at 50% sparsity for Mistral-7B, it also leads to the greatest bias increase. At 70% sparsity, the Stereotype Reliance Score jumps by 83.7%, with 47-59% of previously unbiased items exhibiting new stereotypical behaviors. Random pruning entirely undermines language capability, with perplexity surpassing 10^4. This research underscores the unintended consequences of efficiency-focused pruning methods, which may compromise fairness in edge AI applications.

Key facts

Study conducted on three instruction-tuned models: Gemma-2-9b-it, Mistral-7B-Instruct-v0.3, Phi-3.5-mini-instruct
Three pruning methods tested: Random, Magnitude, Wanda
Four sparsity levels: 10-70%
Dataset: 12,148 BBQ bias benchmark items with 5 random seeds
Total inference records: 2,368,860
Wanda pruning preserves perplexity (3.5% increase at 50% sparsity for Mistral-7B) but amplifies bias most
At 70% sparsity, Stereotype Reliance Score increases 83.7%
47-59% of previously unbiased items develop new stereotypical behaviors at 70% sparsity

Weight Pruning Amplifies Bias in LLMs for Edge AI

Key facts

Entities

Institutions

Sources