LoRP: Training-Free Depth Pruning for LLMs Using Representation Locality

ai-technology · 2026-05-28

A new framework called Locality-Aware Redundancy Pruning (LoRP) has been introduced by researchers for depth pruning in large language models without the need for training. LoRP utilizes a Representation Locality Score (RLS), which is based on the similarity of hidden states across layers, to determine if redundancy is localized or spread out. By using a minimal calibration set, it assesses pairwise similarity between layers, organizes them into clusters, and eliminates redundancy within those clusters. Experiments conducted on various LLM families indicate enhancements in perplexity.

Key facts

arXiv:2605.27786v1
LoRP is training-free and one-shot
RLS measures representation locality
Uses small calibration set
Clusters layers by similarity
Prunes based on intra-cluster redundancy
Tested on diverse LLM families
Improves perplexity

LoRP: Training-Free Depth Pruning for LLMs Using Representation Locality

Key facts

Entities

Institutions

Sources