Decision Boundary-aware Generation for Long-tailed Learning

other · 2026-05-06

A new framework, Decision Boundary-aware Generation (DBG), addresses long-tailed data bias in machine learning. Long-tailed datasets skew decision boundaries toward head classes, reducing tail class accuracy. Diffusion-based generative augmentation and head-to-tail transfer have been used to mitigate this, but the latter can cause latent non-local feature mixing, leading to decision boundary overlap and tail class distribution shift. DBG identifies this boundary ambiguity problem and generates informative near-boundary samples to promote separable decision spaces. The framework rebalances long-tailed datasets and improves classifier performance. Results on standard long-tailed benchmarks show consistent improvements.

Key facts

Long-tailed data bias decision boundaries toward head classes and degrade tail class accuracy.
Diffusion-based generative augmentation addresses this by generating additional data.
Head-to-tail transfer mitigates generator bias but induces latent non-local feature mixing.
Feature mixing causes decision boundary overlap and tail class distribution shift.
DBG framework promotes near-boundary representation learning.
DBG generates informative near-boundary samples.
DBG rebalances long-tailed datasets while yielding more separable decision space.
DBG is evaluated on standard long-tailed benchmarks.

Entities

—

Sources

arXiv cs.AI — 2026-05-05