Random Node Sampling Matches Full-Graph GNN Training
A recent study published on arXiv (2605.22480) indicates that Random Node Sampling (RNS), the most basic mini-batch training method for Graph Neural Networks (GNNs), either matches or exceeds the performance of full-graph training across 8 out of 10 datasets while requiring less time and memory. The researchers utilized backward error analysis on graph mini-batch Stochastic Gradient Descent (SGD) and discovered that it effectively minimizes the sampled loss along with a regularizer that is proportional to the variance of the mini-batch gradient influenced by the sampler. Although RNS overlooks local structures, it yields mini-batches with expected losses that more closely align with the full-graph loss, challenging previous efforts that focused on structure-aware samplers to maintain connectivity and decrease variance.
Key facts
- Random Node Sampling (RNS) matches or outperforms full-graph training on 8 of 10 datasets.
- RNS uses less wall-clock time and memory than full-graph training.
- Backward error analysis shows mini-batch SGD implicitly minimizes sampled loss plus a variance-based regularizer.
- The regularizer is proportional to mini-batch gradient variance, shaped by the sampler.
- RNS discards local structure but produces mini-batches with expected loss closer to full-graph loss.
- The study challenges the need for structure-aware samplers in GNN mini-batch training.
- The research is published on arXiv with ID 2605.22480.
- Mini-batch training of GNNs differs fundamentally from i.i.d. data due to altered topology and boundary effects.
Entities
Institutions
- arXiv