FreeScale: Reducing Training Costs for Recommendation Models
A new system called FreeScale aims to reduce computational costs in training sequence recommendation models by addressing resource under-utilization caused by stragglers and blocking communications. It uses load-balanced input samples, prioritized embedding communication overlapping, and SM-Free techniques to resolve GPU resource competition. Empirical results show up to 90.3% reduction in computational bubbles.
Key facts
- FreeScale is introduced to mitigate straggler problems in training recommendation models.
- It uses load-balanced input samples to reduce stragglers.
- Prioritized embedding communications are overlapped with computations to minimize blocking.
- SM-Free techniques resolve GPU resource competition during overlapping.
- Empirical evaluation shows up to 90.3% reduction in computational bubbles.
- The paper is available on arXiv with ID 2604.24073.
- The system targets modern industrial deep learning recommendation models.
- Heterogeneous data characteristics cause resource under-utilization.
Entities
Institutions
- arXiv