Dynamic Graph-Based Data Scheduling for LLM Training
A new framework called D$^3$ (Dynamic Directional graph-constrained Data scheduling) addresses the overlooked interactions between training samples in large language model (LLM) optimization. Most existing data scheduling strategies adjust overall data distribution but ignore directional influences among samples, which affect training order. D$^3$ models these interactions as a dynamic influence graph with loss-based edges, then solves a constrained optimization to determine the training sequence that prioritizes more influential train-units. The approach aims to improve learning efficiency by respecting sample dependencies. The paper is available on arXiv under identifier 2605.31164.
Key facts
- D$^3$ is a Dynamic Directional graph-constrained Data scheduling framework for LLM training.
- It formulates interactions among train-units as a dynamic influence graph with loss-based dependencies.
- The framework solves a constrained optimization problem over the graph to derive the training order.
- Existing data scheduling strategies neglect underlying interactions between samples.
- Real-world data samples exhibit directional influences on each other.
- Prioritizing train-units with greater influence improves learning efficiency.
- The paper is published on arXiv with ID 2605.31164.
- The approach is designed for large language model optimization.
Entities
Institutions
- arXiv