Multi-Node Lookahead Prediction Enhances Neural Routing Policy Training
A new training strategy called Multi-node Lookahead Prediction (MnLP) improves neural policies for vehicle routing problems. Current training methods focus on next-node prediction, leading to myopic decision-making. MnLP extends supervised learning to predict multiple future nodes simultaneously, using causal and discardable modules that operate only during training. This approach preserves inference-time efficiency while enabling long-range contextual understanding. Experiments show MnLP outperforms existing training methods.
Key facts
- MnLP is a novel training strategy for neural routing policies.
- Current training paradigms focus on next-node prediction, causing myopic decisions.
- MnLP predicts multiple future nodes simultaneously.
- Causal and discardable MnLP modules operate only during training.
- MnLP preserves inference-time efficiency.
- Multi-depth auxiliary supervision is incorporated into the loss function.
- MnLP equips neural policies with long-range contextual understanding.
- MnLP outperforms existing training methods experimentally.
Entities
—