Exascale Graph Foundation Models Enable Billion-Scale Materials Discovery in Seconds
A groundbreaking exascale workflow aimed at materials discovery has been introduced, utilizing atomistic graph foundation models founded on HydraGNN. This innovative system is trained on 16 publicly available first-principles datasets, encompassing over 544 million structures across more than 85 elements. Researchers implemented a multi-task architecture, featuring per-dataset heads and a scalable ADIOS2/DDStore data pipeline, to conduct six extensive DeepHyper hyperparameter optimization campaigns in FP64 precision on Frontier supercomputers. The leading message-passing models advanced to sustained training on 2,048 nodes, culminating in a PaiNN-based model. This model facilitates billion-scale screening, assessing 1.1 billion atomistic structures in a mere 50 seconds, significantly reducing the time needed for first-principles computation. The workflow also allows for data-scarce fine-tuning across various downstream applications and showcases transferability across twelve chemically diverse tasks. Researchers evaluated precision-performance tradeoffs among BF16, FP32, and FP64 formats, setting new standards in computational materials science. This workflow marks a notable leap forward in high-performance computing for scientific exploration.
Key facts
- Exascale workflow uses atomistic graph foundation models built on HydraGNN
- Trained on 16 open first-principles datasets with 544+ million structures covering 85+ elements
- Multi-task architecture with per-dataset heads and scalable ADIOS2/DDStore data pipeline
- Six large-scale DeepHyper hyperparameter optimization campaigns executed in FP64 on Frontier
- Top models promoted to sustained 2,048-node training yielding PaiNN-based lead model
- Evaluates 1.1 billion atomistic structures in 50 seconds
- Compresses workload that would require years of first-principles computation
- Demonstrates transfer across twelve chemically diverse downstream tasks
Entities
—