MolCHG: Multi-Level Self-Supervised Pretraining on Compositional Hierarchical Graphs
A team of researchers has introduced MolCHG, a multi-tiered self-supervised pretraining framework aimed at predicting molecular properties. This framework employs a Compositional Hierarchical Graph, which categorizes molecular structures into four types of nodes across three semantic tiers, featuring a bond graph that runs parallel to the atom graph. This approach enhances bond data into independently evolving node representations, allowing fragment nodes to equally integrate atom-level and bond-level semantics. Additionally, three pretraining objectives tailored to specific levels are proposed, including a cross-view contrastive task for atom and bond interactions. This research tackles the shortcomings of current methods that function at a single structural level and consider bond information merely as auxiliary edge attributes.
Key facts
- MolCHG is a multi-level self-supervised pretraining framework.
- It uses a Compositional Hierarchical Graph with four node types across three semantic levels.
- A bond graph operates in parallel with the atom graph.
- Bond-level information is elevated to independently evolving node representations.
- Fragment nodes aggregate atom-level and bond-level semantics on an equal footing.
- Three level-specific pretraining objectives are designed.
- One objective is an atom-bond cross-view contrastive task.
- The work appears on arXiv with ID 2605.16088.
Entities
Institutions
- arXiv