PWRules Framework Applies Protein Words to Predict Small Molecule Binding with Interpretability

ai-technology · 2026-04-22

The PWRules framework improves the interpretability of protein-small molecule binding predictions by pinpointing favored small molecule fragments and establishing pairing rules with protein words—semantic sequence units. By utilizing binding affinity data, the framework ranks word-fragment rules via the PWScore function to highlight active compounds. Assessments on benchmark datasets reveal that PWScore performs competitively, on par with the physics-based model Glide and the deep learning model PSICHIC. This framework demonstrates wide applicability for protein targets beyond the training dataset, such as the SARS-CoV-2 main protease. Importantly, PWScore captures complementary interaction data, mitigating the dependence on opaque deep learning models in drug discovery while integrating principles and heuristics of protein-ligand interactions. This research was shared on arXiv under identifier 2604.16550v1, emphasizing the enhancement of binding prediction interpretability through this innovative method.

Key facts

PWRules framework improves interpretability of protein-small molecule binding predictions
Identifies privileged small molecule fragments using binding affinity data
Defines complementary pairing rules between fragments and protein words (semantic sequence units)
PWScore function ranks word-fragment rules to prioritize active compounds
Achieves competitive performance comparable to physics-based model Glide and deep learning model PSICHIC
Shows broad applicability for protein targets outside training dataset, e.g., SARS-CoV-2 main protease
Captures complementary interaction information
Research announced on arXiv with identifier 2604.16550v1 as cross announcement

PWRules Framework Applies Protein Words to Predict Small Molecule Binding with Interpretability

Key facts

Entities

Institutions

Sources