ProtLiD²: Ligand-Conditioned Discrete Diffusion for Protein Design
ProtLiD² represents an innovative discrete diffusion model aimed at co-designing protein sequences and structures with specific ligand constraints. This model simultaneously creates amino acid sequences and discrete structural tokens, utilizing geometry-aware cross-attention to integrate both chemical and geometric details of ligands. With training on more than one million ligand-protein complexes, ProtLiD² enhances masked discrete diffusion, facilitating ligand-aware protein design and filling a void in current discrete diffusion protein language models that lack direct conditioning from small molecules. The introduction of this model can be found in a preprint on arXiv (2605.27413).
Key facts
- ProtLiD² is a ligand-conditioned discrete diffusion model for protein sequence-structure co-design.
- It jointly generates amino-acid sequence and discrete structure tokens.
- It incorporates ligand chemical and geometric information through geometry-aware cross-attention.
- Trained on over one million ligand-protein complexes.
- Extends masked discrete diffusion for ligand-aware protein design.
- Addresses limitations of existing discrete diffusion protein language models.
- Published as arXiv preprint 2605.27413.
- The model enables design of sequence-structure compatible proteins under explicit ligand constraints.
Entities
Institutions
- arXiv