SILO: Self-Improvement Imitation for Protein Design Under Oracle Budgets
A new framework called SILO has been developed by researchers for optimizing protein sequences through trajectory-level self-improvement while adhering to strict oracle budgets. This innovative approach employs a hierarchical edit policy that breaks down each mutation into two steps: selecting a position and then choosing a residue. During each round of active learning, the policy utilizes incremental stochastic beam search (SBS) to sample candidate trajectories without replacement. Candidates that feature functionally significant edits are chosen using a UCB-based proxy ensemble alongside an alanine-scan fitness score (AFS) for in silico oracle assessment. The policy is subsequently refined through next-action cross-entropy imitation, tackling issues prevalent in current reinforcement learning and off-policy generative techniques, which can suffer from surrogate noise and potentially harm critical residues with indiscriminate mutation proposals.
Key facts
- SILO is a self-improvement imitation framework for protein design.
- It uses a hierarchical edit policy with position and residue choices.
- Candidate trajectories are sampled via stochastic beam search without replacement.
- UCB-based proxy ensemble and alanine-scan fitness score select candidates.
- Policy updated by next-action cross-entropy imitation.
- Addresses surrogate noise and residue disruption issues.
- Published on arXiv with ID 2605.26690.
- Method targets tight oracle budgets in protein sequence optimization.
Entities
Institutions
- arXiv