SILO: Self-Improvement Imitation for Protein Design Under Oracle Budgets

other · 2026-05-27

A new framework called SILO has been developed by researchers for optimizing protein sequences through trajectory-level self-improvement while adhering to strict oracle budgets. This innovative approach employs a hierarchical edit policy that breaks down each mutation into two steps: selecting a position and then choosing a residue. During each round of active learning, the policy utilizes incremental stochastic beam search (SBS) to sample candidate trajectories without replacement. Candidates that feature functionally significant edits are chosen using a UCB-based proxy ensemble alongside an alanine-scan fitness score (AFS) for in silico oracle assessment. The policy is subsequently refined through next-action cross-entropy imitation, tackling issues prevalent in current reinforcement learning and off-policy generative techniques, which can suffer from surrogate noise and potentially harm critical residues with indiscriminate mutation proposals.

Key facts

SILO is a self-improvement imitation framework for protein design.
It uses a hierarchical edit policy with position and residue choices.
Candidate trajectories are sampled via stochastic beam search without replacement.
UCB-based proxy ensemble and alanine-scan fitness score select candidates.
Policy updated by next-action cross-entropy imitation.
Addresses surrogate noise and residue disruption issues.
Published on arXiv with ID 2605.26690.
Method targets tight oracle budgets in protein sequence optimization.

SILO: Self-Improvement Imitation for Protein Design Under Oracle Budgets

Key facts

Entities

Institutions

Sources