Masked-Position JEPA Improves Protein Language Model Performance
A new study on arXiv (2605.07554) proposes masked-position MLM+JEPA, a training recipe that combines latent-space prediction with masked language modeling for protein sequence encoders. The method predicts latent targets only at masked positions while retaining MLM cross-entropy loss. Tested on ESM2 models (35M and 150M parameters) and random-init encoders, it outperforms MLM-only training on downstream tasks: 10 wins/3 losses/3 ties for ESM2-35M and 11/2/3 for ESM2-150M on a 16-task suite including frozen linear probes and SCOPe-40 zero-shot fold retrieval. Results from pretraining from scratch are mixed (6/8/2). The work suggests that latent prediction complements token-level objectives under matched wall-clock budgets.
Key facts
- ProteinJEPA combines latent prediction with MLM.
- Masked-position variant predicts latent targets only at masked positions.
- Tested on ESM2-35M and ESM2-150M models.
- Outperforms MLM-only on 10 of 16 tasks for ESM2-35M.
- Outperforms MLM-only on 11 of 16 tasks for ESM2-150M.
- Results from scratch pretraining are mixed (6/8/2).
- Downstream suite includes 15 frozen linear probes and SCOPe-40 zero-shot fold retrieval.
- Study conducted under matched wall-clock budgets.
Entities
Institutions
- arXiv