Masked-Position JEPA Improves Protein Language Model Performance

ai-technology · 2026-05-11

A new study on arXiv (2605.07554) proposes masked-position MLM+JEPA, a training recipe that combines latent-space prediction with masked language modeling for protein sequence encoders. The method predicts latent targets only at masked positions while retaining MLM cross-entropy loss. Tested on ESM2 models (35M and 150M parameters) and random-init encoders, it outperforms MLM-only training on downstream tasks: 10 wins/3 losses/3 ties for ESM2-35M and 11/2/3 for ESM2-150M on a 16-task suite including frozen linear probes and SCOPe-40 zero-shot fold retrieval. Results from pretraining from scratch are mixed (6/8/2). The work suggests that latent prediction complements token-level objectives under matched wall-clock budgets.

Key facts

ProteinJEPA combines latent prediction with MLM.
Masked-position variant predicts latent targets only at masked positions.
Tested on ESM2-35M and ESM2-150M models.
Outperforms MLM-only on 10 of 16 tasks for ESM2-35M.
Outperforms MLM-only on 11 of 16 tasks for ESM2-150M.
Results from scratch pretraining are mixed (6/8/2).
Downstream suite includes 15 frozen linear probes and SCOPe-40 zero-shot fold retrieval.
Study conducted under matched wall-clock budgets.

Masked-Position JEPA Improves Protein Language Model Performance

Key facts

Entities

Institutions

Sources