ARTFEED — Contemporary Art Intelligence

scpFormer: Foundation Model for Single-Cell Proteomics Integration

other · 2026-04-24

Researchers have unveiled scpFormer, a transformer-based foundational model tailored for single-cell proteomics, which has been pre-trained on more than 390 million cells. This innovative model replaces traditional index-based tokenization with a continuous, sequence-anchored method, merging Evolutionary Scale Modeling (ESM) and value-aware expression embeddings to align variable antibody panels within a unified semantic space, avoiding artificial discretization. It produces global cell representations that excel in large-scale batch integration and unsupervised clustering. Additionally, its open-vocabulary design facilitates in silico panel expansion, which supports the reconstruction of biological manifolds in sparse clinical datasets. The study also delves into the learned logic of protein co-expression, addressing the challenges posed by fragmented targeted antibody panels in single-cell proteomic data integration.

Key facts

  • scpFormer is a transformer-based foundation model for single-cell proteomics.
  • Pre-trained on over 390 million cells.
  • Uses continuous, sequence-anchored tokenization instead of index-based.
  • Combines Evolutionary Scale Modeling (ESM) with value-aware expression embeddings.
  • Maps variable antibody panels into a shared semantic space without artificial discretization.
  • Generates global cell representations for batch integration and clustering.
  • Open-vocabulary architecture facilitates in silico panel expansion.
  • Aids reconstruction of biological manifolds in sparse clinical datasets.

Entities

Sources