Siamese MLP Improves Operon Prediction Over DGEB Baseline

other · 2026-05-13

A new computational method for operon identification, Siamese Contrastive Operon Pair Embeddings (SCOPE), outperforms the DGEB benchmark by using a Siamese MLP classifier over fused embedding spaces. Operon prediction is critical for understanding prokaryotic gene regulation, enabling regulatory network reconstruction and drug development. While experimental methods like RT-PCR and RNA-seq are accurate, they are labor-intensive and limited to model organisms, necessitating scalable computational approaches. Prior methods used logistic regression and decision trees as baselines. DGEB embeds sequences independently with a pre-trained protein language model and computes pairwise cosine similarity. SCOPE instead learns a classifier on fused embeddings, improving classification performance. The study was published on arXiv (2605.11022) as a cross-type announcement.

Key facts

SCOPE uses a Siamese MLP classifier over fused embedding spaces
DGEB benchmark uses independent embeddings with pairwise cosine similarity
Operon identification is fundamental for understanding prokaryotic gene regulation
Experimental methods like RT-PCR and RNA-seq are precise but laborious
Prior computational approaches used logistic regression and decision trees
The method was published on arXiv with ID 2605.11022
The announcement type is cross
SCOPE improves upon the DGEB baseline

Siamese MLP Improves Operon Prediction Over DGEB Baseline

Key facts

Entities

Institutions

Sources