AttnGen: Attention-Guided Framework Boosts Genomic Sequence Classification
A new training framework called AttnGen embeds interpretability directly into the optimization process for deep neural networks classifying genomic sequences. It computes nucleotide-level importance scores via an attention mechanism, progressively suppressing low-contribution positions during training to focus predictions on informative regions. On the demo_human_or_worm benchmark (binary classification over 200-nucleotide sequences), AttnGen with moderate masking achieves 96.73% validation accuracy, outperforming a conventional CNN baseline at 95.83%, while also converging faster. The work is described in arXiv:2605.14073.
Key facts
- AttnGen is an attention-guided training framework for genomic sequence classification.
- It computes nucleotide-level importance scores using an attention mechanism.
- Low-contribution positions are progressively suppressed during training.
- Evaluated on the demo_human_or_worm benchmark with 200-nucleotide sequences.
- Validation accuracy: 96.73% with moderate masking.
- CNN baseline accuracy: 95.83%.
- AttnGen shows faster convergence than the baseline.
- Described in arXiv:2605.14073.
Entities
Institutions
- arXiv