Soft-TransFormers for Continual Learning
Soft-TransFormers for Continual Learning
Key facts
- Soft-TransFormers (Soft-TF) is a parameter-efficient framework for continual learning.
- It is inspired by the Well-initialized Lottery Ticket Hypothesis (WLTH).
- Soft-TF uses soft, real-valued subnetworks over a frozen pre-trained Transformer.
- It learns task-specific multiplicative masks applied to key, query, value, and output projections in self-attention.
- Masks enable smooth and stable task adaptation while preserving shared representations.
- A lightweight dual-prompt mechanism maintains knowledge retention and mitigates Catastrophic Forgetting (CF).
- Soft-TF achieves state-of-the-art performance on multiple continual learning benchmarks.
- It outperforms prompt-based, adapter-based, and LoRA-style baselines with minimal additional parameters.
Entities
—