Skill Neologisms Enable Continual Learning in LLMs

ai-technology · 2026-05-07

A new arXiv preprint (2605.04970) introduces skill neologisms as a method for continual learning in large language models. These are soft tokens added to the model's vocabulary and optimized to improve performance on specific skills without weight updates. The researchers observed that pre-trained LLMs already have tokens associated with procedural knowledge. They demonstrated that skill neologisms can be learned to enhance capabilities on target skills while remaining composable with out-of-distribution skills, and that independently trained neologisms can be combined. This approach addresses limitations of fine-tuning (catastrophic forgetting) and context-based methods (limited expressiveness and context constraints).

Key facts

arXiv preprint 2605.04970 introduces skill neologisms for continual learning in LLMs.
Skill neologisms are soft tokens integrated into the model's vocabulary.
They are optimized to improve capabilities over a specific skill without weight updates.
Off-the-shelf pre-trained LLMs already show tokens associated with procedural knowledge.
Skill neologisms can improve model capabilities on specific skills.
They are composable with out-of-distribution skills.
Independently trained skill neologisms can be combined.
This method avoids catastrophic forgetting and context limitations.

Skill Neologisms Enable Continual Learning in LLMs

Key facts

Entities

Institutions

Sources