Path-Lock Expert: Architecture-Level Separation for Hybrid Thinking in LLMs
So, researchers have rolled out something called Path-Lock Expert (PLE), which is basically a new way to improve hybrid-thinking language models. The problem with current models is that they often struggle with reasoning leakage, leading to overly long outputs when they’re supposed to be in no-think mode. With PLE, they’ve replaced the usual single MLP in decoder layers with two experts—one for thinking and another for no-thinking—while still keeping shared components like attention and embeddings. A smart control-token router picks one expert for the whole sequence, helping maintain efficient computations and allowing for more targeted updates during training. They've already tried this out on math and science tasks.
Key facts
- Path-Lock Expert (PLE) is an architecture-level solution for hybrid-thinking language models.
- PLE replaces the single MLP in each decoder layer with two semantically locked experts.
- One expert is for think mode, one for no-think mode.
- Attention, embeddings, normalization, and language-model head remain shared.
- A deterministic control-token router selects one expert path for the entire sequence.
- Inference preserves dense model per-token computation pattern.
- Each expert receives mode-pure updates during supervised fine-tuning.
- Evaluation is on math and science tasks.
Entities
—