SlimDT: Efficient Decision Transformer via RTG Injection
A new variant of the Decision Transformer (DT), named SlimDT, has been introduced by researchers. This model eliminates Return-to-Go (RTG) tokens from the autoregressive sequence. Instead, RTG data is incorporated into state representations prior to sequential modeling, leading to a one-third reduction in sequence length and enhanced inference efficiency. In evaluations on the D4RL benchmark, SlimDT outperforms the conventional DT model.
Key facts
- Decision Transformer formulates offline reinforcement learning as autoregressive sequence modeling.
- RTG is a scalar summarizing future rewards, containing less information than state or action vectors.
- Including RTG as a separate token adds computational overhead due to quadratic self-attention cost.
- SlimDT removes RTG from the autoregressive sequence.
- RTG information is injected into state representations before sequential modeling.
- The Transformer processes only a compact (state, action) sequence.
- Sequence length is reduced by one-third.
- SlimDT surpasses standard DT on the D4RL benchmark.
Entities
—