mGRADE: A Hybrid-Memory Model for Efficient Multi-Timescale Sequence Modeling
A new lightweight sequence modeling architecture, mGRADE (minimally Gated Recurrent Architecture with Delay Embedding), has been proposed to address the challenge of capturing both local fast dynamics and global slow context under strict memory constraints typical of edge devices. Current state-of-the-art models with constant memory footprints struggle to balance long-range selectivity and high-precision modeling of fast dynamics. mGRADE integrates a convolution with learnable temporal spacings—theoretically equivalent to a delay embedding—with a lightweight gated recurrent component. This hybrid-memory system introduces inductive biases across timescales, enabling parameter-efficient reconstruction of partially-observed fast dynamics while selectively maintaining long-range context. The work is detailed in a paper on arXiv (2507.01829v2).
Key facts
- mGRADE stands for minimally Gated Recurrent Architecture with Delay Embedding.
- It is designed for multi-timescale sequence modeling on edge devices.
- The model uses a convolution with learnable temporal spacings.
- Learnable spacings are equivalent to a delay embedding.
- It combines convolution with a lightweight gated recurrent component.
- The approach addresses the trade-off between long-range selectivity and fast dynamics modeling.
- The paper is available on arXiv under ID 2507.01829v2.
- The model targets constant memory footprint scenarios.
Entities
Institutions
- arXiv