Researchers propose SDLLM, a spike-driven large language model eliminating dense matrix multiplications
A new research paper introduces SDLLM, a spike-driven large language model that replaces dense matrix multiplications with sparse addition operations. Inspired by the brain's information processing mechanisms, this approach integrates spiking-driven characteristics into LLM inference. Current large language models rely heavily on large-scale dense matrix multiplications, but SDLLM uses a plug-and-play gamma-SQP two-step spike encoding method to align the quantization process. The work addresses challenges in achieving spike-driven LLMs with billions of parameters using only sparse additions. Existing spike encoding schemes at the LLM level have faced issues with limited representational capacity and sparsity. Some previous attempts have combined Spiking Neural Networks with Transformers, but this new model represents a significant advancement in the SNN field. The research was announced on arXiv with the identifier 2604.16475v1. This cross-announcement explores fundamental questions about effectively integrating brain-like spiking mechanisms into language model architecture.
Key facts
- SDLLM is a spike-driven large language model
- It eliminates dense matrix multiplications through sparse addition operations
- The model uses gamma-SQP two-step spike encoding method
- Research addresses limited representational capacity in existing spike encoding schemes
- Paper explores integrating brain's spiking-driven characteristics into LLM inference
- Current LLMs are based on large-scale dense matrix multiplications
- Achieving spike-driven LLMs with billions of parameters remains a challenge
- Some works have attempted to combine Spiking Neural Networks with Transformers
Entities
Institutions
- arXiv