LaplacianFormer: Linear Attention via Laplacian Kernel for Vision
LaplacianFormer introduces a Laplacian kernel as a theoretically grounded alternative to softmax attention in Transformers, addressing the quadratic complexity bottleneck for high-resolution vision tasks. The model employs a provably injective feature map to preserve fine-grained token interactions under low-rank approximations, and uses Nyström approximation with Newton–Schulz iteration for efficient kernel matrix computation, avoiding costly matrix inversion and SVD. Custom CUDA implementations are developed for both the kernel and solver. The work is published as arXiv preprint 2604.20368.
Key facts
- LaplacianFormer uses a Laplacian kernel to replace softmax attention.
- Quadratic complexity of softmax attention is a major obstacle for high-resolution vision tasks.
- Existing linear attention variants often use Gaussian kernels without theoretical grounding.
- The Laplacian kernel is motivated by empirical observations and theoretical analysis.
- A provably injective feature map retains fine-grained token information.
- Nyström approximation of the kernel matrix is used for efficient computation.
- Newton–Schulz iteration solves the system without matrix inversion or SVD.
- Custom CUDA implementations are developed for the kernel and solver.
Entities
Institutions
- arXiv