Weierstrass Elliptic Positional Encoding Improves Vision Transformers

other · 2026-05-25

A recent preprint on arXiv (2605.23719) presents Weierstrass elliptic Positional Encoding (WePE) designed for Vision Transformers (ViTs). Traditional ViTs rely on learnable one-dimensional positional encodings, which do not effectively maintain the two-dimensional spatial arrangement of images once patch flattening occurs. WePE resolves this issue by projecting normalized 2D patch coordinates onto the complex plane and creating compact four-dimensional positional features based on the Weierstrass elliptic function and its derivative. The function's double periodicity offers a systematic representation of 2D positions, ensuring a consistent relationship between Euclidean spatial distances and sequential index distances. This mathematically sound approach seeks to improve ViTs' capacity to utilize spatial proximity priors, a feature often lacking in current encodings due to inadequate geometric constraints.

Key facts

arXiv preprint 2605.23719 proposes Weierstrass elliptic Positional Encoding (WePE) for Vision Transformers.
Current ViTs use learnable one-dimensional positional encodings that weaken 2D spatial structure.
WePE maps normalized 2D patch coordinates onto the complex plane.
WePE constructs four-dimensional positional features using the Weierstrass elliptic function and its derivative.
Double periodicity provides a principled representation of 2D positions.
WePE maintains monotonic relationship between Euclidean distances and sequential index distances.
Existing positional encodings lack geometric constraints and spatial proximity priors.
The method is mathematically grounded and motivated by periodicity in positional encoding.

Weierstrass Elliptic Positional Encoding Improves Vision Transformers

Key facts

Entities

Institutions

Sources