ARTFEED — Contemporary Art Intelligence

Weierstrass Elliptic Positional Encoding Improves Vision Transformers

other · 2026-05-25

A recent preprint on arXiv (2605.23719) presents Weierstrass elliptic Positional Encoding (WePE) designed for Vision Transformers (ViTs). Traditional ViTs rely on learnable one-dimensional positional encodings, which do not effectively maintain the two-dimensional spatial arrangement of images once patch flattening occurs. WePE resolves this issue by projecting normalized 2D patch coordinates onto the complex plane and creating compact four-dimensional positional features based on the Weierstrass elliptic function and its derivative. The function's double periodicity offers a systematic representation of 2D positions, ensuring a consistent relationship between Euclidean spatial distances and sequential index distances. This mathematically sound approach seeks to improve ViTs' capacity to utilize spatial proximity priors, a feature often lacking in current encodings due to inadequate geometric constraints.

Key facts

  • arXiv preprint 2605.23719 proposes Weierstrass elliptic Positional Encoding (WePE) for Vision Transformers.
  • Current ViTs use learnable one-dimensional positional encodings that weaken 2D spatial structure.
  • WePE maps normalized 2D patch coordinates onto the complex plane.
  • WePE constructs four-dimensional positional features using the Weierstrass elliptic function and its derivative.
  • Double periodicity provides a principled representation of 2D positions.
  • WePE maintains monotonic relationship between Euclidean distances and sequential index distances.
  • Existing positional encodings lack geometric constraints and spatial proximity priors.
  • The method is mathematically grounded and motivated by periodicity in positional encoding.

Entities

Institutions

  • arXiv

Sources