ARTFEED — Contemporary Art Intelligence

TaTok: Adaptive Image Tokenization Based on Information Entropy

other · 2026-05-20

A study published on arXiv presents TaTok, an adaptive image tokenization framework grounded in theory, which tackles the shortcomings of existing methods. Traditional techniques compress all image data uniformly, disregarding the varying density of information, which can result in both redundancy and loss of crucial details. TaTok employs global tokens to capture mutual information among patch tokens and utilizes a Dynamic Token Filtering (DTF) algorithm founded on cumulative conditional entropy to reduce redundancy. Experimental results indicate a 1.3x improvement in gFID and an 8.7x increase in inference speed, marking it as a leader in performance. This framework optimizes token allocation based on information content, significantly boosting efficiency for processing lengthy image sequences.

Key facts

  • TaTok is a theoretically grounded adaptive image tokenization framework.
  • Current methods compress all content at a fixed rate, causing redundancy or information loss.
  • TaTok introduces global tokens to model mutual information across patch tokens.
  • Dynamic Token Filtering (DTF) algorithm uses cumulative conditional entropy to eliminate redundancy.
  • Experiments show 1.3x gFID improvement and 8.7x inference speedup.
  • TaTok achieves state-of-the-art performance in discrete image tokenization.
  • The framework allocates tokens according to information content.
  • The paper is published on arXiv with ID 2605.16384.

Entities

Institutions

  • arXiv

Sources