Glauber Dynamics on Masked Language Models: Mixing Time Analysis
A recent theoretical study published on arXiv (2605.16378) explores the global distributional characteristics of masked language models (MLMs) during iterative generation. The researchers represent the resampling of masked tokens iteratively as a Glauber dynamics Markov chain applied to token sequences. They propose a rectangle test to demonstrate the inherent incompatibility of MLM conditionals, confirming its widespread occurrence in contemporary MLMs. The theoretical findings indicate that with limited cross-token influence, a high-temperature contraction outcome suggests a mixing time of O(n log n), where n represents the sequence length. In contrast, under a uniform local margin condition, the behavior of the chain differs. This study tackles a crucial issue regarding the reliability of MLMs as generative models.
Key facts
- arXiv paper 2605.16378
- Models iterative masked-token resampling as Glauber dynamics Markov chain
- Introduces rectangle test for incompatibility of MLM conditionals
- Empirically verifies incompatibility across modern MLMs
- High-temperature contraction gives O(n log n) mixing time under bounded cross-token influence
- Uniform local margin condition leads to different mixing behavior
- Addresses global distributional behavior of MLMs
- Sequence length n is the key parameter
Entities
Institutions
- arXiv