Transformers Trained as Universal Computers via MicroPy Programs
A study reveals that a compact transformer can be developed into a universal computer by learning to run programs in MicroPy, a streamlined yet computationally complete programming language. Utilizing PENCIL scaffolding, the transformer predicts execution in small steps, allowing for efficient processing within a limited context window. Following training on randomly generated, nonsensical MicroPy scripts, the model successfully adapts to human-created programs, including bit manipulation, binary arithmetic, and SAT problem-solving. The trained model exhibits out-of-distribution generalization, assessing new programs from previously unseen distributions. Given that MicroPy can represent any computation, these findings offer empirical support that a standard transformer can be trained to serve as a universal computer. This paper is available on arXiv in the computer science and artificial intelligence sections.
Key facts
- Small transformer learns to execute MicroPy programs
- MicroPy is a simplified but computationally universal programming language
- PENCIL scaffolding enables space-efficient execution within bounded context window
- Training on randomly generated meaningless programs
- Generalizes to human-written programs: bit copying, flipping, binary addition, multiplication, SAT verification and solving
- Achieves out-of-distribution generalization
- Provides empirical evidence that transformers can act as universal computers
- Published on arXiv (2604.25166)
Entities
Institutions
- arXiv