TowerMind: A New Benchmark for LLM Agents Using Tower Defense Games

ai-technology · 2026-05-27

Researchers have introduced TowerMind, a novel environment and benchmark for evaluating Large Language Models (LLMs) as agents, grounded in the tower defense (TD) subgenre of real-time strategy (RTS) games. TowerMind addresses limitations of existing RTS game environments, which either have high computational demands or lack textual observations, by offering low computational requirements and a multimodal observation space. This allows for assessing LLMs' long-term planning and decision-making capabilities, which are crucial for adapting to diverse scenarios. The environment preserves key evaluation strengths of RTS games while being more accessible for LLM testing.

Key facts

TowerMind is a new environment for LLM agents based on tower defense games.
It features low computational demands and multimodal observation space.
Existing RTS environments have high computational demands or lack textual observations.
LLMs are being evaluated for long-term planning and decision-making capabilities.
RTS games require macro-level strategic planning and micro-level tactical adaptation.
The environment is designed to benchmark LLMs as agents.
TowerMind is presented in arXiv paper 2601.05899.
The paper was announced as a replace type on arXiv.

TowerMind: A New Benchmark for LLM Agents Using Tower Defense Games

Key facts

Entities

Institutions

Sources