ARTFEED — Contemporary Art Intelligence

TowerMind: A New Benchmark for LLM Agents Using Tower Defense Games

ai-technology · 2026-05-27

Researchers have introduced TowerMind, a novel environment and benchmark for evaluating Large Language Models (LLMs) as agents, grounded in the tower defense (TD) subgenre of real-time strategy (RTS) games. TowerMind addresses limitations of existing RTS game environments, which either have high computational demands or lack textual observations, by offering low computational requirements and a multimodal observation space. This allows for assessing LLMs' long-term planning and decision-making capabilities, which are crucial for adapting to diverse scenarios. The environment preserves key evaluation strengths of RTS games while being more accessible for LLM testing.

Key facts

  • TowerMind is a new environment for LLM agents based on tower defense games.
  • It features low computational demands and multimodal observation space.
  • Existing RTS environments have high computational demands or lack textual observations.
  • LLMs are being evaluated for long-term planning and decision-making capabilities.
  • RTS games require macro-level strategic planning and micro-level tactical adaptation.
  • The environment is designed to benchmark LLMs as agents.
  • TowerMind is presented in arXiv paper 2601.05899.
  • The paper was announced as a replace type on arXiv.

Entities

Institutions

  • arXiv

Sources