LLM-Based Robustness Testing of Microservice Applications

other · 2026-05-16

A recent study published on arXiv (2605.14202) explores the application of large language models (LLMs) for testing the robustness of microservice applications. The researchers implemented seven prompting strategies across three open-source LLMs (ranging from 14B to 70B parameters) focusing on two architecturally different systems: a Java monolingual system comprising six services and nine failure modes, and a polyglot system with 27 services and 14 failure modes. This resulted in 38 valid runs and 663 generated tests. A significant finding indicates that the prompt strategy accounts for more variability in test diversity than the size of the model; notably, a Structured prompt completely diminished diversity, while a single model...

Key facts

arXiv paper 2605.14202
7 prompt strategies tested
3 open-source LLMs (14B-70B parameters)
2 microservice systems: Java monolingual (6 services, 9 failure modes) and polyglot (27 services, 14 failure modes)
38 valid runs and 663 generated tests
Prompt strategy explains more variation in diversity than model size
Structured prompt collapsed diversity entirely

LLM-Based Robustness Testing of Microservice Applications

Key facts

Entities

Institutions

Sources