LLM-Generated Code Security Evaluation Across Prompting Methods

other · 2026-05-26

A study on arXiv (2605.24298) evaluates security of code generated by five LLMs across Java, C++, C, and Python using various prompt engineering methods. The authors introduce a weaknesses-aware zero-shot chain-of-thought (WA-0CoT) strategy that incorporates CWE mappings for security context. Chi-square tests show no statistically significant reduction in vulnerability frequency or density across methods, including WA-0CoT.

Key facts

arXiv paper 2605.24298 evaluates LLM-generated code security
Five LLMs tested across Java, C++, C, and Python
WA-0CoT prompting strategy uses CWE mappings
Chi-square tests found no significant vulnerability reduction
Prompt methods include WA-0CoT and others

LLM-Generated Code Security Evaluation Across Prompting Methods

Key facts

Entities

Institutions

Sources