LLMs Generate UML Diagrams from Developer Queries
Researchers have unveiled a novel method for generating UML diagrams driven by queries, enabling LLMs to produce diagrams that respond directly to natural language inquiries regarding code. This technique involves fine-tuning Qwen2.5-Coder-14B using a carefully selected dataset comprising code files, developer questions, and their related diagram representations in structured JSON format. The evaluation process includes both automatic detection of structural flaws and human evaluation of semantic relevance. Findings indicate that fine-tuning on a limited set of manually corrected data leads to significant enhancements. This approach tackles the issue of software documentation, which often becomes outdated or is entirely absent, by offering targeted insights into codebases without excessive detail.
Key facts
- Query-driven UML diagram generation uses LLMs to answer developer queries.
- Fine-tuned Qwen2.5-Coder-14B on curated dataset of code, queries, and diagrams.
- Evaluation includes automatic structural defect detection and human semantic relevance assessment.
- Modest manually corrected data yields dramatic improvements.
- Addresses outdated or missing software documentation.
- Produces semantically focused diagrams with only relevant elements.
- Published on arXiv with ID 2604.23816.
- Approach differs from automated reverse engineering tools that produce overwhelming detail.
Entities
Institutions
- arXiv