MOSAIC: Distillation-Driven Code Generation for Scientific Workflows Without Test Cases
MOSAIC has been unveiled by researchers as a novel framework for multi-agent scientific code generation that does not require training with I/O test cases. Unlike traditional LLM-based code generation, which depends on execution feedback from test cases, scientific workflows typically do not have these cases available since creating them involves resolving the original problem. MOSAIC mitigates this issue by utilizing a student-teacher knowledge distillation approach, anchoring generation in domain-specific examples and structured problem breakdowns. To minimize hallucinations across interconnected subproblems, it features a Consolidated Context Window (CCW) that ensures consistent reasoning among agents. Tests conducted on the SciCode benchmark reveal that MOSAIC enhances accuracy, executability, and numerical precision compared to current methods. This framework is tailored for multi-agent LLM systems, addressing the unique challenges of scientific code generation.
Key facts
- MOSAIC is a training-free multi-agent framework for scientific code generation.
- It does not require I/O test cases for execution feedback.
- Uses student-teacher knowledge distillation with domain-specific examples.
- Employs a Consolidated Context Window (CCW) for consistent reasoning.
- Tested on the SciCode benchmark.
- Improves accuracy, executability, and numerical precision.
- Addresses the problem of missing test cases in scientific workflows.
- Framework is designed for multi-agent LLM systems.
Entities
—