AI Framework Enhances Educational Assessment Q-Matrices Through LLM and NeuralCDM Integration
A novel framework combining large language models with NeuralCDM evaluation improves Q-matrices used in educational assessment. Traditionally crafted by experts, these matrices map item demands to knowledge components but are subjective and time-consuming. The approach uses LLMs with structured prompting to generate candidate Q-matrices, which NeuralCDM then evaluates against student response data. Applied to a thermodynamics assessment dataset, the method benchmarks locally deployed LLMs against cloud-served models. Results demonstrate that iteratively refined LLM-generated Q-matrices can surpass expert-baseline model fit, achieving an AUC of 0.780 compared to 0.717. Locally deployed models perform comparably to cloud-based alternatives, offering a scalable solution for empirical validation. This framework addresses limitations in expert-driven Q-matrix construction by integrating AI-generated candidates with data-driven evaluation. The study highlights the potential for AI to enhance learning analytics and assessment design through human-AI collaboration. Published on arXiv under identifier 2604.16398v1, the research presents a cross-disciplinary advancement in educational technology.
Key facts
- Q-matrices are theory-driven tools for assessment and learning analytics
- Expert-crafted Q-matrices are subjective and difficult to validate empirically
- The framework uses LLMs to generate candidate Q-matrices via structured prompting
- NeuralCDM evaluates candidates based on student response data explanation
- Applied to a thermodynamics assessment dataset
- LLM-generated Q-matrices achieved AUC 0.780 vs expert baseline 0.717
- Locally deployed LLMs benchmarked against cloud-served models
- Published on arXiv with identifier 2604.16398v1
Entities
Institutions
- arXiv