AI Framework Enhances Educational Assessment Q-Matrices Through LLM and NeuralCDM Integration

ai-technology · 2026-04-22

A novel framework combining large language models with NeuralCDM evaluation improves Q-matrices used in educational assessment. Traditionally crafted by experts, these matrices map item demands to knowledge components but are subjective and time-consuming. The approach uses LLMs with structured prompting to generate candidate Q-matrices, which NeuralCDM then evaluates against student response data. Applied to a thermodynamics assessment dataset, the method benchmarks locally deployed LLMs against cloud-served models. Results demonstrate that iteratively refined LLM-generated Q-matrices can surpass expert-baseline model fit, achieving an AUC of 0.780 compared to 0.717. Locally deployed models perform comparably to cloud-based alternatives, offering a scalable solution for empirical validation. This framework addresses limitations in expert-driven Q-matrix construction by integrating AI-generated candidates with data-driven evaluation. The study highlights the potential for AI to enhance learning analytics and assessment design through human-AI collaboration. Published on arXiv under identifier 2604.16398v1, the research presents a cross-disciplinary advancement in educational technology.

Key facts

Q-matrices are theory-driven tools for assessment and learning analytics
Expert-crafted Q-matrices are subjective and difficult to validate empirically
The framework uses LLMs to generate candidate Q-matrices via structured prompting
NeuralCDM evaluates candidates based on student response data explanation
Applied to a thermodynamics assessment dataset
LLM-generated Q-matrices achieved AUC 0.780 vs expert baseline 0.717
Locally deployed LLMs benchmarked against cloud-served models
Published on arXiv with identifier 2604.16398v1

AI Framework Enhances Educational Assessment Q-Matrices Through LLM and NeuralCDM Integration

Key facts

Entities

Institutions

Sources