ARTFEED — Contemporary Art Intelligence

Data-Driven Cognitive Profiling Improves MCQ Difficulty Prediction

ai-technology · 2026-05-20

A new predictive framework for assessing the difficulty of multiple-choice questions has been unveiled by a group of researchers. Moving away from conventional methods focused on individual student abilities, this innovative approach examines learner diversity through cognitive profiling using data analysis. By exploring the EEDI dataset, researchers employed latent class analysis to identify distinct student personas. They then utilized a large language model to simulate responses and combined these insights with topic context in a Ridge Regression model. This method resulted in enhanced accuracy, reducing the mean squared error substantially and improving R-squared values significantly. The study is available on arXiv, identifier 2605.16290.

Key facts

  • Framework uses data-driven cognitive profiling instead of theoretical ability sampling.
  • Student personas identified via latent class analysis on EEDI dataset.
  • LLM conditioned to simulate response distributions for each persona.
  • Ridge Regression model predicts IRT difficulty parameter.
  • Five-fold cross-validation improves MSE from 0.367 to 0.274.
  • R-squared improves from 0.525 to 0.686.
  • Personas are interpretable and explain item difficulty.
  • Published on arXiv with ID 2605.16290.

Entities

Sources