Modular Pipeline for Educational Analogy Generation with LLMs

ai-technology · 2026-05-26

A new modular pipeline designed for generating educational analogies has been created, breaking the process into four distinct phases: finding sources, generating sub-concepts, creating explanations, and conducting evaluations. Based on Structure Mapping Theory, this pipeline allows for a detailed examination of how model selection and input configurations influence the quality of analogies. Twelve advanced LLMs from six different model families were tested using two datasets featuring structured sub-concept annotations (SCAR and ParallelPARC), along with seven embedding models for closed-setting retrieval. Findings indicate that while sub-concepts significantly enhance the quality of explanations and precision in closed settings, their advantages are limited in open-ended contexts.

Key facts

Pipeline has four stages: source finding, sub-concept generation, explanation generation, evaluation
Grounded in Structure Mapping Theory
Evaluated 12 state-of-the-art LLMs across six model families
Used two datasets: SCAR and ParallelPARC
Also evaluated seven embedding models for closed-setting retrieval
Sub-concepts improve explanation quality and closed-setting retrieval precision
Sub-concepts provide limited benefit in open-ended settings
Published on arXiv with ID 2605.24211

Modular Pipeline for Educational Analogy Generation with LLMs

Key facts

Entities

Institutions

Sources