ARTFEED — Contemporary Art Intelligence

FeatEHR-LLM: LLM-Based Feature Engineering for EHR Data

ai-technology · 2026-04-27

A new framework called FeatEHR-LLM uses large language models to generate clinically meaningful features from irregularly sampled electronic health record time series. The approach addresses challenges such as irregular observation intervals, variable measurement frequencies, and structural sparsity. To protect patient privacy, the LLM only accesses dataset schemas and task descriptions, not raw records. A tool-augmented generation mechanism allows the LLM to produce executable code for feature extraction that handles uneven patterns and informative sparsity. The framework is presented in a preprint on arXiv (2604.22534).

Key facts

  • FeatEHR-LLM leverages LLMs for feature engineering in EHR data
  • Addresses irregular observation intervals and variable measurement frequencies
  • LLM operates only on schemas and task descriptions to protect privacy
  • Tool-augmented generation produces executable feature-extraction code
  • Handles uneven observation patterns and informative sparsity
  • Preprint available on arXiv with ID 2604.22534
  • Existing automated methods lack clinical awareness or assume clean inputs
  • Framework targets real-world EHR data challenges

Entities

Institutions

  • arXiv

Sources