ARTFEED — Contemporary Art Intelligence

AI Pipeline Transcribes Medieval English Legal Manuscripts

digital · 2026-05-06

A new open-source AI pipeline achieves 79% word accuracy in transcribing medieval English legal manuscripts written in abbreviated Latin. The dataset comprises 4,029 lines from 193 criminal and civil cases. The system uses R-Blla for line segmentation and CNN+LSTM with CTC decoding for handwriting recognition. Simple post-processing significantly boosts accuracy, despite a small training set and the challenge of expanding abbreviations. This project aims to democratize access to the records of the Anglo-American legal system, which are currently readable by only a few dozen scholars worldwide.

Key facts

  • Dataset of 4,029 lines from 193 medieval cases
  • Uses R-Blla and CNN+LSTM with CTC decoding
  • 79% word accuracy achieved
  • Post-processing significantly boosts accuracy
  • Manuscripts in abbreviated medieval Latin
  • Only a few dozen scholars can read them
  • Open-source end-to-end pipeline
  • Records of the first centuries of Anglo-American legal system

Entities

Sources