Latin OCR

From The Digital Classicist Wiki
Jump to navigation Jump to search



  • Antonia Karaisl
  • Nick White


From the project website (accessed 2016-09-22):

Latin OCR provides free software to convert scans of early modern Latin printed text into unicode text and PDF files that can be easily searched, copied, archived, and transformed. It uses Tesseract as an OCR engine with a specific training set based on the work of Ancient Greek OCR and Ryan Baumann's Latin OCR for Tesseract. The training set is developed by Rescribe Ltd and is specifically tailored to cater to the peculiarities of historic fonts and characters used in printing from 1500 to about 1800.