The Latin Macronizer

From The Digital Classicist Wiki
Jump to navigation Jump to search



  • Johan Winge


From the project website (accessed 2020-09-29)

This automatic macronizer lets you quickly mark all the long vowels in a Latin text. The expected accuracy on an average classical text is estimated to be about 98% to 99%. Please review the resulting macrons with a critical eye!

The macronization is performed using a part-of-speech tagger (RFTagger) trained on the Latin Dependency Treebank, and with macrons provided by a customized version of the Morpheus morphological analyzer. An earlier version of this tool was the subject of my bachelor’s thesis in Language Technology, Automatic annotation of Latin vowel length.

If you want to run the macronizer locally, or develop it further, you may find the source code on GitHub.