OCR for ancient Greek: Difference between revisions

From The Digital Classicist Wiki
Jump to navigation Jump to search
(add Antigrapheus)
(add Kraken)
Line 7: Line 7:
** [[Lace: Greek OCR]] collects results of OCR processing with Rigaudon on public domain texts
** [[Lace: Greek OCR]] collects results of OCR processing with Rigaudon on public domain texts
** Initial reports on preliminary results of a survey of techniques: http://www.heml.org/RobertsonGreekOCR/
** Initial reports on preliminary results of a survey of techniques: http://www.heml.org/RobertsonGreekOCR/
* A number of people have produced training files for specific Greek fonts in the [http://kraken.re/ Kraken] OCR engine:
** [https://github.com/pharos-alexandria/kraken-ocr-greek_cursive Greek Cursive, from an edition of John Chrysostom's works by Henry Savile]
** [https://github.com/ryanfb/kraken-gaza-iliad Greek from an edition of Theodorus Gaza's Attic paraphrase of the Iliad]
** [https://github.com/mittagessen/kraken-models Greek models in the Kraken models repo] (these are in the legacy pyrnn model format and may not work with the latest version of Kraken, see [https://github.com/mittagessen/kraken/issues/118 this issue])
* The [http://gamera.informatik.hsnr.de/ Gamera] toolkit for analysing and scanning complex texts includes some experiments with polytonic Greek
* The [http://gamera.informatik.hsnr.de/ Gamera] toolkit for analysing and scanning complex texts includes some experiments with polytonic Greek
* Federico Boschetti did some earlier experimentation with adapting/training Google's OCR engine [http://code.google.com/p/tesseract-ocr/ tesseract] to ancient Greek texts: http://www.himeros.eu/ ([http://www.perseus.tufts.edu/~ababeu/ecdl2009-preprint.pdf related paper])
* Federico Boschetti did some earlier experimentation with adapting/training Google's OCR engine [http://code.google.com/p/tesseract-ocr/ tesseract] to ancient Greek texts: http://www.himeros.eu/ ([http://www.perseus.tufts.edu/~ababeu/ecdl2009-preprint.pdf related paper])

Revision as of 13:36, 12 July 2019

Tools and advice for the Optical Character Recognition (OCR) of Ancient Greek

Alternatives

  • AccessTEI is a service for members of the TEI for manual keying of texts which can handle ancient Greek