Difference between revisions of "OCR for ancient Greek"

From The Digital Classicist Wiki
Jump to: navigation, search
(title; removed out-of-date "External links")
(add Kraken)
 
(One intermediate revision by one user not shown)
Line 2: Line 2:
  
 
* [http://ancientgreekocr.org Ancient Greek OCR] provides downloads and instructions for OCR using the [http://code.google.com/p/tesseract-ocr Tesseract] engine. Works on Windows, Linux, OSX & Android.
 
* [http://ancientgreekocr.org Ancient Greek OCR] provides downloads and instructions for OCR using the [http://code.google.com/p/tesseract-ocr Tesseract] engine. Works on Windows, Linux, OSX & Android.
 +
* [https://dcthree.github.io/antigrapheus/ Antigrapheus] allows you to use the Ancient Greek OCR training file above to OCR documents in a web browser, using Tesseract.js.
 
* Bruce Robertson has created "Rigaudon", "a complete suite of scripts, python code and data required for producing polytonic Greek OCR"
 
* Bruce Robertson has created "Rigaudon", "a complete suite of scripts, python code and data required for producing polytonic Greek OCR"
 
** [https://github.com/brobertson/rigaudon Rigaudon GitHub page]
 
** [https://github.com/brobertson/rigaudon Rigaudon GitHub page]
 
** [[Lace: Greek OCR]] collects results of OCR processing with Rigaudon on public domain texts
 
** [[Lace: Greek OCR]] collects results of OCR processing with Rigaudon on public domain texts
 
** Initial reports on preliminary results of a survey of techniques: http://www.heml.org/RobertsonGreekOCR/
 
** Initial reports on preliminary results of a survey of techniques: http://www.heml.org/RobertsonGreekOCR/
 +
* A number of people have produced training files for specific Greek fonts in the [http://kraken.re/ Kraken] OCR engine:
 +
** [https://github.com/pharos-alexandria/kraken-ocr-greek_cursive Greek Cursive, from an edition of John Chrysostom's works by Henry Savile]
 +
** [https://github.com/ryanfb/kraken-gaza-iliad Greek from an edition of Theodorus Gaza's Attic paraphrase of the Iliad]
 +
** [https://github.com/mittagessen/kraken-models Greek models in the Kraken models repo] (these are in the legacy pyrnn model format and may not work with the latest version of Kraken, see [https://github.com/mittagessen/kraken/issues/118 this issue])
 
* The [http://gamera.informatik.hsnr.de/ Gamera] toolkit for analysing and scanning complex texts includes some experiments with polytonic Greek
 
* The [http://gamera.informatik.hsnr.de/ Gamera] toolkit for analysing and scanning complex texts includes some experiments with polytonic Greek
 
* Federico Boschetti did some earlier experimentation with adapting/training Google's OCR engine [http://code.google.com/p/tesseract-ocr/ tesseract] to ancient Greek texts: http://www.himeros.eu/ ([http://www.perseus.tufts.edu/~ababeu/ecdl2009-preprint.pdf related paper])
 
* Federico Boschetti did some earlier experimentation with adapting/training Google's OCR engine [http://code.google.com/p/tesseract-ocr/ tesseract] to ancient Greek texts: http://www.himeros.eu/ ([http://www.perseus.tufts.edu/~ababeu/ecdl2009-preprint.pdf related paper])

Latest revision as of 14:36, 12 July 2019

[edit] Tools and advice for the Optical Character Recognition (OCR) of Ancient Greek

[edit] Alternatives

  • AccessTEI is a service for members of the TEI for manual keying of texts which can handle ancient Greek
Personal tools