OCR for ancient Greek: Difference between revisions
Jump to navigation
Jump to search
m (→alternatives: capitalisation) |
(title; removed out-of-date "External links") |
||
Line 1: | Line 1: | ||
==Tools and advice for the Optical Character Recognition (OCR) of Ancient Greek== | |||
* [http://ancientgreekocr.org Ancient Greek OCR] provides downloads and instructions for OCR using the [http://code.google.com/p/tesseract-ocr Tesseract] engine. Works on Windows, Linux, OSX & Android. | * [http://ancientgreekocr.org Ancient Greek OCR] provides downloads and instructions for OCR using the [http://code.google.com/p/tesseract-ocr Tesseract] engine. Works on Windows, Linux, OSX & Android. | ||
* Bruce Robertson has created "Rigaudon", "a complete suite of scripts, python code and data required for producing polytonic Greek OCR" | * Bruce Robertson has created "Rigaudon", "a complete suite of scripts, python code and data required for producing polytonic Greek OCR" | ||
Line 14: | Line 16: | ||
* [http://accesstei.apexcovantage.com/ AccessTEI] is a service for members of the TEI for manual keying of texts which can handle ancient Greek | * [http://accesstei.apexcovantage.com/ AccessTEI] is a service for members of the TEI for manual keying of texts which can handle ancient Greek | ||
[[category:FAQ]] | [[category:FAQ]] | ||
[[category:Tools]] | [[category:Tools]] | ||
[[category:OCR]] | [[category:OCR]] |
Revision as of 11:06, 12 July 2019
Tools and advice for the Optical Character Recognition (OCR) of Ancient Greek
- Ancient Greek OCR provides downloads and instructions for OCR using the Tesseract engine. Works on Windows, Linux, OSX & Android.
- Bruce Robertson has created "Rigaudon", "a complete suite of scripts, python code and data required for producing polytonic Greek OCR"
- Rigaudon GitHub page
- Lace: Greek OCR collects results of OCR processing with Rigaudon on public domain texts
- Initial reports on preliminary results of a survey of techniques: http://www.heml.org/RobertsonGreekOCR/
- The Gamera toolkit for analysing and scanning complex texts includes some experiments with polytonic Greek
- Federico Boschetti did some earlier experimentation with adapting/training Google's OCR engine tesseract to ancient Greek texts: http://www.himeros.eu/ (related paper)
- The commercial OCR software Anagnostis (€585) can handle ancient Greek, though apparently poorly
- ABBYY FineReader can be made to work with ancient Greek with extensive training
- Google Docs now allows you to have it do OCR on uploaded documents in a variety of languages, and you can get some results by specifying "Greek" and uploading a PDF (images seem not to work). Quality is about on the level of Google Books OCR of printed ancient Greek.
Alternatives
- AccessTEI is a service for members of the TEI for manual keying of texts which can handle ancient Greek