Difference between revisions of "OCR for ancient Greek"

From The Digital Classicist Wiki
Jump to: navigation, search
m (alternatives: capitalisation)
(title; removed out-of-date "External links")
Line 1: Line 1:
 +
==Tools and advice for the Optical Character Recognition (OCR) of Ancient Greek==
 +
 
* [http://ancientgreekocr.org Ancient Greek OCR] provides downloads and instructions for OCR using the [http://code.google.com/p/tesseract-ocr Tesseract] engine. Works on Windows, Linux, OSX & Android.
 
* [http://ancientgreekocr.org Ancient Greek OCR] provides downloads and instructions for OCR using the [http://code.google.com/p/tesseract-ocr Tesseract] engine. Works on Windows, Linux, OSX & Android.
 
* Bruce Robertson has created "Rigaudon", "a complete suite of scripts, python code and data required for producing polytonic Greek OCR"
 
* Bruce Robertson has created "Rigaudon", "a complete suite of scripts, python code and data required for producing polytonic Greek OCR"
Line 14: Line 16:
 
* [http://accesstei.apexcovantage.com/ AccessTEI] is a service for members of the TEI for manual keying of texts which can handle ancient Greek
 
* [http://accesstei.apexcovantage.com/ AccessTEI] is a service for members of the TEI for manual keying of texts which can handle ancient Greek
  
==External links==
+
 
* [https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1005&L=DIGITALCLASSICIST&F=&S=&P=2180 Discussion of ancient Greek OCR software on Digital Classicist mailing list]
+
* [http://www.odl.ox.ac.uk/papers/OCRFeasibility_final.pdf Deciding whether Optical Character Recognition is feasible, Simon Tanner (KDCS), 2004]
+
  
 
[[category:FAQ]]
 
[[category:FAQ]]
 
[[category:Tools]]
 
[[category:Tools]]
 
[[category:OCR]]
 
[[category:OCR]]

Revision as of 12:06, 12 July 2019

Tools and advice for the Optical Character Recognition (OCR) of Ancient Greek

  • Ancient Greek OCR provides downloads and instructions for OCR using the Tesseract engine. Works on Windows, Linux, OSX & Android.
  • Bruce Robertson has created "Rigaudon", "a complete suite of scripts, python code and data required for producing polytonic Greek OCR"
  • The Gamera toolkit for analysing and scanning complex texts includes some experiments with polytonic Greek
  • Federico Boschetti did some earlier experimentation with adapting/training Google's OCR engine tesseract to ancient Greek texts: http://www.himeros.eu/ (related paper)
  • The commercial OCR software Anagnostis (€585) can handle ancient Greek, though apparently poorly
  • ABBYY FineReader can be made to work with ancient Greek with extensive training
  • Google Docs now allows you to have it do OCR on uploaded documents in a variety of languages, and you can get some results by specifying "Greek" and uploading a PDF (images seem not to work). Quality is about on the level of Google Books OCR of printed ancient Greek.

Alternatives

  • AccessTEI is a service for members of the TEI for manual keying of texts which can handle ancient Greek
Personal tools