TLG Beta Code vs. Unicode FAQ: Difference between revisions

From The Digital Classicist Wiki
Jump to navigation Jump to search
(Imported from xwiki)
 
(added to Typography category)
 
(8 intermediate revisions by one other user not shown)
Line 1: Line 1:
=== Should I use TLG betacode or Unicode for polytonic classical Greek in my electronic publications? ===
==Should I use TLG betacode or Unicode for polytonic classical Greek in my electronic publications?==


Some practical considerations one hears quoted for both sides of this debate. (Thanks, Ross. All comments/additions welcome.)
Some practical considerations one hears quoted for both sides of this debate. (Thanks, Ross. All comments/additions welcome.)


=== Arguments one hears for coding polytonic classical Greek with TLG Beta Code even today in new e-pubs: ===
===Arguments one hears for coding polytonic classical Greek with TLG Beta Code even today in new electronic publications:===


# Unicode conflates the idea of "character" and "glyph", treating an alpha+acute as a different letter from an alpha+grave, and a terminal sigma as different from a medial sigma.
# Unicode conflates the idea of "character" and "glyph", treating an alpha+acute as a different letter from an alpha+grave, and a terminal sigma as different from a medial sigma.
Line 9: Line 9:
# There are symbols defined in Beta Code but not yet defined in Unicode, and symbols defined in both, but with no font support in Unicode (but this is a problem either way).
# There are symbols defined in Beta Code but not yet defined in Unicode, and symbols defined in both, but with no font support in Unicode (but this is a problem either way).


=== Arguments one hears for coding polytonic classical Greek with Unicode in new e-pubs: ===
===Arguments one hears for coding polytonic classical Greek with Unicode in new electronic publications:===


# Unicode ''is'' an international standard.
# Unicode ''is'' an international standard.
Line 16: Line 16:
# By virtue of the transcoder and other conversion methods out there, we can always go ''back'' to Beta Code, on the fly, when it is necessary.
# By virtue of the transcoder and other conversion methods out there, we can always go ''back'' to Beta Code, on the fly, when it is necessary.
# Beta code, by using punctuation marks in non-standard ways, requires a rewrite of any tokenizer (e.g. you can't count on ")" to follow the end of a word); this requires some extra programming in some instances.
# Beta code, by using punctuation marks in non-standard ways, requires a rewrite of any tokenizer (e.g. you can't count on ")" to follow the end of a word); this requires some extra programming in some instances.
[[Category:FAQ]]
[[Category:Unicode]]
[[category:Typography]]

Latest revision as of 21:17, 29 June 2026

Should I use TLG betacode or Unicode for polytonic classical Greek in my electronic publications?

Some practical considerations one hears quoted for both sides of this debate. (Thanks, Ross. All comments/additions welcome.)

Arguments one hears for coding polytonic classical Greek with TLG Beta Code even today in new electronic publications:

  1. Unicode conflates the idea of "character" and "glyph", treating an alpha+acute as a different letter from an alpha+grave, and a terminal sigma as different from a medial sigma.
  2. Morpheus (Perseus morphological parser, aka cruncher) needs Beta Code input.
  3. There are symbols defined in Beta Code but not yet defined in Unicode, and symbols defined in both, but with no font support in Unicode (but this is a problem either way).

Arguments one hears for coding polytonic classical Greek with Unicode in new electronic publications:

  1. Unicode is an international standard.
  2. It sucks to have to implement a transcoder vel sim. in an already hairy process off setting up tomcat/cocoon or other on-the-fly publication framework.
  3. If you offer your XML source files for download, and the Greek is TLG B C, people can't read them easily, without conversion.
  4. By virtue of the transcoder and other conversion methods out there, we can always go back to Beta Code, on the fly, when it is necessary.
  5. Beta code, by using punctuation marks in non-standard ways, requires a rewrite of any tokenizer (e.g. you can't count on ")" to follow the end of a word); this requires some extra programming in some instances.