Collations for Ancient Languages in XSLT and XQuery

From The Digital Classicist Wiki
Revision as of 20:54, 30 October 2014 by JoelKalvesmaki (talk | contribs) (clarification of newlines)
Jump to navigation Jump to search

In using XLST and XQuery style sheets, classicists will find the need to alphabetize their material in specialized manners. For example, scholars working with Latin may wish to conflate the i with the j and the u with the v, and those working with Greek may wish to have the ϙ (qoppa) collated in the alphabet, or to include characters that are outside the Greek and Coptic and Greek Extended planes.

The W3C recommendations on XSLT (1.0, 2.0, 3.0) and XQuery (1.0, 3.0) provide for collations through attributes such as @collation, but they leave to individual transformation engines the decisions on how to construct and retrieve specific collations.

XSLT-friendly collations that address the needs of those working with ancient languages are few and far between. Many require customized collations.

Examples of XSLT/XQuery Collations

Greek

Latin

Syriac

  • Syriac Reference Portal: romanized transliteration scheme: definition and application -- intended to work with the Saxon engine.

Alternatives

  • It is possible to use the fn:translate() function as a processor-independent collation method. This method binds characters to specific Unicode code points, and relies upon default sorting by code point to alphabetize. Here is an example of how to sort, non-case-sensitive, a sequence of Greek words stored in the variable $gr (select new lines have been introduced, to improve display on the screen; when using this code remove all newlines between the opening and closing tags):

<sort select="translate($gr,'ἀἁἂἃἄἅἆἇἈἉἊἋἌἍἎἏὰάᾀᾁᾂᾃᾄᾅᾆᾇᾈᾉᾊᾋᾌᾍᾎᾏᾰᾱᾲᾳᾴᾶᾷᾸᾹᾺΆᾼΆΑάα ΒβϐΓγΔδἐἑἒἓἔἕἘἙἚἛἜἝὲέῈΈΈΕέεϵ϶ΖζἠἡἢἣἤἥἦἧἨἩἪἫἬἭἮἯὴήᾐᾑᾒᾓᾔᾕᾖᾗᾘᾙᾚᾛᾜᾝᾞᾟῂῃῄῆῇῊΉῌͰͱΉΗήη ΘθϑϴἰἱἲἳἴἵἶἷἸἹἺἻἼἽἾἿὶίῐῑῒΐῖῗῘῙῚΊΊΐΙΪίιϊϳΚκϏϗϰΛλΜμΝνΞξὀὁὂὃὄὅὈὉὊὋὌὍὸόῸΌΌΟοόΠπϺϻῤῥῬΡρϱϼ ΣςσϲϹϽϾϿΤτὐὑὒὓὔὕὖὗὙὛὝὟὺύῠῡῢΰῦῧῨῩῪΎΎΥΫΰυϋύϒϓϔΦφϕΧχΨψ ὠὡὢὣὤὥὦὧὨὩὪὫὬὭὮὯὼώᾠᾡᾢᾣᾤᾥᾦᾧᾨᾩᾪᾫᾬᾭᾮᾯῲῳῴῶῷῺΏῼΏΩωώϖ ϚϛϜϝϞϟϘϙͲͳϠϡϷϸϢϣϤϥϦϧϨϩϪϫϬϭϮϯ᾽ι᾿῀῁῍῎῏῝῞῟῭΅`´῾ʹ͵Ͷͷͺͻͼͽ;΄΅·', 'αααααααααααααααααααααααααααααααααααααααααααααααααα βββγγδδεεεεεεεεεεεεεεεεεεεεεεζζηηηηηηηηηηηηηηηηηηηηηηηηηηηηηηηηηηηηηηηηηηηηηηηη θθθθιιιιιιιιιιιιιιιιιιιιιιιιιιιιιιιιιιιικκκκκλλμμννξξοοοοοοοοοοοοοοοοοοοοππϻϻρρρρρρρ σσσσσσσσττυυυυυυυυυυυυυυυυυυυυυυυυυυυυυυυυυυφφφχχψψ ωωωωωωωωωωωωωωωωωωωωωωωωωωωωωωωωωωωωωωωωωωωωωωω ϛϛϝϝϟϟϙϙϠϠϡϡϸϸϣϣϥϥϧϧϩϩϫϫϭϭϯϯ')"/>