Collations for Ancient Languages in XSLT and XQuery

In using XLST and XQuery style sheets, classicists will find the need to alphabetize their material in specialized manners. For example, scholars working with Latin may wish to conflate the i with the j and the u with the v, and those working with Greek may wish to have the ϙ (qoppa) collated in the alphabet, or to include characters that are outside the Greek and Coptic and Greek Extended planes.

The W3C recommendations on XSLT (1.0, 2.0, 3.0) and XQuery (1.0, 3.0) provide for collations through attributes such as @collation, but they leave to individual transformation engines the decisions on how to construct and retrieve specific collations.

XSLT-friendly collations that address the needs of those working with ancient languages are few and far between. Many require customized collations.

Syriac

 * Syriac Reference Portal: romanized transliteration scheme: definition and application -- intended to work with the Saxon engine.

Alternatives
<sort select="translate($gr,'ἀἁἂἃἄἅἆἇἈἉἊἋἌἍἎἏὰάᾀᾁᾂᾃᾄᾅᾆᾇᾈᾉᾊᾋᾌᾍᾎᾏᾰᾱᾲᾳᾴᾶᾷᾸᾹᾺΆᾼΆΑάαΒβϐΓγΔδἐἑἒἓἔ ἕἘἙἚἛἜἝὲέῈΈΈΕέεϵ϶ΖζἠἡἢἣἤἥἦἧἨἩἪἫἬἭἮἯὴήᾐᾑᾒᾓᾔᾕᾖᾗᾘᾙᾚᾛᾜᾝᾞᾟῂῃῄῆῇῊΉῌͰͱΉΗήηΘθϑϴἰἱἲἳἴἵἶἷἸἹἺἻἼἽἾἿὶίῐῑ ῒΐῖῗῘῙῚΊΊΐΙΪίιϊϳΚκϏϗϰΛλΜμΝνΞξὀὁὂὃὄὅὈὉὊὋὌὍὸόῸΌΌΟοόΠπϺϻῤῥῬΡρϱϼΣςσϲϹϽϾϿΤτὐὑὒὓὔὕὖὗὙὛὝὟὺύῠῡῢΰῦῧῨῩ ῪΎΎΥΫΰυϋύϒϓϔΦφϕΧχΨψὠὡὢὣὤὥὦὧὨὩὪὫὬὭὮὯὼώᾠᾡᾢᾣᾤᾥᾦᾧᾨᾩᾪᾫᾬᾭᾮᾯῲῳῴῶῷῺΏῼΏΩωώϖϚϛϜϝϞϟϘϙͲͳϠϡϷϸϢϣϤϥϦϧϨϩϪϫϬϭ Ϯϯ᾽ι᾿῀῁῍῎῏῝῞῟῭΅`´῾ʹ͵Ͷͷͺͻͼͽ;΄΅·', 'ααααααααααααααααααααααααααααααααααααααααααααααααααβββγγδδεεεεεεεεεεεεεεεεεεεεεεζζηηηηηηηηηη ηηηηηηηηηηηηηηηηηηηηηηηηηηηηηηηηηηηηηηθθθθιιιιιιιιιιιιιιιιιιιιιιιιιιιιιιιιιιιικκκκκλλμμννξξο οοοοοοοοοοοοοοοοοοοππϻϻρρρρρρρσσσσσσσσττυυυυυυυυυυυυυυυυυυυυυυυυυυυυυυυυυυφφφχχψψωωωωωωωωωωω ωωωωωωωωωωωωωωωωωωωωωωωωωωωωωωωωωωωω ϛϛϝϝϟϟϙϙϠϠϡϡϸϸϣϣϥϥϧϧϩϩϫϫϭϭϯϯ')"/> lower-case(translate(normalize-unicode($gr,'NFD'), '&amp;#x0300;&amp;#x0301;&amp;#x0308;&amp;#x0313;&amp;#x0314;&amp;#x0342;&amp;#x0345;',''))
 * It is possible to use the fn:translate function as a processor-independent collation method. This method binds characters to specific Unicode code points, and relies upon default sorting by code point to alphabetize. Here is an example of how to sort, non-case-sensitive, a sequence of Greek words stored in the variable $gr (select new lines have been introduced, to improve display on the screen; when using this code remove all newlines between the opening and closing tags):
 * A simpler function, which should perform the same result, might be (i.e. normalize as decomposed Unicode, then strip our the combining diacritics characters):