Treebanking

From The Digital Classicist Wiki
Revision as of 17:53, 4 October 2016 by PolinaYordanova (talk | contribs) (Created page with ""Treebanking" is the shorthand term for grammatically parsing digital texts of Ancient Greek, Latin and a number of other languages, and the creation of annotated morpho-synta...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

"Treebanking" is the shorthand term for grammatically parsing digital texts of Ancient Greek, Latin and a number of other languages, and the creation of annotated morpho-syntactic trees.

Such linguistic annotation allows collecting information that can be used for comprehensive and sophisticated quantitative research on various phenomena in ancient texts with high level of precision. Examples of questions that can be answered using the data created through treebanking is which verbs are more often associated with masculine gender subjects than feminine; how complex are the sentences in given authors/genres relative to others etc.

Treebanking has been proven to be useful as a pedagogical tool in various ways. The trees that have already been created can be used in class as a visualisation of sentences that have more complex or unusual syntax, and by getting involved in the act of building the trees themselves students are provided with a form of online exercise on parsing of grammar and syntax. The data created by the joint efforts of students and supervising teachers can then contribute to the creation of a corpus of morpho-syntactically annotated classical texts that could be queried in future research.

Treebanking both tests and improves students’ understanding of Greek or Latin vocabulary (identifying lemmata and parts of speech), grammar (morphological parsing of word forms), and syntax (the dependencies between words and phrases in a sentence). It can also be used to study writing style, and to cast interesting light on the parallels between translations or alternate language versions of a text. (It has been used to study Arabic translations of Greek scientific texts, for example.)


The downside is that the treebanking tools we use require an understanding of dependency grammar (which I still struggle with--basically rather than breaking sentences up into noun-phrases, verb-phrases, etc., dependency grammar links each individual word to another word in the sentence on which it depends, with the whole sentence depending from the main verb, and the rest hanging like a tree from that root). It can be a bit of a conceptual struggle to get ones head around that, but it's still Greek and Latin grammar, so doesn't change one's understanding of the language.