Treebanking: Difference between revisions

From The Digital Classicist Wiki
Jump to navigation Jump to search
(Created page with ""Treebanking" is the shorthand term for grammatically parsing digital texts of Ancient Greek, Latin and a number of other languages, and the creation of annotated morpho-synta...")
 
(19 intermediate revisions by 4 users not shown)
Line 1: Line 1:
"Treebanking" is the shorthand term for grammatically parsing digital texts of Ancient Greek, Latin and a number of other languages, and the creation of annotated morpho-syntactic trees.
==Description==
 
"'''Treebanking'''" is the shorthand term for grammatically parsing digital texts of Ancient Greek, Latin and a number of other languages, and the creation of annotated morpho-syntactic trees.
 
===Application in research===


Such linguistic annotation allows collecting information that can be used for comprehensive and sophisticated quantitative research on various phenomena in ancient texts with high level of precision. Examples of questions that can be answered using the data created through treebanking is which verbs are more often associated with masculine gender subjects than feminine; how complex are the sentences in given authors/genres relative to others etc.
Such linguistic annotation allows collecting information that can be used for comprehensive and sophisticated quantitative research on various phenomena in ancient texts with high level of precision. Examples of questions that can be answered using the data created through treebanking is which verbs are more often associated with masculine gender subjects than feminine; how complex are the sentences in given authors/genres relative to others etc.
===Pedagogical applicaton===


Treebanking has been proven to be useful as a pedagogical tool in various ways. The trees that have already been created can be used in class as a visualisation of sentences that have more complex or unusual syntax, and by getting involved in the act of building the trees themselves students are provided with a form of online exercise on parsing of grammar and syntax. The data created by the joint efforts of students and supervising teachers can then contribute to the creation of a corpus of morpho-syntactically annotated classical texts that could be queried in future research.  
Treebanking has been proven to be useful as a pedagogical tool in various ways. The trees that have already been created can be used in class as a visualisation of sentences that have more complex or unusual syntax, and by getting involved in the act of building the trees themselves students are provided with a form of online exercise on parsing of grammar and syntax. The data created by the joint efforts of students and supervising teachers can then contribute to the creation of a corpus of morpho-syntactically annotated classical texts that could be queried in future research.  


Treebanking both tests and improves students’ understanding of Greek or Latin vocabulary (identifying lemmata and parts of speech), grammar (morphological parsing of word forms), and syntax (the dependencies between words and phrases in a sentence). It can also be used to study writing style, and to cast interesting light on the parallels between translations or alternate language versions of a text. (It has been used to study Arabic translations of Greek scientific texts, for example.)
Treebanking both tests and improves students’ understanding of Greek or Latin vocabulary (identifying lemmata and parts of speech), grammar (morphological parsing of word forms), and syntax (the dependencies between words and phrases in a sentence). It can also be used to study writing style, and to cast light on the parallels between translations or alternate language versions of a text.
 
==Treebanking platforms and databases==
 
Online platforms for treebanking incude those built by [http://www.clarino.uib.no/iness/page ''Infrastructure for the Exploration of Syntax and Semantics (INESS)''] and the [http://www.perseids.org ''Perseids Project'']. Both projects have treebanked databases that can be queried and used for research.
 
==Bibliography==
===Guidelines===
 
* Bamman David & al. 2008. Guidelines for the Syntactic Annotation of Latin Treebanks (v. 1.3). https://github.com/PerseusDL/treebank_data/blob/master/v1/latin/docs/guidelines.pdf (only p. 3-21; 24; 26)
* Celano, Giuseppe G.A. 2014. Guidelines for the annotation of the Ancient Greek Dependency Treebank 2.0. https://github.com/PerseusDL/treebank_data/edit/master/AGDT2/guidelines (only Chapter 3, including analysis of the hyperlinked examples)
 
===Studies===
 
* Celano, Giuseppe G.A. (2019). "The Dependency Treebanks for Ancient Greek and Latin." In Monica Berti (ed), ''Digital Classical Philology: Ancient Greek and Latin in the Digital Revolution''. De Gruyter. Pp. 279–298. Available: https://doi.org/10.1515/9783110599572-016
* Gorman, Vanessa B. & Robert J. Gorman (2016). “Approaching Questions of Text Reuse in Ancient Greek Using Computational Syntactic Stylometry.”  ''Open Linguistics'' 2, 500-510. Available: https://doi.org/10.1515/opli-2016-0026
* Haug, Dag. 2015. “Treebanks in historical linguistic research.” In Carlotta Viti (ed.), ''Perspectives on Historical Syntax'', Benjamins, pp. 188-202. http://folk.uio.no/daghaug/historical-treebanks.pdf
* Mambrini, Francesco.  2016. "The Ancient Greek Dependency Treebank: Linguistic Annotation in a Teaching Environment." In: Bodard, G & Romanello, M (eds.) ''Digital Classics Outside the Echo-Chamber: Teaching, Knowledge Exchange & Public Engagement'', pp. 83–99. London: Ubiquity Press. DOI: http://dx.doi.org/10.5334/bat.f
* Mambrini, Francesco, 2019. “Nominal vs Copular Clauses in a Diachronic Corpus of Ancient Greek Historians.” ''Journal of Greek Linguistics'' 19, 90-113. Available: https://doi.org/10.1163/15699846-01901003
* Passarotti, Marco (2019). "The Project of the Index Thomisticus Treebank." In Monica Berti (ed), ''Digital Classical Philology: Ancient Greek and Latin in the Digital Revolution''. De Gruyter. Pp. 299–320. Available: https://doi.org/10.1515/9783110599572-017
* Reggiani, Nicola, 2017. "New Trends in Papyrology. Quantitative anaysis of textual data: past and future of computational linguistics applied to papyrology." Chapter 7.1 in ''Digital Papyrology I: Methods, Tools and Trends''. De Gruyter. Pp. 178–189. Available: https://doi.org/10.1515/9783110547474-007
* Smith, Neel, 2016. "Morphological Analysis of Historical Languages." ''Bulletin of the Institute of Classical Studies'' 59.2, 89–102. Available: https://onlinelibrary.wiley.com/doi/10.1111/j.2041-5370.2016.12040.x
* Vierros, Marja (2018). "Linguistic Annotation of the Digital Papyrological Corpus: Sematia." In Nicola Reggiani (ed), ''Digital Papyrology II: Case Studies on the Digital Edition of Ancient Greek Papyri''. De Gruyter. Pp. 105–118. Available: https://doi.org/10.1515/9783110547450-006
 
===Presentations and tutorials (from Sunoikisis Digital Classics)===
 
* June 16, 2015: [https://www.youtube.com/watch?v=e0ZzS0ghOOI An Introduction to Treebanking] (Neven Jovanović) (YouTube)
* June 23, 2015: [https://www.youtube.com/watch?v=rbzFFb1ufac Comparing Trees: an Introduction to Treebanking Evaluation] (Giuseppe G.A. Celano)
* Feb 10, 2016: [https://github.com/SunoikisisDC/SunoikisisDC-2016/wiki/Introduction-to-Treebanking-Part-I-%28February-10%29 Introduction to Treebanking Part I] (Giuseppe G.A. Celano and Dag Haug)
* Feb 17, 2016: [http://www.youtube.com/watch?v=2z2oxu8OMqQ Introduction to Treebanking Part II] (Giuseppe G.A. Celano) (YouTube)
* Feb 9, 2017: [https://github.com/SunoikisisDC/SunoikisisDC-2016-2017/wiki/Annotating-treebanks Annotating treebanks] (Polina Yordanova and Marja Vierros)
* Feb 16, 2017: [https://github.com/SunoikisisDC/SunoikisisDC-2016-2017/wiki/Querying-treebanks-INESS-Regex Querying treebanks - INESS & Regex] (Dag Haug)
* Feb 23, 2017: [https://github.com/SunoikisisDC/SunoikisisDC-2016-2017/wiki/Querying-treebanks-XML,-XQuery,-XPath Querying treebanks - XML/XQuery/XPath] (Giuseppe G.A. Celano)
* March 1, 2018: [https://github.com/SunoikisisDC/SunoikisisDC-2017-2018/wiki/Treebanking-1:-morphosyntactic-annotation Treebanking 1] (Marja Vierros, Polina Yordanova)
* March 8, 2018: [https://github.com/SunoikisisDC/SunoikisisDC-2017-2018/wiki/Treebanking-2:-using-treebanks Treebanking 2] (Dag Haug, Francesco Mambrini)
* Feb 7, 2019: [https://github.com/SunoikisisDC/SunoikisisDC-2018-2019/wiki/ICS02:-5.-Introduction-to-Treebanking Introduction to Treebanking] (Marja Vierros & Polina Yordanova)
* Feb 21, 2019: [https://github.com/SunoikisisDC/SunoikisisDC-2018-2019/wiki/ICS02:-7.-Using-Treebanks Using treebanked corpora & Universal Dependencies] (Timo Korkiakangas & Marco Passarotti)
 
===See also===
 
* [[GLTreebank]] mailing list
* [[Ancient Greek and Latin Dependency Treebank]]
* [[PapyGreek]]
* [[Index Thomisticus Treebank]]




The downside is that the treebanking tools we use require an
[[category:tools]]
understanding of dependency grammar (which I still struggle
[[category:pedagogy]]
with--basically rather than breaking sentences up into noun-phrases,
[[category:linguistics]]
verb-phrases, etc., dependency grammar links each individual word to
[[category:Syntactic analysis]]
another word in the sentence on which it depends, with the whole
[[category:stylistic analysis]]
sentence depending from the main verb, and the rest hanging like a
[[category:language learning]]
tree from that root). It can be a bit of a conceptual struggle to get
ones head around that, but it's still Greek and Latin grammar, so
doesn't change one's understanding of the language.

Revision as of 16:07, 1 October 2019

Description

"Treebanking" is the shorthand term for grammatically parsing digital texts of Ancient Greek, Latin and a number of other languages, and the creation of annotated morpho-syntactic trees.

Application in research

Such linguistic annotation allows collecting information that can be used for comprehensive and sophisticated quantitative research on various phenomena in ancient texts with high level of precision. Examples of questions that can be answered using the data created through treebanking is which verbs are more often associated with masculine gender subjects than feminine; how complex are the sentences in given authors/genres relative to others etc.

Pedagogical applicaton

Treebanking has been proven to be useful as a pedagogical tool in various ways. The trees that have already been created can be used in class as a visualisation of sentences that have more complex or unusual syntax, and by getting involved in the act of building the trees themselves students are provided with a form of online exercise on parsing of grammar and syntax. The data created by the joint efforts of students and supervising teachers can then contribute to the creation of a corpus of morpho-syntactically annotated classical texts that could be queried in future research.

Treebanking both tests and improves students’ understanding of Greek or Latin vocabulary (identifying lemmata and parts of speech), grammar (morphological parsing of word forms), and syntax (the dependencies between words and phrases in a sentence). It can also be used to study writing style, and to cast light on the parallels between translations or alternate language versions of a text.

Treebanking platforms and databases

Online platforms for treebanking incude those built by Infrastructure for the Exploration of Syntax and Semantics (INESS) and the Perseids Project. Both projects have treebanked databases that can be queried and used for research.

Bibliography

Guidelines

Studies

  • Celano, Giuseppe G.A. (2019). "The Dependency Treebanks for Ancient Greek and Latin." In Monica Berti (ed), Digital Classical Philology: Ancient Greek and Latin in the Digital Revolution. De Gruyter. Pp. 279–298. Available: https://doi.org/10.1515/9783110599572-016
  • Gorman, Vanessa B. & Robert J. Gorman (2016). “Approaching Questions of Text Reuse in Ancient Greek Using Computational Syntactic Stylometry.” Open Linguistics 2, 500-510. Available: https://doi.org/10.1515/opli-2016-0026
  • Haug, Dag. 2015. “Treebanks in historical linguistic research.” In Carlotta Viti (ed.), Perspectives on Historical Syntax, Benjamins, pp. 188-202. http://folk.uio.no/daghaug/historical-treebanks.pdf
  • Mambrini, Francesco. 2016. "The Ancient Greek Dependency Treebank: Linguistic Annotation in a Teaching Environment." In: Bodard, G & Romanello, M (eds.) Digital Classics Outside the Echo-Chamber: Teaching, Knowledge Exchange & Public Engagement, pp. 83–99. London: Ubiquity Press. DOI: http://dx.doi.org/10.5334/bat.f
  • Mambrini, Francesco, 2019. “Nominal vs Copular Clauses in a Diachronic Corpus of Ancient Greek Historians.” Journal of Greek Linguistics 19, 90-113. Available: https://doi.org/10.1163/15699846-01901003
  • Passarotti, Marco (2019). "The Project of the Index Thomisticus Treebank." In Monica Berti (ed), Digital Classical Philology: Ancient Greek and Latin in the Digital Revolution. De Gruyter. Pp. 299–320. Available: https://doi.org/10.1515/9783110599572-017
  • Reggiani, Nicola, 2017. "New Trends in Papyrology. Quantitative anaysis of textual data: past and future of computational linguistics applied to papyrology." Chapter 7.1 in Digital Papyrology I: Methods, Tools and Trends. De Gruyter. Pp. 178–189. Available: https://doi.org/10.1515/9783110547474-007
  • Smith, Neel, 2016. "Morphological Analysis of Historical Languages." Bulletin of the Institute of Classical Studies 59.2, 89–102. Available: https://onlinelibrary.wiley.com/doi/10.1111/j.2041-5370.2016.12040.x
  • Vierros, Marja (2018). "Linguistic Annotation of the Digital Papyrological Corpus: Sematia." In Nicola Reggiani (ed), Digital Papyrology II: Case Studies on the Digital Edition of Ancient Greek Papyri. De Gruyter. Pp. 105–118. Available: https://doi.org/10.1515/9783110547450-006

Presentations and tutorials (from Sunoikisis Digital Classics)

See also