Citation in digital scholarship: Difference between revisions

From The Digital Classicist Wiki
Jump to navigation Jump to search
No edit summary
No edit summary
Line 6: Line 6:
* Example: <a class="citation" title="Herodotus, Histories, 1.78" href="http://www.perseus.tufts.edu/hopper/text?doc=Perseus:text:1999.01.0125:book%3D1:chapter%3D78">Herodotus (1.78)</a>
* Example: <a class="citation" title="Herodotus, Histories, 1.78" href="http://www.perseus.tufts.edu/hopper/text?doc=Perseus:text:1999.01.0125:book%3D1:chapter%3D78">Herodotus (1.78)</a>
* Example with added RDFa: <a class="citation" typeof="dc:Location" rel="skos:definition" href="http://pleiades.stoa.org/places/599612">Ephesus</a>
* Example with added RDFa: <a class="citation" typeof="dc:Location" rel="skos:definition" href="http://pleiades.stoa.org/places/599612">Ephesus</a>


==Requirements==
==Requirements==

Revision as of 22:32, 30 September 2010

This page suggests best practices for making citations in digital scholarship and documents a set of conventions that are intended to promote greater interoperability. It will also point to tools for identifying, processing, and presenting citations in server and client-side environments. Additionally, it highlights resources that are creating stable URLs relevant to digital scholarship, with a focus on humanities disciplines.

The term 'citation' is meant very generally as the encoding of reference to an external entity in support of, as illustration of, or otherwise in relationship to a work of digital scholarship. Scholars cite resources ranging from primary texts, contemporary scholarship, museum objects, people, places, and a wide range of other entities and categories of information. A set of robust, straightforward conventions that allow for local adaptation and extension will enable increased creation and recognition of links between scholarly works.


Requirements

This effort takes as its starting point that the conventions described of citations should:

  1. Be automatically parsable. Automatic agents should be able to recognize that a citation is being made, and to identify what is being cited.
  2. Encourage reuse of existing naming schemes. A consistently applied convention should allow distinct and independent citation to the same entity to be recognized by third parties. For all the examples below, but particularly for sites creating stable id's (e.g. Pleiades), a concern is for a generic, interoperable, author-friendly convention to refer to those resources in ways that the sites themselves will recognize. "If you make a reference to Pleiades, how does Pleiades know that you've done so?"
  3. Support user interaction. Client-side operations, such as "show me a map of all geographic entities in a document" can be facilitated by a robust citation convention.
  4. Recognize that various standards already exist and not take unnecessary steps to interfere with the deployment of those standards.

The above list is based up on previous scholarship in the field of digital citation ([1], [2]).

Recommended Convention (@*="citation" paired with URI)

For xml-based documents, the following conventions are recommended.

  • Citations should be human readable.
  • Citations must be indicated by a containing element that has an attribute whose value is 'citation'. In (x)html that attribute is the 'class' global attribute. The only role of this attribute is to identify a span of text that is amenable to automatic processing to identify, describe or make actionable to a user the cited resource.
  • Citations should be qualified by an attribute giving an unabbreviated plain text version of the citation. This is unnecessary when the citation itself is not abbreviated.
  • The language of the citation should be indicated if it is distinct from the language of the host document.
  • Citations should link to stable online resources that make available, are surrogates for, or otherwise define the cited entity.
  • Citations may use an existing standard to indicate the nature of the entity being cited and to describe the relationship of the online resource being linked to the underlying concept that online resource describes.

Examples

Simple Examples

Other examples with more markup

Here the citation is to a 'http://purl.org/dc/terms/Location' that is defined at the URL 'http://pleiades.stoa.org/places/599612' .
The @typeof will produce an RDFa triple indicating the resource at the URI is a Dublin Core Text.
The reference is to a text whose definition is at 'http://www.worldcat.org...". Dublin Core makes no distinction between "primary source" text and "secondary" text. Other ontologies do.

Use Cases

The '@*="citation"' pattern can be used in the following circumstances.

  • To distinguish those links which contribute to the intellectual argument of a document from those that implement a user-interface or indicate the immediate publishing environment of a document. For example, a link to the homepage of a website hosting a document should not be marked with 'citation'.
  • By authors of (x)html documents that are the archival version of digital scholarship.
  • As a presentation target for documents that are stored in a non-html format but presented as such on the web.

The Process of Digital Citation in Prose Works

Preliminary Notes

  • An xml environment, with examples implemented in (x)html and tei, is assumed.
  • While this page does assert categories, those are also up for discussion. What is the theoretical and practical difference between a "primary source" and "secondary scholarship"? It is reasonable to cite the 9th century scholar Photius as both.

1. Plain-text citations

Sample text: Herodotus (1.78) describes Babylon as the strongest and most famous city in Assyria. It is likely that this city was subsequently the mint from which Alexander issued a series of coins depicting eastern warriors on the obverse and an elephant on the reverse (e.g. ANS 1995.51.68). See discussion by Martin Price (1991).

Is it possible to establish a robust convention that allows unambiguous machine-recognizable linking to the cited text, to Alexander, to Babylon, to a description of the the coin in the collection of the American Numismatic Society and to the article "Circulation at Babylon in 323 B.C."?

2. Indicating the Presence of a Citation (@*="citation")

HTML: <span class="citation">Herodotus (1.78)</span>

TEI: Cite error: Invalid <ref> tag; invalid names, e.g. too many

In both these usages, an xpath selector "//*[@*='citation']" will create a set of all the citations in a text. That is robust.

3. Normalizing the plain text citation

HTML: <span class="citation" lang="en" title="Herodotus Histories 1.78">Herodotus (1.78)</span>

TEI: Cite error: Invalid <ref> tag; invalid names, e.g. too many

Normalization will assist tools that can automatically recognize plain text citations.

If the value of the 'title' attribute would be identical to the text representation of the element it is attached to, it can be left out.

Note: in the HTML5 spec, elements without @title inherit the value from any ancestor that has @title. That should not happen in the case of a citation.

4. Be explicit about language

Both "Herodotus Histories 1.78" and "Hdt. 1.78" can be considered English representations of the citation of that text. The German equivalent of the first is "Herodot Historien 1.78", the Latin - still with Arabic numerals - is "Herodotus Historiae 1.78". If the language of the citation is the same as its prose context, it is not necessary to further markup the citation. It is common practice in some disciplines to cite the title of a work in its original language or in a widely accepted academic language, such as Latin titles for Greek works in Classics.

HTML: Herodotus (<a class="citation" title="Herodotus Historiae 1.78" lang="la">Historiae 1.78</a>) describes...

The 'lang' attribute specifices the language of the element to which it is attached. It does not directly specify the language of the 'title' attribute. Therefore, they must be the same.

5. Choosing a URL

Ideally, citations in digital scholarship are paired with a link to an online resource available at a persistent URI that that has clear semantics. Such URIs do not always exist, which is one reason to put a plain-text reference in the 'title' attribute.

HTML: <a class="citation" title="Hdt. 1.78" href="http://www.perseus.tufts.edu/hopper/text?doc=Perseus:text:1999.01.0125:book%3D1:chapter%3D78">Herodotus (1.78)</a>

TEI: Cite error: Invalid <ref> tag; invalid names, e.g. too many

HTML: <a class="citation" href="http://atlantides.org/batlas/babylon-91-f5">Babylon</a>

More complete markup

Extrapolating from the truncated steps above gives the following markup for the sample text:

HTML: <span><a class="citation" title="Herodotus Histories 1.78" href="http://www.perseus.tufts.edu/hopper/text?doc=Perseus:text:1999.01.0125:book%3D1:chapter%3D78">Herodotus (1.78)</a> describes <a class="citation" href="http://atlantides.org/batlas/babylon-91-f5">Babylon</a> as the strongest and most famous city in Assyria. It is likely that this city was subsequently the mint from which <a class="citation" title="Alexander III of Macedon" href="http://en.wikipedia.org/wiki/Alexander_the_Great">Alexander</a> issued a series of coins depicting eastern warriors on the obverse and an elephant on the reverse (e.g. <a class="citation" href="http://numismatics.org/collection/1995.51.68">ANS 1995.51.68</a>). See discussion by Martin Price (<a class="citation" title="Martin Price. 'Circulation at Babylon in 323 B.C.' in Mnemata : papers in memory of Nancy M. Waggoner" href="http://www.worldcat.org/title/mnemata-papers-in-memory-of-nancy-m-waggoner/oclc/24342025">1991</a>).

Notes: The reference to the M. Price article is insufficient.

TEI: to come.

Adding other markup schemes to conformant citations

The 'class="citation" title="<normalized plain text citation>"' html pattern is designed so that it can be easily used with other markup schemes. The global 'class' attribute in html is a space separated list so that other, unrelated values can be present without interfering with the identification of an element as a citation. The global 'title' attribute is directly suitable for the role envisioned here so shouldn't clash with other conforming uses.

Content-creators may choose to add in additional markup. Links to guidelines for doing so are list here.

OpenURL/Coins/Zotero

CTS + Microformats

RDFa

Categories of resources that can be cited

Note: the page Current practice in citation has been started.

Ancient Mediterranean Primary Texts

"Classics" has well established abbreviations. Neither complete, nor unambiguous, but well established.

  • Plain text: "Hom. Il. 2.345", "Homer, Iliad 2.345"

The following examples illustrate that the same text can appear in different places.

This example does not address the presence and/or capabilities of the Canonical Text Services (CTS) protocol and URN scheme under development at the Center for Hellenic Studies.

Geographic Entities

Within the Ancient Mediterranean, the Pleiades Project is establishing short URL as identifiers for geographic entities (but see their own discussion for details). Geonames.org is a worldwide list of identifiers.

Bibliographic Data

Worldcat. But there may be licensing issues.

What is the relationship between citing a work and citing its bibliographic record? Is that a necessary distinction?

Museum Objects

Or any cataloged object with stable id?

HTML: <a class="citation" href="http://numismatics.org/collection/1968.34.40">ANS 1968.34.40</a>.

Egyptian Papyri

The sites http://papyri.info and http://trismegistos.org (e.g. http://www.trismegistos.org/tm/detail.php?tm=23 ) are islands of stability here.

HTML: <a class="citation" title="Trismegistos Number 23" href="http://www.trismegistos.org/tm/detail.php?tm=23">TM23</a>

Notes

Template:Reflist

References and Further Reading

  • Heath 2010: Sebastian Heath, 'Diversity and Reuse of Digital Resources for Ancient Mediterranean Material Culture.' In G. Bodard and S. Mahony, eds., Digital Research in the Study of Classical Antiquity. (2010), pp. 35-52. Farnham, UK: Ashgate. http://hdl.handle.net/2451/29797
  • Romanello 2007. Matteo Romanello, "A semantic linking system for canonical references to electronic corpora," in International Conference on Electronic Corpora of Ancient Languages : proceedings of the international conference, Prague, November 16-17, 2007, P. Zemanek, Ed., Prague, 2007, pp. 107-120. [Online]. Available: http://eprints.rclis.org/16239/1/Romanello2008.pdf
  • Romanello 2008. Matteo Romanello. "A Semantic Linking Framework to Provide Critical Value-Added Services for E-Journals on Classics." ELPUB 2008: Open Scholarship: Authority, Community, and Sustainability in the Age of Web 2.0 - Proceedings of the 12th International Conference on Electronic Publishing: http://elpub.scix.net/data/works/att/401_elpub2008.content.pdf.
  1. Romanello 2008
  2. Smith 2009