Text Encoding (implications of using XML vs database)

From The Digital Classicist Wiki
Jump to: navigation, search

Usually, the encoding of texts using XML is a practice orthogonal to the use of a database to store information about texts. In other words, one might well use both.


The term "database" is typically used to refer to relational database management systems (RDBMS), in which data is stored in tables made up of rows and named and typed columns (or fields). These tables can be linked by means of a common piece or group of data (usually an id number) shared between them. Databases are useful because they make certain types of information retrieval very easy. It is possible to store complete texts in databases, either by fragmenting them into pieces stored in the database tables or by loading the whole text into a single field. The latter practice tends to mitigate some of the advantages of using a database.


XML, on the other hand, provides a text-based syntax for encoding the structure and semantics of the text. Once this has been done, other tools (including databases) can be used to leverage that encoding and perform various tasks on the texts, such as information retrieval and display in various formats.

Personal tools