Difference between revisions of "Dataset Integration Hack"

From The Digital Classicist Wiki
Jump to navigation Jump to search
Line 8: Line 8:
  
 
OAI-PMH server and DC metadata. (JN, MR, JMV: more info please?)
 
OAI-PMH server and DC metadata. (JN, MR, JMV: more info please?)
 +
 +
[http://www.dlese.org/dds/services/joai_software.jsp JOAI] is a Java implementation of OAI-PMH data provider and harvester that might be used for a first proof-of-concept implementation.
  
 
== Metadata ==
 
== Metadata ==

Revision as of 18:20, 11 November 2010

The problem

How to integrate several distributed but Open Access and Open Licensed datasets so that they can be served via a metadata portal from a single web service.

The datasets: Open Access Classical Data

Platform

OAI-PMH server and DC metadata. (JN, MR, JMV: more info please?)

JOAI is a Java implementation of OAI-PMH data provider and harvester that might be used for a first proof-of-concept implementation.

Metadata

Extraction

Metadata will be extracted on a case-by-case basis from the source data, with additional global parameters provided from local knowledge as required. Ideally, and eventually, individual datasets would provide their own OAI service to expose this metadata. (We may try to illustrate this with IAph and IRT at some point.)

Harvesting

Each dataset will be essentially transformed into a data provider by exposing the extracted metadata accordingly with the OAI-PMH.

Schema

OAI-PMH in Dublin Core

Tags How we generate?
dc:title title of resource
dc:creator harvest (or known?)
dc:subject ??
dc:description if any free prose
dc:publisher harvest
dc:contributor harvest if given
dc:date harvest
dc:type photograph|commentary|database|linked data|other)
dc:format filetypes?
dc:identifier URI and/or URL?
dc:source ??
dc:language = modern language
dc:relation ??
dc:coverage ??
dc:rights = license (in spreadsheet)

What's next?

  • Set up OAIPMH server.
  • Create sample metadata for each dataset (ideally by writing scripts for the sake of process reproducibility)
  • discuss viability of CKAN for our purposes
  • Next meeting.