OR09: Repository workflows: ICE-TheOREM, semantic infra for theses

Open Repositories 2009, day 2, session 6b.

Summary: great concept, convincing demonstration. Excellent stuff.

Part if ICE project, a JISC funded experiment with ORE.
Importance of ORE: “ORE is a really important protocol – it has been missing for the web for most of its life so far.” (DH: Amen!)

Motivations for TheOREM: check ORE – is it applicable and useful? What are different ways of using? How do SWORD and ORE combine?
Pracitally: improving theses visibility, embargoes as enabler.

Interesting: in the whole repository system, the management of embargoes is separated from the repository by design. A special system serves resourcemaps for the unembargoed, IR polls these regularly. Interesting: this reflects the real-world political issues, and makes it easier to bring quite radical changes.

Demonstrator (with the Fascinator) with one thesis, with reference to data object: molecule description in chemical markup language (actual data).
Simple authoring environment in openoffice Writer (Word is also supported), stylesheet + convention based approach. When uploaded, the doc is taken apart to atomistic xml objects in Fedora. The chemical element is a separate object with relation to the doc, versioning etc.

Embargo metadata is written as text in the doc (on title page; date noted using convention,KISS approach), and a style (p-meta-date-embargo) is applied. The thesis is again ingested - and voila, the part of the thesis with embargo is now hidden.

This simple system also allows dialogue between student and tutor - remarks on the text - to be embedded in the document itself (and hidden to the outside by default). It looks deceivingly like Words's own comments, which I imagine will ease the uptake.

Sidenote: policy in this project is that only submitter can ever change embargo data. So it is recommended to use openID rather than institutional logins, as PhD graduates tend to move on, and then nobody can change it anymore.

Q (from Les Carr): supervisors won’t like to have their interaction with students complicated by tech. What is their benefit?
A: automatic backing up is a big benefit, also of the workflow (ie. the comments in the document text). We *know* students appreciate it. Supers may not like it but everyone else will, and then they’ll have to.

(note DH: this is of course in the sciences, it will be an interesting challange to get the humanities to adhere to stylesheet and microformatting conventions)

Q: can this workflow also generate the ‘authentic and blessed copy’ of the final thesis?
A: Not in project scope, we still produce the pdf  for that. In theory this might be a more authentic copy, but they might scream at the sight of this tech.

