Thursday, March 06, 2014

IDCC14 notes, day 2: keynote Atul Butte

Part 2 in a series of notes on IDCC 2014, the 9th International Digital Curation Conference, held San Francisco, 24-27 feb.


Day two kicked off with a fantastic keynote by Atul Butte, Associate Professor in Medicine and Pediatrics, Stanford University School of Medicine: Translating a trillion points of data into therapies,diagnostics and new insights into disease [PDF] [Video on Youtube]. This one was well worth a separate blogpost. 

Butte starts his presentation with some great examples of how the availability of a wealth of open data has already radically changed bio/medial research. Over one million datasets are now openly available in the GeneChip standardized format. A search for breast cancer samples in NCBI Geo datasets database gives 40k results, more than the best lab will ever have in their stores. And PubChem has more samples than all pharma companies combined, completely open.

The availability of this data is leading to new developments. Butte cites a recent study that by combining datasets revealed ‘overfitting’, where everybody does an experiment in exactly the same way leading to reproducable results that are irrelevant to the real world.

But this is tame compared to the change in the whole science ecosystem with the advent of online marketplaces. Butte goes on to show a number of thriving ecommerce sites - “add to shopping cart!” - where samples can be bought for competitive prices. Conversant Bio is a marketplace for discarded samples from hospitals, with identifiers stripped off. Hospitals have limited freezer space, have biopsy samples that can be sold, and presto. What about the ethics? "Ethics is a regional thing. They can get away with a lot if stuff in Boston we can't do in Palo Alto." Now any lab can buy research samples for a low price and develop new blood marker tests. This way recently a test was developed for preeclampsia, the disease now best known from Downton Abbey.

Marketplaces also have sprung up for services, such as AssayDepot.com. This is a clearinghouse for medical research services, including animal tests. Thousands of companies provide these worldwide. Butte stresses that it's is not just a race to the bottom and to China, but that this also creates opportunity for new specialised research niches, such as a lab specializing in mouse coloscopies. Makes it possible to do real double blind tests by just buying more tests from different vendors (with different certifications, just to spread). This makes it especially interesting to investigate other effects of tested and approved drugs. Which is a good thing, because the old way of research on new drugs is not sustainable when patents run out (the “pharma patent cliff of 2018”). 

This new science ecosystem is built on top of the availability of open data sets, but there are questions to be solved for the sustainability. Butte sees two players here, funders and repositories themselves.
Incentives for sharing are lacking. Altmetrics are just beginning, and funders need to kick in. Secondary use grants are an interesting new development. Clinical trials may be the next big thing. The most expensive experiments in the world, costing $200 mln each. 50% fails and not even a paper is written about them... Butte expects funders to start requiring publications on negative trails and publishing of the raw data.
The international repositories are at the moment mostly government funded and this may run out. Butte thinks that mirroring and distributing is the future. He also stresses that repositories need to bring the cost down - outsourcing! - and real show use cases, that will inspire people. The repositories that will win are the ones that yield the best research.

Sunday, March 02, 2014

IDCC14 notes, day 1: 4c project workshop

Part 1 in a series of notes on IDCC 2014, the 9th International Digital Curation Conference, held San Francisco, 24-27 feb.

In stark contrast with the 'European' 2013 edition, held last year in my hometown Amsterdam, at this IDCC over 80% of the attendees were from the US. That’s what you get with a west coast location, and unfortunately it was not made up by more delegates from Asia and down under. However as the conference progressed it became clear that despite the vast differences in organisation and culture, we’re all running into the same problems.

IDCC 2014 Day 1: pre-conference workshop on 4C project

4cproject.eu is an EU financed project to make a inventory of the costs of curation (archiving, stewardship, digital permanence etc.) With a 2 year project span it’s relatively short. The main results will be a series reports, a framework for analysis (based on risk management) and the ‘curation cost exchange’, a website where (anonimized) budgets can be compared.
The project held a one-day pre-conference workshop “4C - are we on the right track?” at which a roadmap and some intermediate results were presented, mixed with more interactive sessions for feedback from the participants. It didn’t always work (the schedule was tight) but still it was a day full of interesting discussions.
Neil Grindley noted that since the start of the project the goal has shifted from “just helping people calculate the cost” to a wider context. Beyond the actual cost (model) of curation: also the context, risks management, benefits and ROI. ROI is especially important for influencing decision makers, given the limited resources.

d3-1 - evaluation of cost models and needs gaps analysis draft

Cost models

Cost models are difficult to compare and hard to do. Top topics of interest: risks/trustworthiness, sustainability and data protection issues. Some organizations are unable or unwilling to share budgets. Special praise was given to the Dutch Royal Library (KB) for being a very open organisation for disclosing their business costs.
The exponential drop of storage costs has stopped. The rate has fallen from 30-40% to at most 12%. It is impossible to calculate costs for indefinite storage. This lead to a remark from the audience: "we're just as guilty as the researchers really, our horizon is the finish of the project.” We have to use different time scales - you have to have some short time benefits, but also keep the long term in scope.
However, costs are much more than storage. Rule of a thumb: 1/2 ingest, 1/3 storage, 1/6 access. Preservation and access are not necessarily linked. Example is the LOC twitter archive which they keep on tape. Once (if) legal issues currently prohibiting opening this archive are resolved, access might be possible via amazon’s 'open data sets' where you pay for acces by using EC2. The economics work because amazon keeps it on non-presistent media and provides access, and LOC keeps it on persistent media but no access.

Other misc notes

A detailed mockup of the cost exchange website was demoed and if all the functionality can be realized, this may be a very useful resource.

The workshop included a primer on professional risk management, based on ISO 31000 standard. “Just read this standard, it's not very boring!”. Originally from engineering, risk management is now considered mature for other fields as well. 

German Nestor project, really clear definitions on what a repository is, a useful resource comparable to the JISC reports:
www.crl.edu/focus/article/394
www.langzeitarchivierung.de/Subsites/nestor/DE/Home/home_node.html

Open Planets Foundation - great tools.

CDL DataShare is online - a really nice, clean interface.


Wednesday, February 22, 2012

The EJME plugin: improving OJS for articles with data

The EJME project has wrapped up and delivered! To quote the press release from SURFfoundation: "Enhanced Publications now possible with Open Journal Systems - Research results published within tried-and-tested system using plug-ins". That's all great, and so is the documentation, but aimed at those in the know already. A little more explanation is needed.


Who is EJME for?
Any journal that uses OJS for publishing and that wants to make it possible to have data files attached to articles (and as of December 2011, that's 11,500 Journals!).

What does it do?
Three things:
  1. improves the standard OJS handling of article 'attachments': files are available to editors and peers during the review process, and the submission process has been made (a little) easier; 
  2. plays nicely with external data repositories: an attachment can be a link to a file residing elsewhere (but work just like an internal OJS attachment in the review and publishing stage), and an internal attachment that an author has submitted with the article can also be submitted to a data repository, creating a 'one-stop-shop' experience for the author;
  3. on publication, it automagically creates machine-readable descriptions of an article and its data files (in tech-speak: these are small XML files, so-called Resource Maps, in the OAI-ORE standard). These can be harvested by aggregators such as the Dutch site Narcis that can then do more great and wonderful things with it, for example slick visualizations.

Great, but I only want some of that!
That's perfectly possible. If you want only improved handling, they're included in the latest OJS version. The other two are in separate plug ins, install only what you need. Though I do recommend to install the resource map plug-in, it won't require any work after installing.

What does it cost?
Just like OJS itself, the plug-in is open source and free of cost. Installation is as easy as most OJS plug-ins.

What does the journal have to do?
Of course, software is only a tool. The real question is deciding what to do with it. Does the journal want a mandatory Data Access policy? Is there a data repository in the field to cooperate with? Once these questions are answered, the journal policy and editorial guidelines will need to changed to reflect them.

Why would my journal want data along with articles?
As science becomes more and more data-oriented (and that includes the humanities), publishing data along with articles becomes essential for the peer review system to function. There have been too many examples lately of data manipulation that would have been found out by reviewers if they would have checked the data. And for that, they need access to the data. Reviewers of course won't change their habits suddenly once data is available to them, but it's a necessary first step.
(There are many other reasons, both carrots and sticks, for the greater good or the benefit of journal and author, but IMHO this is the pivotal point).

Q: Why name it EJME, such a silly name?
Enhanced Journals Made Easy was a little optimistic, I admit. Enhanced Journals Made (A Little) Easier would have been better. You live and learn!


Want to know more about EJME? Get started with the documentation.

Saturday, July 02, 2011

OR11: Misc notes

The state of Texas
I like going to conferences alone, it’s much easier to meet new people from all over the world than when you’re with a group, groups tend to cling together. With a multitracking conference like OR11 however, the downside is that there’s so much to miss. Especially since I like to check out sessions from fields I’m not familiar with. At OR11, I wanted to take the pulse of DSpace and Eprints, and not just faithfully stick with the Fedora talks.

In this entry, I focus on bits and bobs I found noteworthy, rather than give a complete description. I skip over sessions that were excellent but have already widely covered elsewhere (for instance at library jester) such as Clifford Lynch closing plenary.


“Sheer Curation” of Experiments: data, process and provenance, Mark Hedges 
slides [pdf]

"Sheer curation" is meant to be lightweight, with curation quietly integrated in the normal workflow. The scientific process is complex with many intermediate steps that are discarded. The deposit at the end approach misses these. Goal of this JISC project is to capture provenance experimental structure. It follows up on Scarp (2007-2009).

I really liked the pragmatic approach (I've written this sentence often - I really like pragmatism!). As the researchers tend to work on a single machine and heavily use the file system hierarchy, they wrote a program that runs as a background process on the scientists’ computer. Quite a lot of metadata can be captured from log files, headers, filenames. Notably, it also helps that much work on metadata and vocabulary has already been done in the field in the form of limited practices and standards.

Being pragmatic also means discarding nice-to-haves such as persistent identifiers. That would require the researchers to standardise beyond their own computer and that’s asking too much.

The final lesson learned sounded familiar: it took more, much more time than anticipated to find out what it is the researchers really want.


SWORD v2

SWORD2: looks promising and useful, and actually rather simple. Keeping the S was a design constraint. Hey, otherwise we’d end up with Word, and one is more than enough!

Version 2 will do full Create/Read/Update/Delete (CRUD). Though a service can always be configured to deny a certain actions. It’s modelled on Google’s Gdata and makes an elegant use of Resource Maps and dedicated action URLs.

CottageLabs, one of the design partners, made a really introduction video to Sword v2 demonstrating how it works:





It looks really useful and indeed still easy (as per Einstein's famous quip, as simple as possible but not simpler). If you’re a techie, dive into SwordApp.org. If you’re not, just add Sword compliance to your project requirements!


Ethicshare & Humbox, two sessions on community building

Two examples of successful subject-oriented communities that feature a repository, each with some good ideas to nick.

Ethicshare is a community repository that aggregates social features for bioethics:

  • one of the project partners is a computer scientist who studies social communities. Because of this mutual interest (for the programmer it’s more than just a job) they have had the resources to fine tune the site.
  • the field has a strong professional society that they closely work with.
  • glitches at beginning were a strong deterrent to success - so yes, release early and often, but not with crippling bugs!
  • the most popular feature is a folder for gathering links, and many people choose to make them public (it’s private by default).
  • before offering it to the whole site, new features are tried out on a small, active group of around 30 testers.
  • for the next grant phase they needed more users quickly, so they bought ads. $300 for Facebook ads yielded for 500 clickthroughs, $2000 Google ads 5000. This (likely) contributed to number of unique visitors rising from 4k to 20k per month. Tentative conclusion: these ads cost relatively little and are effective for such a specialized subject, the targeting is really quite good.


Lessons from the UK based Humbox project:

  • approach: analyse what scientists were doing already in real life, in paper and file cabinets, mimic it and extend it.
  • "the repository is not about research papers, it is about the people who write them": the profile page is the heart, putting the user at the centre. Like Facebook’s, it has two distinct views: an outside version about you (to show off), and internal version for you (with your interests). This reminds me of the success of the original, pre-yahoo delicious, which also cleverly put self-interest first with the social sharing as a side-effect.
  • Find a need that's not covered by existing systems: Humbox fills a need to share stuff, not just with students - for that the LCMS is the natural place to go to - but with colleagues, since the course-centric nature of LCMS’s tends to lock colleagues out.
  • Most feedback came from community workshops. Participants often became local evangelists.
  • Comments often were corrections. 60% of the authors changed a resource after a comment - and the 40% comments not leading to a correction also include positives, so the attitude towards criticism was quite positive.
  • over 50% of users modified or augmented material from others, sometimes reuploading it to the site.
  • Humbox only takes Creative Commons licenses, with an educational side-effect: some users indicated they also started looking in other places (such as flickr) for cc material as a result.


The Learning Registry: “Social Networking for Metadata”
slides [google docs]

I just want to mention this for the sheer scope and size of this initiative. It’s [explicative] ambitious.

The aim to gather all social networking metadata! To limit the scope, they won’t do normalising, or offer search or a query api, that's all left to the users of the gathered dataset. But all, they mean everything on the net: data, metadata and paradata (by which I understand they mean the relationships with other data).

Agreements are in the works with major partners (see last slide). The big elephant in the room was Facebook (no surprise, sigh) which wasn’t mentioned at all. (as I'm writing this, Google+ has just been announced, there is some hope after all of the slightly creepy evil eventually triumphing over the even more evil).

They call their approach a do-ocracy. Very agile design principles. Real-time everything in the open: all code and specs are written directly in Google Docs (table of contents, a google spreadsheet). NoSQL master-master storage system, well thought-out architecture, production will run on ec2. Everything will be open, except data harvested from commercial partners.

Something to keep an eye on: www.learningregistry.org.




Finally...


MODS is the new DC. In recent projects, MODS seems to have replaced Dublin Core as the baseline standard for metadata exchange. Interesting development.

Wednesday, June 29, 2011

OR11: New in EPrints 3.3: large scale research data, and the Bazaar.

As I mentioned in the overview, I was very impressed by what's happening in the Eprints community. The new features of the upcoming 3.3 are impressive as they seem to strike the right balance between pragmatism and innovation. Thanks to an outstandingly enthousiastic and open developer community, they're giving DSpace (and to a lesser extend Duraspace) a run for the money.

Red Bull
"Energize":
 could've been the motto of the Eprints community

Support for research data repositories

The new large scale research data is also a hallmark for pragmatic simplicity. EPrints avoid getting very explicit about subject data classification and control, taking a generic approach that can be extended.

Research data can come in two container datatypes, ‘Dataset’ and ‘Experiment’. A Dataset is a standalone, one-off collection of data. The metadata reflects the collection. The object can contains one or more documents, and must also have a read-me file attached, which is a human-oriented manifest, as, though machine-oriented complex metadata is possible, it would deter actual use.

The other datatype is Experiment. This describes a structural process that may result in many datasets. The metadata reflects process and supports the Open Provenance Model.

Where the standard metadata don’t suffice, one of the data streams belonging to the object can be an xml file. If I understood correctly, xpath expressions can then be used for querying and browsing. Effectively this unleashes the shackles of the standard metadata definitions and creates flexibility similar to Fedora. It's very similar to what we're trying to do in the FLUOR project with a SAKAI plugin that acts as a GUI for a data repository in Fedora. Combining user-friendliness with configurable, flexible metadata schemes is a tough one to pull off, I'll certainly keep an eye out on the way EPrints accomplishes this.

The Bazaar

The EPrints Bazaar is plug-in management system and/or an ‘App Store’ for EPrints, inspired by Wordpress. For an administrator it's fully GUI driven, versatile and pretty fool-proof. For developers it looks pretty easy to develop for (I had no trouble following the example with my rusty coding skills).

The primary design goal was that the repository including API must always stay up. They’re clever bastards: they based the plug-in structure on the Debian package mechanism, including the tests for dependencies and conflicts, which makes it very stable. Internally, they’ve run it for six months without a single interruption. Now that’s eating your own dog food!

Country road
Off the beaten track

EPrints as a CRIS

The third major new functionality of 3.3 is CERIF import & export. Primarily this is meant to link eprints repositories automatically to CRIS systems, but for smaller institutions that need to comply with reports in CERIF format but don’t have a system yet, using eprints itself may suffice as pretty much all the necessary metadata is in there. The big question is whether the import/export would allow a full lossless roundtrip, as I joined this session halfway (after an enthousiastic tweet prompted me to change rooms) I might've missed that.

This sounds very appealing to me. Unfortuntaly, the situation in the Netherlands is very different, as a CRIS has been mandatory for decades for the Dutch Universities. Right now we’re in the middle of an European tender for a new, nationwide system, and the only thing I can say is that it’s not without problems. How I’d love to experiment with this instead in my institution, but alas, that won't be possible politically

The EPrints attitude

As Les Carr couldn’t make it stateside, he presented it from the UK. The way this was set up was typical for the can-do attitude of the eprints developers: Skypeing in to a laptop which was put before a mike, and whenever the next slide was needed Les would cheerily call out ‘next slide please!’, after which the stateside companion theatrically reached out for the spacebar of the other laptop, connected to the beamer. Avoid neat technology for technology’s sake and keep it simple and effective.


Wednesday, June 22, 2011

OR11: opening plenary

See also: OR11 overview

The opening session by Jim Jagielski, President of the Apache Software Foundation, focussed on how to make an open source development project viable, whether it produces code or concepts. As El Reg reports today, doing open source is hard. The ASF has a unique experience in running open projects (see also is apache open by rule). Much nodding in agreement all around, as what he said made good sense, but hard to put in practice. Some choice advise:

Communication is all-important. Despite all the new media that come and go, the mailing list still is king. Any communication that happens elsewhere - wikis, IRC, blogs, twitter, FB, etc - needs to be (re)posted to the list before it officially exists and can be considered. A mailing list is a communication channel which is asynchronous and participants can control themselves, meaning read or skip it at their time of choice, not the time mandated by the medium. A searchable archive of the list is a must.

Software development needs a meritocracy. Merit is built up over time. It’s important that merit never expires, as much open source committers are volunteers who need to be able to take time off when life gets in the way (babies, job change, etc).

You need at least three active committers. Why three? So they can take a vote without getting stuck. You also need ‘enough eyeballs’ to go over a patch or proposal. A vote at ASF needs minimally three positive votes and no negatives.
To create a community, you also need a ‘shepherd’, someone who is knowledgable yet approachable by newbies. It’s vital to keep a community open, so not to let the talent pool become too small. To stay attractive, that you need to find out what’s the ‘itch’ that your audience wants to scratch.

The more 'idealistic' software licenses (GPL and all) are "a boon firstmost to lawyers", because the terms ‘share alike’ and ‘commercial use’ are not (yet) clear in juridical context. Choosing an idealistic license can limit the size of the community for projects where companies play a major role. A commenter added that this mirrors the problems of the Creative Commons licenses. In a way, the apache license mirrors CCzero, which CC created to tackle those.

Tuesday, June 21, 2011

Open Repositories 2011 overview

Open Repositories was great this year. Good atmosphere, lots of interesting news, good fun. It's hard to make a selection from 49k of notes (in raw utf8 txt!). This post is a general overview, more details (and specific topics) will follow later.

Bright lights, bit state!
Texas State History Museu
My key points:

1. Focus on building healthy open source communities

The keynote by Jim Jagielski, President of the Apache Software Foundation, set the tone for much what was to come. An interesting talk on how to create viable open source projects from a real expert. The points raised in this talk came back often in panel discussions, audience questions and presentations later.
More details here.

2. The Fedora frameworks are growing up

Both Hydra and Islandora now have a growing installed base, commercial support available, and a thriving ecosystem. They've had to learn the lessons on open source building the hard way, but they have their act together. Fez and Muradora were only mentioned in the context of migrating away.
Also, several Fedora projects that don't use Hydra still use the Hydra Content Model. If this trend of standardizing on a small number of de facto standard CM's, that would greatly ease mixing and moving between Fedora middleware layers.


3. Eprints’ pragmatic approach: surprisingly effective and versatile

Out of curiosity I attended several EPrints sessions, and I was pleasantly surprised, if not stunned by what was shown. Especially the support for research data repositories looks to strike the right balance between supporting complex data and metadata types, while keeping it simple and very usable out-of-the box. And also the Bazaar, which tops Wordpress in ease of maintainance and installation, but on a a solid engineering base that's inspired by Debian's package manager. Very impressive!
More details here.
Longhorn
Texans take 'em by the horns!



Misc. notes
See part #3: Misc notes


Elsewhere on the web


OR11 Conference programpresentations.
Richard Davis, ULCC: #1 overview#2 the Developers Challenge, #3: eprints vs. dspace.
Disruptive Library Technology Jester day 1, day 2, day 3.
Leslie Johnson - a good round-up with focus on practical solutions.
#or11 Tweet archive on twapperkeeper

Photosets: bigD, keitabando, yours truly, all Flickr images tagged with or11, Adrian Stevenson (warning: FB!).


Other observations

Unlike OR09, the audience was not very international. Italians and Belgians were relatively overrepresented with three and six respectively. I spotted just one German, one Swede and one Swiss, and I was the lone Dutchman. The UK was the exception, though many were presenters of JISC funded projects, which usually have budget allocated for knowledge dissemmination.

As OR alternates between Europe and the US, the ratio of participants tends to be weighed to the 'native continent' anyway. But the recession seems to be hitting travel budgets hard in Europe now.
As there were interesting presentations from Japan, Hong Kong and New Zealand, the rumour floating around that OR12 might be in Asia sounded attractive, I'd be very curious to hear more about what's going on there in repositories and open access. The location of OR12 should be announced within a month, let's see.

[updated June 27th, added more links to other writeups; updated June 28, added Hydra CM uptake]




Monday, June 20, 2011

Catching up on old news, I came across an interesting presentation on CNI this spring on the Data Management Plans initiative. Abstract, recording of the presentation on youtube, slides.

DMP online is a great starting point (and one of the inspirations for CARDS) and this looks like the right group of partners to extend it into a truly generic resource. What's notable about the presentation is also the sensible reasons outlined for collaboration between this quite large group of prestigious institutions.All in all, something to keep an eye on.

Tuesday, October 05, 2010

Don't panic! Or, further thoughts on the mobile challenge

Two weeks ago, I posted some notes on the CILIP executive briefing on 'the mobile challenge', where I presented the effort of my library, the quick-wins 'UBA Mobiel' project. Those notes concentrated on the talks on the day. Now that it's had time to simmer (and a quick autumn holiday), I want to add some reflection on the general theme.

Which basically boils down to Don't Panic (preferably in large, friendly letters on the cover).

Is there really such a thing as a 'mobile challenge' for libraries? Well, yes and no. Yes, the use of internet on mobile devices is growing fast, and is adding a new way of searching and using information for everyone, including library patrons. The potential of 'always on' is staggering. And it is a challenge.

However, it is also just another challenge. After twenty years of continuous disruption, starting with on-line databases, then web 1.0 and web 2.0, change is not new any more. Libraries are still gateways to information, rare and/or expensive (the definition of expensive and rare depending and varying on the context, also changing of course). And the potential of the paperless office may finally come to fruit with the advent of the iPad, but meanwhile printer makers are having a boon selling ever more ink at ridiculous prices.

So, what to do?

There are three ways to adapt. On one side are the forerunners, with full focus on the new and shiny. Forerunners get the spotlights, and tend to be extroverts that make good presentations. However, not everyone can be in front - it would get pretty crowded. It takes resources, both money and a special kind of staff. Two prominent examples given at several of the Cilip talks were NCSU and DOK Delft. Kudos to them, they're each doing exciting stuff, but they are also the usual suspects, and that's no coincidence.

On the other extreme, there's not changing at all. For the institution, a certain road to obsolescence. For a number of library staff the easy way to retirement. Fortunately, their number seems to be rapidly dwindling, but nevertheless, finding the right staff to fulfil the jobs at libraries or publishers when the descriptions of these jobs are in flux was a much talked about topic, both in the talks and in the breaks.

In practice, most libraries are performing a balancing act in between. And it is perfectly acceptable to be in the middle. Keep an eye on things. Stay informed. Make sure your staff gets some time to play with the toys that the customers are walking around with, and if they find out what's on offer in the library is out of sync, do something about it.

[from tuesday tech]
Which is pretty much what we did with UBA Mobiel. Nothing worlds hattering, not breaking the bank. We're certainly not running in front, but we're making sure our most important content (according to the customers) is usable. This way, when the chance comes along to do Something Utterly Terrific (Birmingham) or merely a Next Step Forward (upgrading our CMS) we know what to focus on.

The response on our humble little project has been very positive. We may have hit a nerve, and I'm really glad to hear that it is inspiring others to get going. Go-Go Gadget Libraries!

Friday, September 17, 2010

Becoming upwardly mobile - a Cilip executive briefing

Cilip office
Cilip office in Bloomsbury, London

On September 15, Cilip (the UK Chartered Institute of Library and Information Professionals) and OCLC held a meeting on the challenge that mobile technology proves for libraries, called Becoming upwardly mobile Executive Briefing.

The attendees came from the British Isles (UK and Ireland). Some of the speakers however came from elsewhere. Representing The Netherlands, I presented the UBA Mobiel project as a case study, which went well.

The mere fact that I was asked to present our small low-key project - which in the end cost less than 1100 euro and 200 hours - as a case study along the new public library in Birmingham with a budget of 179 million pounds sterling shows how diverse the subject 'the mobile challenge' is.

Thus the talks varied widely, and especially the panel discussion suffered from a lack of focus. It was interesting nevertheless.

Attendees were encouraged to turn their mobiles on and tweet away, and a fair number of them did. See Twitter archive for #mobexec at twapperkeeper.


1. Adam Blackwood, JISC RSC

A nice wide-ranging introduction in a pleasant presentation, using lots of lego animation. In one word: convergence. To show what a modern smartphone can do, he emptied his pockets, then went on from a big backpack, until the table in front of him was covered with equipment, a medical reference, an atlas and so on. "And one more thing…".  The versatility of the devices coming at us means not only that current practices will be replaced, but also that they are going to merge in unexpected ways. Reading a textbook online is a different experience from reading it on paper, for instance. Augmented reality (in the broad sense of the word, not just the futuristic goggles) is a huge enabler that we should not block by sticking to old rules (such as asking to turn devices off in the library or during lectures).

As for the demoes, it's a bit unfortunate that it always seem to be the same that are pointed to (NCSU, DoK), though they're still great. Using widgetbox to quickly create mobile websites was new to me, worth checking out further (the example was ad-enabled, hope they have a paid version, too).

All in all, a great rallying of the troops.

2. Brian Gambles, Birmingham

A talk about the new public library in Birmingham. An ambitious undertaking, inspired by amongst others the new Amsterdam public library. The new library should put Birmingham on the cultural map, and itself become one of the major touristic attractions for the city, opening in 2013. It's also meant to 'open up' the vast heritage collection (the largest collection of rare books and photography of any public library in Europe). And to pay for it, they'll have to monetize those as well.

A laudable goal, great looking plans, I wish them luck in these difficult times.

The library is not just the books (the new Kansas city library sends all the wrong messages). The mobile strategy comes forth from the general strategy: open up services and let others do the applications. Open data, etc. They are working with apple to get on iTunesU for instance (partnership with the uni). Get inspiration from cultural sector, many interesting & much downloaded apps have come from museums. Notable especially is the Street museum of London (flash-y-website, direct iTunes ap link)

Also, can't afford to hire enough cataloguers for the special collections - open up this as well, let crowdsurfing as a helpful tool. Surprised that there are people that like to correct OCR texts, which he thinks is a dreadful chore. So let's use it.


3. Panel discussion.

This wasn't as good as it could have been unfortunately, due to the wide range of the topic. Still some interesting points:

Adrian Northover-Smith from Sony of course very much pro e-ink devices and against the iPad. It's a cultural challenge for the company that their e-reader customers are female and older, most of their wares are peddled to young males. In a way, not dissimilar to libraries adjusting to the new 'digital native' generations, especially those catering to students.

Q: mobile use for people with visual impairment? A: epub format allows for more formats, larger letters, reading aloud. In some studies (art, fashion) up to 30% of students are dyslectic, and they're helped greatly by different presentation from the content. (DH: this is yet another field in which rights are the big hurdle, given the skirmishes over audiobook vs text-to-speach rights...).

Simon Bell from the British Library talked about the challenge of mass digitization. The definition of availability is shifting, and digital born data is especially volatile. Mobile access is just another form of presenting content, the content comes first now.

Jonathan Glasspool from Bloomsbury Academic talked about the publishing point of view. He presented a new platform for online publishing, using CC licenses to allow non-commercial use online. I'm curious how this compares to the European OApen project in which our uni participates.

In his view, the main challenge today is that the industry needs a new type of people. Bloomsbury has weekly voluntary 'elevenses' sessions, where staff can brief each other on new ideas and online uses they found, which seem to work well as a motivator.

Simon Bains and bevanpaul noted via tweets that there seems to be a big divide between those focussing on generating content versus those interested in new platforms, and I agree. You can't have one without the other, it's a chicken & egg situation. On the other hand, the reality is that the size of the problems are so big that to get anything done, focus is needed.

Brian Gambles mentioned that railway ticket machines were recently redesigned to deal with the visual impaired, resulting in a design that's much better for everyone. Better to incorporate it from the start: "accessibility should be in the DNA of new products".


4. Jeff Penka, OCLC worldwide

As I was preparing for my own talk, only a few notes. The main point of technology is barrier elimination for the user. We tend to think in systems, in details, jargon and acronyms: ILS, OPAC, SFX. The user just thinks a button should be "Get it". See also the importance of 'one-click' shopping in the Amazon and iTunes stores: such a seemingly small step key to dominance.

The worldcat mobile interface is very 'beta' - every 2-3 days a new release, to try things out. Expected to stabilize in spring 2011 though. An interesting remark: OCLC believes that a mobile interface should not come as an extra, at a high cost. Rimshot! Too many vendors are trying to squeeze their clients by doing exactly that.

5. Driek Heesakkers on Uba Mobiel

Download the presentation (licensed under creative commons BY-NC-SA).

Then it was my turn. I presented our small 'agile' project. See the presentation. It will be described in more detail in the upcoming book 'Catalogue 2.0' - A little ironic, as one of the themes of the day was that the catalogue is much less important to the users as it is to library professionals.

To summarize: by giving space to enthusiastic early adopters amongst staff, in the form of a low-overhead, fast-moving project that focuses on possible quick wins, a library can bridge the gap for the current transition period. In the long term, vendors will come up with solutions that present content (whether a catalogue, website or digitized objects) equally well in a mobile content as in others. This will take a while though, and in the meanwhile we can't afford that our services are (nearly) unusable on a mobile device.

Basically, the message is "just do it" - it will be easier than you think!


6. Benoit Maison on Pic2shop

A highly specialized topic. The pic2shop application offers an interesting way of merging functionality that web apps can't access (in this case, barcode scanning) with regular web apps. In the case of their worldcat enabled scanner, a user can scan a book (in a bookstore I presume), the app then passes the code on to an external website which does something useful with it (looks it up in worldcat) and the app displays the result from this website inside the app interface. To the user it's transparant, for the developer it's relatively light-weight.

It's an elegant concept. Might be useful for other specific device functionality that can't be accessed via web apps as well, though there are currently no plans for that.

The day ended with a session on augmented reality by Lester Madden, who did a good job I heard. Unfortunately my flight connection was too tight to stay for this one. The flight experience was pretty bad anyway... next time Eurostar for me!

Finally, for a little balance: on the same day, Aaron Tay wrote A few heretical thoughts about library, which deals amongst other things with the relative unimportance of mobile use at the moment. To a certain extent he has a point. It's not bad to stop for a moment and check if you're just following the pack.

A quiet moment