Ideas Challenge presentations #or2017

Challenge to solve an existing problem with emerging technologies.

Data Pickle

Research wanted to upload data but didn’t know how to wrap it up. So cf ThisToThat for gluing thing A to thing B. Let’s make this for data.

Package Shapefiles for Preservation. click “PICKLE!” and it recommends a) the best practice and b) the minimum requirements.

Crowdsourced but curated information for various options.

Technology handshake to achieve Australasia PMC

Right now have EuropePMC and CanadaPMC (child nodes of US PubMed Central which has 27million references). So create AustralasiaPMC so PMC can link to OA articles. Can populate PMC with clumsy markup so need clever handshake technology  to make full-text available in children and parent nodes simultaneously.


Museum is an interface for scientific information to general public. But takes too long for simplified explanations of science (from eg journals) to general public, and journalists don’t always guard scientific integrity.

Want to do a better job of spreading info through social media. Natural language processing to create automated simplified summaries from technical abstracts; push notification to proposal pages so they can create or add to articles; Google translate for other languages.

Put it all together and you get communication immediately after acceptance, being picked up correctly by major news outlets.

(In Q&A: hard to contextualise. Audience notes researchers want to say ‘further researcher needed while lay people want to know what the answer is.)


The technology we’ll use in future repositories has already been written – GitHub is full of work in progress – some people know about it but not all of us. Pull code automatically from everywhere, put it together, throw data in, see if it works.

Plan A – artificial intelligence – most advanced AI right now is self-driving car, so jump in front of one with the repository and the car can evaluate it and then run you over.

Plan B – use humans

(In Q&A: Kim Shepherd suggests when on GitHub and look at number of forks on projects – what percentage might be active, what percentage should we have merged in.)

Global Connections

Deep learning for repository deposit – use existing repository PDFs and metadata to train AI to a) create structured metadata for unstructured content (ie articles), find relevant articles, add structured metadata.

Slice ‘n’ Dice: API-X + XProc-Z

XProc-Z is a simple web server framework HTTP request -> -> -> HTTP response (especially useful for proxies)

API-X for plumbing together microservices.

GET request for info on resource – API-X intercepts/proxies, tweaks, and makes request to server, retrieves result, wraps in a header, tweaks and returns to user.

Don’t need to develop code, just write a text file in XProc language so you can test out what it looks like and you don’t need to wait for repository support. Signposting; generating IIIF manifests; add OAuth authenticating; adding CSS.

Brisbane Declaration ON the Elimination Of Keywords (B-DONEOK)

Keywords can’t express the complexity of language the way full-text can. We spend time doing it anyway. So let’s stop. Instead just use sophisticated full-text search and indexing. SIgn on to the declaration at

(In Q&A audience asks if there’s evidence keywords aren’t useful; team asks in return if there’s evidence keywords are useful.)


Leave a Reply

Your email address will not be published. Required fields are marked *