Tag Archives: linked data

Humanities Informatics #ndf2012

Humanities Informatics:
Ingrid Mason (@1n9r1d), Intersect Australia
The Humanities Networked Infrastructure project is a virtual laboratory project funded by the NeCTAR programme in Australia. The project has several significant scholarly humanities datasets to bring together and map across. The immediate goal is to enable researchers to explore and interpret the commonalities.
The initial design challenge is to select description schema and use linked data and controlled vocabularies for data to align the data. This approach tests the assumption that configuring and building on the knowledge of available schema, methods and datasets, will provide a standards based and curated foundation layer to support research requirements.
This ‘prefabricated’ approach has been the basis by which the digital humanities and GLAM sectors have provided access to data. Observing how researchers shape and use this prefabricated environment will inform the value of that approach and the architectural modelling, and inform next steps to building infrastructure where the ‘researcher query’ is the lens that defines the schema.

Anonymous quote: “Gah semantic web is frying my brain!”

Intersect Australia is eResearch org. Working on virtual lab project in humanities informatics field. Talking and dreaming and living data… Have become conversant in RDF; even taking step to ontology development. Talking about linked data, data as graph. Interested in overlap between humanities informatics and GLAM digital cultural heritage.

Wants to provoke thinking – datasharing across GLAMs and scholarly datasets? Who has authority, truth, encoding consensus or contradiction? Doing something with HuNI data? etc

Digital Humanities sits within eResearch (which has been dominated by science). HuNI (@hunivl – Humanities Networked Infrastructure) is a distributed project want to explore commonalities/divergences in data. Bring together datasets, meaning dealing with multiple standards, need to build an ontology. User-centred design.

Assumptions – they’re “prefabricating” but talking to researchers all the way through. Building foundation layer. Fascinated by idea of a researcher query. Work to help researchers ask the questions they need.

Project to integrate 28 cultural datasets (using linked open data) into a virtual laboratory. Want to break down barriers between disciplines. Want it to be available to all but licensing comes into it.

Data – AusStage, bonza, CAARP, AustLit, CircusOz, Australian Dictionary of Biography, PARADISEC, Australian Women’s Register…….
Tools – eg Omeka, Neatline, LOREinformatics from Wikipedia: “studying how to design a system that delivers the right information, to the right person in the right place and time, in the right way”

(Skimming through – slides will be online.)

“Data” an ineffective word to describe all the kinds of data there are.

Linked Data on Wikipedia.

RDF – resource description framework. Statements known as “triples” – subject, predicate, object. In different formats eg RDF/XML, RDF/JSON
SPARQL – query language

“Ingrid is a Kiwi. Conal is a Kiwi. But what is a Kiwi?”

Ontologies have concepts, relations, instances, and axioms. A set of entities within a domain are related by a concept.

Connections between people within Australian Biographies, and between a group of datasets.

Challenges:

  • Need to help researchers go from above the forest through the canopy into the trees and branches.
  • Unlock data, value in controlled vocabularies.

How will you commemorate the First World War centenary? #ndf2012

How will you commemorate the First World War centenary?
Virginia Gow @vexus_nexus and Douglas Campbell, WW100 and Auckland Museum

What is your organisation doing to commemorate the centenary of the First World War?
The First World War (1914–1918) was one of the most significant events of the twentieth century and had a seismic impact on New Zealand society. Ten percent of our then population of one million served overseas, of which more than 18,000 died and over 40,000 were wounded. Nearly every New Zealand family was affected.
In this session, join Virginia Gow and Douglas Campbell to get some pointers on preparing your organisation for WW100 – New Zealand’s First World War centenary commemorations. We’ll cover some of the activities already underway in the digital GLAM sphere, how you might contribute to national initiatives such as the Cenotaph redevelopment, and hold an open discussion on how we can support each other to be ready for WW100.

Virginia: Centenary of WWI coming up in 2014. Why are we commemorating it? Is there anything left to digitise?

Nearly half of NZ’s young men went to war. Events touched every family, community, school, workplace. Aim to tell stories, not sanitised. Create a comprehensive website of the WWI history http://www.firstworldwar.govt.nz. Aims: Public engagement, preservation of our heritage, creation of new interpretations of our history, international connections.

Funding opportunities available – applications close Nov 2012, May 2013. Have created symbol and official name for even (available on website). Programme office no mandate or intention to organise everything. Providing support for things but mostly facilitating activities elsewhere.

What does the centenary mean for us as GLAM institutions?

Of note: photographs taken by NZers before 1944 are probably out of copyright.

Could be good to get together, figure out what we’ve got and what’s out there, then pulling it together in meaningful ways. What story will we tell the future about this centenary? (eg people using Twibbons as people in the first Anzac Day commemoration wore hats?) An opportunity for the GLAM sector to shine especially if we work together / collaborate.

Private mailing list available to discuss plans – contact the programme office for info.


Douglas: working on Cenotaph redevelopment. Cenotaph is a biographical database for NZers who served in war. Records for most of 100,000 NZers who fought overseas and have died. Records may have details and photos, or may only have name rank and serial number.

Will keep a page per soldier but jazz it up a bit and add other entry points – maps, battalions, battles. Could have much more content available out in the GLAM sector. GLAM could contribute; links could go both ways. Users could contribute info/photos about family. Crowdsource research, digitisation, transcriptions, stories both typed and audiovisual, corrections (eg bad machine data matching, mistakes in official records, soldiers giving wrong date of birth). Provide data (vocabularies, authoritative data, international data, linked data) back to institutions. Make databases available to academic research. Will be complicated so hope to partner with DigitalNZ.

Curly questions:

  • scope (which people, which wars?)
  • centralisation – should it all be on Cenotaph or should it link out?
  • ownership
  • provenance – how do we make sure we know which data is curated, which crowdsourced, etc?

Note service numbers aren’t unique but can use Cenotaph number which should (hopefully!) be permanent.


Q: Data going to institutions and academics but back to users who contributed it. Will we see an Open API?
A: Hope so but will be curly as integrate data from various sources.

Q: How do we turn commemoration into something inclusive of all NZers including those whose ancestors fought on other side?
A: We’re just one project among many all around the world. There are other ways into the centenary than Cenotaph eg life a hundred years ago.

Q: Is there an index to conscientious objectors?
–Apparently there’s one in the Gazettes.

Q: Can you commit to the Cenotaph ID being permanent?
A: Yes, so commits.

The tales we can tell #ndf2012

The tales we can tell
Tim Sherratt and Chris McDowall
The growing proliferation of digital sources provides opportunities to view the past in different ways. We can analyse textual content of documents, extract and compare information from images, and build all manner of impressive graphs and visualisations to discern new patterns and insights. But this data has its origins in human activity. Behind each data point is a multitude of stories, as different as they are the same. By abstracting these experiences, the world of big data can become detached and alienating. How do we take advantage of quantitative techniques for contextualisation while holding on to the differences, the anomalies, the contradictions that continue to nourish and intrigue us?
Using examples drawn from a variety of collections and projects, Tim and Chris will investigate ways of bringing the two perspectives together. How can we construct interfaces that enable us to move freely across gulfs of scale and meaning? How can we present online narratives that embed multiple contexts? How can we use machine- readable data to frame and enrich our human-sized stories?

Tim: What happens when we bring stories and data together?

The excitement of linked open data is about making meaning. Explore, wonder, linger, sometimes stumble. The frustration of linked open data is that we talk as if it was all just engineering – a big industrial plumbing project. Can instead be a craft, created with love – or in anger. Linked open data will be a success not when we’ve linked everything to DBpedia, but when we’ve created thriving communities.

Western tradition equates knowledge with accumulation. Linked data promises Lots More Stuff. It’d be a tragedy if all we ended up with was a bigger database or better search engine. Want enriched stories, embedded meaning.

Did a presentation once adding triples – but presentation and triples were still separate. Want to create something not with a platform (“sneaky server-side stuff”), something anyone could do. Plain text, no markup. Hacked together javascript to work with text in document, get data from elsewhere, and: Live demo. Script inspects text onscreen and displays visible entities to the right. (The audience is audibly wowed.) Right now most data comes from within document, but sometimes only includes an identifier and pulls info from other sources. Rough demo and long to do list – but gives ideas on how to create data-rich stories.

Just used HTML, RDFA, and some javascript libraries. Wanted it to be accessible. “Access” not just the power to consume but also the power to create. Doesn’t want to live in a world where data is something other people collect for us. Wants “slow data”. Not the giant global graph, but data artisans hand-crafting stories into a messy tapestry.


Chris: Showing DigitalNZ listing thumbnails which link to institutional landing pages. Thinks it’s great if you know what you’re looking for. Tells of being in museum – not looking for a specific thing but just exploring. When online, don’t want to look at a postage stamp.

On a screen there’s so little real estate. Most compelling part of an image is typically the face. So took images (all 21,000 of them) and passed through OpenCV to extract 16,500 faces. Started experimenting with tile placement algorithms.

Composited images into a single image (in five clusters eg the area of soldiers’ faces) displayed with a maptiler interface: can zoom out to full mosaic or zoom into individual image. Wants online but first needs to add a metadata overlay and a clickthrough to source.

Has questions: Is this useful? Would this scale? Does this automatic cropping respect the images?