Many aspects of open data – today focusing on research data, ie created by research projects at an institution.
Research workflow is very complex but to really simplify: researchers start a project, get lots of data, and summarise results in journals. But it’s not the data – it’s a summary of the data with maybe a few key examples. The rest goes to places where only the researcher can access it.
Why do we care?
- for the good of all
- expensive to generate so want to maximise use eg validate, meta-analyses, used in different ways
- much funded by government therefore taxpayer – so they should be able to access it
Used to work in a group which shared greenhouse space but had no idea what else was in there. Proposed sharing basic information about what was there and what to do in case of emergency – and was shocked when some said no. Supervisor said don’t let it stop you asking the question but that’ll happen, yeah.
Requesting data, odds of it being extant decrease 17% each year. (cite: Vines (2013) 10.1016/j.cub/2013.11.014)
This is where academic libraries come in – getting the data off the USB drives. So need to understand why they might not want to share. Did interviews to inform survey construction to get info from more people. 102 responses from researchers across 10 disciplines; 18 from librarians (about 20% response rate).
Do librarians and researchers agree on the major drivers that determine whether researchers choose to share their data?
Is data-sharing part of the research culture? Librarians: 7% said common/essential; researchers 26%
Factors influencing data-sharing
- agreement in some areas eg ability to publish, inappropriate use, copyright and IP pretty high; then resources, interest to others, system structure and data access
- differences: librarians thought institutional policy, system integration very important; funder policy, system usability somewhat important – all very low for researchers. What was important for researchers were: ethics (>40%); culture, research quality (10-15%); data preservation, publisher policy (5-10%)
Are there differences across major disciplines in what those drivers are?
5 disciplines with 10+ responses: business, medicine/health, phys/chem/earth; life sci/bio; soc sci/education. Ethics important for most but not a high-ranking factor for phys/chem/earth due to nature of their data. Whereas data preservation/archiving is more important for them (and med/health), somewhat important for life sci and soc sci, while business barely cared.
Take home
So consult with your community to find out what’s worrying them. Target those concerns in promotion and training. Eg we know system usability is important so definitely fix it – but don’t waste your communication opportunities talking about it when they’re worried about other things.