Monthly Archives: February 2014

Think social #vala14 #s20 #s21

Wendy Abbott, Jessie Donaghey, Joanna Hare and Peta Hopkins The perfect storm: the convergence of social, mobile and photo technologies in libraries

Looking at libraries’ use of Instagram and photosharing. Identified 74 libraries in April/May 2013 – seems to be early days compared to Facebook. Broke down to a few special libraries (eg Smithsonian) but mostly public (slight majority) and academic.

Survey sent to 65 libraries that could find contacts for, 29 responded. 15 agreed to individualised followup and 10 in fact followed up. Also used Nitrogram to monitor 20 library Instagram account over 4 weeks. Took ten most-ilked images from Nitrogram images – turned out that identity and affective were more important than functional images.

Libraries don’t target specific groups – just anyone and everyone with Instagram account.
Issues: having trouble coming up with content to share. Some found it hard to share responsibility among staff since it’s a mobile platform; also issues editing images. Most libraries use staff personal equipment. Public libraries more likely to use employer equipment.
Most libraries share across multiple platforms. Found visual content got better engagement than verbal updates.
Less than 50% provided training – usually self-directed or in-house social media training.
Uncertainty of how much to follow/interact with students. Would be good if there were norms!

UCLA Powell image of tree that fell down, with Harry Potter spin because undergrads often refer to library as Hogwarts – very individualised to their population.
Emily Carr Uni library use same background for all images to create cohesive style
Public Libraries of Singapore – pets with books
Los Angeles County Public Library – connect with shared love of local sports team
Melbourne University Library – dolls in library
Some have very specific uses – eg educating re cuts to library budgets, or promoting maker space, or promoting photo archives.
“Library selfies (and shelfies)” – used to construct identity. Often want to construct friendly identity for library.

Thinking about goals:

  • what your message is
  • think about your target audience overall and how that might differ per image
  • how you want to engage your audience
  • how you’ll evaluate
  • how the images will be used and where they’ll appear

Data and paper online

Q: Any licensing issues?
A: Not an issue for us because creating own images. Used a Creative Commons image once – just add attribution over the top or underneath so not an issue.

Q: Would some places have issues with their PR office?
A: Didn’t cover in their research because only surveyed places that already had accounts.

Kathleen Smeaton and Kate Davis Is it Tweet-worthy? Privacy in a time of sharing

“Content forwarding” for retweeting without adding own content/analysis/critique, and for conference tweeting of the “Kate just said” variety.

52 participants completed survey, all in full. 32 consented to being followed via social media for a week – actually only chose 12. Respondents from students, graduates, deputy university librarian. Most had one account, a few had more than two. Most self-reported lower than they actually tweeted. Likewise self-reported professional tweeting as higher than actually tweeted. 64% said would tweet on controversial topics. 85% identify profession in profile – important part of online identity. 22% identify org in profile and 50% identify in tweets.

Tweets on controversial topics almost always liberal. Are there few conservative librarians or are they just very quiet? OTOH mostly tweeting about controversial topics were retweets, not original tweets – evidence of some tentativeness.

Approaches to tweeting can change over time, often more relaxed once involved in tweeting community. Work and life collide – unless deliberately separate identities they merge together. “Context collapse” can be a concern when associate yourself with organisation. Many tweet personal beliefs; many tweet for organisation on own personal account. What are the impacts on governance? Most tweeting librarians are wise to risks and take a commonsense response. Organisations need an appropriate flexible policy in place – loosen up and trust professionalism of staff.

Lots of livetweeting, forwarding content. Two thirds of professional content was content forwarding. 15 tweets from 4 participants gave a professional opinion on something. Unwilling to put forward a professional opinion even if willing to raise controversial non-library topics. Is it safer to talk about politics than library policy?

86% of tweets were replies to a conversation. Building relationships. Some only tweeted professionally with no tea-table banter. “Informers” share information with goal of cultivating followers and relationships, while “Me-formers” share info about themselves. Not everyone wants to indulge in disclosure about shoes and cats – but this is valuable for building relationships. Disclosure seems to be the main catalyst for conversation.

Useful professional tool, perhaps because of personal discussions.

A new kind of citizens’ library #vala14 #p3

Gene Tan A new kind of citizens’ library through the Singapore Memory Project

Starting by getting us to share with each other favourite memory of Australia. [Mine is coming as a kid to Great Keppel Island and one night our parents got us up really late to see Halley’s Comet – I don’t remember the comet, just getting to be up so late.]

Someone obsessed with playgrounds and documented all the playgrounds in Singapore – found someone taking photos of playgrounds at night. Facebook page hit 100,000 people in a few hours, started national crowdsourcing – people sharing photos and memories of playgrounds.

Traditionally collect internet content, digital content, physical content. We don’t so much collect hearts-and-mind content.

Focusing today on long-tail contributions – people who contribute only once but add them all together and it’s huge.

Project looks at different perspectives, for each event collect multiple perspectives – history seen through the eyes of Rashomon. Politicians asking if would be organised into types, he said he thought the project should be random – seemed offensive to reduce people to patterns. Project needs to remain messy.

Got students to do interviews with seniors. Exhibition getting stories of people with portraits of their hands. Opportunity to capture memories of years when Singapore gained independence.

Created site www.SingaporeMemory.sg to give every Singaporean a permanent memory account. Made three off-beat films – one about graffiti. Took proposals for a fund to create content – one proposal on coffee, one on capturing memories of first homes (most Singaporeans have moved half a dozen times).

A few years after started this, national papers mentioning “memories” 1500% as often as before. So started wondering if National Library should keep doing this now that everyone is talking about it.

“Library of me” – but connected to newspapers, encyclopaedia, digital content from previous exhibitions, books, manuscripts, archives, research. So personal footprint gets huge.

“Every citizen her book” – in 2016 want to create Singapore Memory Public Library – any memory you’ve recorded manifested as a physical book. Talking to politician – not very excited. “That’s just like a hallowed hall. What about a memorial in the open.” Which sounded like something dead. But thinking about how what people loved were places they’ve been. What if could replicate the library in the park: miniatures by artists of all the places that are gone – so can see not just stories and films about the place, but things that are unexpected to transport you into that place/time.

Not everything has to be a digital innovation – wants to take it back to the palpable.
Singapore Memory Project – “Giving your past a present”

Q: How do you get people who aren’t technically savvy to contribute?
A: A lot has to be done by going to them. School programme where train the entire cohort of students eg 15 years old and deploy them to housing estates. Going to bring out DIY clips on YouTube teaching how to do this (in various languages because many don’t speak English: English first language, also mother tongue, also dialect).

Q: State Library of Queensland project collecting teacups. Now embarking on project to tell World War One. Over four years, how to find performance outcomes/measures that will keep politicians and accountants happy?
A: Originally said would collect a million memories, but politicians said “Let’s make it 5 million memories”. Tried to hit it – and pure numbers pleased politicians but Gene realised weren’t getting the complexity, were just getting one-liners. So went back to politicians and said “Singapore is changing. No-one cares about the numbers except you. The public cares about things they can relate to. Instead of five million I’ll give you five thousand – memories that are well articulated, that others can make things out of.” Memories made it into textbooks. Could show all the different things one memory could create, ways they could be connected to much more. Reached a million – but not interested in numbers, interested in what the memories can become.

Q: An old session concluded “Facebook as path of least resistance” – how do we persuade user that Facebook isn’t enough?
A: Don’t actually try to persuade: it’s hard to change people’s minds. Use Facebook as a place for conversation, but keep working on main project. Not interested in capturing everything – use it to generate leads then follow up to get longer form memories. But opting out of discussion on whether Facebook is good/bad/enough/insufficient.

Q: Singaporean government has reputation for political control. Can you capture memories of opponents as well as ‘ordinary people’?
A: Took a stand at the start of the project that would capture everything. Wrote to opposition to get them involved. But not so interested in “big history”. Nothing censored (except swearing, people abusive to other races/religions).

Q: Personal histories can be traumatic. Once had customer find traumatic memory of her mother (indigenous family/history) and felt unprepared and unsupported. How do you deal with this?
A: Don’t have that level of trauma though some are hard. But hopes that every memory is surrounded by other memories – see this on Facebook: someone posts something and others come in with similar memories and also other memories that can ‘cushion’ them.

Q: [About leadership]
A: Most people come up with strategic statement then decide they’ve got it solved. Gene doesn’t have a strategic statement. Got some people in to ‘design the future’ product prototyping. “There are iPhones to be found in the library, we just need to stop strategising.”

Social media as an agent of socio-economic change #vala14 p2

Johan Bollen Social media as an agent of socio-economic change: analytics and applications

World we live in increasingly about online connections. First computer had 1KB RAM and programmable by BASIC. Now can wake up parents in Belgium by FaceTime. Data from 2012 2.4billion internet users worldwide (15.6% Africa to 78.6% North America, 67.6% Oceania/Australia). Amount of online content staggering.

Facebook, LiveJournal, Twitter… We’re not using these networks to broadcast – they’re to collaborate socially. Many-to-many. Generates content and establishes social relations — collaboratively.

Displays xkcd cartoon re ubiquity of phones and map of usage of Twitter and Flickr. Visualising languages spoken; what things are being downloaded. Using Twitter to map discussion of beer vs church. And using it to monitor outbreaks of flu.

Wikipedia using collaboration to create content. Estimize using it to predict markets.
“Prevailing pessimism about large groups collaborating in a productive manner, absent central authority, may not be justified.” From the “madness of crowds” (wacky ideas) to “the wisdom of crowds”. On “Who wants to be a millionaire”, asking an expert gets it right 65%, asking the audience 91% right. When you ask people questions they have to guestimate an answer to, “the average of two guesses from one individual was more accurate than either guess alone”.

Galton (1907), Nature, 1949(75):450-451 – aggregating judgements of people of weight of dressed ox got within 1% of accuracy.
Condorcet Jury Theorem (1785) – even if jurors individually are rarely right, going for a majority vote the chance of being right approaches unity.
Collective intelligence – birds flocking, ants finding food.

We have telescopes to look at huge things, microscopes to look at tiny things – we need a macroscope to look at really complex things: this is computational social science studying data generated by social media. Network analysis. Natural language processing.

Epictetus “Men are disturbed, not by things, but by the principles and notions which they form concerning things”.

Sentiment analysis. eg “Affective Norms for English Words” rated along valence, arousal and dominance, OpinionFinder, SentiWordnet. We understand individual emotions well, not so much collective emotions. Diagram charting fluctuations in collective mood based on Twitter feeds; correlating with market fluctuations – discovered that the Twitter ‘calm’ mood correlated with increase in DOW three days in advance 85%. Other results have largely confirmed this using Google trends, using dataset from LiveJournal posts.

Where does collective emotion come from? Is it more than the sum of individual emotions? Do sad people flock together or do they make each other sad? Homophily (bird of a feather) prevalent in social networks. People connected to lots of people tend to be connected to other people who are connected to lots of people. (Ie the popular kids hang out with each other.) Image of political homophily on Twitter. So does mood act in the same way? Looked at reciprocal following on Twitter. Found small cluster of negative-emotion users, and larger cluster of positive-emotion users. (Don’t know where causation is.) The closer the friendship, the more reliable this was.

Application to bibliometrics: got rejected from journals so published on arXiv and got massively read and within a month cited. So looked at arXiv papers and found a weak correlation between Twitter mentions and early citations. But the problem with altmetrics: the biggest nodes are the media, big blogs etc. The number mentions doesn’t matter as who is mentioning.

Radical proposal for funding science (developed over alcohol-fueled Christmas party grumps about writing funding proposals). (Motto: “What would the aliens say?”) Fund people not projects. Science as gift-economy. Encourage innovation. Change scholarly incentives for the better. Congress should give money to scientific community – every scientist gets an equal chunk, but you have to donate a certain percentage to anyone you want (who have to donate a percentage of what they’ve received). Would lead to an uneven “but fair” distribution. [My criticism: would be susceptible to issues of implicit bias against women, people of colour, etc. However don’t know if it’d be more or less susceptible to these problems than the current system is.] Ran a simulation using network data: when F=0.5 it matches the distribution by the NSF and NIH.

Q: Risk of feedback loops?
A: Yes – citing hacking of Twitter account to post about bombs in White House leading to massive market shorting – not just people getting freaked out, algorithms getting freaked out. Positive feedback loops bad news – hopefully can set up things so instead you’ll get negative feedback loops that lead to homeostasis. Can only mitigate problems by understanding how things work.

Innovate #vala14 #s13 #s14 #s15

Hue Thi Pham and Kerry Tanner Influences of technology on collaboration between academics and librarians

Interrelationships between collaboration, institutional structure, and technology.
Things like Google Apps tend to be used within departments – less use on smaller campuses because more casual face-to-face interaction. Level of use varies by discipline, faculty, campus.
Social technologies like Twitter used in lectures
Learning management system (eg Moodle) most important technology mentioned in interviews.
Institutional repository common space for depositing resources

Technology facilitating transition from traditional to digital library – more electronic resources, communicating over telephone, email, Skype. But purely online interaction means a reduced mutual understanding of partners’ contributions, and an old perception of librarians’ roles.

Divide between library system and learning management system leads to a divide between the two communities around these. Librarians complain they can’t do a workshop about an assignment without Moodle access to see the assignment. Academics say they think librarians could have a role but they don’t understand why they would need access or what they would do with it. Lack of coordination can be a problem – means LMS people and library people make decisions that each other isn’t aware of. Siloisation.

Library staff need to consider roles of interpersonal interaction with technology – value of tech, value of face-to-face interaction, importance of space design / architecture. Get automatic access to learning management system but avoid resulting workload. Need to find ways to integrate library management system with learning management system.

Audience comment: Involvement of librarian in discussion boards can be useful – some topics the academics are relieved to leave to librarian. But important to have awareness of mutual roles.

Lisa Ogle and Kai Jin Chen Just accept it! Increasing researcher input into the business of research outputs

Implementing Symplectic Elements at UoNewcastle. (37,000 students, 1000 academics plus 1500 professional staff) HERDC is reporting exercise to Australian government to secure funding – sounds similar to New Zealand’s PBRF. Work managed by research division but most data entry done by admin folk. Issues include duplicate data entry, variance in data quality, many publications never reported – funding missed out on. Library asked to assist from 2005 – centralised model addresses many issues.

Various identification mechanisms: scholarly databases, researchers, conference lists, uni website, library orders. All put manually into Endnote library, then manually copy/pasted into Callista database. Labour-intensive and would often be a 2-6 month delay for researchers, very frustrating.

Getting Elements. Loved harvesting from databases (based on search settings: “We think this is your publication, please log in to claim or reject it”). Originally not keen on opening up to researchers, but after demos got convinced researchers could add manual entry without compromising data quality as library/research staff can verify and lock it.

Benefits: database searches can be customised to minimise false positives/negatives. Can delegate others to act on researchers’ behalf. Publications appear on profile within 48 hours. Can upload Endnote libraries. Can include ‘in press’ publications without messing up workflow. Easily generate publication lists. Capture of bibliometric data. Pretty graphs on user’s dashboard.

Have been running 4 months, 2 thirds of publishing academics have logged in and interacted with system. (800 in first two weeks, and a lull over summer). 2900 publications in the system from current collection year (usually 3500).

Challenges: early adopter in Australian market. Development module took longer than expected – learned that everyone does HERDC differently.

Most negative feedback so far is from people who haven’t yet logged into the system. Someone complaining it was too hard – talked her through it over the phone and now fine.

Need to investigate further repository integration.

Malcolm Wolski and Joanna Richardson Terra Nova: a new land for librarians?
Big issues emerging around vast amounts of data and trying to connect it. Global connectedness another impact.

Researchers needing a “dry lab” to work with data instead of hands-on wet-lab. Seeing this in many areas.
Researchers can’t afford to work solo any more. Much infrastructure costs beyond reach of individual researcher or individual centre. Problems are too much for one person.
Can get storage and computing power – but may need to work with data for ten years so need to be able to retain it and keep working on it through changing technology. Lots of outputs are governmental reports not journal articles.
Most large research projects these days involve communities – even incorporated bodies.
80% of papers in the EU are of people collaborating with people outside their institution.

NeCTAR have invested heavily in virtual laboratories because it’s not just about creating data but using it – of course this creates more data.
In theory nothing stops a researcher going to Research Data Storage Infrastructure for storage without their university knowing.
Various community solutions like Tropical Data Hub, Australian National Corpus – slide lists a pile and he points out that for each of these, some institution has put their hand up to take responsibility for maintenance.

Approach of institutions keeping their own data but having to share metadata. Requires lots of discussion around data schemas – what you expect to find in data descriptions. Eg Research Data Australia from 85 participating organisations and growing. Goal to get more data, better connected data, more findable/usable.

Two impacts around:
Research tools: New suite from NeCTAR and ANDS eg virtual laboratories, discipline-specific tools. Need to choose which we’ll support, which data collection schemes we’ll be involved in. May need to develop our own tools for specific disciplines.
Library/research collaboration: Moving more to a partnership model.

Libraries provide support for data management plans and citing data, but there’s huge demand for archiving/preserving data.

Impact on university libraries:

  • New jobs coming out for the “databrarian”.
  • Need research services to help develop common data structures
  • Participation in cross-disciplinary teams bringing librarian skills
  • Development of legal frameworks for acquiring, generating, storing and sharing data
  • Assisting with development of tools – lots of disciplines have different ways of exploring/analysing data so national collections/communities may have specific search (eg maps, chemical structure, vs facets) or visualisation tools.
  • Archiving and preservation services

Librarian support roles

  • Sourcing relevant data sets
  • Consultancy – identify faculty needs, refer back to experts
  • Targeted outreach services re data citation or data repositories
  • New support service tools and processes

Want to be able to offer a service to researchers and them not have to worry about where it’s stored, whether on campus or Amazon Web Services or whatever.

Cloud gazing #vala14 #s8 and #s9

Michelle McLean, Residing in the cloud: looking at the forecast now and into the future
Service models:
Software as a service (LibGuides, Office365, HathiTrust)
Platform as a service (eg Yahoo Pipes, OCLC Web Services, Google App Engine)
Infrastructure as a service (Britash Library, Library of Congress, My Kansas Library)

Deployment models:
Private cloud
community cloud
hybrid cloud
public cloud

Essential characteristics:
Resource pooling
rapid elasticity
on-demand self-service
measured service
broad network access

Pros

  • Scale and cost
  • Change management done for you – you don’t have to worry about upgrades
  • Choice and agility – if you want something new just pay and you get it
  • Next-generation architecture
  • IT isn’t a library core business – let the experts do it. Better security, better sustainability, better reliability

Cons

  • Security – when people leave need to remove their access right away because access through the web. All big companies have had failures
  • Lock-in. Need to be sure you can take your data with you if you leave
  • Lack of control. If the website is down where is the problem?
  • Financial savings mightn’t be as good as predicted.
  • You lose your IT expertise if you outsource, but then you lose your first point of trouble-shooting.

Preparing for the cloud
Consider security, privacy, access, law, lock-in, whether it’s right for your business.
Cloud computing services are marginally more reliable that IT departments (99% vs 98% uptime). So make sure you have backup systems.

Derek Whitehead All on the ground: there is no cloud
Metaphor of cloud as fluffy, friendly, faraway – slideshows never show stormclouds!
Behind the metaphor nothing’s actually in the cloud, they’re in servers in a building on the ground in a legal jurisdiction (not always ours).

There are four basic perspectives on the cloud:

  • Technology
  • Content – “information located remotely” but information is rarely independent of computation
  • Personal – companies want us to locate our info elsewhere than our own computers so they can ‘develop a relationship’ with us [lovely euphemism there! -Deborah]
  • Legal – jurisdiction makes a difference though not quite as simple as “in Australia = free of PATRIOT Act”. Frequently mirrored, moved around, using redundancy to safeguard info. People mostly concerned about privacy legislation – strong in Australia and Europe.

Swinburne’s policy is to externally host/manage most where possible – “opportunistic vendor hosting”. Student email; HR; learning management system, library system, OJS, etc.

What do we want the cloud people to do for us? Vendor cloud hosting vs service aggregator provision. Huge range of hybrid or multisource options. But services have to be efficient, reliable, high quality, fast to access, and cost-effective.

Why would we do it? When a kid, generated own electricity – not a great way to live. Thinks IT will one day look back at the idea of having your own server in your basement in the same way. Cost minimisation, efficiency, economies of scale — all of these issues. Security is an issue because bigger targets for hackers, but also have bigger resources to defend against them.

Will need a realignment in skillsets. Getting ability to read/write/negotiate contracts is vital.
But libraries are leaders. Remember when we moved from print to CD-ROMs? (Okay, this was the wrong direction…)
Exit strategies where possible – harder in monopoloy situations.
Helped by clear customer benefits and freeing up buildings. Libraries have access to economies of scale, we’re comfortable with automation, it benefits collaboration.

Q: What’s the customer experience of change to the cloud?
A: Infrastructure/management should be invisible to customers. But having info in the cloud brings huge benefits: eg huge increase in number of articles used by academics when they can get them from their desktop.

Q: What if things go wrong?
A: With an external host you’ll have remedies in the contract if things go wrong – no such remedy if you stuff up yourself!

Big data, little data, no data – Christine Borgman #vala14 #p1

Big data, little data, no data: scholarship in the networked world

Technological advances in mediated communication – have gone to writing to computers to social media and these are cumulative: we use all of these concurrently. And increasingly thinking of these in terms of data. Need to think about new infrastructures because this will determine what will be there for tomorrow’s students/librarians/archivists.

Australian notable for ANDS, and for movements to open access policies – only place she’s found where managing data is part of (ARC’s) Code for the Responsible Conduct of Research.

Book coming out late 2014/early 2015 – data and scholarship; case studies in data scholarship; data policy and practice. Organised around “provocations”:

  • How do rights, responsibilities, and risks around research data vary by disciplines and stakeholders?
  • How can data be exchanged across domains, contexts, time?
  • How do publication and data differ?
  • What are scholars’ motivations to share?
  • What expertise is needed to manage research data?
  • How can knowledge infrastructures adapt to the needs of scholars and demands of stakeholders?

Until the first journal in 17th century, scholars communicated by private letters. Journals were the beginning of peer review, of opening up knowledge beyond those privileged to exchange letters. –However things began much earlier: brick from 5th-6th century inscribed with Sutra on Dependent Origination. Now we have complete open access in PLOS One. (Shows If We Share Data, Will Anyone Use Them? Data Sharing and Reuse in the Long Tail of Science and Technology.) Lots of journals, preprint servers, institutional repositories to submit to.

Publishing (including peer review) serves to legitimise knowledge; to disseminate it; and to provide access, preservation and curation.

Open access means many things – uses Suber’s “digital, online, free of charge, and free of most copyright and licensing restrictions” definition.

ANDS model of “more Australian researchers reusing research data more often”. Moving from unmanaged, disconnected, invisible, single-use data to managed, connected, findable, reusable data.

Open data has even more definitions: Open Data Commons “free to use, reuse and redistribute”; Royal Society says “accessible, useable, assessable, intelligible”. OECD has 13 conditions. People don’t agree because data’s really messy!

Data aren’t publications
When data’s created it’s not clear who owns it – field researcher, funder, instrument, principle investigator?
Papers are arguments – data are evidence.
Few journals try to peer review data. Some repositories do but most just check the format.

Data aren’t natural objects
What are data? Most places list possibilities; few define what is and isn’t data. Marie Curie’s notebook? A mouse? A map or figure? An astronomical photo – which the public loves, but astronomers don’t agree on what the colours actually mean… 3D figure in PDF (if you have the exact right version of Adobe Acrobat). Social science data where even when specifically designed to share it’s full of footnotes telling you which appendices to read to understand how the questions/methods changed over time…

Data are representations
“Data are representations of observations, objects, or other entities used as evidence of phenomena for the purposes of research or scholarship.”

You think you have problems on catalogue interoperability, try looking at open ontologies intersecting different communities.

Data sharing and reuse depends on infrastructure
You don’t just build an infrastructure and you’re done. They’re complex, interact with communities. Huge amount of provenance important to make sense of data down the line.

Data management is difficult – scholars have a hard enough time managing it for their own reuse let alone someone else’s reuse. Need to think about provenance, property rights, different methods, different theoretical perspectives, “the wonderful thing about standards is there’s so many to choose from”.

Ways to release data:

  • contribute to archive
  • attach to journal article
  • post on local website
  • license on request
  • release on request

These last ones are very effective because people are talking to each other and can exchange tacit knowledge — but it doesn’t scale. The first scales but only works for well-structured and organised data.

So what are we trying to do? Reuse by investigator, collaborators, colleagues, unaffiliated others, future generations/millennia? These are very different purposes and commitments.

Traditional economics (1950s) was based on physical goods – supply and demand. But this doesn’t work with data. Public/private goods distinction doesn’t work with information. There’s no rivalry around the sunset or general knowledge in the way there is around a table or book. So concept of “common pool resources” – libraries, data archives – where goods must be governed.

Low subtractability/rivalry High
Exclusion difficult public goods common pool resources
Easy toll or club goods private goods

While data are unstructured and hard to use they’re private goods. Are we investing to make them tool goods, common pool resources or public goods?

Need to make sustainability decisions – what to keep, why, how, how long, who will govern them, what expertise required?

Q: Health sciences doing well
A: Yes but representation issues. Attempt to outsource mammogram readings fell foul of huge amounts of tacit knowledge required. In genomics attempts to get scientists and drug companies to work together in the open, but complicated situation with journals who say that because the data is out there it’s prior publication when in fact the paper is explaining the science behind it; and issues around (misleading) partial release of data – recommends Goldacre’s Big Pharma.

Q: Scientists want to know who they’re giving data to. But maybe data citation a way to get scientists on board?
A: Citing data as incentive is a hypothesis. Really sharing data is a gift – if you put it on a repository you don’t have it available to trade to collaborators, funders, new universities. Data as dowry: people getting hired because they have the data.
Agreeing on the citable unit is hard – some people would have a DOI on every cell, others would have a footnote “OECD”. Citation isn’t just about APA vs Blue Book, it’s about citable unit and who gets credit and….