Category Archives: Uncategorized

Huakina te whare ki te ao – Ariana Tikao, Catherine Amey, Anahera Morehu #open17

Ngā Upoko Tukutuku thesaurus created by looking at cataloguing worldview within Te Ao Māori framework. Classifying mātauranga Māori in a Library of Congress framework is pretty hard; but it was also about revitalising te reo. So Ngā Upoko Tukutuku aims to help cataloguers and archivists assign appropriate subject terms; and enable library users to find resources within a mātauranga Māori framework.

Kaupapa are preferred terms – with a whakamarama; related to Reo-ā-iwi (dialectal); within Tāhuhu (broader terms), Heke (narrower terms) etc.

Tukutuku panels made with a person on each side weaving threads back and forth; Ngā Upoko Tukutuku are made in the same way.

Example of frogs – in Māori worldview frogs aren’t part of an ‘amphibian’ category but rather part of aitanga pepeke (animals that jump) so added poraka there.

Once had a request for a term for ‘environmental ethics’, but no term for this so added two terms, one for ethics, one for environment. Added scope notes.

Rakiraki – the specific readers inspiring the request were actually about family so suggested using whānau there. But also added rakiraki as it was suitable for other resources about ducks.

Manawaroa for resilience.

Trying to create scope notes that are easy for cataloguers/archivists with little knowledge of mātauranga Māori to understand.

Reo-ā-iwi – Hura kōhatu / Hura kōwhatu; kōkā / māmā / whaea

Opening up the data to the world eg http://miriamposner.com/msh; converting a subset into Linked Data

Feedback, questions, interest in collaboration to reo@dia.govt.nz

Te haerenga o Koha – Kathryn Tyree & Chris Cormack #open17

(Mehemea he hē i ēnei tuhituhi, nāku te hē!)

I te tau 1999 ka timata a Koha. He raru kei te haere – ko te tau 2000 (Y2K). Ka pakaru ngā pūmanawa whakapukapuka katoa, nā, ka puta a Koha. Inaianei 15,000 ngā wharepukapuka, 300 ngā kaituhi.

Ia tau ia tau kei te hui ngā kaituhi (Kohacon) – ko Wīwī te wahi tuatahi. Ko USA te wahi tuarua, ko Aotearoa te wahi tuatoru. I tēnei tau, ko Piripini te wahi o te hui.

“He rau ringa e oti ai”. Ko te hapori Koha, he whānau whānui.

“Nāku te rourou, nāu te rourou, ka ora ai te iwi.”

He taniwha kei roto i tēnei kōrero: kei hea tō tātou mātauranga, a, te mātauranga o tō tātou hāpori? Ki te USA, ki a Aotearoa rānei?

Ngā kōrero harikoa hoki: nā maha ngā wharepukapuka iti, ka install i a Koha. Ka awhi ngā tangata whenua o Aotearoa ki ngā tangata whenua o Nunavut.


Nāku tētahi patai mō FOLIO – kei te mahi a Catalyst ki tēnei kaupapa. Ki tēnei wā kei te kōrero a Koha ki a Mahara, ki a Moodle, ki a ngā pūmanawa atu hoki. Ko te tumanako o FOLIO, ka kōrero a Koha kia a FOLIO, a ka kōrero a FOLIO ki a ngā pūmanawa atu katoa.

Open your arms (and mind) – Mojgan Sadhigi #open17

Open your arms (and mind): A practical approach to connecting libraries with their CALD communities – Best practice for creating programmes with not for communities.
CALD = culturally and linguistically diverse communities
Toolkit:
  1. Gather information
    Collect demographics, develop profiles, select who you’d like to work with, identify leaders and the best communication method (eg group meetings – a good idea to go to their meetings, but also invite to ours; community reps, focus groups, advisory team, volunteer programmes).
  2. Connect with partners (and back to #1)
    Approach partners, exchange stories, explore interests – respect the autonomy of their organisation, take time developing trust.
  3. Decide shared goals
    Work out what you each want; define success.
  4. Plan project together
    Choose plan, consult, outline roles, assess risks, be flexible – teamwork is key.
  5. Promote partnership
    If you don’t let communities know what you’re offering it’s all gone to waste. Use your connections to publicise (they might have radio, newsletters, TV station) but also put things in writing; use other events to distribute info, have a stand and staff there. Make sure you use plain English and suitable translation. Visit migration centres, daycares, language schools.
  6. Evaluate your project and what to do next (and back to #1/2)
    Capture stories as well as number who attended. What worked with the partnership as well as the project.

Examples: exhibitions and displays, celebration of key events, bilingual story telling, conversation classes, movie night, baking


How hard to find volunteers?
Always found one. Often worth asking the people who are busiest! Sometimes daunting to say “We need volunteers”, so can have morning tea to chat, and then can identify the keenest and say “I need your advice” and go from there.

How to get people from specific groups to the library when they’re working every day?
Some groups see library as ‘government’ building and wary of it; so started developing in shopping centres. Also tried a different language. Deliver in their space, gain trust, then can slowly move to the library. Also food usually helps.

 

Open your mind – Vinh Giang #open17

Hard to blog a magic show, but…

Magic is just a problem you can’t solve.

Perspective is power. When focused on a problem you get “change blindness”.  Perspective from a completely different field/pov is needed to not only solve the problem but also see the opportunities in it.

Importance of influence (especially from negative people in your life). “You’re the direct reflection of the top 5 people you spend time with.”

First step to creating something possible is to believe it’s possible. Beliefs dictate actions and take first step. You have to be on the journey before you can see step two.

The Dangerous Myth about Librarians – Laurinda Thomas #open17

Laurinda gave a talk at TEDxWellington in 2016, focusing not on the future of libraries but the present of libraries; that we get so caught up in the nostalgia of libraries that we’ve missed how crucial libraries are to society today.

Every day someone comes into the library who’s never been before – what will they think of it all? Librarianship is very old and very adaptable. Of course we’ll survive – but will we flourish? Myth of who we are plays into decisions we make, which affects experience newcomers have.

We take things you’d normally have to pay for and provide it for free. Social entrepreneurs since before there was a word for it. We have  a bigger influence than we think and need to remember it.

Change is constant – in terms of the type of change as well as how much eg financial, technological, society. We’re already dealing with this change. But we’ve become used to the downwards trajectory of budget cuts; have become used to what we think we do and don’t do.

Wants us to put in not extra effort but extra intentionality – rethink what we do. To date we’ve added things to what we do but don’t really match how our users think and want to use things. Need to be more deliberate and think who are we really here for? what purposes are advanced by what we do?

Cf the UK – libraries didn’t waste away because they weren’t used. Were attacked by “austerity” cuts. Choosing where to cut funding isn’t a politically neutral act. Shows you what the people cutting the money value.

Libraries are powerful. We give people the means to apply for jobs, communicate with family. Easy to misuse power – both deliberately and accidentally. But important to use our power. Words have power – pay attention to the language we use.  eg “We need to remain relevant” ends up getting echoed back from others as “Are libraries still relevant?” How about striving for “responsive” or “customer-focused”? Similarly “Save our libraries” is echoed back as “Libraries are endangered / dying.” Need to use language in a way that spurs us forward instead of holding us back.

We’ve been having the same ‘relevance’ conversation for literally decades. How can we have better conversations? We need to have these conversations with the people who haven’t been in the library for a decade or more. We see every day how vital our services are; need to make other people see this too. To do that, remember we’re not all the same; some people don’t care about social good of library. Find out what they do care about and show them how libraries affect that. Both stories and quantitative numbers so stories don’t just get brushed off as anecdata.

Ask what we’re afraid to ask. And be open to the answers. Don’t need to do all the things – just honestly engage with them.

  • Stop misusing numbers (eg door stats – if 10 fewer people came in the door, we’re not less valuable).
  • Stop relying on how ‘obvious’ our value is
  • Stop being lazy about biculturalism. Have not made as much progress since the 80s as we should have.
  • Stop looking for a single ‘thing’ (especially technology) to save us.
  • Stop avoiding politics. Libraries are not idealogically neutral. We believe in things! We have values and strong views. Don’t be afraid of making enemies; need to own our values. Use our power, as private individuals if not in our professional role.

Value ourselves. The world is full of rules – but we can make new rules. Have courage – ie doing the things that need to be done. Be visible. Need to make our profession impossible to be ignored.


Q&A

How do we challenge budget cuts?
Focus on outcomes – not our traditional outcomes, but the outcomes that people holding the purse-strings care about. Highlight the impact of our skills on the community. Not a simple answer but need to keep having the conversations.

He aha tō whakaaro mō te kupu ‘biculturalism’?
Some libraries doing great stuff; a lot haven’t gone beyond some bilingual signs. 20 years ago would have thought we’d all be bilingual by now and we’re definitely not. Need to take responsibility for doing better.

Funding?
Overseas can look for funding from non-government bodies. Many other innovative ways of funding – have a book dedicated to you for a day. Trouble is in NZ with smaller population does the effort justify what you get out of it?

What if we work politically to get wellbeing back into the Local Government Act?
Depends on whether this will be useful influencing those with the purse strings.

Not just aligning with what funders want – but align with what we think they’ll want in future.
Pitch what we’re doing to what’s becoming important to them.

Checking out the Elsevier / U of Florida pilot

One of the papers at Open Repositories 2017 I couldn’t attend was:

Maximize the Visibility and Impact of Open Access and other Articles through integration of Publisher APIs
Letitia Mukherjee (Elsevier), Robert Phillips (University of Florida)

The University of Florida searched for solutions to expand access to university-authored journal articles thru institutional repository. UFL and Elsevier collaborated to automatically feed journal platform data and links to the IR through free APIs. The project enabled UFL to support university authors/researchers and compliance with US public access policies.

I wrote most of this blog post based on what I heard about the presentation at conference, and my own investigations a couple of days later (ie a month ago); I’ve made some small edits and am posting this now after seeing the presentation recording on YouTube.


I first read about this project a year ago in an Inside Higher Ed article (in which Alicia Wise is quoted with an infuriating “The nice thing about this pilot is it opens up the repository”. No, it doesn’t open the repository. The repository was already open. It also doesn’t open up Elsevier content, which remains completely closed) and in a more sceptical blog post (which describes it as turning the repository into “a de facto discovery layer”. From what I can tell, this is being extraordinarily generous: as a discovery layer it doesn’t even make a particularly good Amazon affiliate programme, because Amazon at least pays you a few cents for the privilege of linking to them.)

Before going further I want to make it clear that any and all scathing comments I make in this post are reflective of my opinions about Elsevier stinginess, not about the repository or its staff who are clearly just doing what Elsevier allows them to do. Also I’m writing about the system as it is right now (Phase I). [Phase II was briefly discussed starting about 18:55 in the video and in Q&A at the end of the presentation.]

While still at conference, I heard that Robust Discussion was had following the presentation (and this is captured in the video too). Among other questions, an audience member asked if Elsevier would offer all subscribers the ability to download final accepted manuscripts via API for example (21:59). The eventual answer (after some confusion and clarification) seems to be that it’s not currently available to all subscribers as they’re creating author manuscripts specifically for the pilot and need to work out whether this is scalable (24:44). [This raises the question to me of why. Why not just use the actual author manuscript instead of converting the author manuscript into the publisher manuscript and then apparently converting it back?]

In any case, when I asked the same question at the vendor stall, I was told that if they provided the pdf to repositories, they wouldn’t be able to track usage of it. The vendor also asked me why we’d want to. I talked about preservation, primarily because I foolishly assumed that the system they’ve got with Florida actually worked as advertised to provide ‘public access’ but a couple of days later, somewhat recovered from the exhaustion of conference, I had second thoughts. Because of course the other things that we want are full-text searching and access via Google Scholar. Also access for the general public, not just our own university. Also, well, access at all. I thought this went without saying until I actually began to test how it works in practice.

So University of Floriday’s repository is IR@UF. I ran a general search for {Elsevier} and turned up 32,987 results. I chose an early result that wasn’t from the Lancet because the Lancet is a special snowflake: “(1 1 0) and (1 0 0) Sidewall-oriented FinFETs: A performance and reliability investigation”.  The result is plastered honestly with “Publisher version: Check access”.

Is it open access? I clicked on the title. Elsevier has made much of “embedding” the content in the repository. I think this is in fact intended for phase II but they’d managed to give the impression that it was already in place so at this point I expected to be taken to a repository page with a PDF embedded in an iframe or possibly some unholy Flash content. Instead, I was taken straight to the item pay-to-download page on Elsevier. Further exploration uncovered no additional ways to access the article. So there’s no access to the public: it’s not open access and it does absolutely nothing to support “compliance with US public access policies”.

Is it easily accessible to institution members? If I was a UFL student or staff member who happened to be off-campus (say, at a conference, or researching from home) there’s no visible way to login to access the article. I assume UFL has IP access to content in which case it’d work on campus or through a VPN, but that’s it.

Is it findable through full-text search? I dug up access through my own library to download the pdf so I could select a phrase early on in the full-text that didn’t appear in the title or abstract. But doing a full-text search in IR@UF for {“nMOS FinFET devices”} resulted in “Your search returned no results“.

(Just to be sure the full-text search was working, I also tried it with a phrase from the title, {“Sidewall-oriented FinFETs”}, which did bring up the desired article. The link from this result is broken, though, which is presumably a bug in the implementation of the scheme, since links for non-Elsevier results on similar full-text searches are fine.)

Is it findable via Google Scholar? Scholar lists 6 records for the article, none of which are via IR@UF. Not, at this point, that there’s any advantage to seeing the IR@UF version anyway, but the pilot is certainly not driving traffic to the repository.

Is it a discovery layer? Even aside from the lack of full-text search and the inability to get access off-campus, it only works for ScienceDirect articles by UFL authors, so no.

If I had to come up with an analogy for what it is and does, I guess I’d say it’s a bit like a public-facing RIMS or CRIS, except those would include more data sources and more reporting functionality.

So to answer the question as I could have if I’d realised how limited this functionality is: why do institutional repositories want to have the full text?

  • to make it discoverable via full-text searching
  • to provide easy access for our own institution’s members
  • to provide open access for the rest of the world
    • thereby increasing its impact (including but not limited to that measured in citations and altmetrics)
  • to ensure it’s preserved and accessible for the centuries to come
  • to bring traffic to our own repository and the rest of its valuable collections; and
  • to track usage.
    UFL’s repository can do this last one. Sort of. It’s got a page for “Views” (hits) and “Visits” (unique visitors) . But it doesn’t tell us how many of these visitors actually succeeded in accessing the full-text. My suspicion is that this number would be much lower.

Phase II, if it works as advertised, may address some of these issues, but I’m not sure how many. I feel we’re getting conflicting messages of how it will actually function and at this point am not inclined to believe anything until I see it in action. For now it’s the same as any other vapourware.

Round-up of 16 #or2017-related sessions

For #or2017 I attended 16 sessions (including satellite events), most of which include 3 presentations or 8 lightning talks, so there’s a lot of information that’s gone into my head over the last week. (As a result I now have 26 new items on my To Do list, which range from “Check X is on Y’s radar” to “Found a new national conference”.)

Below is a summary with key points and highlighting of things I particularly want to remember for some reason [plus thoughts of my own].

Monday – CAUL Research Repository Community Day

  • Session 1: APCs not a solution so need to strengthen repositories. [This doesn’t entirely follow because there’s an excluded and oft-forgotten middle: gold OA journals that don’t charge APCs but are funded through other streams. But as it happens I do believe in repositories too or I probably wouldn’t be here.] Discussion about forming an Australasian (and/or New Zealand) formal consortium to make it easier to feed into COAR etc. NISO “Free_to_read” and “License_ref” tags [which I need to find out more about and how they work in the OAI context].
  • Session 2: ORCID developments; repository self-assessment and repository metadata output health-check with some suggested standards from the point of view of one aggregator (Trove); two views on dealing withnon-traditional and creative works.
  • When the Australians started talking about REF, the Kiwis bailed. My suggestion to talk about PBRF was overwhelmingly voted down. We had a robust discussion about metadata instead.
  • I also squeezed in a visit to the State Library. I liked their coffee tables as advertisement: they were printed with a nice design listing the services they had on offer. And of course their Digital Futures space: the kinetic sand with a sensor/projector above that sensed the height of the sand and projected colours and contour lines accordingly; the tin can connected to wire flowers where you touch a flower and hear a random comment about the future from a previous visitor; and VR and touchscreens and stickers to put on a paper timeline mounted along the wall.

Tuesday

  • Getting started with Angular 2 and DSpace workshop: 2 parts: background on what Angular is [I understand it so much better now! This was a far better explanation than any of the ones I’d tried while struggling with Primo’s new UI] followed by a hands-on working through the exercises. [This was a little quick for me but I managed to catch up using the github code as reference, and only failed on step 2 because the presenters missed a step too. 🙂 So I came out feeling very accomplished… though I still hate dependencies.]
  • FOLIO presentation hosted by EBSCO – “a community collaboration to develop an open source Library Services Platform (LSP) designed for innovation”. Still very early stages but the “APIs all the way down” and responsiveness of the architecture is nice; worth keeping an eye on as it develops more modules.
  • Electronic poster presentation: researcher metrics dashboards; CORE Repository Dashboard; IIIF image framework; OA retrospective theses; increasing OA content in your repository; improving the DSpace workflow

Wednesday

  • Perverse incentives: how the reward structures of academia are getting in the way of scholarly communication and good science: basic introduction to scholarly communication and the need for OA from a mathematician’s perspective
  • Research and non-publications repositories, Open Science: 8 lightning presentations: an intro to IIIF but most of the rest were about research data, including RDM training; data paper publication best practices from a journal’s perspective; data management plan record
  • Scholarly workflows:
    • “Scholarly Tools…” looked beyond RDM and [where I usually think of managing/publishing code, methods] talked about research tools of which there are a bazillion – it more raised the scope of the issue than provided a firm path forward but that’s fair at this point!
    • “Research Offices As Vital Factors…” was an inspiring view from a research office that Gets RDM – might be a useful primer for other research offices.
  • Demonstrating impact: “A New Approach for Measuring Value…” mentioned the idea of value as beyond a simple dollar figure to basic/expected/desired/unanticipated value. By contrast “How to Speak Business Case” talked about how to get down to the kind of value that speaks to project managers.

Thursday

  • Repository admin and integration: 8 more lightning presentations.
    • “Mind the Gap!…” proposed ResourceSync as a replacement to OAI-PMH (retaining the latter for legacy purposes as appropriate) due to various advantages.
    • “Leading the Charge…” mentions the imminent “UK Scholarly Communications License” on the Harvard model which would be a great extension of precedent.
    • “Towards an Understanding…” talks of driver behind The Conversation to improve govt/public awareness/understanding of new research.
    • “Batch processes…” described a workflow for semi-automating identifying of low-hanging fruit to gather/import into IR. [I want to check with our workflows to see if this might help or if our workflows with Elements are about as efficient already.]
  • Extending DSpace:
    • “Archiving Sensitive Data” was awe-inspiring [albeit irrelevant to me].
    • “Full integration of Piwik analytics” was relevant to me [due especially to us I think stuffing up what analytics DSpace does give us – but probably a bit too technically challenging].
    • “The Request a Copy Button” suggested it’s possible to get it working sensibly if we ever decide it’s worth it for us.
  • Evaluation and assessment:
    • “Cambridge’s journey towards Open Access” is not that different from ours [which is heartening]. “Open Access policy 3 years in” at UniSA has a stronger mandate than us and still low deposit rate [ditto]; pre-population with CrossRef lookup on DOI is nice. [Probably replicates the functionality in Elements.]
    • “Self-Auditing as a Trusted Digital Repository” sounds like a pain in the proverbial though useful if you can bear to.
  • Integrating DSpace: “Harvesting a Rich Crop” on multi-tenancy DSpace. “DSpace in the centre” on Elements/DSpace integration. “DSpace for Cultural Heritage” introduces DSpace-GLAM with IIIF-compliant image viewer, audio-visual streaming, dataset visualisation.

Friday:

  • Institutional Publications Repositories and beyond:
    • “Curating, But Still Not Mediating” on appreciative demanding of README files asap under the principle of “The best time to plant a tree is 20 years ago; the second best time is now”.
    • “Uniform metadata for Finnish repositories” was determined by a national working group. [This has now inspired discussions about doing the same in New Zealand. I approve the idea but mourn the cleanup I’ll have to do in our repository… or maybe just in our OAI crosswalk…]
    • “Isomorphic Pressures…” looks at difference in IR ecosystem in Japan cf the USA and specifically factors influencing this: regulatory/coercive pressures; cognitive/mimetic pressures; normative pressures. [I like big words for new ways to think of things.]
    • “The role of the repository…” spins the citation advantage concept to do an analysis of the altmetric advantage of depositing in a repository. They find one.
    • “Scholarly Identity and Author Rights…” on popularity of workshops on creating your researcher profile.
    • And I got a chocolate koala for finishing my own lightning presentation on time. 😀
  • Ideas Challenge:
    • “Data Pickle” modelled on ThisToThat.com should definitely be a thing.
    • “Global Connections” – I don’t know how well this would work in practice but having seen what machine learning does with Resene paint colours and Doctor Who titles I’d actually really like to see it generating metadata (and/or, per my question/suggestion, simply skipping a step and generating new research…)
    • “Brisbane Declaration ON the Elimination Of Keywords (B-DONEOK)” – if there is a mass global wave of cataloguers murdering institutional repository folk in the next week, you know why.
  • Beyond Repositories: From Resource-oriented towards Problem-solving-oriented: I didn’t blog this well, it was very dense and full of ideas that are simultaneously catching up with things I see and well ahead in others – especially well ahead in determination to grab hold of it all and go for it.

And finally, a photo from the gala at the museum:

me in a nice dress and gladiator helmet, with sword and shield

All dressed up to attack messy metadata.

 

Beyond Repositories: Problem-solving-oriented #or2017

Beyond Repositories: From Resource-oriented towards Problem-solving-oriented by Dr Xiaolin Zhang, National Science Library, Chinese Academy of Sciences

With the ubiquitous deployment of digital ecosystems, developing repositories to meet next generation needs and functions become an imperative and increasingly active efforts. However, a paradigmatic shift may be needed to prepare repositories to go outside the resource-orientation box, as JISC report “The future of data-driven decision making” puts it, “[I]t is not sufficient simply to focus on exposing, collecting, storing, and sharing data in the raw. It is what you do with it (and when) that counts”.

The presentation first discusses the emerging digital ecosystems in research, learning, publishing, smart campus/cities, knowledge analytics, etc. where traditional content/repositories are just a small part of stories.

Then an exploration is made about making repositories embedded into, integrated with, and proactively contributing to user problem-solving workflows in digital ecosystems such as scholar hub, research informatics, open science, learning analytics, research management, and other situations.

Further effort is attempted to understand (admittedly preliminarily) strategies for repositories to be transformed into part of problem-solving-oriented services, including, but not limited to, 1) enhancing the interoperability to be re-usable to third part “users”, 2) developing repositories into smart content with application contexts, and 3) developing smart contextualization capabilities to better serve multiple, varied, and dynamically integrating problem-solving processes.

[I’ve previously blogged a keynote by Dr Zhang at THETA 2015.] He has a new perspective since moving jobs two years ago.

104 research institutes, 55,000 researchers. Various repositories eg NSFC Repository for Basic Research, CALIS IR portal of 40+ universities. Research data sharing platform, and Chinese Academy of Science distributed research data management and integrative service platform.

  1. Changes in the digital ecosystems
    • Steady progress of repositories but numbers don’t tell the story – better to look at how users use it. Most still collection based and local applications are the main service. What if we move away from repository-based approach. Imagine new scenarios out in society. What do they need?
    • All media and content can be data (including processes, relations, IoT devices, tweets). Can be smart – and semantic publishing will be the new normal. Knowlege as a Service.
    • Transformation from subscription to open access. Born digital = born linkable.
    • eScience is the knowledge system – opening up data-intensive scientific discovery. Not just about access, it’s a different way of doing science
    • Open Science again more than open access, but open evaluation, open process, open collaboration. (Displays open science taxonomy). Even social science now incorporating computational methods.
    • eLearning creating a new knowledge ecosystem. Things changing quickly. In the classroom everyone (200 students) uploading content and system going down even though made plans for it only 2 years ago. Flipped classrooms where students do work before the class in digitally collaborative environments, multimedia-rich laboratories so students can interact with each other. Requires intelligent campus and services. eStudent Center where student’s whole learning life is together to be analysed; university center can look at trends etc
    • Knowledge analytics – converging data science, computer science, information science. Open source tools for data visualisation and analysis. Data analytics can become new infrastructure
    • Moving into the Machine Learning Age? 7.5 million university graduates every year in China
  2. Explorations to re-orient repositories
    • Towards working labs: Elsevier Knowledge Platform; WDCM
    • From resources to problem solving, eg digital healthcare needing knowledge from literature but also from wearables and other devices; eg intelligent cities with data, linking, analysis, to answer questions.
  3. Challenges in re-developing repositories
    • Re-purpose and reposition repositories? but outside the scholarly communication environment? Eg using big data in smart cities – scholarly knowledge plays a huge role here. Eg learning analytics where we combine data on students (grades, interactions on Moodle).
    • Cycle: environmental scanning -> idea/design/testing -> R&D -> data management -> Data analysis -> dissemination -> preservation/reuse -> evaluation -> environmental scanning
    • Interoperability cf W3C recommendations
    • Identify/select/developed/integrate value-added services (not all work together, but some aren’t meant to). How to turn content into computable data? how to develop rich and smart media resources? eg How to turn powerpoints into actionable data?
    • Working on automatic translation, domain interaction dynamics, scientometrics tools, social network metrics, automatic thesaurus/k-graph development. Hard for students to select a topic when there’s open-source tools already out there about it! Calculations and results become objects to be reused.
    • Representing knowledge with knowledge graphs. which can enable intelligent applications. Text analytics, RDF data management. eg SpringerNature SciGraph – turning all papers into semantic network of knowledge.
    • Too many vocabularies! Some used by many people, some very common (eg Schema.org ) and general – but also very specific ones eg neuron ontology; Internet of Things developing their own. Ontology mapping tools? Cross-language linking of knowledge graphs and smart data eg Chinese/English Wikipedia pages.
    • What about when live machines join the integration and we put our data into real-life processes? Geospatial/temporal/event/methods/workflows-identifiable.

Are these real life scenarios really relevant to our repositories? If not we’ve got a problem! Is what we’re doing now getting us into these scenarios? Are we talking/collaborating with people in these scenarios? They’re not necessarily going to approach us! Time for us to think and act before it’s too late.

Ideas Challenge presentations #or2017

Challenge to solve an existing problem with emerging technologies.

Data Pickle

Research wanted to upload data but didn’t know how to wrap it up. So cf ThisToThat for gluing thing A to thing B. Let’s make this for data.

Package Shapefiles for Preservation. click “PICKLE!” and it recommends a) the best practice and b) the minimum requirements.

Crowdsourced but curated information for various options.

Technology handshake to achieve Australasia PMC

Right now have EuropePMC and CanadaPMC (child nodes of US PubMed Central which has 27million references). So create AustralasiaPMC so PMC can link to OA articles. Can populate PMC with clumsy markup so need clever handshake technology  to make full-text available in children and parent nodes simultaneously.

Simple

Museum is an interface for scientific information to general public. But takes too long for simplified explanations of science (from eg journals) to general public, and journalists don’t always guard scientific integrity.

Want to do a better job of spreading info through social media. Natural language processing to create automated simplified summaries from technical abstracts; push notification to simple.wikipedia.org proposal pages so they can create or add to articles; Google translate for other languages.

Put it all together and you get communication immediately after acceptance, being picked up correctly by major news outlets.

(In Q&A: hard to contextualise. Audience notes researchers want to say ‘further researcher needed while lay people want to know what the answer is.)

FuturEpa

The technology we’ll use in future repositories has already been written – GitHub is full of work in progress – some people know about it but not all of us. Pull code automatically from everywhere, put it together, throw data in, see if it works.

Plan A – artificial intelligence – most advanced AI right now is self-driving car, so jump in front of one with the repository and the car can evaluate it and then run you over.

Plan B – use humans

(In Q&A: Kim Shepherd suggests when on GitHub and look at number of forks on projects – what percentage might be active, what percentage should we have merged in.)

Global Connections

Deep learning for repository deposit – use existing repository PDFs and metadata to train AI to a) create structured metadata for unstructured content (ie articles), find relevant articles, add structured metadata.

Slice ‘n’ Dice: API-X + XProc-Z

XProc-Z is a simple web server framework HTTP request -> -> -> HTTP response (especially useful for proxies)

API-X for plumbing together microservices.

GET request for info on resource – API-X intercepts/proxies, tweaks, and makes request to server, retrieves result, wraps in a header, tweaks and returns to user.

Don’t need to develop code, just write a text file in XProc language so you can test out what it looks like and you don’t need to wait for repository support. Signposting; generating IIIF manifests; add OAuth authenticating; adding CSS.

Brisbane Declaration ON the Elimination Of Keywords (B-DONEOK)

Keywords can’t express the complexity of language the way full-text can. We spend time doing it anyway. So let’s stop. Instead just use sophisticated full-text search and indexing. SIgn on to the declaration at

(In Q&A audience asks if there’s evidence keywords aren’t useful; team asks in return if there’s evidence keywords are useful.)

 

Institutional Publications Repositories and beyond #or2017

Abstracts

Curating, But Still Not Mediating by Jim Ottaviani, Amy Neeser

aka “don’t let anyone get away with 6accdœ13eff7i3l9n4o4qrr4s8t12ux” (Isaac Newton establishing priority on calculus in code)

Chinese proverb: “The best time to plant a tree is 20 years ago; the second best time is now.”

Curation starts immediately: don’t wait or people will forget. You have to open every data file and check it’s usable. They assume if it’s intended for humans it’s a “document” but if it’s intended for machines it’s “data”.

Acknowledge/thank the deposit (signing your name so they know you’re human not bot). Then you can ask for a README.txt or offer to help write it.

Home and Away: Exploring the use of metrics in Australia and the UK, with a focus on impact by Jo Lambert, Robin Burgess

Sydney measures metrics through: Atmire tools embedded in repository; PlumX integration altmetrics; ERA requirements; CAUL stats; exploring UK methods. Employing FAIR principles. Researchers provide context for impact.

JISC OA services support through article lifecycle of submission (SHERPA/RoMEO etc), Acceptance (Monitor UK, Jisc collections, OpenDOAR), Publication, etc. Stats collected via aggregation then available in COUNTER format; raw download data from UK IRs using DSpace, Fedora, PURE, ePrints…. Collaboration is important – working with OpenAIRE, concept of creating other IRUS instances eg IRUS-ANZ?

Set up Australian Repository Working Group. Looking at standards and collaboration. “We dream the same dream, we want the same thing” – Belinda Carlisle

Uniform metadata for Finnish repositories by Jyrki Ilva, Esa-Pekka Keskitalo, Päivi Rosenström, Tanja Vienonen, Samu Viita

Open Scientific Publishing project (Tajua). 60 orgs in Finland have an IR, mostly DSpace – some are shared so total of 17 IRs.

Challenges: heterogeneous metadata practices, ad hoc solutions, no general metadata guidelines so repository managers have to fend for themselves.

80 experts got together, formed a smaller working group including National Library experts. Goal-oriented approach to develop a “good-enough” metadata format, semantics prioritised over correct Dublin Core. Compiled most used metadata fields and suggesting fields closer to standard DC.

Spreadsheets collected into Google Drive, meetings held online and in person – then done! Final version published on National Libraries public wiki fields. 62 core dc fields, and 11 extras if needed. 6 fields labelled important: title, author, date, persistent ID, rights (pref Creative Commons). Guidelines at: https://www.kiwi.fi/x/94R7B

Isomorphic Pressures on Instutional Repositories in Japan by Jennifer Beamer

Comparing US and Japanese repositories as interested in situation as institutions interact with repositories. In Japan repositories exploding from 2015-17 and wants to know why. More of a national push in Japan (whereas in the US it’s more grassroots). Previous work not looking at research in Japan especially on big picture scale.

Isomorphism – regulatory/coercive pressures; cognitive/mimetic pressures; normative pressures.

Collecting data from OpenDOAR and ROARMAP – content analysis of themes, mandates, core beliefs. Then interviews with SPARC in Tokyo, librarians, faculties. National Institute for Informatics has a shared cloud server with IR architecture so limited resources  not a barrier. OA policies have started very recently, but librarians play a major role in getting deposits even though only in that role briefly and assist directly. You don’t have to be a PhD to work in faculty so tenure and promotion completely different – publishing isn’t connected to tenure.

The role of the repository in increasing the reach and influence of research by Belinda Tiffen, Kate Byrne

(Acknowledge work of Catherine Williams). Repository enables reporting and assessment but also shopfront to sell research to the world. These roles don’t always sit well together – hard to explain to researchers why we want two versions of their papers.

What role does repository play in sharing research? Data from last year: 2609 UTS publications from Scopus. 33% also in repository. Looked at Almetrics for engagement. 1000 (of the 2609) have an Altmetric score. 47% in both Scopus and repository have an Altmetric score. 63% also in other repositories have an Altmetric score. But only 34% of outputs only in Scopus have an Altmetric score.

What will UTS do with this data? Have OA policy (since 2014) which has increased IR content but still only 35%. In 2015 rolled out new user interface. Active training to get authors to deposit. Want to find out which interventions are having an impact.

(In Q&A: redesigned theme done by part-time graphic designer in library, and in-house DSpace developer.

Scholarly Identity and Author Rights: guiding scholars as they make choices with their scholarly identities in a messy world by Jen Green

Project to managing scholarly/research identity. Schol comm team with wider working group and work with research community to focus outreach efforts. Workshop attendees mostly faculty and postgrads – but this changed when they started talking about online identity.

Workshops on ORCID – in the absence of a repository seemed a good place to start. Short pop-up sessions with ice-cream worked well, chatted, they created an ORCID.

Thought they needed help creating ORCIDs. Learned they needed that plus managing professional identity online and helping their student too. Scholars have limited time and want to spend it on own goals which may not match institution’s.

“Your Research Identity” – covered Twitter, Facebook, etc – so they’d know their online identity exists whether they manage it or not, and here are tools to manage it. Started with Google search on their names, discussed results. When results come up with other people with same name bring up ORCID. Suggested creating one place everything else can link back to (eg her own website).

Outcomes: after this workshop, workshops began to fill up. Once accidentally sent invite to whole campus and 30 seats filled in 10 minutes. Audience didn’t know existed mostly support staff.

(In Q&A: many faculty had never googled themselves, or didn’t know different results with different IP addresses.)

The University of the Philippines Baguio Faculty Research Database: starting a university repository by Cristina Borja Villanueva, Jay Mendoza Mapalo

Cordillera region home to country’s second largest concentration of indigenous people with 7 major ethnolinguistic groups. At Uof Philippines Baguio research is a priority. and library needs to collect and make outputs available to wider community.

Faculty Research Database started in June 2012, launched 2013, 500+ entries to document and disseminate outputs, increase citations, and advance knowledge. Use Joomla. Search author (by dropdown menu). Results page shows number of visits for each item – stats available to show most viewed. Item page may have full-text or may say available on request from author.

Has accomplished availability objectives. Hope to continue improve repository.

Crosswalks, mapping tables, and normalisation rules: when we don’t even share the same vocabulary for authority control by Deborah Fitchett

That’s me! So I didn’t summarise; instead see my full slides and notes.