Category Archives: Uncategorized

Resource sharing partner synchronisation #anzreg2018

Managing Resource Sharing Partners in Alma
Nishen Naidoo, Macquarie University

  • Used to use VDX – external system, not transparent to end-user. But good that partners were managed centrally.
  • Alma provided single system, no additional user system integration, user experience via Primo and much richer. But partner management is up to each institution.
  • Connection options: broker (all requests via intermediary which handles billing) vs peer-to-peer
  • managing partners – contact details, and suspension status. Tricky to do this automatically so most people updating manually based on LADD webpage (AU suspension info), ILRS (AU addresses), Te Puna csv (NZ contact details), mailing lists announcements (NZ suspension announcements)
  • part 1 designed harvester to scrape data from these sources and put it into a datastore in json. Also capture and store changes eg of inst name or contact email.
  • part 2 designed sync service (API) to get data from datastore and upload to Alma. Needs your NUC symbol, an API key with read/write to resource sharing, and a configuration in Elasticsearch Index. (There’s a substantial technology stack.) Then pulls partner data from your Alma institution, and sync service creates partner records, compares with existing, updates Alma with changes.
  • future – hope to host in AWS. Wanting to get LADD/Te Puna to release data through proper API. Ideally Ex Libris would get data directly but at the moment can understand they wouldn’t want to touch it with a bargepoll.
  • documentation and download at https://mqlibrary.github.io/resource-sharing-partners-harvest/ and https://mqlibrary.github.io/resource-sharing-partners-sync/

Authorities and identifiers in data sources #anzreg2018

The future of authorities and identifiers in national and international data sources; pros, cons & ROI.
Panel: Lynne Billington SLNSW, Libraries Australia representative, Catherine Amey NLNZ, Jenny Klingler Monash University, Ebe Kartus UNE

Libraries Australia syndicates data to Trove; headings to WorldCat; to VIAF (not clear how much identifiers are being used here but Wikidata grabs it; ISNI sends data to VIAF and ORCID has some relationship with ISNI)… Workflow never fully developed by LA due to lack of demand. Only 3 orgs regularly sending data for ingest to ANBD. Integrated in international identifier ecosystem and investing in staff training. RDA gives opportunity to enrich records – functionality not yet implemented by library systems. Advocates with vendors to ensure data can interoperate with national/international data ecosystem.

National Library of New Zealand – including iwi names under 373. Data goes to OCLC. Follow international standards except for Ngā Upoko Tukutuku. Recognised by LoC. Available as open dataset for download. Last year pilot project to convert to Linked Data format – trying to show reo-ā-iwi as concepts on an equal level.

Monash
Used to load ABN authority records to Voyager. Later aligned with LC authority records, automated from Validator – until the program stopped working. Bought and loaded weekly updates with Gary Strawn’s programs. Migrated to Alma where these programs didn’t work so joined NACO program. NACO authorities go to LC linked data and VIAF – LC authorities are in the Alma Community Zone. Can insert this into 024 field to hopefully enable linked data.
Staffing a major issue in metadata area – lack of support in this area with many staff retiring without being replaced. Tension between NACO headings and LA record bib headings. Time intensive, and delay of 2 weeks before get into CZ.

University of New England
frustrated that we’re worried about library data instead of being part of the semantic web. MARC Will Not Die – it’s an albatross around our neck.
Have tried redefining a few things eg $0 for a standard control number of a related authority or standard identifier; $1 for a URI that identifies an entity (which appears to generally include standard identifier).
Libraries need to be part of the web, not just on the web.
Risk of focusing on what authorities we can get in CZ because this will advantage big authorities and disadvantage local authorities that are important to our community.
Can’t put a triple into a relational database. How are we really going to start working toward a linked open data environment?
need to put in identifiers wherever possible and stop fussing about punctuation
return on investment – hard to show one way or another. We don’t have a system to show proof of concept. Need to take leap of faith, hopefully in partnership with a vendor.

 

Institutional repository in Alma Digital #anzreg2018

Optimising workflows and utilising Alma APIs to manage our Institutional Repository in Alma Digital
Kate Sergeant, University South Australia

Used DigiTool as an interim solution with a long-term plan to move to Alma. For a while used the electronic resource component to manage metadata, with a local filestore. Last year finally moved everything properly into Alma Digital.

In early stage needed to generate handles and manage files. Phase 2 – development of templated emails to enable requesting outputs from researchers. Phase 3 last year – workflow management, data validation, author management, license management….

Get records submitted directly by researchers; harvest from Web of Science and Scopus APIs combined with Alma APIs for adding bib records. Land in Alma as suppressed records – often hundreds. Try to prioritise manually submitted stuff; and easier (eg journal articles) stuff. Make sets of incoming records.

Alma native interface doesn’t always show all data needed so use their own dashboard using the Alma APIs, which pulls out the things they care about (title, pub date, author, resource type, date added). Then have canned searches (eg Google title search, DOI resolver, DOI in Scopus, ISSN in Ulrichs, prepopulated interloan form…) . Look at metadata eg for authors/affiliations (links into Alma metadata editor for actual editing; links through to public profile). License information in Alma’s licence module. Shows archiving rights, version for archiving, embargo period – with links to copyright spreadsheet and to Sherpa/Romeo.

Would often forget to un-suppress the record – so added that ability at the point of minting the handle. The handle is then put into the record at the same time; and mint a DOI where relevant (eg data for ANDS).

Finally composes email based on templates to research – eg post-print required – built in delay for the email until after the item has actually gone live which often takes 6hrs.

Dashboard also includes exception reports etc; record enhancement facility with eg WoS data; publication/person lookup.

Primo out of the box #anzreg2018

Primo out of the box: Making the box work for you
Stacey Van Groll, UQ

Core philosophy – maintain out-of-the-box unless there’s a strong use case, user feedback, or bug. Focus on core high-use features like basic search (rather than browse) and search refinement (rather than my account). Stable and reliable discovery interface; quick and seamless resource access.

Said yes to:

  • UQ stylesheet – one search scope, oneview, one tab, their own prefilters on library homepage (a drop-down menu – includes some Primo things like newspaper search, some non-Primo things)

Said no to:

  • Journals A-Z
  • Citation linker
  • Purchase requests
  • main menu
  • EBSCO API
  • Featured Results
  • Collection Discovery
  • Tags & Reviews
  • Database search (for now)
  • Newspaper search (for now)
  • Resource recommender (for now)

Dev work for some things – eg tweaked the log out functionality to address an issue; then Primo improved something, which broke their fix; fixed the fix; next release was okay; next release broke it again; so have reviewed and gone back to out-of-the-box. An example of the downsides to making tweaks.

But sometimes really need to make a change – consider the drivers, good use cases, who and how many people experience the problem, how much work it is to make/develop the change and how much work to maintain it? Is there existing functionality in the product or on the Roadmap? How do you measure success?

Does environmental scans – has bookmarks of other Primo NewUI sites to see what else other people do and how.

Data analysis – lots of bugs in My Account but also very low usage. So doesn’t put much work in, just submits a Salesforce case then forgets.

Evaluates new releases – likes to piggyback on these eg adding OA and peer-reviewed tags to institutional repository norm rules.

User feedback – classify by how common the complaint is and try to address most common.

Feedback:

  • first goes to Knowledge Centre Feedback feature and includes email address which forces a response
  • second listserv
  • third Salesforce, and then escalation channels if needed

Lessons learned: A good salesforce case has a single problem, include screenshots, explain what behaviour you desire.

Ex Libris company / product updates #anzreg2018

Ex Libris company update
Bar Veinstein, President Ex Libris

  • in 85 of top 100 unis; 65million api calls/month; percentage of new sales that are in cloud up from 16% in 2009 to 96% in 2017; 92% customer satisfaction
  • Pivot for exploration of funding/collaboration https://www.proquest.com/products-services/Pivot.html
  • aim to develop solutions sustainably so not a proliferation of systems for developing needs
  • looking at more AI to develop recommendation eg “high patron demand for 8 titles. review and purchase?”, “based on usage patterns, you should move 46 titles from closed stacks to open shelves?”, “your interloans rota needs load balancing, configure now?”, “you’ve got usage from vendors who provide SUSHI accounts you haven’t set up yet, do that now?”, algorithms around SUSHI vs usage.
  • serious about retaining Primo/Summon; shared content and metadata
  • Primo VE – realtime updates. Trying to reduce complexity of Primo Back Office (pipes etc – but unclear what replaces this when pipes are “all gone”)
  • RefWorks not just for end user but also aggregated analytics on cloud platform. Should this be connected/equal to eshelf on Primo?
  • Leganto – ‘wanting to get libraries closer to teaching and learning’ – tracking whether instructors are actually using it and big jumps between semesters.
  • developing app services (ux, workflow, collaboration, analytics, shared data) and infrastructure services (agile, multi-tenancy, open apis, metadata schemas, auth) on top of cloud platform – if you’ve got one thing with them very quick to implement another because they already know how you’re set up.
  • principles of openness: more transactions now via api than staff direct action.
  • https://trust.exlibrisgroup.com/
  • Proquest issues – ExL & PQ passing the customer service buck, so to align this. Eg being able to transfer support cases directly across between Salesforce instances.

Ex Libris prodct presentation
Oren Beit-Arie, Ex Libris Chief Strategy Officer

  • 1980s acquisitions not part of library systems -> integrated library systems
  • 2000s e-resource mgmt not part of ILS -> library services platform (‘unified resource mgmt system’)
  • now teaching/learning/research not part of LSPs -> … Ex Libris’s view of a cloud ‘higher education platform’
  • Leganto
    – course reading lists; copyright compliance; integration with Alma/Primo/learning management system
    – improve teaching and learning experience; student engagement; library efficiency; compliance; maximise use of library collections
    – Alma workflows, creation of OpenURLs…
  • Esploro
    – in dev
    – RIMs
    – planning – discovery and analysis – writing – publication – outreach – assessment
    – researchers (publish, publish, publish); librarians (provide research services); research office (increase research funding/impact)
    – [venn diagram] research admin systems [research master]; research data mgmt systems [figshare]; institutional repositories [dspace]; current research information systems [elements]
    – pain points for rseearchers: too may systems, overhead, lack of incentive, hard to keep public profile up to date
    – for research office – research output of the uni, lack of metrics, hard to track output and impact, risk of noncompliance
    – next gen research repository: all assets; automated capture (don’t expect all content to be in repository); enrichment of metadata
    – showcase research via discovery/portals; automated researcher profiles; research benchmarks/metrics
    – different assets including creative works, research data, activities
    – metadata curation and enrichment (whether direct deposit, mediated deposit, automatic capture) through partnerships with other parties (data then flows both ways, with consent)
    – guiding principles: not to change researchers’ habits; not to create more work for librarians; not to be another ‘point solution’ (interoperable)
    – parses pdf from upload for metadata (also checks against Primo etc). Keywords suggested based on researcher profile
    – deposit management, apc requests, dmp management etc in “Research” tab on Alma
    – allows analytics of eg journals in library containing articles published by faculty
    – tries to track relationships with datasets
    – public view essentially a discovery layer (it’s very Primo NewUI with bonus document viewer – possibly just an extra view) for research assets – colocates article with related dataset
    – however have essentially ruled research administration systems out of scope as starting where their strength is. Do have Pivot however.

EZproxy log monitoring with Splunk for security management #anzreg2018

Ingesting EZproxy logs into Splunk. Proactive security breach management and generating rich eResource metrics
Linda Farrall, Monash University

Use Alma analytics for usage, but also using EZproxy logs.

EZProxy is locally hosted and administered by library/IT. On- and off-campus access is through EZproxy where possible, and Monash has always used EZproxy logs to report on access statistics. (For some vendors it’s the only stats available.) Used a Python script to generate html and CSV files.

Maintenance hard, logs bigger so execution took longer, python libraries no longer supported, skewed statistics due to EZproxy misuse/compromised accounts. So moved to Splunk (already had enterprise version at university) to ingest logs; can then enrich with faculty data, and improve detection of compromised accounts.

EZproxy misuse – mostly excessive downloads, eg using script or browser plugin – related to study but the amount triggers vendor Concerns (ie block all university access) – in this case check in with user to make sure it was them and sort out the issue. Or compromised accounts due to phishing. Have created a process to identify issues and block the account until ITS educates the user (because phishing emails will get sent to the same person who fell for it last time).

Pre-Splunk, it was time-consuming to monitor logs and investigate. Python script monitoring downloads no longer worked due to change of file size/number involved in typical download.

Most compromised accounts from Canada, US, Europe – in Splunk can look at reports where a user has bounced between a few countries within one week. Can look at total download size (file numbers, file size) – and can then join these two reports to look for accounts downloading a lot from a lot of countries.

To investigate have to go into identity management accounts – but can then see all their private data. Once they integrate faculty information into Splunk they don’t have to look them up so can actually enhance privacy – can see downloading lots of engineering data but are actually in engineering faculty so probably okay.

In 2016 had 10 incidents with resources blocked by vendors for 26 days. In 2017 16 incidents (all before August when started using Splunk). In 2018, 0 incidents of blocking – because they’re staying on top of compromised accounts (identifying an average of 4 a week) and taking pre-emptive action (see an issue, block the account, notify the vendor). Also now have a very good relationship with IEEE! (Notes that when IEEE alerts you to an issue it’s always a compromised account, there’s never any other explanation.)

Typically account compromised; tested quietly over several days; then sold on and used heavily. If a university hasn’t been targeted yet, it will be. By detecting accounts downloading data, are also protecting the university from other damage they can cause to university systems.

Notes that each university will have different patterns of normal use: you get to know your own data.

Lots of vendors moving to SSO. Plan to do SSO through EZproxy – haven’t done it yet so not sure it’ll work or not but testing it within a couple of months. ITS will implement SSO logging for the university, so hopefully they’ll pick up issues before it gets to EZproxy. Actively asking vendors to do it through IP recognition/EZproxy.

E-resource usage analytics in Alma #anzreg2018

“Pillars in the Mist: Supporting Effective Decision-making with Statistical Analysis of SUSHI and COUNTER Usage Reports
Aleksandra Petrovic, University of Auckland

Increasing call for evidence-based decision making in combination with rising importance of e-resources (from 60% -> 87% of collection in last ten years), in context of decreasing budget and changes in user behaviour.

Options: EBSCO usage consolidations, Alma analytics or Journal Usage Statistics Portal (JUSP). Pros of Alma: no additional fees; part of existing system; no restrictions for historical records; could modify/enhance reports; could have input in future development. But does involve more work than other systems.

Workflow: harvest data by manual methods; automatic receipt of reports, mostly COUNTER; receipt by email. All go into Alma Analytics, then create reports, analyse, make subscription decisions.

Use the Pareto Principle eg 20% of vendors responsible for 80% of usage. Similarly 80% of project time spent in data gathering creates 20% of business value; 20% of time spent in analysis for 80% of value.

Some vendors slow to respond (asking at renewal time increased their motivation….) Harvesting bugs eg issue with JR1. There were reporting failures (especially in move from http to https) and issues tracking the harvesting. Important to monitor what data is being harvested before basing decisions on it! Alma provides a “Missing data” view but can’t export into Excel to filter so created a similar report on Alma Analytics (which they’re willing to share).

So far have 106 SUSHI, 45 manual COUNTER vendors and 17 non-COUNTER vendors. Got stats from 85% of vendors.

Can see trends in open access usage. Can compare whether users are using recent vs older material – drives decisions around backfiles vs rolling embargos. Can look at usage for titles in package – eg one where only three titles had high usage so just bought those and cancelled package.

All reports in one place. Can be imported into Tableau for display/visualisation: a nice cherry on the top.

Cancelling low-use items / reducing duplication has saved money. Hope more vendors will use SUSHI to increase data available. If doing it again would:

  • use a generic contact email for gathering data
  • use the dashboard earlier in the project

Cost per use trickier to get out – especially with exchange rate issues but also sounds like reports don’t quite match up in Alma.

Alma plus JUSP
Julie Wright, University of Adelaide

Moved from using Alma Analytics to JUSP – to both. Timeline:

  • Manual analysis of COUNTER: very time intensive: 2-3 weeks each time and wanted to do it monthly…
  • UStat better but only SUSHI, specific reports, and no integration with Alma Analytics
  • Alma Analytics better still but still needs monitoring (see above-mentioned https issues)
  • JUSP – only COUNTER/SUSHI, reports easy and good, but can’t make your own
Alma JUST
much work easy
complex analyses available only simple reports
only has 12 months data data back to 2014
benchmarking works with vendors on issues
quality control of data

JUSP also has its own SUSHI server – so can harvest from here into Alma. This causes issues with duplicate data when the publishers don’t match exactly. Eg JUSP shows “BioOne” when there are actually various publishers; or “Wiley” when Alma has “John Wiley and Sons”. Might need to delete all Alma data and use only JUSP data.

Round-up of LIANZA 2017 sessions #open17

Below is a summary-of-the-summary of some of the LIANZA 2017 sessions I attended (some others were too participatory to allow live-blogging, or I ran out of brain) with key points and highlighting of things I particularly want to remember for some reason; no value judgements to be implied by the lack thereof!

Sunday

  • The dangerous myth about librarians – libraries are powerful and words have power so stop with the ‘save our libraries’ rhetoric. Stop relying on how ‘obvious’ our value is; stop being lazy about biculturalism; value ourselves, have courage, be visible.

Monday

Tuesday

  • Huakina te whare ki te ao – background and examples of Ngā Upoko Tukutuku (Māori subject headings)
  • What’s going on with ebook usage? – public library context, did lots of work extracting usage data and combining with patron data, plus surveying satisfaction
  • Games for learning – focusing on the learning around making games rather than playing them, and particularly using the presenter’s Gamefroot platform
  • Opening up licensing agreements – the kinds of terms we should be clarifying with database vendors, and how we convey this to users (particularly in Alma – we could be doing this a bit on the journal level now, though not on the article level)
  • The Future of the Commons – looking at Creative Commons (and the commons in general) from the point of view of the social systems supporting the commons, and in relation to the state and the market.
  • Enhancing library services with a journey mapping approach – a user experience methodology with a focus on the user’s emotions. Looking at what the user does and how they feel at each stage of carrying out a particular task/heading towards a particular goal.

Wednesday

The Anthropologist’s Tale: A Caution – Donna Lanclos #open17

Anthropologists get to do the work they do because someone lets them in. Listen, collect, collate, interpret, and tell stories. Stories are data – ways of representing and interpreting reality. She studies the ‘village’ of academia, investigating the logic behind the behaviours in academia – students, academics, others.

Example of bowdlerised version of Chaucer’s “Wife’s Tale” when she was in high school – she wanted the real story. Also as a folklorist, very aware of different versions of stories. There’s meaning not just in the story but in the fact that there are different versions. Who tells the tale informs how it’s told.

Early anthropology work was literally a tool of the Man. Finding out more about a people in order to colonise and control them. Eg “The Nuer” by Evans-Pritchard. Franz Boas ‘the father of anthropology’ when native American groups were the object of study because people believed they were ‘disappearing’ (a framing that ignores the agency of colonisation). In WWII armchair anthropology by Ruth Benedict informed post-war occupation strategy of US in Japan. Margaret Mead worked in Samoa and other people in the Pacific – many issues around whose stories she told and why. But her purposes shifted from institutional control to understanding. Wanted to make the unfamiliar familiar and relatable. Also to make the familiar unfamiliar – so people can look at things they’ve always done and wonder why.

Moving to libraries. Andrew Carnegie (as a retirement project from his life as a robber baron) founded lots of libraries all over US, UK, NZ, basically everywhere – to impose his ideas of what communities should have. There was an application process – communities wanted to be associated with the respectability and power. Libraries as colonising structures. And assumption that if you don’t put a library there, don’t establish a colonial government, there won’t be anything. It ignores what’s already there. There were people long before there were libraries.

Colonising impulse in libraries:

  • When she presents on student behaviour (googling, citing Wikipedia, not putting materials in IR) she talks about motivations, conflicting messages people get around these, the ways these things make sense to people where they are. And gets the question “So how do we get them to change their behaviour?” Wants the idea of what’s “best” to fall away. Listen to what people, understand why.
  • When she proposes open-ended investigations, eg day-in-the-life studies, geolocating emotions across various institutions and look at the pattern of their lives. No particular question or problem in mind, just wanted to know what it looked like. But often got asked, “How will that help me solve [very specific problem]?” Exploratory research isn’t about solving problems, it’s about gaining insight.

You don’t do anthropology to shift how they do library things; you do it so the library can shift its practices. How do we listen? How do we change? Study people not to control, but to connect. We don’t want to be the colonising library! We may think we’re powerless, but have so much more power than our users, so have a responsibility to be careful.

Approaches beyond ‘solutionism’:

  1. Syncretism: cobbling together, where you can see the component parts. In libraries, users already have a fully formed set of practices. They’ll make room for new ones if they’re useful. We should expect to be taught by them, as we teach them, what libraries mean for them.
  2. Decolonisation – listen to users, make space so the definition of what a library is emerges from the community. (cf Linda Tuhiwai-Smith’s “Decolonising Methodologies”)
  3. Community – not just responsible to users but to the whole community. (Public libraries are good at this.) Anthropological approaches can help if moving away from colonialism.

“Trying to predict the future is a really neat way of avoiding talking about the present.”

Open data? Perceptions of barriers to research data-sharing – Jo Simons #open17

Many aspects of open data – today focusing on research data, ie created by research projects at an institution.

Research workflow is very complex but to really simplify: researchers start a project, get lots of data, and summarise results in journals.  But it’s not the data – it’s a summary of the data with maybe a few key examples. The rest goes to places where only the researcher can access it.

Why do we care?

  • for the good of all
  • expensive to generate so want to maximise use eg validate, meta-analyses, used in different ways
  • much funded by government therefore taxpayer – so they should be able to access it

Used to work in a group which shared greenhouse space but had no idea what else was in there. Proposed sharing basic information about what was there and what to do in case of emergency – and was shocked when some said no. Supervisor said don’t let it stop you asking the question but that’ll happen, yeah.

Requesting data, odds of it being extant decrease 17% each year. (cite: Vines (2013) 10.1016/j.cub/2013.11.014)

This is where academic libraries come in – getting the data off the USB drives. So need to understand why they might not want to share. Did interviews to inform survey construction to get info from more people. 102 responses from researchers across 10 disciplines; 18 from librarians (about 20% response rate).

Do librarians and researchers agree on the major drivers that determine whether researchers choose to share their data?

Is data-sharing part of the research culture? Librarians: 7% said common/essential; researchers 26%

Factors influencing data-sharing

  • agreement in some areas eg ability to publish, inappropriate use, copyright and IP pretty high; then resources, interest to others, system structure and data access
  • differences: librarians thought institutional policy, system integration very important; funder policy, system usability somewhat important – all very low for researchers. What was important for researchers were: ethics (>40%); culture, research quality (10-15%); data preservation, publisher policy (5-10%)

Are there differences across major disciplines in what those drivers are?

5 disciplines with 10+ responses: business, medicine/health, phys/chem/earth; life sci/bio; soc sci/education. Ethics important for most but not a high-ranking factor for phys/chem/earth due to nature of their data. Whereas data preservation/archiving is more important for them (and med/health), somewhat important for life sci and soc sci, while business barely cared.

Take home

So consult with your community to find out what’s worrying them. Target those concerns in promotion and training. Eg we know system usability is important so definitely fix it – but don’t waste your communication opportunities talking about it when they’re worried about other things.