Automation and integration with Agile and continuous development #anzreg2018

Automation and integration
Peter Brotherton, SLNSW

Agile

  • Idea: requirements and solutions evolve, not defined upfront – continual improvement process including of communication. Early and continuous delivery, welcoming changing requirements, communication and reflection with a view to tuning and adjusting. Working software is the primary measure of progress.
  • Challenges: risk-averse culture; documentation-heavy project management framework; hard to change mindsets. When they first tried to do agile they just ended up doing waterfall over and over and over again. Agile training workshops were helpful.

CI/CD: Continuous Integration/Continuous Delivery/Deployment

  • Continuous Integration – merging feature branches back into main branch frequently – requires test automation to ensure quality of your unit tests and integration testing as well.
  • Continuous Delivery – automated release process
  • Continuous Deploment – automated deployment to production
  • Unit-testing – testing units of source code – function, class, method
  • System-testing – testing integrated system, often through user interface
  • Docker is a light weight containerisation technology, helps standardise application dependencies across environment so helps make dev setup and deployment easy.
  • Fewer bugs into production and less time manually testing despite releasing more frequently so being more responsive.
  • Use Bamboo, also considering Jenkins

Eg Alma acceptance tests

  • Can’t write unit tests as don’t have source code, and can’t control when releases happen. But can do browser-based system tests.
  • Audited critical business processes in each area of the library. Documented step by step into Excel, and started manual testing on Sandbox release – super tedious. Now working on automating acceptance tests using Python Robot Framework (uses either DOM or xpath, possibly also coordinates), which is working well. (This auditing/documentation also highlighted efficiencies they could make in regular business processes.)
  • Change in UI did break the script once. Change in data hasn’t yet.

Analysing logs #anzreg2018

How to work with EZproxy logs in Splunk. Why; how; who
Linda Farrall, Monash University

Monash uses EZproxy for all access either on/off campus. Manage EZproxy themselves. Use logs for resource statistics and preventing unauthorised access. Splunk is a log-ingestion tool – could use anything.

Notes can’t rely just on country changes though this is important as people use VPNs a lot. Eg people in China especially appear elsewhere; and people often use US VPN to watch Netflix and then forget to turn it off. Similarly total downloads isn’t very important as illegal downloads often happen a bit by bit.

Number of events by sessionid can be an indicator; as can number of sessions per user. And then there’s suspicious referrers eg SciHub! But some users do a search on SciHub because it’s more user-friendly and then come to get the article legally through their EZproxy.

https://github.com/prbutler/EZProxy_IP_Blacklist – doesn’t use this directly as doesn’t want to encourage them to just move to another IP.

A report of users who seem to be testing accounts with different databases.

Splunk can send alerts based on queries. Also is doing work with machine learning so could theoretically identify ‘normal’ behaviour and alert for abnormal behaviour.

But currently Monash does no automated blocking – investigates anything that looks unusual first.

 

Working with Tableau, Alma, Primo and Leganto
Sabrina Alvaro UNSW Megan Lee Monash University

Tableau server: self-hosted or Tableau-hosted (these two give you more security options to make reports private), and public (free) version.

Tableau desktop: similarly enterprise vs public.

UNSW using self-hosted server and enterprise desktop, with 9 dashboards (or ‘projects’)

For Alma/Primo can’t use Ex Libris web data connector so extract Analytics data manually but it may be a server version issue.

Easy interface to create report and then share with link or embed code.

UNSW  still learning. Want to join sources together, identify correlations, capture user stories.

Integration with the Alma Course API #anzreg2018

The Alma Course API – An Exercise in Course Integration
David Lewis

Alma Course Loader was inflexible – only runnable once a day, and doesn’t let you recover from errors. So wanted to write their own. Migrated to Alma when SOAP was available; later had to rewrite for REST API.  With the advent of Leganto the integration has become of even more importance.

Importance of API quotas and minimising frequency of calls. (Especially as the same API gateway is used by all Alma customers!) Course field mappings also important at the start. Another difficulty was course collapsing and parent-child course relationships (eg different cohorts within one course) which was important at their uni and was the hardest part to figure out. Ended up using course code for normal courses and parent course code for collapsed courses.

Discovered that even when they asked for JSON, error messages would come back as XML and crash their system – so ended up just writing their program to use XML instead of JSON.

Logging is a good debugging tool and audit trail and useful when raising jobs with Ex Libris.

Senior management often doesn’t value library contribution to course management – this is often political and requires a lot of awareness-raising among lecturers etc to get them to talk up the library to project managers.

Digital Strategy and Skills Development – A Balancing Act #anzreg2018

Digital Strategy and Skills Development – A Balancing Act
Masud Khokhar

“A short history of an ambitious team who curbed their enthusiasm for the larger good” / “of an ambitious team who told their evil overlord to shh and calm down”

Team works to enhance reach/impact/potential of digital and research – partnering with researchers which can lead to moments of optimism.

Key drivers – rapid tech changes, impact of machine learning, growth of digital scholarships, need for evidence-driven decision making, lack of general purpose digital skills and way of thinking among non-tech staff. At Lancaster added ‘digitally innovative’ to its strategy; have a digital vision for university (digital research / digital teaching and learning / digital engagement).

So library needed to be digitally innovative, digitally fluent; diversity of thinking as core principle – formed innovation group to actively seek partnerships, build confidence, develop leadership, inspire creativity. Wanted to get insight into customer behaviour to develop data-driven services.

Most ideas actually turned out to be non-digital in nature – some required digital work, more required cultural change!

Ideas/projects

  • A Primo learning wizard for first-time users (but most people don’t log in so issues with them seeing it again and again).
  • Research data shared service – repository, preservation, reporting – collaboration with 15 institutions. Looking at a framework agreement/interoperability standard so variety of vendors can be on board – no matter what repository you use, it talks to a messaging layer which connects to aggregators, preservation services, reporting and analytics, institutional or external services.
  • Data Management Administration Online (sister to DMPonline) – about to be launched as a service – gives a birds eye view of all RDM/open science services at your institution. Can set KPIs, benchmark against similar institutions – has multiple views (DVC / librarian / data manager / IT manager etc). API driven including Tableau connector. Based on Jisc Research Data shared services and on messaging layer.
  • Mint – doi minting tool (open source to work with PURE)
  • Library digitisation service / copyright compliance for content in Moodle. Reports on downloads and usage
  • Leganto implementation (migrated from Talis). Developed some Moodle integration: https://moodle.org/plugins/mod_leganto
  • Noise reporting – part of indoor mapping system – users can select where they are and give comments on noisiness – system provides heatmaps and helps detect common patterns. Can extend this for fault reporting, safety reporting.
  • Labs environment for quick-and-dirty eg library opening hours; research connections (extracting data from PURE, Scopus, SciVal, and twitter APIs; preservation of research data – extracting from Pure into Archivematica (not in prod but possible); research data metadata (rdf based on Pure data); research outputs announcements (generated from Pure metadata for Twitter announcements; again not in prod but possible).

But when focused on learning machine learning etc and all the exciting stuff, it’s at the expense of real needs. So for snazzy stuff did learn and adopt Amazon infrastructure and a local caching infrastructure for Alma data, some IoT infrastructure (beacon based, sensor based eg noise and temperature, thermal imaging for people counting), natural language touch points eg messenger/Slack bots.

Have decided that every process will be reviewed with digital as part of it. Introducing more Excel skills with training; Alma analytics training; analytical thinking in general. Trying to embed digital team in all library processes

Looking at the Rapid Improvement Exercises model

Wrangling Primo Resource Recommender #anzreg2018

Resource wrangling : An implementation of Primo Resource Recommender service at State Library Victoria
Marcus Ferguson, SLV

Principles: find best way to present recommendations; control number of resources recommended; clearly identify subscription vs free. Include: databases, websites, research guides, custom types (collection pages, exhibition pages), ‘more to explore’ (originally things like library hours, now repurposed libguides for subjects) 488 resources – with 10,500+ tags. Maintaining this either through Back Office or spreadsheet upload was going to be difficult.

Built a spreadsheet with columns: list of keywords; database1..database5; website1..website3 etc; with dropdown menus to populate these. But then need to convert this. VLOOKUP wouldn’t work so needed custom function. Found a VBA function via Google. This operates on a new sheet to create a list of databases and all the tags used by it, plus a list of ‘other tags’ added manually for each one. Final sheet pulls it all together into the format Primo expects.

Finally also assigned icons to improve visual effect – found from vendor branding pages; website; or social media. Looks bad if a resource has none, so assign a default logo in that case.

Subscription database use ‘Login to access’ as URL text; free ones have title as URL text.

Added rr_database_icon_check as a keyword so can search for Primo for all of these and check that they’re still valid – mostly they’re pretty stable. If that changes, will grab them and store locally.

Final step is VBA macro to save export version and backup.

Looking forward: – need to assess impact of the May release “tag enrichment”; extend spreadsheet to include research guides; apply additional error checking; investigate ways to allow other librarians to work with the tags while managing change control.

Automating systematic reviews #anzreg2018

Automating systematic reviews with library systems: Are Primo and Alma APIs a pain reliever?
Peta Hopkins, Bond University

Systematic (literature) reviews especially in medical field – one example retrieved 40,000+ abstracts, screened to 1,356 full-text, and included 207 in the final review.

Were asked for process to reduce time involved down to two weeks. Developing toolset of elements to automate processes. Esp find/download full-text from subscriptions, batch-request from interloans.

  • Primo APIs to find/download? Not really (because actually it’s the Alma uResolver and even that can’t pull full-text).
  • Alma APIs to submit interloan requests? This has worked well – 95% success rate.

Old system searched Primo, clicked interloan link, tick copyright boxes, submit
Now upload Endnote file into system, click link to submit requests to Library, tick copyright boxes, submit (in bulk)

Dev wanted better documentation on APIs (eg encoding format); more helpful error messages; and in future want a way to find full-text and download.

Repositories at https://github.com/CREBP

Leganto implementations #anzreg2018

eReserve, Alma-D and Leganto: Working together
Anna Clatworthy, RMIT

Project to move all 14,000 Equella e-reserve items to Alma Digital in a format to suit Alma/Leganto copyright and digitisation workflows

All course readings at RMIT are digital; eReserve team in library accepts requests, scans items, uploads, sends a link back to use in CMS. Helps withh copyright compliance. Mostly book extracts, some journal articles, Harvard Business Review

Lots of questions to consider: MARC or DC; multiple representation or single record; how to deal with CAL survey in middle of migration; how do records look in Primo and in Leganto (which they didn’t yet have live); what is copyright workflow and how to manage compliance?

DC records weren’t publishing correctly so migrated to MARC. (This may have been fixed now). Multiple portion representations on a single bib record – migration process quicker, chapter/portion info in 505_2$a. Custom 940 field with copyright info

Extracted parents as spreadsheet, extracted children as spreadsheet, script combined the two — then instead imported records from Libraries Australia with a norm rule for extra fields (505, 542 for extract and copyright info; 9XX for CAL information); trained non-library folk to use MDE and run norm rules.

eReserve in Alma has no custom fields. Creates confusion for non-eReserve staff (thinking they own the book so no need to buy it though in fact only have 11pages of ch.4 – looks like a book in Primo too!)
* DC doesn’t work in Analytics – only see title
* Determined best practices and process for migration; set up Alma-D collections config and display in Primo; created MARC RDA cataloguing template and training; Leganto training and pilot; configure Alma reading lists, copyright, Leganto set up, and more…..
* Would like enhancements:
– automatic fills in copyright workflow – only working for some fields
– search function in reading list view
– MARC deposit form
– digital viewer link – Share link doesn’t work, leads to ‘no permission’ page. (Users need to sign-in first but of course they don’t.)
* With Leganto, show-and-tells seem to be getting interest, as is word of mouth. Not actually live yet though due to IT delays.

Leganto at Macquarie University: impressions, adjustments and improvements
Kendall Kousek, Macquarie University

Macquarie had Equella for copyright collection. Teachers email list to library and list made searchable in Primo by unit code (via daily pipe). Move to Leganto to address some issues. Can search library for items or upload your own pdfs, images, etc.

Pilot with faculty of Arts to create reading lists for 9 courses. Next semester another 11; 1 person had done it before and confident enough to try their own. Next semester 3 departments; not many came to session but a few still created own reading list; total of 120 reading lists created.

Feedback – added survey as a citation to reading lists – not many respondents as end of semester. Later survey added to Moodle directly to capture those not using the reading list and finding out why. Teachers liked how they could track how many used links and when (eg hour before class); ability to tag readings (eg literature review, assignment, extra); students like navigability and ability to suggest readings to teacher. Student satisfaction very high: clear layout, saved time chasing readings and can track reading in the week. Library staff liked layout, ease of learning/adding PCI records; Cite It! bookmarket.

Improvements people wanted was better integration with Moodle (lots of clicks to get to article); found it slow to load; students getting confused about whether discussions should be in Moodle or Leganto. Edge broke something so told students to use other browser. Want a ‘collapse all’ button for previous weeks to get straight to today’s: ExLibris are releasing this soon. Library staff want subsections functionality (ExL not going to do this, so using ‘notes’ feature instead.)

Adjustments needed by
* students – easier to find readings in Primo – but not all are there (esp articles, chapter scans), Leganto is source of truth. So have created Resource Recommender record to link to Leganto.
* teachers – want them to create their own reading list instead of submitting it by email (or at least to include layout information in those emails). And get them to use more variety of resources.
* library staff – more collaboration, reading lists are never complete until end of semester so have to be on top of it.

Improvements
* teacher finding more engagement as students aware they can see usage! Another planning to be more ‘playful’ with reading lists. Appearance of Leganto makes students more aware of resources as resources instead of just a list. Feeling will plan their teaching through Leganto. One teacher saying “These are the questions for the week, what are teh resources you’re using to answer them?”
* students can track which readings they’ve completed, can build own collection, can export in preferred referencing style.
* library staff have communication with teachers in Leganto; inclusion of all resource types (including web links using citation bookmarklet). Using public notes (eg trigger warnings)

4th stage of pilot will involve new departments, more volunteers by word of mouth. Need better communication/training eg presentations at dept meetings.

OER not currently dealt with – functionality maybe to come – can add CC license within a reading list but then depends on how widely you share that reading list!

Resource sharing partner synchronisation #anzreg2018

Managing Resource Sharing Partners in Alma
Nishen Naidoo, Macquarie University

  • Used to use VDX – external system, not transparent to end-user. But good that partners were managed centrally.
  • Alma provided single system, no additional user system integration, user experience via Primo and much richer. But partner management is up to each institution.
  • Connection options: broker (all requests via intermediary which handles billing) vs peer-to-peer
  • managing partners – contact details, and suspension status. Tricky to do this automatically so most people updating manually based on LADD webpage (AU suspension info), ILRS (AU addresses), Te Puna csv (NZ contact details), mailing lists announcements (NZ suspension announcements)
  • part 1 designed harvester to scrape data from these sources and put it into a datastore in json. Also capture and store changes eg of inst name or contact email.
  • part 2 designed sync service (API) to get data from datastore and upload to Alma. Needs your NUC symbol, an API key with read/write to resource sharing, and a configuration in Elasticsearch Index. (There’s a substantial technology stack.) Then pulls partner data from your Alma institution, and sync service creates partner records, compares with existing, updates Alma with changes.
  • future – hope to host in AWS. Wanting to get LADD/Te Puna to release data through proper API. Ideally Ex Libris would get data directly but at the moment can understand they wouldn’t want to touch it with a bargepoll.
  • documentation and download at https://mqlibrary.github.io/resource-sharing-partners-harvest/ and https://mqlibrary.github.io/resource-sharing-partners-sync/

Authorities and identifiers in data sources #anzreg2018

The future of authorities and identifiers in national and international data sources; pros, cons & ROI.
Panel: Lynne Billington SLNSW, Libraries Australia representative, Catherine Amey NLNZ, Jenny Klingler Monash University, Ebe Kartus UNE

Libraries Australia syndicates data to Trove; headings to WorldCat; to VIAF (not clear how much identifiers are being used here but Wikidata grabs it; ISNI sends data to VIAF and ORCID has some relationship with ISNI)… Workflow never fully developed by LA due to lack of demand. Only 3 orgs regularly sending data for ingest to ANBD. Integrated in international identifier ecosystem and investing in staff training. RDA gives opportunity to enrich records – functionality not yet implemented by library systems. Advocates with vendors to ensure data can interoperate with national/international data ecosystem.

National Library of New Zealand – including iwi names under 373. Data goes to OCLC. Follow international standards except for Ngā Upoko Tukutuku. Recognised by LoC. Available as open dataset for download. Last year pilot project to convert to Linked Data format – trying to show reo-ā-iwi as concepts on an equal level.

Monash
Used to load ABN authority records to Voyager. Later aligned with LC authority records, automated from Validator – until the program stopped working. Bought and loaded weekly updates with Gary Strawn’s programs. Migrated to Alma where these programs didn’t work so joined NACO program. NACO authorities go to LC linked data and VIAF – LC authorities are in the Alma Community Zone. Can insert this into 024 field to hopefully enable linked data.
Staffing a major issue in metadata area – lack of support in this area with many staff retiring without being replaced. Tension between NACO headings and LA record bib headings. Time intensive, and delay of 2 weeks before get into CZ.

University of New England
frustrated that we’re worried about library data instead of being part of the semantic web. MARC Will Not Die – it’s an albatross around our neck.
Have tried redefining a few things eg $0 for a standard control number of a related authority or standard identifier; $1 for a URI that identifies an entity (which appears to generally include standard identifier).
Libraries need to be part of the web, not just on the web.
Risk of focusing on what authorities we can get in CZ because this will advantage big authorities and disadvantage local authorities that are important to our community.
Can’t put a triple into a relational database. How are we really going to start working toward a linked open data environment?
need to put in identifiers wherever possible and stop fussing about punctuation
return on investment – hard to show one way or another. We don’t have a system to show proof of concept. Need to take leap of faith, hopefully in partnership with a vendor.

 

Institutional repository in Alma Digital #anzreg2018

Optimising workflows and utilising Alma APIs to manage our Institutional Repository in Alma Digital
Kate Sergeant, University South Australia

Used DigiTool as an interim solution with a long-term plan to move to Alma. For a while used the electronic resource component to manage metadata, with a local filestore. Last year finally moved everything properly into Alma Digital.

In early stage needed to generate handles and manage files. Phase 2 – development of templated emails to enable requesting outputs from researchers. Phase 3 last year – workflow management, data validation, author management, license management….

Get records submitted directly by researchers; harvest from Web of Science and Scopus APIs combined with Alma APIs for adding bib records. Land in Alma as suppressed records – often hundreds. Try to prioritise manually submitted stuff; and easier (eg journal articles) stuff. Make sets of incoming records.

Alma native interface doesn’t always show all data needed so use their own dashboard using the Alma APIs, which pulls out the things they care about (title, pub date, author, resource type, date added). Then have canned searches (eg Google title search, DOI resolver, DOI in Scopus, ISSN in Ulrichs, prepopulated interloan form…) . Look at metadata eg for authors/affiliations (links into Alma metadata editor for actual editing; links through to public profile). License information in Alma’s licence module. Shows archiving rights, version for archiving, embargo period – with links to copyright spreadsheet and to Sherpa/Romeo.

Would often forget to un-suppress the record – so added that ability at the point of minting the handle. The handle is then put into the record at the same time; and mint a DOI where relevant (eg data for ANDS).

Finally composes email based on templates to research – eg post-print required – built in delay for the email until after the item has actually gone live which often takes 6hrs.

Dashboard also includes exception reports etc; record enhancement facility with eg WoS data; publication/person lookup.