Monthly Archives: June 2018

Automation and integration with Agile and continuous development #anzreg2018

Automation and integration
Peter Brotherton, SLNSW

Agile

  • Idea: requirements and solutions evolve, not defined upfront – continual improvement process including of communication. Early and continuous delivery, welcoming changing requirements, communication and reflection with a view to tuning and adjusting. Working software is the primary measure of progress.
  • Challenges: risk-averse culture; documentation-heavy project management framework; hard to change mindsets. When they first tried to do agile they just ended up doing waterfall over and over and over again. Agile training workshops were helpful.

CI/CD: Continuous Integration/Continuous Delivery/Deployment

  • Continuous Integration – merging feature branches back into main branch frequently – requires test automation to ensure quality of your unit tests and integration testing as well.
  • Continuous Delivery – automated release process
  • Continuous Deploment – automated deployment to production
  • Unit-testing – testing units of source code – function, class, method
  • System-testing – testing integrated system, often through user interface
  • Docker is a light weight containerisation technology, helps standardise application dependencies across environment so helps make dev setup and deployment easy.
  • Fewer bugs into production and less time manually testing despite releasing more frequently so being more responsive.
  • Use Bamboo, also considering Jenkins

Eg Alma acceptance tests

  • Can’t write unit tests as don’t have source code, and can’t control when releases happen. But can do browser-based system tests.
  • Audited critical business processes in each area of the library. Documented step by step into Excel, and started manual testing on Sandbox release – super tedious. Now working on automating acceptance tests using Python Robot Framework (uses either DOM or xpath, possibly also coordinates), which is working well. (This auditing/documentation also highlighted efficiencies they could make in regular business processes.)
  • Change in UI did break the script once. Change in data hasn’t yet.

Analysing logs #anzreg2018

How to work with EZproxy logs in Splunk. Why; how; who
Linda Farrall, Monash University

Monash uses EZproxy for all access either on/off campus. Manage EZproxy themselves. Use logs for resource statistics and preventing unauthorised access. Splunk is a log-ingestion tool – could use anything.

Notes can’t rely just on country changes though this is important as people use VPNs a lot. Eg people in China especially appear elsewhere; and people often use US VPN to watch Netflix and then forget to turn it off. Similarly total downloads isn’t very important as illegal downloads often happen a bit by bit.

Number of events by sessionid can be an indicator; as can number of sessions per user. And then there’s suspicious referrers eg SciHub! But some users do a search on SciHub because it’s more user-friendly and then come to get the article legally through their EZproxy.

https://github.com/prbutler/EZProxy_IP_Blacklist – doesn’t use this directly as doesn’t want to encourage them to just move to another IP.

A report of users who seem to be testing accounts with different databases.

Splunk can send alerts based on queries. Also is doing work with machine learning so could theoretically identify ‘normal’ behaviour and alert for abnormal behaviour.

But currently Monash does no automated blocking – investigates anything that looks unusual first.

 

Working with Tableau, Alma, Primo and Leganto
Sabrina Alvaro UNSW Megan Lee Monash University

Tableau server: self-hosted or Tableau-hosted (these two give you more security options to make reports private), and public (free) version.

Tableau desktop: similarly enterprise vs public.

UNSW using self-hosted server and enterprise desktop, with 9 dashboards (or ‘projects’)

For Alma/Primo can’t use Ex Libris web data connector so extract Analytics data manually but it may be a server version issue.

Easy interface to create report and then share with link or embed code.

UNSW  still learning. Want to join sources together, identify correlations, capture user stories.

Integration with the Alma Course API #anzreg2018

The Alma Course API – An Exercise in Course Integration
David Lewis

Alma Course Loader was inflexible – only runnable once a day, and doesn’t let you recover from errors. So wanted to write their own. Migrated to Alma when SOAP was available; later had to rewrite for REST API.  With the advent of Leganto the integration has become of even more importance.

Importance of API quotas and minimising frequency of calls. (Especially as the same API gateway is used by all Alma customers!) Course field mappings also important at the start. Another difficulty was course collapsing and parent-child course relationships (eg different cohorts within one course) which was important at their uni and was the hardest part to figure out. Ended up using course code for normal courses and parent course code for collapsed courses.

Discovered that even when they asked for JSON, error messages would come back as XML and crash their system – so ended up just writing their program to use XML instead of JSON.

Logging is a good debugging tool and audit trail and useful when raising jobs with Ex Libris.

Senior management often doesn’t value library contribution to course management – this is often political and requires a lot of awareness-raising among lecturers etc to get them to talk up the library to project managers.

Digital Strategy and Skills Development – A Balancing Act #anzreg2018

Digital Strategy and Skills Development – A Balancing Act
Masud Khokhar

“A short history of an ambitious team who curbed their enthusiasm for the larger good” / “of an ambitious team who told their evil overlord to shh and calm down”

Team works to enhance reach/impact/potential of digital and research – partnering with researchers which can lead to moments of optimism.

Key drivers – rapid tech changes, impact of machine learning, growth of digital scholarships, need for evidence-driven decision making, lack of general purpose digital skills and way of thinking among non-tech staff. At Lancaster added ‘digitally innovative’ to its strategy; have a digital vision for university (digital research / digital teaching and learning / digital engagement).

So library needed to be digitally innovative, digitally fluent; diversity of thinking as core principle – formed innovation group to actively seek partnerships, build confidence, develop leadership, inspire creativity. Wanted to get insight into customer behaviour to develop data-driven services.

Most ideas actually turned out to be non-digital in nature – some required digital work, more required cultural change!

Ideas/projects

  • A Primo learning wizard for first-time users (but most people don’t log in so issues with them seeing it again and again).
  • Research data shared service – repository, preservation, reporting – collaboration with 15 institutions. Looking at a framework agreement/interoperability standard so variety of vendors can be on board – no matter what repository you use, it talks to a messaging layer which connects to aggregators, preservation services, reporting and analytics, institutional or external services.
  • Data Management Administration Online (sister to DMPonline) – about to be launched as a service – gives a birds eye view of all RDM/open science services at your institution. Can set KPIs, benchmark against similar institutions – has multiple views (DVC / librarian / data manager / IT manager etc). API driven including Tableau connector. Based on Jisc Research Data shared services and on messaging layer.
  • Mint – doi minting tool (open source to work with PURE)
  • Library digitisation service / copyright compliance for content in Moodle. Reports on downloads and usage
  • Leganto implementation (migrated from Talis). Developed some Moodle integration: https://moodle.org/plugins/mod_leganto
  • Noise reporting – part of indoor mapping system – users can select where they are and give comments on noisiness – system provides heatmaps and helps detect common patterns. Can extend this for fault reporting, safety reporting.
  • Labs environment for quick-and-dirty eg library opening hours; research connections (extracting data from PURE, Scopus, SciVal, and twitter APIs; preservation of research data – extracting from Pure into Archivematica (not in prod but possible); research data metadata (rdf based on Pure data); research outputs announcements (generated from Pure metadata for Twitter announcements; again not in prod but possible).

But when focused on learning machine learning etc and all the exciting stuff, it’s at the expense of real needs. So for snazzy stuff did learn and adopt Amazon infrastructure and a local caching infrastructure for Alma data, some IoT infrastructure (beacon based, sensor based eg noise and temperature, thermal imaging for people counting), natural language touch points eg messenger/Slack bots.

Have decided that every process will be reviewed with digital as part of it. Introducing more Excel skills with training; Alma analytics training; analytical thinking in general. Trying to embed digital team in all library processes

Looking at the Rapid Improvement Exercises model