One of the papers at Open Repositories 2017 I couldn’t attend was:
Maximize the Visibility and Impact of Open Access and other Articles through integration of Publisher APIs
Letitia Mukherjee (Elsevier), Robert Phillips (University of Florida)The University of Florida searched for solutions to expand access to university-authored journal articles thru institutional repository. UFL and Elsevier collaborated to automatically feed journal platform data and links to the IR through free APIs. The project enabled UFL to support university authors/researchers and compliance with US public access policies.
I wrote most of this blog post based on what I heard about the presentation at conference, and my own investigations a couple of days later (ie a month ago); I’ve made some small edits and am posting this now after seeing the presentation recording on YouTube.
I first read about this project a year ago in an Inside Higher Ed article (in which Alicia Wise is quoted with an infuriating “The nice thing about this pilot is it opens up the repository”. No, it doesn’t open the repository. The repository was already open. It also doesn’t open up Elsevier content, which remains completely closed) and in a more sceptical blog post (which describes it as turning the repository into “a de facto discovery layer”. From what I can tell, this is being extraordinarily generous: as a discovery layer it doesn’t even make a particularly good Amazon affiliate programme, because Amazon at least pays you a few cents for the privilege of linking to them.)
Before going further I want to make it clear that any and all scathing comments I make in this post are reflective of my opinions about Elsevier stinginess, not about the repository or its staff who are clearly just doing what Elsevier allows them to do. Also I’m writing about the system as it is right now (Phase I). [Phase II was briefly discussed starting about 18:55 in the video and in Q&A at the end of the presentation.]
While still at conference, I heard that Robust Discussion was had following the presentation (and this is captured in the video too). Among other questions, an audience member asked if Elsevier would offer all subscribers the ability to download final accepted manuscripts via API for example (21:59). The eventual answer (after some confusion and clarification) seems to be that it’s not currently available to all subscribers as they’re creating author manuscripts specifically for the pilot and need to work out whether this is scalable (24:44). [This raises the question to me of why. Why not just use the actual author manuscript instead of converting the author manuscript into the publisher manuscript and then apparently converting it back?]
In any case, when I asked the same question at the vendor stall, I was told that if they provided the pdf to repositories, they wouldn’t be able to track usage of it. The vendor also asked me why we’d want to. I talked about preservation, primarily because I foolishly assumed that the system they’ve got with Florida actually worked as advertised to provide ‘public access’ but a couple of days later, somewhat recovered from the exhaustion of conference, I had second thoughts. Because of course the other things that we want are full-text searching and access via Google Scholar. Also access for the general public, not just our own university. Also, well, access at all. I thought this went without saying until I actually began to test how it works in practice.
So University of Floriday’s repository is IR@UF. I ran a general search for {Elsevier} and turned up 32,987 results. I chose an early result that wasn’t from the Lancet because the Lancet is a special snowflake: “(1 1 0) and (1 0 0) Sidewall-oriented FinFETs: A performance and reliability investigation”. The result is plastered honestly with “Publisher version: Check access”.
Is it open access? I clicked on the title. Elsevier has made much of “embedding” the content in the repository. I think this is in fact intended for phase II but they’d managed to give the impression that it was already in place so at this point I expected to be taken to a repository page with a PDF embedded in an iframe or possibly some unholy Flash content. Instead, I was taken straight to the item pay-to-download page on Elsevier. Further exploration uncovered no additional ways to access the article. So there’s no access to the public: it’s not open access and it does absolutely nothing to support “compliance with US public access policies”.
Is it easily accessible to institution members? If I was a UFL student or staff member who happened to be off-campus (say, at a conference, or researching from home) there’s no visible way to login to access the article. I assume UFL has IP access to content in which case it’d work on campus or through a VPN, but that’s it.
Is it findable through full-text search? I dug up access through my own library to download the pdf so I could select a phrase early on in the full-text that didn’t appear in the title or abstract. But doing a full-text search in IR@UF for {“nMOS FinFET devices”} resulted in “Your search returned no results“.
(Just to be sure the full-text search was working, I also tried it with a phrase from the title, {“Sidewall-oriented FinFETs”}, which did bring up the desired article. The link from this result is broken, though, which is presumably a bug in the implementation of the scheme, since links for non-Elsevier results on similar full-text searches are fine.)
Is it findable via Google Scholar? Scholar lists 6 records for the article, none of which are via IR@UF. Not, at this point, that there’s any advantage to seeing the IR@UF version anyway, but the pilot is certainly not driving traffic to the repository.
Is it a discovery layer? Even aside from the lack of full-text search and the inability to get access off-campus, it only works for ScienceDirect articles by UFL authors, so no.
If I had to come up with an analogy for what it is and does, I guess I’d say it’s a bit like a public-facing RIMS or CRIS, except those would include more data sources and more reporting functionality.
So to answer the question as I could have if I’d realised how limited this functionality is: why do institutional repositories want to have the full text?
- to make it discoverable via full-text searching
- to provide easy access for our own institution’s members
- to provide open access for the rest of the world
- thereby increasing its impact (including but not limited to that measured in citations and altmetrics)
- to ensure it’s preserved and accessible for the centuries to come
- to bring traffic to our own repository and the rest of its valuable collections; and
- to track usage.
UFL’s repository can do this last one. Sort of. It’s got a page for “Views” (hits) and “Visits” (unique visitors) . But it doesn’t tell us how many of these visitors actually succeeded in accessing the full-text. My suspicion is that this number would be much lower.
Phase II, if it works as advertised, may address some of these issues, but I’m not sure how many. I feel we’re getting conflicting messages of how it will actually function and at this point am not inclined to believe anything until I see it in action. For now it’s the same as any other vapourware.