Tag Archives: open access

Checking out the Elsevier / U of Florida pilot

One of the papers at Open Repositories 2017 I couldn’t attend was:

Maximize the Visibility and Impact of Open Access and other Articles through integration of Publisher APIs
Letitia Mukherjee (Elsevier), Robert Phillips (University of Florida)

The University of Florida searched for solutions to expand access to university-authored journal articles thru institutional repository. UFL and Elsevier collaborated to automatically feed journal platform data and links to the IR through free APIs. The project enabled UFL to support university authors/researchers and compliance with US public access policies.

I wrote most of this blog post based on what I heard about the presentation at conference, and my own investigations a couple of days later (ie a month ago); I’ve made some small edits and am posting this now after seeing the presentation recording on YouTube.


I first read about this project a year ago in an Inside Higher Ed article (in which Alicia Wise is quoted with an infuriating “The nice thing about this pilot is it opens up the repository”. No, it doesn’t open the repository. The repository was already open. It also doesn’t open up Elsevier content, which remains completely closed) and in a more sceptical blog post (which describes it as turning the repository into “a de facto discovery layer”. From what I can tell, this is being extraordinarily generous: as a discovery layer it doesn’t even make a particularly good Amazon affiliate programme, because Amazon at least pays you a few cents for the privilege of linking to them.)

Before going further I want to make it clear that any and all scathing comments I make in this post are reflective of my opinions about Elsevier stinginess, not about the repository or its staff who are clearly just doing what Elsevier allows them to do. Also I’m writing about the system as it is right now (Phase I). [Phase II was briefly discussed starting about 18:55 in the video and in Q&A at the end of the presentation.]

While still at conference, I heard that Robust Discussion was had following the presentation (and this is captured in the video too). Among other questions, an audience member asked if Elsevier would offer all subscribers the ability to download final accepted manuscripts via API for example (21:59). The eventual answer (after some confusion and clarification) seems to be that it’s not currently available to all subscribers as they’re creating author manuscripts specifically for the pilot and need to work out whether this is scalable (24:44). [This raises the question to me of why. Why not just use the actual author manuscript instead of converting the author manuscript into the publisher manuscript and then apparently converting it back?]

In any case, when I asked the same question at the vendor stall, I was told that if they provided the pdf to repositories, they wouldn’t be able to track usage of it. The vendor also asked me why we’d want to. I talked about preservation, primarily because I foolishly assumed that the system they’ve got with Florida actually worked as advertised to provide ‘public access’ but a couple of days later, somewhat recovered from the exhaustion of conference, I had second thoughts. Because of course the other things that we want are full-text searching and access via Google Scholar. Also access for the general public, not just our own university. Also, well, access at all. I thought this went without saying until I actually began to test how it works in practice.

So University of Floriday’s repository is IR@UF. I ran a general search for {Elsevier} and turned up 32,987 results. I chose an early result that wasn’t from the Lancet because the Lancet is a special snowflake: “(1 1 0) and (1 0 0) Sidewall-oriented FinFETs: A performance and reliability investigation”.  The result is plastered honestly with “Publisher version: Check access”.

Is it open access? I clicked on the title. Elsevier has made much of “embedding” the content in the repository. I think this is in fact intended for phase II but they’d managed to give the impression that it was already in place so at this point I expected to be taken to a repository page with a PDF embedded in an iframe or possibly some unholy Flash content. Instead, I was taken straight to the item pay-to-download page on Elsevier. Further exploration uncovered no additional ways to access the article. So there’s no access to the public: it’s not open access and it does absolutely nothing to support “compliance with US public access policies”.

Is it easily accessible to institution members? If I was a UFL student or staff member who happened to be off-campus (say, at a conference, or researching from home) there’s no visible way to login to access the article. I assume UFL has IP access to content in which case it’d work on campus or through a VPN, but that’s it.

Is it findable through full-text search? I dug up access through my own library to download the pdf so I could select a phrase early on in the full-text that didn’t appear in the title or abstract. But doing a full-text search in IR@UF for {“nMOS FinFET devices”} resulted in “Your search returned no results“.

(Just to be sure the full-text search was working, I also tried it with a phrase from the title, {“Sidewall-oriented FinFETs”}, which did bring up the desired article. The link from this result is broken, though, which is presumably a bug in the implementation of the scheme, since links for non-Elsevier results on similar full-text searches are fine.)

Is it findable via Google Scholar? Scholar lists 6 records for the article, none of which are via IR@UF. Not, at this point, that there’s any advantage to seeing the IR@UF version anyway, but the pilot is certainly not driving traffic to the repository.

Is it a discovery layer? Even aside from the lack of full-text search and the inability to get access off-campus, it only works for ScienceDirect articles by UFL authors, so no.

If I had to come up with an analogy for what it is and does, I guess I’d say it’s a bit like a public-facing RIMS or CRIS, except those would include more data sources and more reporting functionality.

So to answer the question as I could have if I’d realised how limited this functionality is: why do institutional repositories want to have the full text?

  • to make it discoverable via full-text searching
  • to provide easy access for our own institution’s members
  • to provide open access for the rest of the world
    • thereby increasing its impact (including but not limited to that measured in citations and altmetrics)
  • to ensure it’s preserved and accessible for the centuries to come
  • to bring traffic to our own repository and the rest of its valuable collections; and
  • to track usage.
    UFL’s repository can do this last one. Sort of. It’s got a page for “Views” (hits) and “Visits” (unique visitors) . But it doesn’t tell us how many of these visitors actually succeeded in accessing the full-text. My suspicion is that this number would be much lower.

Phase II, if it works as advertised, may address some of these issues, but I’m not sure how many. I feel we’re getting conflicting messages of how it will actually function and at this point am not inclined to believe anything until I see it in action. For now it’s the same as any other vapourware.

Perverse incentives and the reward structures of academia #or2017

Perverse incentives: how the reward structures of academia are getting in the way of scholarly communication and good science
by Sir Timothy Gowers

Abstract:

The internet has been widely used for the last 20 years and has revolutionized many aspects of our lives. It has been particularly useful for academics, allowing them to interact and exchange ideas far more rapidly and conveniently than they could in the past. However, much of the way that science proceeds has been affected far less by this development than one might have expected, and the basic method of communication of ideas — the journal article — is not much different from how it was in the seventeenth century.

It is easy to imagine new and better methods of dissemination, so what is stopping them from changing the way scientists communicate? Why has the journal system proved to be far more robust than, say, the music industry, in the face of the new methods of sharing information?

The dream that all information is available on-tap, accessible through a few clicks. We’ve got a bit of that via Google, Wikipedia, YouTube, map sites, travel sites, news sites. But a lot of content is for subscribers only: you can find content in Google Books – but only a page before cut off due to copyright.

Of course copyright holders need incentives to create, to cover costs, etc. Academics aren’t directly paid though – actually the barriers to content are the bigger problem. Covering costs? maybe “if you insist on antiquated methods of publication”.

What could we share? The not-yet-complete idea, that others can build on. OTOH if everyone shared everything they thought it could end up a complete mess. But we can make order from chaos.

In maths the revolution has started: his library transitioning from painfully closed stacks to open stacks coincided unfortunately with not even needing to go to the library for content anyway. Wikipedia for basic concepts; arXiv.org for preprints (don’t need journals at all in maths); OEIS (database of sequences of whole numbers along with formulae for generating them); MathOverflow – for questions at the research level – usually get a useful answer within a few hours.

Traditional way of doing things is the “lone genius” model. But thought it’d be interesting to solve a problem in the public. So posted some initial thoughts on his blog and invited contribution. Traditionally there’s a fear of getting scooped – but doing it completely in the open, timestamps mean no-one can take credit; in fact it rewards putting your comment up quickly before someone else can. Problem was solved in just 6 weeks.

Perverse incentives in maths:

  • personal ambition
  • reward for being first (not for being inspiration, or for being second but with a better solution)
  • primacy of journal article while expository and enabling activities are downplayed – when you start writing textbooks instead of journal articles this is seen as your career slowing down
  • little recognition for incomplete ideas

These are obstacles to efficiency.

Paradox of paywalls: mathematicians write, peer review, edit; dissemination costs almost nothing; almost all interesting recent content is on arXiv (which can include final accepted manuscript – and anyway it’s not much different from the preprint); and still libraries pay huge subscription fees. The problem is the internet came along very quickly while we’re still doing things the old way.

Some initiatives:

  • his blog post about personal Elsevier boycott which inspired someone to set up a pledge which thousands signed
  • Open Library of Humanities set up when  Journal Lingua left Elsevier and became Glossa
  • Discrete Analysis (arXiv overlay journal) set up as proof of concept for cheap journal publication, with US$10 submission charge – and a nice user interface
  • No.Big.Deal – trying without success to get Cambridge and JISC to bargain better
  • Freedom of Information Act requests to UK universities for how much unis are paying Elsevier (contra confidentiality clauses)

Perverse incentives are held up by the whole network of publishers, editors, writers, readers (subdivides into people actually reading it, and people scanning it to judge the writer eg hiring committees), librarians (who have the power to cancel – but subject to academic criticism), scholarly societies (who often derive income from publishing journals), consortium negotiators, funders (in a good position to create mandates) – creating a situation where it’s very difficult to change things.

Feels like has had little effect, but it’s important to have lots of little initiatives which together build to pull the wall down.

Dataset published on access to conference proceedings – thank you!

Thanks to all who’ve helped —

(Andrea, apm, Catherine Fitchett, Sarah Gallagher, Alison Fields, KNB, Manja Pieters, Brendan Smith, Dave, Hadrian Taylor, Theresa Rielly, Jacinta Osman, Poppa-Bear, Richard White, Sierra de la Croix, Christina Pikas, Jo Simons, and Ruth Lewis, plus some anonymous benefactors)

— all the conferences I was investigating have been investigated. 🙂  I’ve since checked everything for consistency and link rot, added in a set of references that I had to research myself as I couldn’t anonymise them sufficiently in the initial run; deduplicated a few more times – conference names vary ridiculously – and finally ended up with a total of 1849 conferences which I’ve now published at https://dx.doi.org/10.6084/m9.figshare.3084727.v1

The immediately obvious stats from this dataset include:

Access to proceedings

  • 23.36% of conferences in the dataset had some form of free online proceedings – full-text papers, slides, or audiovisual recordings.
  • 21.85% had a non-free online proceedings
  • 30.72% had a physical proceedings available – printed book, CD/DVD, USB stick, etc, but not including generic references to proceedings having been given to delegates
  • 45.27% had no proceedings identifiable

(Percentages don’t add to 100% as some conferences had proceedings in multiple forms.)

Access to free online proceedings by year

This doesn’t seem to have varied much over the 6 years most of the conferences took place in:

2006: 39 / 173 = 22.54%
2007: 39 / 177 = 22.03%
2008: 62 / 258 = 24.03%
2009: 63 / 284 = 22.18%
2010: 105 / 428 = 24.53%
2011: 123 / 520 = 23.65%

Conferences attended by country

Conferences attended were in 75 different countries, including those with more than 20 conferences:

New Zealand: 429
USA: 297
Australia: 286
UK: 130
Canada: 67
China: 66
Germany: 44
France: 41
Italy: 35
Portugal: 31
Japan: 29
Spain: 28
Netherlands: 27
Singapore: 25

I won’t break down access to proceedings here, because this data is inherently skewed by the nature of the sample: conferences attended by New Zealand researchers. This means that small conferences in or near New Zealand are much more likely to be included than small conferences in other parts of the world. If a small conference is less resourced to put together and maintain a free online proceedings – or conversely a large society conference is prone to more traditional (non-free) publication options – this variation by conference size/type could easily outweigh any actual variation by country. So I need to do some thinking and discussing with people to see if there’s any actual meaning that can be pulled from the data as it stands. If you’ve got any thoughts on this I’d love to hear from you!

Further analysis now continues….

Progress report on how you’ve helped my research

At this point at least 20 people have helped me look for conference proceedings (some haven’t left a name so it’s somewhere between 20 and 42), which is awesome: thank you all so much! Last week saw us pass the halfway mark, an exciting moment. As of this morning, statistics are:

  • 1187 out of 1958 conferences investigated = 59% done
  • 312 have proceedings free online (26%)
  • of those without free proceedings, 292 have non-free proceedings online
  • of those without any online proceedings, 109 have physical proceedings (especially books or CDs)
  • 472 have no identifiable proceedings (40%)

I’ve got locations for all 1958, pending some checking. Remember this is out of conferences that New Zealand researchers presented at and nominated for their 2012 PBRF portfolio.

The top countries are:
New Zealand    492
Australia    315
USA    304
UK    133
Canada    69
(with China close behind at 68)

In New Zealand, top cities are predictably:
Auckland    154
Wellington    98
Christchurch    53
Dunedin    38
Hamilton    35

Along the way I’ve noticed some things that make the search harder:

  • sometimes authors, or the people verifying their sources, made mistakes in the citation
  • or sometimes people cited the proceedings instead of the conference itself – this isn’t a mistake in the context of the original data entry but makes reconciling the year and the city difficult.
  • or sometimes their citation was perfectly clear, but my attempt to extract the data into tidy columns introduced… misunderstandings (aka terrible, terrible mistakes).
  • or we’ve ended up searching for the same conference a whole pile of times because various people call it the Annual Conference of X, the Annual X Conference, the X Annual Conference, the International Conference of X, the Annual Meeting of X, etc etc.

On the other hand I’ve also noticed some things that make the search easier – either for me:

  • having done so many, I’m starting to recognise titles, so I can search the spreadsheet and often copy/paste a line
  • when all else fails I have access to the source data, so I can look up the title of the paper if I need to figure out whether I’m trying to find the 2008 or 2009 conference.

And things that could be generally helpful:

  • if a conference makes any mention of ACM, whether in the title or as a sponsor, then chances are the proceedings are listed in http://dl.acm.org/proceedings.cfm
  • if it mentions IEEE, try http://ieeexplore.ieee.org/browse/conferences/title/  If it’s there, then on the page for the appropriate year, scroll down and look on the right for the “Purchase print from partner” link – chances are you’ll get a page with an ISBN for the print option; plus confirming the location which is harder to find on IEEEXplore itself.
  • if it’s about computer science in any way, shape or form, then http://dblp.uni-trier.de/search/ can probably point you to the source(s). This is the best way to find anything published as a Lecture Notes in Computer Science (LNCS) because Springer’s site doesn’t search for conferences very well.
  • if you do a web search and see a search result for www.conferencealerts.com, this will confirm the year/title/location of a conference, and give you an event website (which may or may not still be around, but it’s a start). Unfortunately I haven’t found a way to search the site directly for past conferences.
  • a search result for WorldCat will usually confirm year/title/location and (if you scroll down past the holding libraries) often give you the ISBN for the print proceedings.

And two things that have delighted me:

  • Finding some online proceedings in the form of a page listing all the papers’ DOIs – which resolve to the papers on Dropbox.
  • Two of the conferences in the dataset have no identifiable city/country – because they were held entirely online.

I I am of course still eagerly soliciting help, if anyone has 10 minutes here or there over the next month (take a break from the silly season? 🙂  Check out my original post for more, or jump straight to the spreadsheet.

Help me research conference proceedings and open access

I’ve been interested for a while in the amount of scientific/academic knowledge that gets lost to the world due to conference proceedings not being open access / disappearing off the face of the internet. My main question at the moment is, just how much is lost and how much is still available?

Unfortunately googling 1,955 conferences will rapidly give me RSI, so I’m hoping I can convince you to do a few for me – in the interests of science!

Background: I’ve written elsewhere about Open Access to conference literature (short version: conferences are where a huge amount of research gets its first public airing, yet conference papers are notoriously hard to track down after the fact) and Open Access and the PBRF (short version: if conference papers were all OA, PBRF verification/auditing would become a lot easier). Here I’m wanting to quantify the situation.

The data: The original dataset was sourced from TEC, from the list of conference-related NROs (nominated research outputs) from the 2012 PBRF round. There are obvious and non-obvious limitations but basically I feel this makes it a fairly good listing of conferences between 2006-2011 that New Zealand academics presented at and felt that presentation was worthy of being included among their best work for the period. The original dataset is confidential, but I’ve received permission to post a derived, anonymised dataset publically for collaborative purposes, and in due course publish it on figshare.

How you can help:
(Note: by contributing to the spreadsheet you’re agreeing to licence your contribution under a Creative Commons Zero licence, meaning anyone can later reuse it in any way with or without attribution. (Though I’ll be attributing it in the first instance – see below.))

  1. Go to the spreadsheet containing the list of conferences
  2. Pick a conference that doesn’t have any URLs/notes/name-to-credit
  3. SearchGoogle/DuckDuckGo/your search engine of choice for the conference name, year, and city to find a conference website. Assuming you find one:
  4. Correct any details that are wrong or missing: eg expand the acronym; add in missing locations; if the website says it’s the 23rd annual conference put “23” in the “No.” column, etc.
  5. Browse on the website for proceedings, list of papers, table of contents, etc. If you find:
    • a list of papers including links to the full text of each paper freely accessible, paste the URL in “Proceedings URL: free online”
    • a list of papers including links to the full text but requiring a login (including in a database or special journal issue), paste the URL in “Proceedings URL: non-free online”
    • information about offline proceedings eg a CD or book, paste the URL in “Proceedings URL/info re print/CD/etc”
    • none of the above, paste the URL of the conference website for that year in “Other URL: conference website”
  6. If you can’t find any conference website at all, write that in “Any notes” so others don’t try endlessly repeating the futile search!
  7. Sign with a “Name to credit” for your work. If you’d prefer to remain anonymous, put in n/a.
  8. If you like, return to step 2. 🙂
  9. Share this link around!

What I’ll do with it:
First I’ll check it all! And obviously I’ll pull it back into my research and finish that up. I’ll also publish the final checked dataset on figshare under Creative Commons Zero licence so others can use it in their research. I’ll acknowledge everyone who helps and provides a name, in the creation of the dataset and in the paper I’m working on. And if someone wants to do a whole pile and/or be otherwise involved in the research then talk to me about coauthorship!

Why don’t I just use…

  • Mechanical Turk: I’m boycotting Amazon, for various reasons. Plus I consider a fair price for the work would be at least US$0.50 a conference (possibly double that) and as that’s a bit harder to afford I feel more ethical being upfront about asking folk to do it for free.
  • Library assistants: I am doing this a bit but there’s a limited period where they’re still working before summer hours and things have got quiet enough that they have time.
  • Something else: Ask me, I may want to!

Other questions
Please comment or email me.

Innovations in publishing; giving control back to authors #theta2015

Innovations in publishing; giving control back to authors
Virginia Barbour, Executive Officer, Australian Open Access Support Group (ORCID)

Lovely slide comparing a title page for the 1665 Phil.Trans of the Royal Society vs a 2014 Royal Society Open Science article on the web including a YouTube movie of the subject seadragon.

What’s worked well and not-so-well? Online > free > data > attribution > authorship > open
(Difference between ‘free’ and ‘open’ is important!)

We’ve changed the philosophy. We’ve begun to understand what we can do with the web. We’ve seen an explosion of models – not just for open, but also for toll. We’ve begun to ‘harness collective intelligence’. We’ve got the technology and processes to do open access, so with Creative Commons we can clearly label what people can/can’t do with something.

So have we fixed publishing? Hmm.

We need new thinking in peer review. Example of CERN paper appearing to find faster-than-light results and putting it up on arXiv for peer review so that someone could figure out what they’d done wrong. But also post-publication peer review – ~”the terrifying thing of publishing OA is that if you’re wrong someone will tell you about it on Twitter five minutes later”. PubMed Commons

Claiming contributions and identity. Disambiguating multiple authors with same name. Technology catching up with this. Hugely empowering for especially women whose names may change pre/post marriage/divorce.

What’s the right version of an article? Can provide “CrossMark” telling you if there’s an update – even works on downloaded PDFs on your computer.

But most of the debate around open access is driven by publishers. How do authors get control? Knowledge.

Areas where wants authors to have knowledge:

  • where to publish
  • understanding peer review and the black box of publishing
  • understanding how open something is and what can be done with it (eg data mining)

Susan L Janson “research is not finished until it’s published”
Authors need to care as much about publishing as about researching.

The confusing jargon of free

I’m constantly encountering confusion about whether something is in the public domain, or whether it’s open access. And it’s no wonder, because the terminology is inherently confusing.

If someone’s heard that material in the public domain is free for the taking, why shouldn’t they think that a blogpost or a tweeted photo — material on domains that are sometimes excruciatingly public — is included in that?

If publishers have heard about how great open access is, why shouldn’t they think that making some content openly accessible on their site is worthy of press releases vaunting how awesome they are?

(That one was a trick question. Publishers shouldn’t think that because it’s their job to be informed about this stuff. When I see a publisher talking about their “open access” site while their footer continues to be blazoned with “all rights reserved”, I don’t assume they just haven’t come across a proper definition before. I assume they’re wilfully taking advantage of the confusing terminology in order to intentionally deceive people while retaining plausible deniability, and they go on my list of Do Not Trust The Evil.)

The opposite of ‘public domain’ isn’t ‘private’; it’s ‘copyrighted’. This means:

  • Material created in the 19th century and earlier is mostly in the Public Domain (even if it’s in private ownership) because the copyright has expired.
  • Material created recently is generally not in the Public Domain (even if the copyright-holder has made it public by publishing it in a book, a newspaper, a webpage, a social media post, Times Square, and/or laser-writing on the moon) but is rather protected by copyright law. This means the copyright-holder — who is often but not always the author — holds the right to decide what other places the work can or can’t be published in.

The opposite of ‘open access’ isn’t ‘unaccessible’; it’s ‘all rights reserved’.

Something that’s unaccessible can’t be open access; this is true. But being accessible isn’t sufficient. Access has to be guaranteed, either by virtue of the material being in the public domain, or by means of the copyright-holder granting an appropriate license, aka permissions, to users of the material. This allows users to share/take over responsibility for making the material accessible if the copyright-holder can no longer, or no longer wants to, do it themselves.

This is abstract and therefore potentially confusing, so let’s look at a concrete example like Chris Hadfield’s cover of “Space Oddity”. Oh wait — we can’t look at it anymore, because while it was openly accessible for a year, it was never open access. David Bowie’s representatives gave permission for the song to be used for one year, so for one year the video was accessible. But no-one ever gave viewers permission to make and upload their own copies of it to guarantee perpetual access.

(Okay, so users have nevertheless made their own copies and uploaded them all over the place. This is because, firstly, the Internet is forever, and secondly, the video is fantastic. But every single one of these copies is illegal.)

People more familiar with the scholarly publishing landscape may notice I’m almost arguing that green open access and gold open access aren’t actually open access. And you know, I’m okay with saying that an open access article which disappears from the web because the only institutional repository allowed to store it goes down; or an open access journal which suddenly decides to shut all its previously accessible content behind a paywall — that these were never actually open access.

Open access means not just knowing that it’s accessible to everyone now, but knowing that it’s allowed to be accessible to everyone in the future too.

Loyalty cards for scholarly publishing

Two things I’ve come across recently which I don’t think I’ve seen before:

“Each article published in ACS journals during 2014 will qualify the corresponding author for ACS Author Rewards article credit. Credits issued under this program, at a total value of $1,500 per publication, may be used to offset article publishing charges and any ACS open access publishing services of the author’s choosing, and will be redeemable over the next three years (2015-2017).”
American Chemical Society extends new open access program designed to assist authors

“Under [IOP’s] new programme, referees will be offered a 10% credit towards the cost of publishing on a gold open access basis when they review an article.”
Changing the way referees are rewarded

(I’m presuming, though it’s not explicit, that these credits are additive, so if you published 2 toll-access articles with ACS you’d get $3,000 credit, and if you refereed 10 IOP articles you’d get to publish 1 article on a gold open access basis for free.)

I find this fascinating. The obvious catch for scientists is the same as any loyalty card: in order to use it you’ve got to keep shopping at the same company. It’s great psychology, because humans are notoriously reluctant to ignore the opportunity for a discount, so:

  • Someone who’s got credit owing will be less likely to publish in some other journal even if the final cost-to-author is equal and even if that other journal is a better fit for the particular article. (How much less likely I don’t know, but I do think it’d be a factor.)
  • Someone who’s got credit owing for OA publication would probably be more likely to pay the extra to publish OA rather than to publish toll-access for free but not get to use that tempting credit. (This might at least have a small side-effect of getting more people experience with the benefits of publishing open access.)

Both of these are obviously what the companies in question are banking on. I’m a bit concerned about what this pressure to publish with the same old big companies will mean for science – partly about competition, as in the world of supermarkets, but also partly the journals where articles should be finding their best fit. (Perhaps the whole ‘impact factor’ issue has meant that no-one’s ever considered only subject scope in that regard, but this definitely adds another confounding factor.) But given the clear financial benefits to the companies, I expect to be seeing more scholarly publishing reward cards popping up in future.

Open Access cookies

Creative Commons Aotearoa New Zealand are running a series of blogposts for Open Access Week, and I’ve contributed Levelling up to open research data.

I also, for Reasons, had an urge tonight to make Open Access biscuits. (I know my title says ‘cookies’, but the real word is of course ‘biscuits’, and I shall use it throughout the rest of this post along with real measurements and real temperatures. Google can convert for you, should you need it to.) The following instructions I hereby license as Creative Commons Zero, which should not be taken as a reflection on their calorie count.

First I started with a standard biscuit base recipe. You could use your own. I used the base for my family’s recipe for chocolate chip biscuits, which probably means it ultimately derives from Alison Holst, but I think I’ve modified it sufficiently that it’s okay to include here:

  1. Cream 125 grams of butter and 125 grams of sugar. The longer you beat it, the light and crisper the biscuits will be.
  2. Beat in 2 tablespoons sweetened condensed milk (or just milk will do, at a pinch) and 1 teaspoon vanilla essence.
  3. Sift in 1.5 cups of flour and 1 teaspoon of baking powder and mix to a dough.

Now we diverge from the chocolate chip recipe by not adding 90 grams of chocolate chips. We also divide the mixture in half, dying one half orange by using a few drops of red colouring and three times as many drops of yellow colouring:

Open Access biscuits step 1

The plain lot should then be divided into halves, each half rolled long and flat.
The orange lot should have just a small portion taken off and rolled into a fat spaghetto (a bit thinner than I did would be ideal), and the rest rolled into a large rectangle.

Then start rolling it together into our shape. The orange spaghetto gets rolled up into one of the plain rectangles. In this photo I’m doing two steps at once – most of the orange hasn’t been properly rolled out yet:

Open Access biscuits step 2

Then roll the rest of the orange around that with enough hanging off the top that you can fit some more plain stuff in to keep the lock open:

Open Access biscuits step 3

The ends will be raggedy. Don’t worry, this is all part of the plan.

At this point, put your roll of dough into the fridge to firm up a bit while you do the dishes. You could also consider feeding the cat, cooking dinner, etc. Or you can skip this step (or shorten it as I did) and it won’t hurt the biscuits, you’ll just have to do more shaping with your fingers because cutting the slices squashes them into rectangles:

Open Access biscuits step 4

These slices are about half a centimetre thick. I got about 38 off this roll, plus the raggedy ends. Remember I said those were part of the plan? Right, now – listen carefully, because this is very important – what you need to do is dispose of all the raggedy ends that won’t make pretty biscuits by eating the raw dough. I know, I know, but somebody’s got to do it.

The rest of the biscuits you put on a tray in the oven on a slightly low setting, say 150 Celsius, while you do the dishes that you missed last time because they were under things, and generally tidy up. 10 minutes or so, but whatever you do don’t go and start reading blogs because once these start to burn they burn quickly. Take them out when the ones in the hottest part of the oven are just starting to brown, and turn out onto a cooling rack.

Et voilà, open access biscuits:

Open Access biscuits step 5

Open access and peer review

We’re likely to be hearing about John Bohannon’s new article in Science, “Who’s afraid of peer review?” Essentially the author created 304 fake papers with bad science and submitted one each to an ‘author-pays’ open access journal to test their peer review. 157 of the journals accepted it, 98 rejected it; other journals were abandoned websites or still have/had the paper under review at time of analysis. (Some details are interesting. PLOS ONE provided some of the most rigorous peer review and rejected it; OA titles from Sage and Elsevier and some scholarly societies accepted it.)

Sounds pretty damning, except…

Peter Suber and Martin Eve each write a takedown of the study, both well worth reading. They list many problems with the methodology and conclusions. (For example, over two-thirds of open access journals listed on DOAJ aren’t “author-pays” so it’s odd to exclude them.)

But the key flaw is even more obvious than the flaws in the fake articles: his experiment was done without any kind of control. He only submitted to open access journals, not to traditionally-published journals, so we don’t know whether their peer review would have performed any better. As Mike Taylor and Michael Eisen point out, this isn’t the first paper with egregiously bad science that’s slipped through Science‘s peer review process either.