Tag Archives: citation styles

Launching Ref2RIS – convert your typed bibliography to Endnote format

Several months ago I blogged about Converting a plaintext bibliography to RIS format for Endnote. It’s not as painful a process as typing up hundreds/thousands of records, but it’s still painful.

This last week, I had to repeat the process. Eight painful work-hours later, I heard a colleague had something similar to do. And I thought there must be a way to automate it so one doesn’t have to do the endless typing every time.

Then I was home sick and got bored and ended up making a basic APA converter. Then (still sick and still bored) I got all fancy-schmancy and named and documented it and everything.

Thus, Ref2RIS.

Notes:

  • If you really can’t get access to MacOS or Linux to do this on and can’t get sed on Windows, email me – I’ll do it for a dollar or a good cause.
  • If you need a style other than APA, also email me. Whether/how quickly I get around to it will depend on a complex formula of many factors, but I think it’ll be quicker to make than the first one was, and right at the moment my motivation is high.
  • If you use it successfully, please let me know, spread the word, and/or if you’re really enthusiastic there’s a tipjar on the site.

Converting a plaintext bibliography to Endnote/RIS format with help from Linux/Terminal

[Update 16/7/2011: See my more recent post on the topic, Launching Ref2RIS – convert your typed bibliography to Endnote format, which makes things even easier.]

You won’t want to do this unless you’ve got literally hundreds of references. Any less, and these suggestions are way easier.

1. Format references so they’re each on their own line – no blank lines.

2. Use Word’s “Find Special” capabilities to replace a phrase in italics with {it}a phrase in italics{endit} and a phrase in bold with {b}a phrase in bold{endb}.  (Similarly if the citations contain underlines.)

3. Save as plaintext – say, source.txt.  Now the fun begins…  My own source text contains 600-odd lines in ACS style, like this:

Bamford, C. H.; Tipper, C. F. H. {it}Comprehensive Chemical Kinetics{endit}; Elsevier: New York, {b}1977{endb}. 
House, D. A.{it}Chem. Rev.{endit} {b}1962{endb}, {it}62{endit}, 185

4. Open up Terminal or some other Linux command line.

5. Endnote records are separated by a line

ER  - 

– that’s two spaces before the hyphen and one after.  (All these details come from Endnote’s help pages.) This is the easy part: type in

sed -e 's/^\(.*\)/\1ER  - /' source.txt > source1.txt

6. The start of each Endnote record tells you what kind of citation it is – eg a book, journal etc.  To find every line that includes a colon (ie separating the publisher from the city published in) type in

sed -e 's/^\(.*:\)/TY  - BOOK@@\1/' source1.txt > source2.txt

Note 1: The “@@” is in there as a sign that you’ll need to replace this with a new line later; but we want to keep everything on one line for now.
Note 2: This is a good example of why this whole method is highly suspect, because it’ll also catch citations which have a colon in the article title or in a typo or whatever.  So if you can think of a better sign that a citation is a book then use that instead of the colon.

Alternatively, you could type in

sed -e 's/^\(.*{it}[0-9]*{endit}\)/TY  - JOUR@@\1/' source1.txt > source2.txt

to find every line that contains {it}[some number]{endit} which, in my source, is the best indicator that I’m dealing with a journal.  The same caveats apply – you’ll get both false positives and false negatives.

Anyway, keep doing what seems best given your source, and fix up the inevitable mistakes by hand until each line starts with TY  – something.  If you want to give up and just assume that everything that isn’t already assigned as something must be a journal then try

sed -e 's/^\([^(TY  - )].*$\)/TY  - JOUR@@\1/' source2.txt > source3.txt

I now have source looking like:

TY  - BOOK@@Bamford, C. H.; Tipper, C. F. H. {it}Comprehensive Chemical Kinetics{endit}; Elsevier: New York, {b}1977{endb}. 
ER  -
TY  - JOUR@@House, D. A.{it}Chem. Rev.{endit} {b}1962{endb}, {it}62{endit}, 185
ER  -

7. Now we keep playing with patterns.  (You may be able to do large chunks of this with regular find/replace, but for illustrative purposes I’ll keep using Terminal.)

For example, in my source the authors are nicely set off: they come after “@@” and before the first “{it}” (or “in {it}”), and if there’s more than one of them they’re separated by “;”.  So a few commands:

sed -e 's/@@\(.* in {it}\)/@@A1  - \1/' source3.txt > source4.txt
sed -e 's/@@\(.* {it}\)/@@A1  - \1/' source3.txt > source4.txt
sed -e 's/;\(.*;\)/@@A1  - \1/' source5.txt > source6.txt (This one I had to repeat a few times depending how many authors could be cited in one reference; there's supposed to be a way to do it globally but my unix fu is not strong.)
sed -e 's/;\(.*{it}\)/@@A1  - \1/' source8.txt > source9.txt

Journal titles:

sed -e 's/^\(TY  - JOUR.*\)\({it}.*{endit} {b}\)/\1@@JO  - \2/' source9.txt > source10.txt

Years:

sed -e 's/\({b}[0-9]*{endb}\)/@@Y1  - \1/' source10.txt > source11.txt

And so forth.  You pretty soon start to see why the first suggestion on most lists of ways to convert plaintext citations into RIS format is always “Just type it in / search for it again by hand”.  The method above is really only suitable if you’ve got literally hundreds of citations. (I have 639, plus or minus.)

8. Eventually you’ll be at a point where you can do a simple find/replace to change @@ to a new line and nuke all the {it} and so forth.  This will be a great relief.

9. Rename your final saved file from source12.txt to source12.ris and open with Endnote.

10. Bonus material:  if this was a bibliography to a paper using numbered citations in order using eg [1], then in that paper you can do a find/replace on [ -> { and ] -> }, then tell the Endnote plugin to format citations, and voila, the best magic ever.  (If the paper uses author/date citations then you’ll have to link them by hand, sorry.)

Links of Interest 4/11/09

Resources
New Zealand Electronic Text Centre has posted a list of online texts for current courses at VUW.

The Dept of Internal Affairs has launched Government datasets online, a directory of publicly-available NZ government datasets (especially but not exclusively machine-readable datasets).

Complementary Twitter accounts:

  • APStylebook (Sample: Election voting: Use figures for totals and separate the large totals with “to” instead of hyphen.)
  • FakeAPStylebook (Sample: To describe more than one octopus, use sixteentopus, twentyfourtopus, thirtytwotopus, and so on.)

Information Literacy
There was a lot of interest at and after LIANZA09 about the Cephalonia Method of library instruction (basically, handing out pre-written questions on cards to students to ask at appropriate times during the tutorial). A recent blogpost by a librarian worn out from too many tutorials wonders “what if the entire class session consisted of me asking students questions? What if I asked them to demonstrate searching the library catalog and databases?”

Scandal du jour
A document by Stephen Abram (SirsiDynix) on open source library management systems (pdf, 424KB) appeared on WikiLeaks. The biblioblogosphere saw this as evidence of SirsiDynix secretly spreading FUD (fear, uncertainty and doubt) against their open-source competition. Stephen Abram replied on his blog that it was never a secret paper and he’s not against open source software but it’s not ready for most libraries. Much discussion followed in his blog comments and on blogs elsewhere; Library Journal has also picked up the story.

For fun
Also at Library Journal, The Card Catalog Makes a Graceful Departure at the University of South Carolina – rather than just dumping it the library is hosting events such as a Catalog Card Boat Race and What Can You Make With Catalog Cards?

Things Librarians Fancy.

Deborah

Links of interest 25/9/09

News
LibLime, an organisation which sells support to the New Zealand-developed open-source library system Koha, has recently announced changes to their practices that are technically legal but many feel don’t abide by the spirit of the open-source license. Library Journal has a basic summary of events with links to key discussions.

A libarian gets a marriage proposal on Ask a Librarian.

Customer service
Being at the point of need discusses placing screencasts, chat widgets, and other tutorials in the catalogue, subject guides, and databases.

Chalk notes as a valid communication format is a library manager’s blogpost about her response to chalk-on-pavement comments about the library. Her follow-up on chalk notes addresses the issue of communication within the library about public responses like this.

Tracking ILL Requests is a “wouldn’t it be neat if” post about providing more information on ILL requests to users.

Resources
The APA has an APA Style Blog with all sorts of handy tips.

10 free Google Custom Search Engines for librarians

5 sites with free video lectures from top colleges

Links of interest 9/7/09

An essay on the serials review process.

The Global Legal Monitor, published by the Law Library of Congress in Washington, offers an RSS feed for updates for all news stories as well as RSS feeds broken down by topic and/or jurisdiction.

Make it Digital by DigitalNZ has guides, voting for what NZ resources should be digitised (the AJHRs are currently in the lead) and a place to ask and answer questions about digitisation.

Marketing

Added web functionality

Links of interest 5/5/09

“Links of interest” is an irregular series of posts I started making recently to MPOW’s internal blog, based on items culled from FriendFeed, Twitter, and Google Reader. I started thinking it was a shame not to have it available publicly, so here it is. NB Dates on future posts will be in dd/mm/yy format….

Lessons from the library booth at a local festival: or how not to engage customers

A blog post on New Citation Rules in the 7th Edition of the MLA Handbook.

Merck makes phony peer-review journal to promote a drug, published by Elsevier.

Google Maps adds historical maps of Japan which turn out to accidentally facilitate discrimination.

UCOL tweets that: “UCOL Library now has over 20 wireless laptops students can use anywhere on campus. You can borrow a laptop for up to 3 hours.”

National Library explains Twitter – they compare it to Personal Items columns in early 20th century newspapers, describe the feedback and interaction they’ve had for their account, and talk about how they do it.