Massive public domain catalog dump from Harvard

David Weinberger writes, "Harvard University has today put into the public domain (CC0) full bibliographic information about virtually all the 12M works in its 73 libraries. This is (I believe) the largest and most comprehensive such contribution. The metadata, in the standard MARC21 format, is available for bulk download from Harvard. The University also provided the data to the Digital Public Library of America’s prototype platform for programmatic access via an API. The aim is to make rich data about this cultural heritage openly available to the Web ecosystem so that developers can innovate, and so that other sites can draw upon it. This is part of Harvard’s new Open Metadata policy which is VERY COOL."


  1. I’m taking one of their online courses through Coursera. It’s free. Something like Coursera was posted here  on boingboing. cant find the link.  Loving the class. 

  2. Since WorldCat already passed the one-hundred million record mark back in March 2007 (with the book It’s A Horse’s Life) and since I’d be surprised if most of Harvard’s catalogers haven’t been adding all new records to OCLC for a really long time the total contributions to that database will probably amount to little more than a drop in the bucket.

    But it is extremely cool of Harvard to make this information freely available, especially since not everyone has access to WorldCat.

  3. The killer part of this announcement is that the Harvard metadata is public domain, where WorldCat, while freely available to participating libraries, remains proprietary.

Comments are closed.