The British Library has uploaded one million public domain scans from 17th-19th century books to Flickr! They're embarking on an ambitious programme to crowdsource novel uses and navigation tools for the huge corpus. Already, the manifest of image descriptions is available through Github. This is a remarkable, public spirited, archival project, and the British Library is to be loudly applauded for it!
We plan to launch a crowdsourcing application at the beginning of next year, to help describe what the images portray. Our intention is to use this data to train automated classifiers that will run against the whole of the content. The data from this will be as openly licensed as is sensible (given the nature of crowdsourcing) and the code, as always, will be under an open licence.
The manifests of images, with descriptions of the works that they were taken from, are available on github and are also released under a public-domain 'licence'. This set of metadata being on github should indicate that we fully intend people to work with it, to adapt it, and to push back improvements that should help others work with this release.
There are very few datasets of this nature free for any use and by putting it online we hope to stimulate and support research concerning printed illustrations, maps and other material not currently studied. Given that the images are derived from just 65,000 volumes and that the library holds many millions of items.
If you need help or would like to collaborate with us, please contact us on email, or twitter (or me personally, on any technical aspects)
A million first steps
Fancy a regular cut of the best things at Boing Boing—and perhaps a thing or two you won’t read online? Our newsletter goes out weekly and we use only the finest mechanically-separated pixels in its production.
I’m about to switch off my email until September 5 and drive to Black Rock City for 10 days of incinerating the dude.
As Oracle desperately tries to reanimate its wretched, failed attempt to destroy everything Sun Microsystems stood for and end computer science as we know it, there’s never been a better time to rock one of these “You Wouldn’t Reimplement an API” tees, which were an underground hit during the earlier trial.
Finding quality icons is a challenge for designers, and can also get pretty costly if you use them often. And when you’ve got a lot to do, the last thing you want to spend your time on is creating new icons from scratch That’s why we recommend using the Noun Project ($49). Noun Project is a site […]
While Netflix and Hulu have seemingly dominated the streaming market with their limited selections, we’ve looked a little outside the box and found something pretty great as an alternative. SelectTV combines all the content of cable with the convenience of streaming, and it’s affordable too.SelectTV is an online subscription service that packs an impressive library of over […]
These days, the vape market is saturated with low-quality products, making it nearly impossible to separate the gems from the duds. The Atmos Rx Dry Herb Vaporizer stands out from crowd for two reasons: its impressive battery life and durable construction. This high-end little gadget is compact enough to fit in your pocket, and packs a powerful punch, […]