The British Library has uploaded one million public domain scans from 17th-19th century books to Flickr! They're embarking on an ambitious programme to crowdsource novel uses and navigation tools for the huge corpus. Already, the manifest of image descriptions is available through Github. This is a remarkable, public spirited, archival project, and the British Library is to be loudly applauded for it!
We plan to launch a crowdsourcing application at the beginning of next year, to help describe what the images portray. Our intention is to use this data to train automated classifiers that will run against the whole of the content. The data from this will be as openly licensed as is sensible (given the nature of crowdsourcing) and the code, as always, will be under an open licence.
The manifests of images, with descriptions of the works that they were taken from, are available on github and are also released under a public-domain 'licence'. This set of metadata being on github should indicate that we fully intend people to work with it, to adapt it, and to push back improvements that should help others work with this release.
There are very few datasets of this nature free for any use and by putting it online we hope to stimulate and support research concerning printed illustrations, maps and other material not currently studied. Given that the images are derived from just 65,000 volumes and that the library holds many millions of items.
If you need help or would like to collaborate with us, please contact us on email, or twitter (or me personally, on any technical aspects)
A million first steps
The Nightmare Machine is an MIT project to use machine learning image-processing to make imagery for Hallowe’en.
The Stormtrooper Decanter is on back-order, but you can pre-order one from the next batch for £22 — it’s based on Andrew Ainsworth’s original movie helmet moulds from 1976, and will provide endless opportunities to point to lowball glasses and say things like “aren’t you a little short for a Stormtrooper drink?” (via Bonnie Burton)
Yahoo has released a machine-learning model called open_nsfw that is designed to distinguish not-safe-for-work images from worksafe ones. By tweaking the model and combining it with places-CNN, MIT’s scene-recognition model, Gabriel Goh created a bunch of machine-generated scenes that score high for both models — things that aren’t porn, but look porny.
Nothing is more frustrating than needing to edit or sign a PDF and not having access to the original document. That’s why PDFpenPRO is a must-have app in our books.With this extremely useful app, you can merge, markup, and create PDF documents without ever having to convert your PDFs into word processor file formats. Type directly onto […]
From self-driving cars to stock market predicting software to the recommendations you get on Amazon and Netflix, machine learning is at the core of modern technology. You could find yourself building technology that is literally changing the world with the skills you’ll learn in The Complete Machine Learning Bundle. This bundle of 10 courses includes 406 lessons that will teach […]
This Python Mega Course will help you learn to code by teaching you to build 10 real-world apps that each highlight a unique use of Python.Job prospects for coders are still growing steadily—and with Python being one of the most popular coding languages out there today, it’s important for job seekers to demonstrate a widespread understanding of the […]