Open archive of 240,000 hours' worth of talk radio, including 2.8 billion words of machine-transcription

A group of MIT Media Lab researchers have published Radiotalk, a massive corpus of talk radio audio with machine-generated transcriptions, with a total of 240,000 hours' worth of speech, marked up with machine-readable metadata. Read the rest