Whistling Speech

I really love the research that they're doing over at Yale's Haskins Laboratories: instead of studying speech perception and production in terms of faithfully replicating alllll of the sounds we make with our mouths, (like the minute clicks, pops, and hisses of consonants), the team is proposing that all we need to understand speech is to track and re-create a few select resonances of the vocal tract. I like to think of speech production in this context as a series of bottles with varying levels of water in them--the mouth is one bottle that changes pitch resonance when you move it to open it or close it, the nasal cavity another, and so on throughout the vocal tract. It ends up sounding like a bunch of complicated melodies that are then combined into a complex micro-tonal harmony, a.k.a., we're all better at perceiving and making music than we think we are!

The examples below break it down into isolated sine-wave patterns that you can combine yourself to build a sentence. What do you think? How easily can you hear words emerge?


Tone combinations

Play Tone 1 alone | Play Tone 2 alone | Play Tone 3 alone | Play Tones 1 and 2 together

Play Tones 1 and 3 together | Play Tones 2 and 3 together

If you like this, you can go here for more interactive demonstrations, or check out this great sine-wave-synthesized Robert Frost poem.

Thanks to Robert E. Remez, as well as Phillip Rubin and Jennifer Pardo at Haskins Labs for allowing me to embed their work here.

Coming up, I'll be writing about a cool ethnographic example of a language that actually uses something like this in practice!