Fun with sine-wave speech

Here are some voice recordings that have been tweaked almost beyond recognition, placed side-by-side with the unaltered recordings. The tweaked recordings are almost impossible to make out (though if you listen to them over and over, you can figure out what they are saying) but after you listen to the unaltered recordings, the tweaked recordings are very easy to understand.
Most naive listeners hear this as a set of simultaneous whistles, or science fiction sounds. However, for listeners that have previously heard this sound: Listening to the sine-wave speech sound again produces a very different percept of a fully intelligible spoken sentence. This dramatic change in perception is an example of "perceptual insight" or pop-out. We have argued that this form of pop-out is an example of a top-down perceptual process produced by higher-level knowledge and expectations concerning sounds that can potentially be heard as speech.
Here's one example: Sine Wave Speech | Clear Speech

It's almost, but not quite, as weird as the McCollough Effect optical illusion that lasts 24 hours or longer.

More here: Clear Speech

(Via Mind Hacks)


  1. Oops. I meant to say “alleged” Satanic lyrics.

    (BTW, if you can’t be bothered to watch Dr. Shermer’s full 13 minute presentation, the demo in question begins at the 9 minute mark of the video.)

  2. Remember when you didn’t know that the synthesizer in the Beastie Boys song was saying “intergalactic planetary, planetary intergalactic”? It sounded like nothing. Now you can’t not hear those words.

  3. I must be a dumb American because I could barely understand the first part of the “clear” version the first time I listened and I was able to understand the second half of both of them.

  4. It’s true I couldn’t understand the sine wave speech the first time but maybe that’s just because I was listening for the consonants instead of just the vowels.

  5. Sine-wave version sounded identical before and after hearing the clear version.

    Does that mean I don’t have top-down perceptual processes produced by higher-level knowledge and expectations? Figures.

  6. I heard “we always get spaghetti ~soothasoo~”. I had no idea what might fit for soothasoo, though.

    Kinda fun to try and sound it out on your own. I heard spaghetti first and that sort of dictated what I got from the rest of it.

  7. I could tell from the cadence and lilt in the sine wave version that it was derived from speech. Then after hearing the original, my remembrance of the sentence rode those waves. The z remains particularly vivid.

    The site with the other samples seems to be down.

  8. The McCollough effect was really impressive. I went back ten minutes later and it was still working for me. I’ll take a look in the morning. I was really fascinated by the idea that it my be a gauge of extroversion. Wacky!
    However, the sine wave speech…not so much. Even after listening to the regular speech samples multiple times, the sine wave samples only seemed to approximate the sound patterns – and then poorly for some of them. I never got close to actually understanding them as if they were speech – and I tried for a pretty long time. I totally accept that it works differently for different people, like these effects tend to, but yeah, I’m in the dark. Also wacky!
    I don’t remember what it’s called, which is bothering me – but there’s a really awesome effect that occurs when you watch video of someone speaking one sound synced to audio of them speaking another. You – and this one did work for me – end up very powerfully perceiving a kind of linguistic average of the two sounds. Anybody remember what this one’s called?

  9. I heard it as ‘the owl’s captain (or maybe chaplain) are going to the zoo’ the first time, and I still can if I sort of insist on it to myself.

  10. On the visual (McCollough Effect) thingy, I saw a green haze around the horizontal lines before I even ~looked~ at the coloured version.

  11. That is a very, very vivid distinction. I’m really surprised how clear it was after hearing the unaltered speech.

  12. What’s throwing most American listeners on the clear version is the unexpected British accent, I’d suspect.

    I was a dialect dork in college, so I’m confident I heard it correctly…”The camel was kept in the cage at the zoo.”

  13. I still found it unintelligible after the clear speech. Still, I’m not sure how this is cool exactly. It’s just priming. Psychologists use it all the time. You can alter someone’s perceptions to the point of statistical significance just by making them do a crossword for unpleasant, pleasant or (whatever) words.

  14. Hmmm, I didn’t see the McCollough effect at all.
    I’m an introvert, for what it’s worth.

    I did download the Praat software, looks interesting.

  15. Nice, I love all forms of illusion etc. I remember this one where there is many note in progression that appear to be constantly rising but in fact they are the same 5-8 notes in order. Also the optical illusion was nice. There is another great effect where you view a color negative image and then it turns to black and white. As long as you keep your eyes still it stays color but if you move you can only see the black and white. Or of course the waterfall effect where you stare at a waterfall or moving object for some time and then whatever you look at next appears to be moving. Again nice one. And I did hear some words the first time but they all jumped out at me after hearing the clear one.

  16. This is mindblowingly cool. I tried listening to the samples linked above (the second pair of a series of five pairs on the site itself). First the sine wave one – total lack of meaning. I listened to it twice – no words I could detect.
    Next the clear one. Okay, now the scrambled one. No surprise – now I understand it. But now for the test. From the site I downloaded the other four samples, sine versions. Listening to them I could understand all but a few words. After listening again, still just a few words I can’t get. All the rest, almost all the meaning, I could understand and write down to check against the clear versions. So, admittedly, not a double-blind, randomized rock solid experiment, but I stand behind the adjective mindblowing.

  17. Neat. Ita amazing how the brain, even one as faulty as mine, can rebuild speech from incomplete information. But its not the first time this has cropped up : try The Clangers.

    This was a 1970’s childrens’ animation, produced at almost no cost at all, by Oliver Postgate and Peter Firmin in a shed. It went out just before the 6PM news on BBC 1, and all the Clangers spoke in the tooty tones of the Swanee Whistle. The odd thing was that Postgate and Firmin actually had a script to work from while they were making the toots, and you could still tell what the Clangers were saying.

    And they pre-dated our ecological way of thinking by more than 30 years.

    Everything these men did was completely knockout, especially Bagpuss.

  18. mzng, th pwr xpcttn hs vr prcptn. Mght vn cs n t prcv th n-wrd t b spkn drng rlly n s lrdy bsd gnst.

  19. I too heard “..we’re going to the zoo”. With the sound looping, I was very confident that was it, could make out every syllable crisply. But couldn’t make out the first – something like “now and then”.

    That made a difference when I heard the clear version – I could hear it in the sined version, but could also still hear, more clearly, the version I had thought I had heard.

  20. A better analogy is perhaps the “hidden cow” illusion found e.g. at (not a great example, but the good ones seem to be locked up in academic journals). For any given instance of this illusion, it initially looks like a collection of blobs, but once you notice the animal (or have it pointed out to you), you can put the image aside for days, come back to it, and you’ll see the animal instantly.

  21. If you listen to too much sine-wave speech your brain will start to interpret random noises such as rustling paper or splashing water as snatches of speech, if the sounds happen to be similar enough to words in the way the sine-wave speech is.

  22. Fascinating. Even after several listenings, I couldn’t work out the sine wave example that Mark Frauenfelder gives in this post. My best guess was “but now we’re actually going to the zoo”. But on trying a second example from Matt Davis’s collection I got it after two listenings, and the last three examples I got first time. So as Davis says, it’s not just a case of being prompted by the clear speech: it’s more a case of tuning in of a perceptual system.

  23. I recall reading about the McCollough effect about 3 years ago. I played with it for one day, then forgot about it.

    Today I looked at the bars again, and get this…

    They were still green hazed.

    I’m thoroughly concerned about permanent brain changes now!

  24. I took the files to work today, and although one of my coworkers had the same experience I had (first, no comprehension, then the nearly complete comprehension of all samples after hearing the first sample “clear”) my boss was able to make out the words of each sample right away, at least as well as I could (neither of us could make out “camel”, for instance)

  25. Someone showed me the McCollough Effect at Brown University back a few decades ago, and I think it’s still showing up occasionally — that is, unless *everybody* sees pink or green hazes around certain kinds of closely-packed diagonal lines.

    I’m not messing with that one anymore.