The system is Google’s second official generation of the technology, which consists of two deep neural networks. The first network translates the text into a spectrogram (pdf), a visual way to represent audio frequencies over time. That spectrogram is then fed into WaveNet, a system from Alphabet’s AI research lab DeepMind, which reads the chart and generates the corresponding audio elements accordingly.
Tacotron 2 or Human?
In the following examples, one is generated by Tacotron 2, and one is the recording of a human, but which is which?
“That girl did a video about Star Wars lipstick.”
“She earned a doctorate in sociology at Columbia University.”
“George Washington was the first President of the United States.”
Rodney Brooks (previously) is a distinguished computer scientist and roboticist (he's served as as head of MIT's Computer Science and Artificial Intelligence Laboratory and CTO of Irobot); two years ago, he published a list of "dated predictions" intended to cool down some of the hype about self-driving cars, machine learning, and robotics, hype that he […]
Presidents’ Day and VPNs may not be a natural fit at first glance. But think about it for a minute. George Washington and Abraham Lincoln were both American presidents whose legacies are forever tied with liberty and freedom. And what is a VPN service if not absolute web liberty and freedom for all your digital […]
You hear the stories all the time. You know the ones about how a new vehicle instantly loses hundreds, even thousands of dollars in value the second a new owner drives it off the lot. Depreciation is a killer, especially when the item itself works just as well — or maybe even better than all […]
Spring cleaning time is right around the corner, and, if we’re being honest, the whole thing is kind of a drag. But keeping your home clean in any season is necessary for both health and happiness, so why not make it a little less daunting? The correct tools help make any work easier, and that’s […]