Google is restructuring to put machine learning at the core of all it does

Steven Levy is in characteristic excellent form in a long piece on Medium about the internal vogue for machine learning at Google; drawing on the contacts he made with In the Plex, his must-read 2012 biography of the company, Levy paints a picture of a company that's being utterly remade around newly ascendant machine learning techniques.

Machine learning had humble beginnings in the company as a class given by and for engineers, which quickly captivated key technical staff around the world, blossoming into something like a full-fledged internal MOOC. Fast-forward to today and the company has moved its head of machine learning to be head of Search, Google's flagship product. Today, machine learning is "involved in every query" and affects the rankings in not "every query but in a lot of queries," with machine learning being the third-most important "signal" in how Google ranks its results.

The company even produces its own machine learning-optimized chips, the Tensor Processing Unit, which take the place of the graphics cards that have been pressed into service across the industry for all kinds of parallel computation (from Bitcoin mining to AI), thanks to the thousands of small independent processors incorporated into their designs.

Though only a small proportion of the company's engineers are specialized in machine learning, the rest of its engineering teams are being aggressively cycled through machine learning boot-camps, the "Machine Learning Ninja Program," which is intended to transfer a machine learning mindset and skills across the business.

Programming for machine learning isn't like traditional programming. While traditional programming can appeal to people who want "total control" over their code, machine learning systems produce probabilistic outcomes that require "a grasp of certain kinds of math and statistics, which many coders, even gonzo hackers who can zip off tight programs of brobdingnagian length, never bothered to learn."

This reminds me of David Byrne's book How Music Works, which touched on the idea that technological change made life harder for certain kinds of musicians — live performers, say — but gave a huge boost to others — like technical virtuosos. Programming's been through this before: there was the shift from the chess grandmasterly practice of batch programming (which punished programmers who failed to foresee a single misstep anywhere in their programs' futures), for interactive programming (which allowed programmers to quickly iterate towards more robust code); later there was the procedural vs object-oriented shift.

The early dividends from the machine learning approach are certainly exciting. Smart Reply, part of Google Inbox, composes a trio of replies for the messages that arrive in its users' inboxes, giving them a choice to use or tweak one of them, or write one from scratch. In many cases, the replies are eerily good (though there was an unfortunate tendency for the model to say "I love you" in replies to messages it didn't understand).

Most exciting is the idea that Google's creating an AI that doesn't think like a human. Much of what makes humans capable of understanding things — the relationship between a border collie puppy and an adult dog, say — also prevents us from doing the same task over and over, perfectly, without getting bored or resentful. The traditional singularity narrative has imagined that we'll get to understanding by mimicking human intelligence in software, then multiplying that intelligence with fast clockspeeds and more processors. Instead, Google's machine learning produces systems that are able to do some human intelligence tasks — recognizing border collies — but without the associated overheads, foibles, and limitations of human intelligence. It may be that if we could model a human in software, then multiplying that human a millionfold wouldn't give you a system that could monotonously and perfectly sift through tens of millions of images looking for border collies — instead, you'd get a software agent that got bored a million times faster.

It also requires a degree of patience. "The machine learning model is not a static piece of code — you're constantly feeding it data," says Robson. "We are constantly updating the models and learning, adding more data and tweaking how we're going to make predictions. It feels like a living, breathing thing. It's a different kind of engineering."

"It's a discipline really of doing experimentation with the different algorithms, or about which sets of training data work really well for your use case," says Giannandrea, who despite his new role as search czar still considers evangelizing machine learning internally as part of his job. "The computer science part doesn't go away. But there is more of a focus on mathematics and statistics and less of a focus on writing half a million lines of code."

As far as Google is concerned, this hurdle can be leapt over by smart re-training. "At the end of the day the mathematics used in these models is not that sophisticated," says Dean. "It's achievable for most software engineers we would hire at Google."


How Google is Remaking Itself as a "Machine Learning First" Company [Steven Levy/Backchannel]


(Image: Googleplex Pride Logo, Runner 1928, CC-BY-SA/Deep Dream Generator)

(via Beyond the Beyond)