What Hinton’s Google Move Says About the Future of Machine Learning

Earlier this week TechCrunch broke the news that Google had acquired Geoff Hinton’s recently founded deep learning startup. Soon thereafter Geoff posted on his Google+ page an announcement confirming the news and his (part-time) departure to Google from the University of Toronto. From the details that have emerged so far, it appears that he will split his time between UoT and the Google offices in Toronto and Mountain View. What does Geoff’s move, and other recent higher profiles departures, say about the future of machine learning research in academia? A lot, I think.

First, some context. While Geoff is undoubtedly a giant in the field, he is only the latest in a string of departures to Google. Sebastian Thrun left Stanford to head Google’s autonomous car project in 2011. Andrew Ng has led several projects at Google, including the recent high-profile deep learning study in very large scale computer vision. And in late 2010, Matt Welsh made news when he left his tenured faculty position at Harvard to join Google. Except for Matt Welsh, this trend has particularly centered on large scale machine learning, and this question was recently put to Andrew Ng during the panel discussion at the BigVision workshop at NIPS. Andrew’s answer was that the effort involved in doing machine learning at a large scale required significant industrial engineering expertise, one that does not exist in academic settings including Stanford, and that a place like Google is simply much better equipped to carry out this function.

It’s easy to dismiss this as merely the commercialization and scaling of technology that was initially developed in academia. While this is in part true, it is also true that machine learning in general is increasingly becoming dependent on large-scale data sets. In particular, the recent successes in deep learning have all relied on access to massive data sets and massive computing power. I believe it will become increasingly difficult to explore the algorithmic space without such access, and without the concomitant engineering expertise that is required. The behavior of machine learning algorithms, particularly neural networks, is dependent in a nonlinear fashion on the amount of data and computing power used. An algorithm that appears to be performing poorly on small data sets and short training times can begin to perform considerably better when these limitations are removed. This has in fact been in a nutshell the reason for the recent resurgence in neural networks. Even the pre-training breakthrough of 2006 appears to not be strictly necessary if enough data and computing power is thrown at the problem.

All this suggests that, without access to significant computing power, academic machine learning research will find it increasingly difficult to stay relevant. Such a shift to industrial research would not be without precedent. The development of computing itself provides a valuable lesson. In this post, I will attempt to draw parallels between computing in the 20th century and what I see as the current trajectory of machine learning and artificial intelligence.

Computing developed in three broad phases. The first, running from around the 1930s to the 1950s, was one that I would describe as embryonic research. The work was extremely foundational, had little obvious commercial value, and was done primarily in the halls of academia, supported by government and military spending. The heroes of that era are people like Alan Turing and John von Neumann, brilliant minds that could see far into the future, but who had to basically toil in obscurity because their work was so far from reaching its ultimate realization, that the true magnitude of their contribution would only be realized decades later.

The second phase, roughly from the 1960s to the 1980s, was dominated by industrial research. This period is one where the field reached sufficient maturity that it was better done at companies than in academia. The science had become obviously useful, even if it was still far from reaching its full potential. There was no longer a question of whether there is something there worth exploring, but it remained a science, with much research to be done. Because of its potential utility, companies that were not in the business (think Bell Labs, Xerox, IBM) could nonetheless afford to spend large wads of money on the problem. And in many ways this became the golden age of the field, with many interesting if not absolutely foundational problems to be solved. Large companies were also willing to put their weight behind the problem and propel the field forward with great speed. Curiously however this era seemed to generate fewer famous personalities, perhaps because the work was done by lots of people distributed in many industrial labs. The singular genius was less important. This phase also seemed to particularly favor large established companies, ones that were able to throw significant amounts of money and expertise at the problem, without expecting an immediate return on investment. Startups would have had a hard time becoming commercially viable by relying only on computing, because the business was not there yet.

Finally came the third phase, the breakout of the science into a true technology that is used by the masses, spanning the 1980s to the 2000s (I am excluding the internet and mobile.) This phase was less interesting from a scientific perspective, as most of the intellectual heavy lifting had already been done, with the remaining technical work centering on scalability. On the other hand, this period is the most challenging and exciting entrepreneurially, as a new generation of founders took the technology mainstream, eclipsing the older companies that served as incubators for the science. Its heroes are thus not scientists, but corporate visionaries like Bill Gates and Steve Jobs, who became household names and effected great public impact and broad societal change.

Which brings us back to machine learning. My claim is that the above is a sort of generic template for the development of any technology. First it starts as basic academic research, then it graduates into industrial research, finally it becomes a real technology and is made mainstream by new startups. I think that if machine learning were viewed through the broader prism of artificial intelligence, as merely a stage in AI research, then what we are now witnessing is the transition of AI research from the first phase to the second. By AI I mean systems that are able to not only parameter-fit and self-learn input representations, but also reason in a structured manner by exploring compositional models. From this perspective, the 1960s to 1980s were the embryonic period, where a lot of foundational but obscure work was done and no one outside the field took it seriously. It was unclear if there was anything there, if there would be any ultimate payoff. The second phase began in the 1990s and 2000s and will run to the 2020s or 2030s. It’s still very much science (AI, not ML), but large companies like Google see the potential, and to them the basic question of whether it has any utility has been answered in the affirmative, and so they can justify throwing money at the problem, even though it is not strictly in their business (one could argue whether Google is really in the business of AI or just ML.) Because companies are now taking it seriously, the center of mass will shift heavily from academia to industry, with most of the interesting AI research occurring in industrial laboratories. And these industrial labs will be a safe and welcoming place for researchers for around two decades. The reason I think it will last that long is because, to take a very simple and naïve extrapolation, we are currently doing deep neural nets with tens of millions of nodes and billions of parameters. The human brain has approximately hundreds of billions of neurons and hundreds of trillions of synapses, and so, assuming Moore’s law holds and algorithmic developments keep up, we are about three orders of magnitude, or around 20 years, from getting a human brain (assuming doubling of computing power every two years and so 210 fold increase in 20 years.) By the 2030s the bulk of the science will be done, and in the succeeding decades the technology will be commercialized, by which time (2050s-2060s) we will have machine brains that are orders of magnitude smarter than human ones (!). This also suggests that the prime time for AI startups is not now, but some 20 years from now.

I made enough wild predictions and speculations in this post that the rope is now long enough to hang myself several times over, and so I will stop here. My point is simple though. Geoff Hinton’s move is part of a much larger shift. ML research, real fundamental research and not just scaling numerical methods to get the SVD of gargantuan matrices, is moving permanently to the industrial lab.

Update: This post got picked up by Hacker News along with the requisite discussion here.

Update (12/9/13): Yann LeCun just announced that he’s joining Facebook.


13 comments

  1. Google could speed things along, if they were to open up their index. Unlimited API access to the top N returned results on any query. At this point what do they have to loose? Is there seriously anyone out there who can match their infrastructure. Everyone is going to be dependent on their data for a long time to come. Might as well increase the number of researchers working on it.

    • Yes, I so wish they would do this. I feel like the data barrier is even bigger than the computing barrier. For computing there’s always the national supercomputing labs. But Google’s data advantage is just unparalleled. I think if they do it correctly it could prove to be a win-win situation. They wouldn’t lose their competitive advantage and it could stimulate a lot of research.

  2. This chimes strongly with my experience of industrial machine learning research and development, where the volume of test data, together with the level of organization & efficiency of the automated testing regime that exploits the test data, is _the_ strongest indicator of success in a machine vision development program. (Far more significant, for example, than the skills of the individual engineers involved in the program).

  3. I’m not very optimistic about AI moving to industry. Aside from supervised ML, very few people even research AI in academia. Current general automated planning algorithms (which are required for an agent to make decisions from what it learns) in partially observable settings (usually represented as partially observable markov decision processes) can only handle toy problems with very few states.

    AI is like many other sciences today: data rich, and theory poor.

    • There are TONS of academia research in the field of speech recognition. And what academia managed to come up with? CMU Sphinx??? That is a piece of crap even if you compare to thirteen year old state of art commercial products. And current state of art in speech recognition from Google & Microsoft show amazing improvements even if you compare to what was available 4 years ago.

      And there are no shortage of other commercial speech recognition engines that may be not as good as top ones, but infinitely better then anything academia managed to produce. Yahoo, Nuance(think of Apple & Samsung smartphones), Chinese iFlyTek etc.

      But it is impossible to achieve such results without doing cutting edge research in the field. Check papers published by Google – there are hundreds of them in the field of ASR alone. But sure some things that MS and Google are doing are proprietary, and many things are covered by patent protection.

      Nevertheless what is happening now is a very good thing for progress of the field. NN gets massively deployed for commercial use by top players. That creates competition and creates incentive for chasing pack to try to catch up. And top players are trying to get ahead of the curve. As result – more funding from industry becomes available to researchers. And that funding is spend more efficiently then government money. Which enormously help to move whole field forward faster.

      • I’m not arguing that industry won’t advance narrow, specialized machine learning technology. Industry refines what academia produces, and is great at it. What I’m arguing is that industry is mostly only interested in the supervised machine learning subfield of AI, and even then, they’re usually only interested in specific applications of ML. AI is a huge field, most of which is now being mostly ignored by both industry and academia in favor of narrow applications.

        In my opinion, government needs to start spending money on AI much less efficiently 🙂 I.e. fund basic, risky, long-term, and theoretical research. It’s my understanding that, currently, it’s very hard to get funding for AI research, unless you’re working on a very specific application.

  4. I think you’re right about waiting 20 years to do AI. If we want machines to learn like humans do, shouldn’t we be able to do this without huge data sets? Like a child can recognize speech without hearing millions of sample record audio to analyze. This seems like a brute force method instead of trying to understand more about how the brain works. PS it’s looking like Google is becoming more like SkyNet every day 🙂

    • You would be amazed by how much exposure a child has to environmental stimuli before it is able to make such cognitive judgments or how biased a child is based on experience and cultural influences (in almost all aspects of cognition). I actually consider it pretty unfair that all this time we are comparing computers with grown-up humans.

  5. Pingback: Google Adds Another AI Academic to the Mix : Stephen E. Arnold @ Beyond Search

  6. Pingback: What Hinton’s Google Move Says About the Future of Machine Learning | Dotan Di Castro's Website

  7. Hello, I liked your post a lot. I don’t have a lot of experience to judge but following some news like this and like Google and other major IT companies acquiring various robotics startups, I think AI and ML might be the next big thing and might change the 2050s world the way the internet changed the 2000s .


Leave a reply to William Payne Cancel reply