Fast, Accurate Detection of 100,000 Object Classes on a Single Machine a prizewinning paper by Google Research scientists, describes a breakthrough in machine vision that can distinguish between a huge class of objects 20,000 times faster than before.
This so-called convolution operator is one of the key operations used in computer vision and, more broadly, all of signal processing. Unfortunately, it is computationally expensive and hence researchers use it sparingly or employ exotic SIMD hardware like GPUs and FPGAs to mitigate the computational cost. We turn things on their head by showing how one can use fast table lookup — a method called hashing — to trade time for space, replacing the computationally-expensive inner loop of the convolution operator — a sequence of multiplications and additions — required for performing millions of convolutions with a single table lookup.
We demonstrate the advantages of our approach by scaling object detection from the current state of the art involving several hundred or at most a few thousand of object categories to 100,000 categories requiring what would amount to more than a million convolutions. Moreover, our demonstration was carried out on a single commodity computer requiring only a few seconds for each image. The basic technology is used in several pieces of Google infrastructure and can be applied to problems outside of computer vision such as auditory signal processing.
Fast, Accurate Detection of 100,000 Object Classes on a Single Machine
(Image: Clutter, a Creative Commons Attribution Share-Alike (2.0) image from neofob's photostream)
Yahoo has released a machine-learning model called open_nsfw that is designed to distinguish not-safe-for-work images from worksafe ones. By tweaking the model and combining it with places-CNN, MIT’s scene-recognition model, Gabriel Goh created a bunch of machine-generated scenes that score high for both models — things that aren’t porn, but look porny.
Projection mapping is one of the most profound visual effects that computers can generate; themepark fans will have seen it in effect on the revamped opening scene to the Indiana Jones ride at Disneyland and in the night-time shows that involve painting the whole castle with light (projection mapping is also used to generate the […]
In Rich do not rise early: spatio-temporal patterns in the mobility networks of different socio-economic classes, a group of transportation engineers analyze an open data-set about the commutes of people in the Colombian cities of Medellín and Manizales, concluding that the rich and the poor commute the furthest distances, but that the rich have much […]
Nothing is more frustrating than needing to edit or sign a PDF and not having access to the original document. That’s why PDFpenPRO is a must-have app in our books.With this extremely useful app, you can merge, markup, and create PDF documents without ever having to convert your PDFs into word processor file formats. Type directly onto […]
From self-driving cars to stock market predicting software to the recommendations you get on Amazon and Netflix, machine learning is at the core of modern technology. You could find yourself building technology that is literally changing the world with the skills you’ll learn in The Complete Machine Learning Bundle. This bundle of 10 courses includes 406 lessons that will teach […]
This Python Mega Course will help you learn to code by teaching you to build 10 real-world apps that each highlight a unique use of Python.Job prospects for coders are still growing steadily—and with Python being one of the most popular coding languages out there today, it’s important for job seekers to demonstrate a widespread understanding of the […]