Ben Lorica, O'Reilly's chief data scientist, has posted slides and notes from his talk at last December's Strata Data Conference in Singapore, "We need to build machine learning tools to augment machine learning engineers."
Lorica describes a new job emerging in IT departments: "machine learning engineers," whose job is to adapt machine learning models for production environments. These new engineers run the risk of embedding algorithmic bias into their systems, which unfairly discriminate, create liability, and reduces the quality of the recommendations the systems produce.
He presents a set of technical and procedural steps to take to minimize these risks, with links to the relevant papers and code. It's really required reading for anyone implementing a machine learning system in a production environment.
Another example has to do with error: once we are satisfied with a certain error rate, aren’t we done and ready to deploy our model to production? Consider a scenario where you have a machine learning model used in health care: in the course of model building, your training data for millenials (in red) is quite large compared to the number of labeled examples from senior citizens (in blue). Since accuracy tends to be correlated with the size of your training set, chances are the error rate for senior citizens will be higher than for millenials.
For situations like this, a group of researchers introduced a concept, called "equal opportunity", that can help alleviate disproportionate error rates and ensure the “true positive rate” for the two groups are similar. See their paper and accompanying interactive visualization.
We need to build machine learning tools to augment machine learning engineers [Ben Lorica/O'Reilly]
(via 4 Short Links)
Google's "Project Maven" is supplying machine-learning tools to the Pentagon to support drone strikes; the project has been hugely divisive within Google, with employees pointing out that the company is wildly profitable and doesn't need to compromise on its ethics to keep its doors open; that the drone program is a system of extrajudicial killing […]
A team of researchers from Twitter have published a paper detailing a machine learning technique that uses a generative adversarial network to make shrewd guesses about how to up-res small images by up to 400%, into crisp, large images, with eye-popping results.
A group of scientists from Intel and the University of Illinois at Urbana–Champaign have published a paper called Learning to See in the Dark detailing a powerful machine-learning based image processing technique that allows regular cameras to take super-sharp pictures in very low light, without long exposures or the kinds of graininess associated with low-light […]
Few programming languages boast the versatility and user-friendliness of Python, which is why it’s the first language of choice for many aspiring programmers. Regardless of your experience level, you can take the first step to becoming Python-savvy with the Python 3 Bootcamp Bundle, available in the Boing Boing Store for $35 this week. Featuring more than […]
We live during a time where cyberattacks regularly make news headlines, so it should come as no surprise that cybersecurity professionals are experiencing a surge in demand at even the entry level, making now the ideal time to learn the tools of the trade if you’re considering a career switch. The 2018 Supercharged Cybersecurity Bundle offers […]
It’s no secret that companies are eager to hire new project managers and pay them hefty salaries to ensure their initiatives make it from A to B. However, demand alone isn’t quite enough to get your foot in the door as a project manager these days. Without the right certifications, companies will have a hard time […]