Ben Lorica, O'Reilly's chief data scientist, has posted slides and notes from his talk at last December's Strata Data Conference in Singapore, "We need to build machine learning tools to augment machine learning engineers."
Lorica describes a new job emerging in IT departments: "machine learning engineers," whose job is to adapt machine learning models for production environments. These new engineers run the risk of embedding algorithmic bias into their systems, which unfairly discriminate, create liability, and reduces the quality of the recommendations the systems produce.
He presents a set of technical and procedural steps to take to minimize these risks, with links to the relevant papers and code. It's really required reading for anyone implementing a machine learning system in a production environment.
Another example has to do with error: once we are satisfied with a certain error rate, aren’t we done and ready to deploy our model to production? Consider a scenario where you have a machine learning model used in health care: in the course of model building, your training data for millenials (in red) is quite large compared to the number of labeled examples from senior citizens (in blue). Since accuracy tends to be correlated with the size of your training set, chances are the error rate for senior citizens will be higher than for millenials.
For situations like this, a group of researchers introduced a concept, called "equal opportunity", that can help alleviate disproportionate error rates and ensure the “true positive rate” for the two groups are similar. See their paper and accompanying interactive visualization.
We need to build machine learning tools to augment machine learning engineers [Ben Lorica/O'Reilly]
(via 4 Short Links)
Literally the only kind of monopolistic behavior that the US government is willing to prosecute is price fixing, and that's why it's so important to read Artificial intelligence, algorithmic pricing, and collusion, a paper by four Italian economists from the University of Bologna who document how price-fixing is an emergent property of pricing algorithms -- […]
Fizz Buzz is the word-game in which players in a circle count from 1 up, substituting multiples of three with "fizz" and multiples of five with "buzz" ("1, 2, Fizz, 4, Buzz, Fizz, 7, 8, Fizz, Buzz, 11, Fizz, 13, 14, Fizz Buzz, 16, 17, Fizz, 19, Buzz, Fizz, 22, 23, Fizz, Buzz, 26, Fizz, […]
Kamil Rocki was inspired by the 2016 paper from Google Deepmind researchers explaining how they used machine learning to develop a system that could play Breakout on the Atari 2600 with superhuman proficiency.
Looking to de-clutter your kitchen counter? Start with those multiple, tangled charging cables for your multiple, power-hungry devices. There’s a workhorse solution for all those power needs, and it’s just as just as well suited to travel as home use: The Scout Wireless 5000mAh Charger. Compact and sleek at nine ounces, it doesn’t look like […]
Use a single password for every website, and you’re compromising your security. Use a different one each time, and you’re bound to lose track of them. The solution? RoboForm Everywhere, a catch-all tool that will not only manage the passwords on every site you visit but generate better ones. As a simple password database, it’s […]
Just a reminder: Print isn’t dead. And now that printers are becoming as portable as cell phones, it might be around for quite some time. Enter the MEMOBIRD Mobile Thermal Printer, a mini-printer that is versatile, portable – and most importantly, never needs a refill on ink or toner. Measuring just a few inches around, […]