Ben Lorica, O'Reilly's chief data scientist, has posted slides and notes from his talk at last December's Strata Data Conference in Singapore, "We need to build machine learning tools to augment machine learning engineers."
Lorica describes a new job emerging in IT departments: "machine learning engineers," whose job is to adapt machine learning models for production environments. These new engineers run the risk of embedding algorithmic bias into their systems, which unfairly discriminate, create liability, and reduces the quality of the recommendations the systems produce.
He presents a set of technical and procedural steps to take to minimize these risks, with links to the relevant papers and code. It's really required reading for anyone implementing a machine learning system in a production environment.
Another example has to do with error: once we are satisfied with a certain error rate, aren’t we done and ready to deploy our model to production? Consider a scenario where you have a machine learning model used in health care: in the course of model building, your training data for millenials (in red) is quite large compared to the number of labeled examples from senior citizens (in blue). Since accuracy tends to be correlated with the size of your training set, chances are the error rate for senior citizens will be higher than for millenials.
For situations like this, a group of researchers introduced a concept, called "equal opportunity", that can help alleviate disproportionate error rates and ensure the “true positive rate” for the two groups are similar. See their paper and accompanying interactive visualization.
We need to build machine learning tools to augment machine learning engineers [Ben Lorica/O'Reilly]
(via 4 Short Links)
Are you a PhD with interest in "the intersection of digital technology and public life, including experts in computer science, sociology, economics, law, political science, public policy, information studies, communication, and other related disciplines?" Princeton's CITP has three open job postings for 10-month residences starting Sept 1, 2019.
Ganbreeder uses a machine learning technique called Generative Adversarial Networks (GANs) to generate images that seem like photos, at least a first glance.
CIT computer scientist Milan Cvitkovic conducted 46 in-depth interviews with "scientists, engineers, and CEOs" and collated their machine learning research needs into an aptly named paper entitled "Some Requests for Machine Learning Research from the East African Tech Scene," which presents an illuminating look into the gaps in the current practice of machine learning, itself […]
For the true audio enthusiast, there’s a lot of difference between putting on some songs “for background music” and a true listening experience. For the latter, there’s nothing like a pair of sturdy headphones and the powerful speakers that come with them. And the wireless variety doesn’t get much more powerful than the TREBLAB Z2 […]
Digital or analog, there’s a path of least resistance for any project. Finding that path is what the Agile methodology is all about, which is why proficiency in it is a must for any project management position – and the paycheck that comes with it. And the quickest path to learning Agile? The Agile Project […]
Everybody’s flown a paper airplane. But what if you could fly on a paper airplane? Until we invent shrink-ray technology, the PowerUp X FPV Video Paper Airplane Kit will have to do – but it’s as fun as that sounds and more. The original version of this creative toy added drone tech to the old, […]