Facebook gave "as many as" 260 contractors at Wipro, Ltd in Hyderabad, India access to users' private messages and private Instagram posts so that the contractors could label them prior to their inclusion in an AI training-data set.
The posts are randomly selected and shown to two or three contractors who are asked to label the contents, intent and occasion depicted in the post (posts are shown initially to two different contractors for labeling; if their labels contradict each other, a third contractor breaks the tie). The posts are drawn from both public updates to Facebook and Instagram and private messages whose creators took an affirmative step to prevent them from being seen by people they hadn't personally vetted.
Facebook is hoping to use the labeling data to train a machine learning system that can identify potentially offensive posts so that it does not place ads alongside of them.
The privacy flag on Facebook and Instagram's interfaces is designed to assure users that they have control over who can see the content of the messages when it is used.
Reuters learned of the labeling project thanks to Wipro whistleblowers who spoke on condition of anonymity.
Facebook's provision of this private data to an outside contractor is radioactively illegal under Europe's GDPR.
In March, Facebook made a big announcement committing the company to a new era of respect for user privacy. In December, Facebook admitted that it had provided access to users' private messages to Spotify and Netflix. More than a year ago, Facebook promised to deliver a "clear history" feature to let users have more control over their data.
In 2004, Facebook founder Mark Zuckerberg IMed a friend to crow about the early Facebook users' provision of their personal information, saying, "i have over 4000 emails, pictures, addresses, sns… people just submitted it. i don't know why. they 'trust me.' dumb fucks"
Facebook users are not offered the chance to opt out of their data being labeled.
At Wipro, the posts being examined include not only public posts but also those that are shared privately to a limited set of a user's friends. That ensures the sample reflects the range of activity on Facebook and Instagram, said Karen Courington, director of product support operations at Facebook.
Facebook's data policy does not explicitly mention manual analysis.
"We provide information and content to vendors and service providers who support our business, such as by providing technical infrastructure services, analyzing how our products are used, providing customer service, facilitating payments or conducting surveys," the policy states.
Facebook 'labels' posts by hand, posing privacy questions [Munsif Vengattil and Paresh Dave/Reuters]
Facebook contractors categorize your private posts to train AI [Christine Fisher/Engadget]