Tech companies should do something about harassment, but not this

Online harassment is real, it's terrible, and tech companies can and should do more about it — but when the normally sensible Jessica Valenti wrote in the Guardian that tech companies could solve online harassment in a snap by implementing a system like Youtube's Content ID, she wasn't just wrong, she was dangerously wrong.

That's because Content ID doesn't actually work very well, despite having a much easier task than detecting and preventing harassment. Content ID's job is to match all the video it receives against a database of known copyrighted works, a task that it often fails at, first by overblocking (see, for example, the seven-hour science seminar that lost its entire video record because of some incidental music played by a DJ over the lunch hour); and also by underblocking (ask any of the studios or labels and they'll tell you that Content ID sucks, and doesn't do nearly enough to stop infringement, or even slow it down.

And Content ID's job is to match new uploads against a database of known copyrighted works. But there's no database of "things that constitute harassment under any circumstances." If profanity or taunts were included in this database, then denouncing someone for calling you a nasty name would be blocked just as thoroughly as the original harassment.

Sarah Jeong has written an excellent editorial explaining this. Valenti's approach is dangerously close to the security syllogism ("something must be done; there, I've done something), and has enormous potential as a tool of rank, indiscriminate censorship.

Update: Jessica Valenti believes that I have misrepresented her position. I disagree, and have gone into more detail in the comments. She also points out that I neglected to include a link to her original piece, which was a mistake (I'd fatfingered a tag in the post so the link didn't show up). Here is her original piece — I'm sorry I didn't link to it originally.

I would welcome clarity from Ms Valenti about what her article did intend to say, if not what I've said here and in my followup.

Can technology mitigate harassment? Can a change in a product also change the community that uses it? Absolutely. But blunt instruments patterned after Content ID are reactive responses bound to generate more problems rather than mitigating the problems that already exist. The basic premise of Content ID — matching content to a database — isn't one that can be simply diverted against harassment. And the process that follows after, which is designed to mitigate the bluntness of Content ID — that is, the DMCA takedown and the subsequent appeals process, specific to every individual instance of alleged infringement — isn't one that will benefit victims of harassment.

The response of social media companies to the problem of harassment has been lackluster, and they are certainly capable of doing better, but doing better still doesn't mean they can eliminate harassment tomorrow. It is tragic that they have prioritized intellectual property enforcement over user safety, but even its IP enforcement is considered unsatisfactory on many sides — whether for the content industry, for fair use advocates, or for users. A solution to harassment patterned after Content ID is likely to result in similar dissatisfaction for all.

Why it's so hard to stop online harassment [Sarah Jeong/The Verge]