Algorithms like YouTube's content ID harm fair use, free speech, and creativity

The Electronic Frontier Foundation just released new whitepaper study titled Unfiltered: How YouTube's Content ID Discourages Fair Use and Dictates What We See Online that examines the oppressive use of algorithms to police user-generated content in order to "defend" copyrights from perfectly legal things like fair use.

Because YouTube is the dominant player in the online video market, its choices dictate the norms of the whole industry. And unfortunately for independent creators, YouTube has proven to be more interested in appeasing large copyright holders than protecting free speech or promoting creativity. Through its automatic copyright filter, Content ID, YouTube has effectively replaced legal fair use of copyrighted material with its own rules.

These rules disproportionately affect audio, making virtually any use of music risky. Classical musicians worry about playing public domain music. Music criticism that includes the parts of songs being analyzed is rare. The rules only care about how much is being used, so reviewers and educators do not use the "best" examples of what they are discussing, they use the shortest ones, sacrificing clarity. The filter changes constantly, so videos that passed muster once (and always were fair use) constantly need to be re-edited. Money is taken away from independent artists who happen to use parts of copyrighted material, and deposited into the pockets of major media companies, despite the fact that they would never be able to claim that money in court.

YouTube has already essentially cornered the market for original video content. Being part of a large corporation means that they will almost inevitably side with large corporations (read: "rightsholders") over the people who are actually creating the content from which YouTube profits. As EFF explains, it's like a perfect supervillain team-up between YouTube's Content ID and the Digital Millennium Copyright Act (DMCA). The algorithm doesn't care about the context, because it's not a human. As long as it identifies someone else's content—even if it's a fair use sample used in a larger piece of criticism, or a work of satire, both of which are fully legal—the algorithm flags it. This usually results in the video being taken down, unless you actively choose to appeal through a byzantine process that probably won't get you anywhere anyway.

Hell, just a few months ago I posted about Instagram removing a video where I played music that I wrote and recorded for my own band; I've tried 3 different times to appeal this ruling, and I still haven't gotten anywhere, nor can I figure out how the hell to talk to a normal Live Human Being and explain why this takedown was so dumb.

But that was Instagram/Facebook, not YouTube. It's nothing compared to, say, this (emphasis added):

Matches may be made based on mere seconds of material. While YouTube itself does not say on its user support pages how much copyrighted material will trigger a Content ID match, anecdotal evidence puts the threshold under ten seconds. (A ten-hour video of white noise had less than a second claimed by a rightsholder.) Matches are also made against anything in the database, regardless of any deal made between a rightsholder and a video maker. So, even if a video creator has licensed music for a video—either has paid to use something or was granted permission to use something without paying—it will still trigger a match and thus incur a penalty.

And of course, all of this is skewed wildly in favor of already-powerful-and-wealthy entities that can afford the kind of lawyers to keep them in power:

A Content ID claim occurs when the automated algorithm that powers Content ID detects a match between a YouTuber's video and the database of material submitted by rightsholders. Only certain rightsholders are allowed to add content to the database: those who "own a substantial body of original material that is frequently uploaded by the YouTube creator community.". This tilts the database and Content ID matches in favor of major movie and TV studios and music labels.

There are a lot more details like that in the full study. But the long and short of it is: legal strongarming by corporate entities, overruling legal fair use of content, and thus discouraging anything that's in conversation with anything else, which is a fairly central part of how humans create and communicate.

Unfiltered: How YouTube's Content ID Discourages Fair Use and Dictates What We See Online [Katharine Trendacosta / Electronic Frontier Foundation]