Over on Hackernoon, data scientist and "language nerd," Jeff Kao, has posted the results of a data analysis he did on Net Neutrality comments submitted to the FCC between April-October 2017. Using natural language processing techniques, he was able to look for suspicious patterns in the language used. What he found was alarming.
The first and largest cluster of pro-repeal documents was especially notable. Unlike the other clusters I found (which contained a lot of repetitive language) each of the comments here was unique; however, the tone, language, and meaning across each comment was largely uniform. The language was also a bit stilted. Curious to dig deeper, I used regular expressions to match up the words in the clustered comments:
It turns out that there are 1.3 million of these. Each sentence in the faked comments looks like it was generated by a computer program. A mail merge swapped in a synonym for each term to generate unique-sounding comments. It was like mad-libs, except for astroturf.
When laying just five of these side-by-side with highlighting, as above, it’s clear that there’s something fishy going on. But when the comments are scattered among 22+ million, often with vastly different wordings between comment pairs, I can see how it’s hard to catch. Semantic clustering techniques, and not typical string-matching techniques, did a great job at nabbing these.
Finally, it was particularly chilling to see these spam comments all in one place, as they are exactly the type of policy arguments and language you expect to see in industry comments on the proposed repeal, or, these days, in the FCC Commissioner’s own statements lauding the repeal.
Oh, and guess what? Of the 800,000 comments that Kao determined likely to be "organic," 99+% of them were pro-Neutrality.
More on Net Neutrality fuckery:
In 2009, after a successful public records lawsuit, the Invisible Institute received data on complaints against Chicago Police Department officers since 1988 -- the complaints often list multiple officers, and by tracing the social graph of dirty cops over time, The Intercept's Rob Arthur was able to show how corruption spread like a contagion, from […]
Late last month, the Boston Globe published a blockbuster scoop revealing the existence of "Quiet Skies," a secret TSA program that sent Air Marshals out to shadow travelers who were not on any watchlist and had committed to crime, on flimsy pretenses like "This person once visited Turkey."
When the FCC announced its intention to kill Network Neutrality, it had to accept public comments, and what followed was bizarre even by Trump-era standards: first, millions of living, breathing Americans sent so many pro-Net Neutrality comments to the FCC that the website crashed; then bots spammed the FCC with millions of obviously fake anti-Neutrality […]
From self-driving cars to Siri, we’ve already gotten a taste of what AI can do, and now this groundbreaking technology is making its way to education and revolutionizing the way we learn new languages. Mondly uses state-of-the-art speech recognition to help you speak foreign languages like a true local. Lifetime subscriptions are on sale for […]
We’ve all used Excel at some point in our careers, but chances are most of us have only scratched the surface of what this ubiquitous program can do. From automating simple tasks to presenting data through beautiful charts and PivotTables, Excel brings a ton of utility to the table that can make a huge impact […]
Traveling isn’t always the most comfortable experience, but at least you have your music to keep you company on those long flights. That is, until your chatty neighbor and that crying baby three seats over drown out your playlist. These Paww WaveSound 3 Noise-Cancelling Bluetooth Headphones block up to 20 decibels of audio, so you can […]