Over on Hackernoon, data scientist and "language nerd," Jeff Kao, has posted the results of a data analysis he did on Net Neutrality comments submitted to the FCC between April-October 2017. Using natural language processing techniques, he was able to look for suspicious patterns in the language used. What he found was alarming.
The first and largest cluster of pro-repeal documents was especially notable. Unlike the other clusters I found (which contained a lot of repetitive language) each of the comments here was unique; however, the tone, language, and meaning across each comment was largely uniform. The language was also a bit stilted. Curious to dig deeper, I used regular expressions to match up the words in the clustered comments:
It turns out that there are 1.3 million of these. Each sentence in the faked comments looks like it was generated by a computer program. A mail merge swapped in a synonym for each term to generate unique-sounding comments. It was like mad-libs, except for astroturf.
When laying just five of these side-by-side with highlighting, as above, it’s clear that there’s something fishy going on. But when the comments are scattered among 22+ million, often with vastly different wordings between comment pairs, I can see how it’s hard to catch. Semantic clustering techniques, and not typical string-matching techniques, did a great job at nabbing these.
Finally, it was particularly chilling to see these spam comments all in one place, as they are exactly the type of policy arguments and language you expect to see in industry comments on the proposed repeal, or, these days, in the FCC Commissioner’s own statements lauding the repeal.
Oh, and guess what? Of the 800,000 comments that Kao determined likely to be "organic," 99+% of them were pro-Neutrality.
More on Net Neutrality fuckery:
Last week, the New York Times revealed that an obscure company called Securus was providing realtime location tracking to law enforcement, without checking the supposed "warrants" provided by cops, and that their system had been abused by a crooked sheriff to track his targets, including a judge (days later, a hacker showed that Securus's security […]
If you move into a new place and start service from Comcast -- increasingly the only way to get internet service in many places -- the company will often charge you a $90 installation fee, even if the previous occupants had already installed Comcast service, and even if you buy and set up your own […]
Back in 2014, a patent troll called Personal Audio LLC embarked on a campaign to shake down podcasters large and small for millions, but then they made the mistake of tangling with the Electronic Frontier Foundation.
Few programming languages boast the versatility and user-friendliness of Python, which is why it’s the first language of choice for many aspiring programmers. Regardless of your experience level, you can take the first step to becoming Python-savvy with the Python 3 Bootcamp Bundle, available in the Boing Boing Store for $35 this week. Featuring more than […]
We live during a time where cyberattacks regularly make news headlines, so it should come as no surprise that cybersecurity professionals are experiencing a surge in demand at even the entry level, making now the ideal time to learn the tools of the trade if you’re considering a career switch. The 2018 Supercharged Cybersecurity Bundle offers […]
It’s no secret that companies are eager to hire new project managers and pay them hefty salaries to ensure their initiatives make it from A to B. However, demand alone isn’t quite enough to get your foot in the door as a project manager these days. Without the right certifications, companies will have a hard time […]