Over on Hackernoon, data scientist and "language nerd," Jeff Kao, has posted the results of a data analysis he did on Net Neutrality comments submitted to the FCC between April-October 2017. Using natural language processing techniques, he was able to look for suspicious patterns in the language used. What he found was alarming.
The first and largest cluster of pro-repeal documents was especially notable. Unlike the other clusters I found (which contained a lot of repetitive language) each of the comments here was unique; however, the tone, language, and meaning across each comment was largely uniform. The language was also a bit stilted. Curious to dig deeper, I used regular expressions to match up the words in the clustered comments:
It turns out that there are 1.3 million of these. Each sentence in the faked comments looks like it was generated by a computer program. A mail merge swapped in a synonym for each term to generate unique-sounding comments. It was like mad-libs, except for astroturf.
When laying just five of these side-by-side with highlighting, as above, it’s clear that there’s something fishy going on. But when the comments are scattered among 22+ million, often with vastly different wordings between comment pairs, I can see how it’s hard to catch. Semantic clustering techniques, and not typical string-matching techniques, did a great job at nabbing these.
Finally, it was particularly chilling to see these spam comments all in one place, as they are exactly the type of policy arguments and language you expect to see in industry comments on the proposed repeal, or, these days, in the FCC Commissioner’s own statements lauding the repeal.
Oh, and guess what? Of the 800,000 comments that Kao determined likely to be "organic," 99+% of them were pro-Neutrality.
More on Net Neutrality fuckery:
President Donald Trump talked to reporters today about his phone conversation with Saudi Arabia’s King Salman bin Abdulaziz, during which the two despots discussed the disappearance and apparent killing of journalist Jamal Khashoggi.
Since the Thatcher years, the UK has built itself into a powerhouse money-launderer, selling financial secrecy to the world's most corrupt and vicious looters; but with a Labour victory looking more likely with each passing day, the super-rich of Britain are starting to panic.
PBS premieres Dark Money on Monday October 1. It’s a sobering look at how the Supreme Court’s decision in Citizens United v. FEC is trickling down to local politics. John S. Adams, a Montana-based reporter profiled in the film, says, “This is scary stuff, but I think this is the proving ground for the American […]
Speed reading isn’t just an innate skill possessed by a lucky few. Anyone can learn to speed read, and the benefits are endless. The brain can process more information than most people have time to soak up, but you can make that time now with the 2018 Award-Winning Speed Reading Bundle. The first half of […]
Sure, you could use the same old PowerPoint templates for your next business presentation. It’s not like you have bosses or investors to impress. Oh wait, you do? Time to augment that slideshow with Slideshop – the presentation tool that can individualize your pitch while saving you time. Compatible with PowerPoint, Keynote and Google Slides, […]
Multinational companies have used the no-nonsense methodologies of Six Sigma and Lean Six Sigma to oil a smooth-running operation for years. What is it? Six Sigma (and its offshoot, Lean Six Sigma) apply the principles of science to business, teaching managers to methodically target waste, maximize output and streamline the flow from producer to consumer. […]