Peter Thiel's Palantir to IPO within weeks, start trading before election

Bloomberg News reports on Friday that the secretive big-data and surveillance technology firm Palantir is preparing to register an S-1 filing confidentially, and plans to go public in the coming weeks and start trading as early as fall 2020. Read the rest

A fascinating computer analysis of the linguistic context around the 2nd Amendment

The Second Amendment is perhaps the most controversial part of the U.S. Bill of Rights. But that's not just because of our grander cultural debate around gun rights and gun violence — it's 'cause the damn thing is such a grammatical clusterfuck.

A well regulated Militia, being necessary to the security of a free State, the right of the people to keep and bear Arms, shall not be infringed.

27 words in 4 dependent clauses with no clear anything to link them. It's not clear if the thing that shall not be infringed is the well-regulated militia, or the right of the people to keep and bear arms, or if it's all dependent upon what is or is not necessary to the security of a free State. And anyone can make any one of those arguments, and have evidence to back it up that can't be definitively refuted, either.

Over at The Atlantic, James C. Phillips, a Fellow with the Constitutional Law Center at Stanford University, and Josh Blackman, a Constitutional law professor at the South Texas College of Law Houston, discuss a novel approach to figuring out what, exactly, the Founding Fathers were actually trying to say: by creating and scanning through a massive database full of more than a billion words culled from formal American and British texts from 1475 to 1800. They specifically searched for instances where phrases such as "bear arms" and "keep arms" were used, and noted the context, the context, and adjacent language that accompanied the phrases to better understand how these terms were actually being used in their historical context. Read the rest

Scarves that look like those crazy-long CVS receipts

Big Data is now dapper. An enterprising Etsy seller is making scarves that looks like oversized versions of CVS' impossibly long register receipts, coupons and all. These 59-inch-LONG fleece scarves are available from ReceiptScarves for $19.

Of course, I'm reminded of the IKEA rug that looks like an IKEA receipt.

(OddityMall) Read the rest

Not only is Google's auto-delete good for privacy, it's also good news for competition

Earlier this month, Google announced a new collection of auto-delete settings for your personal information that allows you balance some of the conveniences of data-collection (for example, remembering recent locations in Maps so that they can be intelligently autocompleted when you type on a tiny, crappy mobile device keyboard) with the risks of long-term retention, like a future revelation that you visited an HIV clinic, or a political meeting, or were present at the same time and place as someone the police have decided to investigate by means of a sweeping "reverse warrant." Read the rest

Australia wants to kill consent requirement in proposed data-sharing legislation, calls it "nuance"

Australia has a pending, comprehensive "data sharing" law that regulates the dispersal of data collected by the Australian state; in a new government white-paper, the Australian state has proposed that the rules could gain "nuance" if the government were allowed to share data without obtaining consent from the people whose privacy is implicated in that sharing. Read the rest

Crowdfunding a picture book about resisting surveillance

Murray Hunter writes, "I'm a digital rights activist in South Africa - I've written and illustrated a silly, subversive kid's book about the Big Data industry, and a squiggly, wiggly robot sent out to track and profile all the babies. It's not an 'eat your vegetables' kind of book: all I wanted to do was tell a story that could delight young kids (ages 3-5) while also inviting them to imagine for the first time a secret and hidden world of data collection. I don't think it's been done yet, and - well, why not? I've just launched a crowdfunding campaign to publish it in hardcover and thought it might pique the interest of a few happy mutants. Read the rest

Collecting user data is a competitive disadvantage

Warren Buffet is famous for identifying the need for businesses to have "moats" and "walls" around their profit-centers to keep competitors out, and data-centric companies often cite their massive collections of user-data as "moats" that benefit from "network effects" to make their businesses good investments. Read the rest

To do in NYC next Sat, May 11: "The Bigot in the Machine," a panel on algorithmic bias from PEN and McSweeney's

Next weekend, PEN America is throwing its World Voices Festival, including a McSweeney's-sponsored panel on algorithmic bias called The Bigot in the Machine, featuring poet/media activist Malkia Cyril, and Equality Labs founder Thenmozhi Soundararajan, moderated by investigative journalist Adrianne Jeffries: it's on May 11 at 2:30 at Cooper Union's Frederick P. Rose Auditorium. Tickets are $20. Read the rest

Exclusive: "More Data": Negativland's video short about data privacy and surveillance

[I've been in love with Negativland since their legendary copyright battle with U2 and they've been a part of Boing Boing since 2001; it's a pleasure beyond words to be able to debut More Data, their characteristically trenchant video about data privacy and surveillance; see below for notes from Negativland. -Cory] Read the rest

Open dataset of 1.78b links from the public web, 2016-2019

GDELT, a digital news monitoring service backed by Google Jigsaw, has released a massive, open set of linking data, containing 1.78 billion links in CSV, with four fields for each link: "FromSite,ToSite,NumDays,NumLinks." Read the rest

Big Data's "theory-free" analysis is a statistical malpractice

One of the premises of Big Data is that it can be "theory free": rather than starting with a hypothesis ("men at buffets eat more when women are present," "more people will click this button if I move it here," etc) and then gathering data to validate your guess, you just gather a ton of data and look for patterns in it. Read the rest

20,000 Dear Abby letters analyzed in study of "American" anxieties

"30 Years of American Anxieties" is a report on what 20,000 letters to Dear Abby reveal about the alarming things in life— and a great data presentation. Read the rest

In U.S. prisons, women are disciplined at a higher rate than men

Even women in prison can’t escape the sexist stereotype of the “difficult woman.” Read the rest

Nonprofit will coordinate 30 global investigative journalists to report leaked stories of big data abuse

The Signals Network is a nonprofit that supports independent investigative journalism; they're financially supporting a consortium of five international media groups Die Zeit (Germany), Mediapart (France), The Daily Telegraph (UK), The Intercept (US) and WikiTtribune (Global) as they investigate misuse of "big data." Read the rest

The Gates Foundation spent $775m on a Big Data education project that was worse than useless

Kudos to the Gates Foundation, seriously: after spending $775m on the Intensive Partnerships for Effective Teaching, a Big Data initiative to improve education for poor and disadvantaged students, they hired outside auditors to evaluate the program's effectiveness, and published that report, even though it shows that the approach did no good on balance and arguably caused real harms to teachers and students. Read the rest

The most interesting thing about the "Thanksgiving Effect" study is what it tells us about the limits of data anonymization

Late last year, a pair of economists released an interesting paper that used mobile location data to estimate the likelihood that political polarization had shortened family Thanksgiving dinners in 2016. Read the rest

Syllabus for a course on Data Science Ethics

The University of Utah's Suresh Venkatasubramanian and Katie Shelef are teaching a course in "Ethics in Data Science" and they've published a comprehensive syllabus for it; it's a fantastic set of readings for anyone interested in understanding and developing ethical frameworks for computer science generally, and data science in particular. Read the rest

More posts