Nathan Yau created this fun and fascinating name guessing algorithm. You select "male" or "female," the decade you were born, and then type in the first letter of your name. I tried more than a dozen times for people I know and it nailed it on the first letter about 80% of the time. On those that it screwed up after the first letter, it got it right after I entered a second letter. From the project description:
This is based on data from the Social Security Administration, up to 2018. It’s relatively comprehensive, but there are a few limitations. First, it’s data for the United States, so the numbers don’t really apply elsewhere. Second, the SSA doesn’t include names with fewer than five people in a year, so the chart doesn’t cover more unique names. Third, there were no Social Security Numbers before 1935, so the name counts are fuzzier for years before that.
But like I said, the data still has a wide range. I aggregated the annual data by decade and calculated percentages by dividing name counts by total number of Social Security Numbers provided.
Before you enter anything, the chart shows the most popular names for the given sex and decade. Then as you enter a name, the chart shows conditional probabilities. The more information you give it, the stronger the guess.
"Guessing Names Based on What They Start With" (FlowingData)
Read the rest
The U.S. Labor Department Bureau of Labor Statistics (BLS) today announced changes to BLS economic data “lockup” procedures that involve removing a number of legacy computers from its Washington newsroom, effective March 1. There has been controversy around whether the change initiated today by the federal government under Donald Trump may have been aimed at Michael Bloomberg, which BLS denies. It's complicated. Read the rest
What data does your car gather about you? Where does it go? Who has access to it? Read the rest
He calls his cat Number One.
I also dig Riker's 'Stringfellow Hawk' dock on the lake. Read the rest
Warren Buffet is famous for identifying the need for businesses to have "moats" and "walls" around their profit-centers to keep competitors out, and data-centric companies often cite their massive collections of user-data as "moats" that benefit from "network effects" to make their businesses good investments.
Read the rest
From FlowingData, this is the redacted Mueller report in a "thumbnailed view for a sense of the redactions."
There's still plenty to read between the (black) lines.
(Thanks, Ted Weinstein!) Read the rest
Today's FTC ruling impacts how the TikTok app works for users under the age of 13.
A couple of times a year, Apple plops out a report detailing all of the user data requests made by government and law enforcement agencies from around the world. In the latest bi-annual report, it looks like information requests have increased since the last reporting period.
According to the report, which covers the first half of this year, Apple received 32,342 demands for user data from governments -- up 9 percent from the previous period -- spanning access to 163,823 devices. Germany made the most requests (42 percent), the majority of which were due to "stolen devices investigations," issuing 13,704 requests for data on 26,160 devices.
The US was in second place with 4,570 requests for 14,911 devices. More than half of these requests (2,397) were for users' basic account information or content, revealed Apple. The US also asked for 918 financial identifiers -- which cover suspected fraudulent credit, debit, or gift card transactions -- attributing them to iTunes gift card fraud.
It used to be that the report was only offered as a dense, boring PDF. But Apple, in an attempt to boost their corporate transparency, has made their report numbers available to peruse via an interactive website that can be searched by country and the month that the user data was requested.
According to Engadget, Apple's report doesn't include the number of FISA requests made, as there is a legally binding six-month delay required on reportage of such requests.
If you're an Apple hardware or services user, it's worth taking a quick jaunt over to the company's transparency website to see what kind of user information your government has been trying to get their hands on. Read the rest
"30 Years of American Anxieties" is a report on what 20,000 letters to Dear Abby reveal about the alarming things in life— and a great data presentation. Read the rest
Deon is a project to create automated "ethics checklists" for data science projects; by default, running the code creates a comprehensive checklist covering data collection and storage, modeling and deployment: the checklist items aren't specific actions, they're "meant to provoke discussion among good-faith actors who take their ethical responsibilities seriously. Because of this, most of the items are framed as prompts to discuss or consider. Teams will want to document these discussions and decisions for posterity."
Read the rest
Facebook will not provide fraud protection for victims of its latest data breach, details of which were announced in a Friday news dump. It set up a page where you can check if your Facebook account was breached.
One analyst told the BBC the decision was "unconscionable" ... For the most severely impacted users - a group of around 14 million, Facebook said - the stolen data included "username, gender, locale/language, relationship status, religion, hometown, self-reported current city, birthdate, device types used to access Facebook, education, work, the last 10 places they checked into or were tagged in, website, people or pages they follow, and the 15 most recent searches".
Typically, companies affected by large data breaches - such as Target, in 2013 - provide access to credit protection agencies and other methods to lower the risk of identity theft. Other hacked companies, such as on the Playstation Network, and credit monitoring agency Equifax, offered similar solutions.
A Facebook spokeswoman told the BBC it would not be taking this step "at this time". Users would instead be directed to the website's help section.
They're done caring. If you're still using Facebook, you're done caring too. Read the rest
Unless I'm in a cafe, hotel or staying at someone's home I connect to the internet over a tethered connection to my smartphone. I've got an unlimited data plan--but only the first five gigabytes of information that I send or receive is at LTE speeds. After that, things turn slow as molasses flowing uphill in January. To try and keep my data useage under control and, thus, my speeds higher for as long as possible, I use an application called TripMode 2. It's available for MacOS and Windows ten and, priced at eight bucks, it's ridiculously inexpensive to purchase a copy.
Once installed, TripMode is stupid easy to use. Activate the app, locate it in your Menu Bar (MacOS) and click it to get at its drop-down menu. There you'll see every piece of software on your computer that's begging for access to the interwebz. If you're not using the apps you see on the list, de-select the check mark next to it. Boom, they're cut off from using your tethered device's data. You'll note that at the bottom of the list, you can see how much data you've used since you started your session, during the course of a day, month or year. If you're on a plan with limited data, having that information is pure gold.
Best of all, when you're not using it, TripMode 2 can easy be shut off. It's easily up there with Scrivener, ProtonMail Bridge and Adobe Lightroom as one of the most important bits of software that I use on a regular basis. Read the rest
Inkoativ charted income per day against population and animated the "mountains" that result for each continent. Click through to watch the developing world, well, develop. [via Data Is Beautiful] Read the rest
Vanessa Hill at BrainCraft got obsessed with tallying up how many times Arnold Schwarzenegger has appeared in scientific papers, but she wasn't prepared for the actual number of papers: over 15,000. Read the rest
Redditor datacanbeuseful charted the wounding of Craigslist and the death of Backpage. After a political panic over sex trafficking, the latter's domain was seized by the government. Craiglist, to avoid the prospect of a similar fate, shut down all its "casual encounters" and similar categories overnight. It turns out to have been a significant but not critical element of the site's traffic: about 25 percent, but only as inferred through Google Trends.
The figure is based on Google Trends data of search for terms "Craigslist" and "Backpage" before and after Fight Online Sex Trafficking Act (FOSTA). It largely reflects the actual traffic at both sites. Chart created using Excel.
Because of FOSTA and the shutdown of Craigslist's Personals section, Craigslist lost a whopping 1/4 to 1/3 of the web traffic. Backpage, while enjoying a short lived traffic uptick, was soon shut down by law enforcement.
Where can this much traffic go? Does it just evaporate? Does it flow elsewhere?
Journalists usually suppose "the dark web" but reality surely involves more pimps and streetcorners. [via] Read the rest
Cambridge Analytica, the firm that consulted on Trump's 2016 campaign and mined the data of 87 million Facebook users without their permission, has shut its doors. Same goes for the company's UK counterpart SCL. From Wired
The decision to close the company's doors internationally was announced to employees during a town hall meeting in the firm's New York City offices Wednesday. One source says that NYC employees were told to pack up and leave immediately....
Just yesterday, Cambridge Analytica's official Twitter account tweeted out a link to a website refuting the waves of bad press the company has received with the caption, "Get the Facts Behind the Facebook Story."
(image by Mark Frauenfelder) Read the rest