Michael Rigley created this beautiful animation, titled "Network," for his BFA design thesis project at the California College of Art. It's about personal data captured by cell phone providers and is quite relevant this week.
More news from the embattled mayor of Toronto, Rob "Laughable Bumblefuck" Ford: after two of his senior staffers walked out on him following questioning by Toronto homicide detectives, it appears that someone illegally ordered the destruction of their archived city emails and call-records -- as well as the archived electronic communications of Ford's former chief of staff, whom Ford fired under mysterious circumstances.
The Star heard concerns at city hall Wednesday afternoon over the potential destruction or hiding of the records of three staffers who resigned or were fired during the ongoing crack cocaine scandal. Sources told the Star the records were in danger after city employees were directed to delete them.
The Star sent a request late Wednesday to the city asking for email and phone records of the three staffers in question for the time period during which the video at the heart of the scandal has been discussed.
Emails sent by city employees, including political staffers, are automatically preserved by the city, though emails related to “personal” business are exempt from freedom of information requests.
Two people familiar with the system said the emails of specific political staffers cannot be permanently erased from the system.p
Rob Ford video scandal: Concerns raised over safety of email records
Carlo Zapponi created Bolides, a fantastic animated visualization of meteorites that have been seen hitting the Earth. The data source is the Nomenclature Committee of the Meteoritical Society's Meteorite Bulletin. "The word bolide comes from Greek βολίς bolis, which means missile. Astronomers tend to use bolide to identify an exceptionally bright fireball, particularly one that explodes." Bolides
It's not the work of aliens. Instead, you can chalk these crop circles up to humans + money + time. And, with the help of satellite imaging, you can watch as humans use money to change the desert over the course of almost 30 years.
Landsat is a United States satellite program that's been in operation since 1972. Eight different satellites (three of them still up there and functioning) have gathered images from all over the world for decades. This data is used to help scientists studying agriculture, geology, and forestry. It's also been used for surveillance and disaster relief.
Now, at Google, you can look at images taken from eight different sites between 1984 and 2012 and and watch as people change the face of the planet. In one set of images, you can watch agriculture emerge from the deserts of Saudi Arabia — little green polka-dots of irrigation popping up against a vast swath of tan. In another se, you'll see the deforestation of the Amazon. A third, the growth of Las Vegas. It's a fascinating view of how we shape the world around us, in massive ways, over a relatively short period of time.
Alan sez, "Bloomberg got tired of waiting for the SEC to implement its own rule requiring disclosure of data on how many times the median salary the CEO makes for publicly traded companies so they did a little sleuthing of public data and a little averaging math and calculated the ratio for the top 250 of the S&P 500 companies.
The data are searchable and sortable and there's space for companies to comment, which quite a few have done.
To my surprise Oracle is not #1, though it is the only tech firm in the top 10."
Top CEO Pay Ratios
Robert McMillan explains what happens to the data generated and stored with Siri queries: "Once the voice recording is six months old, Apple “disassociates” your user number from the clip
, deleting the number from the voice file. But it keeps these disassociated files for up to 18 more months for testing and product improvement purposes." [Wired] — Rob
Philip N Howard wonders if there are any countries that have, on balanced, suffered as a result of the coming of the Internet -- say, because improved networks created so many opportunities for dictators to spy on dissidents that it swamped any free speech/free association benefits that the Internet delivered. So he scatter-plotted PolityIV’s democratization scores from 2002/2011, and cross-referenced them with World Bank/ITU data on internet users. The conclusion: by this method, no country experienced a decline in its overall levels of a democracy as it attained widespread Internet penetration, and
almost all many countries experienced a rise in democracy levels that correlated to a rise in Internet penetration.
Are there any countries with high internet diffusion rates, where the regime got more authoritarian? The countries that would satisfy this condition should appear in the top left of the graph. Alas, the only candidates that might satisfy these two conditions are Iran, Fiji, and Venezuela. Over the last decade, the regimes governing these countries have become dramatically more authoritarian. Unfortunately for this claim, their technology diffusion rates are not particularly high.
This was a quick sketch, and much more could be done with this data. Some researchers don’t like the PolityIV scores, and there are plenty of reasons to dislike the internet user numbers. Missing data could be imputed, and there may be more meaningful ways to compare over time. Some countries may have moved in one direction and then changed course, all within the last decade. Some only moved one or two points, and really just became slightly more or less democratic. But I’ve done that work too, without finding the cases Morozov wishes he had.
There are concerning stories of censorship and surveillance coming from many countries. Have the stories added up to dramatic authoritarian tendencies, or do they cancel out the benefits of having more and more civic engagement over digital media? Fancier graphic design might help bring home the punchline. There are still no good examples of countries with rapidly growing internet populations and increasingly authoritarian governments.
Are There Countries Whose Situations Worsened with the Arrival of the Internet?
Jill Filipovic wrote an opinion column for The Guardian yesterday, arguing against the practice of women taking their husbands' names when they get married. It ended up linked on Jezebel and found its way to my Facebook feed where one particular statistic caught my eye. Filipovic claimed that 50% of Americans think a women should be legally required to take her husband's name.
First, some quick clarification of my biases here. Although I write under a hyphenate, I never have legally changed my name. I've never had a desire to do so. In my private life, I'm just Maggie Koerth and always will be. That said, I personally take issue with the implication at the center of Filipovic's article — that women shouldn't change their names and that to do so makes you a bad feminist. For me, this is one of those personal decisions where I'm like, whatever. Make your own choice. Just because I don't get it doesn't mean you're wrong.
But just like I take objection to being all judgey about personal choices, I also take objection to legally mandating personal choices, and I was kind of blown away by the idea that 50% of my fellow Americans think my last name should be illegal.
So I looked into that statistic. And then I got really annoyed.
Read the rest
Kenneth Cukier was on NPR this morning talking about the new book he wrote with Viktor Mayer-Schonberger, "Big Data: A Revolution That Will Transform How We Live, Work and Think
." It sounds fascinating and relevant to research I'm doing at Institute for the Future on newfound applications of systems thinking in what we're calling the "coming age of networked matter." Here are some choice bits from the interview:
On how Target identifies pregnant customers
Big Data: A Revolution That Will Transform How We Live, Work and Think
"The example comes from Charles Duhigg, who's a reporter at The New York Times, and he's the one who uncovered the story. What Target was doing was they were trying to find out what customers were likely to be pregnant or not. So what they were able to do was to look at all the different things that couples were buying prior to the pregnancy — such as vitamins at one point, unscented lotion at another point, lots of hand towels at another point — and with that, make a prediction, score the likelihood that this person was pregnant, so that they could then send coupons to the people involved... there might be a coupon for a stroller or for diapers ...
On how Google tracks the flu
"Google stores all of its searches. What they were able to do was go through the database of previous searches to identify what was the likely predictor that there was going to be a flu outbreak in certain regions of America. Now, keep in mind, we pay for the [Centers for Disease Control and Prevention] to look at the United States and find out where flu outbreaks are taking place for the seasonal flu. But the difference is that it takes the CDC about two weeks to report the data. Google does it in real time simply on search queries."
The 'Big Data' Revolution: How Number Crunchers Can Predict Our Lives (NPR)
Researchers have successfully stored information in synthetic DNA and then sequenced the DNA to read the data. Nick Goldman and his colleagues from the European Bioinformatics Institute (EBI) encoded all of Shakespeare's sonnets, an audio clip of Martin Luther King's "I have a dream" speech, Watson and Crick's paper on DNA's structure, a photo of the EBI, and an explanation of their data conversion technique. Last year, Harvard molecular geneticist George Church encoded a book he had written in DNA, but EBI's breakthroughs are in the way the data is encoded and its error-correction. From the abstract of their scientific paper published at Nature
We encoded computer files totalling 739 kilobytes of hard-disk storage and with an estimated Shannon information10 of 5.2 × 106 bits into a DNA code, synthesized this DNA, sequenced it and reconstructed the original files with 100% accuracy. Theoretical analysis indicates that our DNA-based storage scheme could be scaled far beyond current global information volumes and offers a realistic technology for large-scale, long-term and infrequently accessed digital archiving. In fact, current trends in technological advances are reducing DNA synthesis costs at a pace that should make our scheme cost-effective for sub-50-year archiving within a decade.
"Synthetic double-helix faithfully stores Shakespeare's sonnets
" (Thanks, Mike Pescovitz!)
Metadata is one of those things that is so important, it becomes easy to forget about. We often collect metadata without thinking about it. When we don't collect it — or if we collect it in a sloppy manner — we notice very quickly that something has gone wrong. But when someone says the word "metadata", a large number of us go, "the what now?" And start trying to remember what that word means before we make ourselves sound dumb in conversation.
Metadata is really just information about information — it helps us organize, find, and standardize the things we know and want to know. At the Information Culture blog Bonnie Swoger offers some Christmas-themed examples that will help you remember what metadata is, help you understand why it's such a big deal, and improve your ability to do metadata right.
If you stumbled across this list on the web you might be able to guess what it was, but you couldn’t be sure. It would also be difficult to find this list again if you were looking for it. The list creator might find this pretty useful, but if he or she shared it with others, we would want some added information to help the new user understand what he or she was looking at: this is metadata.
Metadata for this data file:
Who created the data: Santa Claus, North Pole. An email address would be nice. This way we have some contact information in case we need clarification.
Title: “My List” isn’t a title that is conducive to finding the file again. While it might be tempting to just call this “Santa’s list” that won’t help other folks who see this file. The title should be descriptive of what the data file contains, and “Santa’s List” could be many things: Santa’s list of Reindeer? Santa’s list of toys that need to be made? A more descriptive title might be “Santa’s list of naughty and nice children.”
Date created: We don’t want to confuse this year’s list (2012) with last year’s list (2011). This could lead to all sorts of unfortunate events where nice kids get coal, naughty kids get presents, or infants (who weren’t around in 2011) get nothing at all.
Who created the data file: Perhaps Santa created the data, but then used an elf to input the data into a computer file. Many computer programs automatically record this information, although you may not realize this.
How the list was created: Behavioral scans? Parental surveys? Elf on the Shelf reports? All of the above? In order to reuse this data in future research projects, we need to know how it was collected, including collection instruments and methodologies.
Definitions of terms used: What is “naughty” what is “nice”? How did Santa place a child into one category or another?
File type: What kind of file is it? The data here are pretty simple, but Santa has lots of different file formats to choose from: excel, .csv, xml, etc. Knowing the file type helps end users determine if they can use the data.
Read the full story and get more great examples
Short version: There is LOTS the FDA doesn't want to tell you about livestock antibiotic use. And that matters. As I reminded you yesterday, the antibiotics we use to keep ourselves alive and healthy are rapidly losing their effectiveness against a whole host of diseases
. Antibiotic resistance to disease is driven by overuse of antibiotics — both in humans and in animals. And there are lots of antibiotics being used on animals. The trouble is, public health researcher know very little about that use. Because the FDA refuses to release more than the bare minimum of data.
For added fun, last year, they stopped even trying to regulate antibiotic use on livestock — opting instead for voluntary self-control systems. — Maggie
is a free Android
app from the Sunlight Foundation that helps you to learn more about your surroundings in seconds. Sitegeist takes public data about the people, housing, history, environment and things to do for any U.S. location and presents it in easy-to-view infographics. Just scroll and swipe your way through the categories to get a feel for the area. Everything from age distributions to political contributions and median home values to record temperatures. It makes complex localized data easy to understand so you can get back to enjoying the neighborhood. The app incorporates publicly available data from a number of sources including the U.S. Census Bureau, InfluenceExplorer.com, the Dark Sky weather API and even Yelp and Foursquare. Sunlight will continue to add and improve on the app as more rich data becomes public."
If you're in London this weekend, you should know that the Wellcome Trust is sponsoring a two-day bioscience hackathon with prizes
awarded for the best ideas in four categories: Open Me — collecting data on yourself and making it useful to yourself; Open Research — making biomedical data produced by professional scientists more accessible and useful to everybody; Open Data — creating apps and hardware that allow doctors to better follow what's really happening with their patients; and the idea that is most useful to the public at large. — Maggie