Submit a link Features Reviews Podcasts Video Forums More ▾

Detect your pulse with your webcam


Thearn released a free/open program for detecting and monitoring your pulse using your webcam. The code is on github for you to download, play with and modify. If this stuff takes your fancy, be sure and read Eulerian Video Magnification for Revealing Subtle Changes in the World, an inspiring paper describing the techniques Thearn uses in his code:

This application uses openCV (http://opencv.org/) to find the location of the user's face, then isolate the forehead region. Data is collected from this location over time to estimate the user's heartbeat frequency. This is done by measuring average optical intensity in the forehead location, in the subimage's green channel alone. Physiological data can be estimated this way thanks to the optical absorbtion characteristics of oxygenated hemoglobin.

With good lighting and minimal noise due to motion, a stable heartbeat should be isolated in about 15 seconds. Other physiological waveforms, such as Mayer waves (http://en.wikipedia.org/wiki/Mayer_waves), should also be visible in the raw data stream.

Once the user's pulse signal has been isolated, temporal phase variation associated with the detected hearbeat frequency is also computed. This allows for the heartbeat frequency to be exaggerated in the post-process frame rendering; causing the highlighted forhead location to pulse in sync with the user's own heartbeat (in real time).

Support for pulse-detection on multiple simultaneous people in an camera's image stream is definitely possible, but at the moment only the information from one face is extracted for cardiac analysis

thearn / webcam-pulse-detector (via O'Reilly Radar)

Solving classic NES games computationally

Dr. Tom Murphy VII gave a research paper called "The First Level of Super Mario Bros. is Easy with Lexicographic Orderings and Time Travel . . . after that it gets a little tricky," (PDF) (source code) at SIGBOVIK 2013, in which he sets out a computational method for solving classic NES games. He devised two libraries for this: learnfun (learning fuction) and playfun (playing function). In this accompanying video, he chronicles the steps and missteps he took getting to a pretty clever destination.

learnfun & playfun: A general technique for automating NES games (via O'Reilly Radar)

Algorithmically constructed news

In Wired, Steven Levy has a long profile of the fascinating field of algorithmic news-story generation. Levy focuses on Narrative Science, and its competitor Automated Insights, and discusses how the companies can turn "data rich" streams into credible news-stories whose style can be presented as anything from sarcastic blogger to dry market analyst. Narrative Science's cofounder, Kristian Hammond, claims that 90 percent of all news will soon be algorithmically generated, but that this won't be due to computers stealing journalists' jobs -- rather, it will be because automation will enable the creation of whole classes of news stories that don't exist today, such as detailed, breezy accounts of every little league game in the country.

Narrative Science’s writing engine requires several steps. First, it must amass high-quality data. That’s why finance and sports are such natural subjects: Both involve the fluctuations of numbers—earnings per share, stock swings, ERAs, RBI. And stats geeks are always creating new data that can enrich a story. Baseball fans, for instance, have created models that calculate the odds of a team’s victory in every situation as the game progresses. So if something happens during one at-bat that suddenly changes the odds of victory from say, 40 percent to 60 percent, the algorithm can be programmed to highlight that pivotal play as the most dramatic moment of the game thus far. Then the algorithms must fit that data into some broader understanding of the subject matter. (For instance, they must know that the team with the highest number of “runs” is declared the winner of a baseball game.) So Narrative Science’s engineers program a set of rules that govern each subject, be it corporate earnings or a sporting event. But how to turn that analysis into prose? The company has hired a team of “meta-writers,” trained journalists who have built a set of templates. They work with the engineers to coach the computers to identify various “angles” from the data. Who won the game? Was it a come-from-behind victory or a blowout? Did one player have a fantastic day at the plate? The algorithm considers context and information from other databases as well: Did a losing streak end?

Then comes the structure. Most news stories, particularly about subjects like sports or finance, hew to a pretty predictable formula, and so it’s a relatively simple matter for the meta-writers to create a framework for the articles. To construct sentences, the algorithms use vocabulary compiled by the meta-writers. (For baseball, the meta-writers seem to have relied heavily on famed early-20th-century sports columnist Ring Lardner. People are always whacking home runs, swiping bags, tallying runs, and stepping up to the dish.) The company calls its finished product “the narrative.”

Both companies claim that they'll be able to make sense of less-quantifiable subjects in the future, and will be able to generate stories about them, too.

Can an Algorithm Write a Better News Story Than a Human Reporter?

Corrected notes on the feeding of yes to yes

This morning, I posted M Tang`s funny experiment in feeding the Unix "yes" command to itself. Now, Seth David Schoen writes in to correct and expand upon the principles therein:

M. Tang`s business about the Unix command

yes `yes no`

is based on a bit of a misconception. The problem is _not_ about combining one yes command with another yes command. Whenever you use the backtick syntax `, like in a hypothetical command

foo `bar`

the shell will first run the command bar (to completion) before it even tries to start foo. The shell will also save the complete output of bar in memory, and then present it as a set of command-line arguments to foo.

In this case, the shell is trying to run the command "yes no" to completion, saving its output in memory, before even starting the other yes command. Of course, "yes no" never finishes, but it does use up an arbitrarily large amount of memory.

To see that the problem is with the use of `yes` rather than with the combination of two yes commands, just try

echo `yes no`

or even

true `yes no`

Both of these forms have exactly the same memory-consumption problem as the original command, and for exactly the same reason! So, Tang is wrong to think that he is somehow creating a problem by combining multiple yesses. The problem is in asking the shell to remember an infinite amount of output.

As other people have mentioned in comments, the ` syntax is also not piping. Piping is done with |, while ` refers to substitution. The distinction is whether the output of program A appears as input to program B (piping) or as command-line arguments to program B (substitution). For example,

echo foo bar | wc -w

outputs the number 2 (that`s the total number of words in the text "foo bar"), while

wc -w `echo foo bar`

counts the number of words in the files foo and bar.

Stupid Unix trick: why you shouldn't pipe yes into yes


Update: M Tang's explanation for this is wrong, but Seth Schoen sent us a great correction


There's a GNU-coreutils program called yes whose function is to "output a string repeatedly until killed." M Tang tried piping the output of one yes command into another. It ended badly:

Taking a look at the source code for yes, it looks like the single argument is being stored in a char array, then, in a while(true) and for loop, each character is printed to the stdout, followed by a new line (\n) character.

So when we use the output of one yes command as the argument for another, the outer yes command fills up the computer’s memory with the output of the inner yes command. Then I have to restart my computer and feel stupid.

yes 'yes no' (via Hacker News)

Aaron Swartz's unfinished monograph on the "programmable Web"

Michael B. Morgan, CEO of Morgan & Claypool Publishers, writes:

In 2009, we invited Aaron Swartz to contribute a short work to our series on Web Engineering (now The Semantic Web: Theory and Technology). He produced a draft of about 40 pages -- a "first version" to be extended later -- which unfortunately never happened.

After his death in January, we decided (with his family's blessing) that it would be a good idea to publish this work so people could read his ideas about programming the Web, his ambivalence about different aspects of Semantic Web technology, his thoughts on Openness, and more.

As a tribute to Aaron, we have posted his work on our site as a free PDF download. It is licensed under a Creative Commons (CC-BY-SA-NC) license. The work stands as originally written, with only a few typographical errors corrected to improve readability.

Aaron Swartz's A Programmable Web: An Unfinished Work

Aaron Swartz’s A Programmable Web: An Unfinished Work (PDF)

(Thanks, Michael!)

Big Data: A Revolution That Will Transform How We Live, Work, and Think


Big Data is a new book from Viktor Mayer-Schonberger, a respected Internet governance theorist; and Kenneth Cukier, a long-time technology journalist who's been on the Economist for many years. As the title and pedigree imply, this is a business-oriented book about "Big Data," a computational approach to business, regulation, science and entertainment that uses data-mining applied to massive, Internet-connected data-sets to learn things that previous generations weren't able to see because their data was too thin and diffuse.

Big Data is an eminently practical and sensible book, but it's also an exciting and excitable text, one that conveys enormous enthusiasm for the field and its fruits. The authors use well-chosen examples to show how everything from shipping logistics to video-game design to healthcare stand to benefit from studying the whole data-set, rather than random samples. They even pose this as a simple way of thinking of big data versus "small data." Small data relies on statistical sampling, and emphasises the reliability and accuracy of each measurement. With big data, you sample the entire pool of activities -- all the books sold, all the operations performed -- and worry less about inaccuracies and anomalies in individual measurements, because these are drowned out by the huge numbers of observations performed.

As you'd expect, Big Data is particularly fascinating when it explores the business implications of all this: the changing leverage between firms that own data versus the firms that know how to make sense of it, and why sometimes data is best processed by unaffiliated third parties who can examine data from rival firms and find out things from which all parties stand to benefit, but which none of them could have discovered on their own. They also cover some of the bigger Big Data business blunders through history -- companies whose culture blinkered them to the opportunities in their data, which were exploited by clever rivals.

The last fifth of the book is dedicated to issues of governance, regulation, and public policy. This is some of the most interesting material in the book and probably needs to be expanded into its own volume. As it is, there's a real sense that the authors are just scraping the surface. For example, many of the stories told in the book have deep privacy implications, and the authors make a point of touching on these, cabining them with phrases like "so long as the data is anonymized" or "adhering to privacy policy, of course." But in the final third, the authors examine the transcendental difficulty of real-world anonymization, and the titanic business blunders committed by firms that believed they'd stripped out the personal information from the data, only to have the data "de-anonymized" and their customers' privacy invaded in small and large ways. These two facts -- that many of the opportunities require effective anonymization and that no one knows how to do anonymization -- are a pretty big stumbling block in the world of Big Data, but the authors don't explicitly acknowledge the conundrum.

While Big Data is an excellent primer on the opportunities of the field, it's thin on the risks, overall. For example, Big Data is rightly fascinated with stories about how we can look at data sets and find predictors of consequential things: for example, when Google mined its query-history and compared it with CDC data on flu outbreaks, it found that it could predict flu outbreaks ahead of the CDC, which is amazingly useful. However, all those search-strings were entered by people who didn't expect to have them mined for subsequent action. If searching for "scratchy throat" and "runny nose" gets your neighborhood quarantined (or gets it extra healthcare dollars), you might get all your friends to search on those terms over and over -- or not at all. Google knows this -- or it should -- because when it started measuring the number of links between sites to define the latent authority of different parts of the Internet, it got great results, but immediately triggered a whole scummy ecosystem of linkfarms and other SEO tricks that create links whose purpose is to produce more of the indicators Google is searching for.

Another important subject is looking at algorithmic prediction in domains where the outcome is punishment, instead of reward. British Airways may get great results from using an algorithm to pick out passengers for upgrades, trying to find potential frequent fliers. But we should be very cautious about applying the same algorithm to building the TSA's No-Fly list. If BA's algorithm fails 20% of the time, it just means that a few lucky people get to ride up front of the plane. If the TSA has a 20% failure rate, it means that one in five "potential terrorists" is an innocent whose fundamental travel rights have been compromised by a secretive and unaccountable algorithm.

Secrecy and accountability are the third important area for examination in a Big Data world. Cukier and Mayer-Schonberger propose a kind of inspector-general for algorithms who'll make sure they're not corrupted to punish the undeserving or line someone's pockets unjustly. But they also talk about the fact that these algorithms are likely to be illegible -- the product of a continuously evolving machine-learning system -- and that no one will be able to tell you why a certain person was denied credit, refused insurance, kept out of a university, or blackballed for a choice job. And when you get into a world where you can't distinguish between an algorithm that gets it wrong because the math is unreliable (a "fair" wrong outcome) from an algorithm that gets it wrong because its creators set out to punish the innocent or enrich the undeserving, then we can't and won't have justice. We know that computers make mistakes, but when we combine the understandable enthusiasm for Big Data's remarkable, counterintuitive recommendations with the mysterious and oracular nature of the algorithms that produce those conclusions, then we're taking on a huge risk when we put these algorithms in charge of anything that matters.

Big Data: A Revolution That Will Transform How We Live, Work, and Think

Previously: Book about big data, predictive behavior, and decision making

How an algorithm came up with Amazon's KEEP CALM AND RAPE A LOT t-shirt


You may have heard that Amazon is selling a "KEEP CALM AND RAPE A LOT" t-shirt. How did such a thing come to pass? Well, as Pete Ashton explains, this is a weird outcome of an automated algorithm that just tries random variations on "KEEP CALM AND," offering them for sale in Amazon's third-party marketplace and printing them on demand if any of them manage to find a buyer.

The t-shirts are created by an algorithm. The word “algorithm” is a little scary to some people because they don’t know what it means. It’s basically a process automated by a computer programme, sometimes simple, sometimes complex as hell. Amazon’s recommendations are powered by an algorithm. They look at what you’ve been browsing and buying, find patterns in that behaviour and show you things the algorithm things you might like to buy. Amazons algorithms are very complex and powerful, which is why they work. The algorithm that creates these t-shirts is not complex or powerful. This is how I expect it works.

1) Start a sentence with the words KEEP CALM AND.

2) Pick a word from this long list of verbs. Any word will do. Don’t worry, I’m sure they’re all fine.

3) Finish the sentence with one of the following: OFF, THEM, IF, THEM or US.

4) Lay these words out in the classic Keep Calm style.

5) Create a mockup jpeg of a t-shirt.

6) Submit the design to Amazon using our boilerplate t-shirt description.

7) Go back to 1 and start again.

There are currently 529,493 Solid Gold Bomb clothing items on Amazon. Assuming they survive this and don’t get shitcanned by Amazon I wouldn’t be at all surprised if they top a million in a few months.

It costs nothing to create the design, nothing to submit it to Amazon and nothing for Amazon to host the product. If no-one buys it then the total cost of the experiment is effectively zero. But if the algorithm stumbles upon something special, something that is both unique and funny and actually sells, then everyone makes money.

Dictionary + algorithm + PoD t-shirt printer + lucrative meme = rape t-shirts on Amazon

Profane commit-messages from GitHub

Commit Logs From Last Night: highlights funny, profane source-code commit-messages from GitHub, as bedraggled hackers find themselves leaving notes documenting their desperate situations. Some recent ones:

WHY THE GODDAMMIT WHY WHY WHY HAROGIHAROGIAHRGOIA FUCK ME

render testing I DREW SOME LINES! reverted render panel to grew (white looks shit)

Merge pull request #15 from ruvetia/font_awesome_is_fucking_awesome include font-awesome into the projcet

Commit Logs From Last Night (via JWZ)

Students get class-wide As by boycotting test, solving Prisoner's Dilemma

Johns Hopkins computer science prof Peter Fröhlich grades his students' tests on a curve -- the top-scoring student gets an A, and the rest of the students are graded relative to that brainiac. But last term, his students came up with an ingenious, cooperative solution to this system: they all boycotted the test, meaning that they all scored zero, and that zero was the top score, and so they all got As. The prof was surprisingly cool about it:

Fröhlich took a surprisingly philosophical view of his students' machinations, crediting their collaborative spirit. "The students learned that by coming together, they can achieve something that individually they could never have done," he said via e-mail. “At a school that is known (perhaps unjustly) for competitiveness I didn't expect that reaching such an agreement was possible.”

The story of the boycott is a sterling example of how computer networks solve collective action problems -- the students solved a prisoner's dilemma in a mutually optimal way without having to iterate, which is impressive:

“The students refused to come into the room and take the exam, so we sat there for a while: me on the inside, they on the outside,” Fröhlich said. “After about 20-30 minutes I would give up.... Then we all left.” The students waited outside the rooms to make sure that others honored the boycott, and were poised to go in if someone had. No one did, though.

Andrew Kelly, a student in Fröhlich’s Introduction to Programming class who was one of the boycott’s key organizers, explained the logic of the students' decision via e-mail: "Handing out 0's to your classmates will not improve your performance in this course," Kelly said.

"So if you can walk in with 100 percent confidence of answering every question correctly, then your payoff would be the same for either decision. Just consider the impact on your other exam performances if you studied for [the final] at the level required to guarantee yourself 100. Otherwise, it's best to work with your colleagues to ensure a 100 for all and a very pleasant start to the holidays."

Fröhlich's changed the grading system -- but he's also now offering the students a final project instead of a final exam, should they choose.

Dangerous Curves [Zack Budryk/Inside Higher Ed]

Malware-Industrial Complex: how the trade in software bugs is weaponizing insecurity

Here's a must-read story from Tech Review about the thriving trade in "zero-day exploits" -- critical software bugs that are sold off to military contractors to be integrated into offensive malware, rather than reported to the manufacturer for repair. The stuff built with zero-days -- network appliances that can snoop on a whole country, even supposedly secure conversations; viruses that can hijack the camera and microphone on your phone or laptop; and more -- are the modern equivalent of landmines and cluster bombs: antipersonnel weapons that end up in the hands of criminals, thugs and dictators who use them to figure out whom to arrest, torture, and murder. The US government is encouraging this market by participating actively in it, even as it makes a lot of noise about "cyber-defense."

Exploits for mobile operating systems are particularly valued, says Soghoian, because unlike desktop computers, mobile systems are rarely updated. Apple sends updates to iPhone software a few times a year, meaning that a given flaw could be exploited for a long time. Sometimes the discoverer of a zero-day vulnerability receives a monthly payment as long as a flaw remains undiscovered. “As long as Apple or Microsoft has not fixed it you get paid,” says Soghioan.

No law directly regulates the sale of zero-days in the United States or elsewhere, so some traders pursue it quite openly. A Bangkok, Thailand-based security researcher who goes by the name “the Grugq” has spoken to the press about negotiating deals worth hundreds of thousands of dollars with government buyers from the United States and western Europe. In a discussion on Twitter last month, in which he was called an “arms dealer,” he tweeted that “exploits are not weapons,” and said that “an exploit is a component of a toolchain … the team that produces & maintains the toolchain is the weapon.”

The Grugq contacted MIT Technology Review to state that he has made no “public statement about exploit sales since the Forbes article.”

Some small companies are similarly up-front about their involvement in the trade. The French security company VUPEN states on its website that it “provides government-grade exploits specifically designed for the Intelligence community and national security agencies to help them achieve their offensive cyber security and lawful intercept missions.” Last year, employees of the company publicly demonstrated a zero-day flaw that compromised Google’s Chrome browser, but they turned down Google’s offer of a $60,000 reward if they would share how it worked. What happened to the exploit is unknown.

Welcome to the Malware-Industrial Complex [Tom Simonite/MIT Technology Review]

(via O'Reilly Radar)

Robots say the craziest things

This morning, while hurrying down the concourse at La Guardia Airport, I tried to dictate a text message to my Nexus 4 while wheeling my suitcase behind me. It got the dictation fine, but appended "kdkdkdkdkdkdkdkd" to the message -- this being its interpretation of the sound of my suitcase wheels on the tiles. Cory

Regular expressions crossword


On Coinheist.com, a crossword puzzle you solve by interpreting regular expressions.

PDF download

Casino panopticon: a look at the CCTV room in the Vegas Aria


A fascinating article in The Verge looks at the history of casino cheating and talks to Ted Whiting, director of surveillance at the Aria casino in Vegas, who specced out a huge, showy CCTV room with feeds from more than 1,100 cameras. They use a lot of machine intelligence to raise potential cheating to the attention of the operators.

Despite that, Whiting says facial recognition software hasn’t been of much use to him. It’s simply too unreliable when it comes to spotting people on the move, in crowds, and under variable lighting. Instead, he and his team rely on pictures shared from other casinos, as well as through the Biometrica and Griffin databases. (The Griffin database, which contains pictures and descriptions of various undesirables, used to go to subscribers as massive paper volumes.) But quite often, they’re not looking for specific people, but rather patterns of behavior. "Believe it or not, when you've done this long enough," he says, "you can tell when somebody's up to no good. It just doesn't feel right."

They keep a close eye on the tables, since that’s where cheating’s most likely to occur. With 1080p high-definition cameras, surveillance operators can read cards and count chips — a significant improvement over earlier cameras. And though facial recognition doesn’t yet work reliably enough to replace human operators, Whiting’s excited at the prospects of OCR. It’s already proven useful for identifying license plates. The next step, he says, is reading cards and automatically assessing a player’s strategy and skill level. In the future, maybe, the cameras will spot card counters and other advantage players without any operator intervention. (Whiting, a former advantage player himself, can often spot such players. Rather than kick them out, as some casinos did in the past, Aria simply limits their bets, making it economically disadvantageous to keep playing.)

With over a thousand cameras operating 24/7, the monitoring room creates tremendous amounts of data every day, most of which goes unseen. Six technicians watch about 40 monitors, but all the feeds are saved for later analysis. One day, as with OCR scanning, it might be possible to search all that data for suspicious activity. Say, a baccarat player who leaves his seat, disappears for a few minutes, and is replaced with another player who hits an impressive winning streak. An alert human might spot the collusion, but even better, video analytics might flag the scene for further review. The valuable trend in surveillance, Whiting says, is toward this data-driven analysis (even when much of the job still involves old-fashioned gumshoe work). "It's the data," he says, "And cameras now are data. So it's all data. It's just learning to understand that data is important."

One thing I wanted to see in this piece was some reflection on how casino level of surveillance, and the casino theory of justice (we spy on everyone to catch the guilty people) has become the new normal across the world.

Not in my house: how Vegas casinos wage a war on cheating [Jesse Hicks/The Verge]

(via Kottke)

Montreal comp sci student reports massive bug, is expelled and threatened with arrest for checking to see if it had been fixed

Ahmed Al-Khabaz was a 20-year-old computer science student at Dawson College in Montreal, until he discovered a big, glaring bug in Omnivox, software widely used by Quebec's junior college system. The bug exposed the personal information (social insurance number, home address, class schedule) of its users. When Al-Khabaz reported the bug to François Paradis, his college's Director of Information Services and Technology, he was congratulated. But when he checked a few days later to see if the bug had been fixed, he was threatened with arrest and made to sign a secret gag-order whose existence he wasn't allowed to disclose. Then, he was expelled:

“I was called into a meeting with the co–ordinator of my program, Ken Fogel, and the dean, Dianne Gauvin,” says Mr. Al-Khabaz. “They asked a lot of questions, mostly about who knew about the problems and who I had told. I got the sense that their primary concern was covering up the problem.”

Following this meeting, the fifteen professors in the computer science department were asked to vote on whether to expel Mr. Al-Khabaz, and fourteen voted in favour. Mr. Al-Khabaz argues that the process was flawed because he was never given a chance to explain his side of the story to the faculty. He appealed his expulsion to the academic dean and even director-general Richard Filion. Both denied the appeal, leaving him in academic limbo.

“I was acing all of my classes, but now I have zeros across the board. I can’t get into any other college because of these grades, and my permanent record shows that I was expelled for unprofessional conduct. I really want this degree, and now I won’t be able to get it. My academic career is completely ruined. In the wrong hands, this breach could have caused a disaster. Students could have been stalked, had their identities stolen, their lockers opened and who knows what else. I found a serious problem, and tried to help fix it. For that I was expelled.”

The thing that gets me, as a member of a computer science faculty, is how gutless his instructors were in their treatment of this promising student. They're sending a clear signal that you're better off publicly disclosing bugs without talking to faculty or IT than going through channels, because "responsible disclosure" means that bugs go unpatched, students go unprotected, and your own teachers will never, ever have your back.

Shame on them.

Youth expelled from Montreal college after finding ‘sloppy coding’ that compromised security of 250,000 students personal data [Ethan Cox/National Post]