Jeremy Kun, a mathematics PhD student at the University of Illinois in Chicago, has posted a wonderful primer on probability theory for programmers on his blog. It's a subject vital to machine learning and data-mining, and it's at the heart of much of the stuff going on with Big Data. His primer is lucid and easy to follow, even for math ignoramuses like me.

For instance, suppose our probability space is and is defined by setting for all (here the “experiment” is rolling a single die). Then we are likely interested in more exquisite kinds of outcomes; instead of asking the probability that the outcome is 4, we might ask what is the probability that the outcome is *even*? This event would be the subset , and if any of these are the outcome of the experiment, the event is said to *occur*. In this case we would expect the probability of the die roll being even to be 1/2 (but we have not yet formalized why this is the case).

As a quick exercise, the reader should formulate a two-dice experiment in terms of sets. What would the probability space consist of as a set? What would the probability mass function look like? What are some interesting events one might consider (if playing a game of craps)?

Probability Theory — A Primer

(*Image: Dice, a Creative Commons Attribution (2.0) image from artbystevejohnson's photostream*)

Tomorrow, I’m turning off my email and hitting the road for Burning Man, where I’ll be giving three talks, and I hope to see you there: at 4PM on Weds, Aug 20, I’m speaking at Palenque Norte at Camp Soft Landing; at noon on Thursday, Aug 31, I’ll be speaking at my home camp, Liminal […]

Peter Biddle writes, “I get I myself into trouble. I don’t claim that bad stuff happens to me more often than others – it’s more that I find more ways to happen to bad stuff. I actually found a way to get severe hypothermia in 105°F heat.”

Lisa Gold is the extraordinary researcher who is perhaps best know for her work with Neal Stephenson, particularly on the Baroque Trilogy.

Top-performing content doesn’t come from pure inspiration. So how do you get the maximum number of eyeballs to see your marketing copy? By analyzing the impact of the keywords within. Serpstat does this all for you. It can monitor up to 200 keywords in your projects, across 10 total domains. They keep track of the […]

Toaster ovens are the perfect appliance for small things like toasted sandwiches and roasted garlic (try it!), but anything more involved usually requires a full-sized conventional oven.However, despite its small size, the Wolfgang Puck Pressure Oven can handle anything from baked pastries to broiled meats. This kitchen appliance has a minimal countertop footprint, and cooks […]