Original research from Princeton's Joe Calandrino, Ed Felten and Will Clarkson show that machine analysis can make very accurate guesses about the identity of people who complete bubble-in forms — that is, there's something like a recognizable, individual "penmanship" for the small scribbles used to fill in the bubbles on machine-readable forms.
These individuals have visibly different stroke directions, suggesting a means of distinguishing between both individuals. While variation between bubbles may be limited, stroke direction and other subtle features permit differentiation between respondents. If we can learn an individual's characteristic features, we may use those features to identify that individual's forms in the future.
To test the limits of our analysis approach, we obtained a set of 92 surveys and extracted 20 bubbles from each of those surveys. We set aside 8 bubbles per survey to test our identification accuracy and trained our model on the remaining 12 bubbles per survey. Using image processing techniques, we identified the unique characteristics of each training bubble and trained a classifier to distinguish between the surveys' respondents. We applied this classifier to the remaining test bubbles from a respondent. The classifier orders the candidate respondents based on the perceived likelihood that they created the test markings. We repeated this test for each of the 92 respondents, recording where the correct respondent fell in the classifier's ordered list of candidate respondents.
New Research Result: Bubble Forms Not So Anonymous