Ed Felten and several colleagues have just finished a paper called "Fingerprinting Blank Paper Using Commodity Scanners" for the May, 2009 Proceedings of the IEEE Symposium on Security and Privacy. It details a mechanism for authenticating documents based on known characteristics of the paper stock and individual sheets they're printed on.

This paper presents a novel technique for authenticating physical documents based on random, naturally occurring imperfections in paper texture. We introduce a new method for measuring the three-dimensional surface of a page using only a commodity scanner and without modifying the document in any way. From this physical feature, we generate a concise fingerprint that uniquely identifies the document. Our technique is secure against counterfeiting and robust to harsh handling; it can be used even before any content is printed on a page. It has a wide range of applications, including detecting forged currency and tickets, authenticating passports, and halting counterfeit goods. Document identification could also be applied maliciously to de-anonymize printed surveys and to compromise the secrecy of paper ballots.
  1. So you’re saying boingboing is now the only way we can leave an anonymous note once all known paper has been scanned?

  2. The title is a misnomer. While this is an interesting idea and may indeed be able to positively identify individual pieces of paper it is not fingerprinting. Fingerprinting is about the positive identification of individual fingers. In the interests of accuracy the word fingerprint should not be used.

    1. I assume you’re joking, right?

      Ametaphoria: the silent epidemic. I mean, not really silent because diseases don’t literally make noises. And not really an epidemic because it’s not literally a contagious disease.

  3. People seemed to have no problem with DNA as “genetic fingerprinting”, the description got the idea across until DNA became a household word. Until they come up with a snappy acronym, “fingerprinting” will do.

    Except for that technique to raise fingerprints on paper using superglue fumes. You could be scanning that, and voila, instant debilitating confusion. Oh noes!

    Ok, that was silly. I’m going to get coffee, now.

  4. I’d like to sidestep pedantic use-of-the-word-“fingerprinting” issue, and focus on what concerns me most about this: the continued erosion of personal privacy. If this could be used to make anonymous ballots not-so-anonymous, or to identify what manufacturer and lot number a particular piece of paper came from, it will not be long before laws are passed requiring all paper printing to be registered and licensed with the government.

    And about a hop, skip, and a jump from there we’ll have legislation outlawing DIY paper making.

  5. I was expecting to read how I could be fingerprinted by my favorite corporations and government agencies when I write my crazy letters to them. Perhaps “tracking” would be a more accurate word here?

    Nevertheless, that last little bit about de-anonymizing documents still hits pretty close to my initial read of the headline.

    And about a hop, skip, and a jump from there we’ll have legislation outlawing DIY paper making.

    That about the same level of reasoning that allows people to say “If we start allowing gay marriage, then a hop, skip, and a jump from there we’ll be allowing people to marry their pet goats!

    I think there are plenty of interesting privacy issues at stake, but there’s no need to go over board. ;)

    Here’s a possible result: Currently, many printers surreptitiously mark all printed pages with a tiny yellow dots that matches each page to a specific printer, and time of printing. (There was some mini-scandal about this — did any companies stop using it?). Imagine instead a printer had a little scanner than grabbed a fingerprint of each page as it went out: same result. The fingerprint would go into some government database, and the time and place of printing would then be traceable.

    Now, I don’t expect that consumer printers are going to secretly install a high-powered, costly scanner in their printers and not tell the consumers about it. However, it could easily be a feature that government offices expressly buy. Then, next time there’s a leak or a whistle blowing or something, any paper printed by a gov’t printer could be traced.

    That said, all this is going to be obsolete pretty soon anyway. Who’s going to be using paper? Leaks go out by emails or text messages. Plane tickets can already be completely paperless — just use the image of the bar code on your cellphone’s screen. I don’t doubt that Ticketmaster is going to be using the same thing for their own tickets — if they aren’t already. Plenty of polling machines are electronic, of not the majority…

    They say you shouldn’t look for technical solutions to political problems. I think that also means you shouldn’t view technology as a political threat.

    The problem is not whether or not the technology is available to allow a government or company to spy on its citizens or employees. It’s a problem of whether they should be allowed to (and if so, to what extent).

    For instance, the example you conceived of, of this being used to track down whistle-blowers, is not a problem in my view. The problem then would be that you have a government that’s very interested in finding and silencing whistle-blowers. How they do it, is less relevant.

    That would not be an issue in for example, Sweden, where investigating government leaks is not only not done, it’s _illegal_. (With the exception of national security matters, which are strictly defined) There is oversight in place to ensure that it doesn’t happen.

    The key here is a strong political will for personal privacy and transparency in government. And I think this attitude of automatically assuming the government will abuse their power if the ability exists, is counter-productive. Empirically, distrust of government doesn’t bring about less corrupt governments. It brings about cynicism and corruption.

    That isn’t to confuse trust of government with _blind_ trust. And that mature form of trust goes hand in hand with demanding transparency.

  8. I looked at the picture first, then misread the title, and thought it said “Fingerprinting blank sheets of paper by shredding them”

    …now that’s a useless technology.

  9. Easy to bypass with a simple trick we used to do as kids.

    We used to roll paper to get it really soft. You roll it into a ball between you palms and keep rolling it. Then you’d unravel it, roll it back up and repeat until the paper was as soft as toilet paper.

    This utterly changes the ‘fingerprint’ as it unravels the threads.

    This could be done more easily with a liquid bath without a deterioration of the paper. Just let the paper soak in water for a little bit and then bring it back out to soak on a line. To really break this method, combine the two and wad up the paper and roll it a bit and then soak it.

    The fibres will unraveldue to the wadding and rolling and warp due to the water.

  10. For the commenters bringing up privacy issues, I don’t really get it.

    If someone wanted to de-anonymize a survey, ballot, or whatever, it seems a whole lot easier to make the particular ballot (or whateve) unique via the text printed on the paper (or using invisible yellow dots) than to try to identify a specific sheet of paper using the paper’s texture.

  11. This has already been done by Light Signatures in the early 80’s. There is probably already a patent on it somewhere.
    They passed a paper under a line scan camera and generated a unique value that could be reproduced by rescanning the paper. They made coupons that were attached to jeans, records, and other products for anti-counterfeiting purposes.
    I worked on the project for the company that made the line scan camera and and image acquisition system.

  12. Can they identify individual pieces of paper, the ream the paper comes from, the batch produced by the factory, or just which factory the paper came from?

    If it could identify individual sheets, then it matters where the “fingerprint” is read from. If you scan the top-left corner, and then someone loaded it upside-down in the printer, so that the scanned part becomes to bottom-right corner, the computed value will be different. In my office, if we run out of A4, I’ll just take a ream of A3 and put it in the guillotine. Cutting the paper causes even more problems.

    Big brother is better off using those printer dots. They could also use deliberate imperfections, built into the rollers on the printer.

    The application for this will be more along the lines of *you* wanting to identify the paper you yourself sent out. Say you print a rebate coupon and give it to your customer, and you want to make sure your customers do not create their own coupons with their own computer printers. Or a university wants to make sure nobody manufactures fake certs. (Yes, those are lame examples, maybe someone can think of better ones).

    It would be a good many years before technology progress far enough that we could replicate individual fibres on the paper.

  13. @ Xeno: It sounds like this is already much more robust than you’re giving it credit for. According to the paper, they tried submerging pages under water for five minutes, letting it dry on a line, and then even ironing it flat with a hot iron. After this mistreatment, they still achieved 100% accuracy in identifying pages.

    @ dainel: The paper describes how the area to be fingerprinted is calculated, which will be different for every page. It will calculate the same area even if you rotate the page. Naturally you’ll have to scan both sides, though, to identify a particular page. And I’d assume it can only identify individual pages — it can’t identify a ream of paper unless it fingerprints the paper every few inches.

  14. BB might want to reconsider the use of the image (or maybe just add a note about it), as it implies that it was created using the equipment described in the paper (‘a commodity scanner’), rather, its an electron micrograph of a piece of paper, used in the paper for illustration purposes. While the current method is quite time consuming (four individual scans at various angles) and not a practical privacy concern, the possibility of automation of the process does raise some issues. But the ballot one is interesting, certainly it’s feasible that someone might use this method to identify individual ballots, but it’s relatively easily defeated, just shuffle the ballots before handing them out. Still a pretty troubling potential threat to privacy, though. I sometimes wonder if our kids will view our privacy concerns in the same way we might view out forebear’s fear of aircraft or electricity.

  15. @7 – You use leak and whistle blowing in the same context there – you realize often times the latter is actually GOOD for people right?

  16. a scanning electron microscope is not a “commodity scanner.” what kind of “commodity scanner” would require 10000 electron volts and a vacuum environment to work?

  17. These guys are behind the times: Sherlock Holmes could tell all kinds of things about someone just by studying the paper he used. ;)

    Also–A “commodity scanner” is just that red laser gun thing that clerks use to scan bar codes, right? I never knew it had that kind of resolution.

  18. @ 19 – And your comment proves my point. The electron microscope wasn’t used to produce the method described in the paper (unless I’ve very much misunderstood it), rather they use the EM photo above as an illustration of the surface structure of paper as part of their explanation of their method. The juxtaposition in this blog entry makes the relationship between the technique and photo (and thus the equipment used) confusing.

  19. Keep in mind that in order to identify a piece of paper with this method, it first has to be scanned, currently a non-trivial undertaking. Further, a database of all those pages has to be maintained.

    Section 8 of the paper touches upon some of the privacy implications. Putting chemical markers on papers (or perhaps pre-printing them with yellow dots) has been possible for years, however these techniques modify the paper in detectable ways.

    With this method, there is absolutely no modification of the paper. So as an example, if a polling place used paper ballots, someone could scan the ballots surreptitiously, and if they are handed out in a known order and if the order of voters is tabulated, a corrupt official could identify the votes after the election.

    On the other hand, someone could claim that the ballots had been scanned, whether the were or not, and it would be extremely difficult for an honest official to prove that they had not been scanned.

  20. @ #3 I assume you do not now or have in the past worked with fingerprints, right?

    and @#4 Good comment. BTW super glue vapor is not a good choice for locating latent fingerprints on paper. There are much better methods available.

  21. Reasonable usage: scan the paper as you’re printing a coupon or ticket for redemption, then print the ‘fingerprint’ code on the edge. You can then verify that it’s not an unauthorized copy (assuming no one cracks your scanning and hashing methods). Probably not worth the trouble, since you get 90% of the benefit by just storing 1 time use ID #s in a database somewhere.

