Associated Press claims to have discovered magic anti-news-copying beans

A lot of copyfighters were mystified by the Associated Press's recent announcement (complete with a bonkers diagram straight off a bottle of Dr. Bronner's) that they had spent millions of dollars on a DRM system for news that would limit how you could paste the text you copied from your browser window.

This is a seeming impossibility, and while there will always be DRM vendors with impossible magic beans to sell to any panicked goofball media dinosaur who'll buy them, it just seemed too weird to think that no one at the AP had said, "Wait, what? This is dumb."

Now Ed Felten has delved into the details that can be gleaned about these magic beans and concludes that AP has made up a bunch of fictional things that their reasonably neat content-management system and microformat can do.

As far as I can tell, the underlying technology is based on hNews, a microformat for news, shown in the AP diagram, that was announced by AP and the Media Standards Trust two weeks before the recent AP announcement.

Unfortunately for AP, the hNews spec bears little resemblance to AP's claims about it. hNews is a handy way of annotating news stories with information about the author, dateline, and so on. But it doesn't "encapsulate" anything in a "wrapper", nor does it do much of anything to facilitate metering, monitoring, or paywalls.

AP also says that hNews " includes a digital permissions framework that lets publishers specify how their content is to be used online". This may sound like a restrictive DRM scheme, aimed at clawing back the rights copyright grants to users. But read the fine print. hNews does include a "rights" field that can be attached to an article, but the rights field uses ccREL, the Creative Commons Rights Expression Language, whose definition states unequivocally that it does not limit users' rights already granted by copyright and can only convey further rights to the user.

AP's DRM Announcement: Much Ado About Nothing

Update: Don't miss Dequed awesome and profane remix of the diagram


  1. I was wondering what they thought they had invented.

    My speculation was that it was some sort of “watermarking” of language choice or deliberate typos or something of that sort, which would at least limit flag unauthorized verbatim copying of articles. (Ideas can’t be copyrighted but instantiations can be, so to stay legal in picking up someone else’s story you need to either pay or rephrase in your own words.) If that was so, I expect they would prefer to keep the details of the watermark secret, to avoid having folks automate the task of rewriting just enough to avoid notice.

    Some years ago, I took Arthur C. Clarke’s phrase “any sufficiently advanced technology is indistinguishable from magic” and derived from it the corollary “any sufficiently digital playback is indistinguishable from copying”. That seems to sum up the rights management problem fairly well — it’s deuced hard to do anything to data that will limit or track its use without completely blocking its reuse. Especially when, as in the case of text, there just aren’t that many bits of subchannel to work with.

    (I do think the AP deserves some payment if you pick up one of their stories; I just don’t see any way they can enforce that except by going back to sending their feed only to subscribers and charging those subscribers enough to support themselves. Genies tend to object to being rebottled.)

  2. According to this the last thing they do is throw the consumer into a slightly alarmed garbage can. That seems about right.

  3. Well, I suppose without the details it may be premature to reach any conclusions, but the odds of the AP having solved the myriad problems surrounding such a scheme – where many competent and well-funded efforts have failed – seem incredibly low. Unbelieveably low.

    What seems far more likely is that the AP, blinded by their desperation and/or mesmerized by ignorant or unethical technologiists, were sold a digital “pig in a poke”, like a bumpkin at the county fair. Magic beans, indeed…

  4. Nice to see the dinosaurs rushing towards extinction. I can’t wait until these Mainstream thugs are just another part of the fossil record.

  5. OK, forgive the irony, while I copy-paste Deet’s comment wholsesale, from the relevant Ars article.. But s/he seems to have an interesting insight to the whole thing, and as you can’t link to Ars comments; pasted it is.

    Deet (comments)

    The AP is smarter than this unfortunate series of announcements makes them seem. The hNews project is a good example of how they’re working to embrace open standards, structured markup, and future-friendly distribution mechanisms. Ignore their hamfisted misuse of DRM jargon, and it’s clear they’re actually implementing real advancements in their content distribution system… they’ve never even had a consistent standard structure for online content distribution before.

    Look at this embarrassing DRM verbiage as a kind of sideshow for the old folks. It’s targeted to the broad segment of AP membership that may not know what DRM is or how it actually works, but are convinced they need it before they release a word of their content online. Remember, AP content is produced by AP members. Talking about DRM is less about threatening bloggers and more about getting the old stalwarts comfortable posting more stories and photos, ultimately restoring some relevance to the small newspaper industry in the online world they find so scary.


    The great difficulty in talking about the AP’s year-long push to stop “misappropriation” of its content is that it has never been quite clear what behavior AP is trying to stop. It’s not targeting Google, as it already has a deal in place with Google to provide full-text feeds of its news stories. It doesn’t appear to be going after small-time bloggers or those who use quotes from AP stories in their work.

    Correct, AP is not going after small-time bloggers or people who quote from AP stories, so all the comments in this thread about AP being stupid for not thinking of copy-and-paste are pretty ridiculous.

    The behavior they’re trying to stop is chiefly threefold: wholesale republishing of AP content by non-subscribers, subscriber misuse of AP content outside subscription provisions, and ‘leaking’ of subscription AP content by subscribers to non-subscribers.

    The wholesale republishing problem goes back many years. It is quite surprising how many sites exist solely for the purpose of scraping and re-publishing content, branding themselves as “local news aggregators” or the like. This activity actually shows up in the referrer logs of the original publishers via embedded photos, “email to a friend” links, etc. These sites are automated, they lift content from a wide range of publishers, and their operators are seedy and elusive. Web beacons are likely to catch more of this than you’d think, and their use here likely originates from successful independent tests by member publishers who found that you can learn a lot about bots by using web beacons.

    The subscriber-misuse problem has emerged in the past few years as AP has migrated from print-oriented subscription models to a more diverse cross-media structure. Many long-standing newspapers are still on the old plans, publishing both in their paper and on the web, using content licensed only for print. By embedding the licensing rules right into the content, however, publishers with complex subscription terms have a structured way to filter what should and should not appear on the web site. Frankly, these tags are a great tool from a publishing perspective, and they are long overdue.

    The leaking problem is where you have Google News, Digg, and all of that. And holy crap, have these sites ever been thoroughly lynched by AP members over the years.

    The deal with Google News came about because much of the content on Google News was in fact AP content scraped from member newspapers, many of whom were syndicating the same material. When a story hit, one random paper would get all the traffic for the same story that also ran in everyone else’s paper, which seemed unfair to most of AP’s membership and put Google News (who wasn’t paying anything at all) on AP’s shitlist. So, to level the field, AP and Google struck a deal directly whereby most* AP content on Google News now comes straight from AP. It restores the value of every member’s subscription, it unifies the experience for Google News users, and it gets the material out to the public effectively. Win win win.

    This new system extends that deal and makes more like it possible. Legitimate aggregators like Google News can now recognize AP content and handle it appropriately, hopefully turning “most” (above) into “all.” Terms and use conditions set by AP or its publishing members can be respected correctly, making everyone happy. If some old codger still doesn’t want to see his news about gator-huntin’ on Google (for whatever inexplicable reason), he can “turn it off” using automated tools. Another long-overdue feature now enabled by this system.

    Having worked with several folks at AP Digital, I can say that they’re as exasperated by DRM as any of us. For all the reasons described here, they’re well aware that actually controlling news content online is practically impossible. This announcement was definitely bungled by non-systems personnel. Nevertheless, the systems people are under a lot of often-conflicting mandates from within a diverse organization, and they’re working several angles, some with real success.

    AP’s hope seems to be that this new specification for online delivery of AP member content will slow, stop, or at least reveal the activities of the more blatant rippers-off, while giving useful tools to legitimate publishers for monitoring and controlling the use of their content, which is entirely within their prerogative. Obviously, and as with any security system, a sophisticated attacker can circumvent these measures. And the AP knows this. What’s great about the tagging system is, if you’re a legit publisher, the tags had better be there. If the tags are missing, well, be prepared to hear from AP.

    These measures beat the old free-for-all hands down, and they may well help news publishers regain some relevance by giving structured, managed access to distribution outside their own print and web readership, which if you ask me, is a great thing for news online.

  6. Or, they could just encapsulate their news stories in a binary blob, a la Flash–that’d show teh geeks!

  7. “encapsulate” XXX in a “wrapper” is usually management speak for something like XML in my experience.

  8. “would limit how you could paste the text you copied from your browser window.”

    What a waste of time and money. I guess they forgot about a screen shot, which in turn can be OCRed(optical character recognition) in most word processing software. Did they forget that someone could just retype the text as well.

  9. it just seemed too weird to think that no one at the AP had said, “Wait, what? This is dumb.”

    You don’t know the AP very well. They fired all the smart people they paid well, and the rest left for equal-paying jobs in the high-speed nourishment preparation sector, among others.

    (No, really. I ran into a former AP reporter – not a stringer or freelancer, but a full-time New Orleans staffer – working the Blockbuster counter as management last October. He left, voluntarily, because they offered better health insurance.

  10. Thanks, Arkizzle for bringing Deet’s comment to us. That was wonderfully helpful in understanding the issue.

  11. Would it kill you guys to link to a full size image any time there is a shrunk version in the post?

  12. If they do manage to procure these magic beans, they’ll probably end up screwing over disabled people who use assistive technology like screen reading software and braille displays.

  13. Several months ago, the political blogs I read started talking about how AP was going to charge blogs for quoting them (we’re talking about $200 for a sentence, if I recall correctly). Most blogs either refused to blog about AP stories (keeping eyeballs from the link) or would put in the post that they were charging AP for the traffic.

    First, AP was breaking the Underpants Gnome’s copyright on business models:
    1) Make your product useless
    2) ?
    3) Profit!

    Second, people will find a way to pay for product that is good. The Wall Street Journal has some of its content behind a subscription wall and used to give you a day pass to their site if you looked at a full-page ad. There are people who make a living going to the Middle East and reporting for tips from readers. It also gets them exposure to sell books and get signed to bigger outlets.

    Finally, does anyone else think Michael Scott is in charge of AP?

  14. ya know, an open and free web that cannot be censored by any government or otherwise shut down as a conduit of information means that we don’t need reporters anymore. Anything that is worth knowing is always witnessed by at least a few, any of which directly publishing on the web is better than any journalists second hand interpretation.

  15. Salon used to charge thirty bucks–I was one of the people who proposed the business model of Watch an ad, read content; and I got a hearty F You at the time from the leadership there!

    Funny how times change!

    You can always grab a screen shot of AP junk (and that is what it is) and, so long as your purpose is intellectual and not-for-profit, call it FAIR USE/news….just don’t be piggy. The other option is to cut and paste the content, and then convert it to plain text. How’s a watermark gonna survive that?

Comments are closed.