Tax returns for 6,461,326 tax-exempt organizations now indexable by search engines and available for free downloads, thanks to

Rogue archivist Carl Malamud sez,

If you want access to all the tax filings of US nonprofit corporations, the IRS will sell you sets of DVDs for $2580 per year of data. We acquired all of these filings from 2002 to the present, a set of DVDs weighing 98.7 pounds. I'm pleased to report that all 6,461,326 of those returns are now successfully extracted and available on our new bulk data feed.

This data really should be available directly from the IRS at no charge. Accordingly, we've drafted a deed of gift offering the system back to the government.

Until the .gov people do take it over, we're offering access to all 5 TBytes of data using the http, ftp, and rsync protocols. Our hope is that developers will come up with lots of new uses for this information. In order to make the database even more useful, we've started working with Captricity to extract data from the forms and make it available as computable data (e.g., CVS files instead of TIFF images!).

Once search engines such as Google finish indexing the data, the tax filings of nonprofits will show up in the search results. When you search for a nonprofit, the first thing you see ought to be their home page. But, the next thing you ought to see are things like how much they pay their CEO, how much revenue goes for fundraising, and if they spend money to lobby public officials.

Nonprofits in the US had $1.87 trillion in 2009 revenues and it is these periodic filings that make the nonprofit marketplace work properly, just like SEC EDGAR filings help make the corporate markets work properly.

Reports of Exempt Organizations (Thanks, Carl!)



  1. In terms of improvement in quality of oversight per dollar, not having the data freely available is a serious waste(and it isn’t as though seeding a bunch of torrents at modest bandwidth is particularly expensive, or that the data aren’t being stored digitally anyway, at least the new data coming in, back records may be on microfilm somewhere…)

    That said, I’m actually kind of surprised that the fee for such a relatively low-volume data dump is so low. ‘Products’ like that have a way of being priced so as to suggest that they really don’t want to sell any(see also, FOIA request processing fees).

  2. You can request form 990 for your favorite nonprofit free from the IRS, but I requested one back in February and they still haven’t responded.  I called last month and was told my request was entered into the system, and they weren’t sure why it hadn’t been processed. My guess is they’re understaffed – every president since Reagan has cut their budget.

  3. People have had access to this data from Guidestar ( for some time. So there have been ways to get that 990 data. Plus nonprofits MUST provide this information to anyone who asks for it. So @boundegar it’s possible to inquire directly. If they can’t/won’t give it to you they are out of compliance (they may direct you to Guidestar however and that would be ok as I understand it.)

    1. Not that they can’t, or won’t.  They just haven’t…  yet.  Hey, what’s eight months among friends?

  4. So that’s about one tax exempt organization per fifty people in the United States? Doesn’t that seem a bit high? How many of these are creative tax shelters?

    1. More like a 1:200 ratio. There are around 1.5 million exempt organizations in the US, but some of them file what is known as an e-Postcard. And, some organizations are not active and don’t file. 

      Our database spans a dozen years, and for some organizations (like Harvard or the Rockefeller Foundation or the American Petroleum Institute), we’ll have a dozen returns.

  5. What’s new and exciting here is the data feed. PDF versions of nonprofit 990s have been available for well over a decade through Guidestar and for a shorter time through many state regulators (attorneys general or secretaries of state). But downloading PDFs is only the first step in researching a nonprofit; one must then painfully enter data into a spreadsheet, check for errors (and recheck …) and do this for at least three years to find meaningful patterns. With a data feed, one could potentially download several years worth of 990s from competing nonprofits and do the sort of financial analysis that, until now, has generally been reserved for Wall Street.

  6. At first read this looks like a great initiative but actually on closer inspection looks duplicative. GuideStar is already presenting this information on the web – admittedly for a fee – and has already digitized the data – so why pay Captricity to do the job again? On the other hand – maybe some competition for GuideStar might be a good idea?

Comments are closed.