What would it cost to store all of America's phone calls?

The Internet Archive's Brewster Kahle has done the math on building a data-center that could hold all of America's voice-calls, and concluded that this it wouldn't quite fit within the $20M price-tag reported for Prism, though it's not far off.

These estimates show only $27M in capital cost, and $2M in electricity and take less than 5,000 square feet of space to store and process all US phonecalls made in a year.   The NSA seems to be spending $1.7 billion on a 100k square foot datacenter that could easily handle this and much much more.    Therefore, money and technology would not hold back such a project– it would be held back if someone did not have the opportunity or will.

Another study concluded about 4x my data estimates others have suggested the data could be compressed 10:1, and the power bill would be lower in Utah.  

Here's a shared spreadsheet with Kahle's calculations.

Cost to Store All US Phonecalls Made in a Year in Cloud Storage so it could be Datamined


  1. ok, if recording all US calls comes in  at 30 mil, but they are building out a data center at 1.7 billion, that’s 60x the cost.  That means 60x the data.  300 million people times 60 is 18 billion.  Or, it’s 6 billion people, but 3x more data per person…  Their plans are quite a bit bigger than we are thinking…  Moar liek W0rLd D0m1n8t10n

    1. With the US population as the test case (English speaking), rather than a legally available cohort somewhere out in the world as the test case…  no matter how you slice it, they are operating extralegally.

  2. Ok, so… I would like to make about 30 billion phone calls, and say, “fuck you, government spies, douchebags and snoops, go mind your own business”.  Spy on this assholes. 

    How do I do that again? 

  3. The question I ask, is that cost of storing the cost of the content of calls. Or simply prescreen for ‘keywords’ and store those voice calls. Instead of pick up up milk type calls and to/from calls.

    Or is it simply storing the ‘to/from’ call. without content unless the content raised a keyword flag?

    1. Want to be safe from NSA spying on your phone calls? 
      Just learn to talk like my brother-in-law…  Google voice typically only gets about 2 words correct in its transcription of his messages.

  4. Don’t forget that the reported budget might not equal the actual budget, and thatthere’s all sorts of “dark” money floating around in the security-industrial-complex appropriations.

  5. Brewster Kahle’s a pretty smart guy, but he doesn’t seem to have the first idea how defense procurement works.  $100,000 a petabyte!  That’s adorable.  Do you think they’ll just contract with Amazon?

    There’s so much wrong with the coverage of PRISM on BB.  Maybe if Cory sees what I wrote at https://plus.google.com/u/0/106481679227287369472/posts/Lh3GzYgPd9C he’ll address it.

  6.  … they’re not recording the audio. They’re doing real-time dumping to text and storing that. Been doing that since the late eighties. Schneier covered it on his blog ten years ago.

  7. Voice data can’t simply be “compressed 10:1” (barring tricks like converting it to text.)  Uncompressed telco voice is 64 kbps per direction, and you can use GSM codecs at speeds like 6.5 kbps, but his estimate started with 30 kbps Skype (which is already compressed; it’s probably some codec like iLBC at around 8-10 kbps, padded by the UDP, IP, and Ethernet headers you need to turn voice into VOIP.)

  8. Audio is usually captured at 16 bits/sample, rather than 1 byte/sample, so, to store all this audio uncompressed, you would need to double all these totals.

  9. *Giggle*  this assumes that you actually do this with a concern about costs.
    Its a Government handout that no one dare challenge else they be branded a terrorist, and we think they would stay on budget.

  10. If we have this excess computing power, why not store all disparate research on cancer and see if anything being done in one place might provide answers for work being done elsewhere. When you run the cost of lives saved, I think my way would win, hands down. 

  11. Drive by Ft. Gordon in Georgia and you’ll see the literal tip of the iceberg. The have a 100k sqft, 5 story underground facility for “overseas” communication monitoring.

    It’s been in place do five years. To assume the tools aren’t already in place is just plain silly.

  12. This assumes that the  only data they are interested in and will access is telephone audio data.  I’m pretty sure this is a bad assumption

  13. Not to mention:  what are they going to DO with all that data?  Find a terrorist six months after the crime is committed?  That has been my question ever since this story broke:  they COLLECT all that data but can’t possibly, competently sift through it all.  Oh, wait:  private contractors.  Now THERE’S a good idea.

Comments are closed.