Open design for a 67 TB array for $7867

Chris sez, "Online backup startup BackBlaze, disgusted with the outrageously overpriced offerings from EMC, NetApp and the like, has released an open-source hardware design showing you how to build a 4U, RAID-capable rack-mounted, Linux-based server using commodity parts that contains 67 terabytes at a material cost of $7,867. It's open-source hardware! Their blog states: 'Our hope is that by sharing, others can benefit and, ultimately, refine this concept and send improvements back to us. '"

Petabytes on a budget: How to build cheap cloud storage (Thanks, Chris!)



  1. I read “Backblaze iPod” and I wondered just exactly how much music a person would need to carry around with them. You can download every movie ever made on 67 terrabytes.

  2. I’d rather spring for a Coraid box – generic SATA drives, supermicro chassis, ultra-fast plan9-based OS running fast tight AOE instead of bloated slow iSCSI, compatible with linux, BSD, windows, Mac… if you can write simple code you can build a GFS-based high availability shared storage cluster across multiple OSes and sites… already built and tested for you.

  3. Uh, did you guys even read the post? Backblaze doesn’t sell this unit, they designed it.

    Backblaze is a low cost ($5/month) unlimited onluine backup. The point of the post was that they were able to cut the pricing of scalable drive space to $113,000 per petabyte by creating these 67TB pods.

  4. Anon3, aren’t Coraid boxes like an order of magnitude more expensive? Like costs the same for a 6tb raid as this 67tb raid?

  5. I agree, #3. Any Romulan NTSC that crosses the streams in a dilithium magneto-array won’t generate nearly the 1.21 gigawatts necessary to compete with AT-AT nacelle Bussard collectors in an OCP ED-209 configuration. You’d be better off sticking with your WOPR running bluetooth!

  6. DCulber, only if your time costs nothing. I buy the coraids empty and load ’em up with drives; it takes roughly ten minutes to have ’em racked up and ready for initializing. No software building, no assembly other than sticking SATA drives in slots.

    (disclaimer: I do not work for coraid, but I wrote some of the simpler FOSS code that people use with their coraids)

    I don’t want to take away from these guys’ accomplishment – it’s very nice work! – but for my purposes coraid products are quite cost effective.

    AOE is also a hell of a lot easier to work with than a raw HTTPS socket when you want to build a filesystem; I can init the drive with GFS for example without writing an HTTPS driver layer. That advantage might be obviated after these guys publish the software side of their solution, though.

    Nice, Day Vexx! Sorry about the geekologue, but you would find a three page dissertation on the relationship between SATA hardware duty cycles and how that manifests in RAID volumes even worse…

  7. I thoroughly enjoyed this. I am a network engineer, but I have only experienced walking next to the SANs before. I don’t know what else EMC and others bring to the table, (Our networks used EMC at my old job.) but a big Kudos to BackBlaze for making this work,then making it public. Truly open-source. I wouldn’t even expect them to do the same with the code, it is truly something they need kept secret, but this is a neat way to open up to customers, and maybe gain a few through publicity at that.

  8. Bummer they don’t sell that as a kit. I’d love to put one of those on our network and ditch some of our EMC junk.

  9. I’ve always found the EMC and NetApp stuff to be way overpriced, but they are easy to implement if your company is willing to shell out the cash. For backup storage, where speed is not an issue, the home grown route is the way to go over high priced solutions.

  10. I can’t wait to revisit this post in 8-10 years, when 67 TB is barely enough storage for a preschooler’s Photoshop CS29 documents.

    I’m old enough to remember my first 286 with a 20 megabyte hard drive, the vast size of which inspired jealousy in friends. This is, what, 3.35 million times more storage?

  11. What happens when a power supply pops and takes out 4 or 5 of the backplanes? You’ll have dead volumes for sure.

  12. It puzzles me why they chose to use JFS for this when far superior filesystems like ZFS exist and would work perfectly for this storage array. Pretty neat box though.

  13. Let’s not get too excited. True, it’s not that hard to stick a bunch of drives in a box, install Linux, and start serving up NFS shares.

    But Enterprise storage, like that that NetApp, EMC, HDS, HP, and IBM, et al. make, is substantially different from their mere hardware components. For one, they are highly available. They have redundancy built in to every piece of the pie. They offer DR/replication, application aware snapshots, deduplication. Purpose built operating systems to manage everything. And thousands of engineers that work to ensure compatibility and resiliency.

    Cheap boxes like this may be great for some things. But if you have the business requirements that justify talking to one of the top tier Enterprise vendors, cheap DIY things like this just aren’t part of the conversation. It’s a false dichotomy.

    *I work for NetApp, so clearly I’m biased. :)

  14. One little problem…

    the Seagate ST31500341AS 1.5TB Barracuda hard drives they used are prone to fail. They also freeze, and combine this with a botched firmware release; there is a reason these drives have a 26% one-star rating on

  15. @15 If you read the full blog post, you’ll see that all the deduplication, redundancy, etc etc is handled at a higher level. Each data server is “just” 3 x 23 TB data banks. Most likely they have front end servers that handle client requests, pulling the data from one or more of these backend data banks as needed. I assume that any one piece of data is store on 2 or more data servers.

  16. @Cory – what’s funny about Day Vexx’s post? Following his post I’ve just put the finishing touches on my personal time machine warp gate. See you in the next quadrant!

  17. If you want to make this more Enterprise-like:

    Replace 4 drives with good SSD’s for backend FS write log (+$2800). 8x memory for read cache (+$400). Maybe a 10G card for replication (+$400). Buy two and mirror using OSS cluster software (cost x2). Final cost is around $22k for redundant 61TB that should push well over 100K iops. What is the price range for this redundancy and performance from NetApp or EMC?

    To complete the package, spend some time writing scripts to quiesce your db/exchange/other when taking snaps, write a nice BUI frotnend to manage it, and become intimately familiar with the OS for when things go wrong.

  18. Besides which, everyone knows that AT-AT nacelles can’t be used in the OCP ED-209 configuration! They end up too degaussed.

  19. That looks a lot like a DIY version of our DDN (DataDirect Networks) box at work, minus the redundancy, fiber channel, and a LOT of the cost.

    For what it’s worth, heat is not a problem in the DDN, even though it’s got 60 drives per drawer (4U of rackspace).

  20. This is pretty cool. But it’s no enterprise class storage by any means.

    One thing that immediately comes to mind is hot swapping drives. That’s key to uptime. Commercial systems give you redundant hot swappable power supplies as well.

    Not to mention dual port drives with redundant controllers to keep the drives available if a controller fails.

    I have a home built raid system and I’ve been working in the raid storage industry since 1999. There’s a big difference between this and enterprise storage, whether those differences are worth the premiums charged is debatable.

  21. If your operations costs (people fixing boxes going down) are typical, and cost for unit downtime is typical, this box is not a good solution.

    Actually redundant power supplies, hot swap, etc. all matter. Enough cooling air flow space around the drives matters (there’s a reason that the commercial top-access 4U disk boxes tend to have 42 drives not 45 – the additional space by leaving one row out gives much better airflow).

    I salute what they accomplished, but they’re going to be paying people a lot to maintain them going forwards.

Comments are closed.