Gonzo essay on the limits of chip design

The term "gonzo journalism" gets thrown around pretty loosely, generally referring to stuff that's kind of shouty or over-the-top, but really gonzo stuff is completely, totally bananas. Case in point is James Mickens's The Slow Winter [PDF], a wonderfully lunatic account of the limitations of chip-design that will almost certainly delight you as much as it did me.

I think that it used to be fun to be a hardware architect. Anything that you invented would be amazing, and the laws of physics were actively trying to help you succeed. Your friend would say, “I wish that we could predict branches more accurately,” and you’d think, “maybe we can leverage three bits of state per branch to implement a simple saturating counter,” and you’d laugh and declare that such a stupid scheme would never work, but then you’d test it and it would be 94% accurate, and the branches would wake up the next morn- ing and read their newspapers and the headlines would say OUR WORLD HAS BEEN SET ON FIRE. You’d give your buddy a high-five and go celebrate at the bar, and then you’d think, “I wonder if we can make branch predictors even more accurate,” and the next day you’d start XOR’ing the branch’s PC address with a shift register containing the branch’s recent branching history, because in those days, you could XOR anything with anything and get something useful, and you test the new branch predictor, and now you’re up to 96% accuracy, and the branches call you on the phone and say OK, WE GET IT, YOU DO NOT LIKE BRANCHES, but the phone call goes to your voicemail because you’re too busy driving the speed boats and wearing the monocles that you purchased after your ­promotion at work. You go to work hung-over, and you realize that, during a drunken conference call, you told your boss that your processor has 32 registers when it only has 8, but then you realize THAT YOU CAN TOTALLY LIE ABOUT THE NUMBER OF PHYSICAL REGISTERS, and you invent a crazy hardware mapping scheme from virtual registers to physical ones, and at this point, you start seducing the spouses of the compiler team, because it’s pretty clear that compilers are a thing of the past, and the next generation of processors will run English-level pseudocode directly. Of course, pride precedes the fall, and at some point, you realize that to implement aggressive out-of-order execution, you need to fit more transistors into the same die size, but then a material science guy pops out of a birthday cake and says YEAH WE CAN DO THAT, and by now, you’re touring with Aerosmith and throwing Matisse paintings from hotel room windows, because when you order two Matisse paintings from room service and you get three, that equation is going to be balanced. It all goes so well, and the party keeps getting better. When you retire in 2003, your face is wrinkled from all of the smiles, and even though you’ve been sued by sev- eral pedestrians who suddenly acquired rare paintings as hats, you go out on top, the master of your domain. You look at your son John, who just joined Intel, and you rest well at night, knowing that he can look forward to a pliant universe and an easy life.

The Slow Winter [James Mickens/Usenix]

(via JWZ)

(Image: MYK78 Clipper Chip Lowres, a Creative Commons Attribution (2.0) image from travisgoodspeed's photostream)

Notable Replies

  1. That. Was. Awesome.

  2. best/funniest thing i've read all month.

  3. This single mother earns £340 a day at home overclocking her CPU using this weird old tip...

  4. OK - puts on ex-chip designer hat .... you guys are assuming this is fiction, I lived it, while it's tongue in cheek it's pretty close to reality ... suddenly small technology advances mean that you can do something you couldn't do before - you put all of a 32-bit cpu on a die, next generation (2 years) you put the caches on too, after that it's all down hill you're competing to add tiny amounts of architectural speed, you lean on the process guys and they pull clock speedups out of their arses for a while but as stuff gets smaller RC delays and clock skew start to kill you.

    I built 2-d graphics accelerators for macs in the early 90s - for 3 generations of designs we made things 10 times faster each time (humans seem to have a bit of a logarithmic response - they seemed 'twice as fast') then we hit the wall - drams were getting bigger, not faster - while for us screens/framebuffers weren't really getting bigger that fast) - we went through the while process in 2 years.

    In reality we're starting to run up against the atomic level, you can count the width of wires in atoms, push too many electrons too fast through them and the atoms start to move (that limits how fast you can charge up the capacitance of gates and nets and build reliable hardware), gates are atoms thick, they start to get leaky - IMHO the whole Moore's law thing has played out, at least for traditional silicon processes - we need something different, that's stable at the the atomic level (carbon nanotubes?) to grab those last few generations worth of performance - it's going to take a while

Continue the discussion bbs.boingboing.net

7 more replies