Why are the data-formats in Star Wars such an awful mess? Because filmmakers make movies about filmmaking

Sarah Jeong's long, terrifyingly thorough analysis of the data-formats in the Star Wars universe is both hilarious and insightful, and illustrates the difference between the burgeoning technological realism of shows like Mr Robot and the long tradition of science fiction media to treat computers as plot devices, rather than things that audiences are familiar with.

Data formats are central to the Star Wars stories — the first movie and the most recent one especially — but there is precious little coherence in the implied technological underpinnings of these systems. That's not unique; all the technology in Star Wars is composed for its visual/acoustic interest, not its coherence (why doesn't R2D2 have a voice chip?). But given the centrality of data-handling and IT to the intrigue in the story, this incoherence looms larger than the rest of the technology dog's breakfast — imagine if, at the end of the Maltese Falcon, we learned that the statue wasn't just important, it could also grow to the size of a planet and was responsible for unpredictable episodes of mass spoon-bending, all without any explanation.


Jeong's piece is excellent and full of yucks. As thorough as it is, I'd like to propose another lens through which to view the franchise's handling of data formats — as a funhouse mirror for the history of the lived experience of the film industry with data.

In the mid 1970s, movies and computers were firmly separated. The VFX artists and directors of the day would have limited experience of data handling — mostly through advertising content from the likes of Sperry-RAND and from one another's cinematic interpretations of same. In other words, they were just making it up.

Gradually, and then in a rush, movies and computers came together. Data-storage is one of the steepest technology curves: the cost of storing data is in freefall, as is the bulkiness and unreliablity of media. From floppies to hard drives to SDDs to RAIDs to cloud-based distributed filesystems, storing data is one of the most reliably advancing dimensions of technology.


But we have a persistent myth of the fragility of data-formats: think of the oft-repeated saw that books are more reliable than computers because old floppy disks and Zip cartridges are crumbling and no one can find a drive to read them with anymore. It's true that media goes corrupt and also true that old hardware is hard to find and hard to rehabilitate, but the problem of old floppies and Zips is one of the awkward adolescence of storage: a moment at which hard-drives and the systems that managed them were growing more slowly than the rate at which we were acquiring data.


In those days, even aggressive technology updaters — people who bought a new computer every year, say — would likely migrate into new systems that lacked the hard drive space to store all those users' files. Those people would necessarily store much of their data in fragile and obscure media — floppies, tapes, Zips — and often as not, discover down the road that some of their data had succumbed to bitrot.

Gradually, computer drives caught up with even very aggressive users. 1TB SSDs got cheap, then cheaper, and then cheaper and smaller — and laptops started sporting multiple 7mm/2.5" drive bays, and the drives that could fit in those bays swelled to 2TB and beyond. At the same time, file-lockers and streaming services took much of the bulkiest data out of our laptops and moved it to the cloud. The price of mechanical drives plummeted in the face of competition from cheap, reliable and fast SSDs, bequeathing careful users a welter of cheap local backup options.

Drives in motion are much less susceptible to bitrot than drives on a shelf. The drive controllers are constantly scanning for and marking off bad sectors, copying endangered bytes to good sectors before they're lost. What's more, the rate of change in drive interfaces is much slower than for removable media — the eventual extinction of 2.5" drive ribbon-cable interfaces will be preceded by many years of co-existence with competing storage interfaces, and those interfaces will support more capacious drives, and the upgrade path from 2.5" to its successors will entail the quick transfer of your 2.5" drives to your new drive and the computer it rode in on. There will be outliers — someone drops dead suddenly and their laptop is tied up in probate for a decade before anyone can try to recover their data — but for most of, the destiny of our data will be to move from live, self-healing media to live, self-healing media, without any time at rest in near-line or offline storage, the home of bitrot.

But this isn't the experience of the film industry: they are one of the few industries for whom data is still a hard-to-solve problem. High resolution cameras with high framerates capture so much data — and often on location — that the bulk of it is inevitably offline for some or all of the production process. And even the biggest internal servers struggle to store all the intermediate data of a finished film — all the test-renders and uncompressed raw footage and so on — meaning that filmmakers are often at the mercy of getting drives out of vaults and plugged into a server somewhere. The distributed nature of film-work, from location shooting to VFX partners in other cities or other continents, means that film-makers routinely struggle with long file-transfer lags that the rest of us have been largely spared for the past decade or so.

It's not a coincidence that there are a lot of novels written about the frustrations of novelists, nor is it a coincidence that filmmakers' tales of dramatic technological struggle look a lot like the kinds of problem-solving that filmmakers go through all day long.


"But Sarah," you might say, "surely there's a difference between a restricted military archive and the Republic archive in Episode II." Are you trying to tell me that Obi-Wan, a leading member of a mystical paramilitary law enforcement organization, just flounces off to the public library while investigating an assassination attempt made on a sitting Senator? Doubtful. And don't try to tell me he doesn't have the security clearances to access a top secret archive similar to the Scarif facility. The Episode II archive is totally a CIA library, and while it is apparently run by a bunch of bozos who can't stop one lone Jedi from deleting an entire star system from its records, it makes the Scarif facility in comparison look like a deserted Blockbuster Video in the year 2016.

Why must the Death Star plans be stored on a data tape the size of four iPads stacked on top each other? Obi-Wan can carry a map of the entire galaxy in a glowing marble, and at the end of Episode II, Count Dooku absconds with a thumb drive or something that contains the Death Star plans.



From Tape Drives to Memory Orbs, the Data Formats of Star Wars Suck (Spoilers)
[Sarah Jeong/Motherboard]