Blinded by the Light
A probe hurtling towards Mercury stops talking. A member of the Messenger spacecraft team, Chris Krupiarz, recalls the fingernail-biting moments.
The probe sails through space, traveling the distance of the LA-to-Chicago red-eye in a minute. My fellow engineers and scientists who built the spacecraft are now some 99 million miles distant on a blue-tinged point of light. We call the probe Messenger.
This name identifies its scientific mission — MESSENGER stands for MErcury Surface, Space ENvironment, GEochemistry, and Ranging — and acknowledges its destination’s eponym, the Roman god of communication and much more. Operating beyond Earth’s atmosphere and without an engineering requirement for beauty, the spacecraft lacks the smooth lines of a Newey-designed F1 car slicing through the Mediterranean air of Monaco, or the splendor of Strauss’s Golden Gate Bridge against a gloomy June sky in San Francisco.
Messenger instead resembles a Lego model of non-descript blocks connected by a subway map of wires and cables. The spacecraft is designed for a singular purpose: to be the first probe to orbit Mercury.
We sent Messenger on its way August 3, 2004. Until Messenger completes its mission, the spacecraft is vulnerable to its surroundings. One bump along its journey remains a vivid reminder.
Deep space none
Traveling with Messenger to the planet closest to the Sun would have been a nightmare vacation. The trip takes years and then the weather on arrival is Phoenix-hot or McMurdo-cold, only far more severe.
For the craft, Mercury’s heat is a bigger problem than the cold. Messenger’s survival depends on an eight-foot by six-and-a-half-foot sunshade providing an SPF in the hundreds of thousands to protect against 700° F temperatures. The sun’s rays will eventually scorch the shade’s white ceramic-cloth surface to burnt-toast black, while behind the shield it stays room temperature.
The project scientists’ stand-ins hide within the shade: a suite of instruments ranging from the imager — the glamour photographer whose prints will grace the covers of Science and Discover — to the introvert of the group, the magnetometer. The magnetometer must stand apart from the crowd, attached to the end of a 12-foot boom, thus avoiding the chatter of the spacecraft’s own magnetic field. Twin solar panels bring electricity to the party; mirrors cover the panels to limit temperatures to a roast-ready 300° F.
During lulls in its internal chatter, Messenger sends instrument and spacecraft data to the three Deep Space Network stations on Earth: one over the hill from the Murrumbidgee River near Canberra, another just off of Calle Blas de Otero in the mountains west of Madrid, and the third in middle-of-nowhere California, northeast on the I-15 from LA.
When Messenger lifted off from Cape Canaveral in 2004, it was a rhino’s weight, with much of its heft from fuel. It has been on a diet since, losing mass each time a thruster burps puffs of skin-burning hydrazine or the main engine fires to change course. An on-board computer directs these actions and others via a shopping list of commands sent from Earth. The cost and the limits of physics in fighting the harsh environment of space has slowed space processor evolution — the latest iPhone runs nearly 100 times faster than the Messenger computer.
I know that computer well: I’m responsible for the flight software that’s controlling the spacecraft during this mission phase. And as I sit in my office at that moment, I hope that computer is still in one piece. I just received an email from Eric Finnegan, who, in his role as the Messenger mission systems engineer, is the leader of the spacecraft team. He has sent this message to all the team members:
Subject: MSGR: Spontaneous Loss of Signal today during DSS-63
Andy just informed me that we lost signal from Messenger today during the DSS-63 contact. They are tracing the ground system now. I am heading over to the MOC to get more situational awareness. Please join me as soon as possible.
From the email, I know that Messenger had been talking with one of the Deep Space Network stations in Madrid, designated DSS-63. These periods of communication, called passes, are typically the length of a workday, about eight hours. Somewhere in the middle of that pass, Messenger had gone silent. Unless commanded to be quiet, spacecraft do not stop chatting: they constantly blabber like one of my sons after his first day of school. Eric calls us all to the mission operations center. Something is wrong.
I send my wife an email that begins, “It’s going to a long day…” and start to pack my computer bag for the walk to mission operations. Just before I shut down my laptop, I receive one last email, this one from those at the Deep Space Network:
FLASH 1: Messenger in Spacecraft Emergency
Initial Incident Report:
On DOY 244/1633z, the Messenger Project declared Spacecraft Emergency. They stated lost contact with the spacecraft at 142640z.
We need to find out what happened.
This isn’t my first rodeo. The first spacecraft I worked on failed upon launch; the second failed in orbit. I don’t need a third strike. But at my workplace, the Johns Hopkins University Applied Physics Laboratory (APL), I’m surrounded by scientists and engineers who devote themselves to figuring out hard problems. As I descend the stairs on my way out of my home base, Building 7, I pass a sign that reminds me of our wonkiness and precision: “Down to First Floor for Exit Discharge.”
Deep space missions are marathons. The mission team, led by the principal investigator from the Carnegie Institution of Washington, Sean Solomon, twice proposed the mission to NASA before it was accepted. Once approved, Messenger took four years to build, and at the time I take this walk, it has two years left on its trip to Mercury.
On arriving at Building 13, which houses the mission control center, I swipe my badge on the security reader and enter. Along the stem of the L-shaped room sit computer terminals for the instrument teams, as well as the propulsion, power, navigation, communication, and flight software subsystems. In the front of the room, what would be the base of the L, sit autonomy, guidance and control, the mission systems engineer, the mission operations manager, and the flight controllers. Off to the side is a room with glass walls, where, during significant events, upper management nervously waits to see whether the next day’s press release will be a positive one.
Standing near the flight controllers, staring at dual data screens on the wall, are Eric Finnegan and Andy Calloway. Eric is lean and energetic, with dark hair, dark eyes, and a perpetual five-o’clock shadow. He is wearing a white shirt and tie and has his ever-present Blackberry on his belt. Andy, the head of operations, is compact and tennis-player fit. An APL security badge hangs around his neck and he is wearing a polo shirt and jeans — the closest that Lab employees come to a uniform.
“What’s the status?” I ask.
“We lost contact about 45 minutes ago,” says Andy. “We were loading a series of macros to EEPROM when the signal just vanished.”
EEPROM is a predecessor to the flash storage used in SD camera cards. As with flash memory, EEPROM data remains permanent once written, even without power. Macros are a series of commands designed to be executed in sequence. There’s a clue in there, but I need more information.
“Were we doing compression? Downlinking files? Collecting images?”
“No,” replies Andy.
This is a good sign. Those three software functions have caused issues in the past. I thank Andy for the information and then sit down next to Adrian Hill, the autonomy lead for Messenger and a calming influence on the team. He is also an NFL on-field official. Perhaps having Bill Belichick yelling in his ear builds antibodies for stress.
I ask Adrian, “They weren’t doing much other than loading macros to memory. Could it be a multi-bit error?”
Adrian says he was thinking the same thing. Space is unkind to electronics. The sun, distant supernovae, and other sources launch particles that drill through these components. These subatomic bullets may damage computer memory by flipping a bit from a one to a zero or vice versa.1 Engineers have added a little math to computer memory that can automatically flip the bit back to the proper value.
However, if this happens more than once in a single computer word (the numbers of bits that a processor uses as its native basic unit), we call it a multi-bit error: the computer is not smart enough to determine which of the bits have flipped. To avoid reading corrupted data and making a bad decision, the computer reboots itself. This is an annoyance, but the computer would be doing what it is designed to do to fix the problem.
We tell Eric and Andy our theory, although I am initially hesitant. Experience has shown that the more positive I am that I am right, the more likely it is I am wrong. I add the caveat that we will not know for sure until the spacecraft starts talking again.
I head across the room to my computer station. Already there at the adjoining terminal is the communication lead, Karl Fielhauer. A gray-haired, gregarious man, Karl is the type of guy that quickly lets you know where he stands. When I arrived in the mission operations center during a previous emergency, he summarized the situation with a succinct “we’re screwed.”
Today the conversation quickly diverts to our favorite subject: the Detroit Tigers. Despite the team being three and a half games up on the Twins and in first place in the American League Central, we still complain. We are both from Michigan and are lifelong Tiger fans; history dictates that a season implosion is likely just a few bizarre plays away.
Karl is relaxed because he has looked at his data and noticed that the signal from Messenger did not immediately drop away. Instead, it had slowly lost strength. Andy, Eric, and Karl conclude that Messenger is probably looking for us, like a child lost in the forest calling out for Mom and Dad.
Most spacecraft have multiple modes; we like to see Messenger in operational mode. That means the spacecraft is happily collecting data and all systems are working fine. Instead, Messenger may be in Earth-acquisition mode. The spacecraft would still be alive, slowly rotating to scan the sky while calling to its family on Earth. If that’s true, Messenger will take about three hours to rotate once and find us.
I log in to my terminal and send an email to Annette Mirantes, a colleague on the software team who recently moved overseas. For her, it is late afternoon in Gijón on the Bay of Biscay in northern Spain. Despite supposedly knowing much about computers, I find it magic that I can type on a laptop in Maryland and have that information almost immediately show up on the screen in Annette’s house in Europe. I summarize the situation and we determine what to examine once we get data back from Messenger. Annette replies:
If it's a multi bit error, we'll see it in 406: HSK_MULTIBIT_ERR_TT will be TRUE HSK_MEM_MULTI_BIT_ERR_CNT will be > 0
Annette is referring to data packet 406 and two data points within. A packet is like an envelope. Inside the envelope is a list of data points and their current values. Data packets are grouped in a spacecraft version of the Dewey Decimal System. The 400 to 499 range is spacecraft data; the 406 packet contains diagnostic data. If the data point
HSK_MULTIBIT_ERR_TT is true, it will confirm our theory of a multi-bit error. This will still require a spacecraft recovery, but I can head home and monitor the data from there. If the value is false, I will soon wish mission operations had extra cots.
Spinning round and round
As the three-hour mark approaches, the chatter in the operations center stops. We gather at the front of the room to watch a screen showing a graph that looks like an EKG. The display represents the signal strength from Messenger. The signal remains flatlined at zero until, slowly, the values begin to inch up. The spacecraft is alive, but we do not know how sick it is.
The next step is to tell the spacecraft to stop rotating. To do this, mission operations must command the spacecraft at the precise moment on a subsequent rotation that it again comes into view of the Deep Space Network antennas. Performing this operation is like timing the windmill shot at the Rocky Gorge mini-golf course, except at a distance of 99 million miles.
But, given the skill of the mission operations team, they make the putt and we eventually see a steady stream of communication from Messenger. The data values we need are screaming through space at the speed of light, but at this distance they take eight minutes to arrive.
Andy calls out, “DSN reports the first frame is in. You should be seeing data in about 12 minutes.”
The additional delay is due to the downlink rate. Because Messenger is in an emergency mode, the spacecraft is sending data at only 10 bits per second, a millionth the rate of home broadband.
“We have a 406 packet in the pipeline,” says Andy. That is the packet we need.
I wait. The spacecraft data points are displayed purple, indicating that the information is the same we saw many hours ago. Once they switch to green, I will know we have new data. I eye the screen. The key data point I am watching goes green.
HSK_MULTIBIT_ERR_TT = TRUE
It is a multi-bit error.
I email my wife: “I’ll be home for dinner.”
Messenger arrived safely in orbit on St. Patrick’s Day, 2011. Since then, the spacecraft has sent back the first complete map of Mercury, confirmed water ice in shadowed craters, and discovered an abundance of sulfur on the surface. Once Messenger’s fuel is depleted, Mercury’s gravity will slowly disrupt the probe’s orbit, lowering the spacecraft until it eventually hits the planet.
Photo credits: Artist’s rendering of Messenger, NASA/The Johns Hopkins University Applied Physics Laboratory/Carnegie Institution of Washington. Other photos from before launch via NASA. From top to bottom: Messenger's two solar arrays are inspected (photo KSC-04PD-1328); a worker checks wiring on the probe (photo KSC-04PD-1327); and the craft is prepared for launch (photo KSC-04PD-0443).
A bit flip was in the news twice in recent years. First, when Voyager 2 in 2010 had its flight data system computer experience such a switch; engineers were able to fix that. In March 2013, Curiosity had a glitch that may have been caused by cosmic radiation. Redundancy in computers is a key to recovery. ↩
Chris Krupiarz works as a spacecraft flight software engineer for the Johns Hopkins University Applied Physics Laboratory. Originally from Michigan, he now lives in Ellicott City, Maryland, with his wife and two kids. In his spare time he enjoys reading and writing, walking in the woods with the family beagles, and creating fictional sporting events with his sons.
You can read more articles of this sort in The Magazine, which has a seven-day free trial, and costs $1.99 a month for two issues (10 or more articles) and $19.99 a year for 26 issues (130+). Subscribe via the Web or its iOS app.
Where are our petabyte drives? Brian Hayes takes us through the reasons storage is “stuck” in the low terabytes. The tl;dr is that we got such exceptional capacity growth in the late 90s and early 00s we don’t need much more right now, so the focus since then has been on SSDs, networking, interfaces, etc, […]
Amélie Lamont, a former staffer at website-hosting startup Squarespace, writes that she often found herself disregarded and disrespected by her colleagues. One comment in particular, though, set her reeling — and came to exemplify her experiences there.
In this episode of the Flash Forward podcast we travel to a future where humans have decided to eradicate the most dangerous animal on the planet: mosquitos. How would we do it? Is it even possible? And what are the consequences? Flash Forward: RSS | iTunes | Twitter | Facebook | Web | Patreon We […]
You never know when new projects, ideas or opportunities can drop into your lap at a moment’s notice. That may require you to learn a new programming language like Python. Or maybe you need a primer on 3D game development. Or you might realize you could use a serious brush-up on iOS mobile creation.Point is, […]
Isn’t it about time to stretch what your Mac can do? I mean, you’ve got plenty of great programs now…but don’t you think you could use some new tools to get your creative, analytical and organizational juices really flowing? It’s spring, so we cleaned up a whole bunch of super-cool apps lying around and packaged […]
In the world of app development, there’s no greater arena to find success than with Android users. About 80% of the smartphones in use today worldwide operate on the Android operating system, so if you build a great app that Android users love, you’re an international rock star. You’ll be able to make sure your […]