Sep 19 2008
Single point failures….
On NASAWatch, which is not a site I recommend, they noted that loss of a single component brought down the entire NASA email system (this just after they were down for nearly a week because the server goes through JSC only, so the hurricane affected all NASA mail). There are few things, technologically, more irksome, more certain to fail, and more embarrassing when they do so, than single point failures.
Failure tolerance is a good thing. It’s the best defense there is against Murphy’s law. At one time, NASA strived to meet the principle of Fail Operational, Fail Safe. That meant that any single failure did not affect function (or at least not critical function). A second failure might leave the hardware nonfunctional, but it would leave it safe - if the function is critical, of course, nonfunctional is not safe. Even more, it ensured that there were at least three failures required before reaching an unsafe condition.
Now, of course, that isn’t always possible. A nuclear reactor containment vessel is unlikely to be redundant. Extra wings on a plane in case one shears off is likely less than useful. Sometimes adding redundancy and additions paths adds so much complexity, weight, etc. that the design becomes ridiculously unwieldy.
But, in general, it’s a good thing. If 12 bolts can bear all the stress necessary, adding one or two other bolts allows for failure without compromising the overall capability. (This is especially important in space endeavors where a stripped or broken bolt may not be recoverable). If one can accommodate redundancy, especially with unreliable equipment, it’s smart engineering practice to do so. This is more true when hazards can result from failure.
So, if a complex or key system fails because of a single component or a readily foreseeable circumstance, that argues poor engineering. Bad in real life…
But it’s damn useful in science fiction. You want to make an interesting story, twist the plot, add hardship, shake things up a bit? Have a critical component fail. You want to add tension, pressure, up the excitement? Make your critical item one of a kind, no spares, hard to come by, made of something scarce and/or requiring interaction with someone hostile to recover it? Do so, and suddenly things are hot and exciting.
Just remember not to use this little trick too often because, if you do, your engineers will look like total morons or it will become cliché. Not that it isn’t a little cliché already thanks to Star Trek and the like. But it’s still good for excitement if you can exercise a little restraint.











But in real life things go wrong…yet when art imitates that, it’s exciting. That’s why the idea of restraint works for me. When a one-shot chance has five things go wrong, making its likelihood one in 476 kagillion (I’m making that up), only stupid readers/viewers can suspend their disbelief so far. That’s why shows like “Murder She Wrote” are lampooned now…what small town has that much murder? How can inexperienced astronauts possibly land on, drill into, and explode an asteroid when they land in the wrong place, lose all sorts of stuff, and go wrong over and over? No, the answer is not that one of them just has to be willing to die to get it done. That isn’t good enough.
In real life, when something is impossible, it’s simply impossible. Make it challenging, not stupid! (And no, I’m not saying you are doing that, Stephanie–I’m ranting rhetorically).
Nice post!
Things can go wrong over and over, but, truth to tell, a single thing can go wrong that can cause all kinds of problems, like losing one job or one misplaced candle.
I think it’s much more reasonable to have a single thing point failure one didn’t see coming and letting that snowball on its own. Believe me, that can happen.
And, Shakespearemom, I KNEW you weren’t comparing me to the BS from Armageddon. We walked away from that movie with the same opinion.
It does suck when this sort of crap happens in reality, flit. But then, if it didn’t suck once in fiction, how dull it would get for the reader.
I’m now looking for someplace to buy some extra bolts for my entire life, so that failure doesn’t seem like such a disaster.
Technology has some good inherent metaphors.
Doesn’t it though?