Don’t Worry, Be Happy… Until One Day
Continuing the disclaimer of two other posts I am referring to – this is not a political post.
Gene Hughson has recently written on the US healthcare.gov project, in response to Uncle Bob’s post from November 12th.
This is not the first time that a software failure had caused severe damage to mammoth projects. Here’s a short quote from Wikipedia on the first launch of Ariane 5:
“Ariane 5’s first test flight (Ariane 5 Flight 501) on 4 June 1996 failed, with the rocket self-destructing 37 seconds after launch because of a malfunction in the control software. A data conversion from 64-bit floating point value to 16-bit signed integer value to be stored in a variable representing horizontal bias caused a processor trap (operand error) because the floating point value was too large to be represented by a 16-bit signed integer.”
The emphasis I have added points to a basic flaw in computer programming, often experienced by novice engineers. One would expect that a high-profile aerospace project will hire better engineers than that, don’t you agree?
Uncle Bob Martin thinks so:
“[…] So, if I were in government right now, I’d be thinking about laws to regulate the Software Industry. I’d be thinking about what languages and processes we should force them to use, what auditing should be done, what schooling is necessary, etc. etc. I’d be thinking about passing laws to get this unruly and chaotic industry under some kind of control.
If I were the President right now, I might even be thinking about creating a new Czar or Cabinet position: The Secretary of Software Quality. Someone who could regulate this misbehaving industry upon which so much of our future depends.”
Moreover, Uncle Bob refers to another aerospace disaster – the Challenger explosion, and the engineers’ responsibility in not stopping the launch:
“It’s easy to blame the managers. It’s appropriate to blame the managers. But it was the engineers who knew. On paper, the engineers did everything right. But they knew. They knew. And they failed to stop the launch. They failed to say: NO!, loud enough for the right people to hear.”
In response, Gene Hughson writes:
“Considering that all indications are that the laws and regulations around government purchasing and contracting contributed to this mess, I’m not sure how additional regulation is supposed to fix it.”
Sadly for our industry, I agree with Gene. Yes, engineering practice has, on the whole, a long, long way to go to become anywhere near excellent. I have a lot of respect for Uncle Bob for his huge contribution there.
But the Challenger disaster is first and foremost not an engineering failure. The disastrous potential of the problematic seal was known for a long time before it actually materialized, to everyone’s shock.
““The Rogers Commission found NASA’s organizational culture and decision-making processes had been key contributing factors to the accident. NASA managers had known contractor Morton Thiokol’s design of the SRBs contained a potentially catastrophic flaw in the O-rings since 1977, but failed to address it properly. They also disregarded warnings (an example of “go fever”) from engineers about the dangers of launching posed by the low temperatures of that morning and had failed in adequately reporting these technical concerns to their superiors.”
At the end of the day, it boils down to the fact that NASA’s leadership were operating under the false belief that with every launch of the shuttle, the risk of the seal failing reduces, completely opposite to common sense.
Mr. Larry Hirschhorn has an excellent description of this in his book The Workplace Within.
In such atmosphere, when my managers, and their managers, are so indifferent to life-threatening flaws, heck, why should I exercise excellence in my mundane tasks? Why should I risk my own livelihood? After all, this is the culture here, in this workplace.
It is heartbreaking that the loss of the Columbia can be attributed to similar management pitfalls as that of the Challenger:
“In a risk-management scenario similar to the Challenger disaster, NASA management failed to recognize the relevance of engineering concerns for safety for imaging to inspect possible damage, and failed to respond to engineer requests about the status of astronaut inspection of the left wing. Engineers made three separate requests for Department of Defense (DOD) imaging of the shuttle in orbit to more precisely determine damage.”
Coming back to Uncle Bob’s conclusions, in his talk, How schools kill creativity, Sir Ken Robinson points out that the school system, in its efforts to teach, are killing creativity in favor of grades. We can only assume that legislating computer engineering studies will, at best, not harm the existing engineering quality. It will probably achieve worse – well certified engineers, with little ability or drive to excel.
This failure has little to do with teaching and certifications, and all too much to do with culture, professionalism, and plain simple awareness.
When managers practice such “It will be OK” attitude, everyone does. By the sound of it, the healthcare.gov failure discussed here is not that far off.
In 1992, Prime Minister Yitzhak Rabin was speaking at the Staff and Command school to prospect senior officers. Here’s what he had to say about “It will be OK”:
“One of our painful problems has a name. A given name and a surname. It is the combination of two words – ‘Yihyeh B’seder’ [“it will be OK”]. This combination of words, which many voice in the day to day life of the State of Israel, is unbearable.
Behind these two words is generally hidden everything which is not OK. The arrogance and sense of self confidence, strength and power which has no place.
The ‘Yihyeh B’seder’ has accompanied us already for a long time. For many years. And it is the hallmark of an atmosphere that borders on irresponsibility in many areas of our lives.
The ‘Yihyeh B’seder’, that same friendly slap on the shoulder, that wink, that ‘count on me’, is the hallmark of the lack of order; a lack of discipline and an absence of professionalism; the presence of negligence; an atmosphere of covering up; which to my great sorrow is the legacy of many public bodies in Israel – not just the IDF.
It is devouring us.
And we have already learned the hard and painful way that ‘Yihyeh B’seder’ means that very much is not OK.”
No, Uncle Bob, engineers are not to blame on this. Management must take responsibility for nourishing a culture that allows such poor standards.