What might it mean when you ask someone “how are you?” and the answer is always hunky dory?
Gene Hughson in his latest post refers to The Daily WTF’s post on a system that never reports an error. This provokes several thoughts.
1. There is a fantasy to create fail-safe systems. There is a similar fantasy to create fail-safe organizations and teams. The truth is that we would much prefer systems (both software and organizational) that are safe-to-fail. Since most software systems and most organizations operate in complex environments which are impossible to predict, knowing what failed is paramount to the evolution of the system.
In an analogy, we would love that our children will have developed magical qualities to get top grades, be friends with everyone they wish, etc. However, we will become much better parents if we invested efforts to help them let us know when they need our help.
2. In the past, Windows, based on the x86 architecture, had a magical error that told us everything: General Protection Fault. Maybe the fantasy at Intel and/or Microsoft was that the x86 with Windows 3.1 is a super quality system, that fails on seldom occasions, so only GPF is required in such conditions (of course, this is a wild and unvalidated hypothesis for the sake of the argument only).
In practice, Even the infamous Dr. Watson, in “his” first versions, was not good enough to tell us what’s wrong, and additional tools were required.
Luckily today Windows combined with 3rd party tools is much better at telling us what went wrong.
Moreover, modern tools tell us what is going wrong now, and even what is about to go wrong.
3. Conways Law tells us that the architecture of the organization is a reflection of the product architecture (and vice versa).
Relating to B.M’s co-worker in Daily WTF’s post, rather than putting the responsibility on him/her, I wonder if and how their organization is structured to hide faults, and what does it mean to admit having made an error.
In organizational life, when your team members always keep telling you that everything is OK, a good advice is to explore how you contribute together to not telling when they could use your help. What are you collectively avoiding to address the real problem issues.
Parallelly, if you are getting frequent customer complaints that are undetectable before the product is released, a good advice is to explore how your architecture is contributing to hiding away such errors.