This is a neat little paper: How Complex Systems Fail. It is short, it is simple and it is absolutely PACKED with insight. Here is are some excerpts:
5. Complex systems run in degraded mode.
A corollary to the preceding point is that complex systems run as broken systems. The system continues to function because it contains so many redundancies and because people can make it function, despite the presence of many flaws. After accident reviews nearly always note that the system has a history of prior ‘proto-accidents’ that nearly generated catastrophe. Arguments that these degraded conditions should have been recognized before the overt accident are usually predicated on naïve notions of system performance. System operations are dynamic, with components (organizational, human, technical) failing and being replaced continuously
7. Post-accident attribution accident to a ‘root cause’ is fundamentally wrong.
Because overt failure requires multiple faults, there is no isolated ‘cause’ of an accident… The evaluations based on such reasoning as ‘root cause’ do not reflect a technical understanding of the nature of failure but rather the social, cultural need to blame specific, localized forces or events for outcomes.
One thing that strikes me about the paper is that the author (probably deliberately) does not try to define what a complex system is. In a sense the paper is a definition of a complex system, which is to say that they are defined by how they fail. Or, perhaps like with pornography: you know it when you see it.
I can see two ways that a complex system can develop and operate: top down or bottom up. Bottom up systems get to be much much more complex, yet I would say that they are much less prone to failure. Perhaps that last sentence is saying the same thing twice.
I think of this in terms of risk management at insurance companies or banks. You can imagine that a weak grasp of how systems fail could be financially ruinous: for example, by an executive believing he/she has a better grasp for the ‘root cause’ of why failures occur.
To run a complex system perhaps requires humility in the face of something you simply cannot understand.