One of the best book this year, I would say, is The Black Swan Theory. You may wonder why the theory is called “the Black Swan”. According to Wikipedia,
The term black swan comes from the ancient Western conception that all swans were white. In that context, a black swan was a metaphor for something that could not exist. The 17th Century discovery of black swans in Australia metamorphosed the term to connote that the perceived impossibility actually came to pass.
And then what are the attributes of a Black Swan event ? You can read the first chapter of the book to find out, but here is an excerpt:
- First, it is an outlier, as it lies outside the realm of regular expectations, because nothing in the past can convincingly point to its possibility.
- Second, it carries an extreme impact.
- Third, in spite of its outlier status, human nature makes us concoct explanations for its occurrence after the fact, making it explainable and predictable.
And I have a real IT example to share as well …
Back in year 2000, we developed a system to disseminate some important information online, along with the offline channel (i.e one could get the same information over the counter). In one summer day the system went live as scheduled, but because of a tropical typhoon all office counters were closed. In other words, the online web site was the only channel to get those important information. As a result, thousands of people logged on to the web site at the same time and crashed the system. Even though we managed to fix the problem within an hour, we disappointed thousands of people.
It was a high impact event (to those thousands of people who badly need the information in a timely fashion). And of course, we never expected the online channel would be the only dessimination channel – an outlier …
Similar to many IT problems, we explained the cause of the problem after the fact – Incompetent software firewall, inefficient traffic distribution across servers, under-par performance servers and slow database etc. etc.
Lesson Learnt ? We redesigned the whole infrastructure from ground up, performed rigorous load tests and from that point onward, the web site could handle tens of thousands of users with ease. A very expensive lesson though …