Nearly half a million airplanes were lost during World War II. To put that into some perspective, it’s estimated that there are only 25,000 planes in service worldwide as of today. With the young bomber crews averaging a less than 30% success rate, anything that could be done to reduce losses was of paramount importance.
Now imagine you’re a general looking to improve your odds in the war. Each day planes come limping home, their fuselages riddled with bullet holes and you have to decide how to allocate your limited supply of metal in order to repair them.
So the question is, where do you put that extra armour?
Conventional wisdom says you would patch up the bullet holes and send the planes back out to fight. Congratulations, you did what the US Air Force very nearly chose to do.
This would have been a disastrous mistake, costing the lives of thousands of pilots and gunners.
It’s a phenomenon known as survivorship bias. By looking at only the planes that made it back you ignore those that didn’t. The Statistical Research Group, a team of mathematicians assembled to aide the war office, realised this error and instead told the generals to armour the areas of the planes that were undamaged.
It sounds counter-intuitive but the groups logic was that anywhere that had been hit was obviously not critical as the plane had made it back. The planes that didn’t return would, statistically, have been struck in other locations and it was these that were the vulnerable areas.
In the 21st century, we’re seeing a new trend in thinking about data, driven by the ability to not only collect huge amounts of data but also analyse it in ways that were not possible or not feasible before.
This propensity for data can be overwhelming at first but there are amazing tools out there for making sense of it all. Interactive systems like Tableau have shown that business data can now be queried directly by those who need it most. But we are seeing the rise of a new danger, that of not asking the right questions or, more worryingly, not realising there are other questions worth asking.
Statisticians love counter-intuitive problems like survivorship bias because they show how easy it is to fall prey to trusting gut-feelings. Big data applied incorrectly only exacerbates these mistakes.
As such, alongside the rise of big data we are seeing an explosion in the role of data scientists and for good reason. A bad decision can not only harm the business but also erode trust in the very data collected.
Ensuring that the right questions are asked can be helped by putting in safeguards against these common mistakes, and the best safeguard is ultimately education. A data scientist can not only provide invaluable insight into a business' data but also guide those for whom statistics are a seemingly dark-art.
From the position of a button on a page that has been determined by a multi-variant test, through to identifying the most valuable relationships with customers and how to activate that target cohort.
In this new era of big data, decisions can be made much faster and with a much higher degree of certainty. This lets even large organisations move more quickly than they could have only a few years ago. A data scientist can be the guide many businesses need to get them on their feet.
There's no better time than today to begin investigating the benefits available. Amazon and Google have eliminated the traditional capex cost of experimentation. An initial analysis of a vertical or horizontal slice of a business is typically all that’s required to demonstrate the hidden riches available.
Ultimately, our takeaways are twofold: embrace the rise in big data to identify trends that may otherwise go unnoticed, while raising the understanding and awareness of the risks of using assumptions over statistical models.
By Luke Lanchester, senior engineer at 383 Project
Visit our website to see events that will help you keep up to speed on; Data protection, cyber security, digital marketing and business growth. View upcoming events here!
comments powered by Disqus