Sometimes, it’s the data you’re missing that’s the key to understanding something

Via John Naughton, a salutary tale for data fiends:
How Not to Be Wrong opens with an extremely interesting tale from World War II. As air warfare gained prominence, the challenge for the military was figuring out where and in what amount to apply protective armor to fighter planes and bombers. Apply too much armor and the planes become slower, less maneuverable and use more fuel. Too little armor, or if it’s in the “wrong” places, and the planes run a higher risk of being brought down by enemy fire. To make these determinations, military leaders examined the amount and placement of bullet holes on damaged planes that returned to base following their missions. The data showed almost twice as much damage to the fuselage of the planes compared to other areas, most specifically the engine compartments, which generally had little damage. This data led the military leaders to conclude that more armor needed to be placed on the fuselage. But mathematician Abraham Wald examined the data and came to the opposite conclusion. The armor, Wald said, doesn’t go where the bullet holes are; instead, it should go where the bullet holes aren’t, specifically, on the engines. The key insight came when Wald looked at the damaged planes that returned to the base and asked where all the “missing” bullet holes to the engines were. The answer was the “missing” bullet holes were on the missing planes, i.e. the ones that didn’t make it back safely to base. Planes that got hit in the engines didn’t come back, but those that sustained damage to the fuselage generally could make it safely back. The military then put Wald’s recommendations into effect and they stayed in place for decades.