Benford's Law describes the frequency distribution of the first digit seen in real life numerical data sets. Intuitively, we know that more numbers should start with a 1
than a 9
. Using this distribution we can assess whether a data set has been tampered with. Evidence based on this is used to make claims about potential fraud and is robust enough to be considered admissible in court. While this observation is often applicable, it's important to remember that the underlying data needs to conform with some key assumptions (e.g. the data doesn't come from a sequential set of numbers).
The simplicity of this observation is a good reminder to look for insights that may be right in front of us. Considering people have been closely analyzing data long before 1881, it's curious that no one noted this pattern until then.