Benford's Law (1938) predicts that digit frequencies for many scientific, engineering, and b... more Benford's Law (1938) predicts that digit frequencies for many scientific, engineering, and business data sets will follow P(dd)=log(1+1/ dd) for digits dd. This law has been used by auditors since 1989 to detect errors and fraud in data sets. Benford also postulated a separate law for integer quantities. This little-known variant of the law is shown to be substantially correct, despite an error by Benford in its derivation. The integer variant is then shown to be extraordinarily common in everyday life, correctly predicting the distribution of footnotes per page in textbooks, sizes of groups walking in public parks and visiting restaurants, fatality
Benford's Law (1938) predicts that digit frequencies for many scientific, engineering, and bu... more Benford's Law (1938) predicts that digit frequencies for many scientific, engineering, and business data sets will follow P(dd)=log(1+1/ dd) for digits dd. This law has been used by auditors since 1989 to detect errors and fraud in data sets. Benford also postulated a separate law for integer quantities. This little-known variant of the law is shown to be substantially correct, despite an error by Benford in its derivation. The integer variant is then shown to be extraordinarily common in everyday life, correctly predicting the distribution of footnotes per page in textbooks, sizes of groups walking in public parks and visiting restaurants, fatality counts in air crashes, repeat visits to service businesses, and purchase quantities for goods. The practical value of the integer variant of Benford's Law is illustrated using cases from the author's consulting experience, as a limit toward which a distribution will tend. A potential proof for a universal distribution law for...
Spencer-Brown advanced a provocative thesis in Probability and Scientific Inference (1957). From ... more Spencer-Brown advanced a provocative thesis in Probability and Scientific Inference (1957). From experiments with early “chance machines,” he argued that a fundamental flaw exists in our view of randomness. Long sequences of random digits generated by a variety of methods show longterm declines in repetition of rare items or sequences. I revisit this neglected topic and show that a variety of modern random number generators (including pseudo-random algorithms) exhibit this same property. For simple schemes the decline is short and the system soon lapses into classic equipartition. For sufficiently complex schemes the decline continues indefinitely. Standard tests like DIEHARD do not detect this pattern. I suggest the principle of maximum entropy as the underlying cause. Similar decline patterns show up empirically in epidemiology, Web traffic, and other probabilistic settings.
Uploads
Papers by Dean Brooks