Abstract
Zipf's law is the major regularity of statistical linguistics that has served as a prototype for rank-frequency relations and scaling laws in natural sciences. Here we show that Zipf's law—together with its applicability for a single text and its generalizations to high and low frequencies including hapax legomena—can be derived from assuming that the words are drawn into the text with random probabilities. Their a priori density relates, via the Bayesian statistics, to the mental lexicon of the author who produced the text.
- Received 21 December 2012
DOI:https://doi.org/10.1103/PhysRevE.88.062804
©2013 American Physical Society