Loops and Self-Reference in the Construction of Dictionaries

Featured in Physics
Open Access

Loops and Self-Reference in the Construction of Dictionaries

David Levary, Jean-Pierre Eckmann, Elisha Moses, and Tsvi Tlusty

Phys. Rev. X 2, 031018 – Published 27 September 2012

See Synopsis: The Value of Circular Definitions

Abstract

Dictionaries link a given word to a set of alternative words (the definition) which in turn point to further descendants. Iterating through definitions in this way, one typically finds that definitions loop back upon themselves. We demonstrate that such definitional loops are created in order to introduce new concepts into a language. In contrast to the expectations for a random lexical network, in graphs of the dictionary, meaningful loops are quite short, although they are often linked to form larger, strongly connected components. These components are found to represent distinct semantic ideas. This observation can be quantified by a singular value decomposition, which uncovers a set of conceptual relationships arising in the global structure of the dictionary. Finally, we use etymological data to show that elements of loops tend to be added to the English lexicon simultaneously and incorporate our results into a simple model for language evolution that falls within the “rich-get-richer” class of network growth.

Received 3 February 2012

DOI:https://doi.org/10.1103/PhysRevX.2.031018

This article is available under the terms of the Creative Commons Attribution 3.0 License. Further distribution of this work must maintain attribution to the author(s) and the published article’s title, journal citation, and DOI.

Published by the American Physical Society

Synopsis

The Value of Circular Definitions

Published 27 September 2012

Methods from statistical physics and graph theory help uncover the structure of human language.

See more in Physics

Authors & Affiliations

David Levary^1,2, Jean-Pierre Eckmann³, Elisha Moses², and Tsvi Tlusty^2,4

¹Department of Physics, Harvard University, 17 Oxford Street, Cambridge, Massachusetts 02138, USA
²Department of Physics of Complex Systems, Weizmann Institute of Science, Rehovot 76100, Israel
³Département de Physique Théorique and Section de Mathématiques, Université de Genève, CH-1211, Geneva 4, Switzerland
⁴Simons Center for Systems Biology, Institute for Advanced Study, Princeton, New Jersey 08540, USA

Popular Summary

Humans can communicate complex thoughts and ideas owing to a shared vocabulary and rules for stringing words together in meaningful sentences. These vocabulary words are often interrelated and evolve over time in response to the changing needs of communication. Such relations are embedded in the structure of dictionaries in which language is fixed in time and words are used to define other words. By applying the tools of statistical physics and graph theory, we have discovered new features of word connections and, in particular, that some aspects of vocabularies, such as apparently circular definitions (or “definitional loops”), are not defects but essential to the growth of language.

In our paper we report a quantitative analysis achieved by viewing a dictionary as a network of words: Every word has links pointing to each of the words in its definition. In turn, each of those words points to the words in its definition, and so on. Iterating through the whole dictionary, we find that a complicated directed graph appears. The directed trajectories converge to a small set of fundamental words, strongly connected with a high density of loops, from which all others can in principle be defined.

The graph of the dictionary must be cyclic to some extent, given the requirement that every word have a definition. However, we show here that definitional loops arise not as simple artifacts of the dictionary’s construction but rather as the vehicle by which new concepts are introduced to language. When a totally new concept is created, the dictionary can accommodate it only by revising the topology of the graph.

Our analysis further reveals that short loops describe parts of homogeneous subjects, which then overlap to form larger components that represent in turn whole semantic ideas. Finally, we show that elements of loops tend to be added to the English lexicon simultaneously, and we incorporate our results into a simple model for language evolution.

Key Image

Article Text

Click to Expand

References

Click to Expand

Issue

Vol. 2, Iss. 3 — July - September 2012

Subject Areas

Reuse & Permissions

Author publication services for translation and copyediting assistance advertisement

Physical Review X