Abstract
In the absence of mechanistic or phenomenological models of real-world systems, data-driven models become necessary. The discovery of various embedding theorems in the 1980s and 1990s motivated a powerful set of tools for analyzing deterministic dynamical systems via delay-coordinate embeddings of observations of their component states. However, in many branches of science, the condition of operational determinism is not satisfied, and stochastic models must be brought to bear. For such stochastic models, the tool set developed for delay-coordinate embedding is no longer appropriate, and a new toolkit must be developed. We present an information-theoretic criterion, the negative log-predictive likelihood, for selecting the embedding dimension for a predictively optimal data-driven model of a stochastic dynamical system. We develop a nonparametric estimator for the negative log-predictive likelihood and compare its performance to a recently proposed criterion based on active information storage. Finally, we show how the output of the model selection procedure can be used to compare candidate predictors for a stochastic system to an information-theoretic lower bound.
- Received 21 November 2017
DOI:https://doi.org/10.1103/PhysRevE.97.032206
Published by the American Physical Society under the terms of the Creative Commons Attribution 4.0 International license. Further distribution of this work must maintain attribution to the author(s) and the published article's title, journal citation, and DOI.
Published by the American Physical Society