Fraudulent White Noise: Flat power spectra belie arbitrarily complex processes

Power spectral densities are a common, convenient, and powerful way to analyze signals. So much so that they are now broadly deployed across the sciences and engineering---from quantum physics to cosmology, and from crystallography to neuroscience to speech recognition. The features they reveal not only identify prominent signal-frequencies but also hint at mechanisms that generate correlation and lead to resonance. Despite their near-centuries-long run of successes in signal analysis, here we show that flat power spectra can be generated by highly complex processes, effectively hiding all inherent structure in complex signals. Historically, this circumstance has been widely misinterpreted, being taken as the renowned signature of"structureless"white noise---the benchmark of randomness. We argue, in contrast, to the extent that most real-world complex systems exhibit correlations beyond pairwise statistics their structures evade power spectra and other pairwise statistical measures. To make these words of warning operational, we present constructive results that explore how this situation comes about and the high toll it takes in understanding complex mechanisms. First, we give the closed-form solution for the power spectrum of a very broad class of structurally-complex signal generators. Second, we demonstrate the close relationship between eigen-spectra of evolution operators and power spectra. Third, we characterize the minimal generative structure implied by any power spectrum. Fourth, we show how to construct arbitrarily complex processes with flat power spectra. Finally, leveraging this diagnosis of the problem, we point the way to developing more incisive tools for discovering structure in complex signals.

This conundrum holds especially in the analysis of signals from truly complex systems, as when analyzing data from multi-electrode arrays in brain tissue or social experiments. These systems are often said to be 'noisy' even though the so-called noise may be entirely functionally relevant, but in an unknown way. Such descriptions fall far short of a principled approach that explains all trends and correlational structure, which would claim success only when all that remains unexplained in the signal is structureless white noise. Even this principled approach ultimately begs the central question, though: how do we test if an apparently random signal is truly white noise?
The challenge of discovering structure in noisy signals is compounded manifold, as we demonstrate in the following, when our chosen observables hide arbitrary amounts of in-principle-predictable structure behind a familiar signature of white noise-the flat power spectrum. Said simply, observables can be completely de-2 void of pairwise correlation, while still embodying structure in higher-order correlations. More precisely, structure can be hidden beyond any arbitrarily-large order-N correlation-that not appearing in pairwise, three-way, nor any n-way statistics, up to some arbitrarily large N [2]. Moreover, the hidden structure can be arbitrarily sophisticated. It can be used, for example, to embed messages while shifting (and so hiding) the messages' content beyond N -way correlation. Here, we explore the structures conveyed and hidden by power spectra, revealing a novel perspective on the interplay between structure and noise in Fourier analysis.
Section II discusses temporal structure and provides closed-form expressions for the power spectra from autonomous signal generators. It highlights the intimate connection between power spectra and eigen-spectra of a system's time-evolution generator. Section III then introduces a suite of results on structure that is hidden by power spectra. Notably, it introduces a general condition for fraudulent white noise processes-structured processes with a flat power spectrum-which applies very broadly, including to input-dependent processes with nonstationary high-order statistics. Taken altogether the results emphasize the power spectrum's shortcomings for the task of structure detection. In response, Sec. IV considers more sophisticated measures of structure, introducing the dependence function which identifies the presence of novel finite-range dependencies that contribute to total correlation. Section V concludes the development. Appendices present detailed derivations, as well as several generalizations, of the main results.

II. STRUCTURE IN SPACE AND TIME?
Pairwise correlations are encountered throughout the sciences and engineering, especially in statistical physics. They are assumed, estimated, relied on, designed with, and used for interpretation widely. The following explores several specific examples of pairwise correlation that arise in different fields. These will set the context for our development, particularly for experts in the associated fields. However, our general results should be accessible and relevant across disciplines, as they rely primarily on basic probability theory and linear algebra.
A well-studied lesson from statistical physics is that diverging correlation length heralds the emergence of new types of order. Remarkably, mechanistically-distinct physical systems share many universal behaviors near a critical point of emergent order, including the scaling of spatial pairwise correlation length [3]. More broadly, pairwise correlations are indicators of fundamental physical processes. For example, the fluctuation-dissipation theorem says that pairwise temporal correlations in equilibrium determine the friction encountered in transport processes. The Green-Kubo relations [4] make this explicit. Far from equilibrium, say in computing devices and biological systems composed of excitable media, temporal correlations are signatures of richly coordinated state-trajectories.
Pairwise correlations are directly viewed in the frequency domain via power spectral densities. Indeed, power spectra are employed as a basic data analysis tool in many scientific domains and have been key to major scientific discoveries. For example, comparing alternative theoretical predictions for power spectra of incident electromagnetic radiation from locally-thermalized bodies, a unexpected discrepancy-the ultraviolet catastropheled to the acceptance of Planck's theory of quantized energies and the subsequent birth of quantum theory [5][6][7]. A contemporary example of the prominent role of power spectra is seen in the exquisitely detailed map of the cosmic microwave background (CMB)-a snapshot of the early universe's spatial correlations. In fact, models of the early universe are now benchmarked against their ability to replicate the CMB power spectrum [8].
In applied mathematics, power spectra played a key role in highlighting the defining features of the strange attractors of dynamical systems theory [9,10]. This led to the discovery of Ruelle-Pollicott resonances, where mixing and the decay of correlations in chaotic systems were related to the point spectrum of the Ruelle-Perron-Frobenius operator [11][12][13]. Indeed, the power spectra of chaotic systems are still actively used to analyze the behavior of everything from open quantum systems [14,15] to climate models [16].
Power spectra are regularly used to discover structure in materials science and biology, too. X-ray diffraction patterns-used to identify crystalline and molecular organization and central to discovering DNA's doublehelix [17][18][19][20]-are power spectra of scatterer densities, as we explain in App. A. Power spectra have been used to identify temporal correlations in single-neuron spike trains, refuting the common Poissonian white-noise assumption common in theoretical and computational neuroscience [21][22][23][24]. This allows the possibility that temporal correlations in the spike train-rather than just the firing rate-can play an important role in the neural code [25,26]. On a much larger (mean-field) scale, brain wave activity in different frequency bands gives signatures of normal brain functioning, as well as pathological conditions. Rhythmic brain-wave activity is clinically assessed through real-time power spectra of electroencephalography (EEG) signals [27][28][29].
From the smallest to the largest scales in the universe, when probing both the inanimate and the animate, power 3 spectra are a central diagnostic tool for structure and validating scientific models. Their use is so important that special-purpose spectral and network analyzers are standard laboratory test equipment; they can be readily purchased from dozens of major manufacturers.
Power spectra report pairwise correlations in a signal. But how much of a system's structure is faithfully represented by pairwise correlation? Are there important types of order that evade power spectra completely? To answer these questions, we first consider the problem of hidden structure concretely through the lens of autocorrelation and power spectra. Only then, once the strengths and weakness of power spectra are clear, do we move on along to more sophisticated measures of structure. Along the way we trace a path that begins to reveal what one can mean by "statistical dependency", "correlation", and "structure".

A. Correlation and Power Spectra
To provide a common ground, consider discretetime processes described by an interdependent sequence . . . X 0 X 1 X 2 . . . of random variables X t that at time t take on values x ∈ A within an alphabet assumed (for now) to be a subset of the complex numbers: A ⊂ C.
(For other kinds of stochastic process, t may represent spatial or angular coordinates. For concreteness here, we interpret t as indexing time.) An observed process may have a discrete domain, as with a classical discrete-time communication channel or a series of quantum measurements, or otherwise may be a regularly-sampled process evolving in continuous time.
A signal's power spectrum or, more properly, its power spectral density quantifies how its power is distributed across frequency [30,31]. For a discrete-domain process it is: where ω is the angular frequency and the angle brackets denote the expected value over the random variable chain X 1 X 2 X 3 . . . X N . For wide-sense stationary stochastic processes the autocorrelation function: is independent of the global time shift t and depends only on the relative time-separation τ between observables [32]. The bar above X t denotes its complex conjugate. Equation (2) makes plain the connection between pairwise statistics and the pairwise correlation function. For wide-sense stationary stochastic processes, the power spectrum is also determined by the signal's autocorrelation function γ(τ ): The windowing function N − |τ | appearing in Eq. (3) is a direct consequence of Eq. (1); it is not imposed externally, as is common practice in signal analysis. (This factor is important for controlling convergence in our subsequent derivations.) Equation (3) suggests that the power spectrum is very nearly the Fourier transform of the autocorrelation function, except for the N − |τ | term. In fact, the Wiener-Khinchin theorem proves that the power spectrum is indeed equal to the Fourier transform of the autocorrelation function for wide-sense stationary processes [33,34]. Note, too, that the pairwise correlation function γ(τ ) can be obtained via the inverse Fourier transform of the power spectrum P (ω).

B. Temporal Structurelessness
Our goal is to understand temporal structure and to identify it in stochastic processes. To detect structure, even when hidden, we first must establish a baseline reference process that has no temporal structure: genuine white noise.
White noise processes are those for which each random variable X t is statistically independent of all others X t =t , and each is identically distributed according to the same probability density function (PDF) over the alphabet. That is, the random variables in the sequence are independent and identically distributed (IID).
Familiar examples include a sequence of coin flips or the sequence of sums when rolling a pair of dice. As an example from contemporary physics, consider the (classical) process that results from observing a sequence of Bell-pair quantum states [35]. For each Bell pair, one of the entangled particles is sent to Alice and the other sent to Bob. Alice makes a sequence of measurements (along any measurement axis). The measurement output sequence she observes is pure white noise, with each measurement outcome having equal and independent probability of being up or down along the measurement axis. In fact, more sophisticated deployments of Bell pairs are being developed to provide certifiable random number generation [36]. Experiments now concentrate on increasing the rate of generating white noise [37,38].
The most recognizable feature of all white noise pro-cesses is their flat power spectrum. For any IID process, it follows directly from Eq. (2) that γ(0) = |X t | 2 , whereas γ(τ ) = | X t | 2 for τ = 0. From Eq. (3), this immediately yields the familiar flat power spectrum of white noise, together with a δ-function at zero frequency, corresponding to the constant offset of the noise. For real-valued IID processes with zero mean, this simplifies further to γ(τ ) = σ 2 δ 0,τ and so P (ω) = σ 2 . In fact, the flat power spectrum has height equal to the variance σ 2 = X 2 t − X t 2 of the white noise for any real-valued IID process. The flat power spectrum for IID processes indicates that any temporal structure in the generating source has such short memory that it vanishes within the short sampling time ∆t between each observation.
Gaussian white noises tend to be the most commonly employed white noise processes and, usually, for good reason. By the central limit theorem, Gaussian white noise arises generically in systems whenever many events-with amplitude of finite variance and with rapidly decaying correlation (compared to the timescale between observations)-contribute additively to each individual observation. Suppose, for example, that the expected number of these contributions to each new observation is simply proportional to the time since the last observation. When sampled at interval ∆t, the central limit theorem then tells us that each observation of the accumulated noise is IID and Gaussian distributed with variance σ 2 η ∝ ∆t. This immediately leads to the familiar standard deviation of σ η ∝ √ ∆t of the additive noise η(t) that appears when numerically integrating stochastic differential equations (e.g., Langevin equations); this, in turn, produces the trajectories of slower random variables [39].
The memoryless nature of repetitive sampling from a distribution is apparent in the state machine shown in Fig. 1(a). The same Gaussian distribution is repeatedly sampled with probability 1 (as depicted by the selftransition probability there) for each observation, regardless of what happened previously.
Other "structureless" white noises are also possible. In fact, any of an uncountably infinite set of different IID processes-Gaussian, Poisson, Bernoulli, or any process that resamples a particular distribution at each timestep-all yield the flat power spectrum or white noise. Non-Gaussian noise can emerge from repetitive sampling of a system's (non-Gaussian) stationary distribution when the relaxation timescales are far shorter than the time elapsed between samples. Alternatively, non-Gaussian white noise can arise when only a few physical events contribute to each observation, in which case the non-Gaussianity may reveal features of the physical generative mechanism. Nevertheless, these processes possess no temporal structure on the timescale of observation 2 = hX 2 t i hX t i 2 of the white noise. The flat power spectrum for IID processes indicates that any temporal structure in the generating source has such short memory that it vanishes within the short sampling time t between each observation. Gaussian white noises tend to be the most commonly employed white noise processes and, usually, for good reason. By the central limit theorem, Gaussian white noise arises generically in systems whenever many events-with amplitude of finite variance and with rapidly decaying correlation (compared to the timescale between observations)-contribute additively to each individual observation. Suppose, for example, that the expected number of these contributions to each new observation is simply proportional to the time since the last observation. When sampled at interval t, the central limit theorem then tells us that each observation of the accumulated noise is IID and Gaussian distributed with variance 2 ⌘ / t. This immediately leads to the familiar standard deviation of ⌘ / p t of the additive noise ⌘(t) that appears when numerically integrating stochastic di↵erential equations (e.g., Langevin equations); this, in turn, produces the trajectories of slower random variables [25].
The technical level of results here is bouncing around, from elementary almost pedagogical to sophisticated. What audience are we targeting here? The reader is assumed knowledgeable enough to derive the stated claims? If not, we'd better provide cites to the textbooks where they can find the results explained.
The memoryless nature of repetitive sampling from a distribution is apparent in the state machines shown in Fig. 1(a). The same Gaussian distribution is repeatedly sampled with probability 1 (as depicted by the selftransition probability there) for each observation, regardless of what happened previously.
Other "structureless" white noises are also possible. In fact, any of an uncountably infinite set of di↵erent IID processes-Gaussian, Poisson, Bernoulli, or any process that resamples a particular distribution at each timestep-all yield the flat power spectrum or white noise. Non-Gaussian noise can emerge from repetitive sampling of a system's (non-Gaussian) stationary distribution when the relaxation timescales are far shorter than the time elapsed between samples. Alternatively, non-Gaussian white noise can arise when only a few physical events contribute to each observation, in which case the non-Gaussianity may reveal features of the physical generative mechanism. Nevertheless, these processes pos-  Another white noise proce non-Gaussian. The class of all possible (not-neces sian) memoryless white noises is identical with processes generated by single-state machines. T turn, is identical to that of all IID processes (spann sible probability density functions). These tempo tureless processes constitute all possible varieties white noise. Give the (flat) power spectrum The reader needs this.
The hallmark of this structural paucity is state for the hidden Markov model (HMM) tha all of these IID processes, as depicted in Fig  The single states means that no influences fro can a↵ect the next or future samples. The genuine white noises.
In sharp contrast, we will consider stochasti with arbitrarily sophisticated temporal struct timescale of observation. The much more gene next consider allows for a thorough investigat porally structured stochastic processes. One feature is that these very structured processes by arbitrarily complicated and memoryful co internal states can have the flat power spectru noise.

C. Models of temporal structure
Need cites to HMM literature: [27][28][29][30][31][32][33] Structure arises over time from the interd between observables. To explicitly address s a broad class of temporally structured proces hidden Markov models (HMMs) as our prefe sentation for autonomous signal generators.  rocesses indicates that any temporal nerating source has such short mems within the short sampling time t rvation. noises tend to be the most comhite noise processes and, usually, for the central limit theorem, Gausrises generically in systems whenever amplitude of finite variance and with orrelation (compared to the timescale ns)-contribute additively to each inn. Suppose, for example, that the exhese contributions to each new obserroportional to the time since the last n sampled at interval t, the central tells us that each observation of the is IID and Gaussian distributed with This immediately leads to the familion of ⌘ / p t of the additive noise hen numerically integrating stochastions (e.g., Langevin equations); this, he trajectories of slower random varilevel of results here is bouncelementary almost pedagogical . What audience are we targeteader is assumed knowledgeable e the stated claims? If not, we'd ites to the textbooks where they lts explained. nature of repetitive sampling from pparent in the state machines shown  The hallmark of this structural paucity is the single state for the hidden Markov model (HMM) that describes all of these IID processes, as depicted in Fig. 1(b) [26]. The single states means that no influences from the past can a↵ect the next or future samples. These are the genuine white noises.
In sharp contrast, we will consider stochastic processes with arbitrarily sophisticated temporal structure on the ω P (ω) (b) Non-Gaussian white noise process (inset) and its flat power spectrum.
FIG. 1. Genuine white noise processes have no memory: This is represented graphically by a machine with a single state that is repeatedly visited with each observation. The same probability density function, inscribed in the state, is sampled at each timestep. (a) Gaussian white noise memoryless stochastic process. (b) Another white noise process, although non-Gaussian. For each (a) and (b), the flat power spectrum is given theoretically (thick gray), with height equal to the variance of the probability density function. We also display the numerically-obtained power spectrum (thin blue) for each. The class of all possible (not-necessarily Gaussian) memoryless white noises is identical with the class of processes generated by single-state machines. This class, in turn, is identical to that of all IID processes (spanning all possible probability density functions). These temporally structureless processes constitute all possible varieties of genuine white noise. and, in particular, generate absolutely no correlations in the sequence of observations. The hallmark of this structural paucity is the single state for the hidden Markov model (HMM) that describes all of these IID processes, as depicted in Fig. 1(b) [40]. The single states means that no influences from the past can affect the next or future samples. These are the genuine white noises.
In sharp contrast, we will consider stochastic processes with arbitrarily sophisticated temporal structure on the timescale of observation. The much more general class we next consider allows for a thorough investigation of temporally structured stochastic processes. One surprising feature is that these very structured processes, described by arbitrarily complicated transition dynamics within memoryful collections of internal states, can have the flat power spectrum of white noise.

C. Models of Temporal Structure
Structure arises over time from the interdependence between observables. To explicitly address structure in a broad class of temporally structured processes, we use hidden Markov models (HMMs) as our preferred representation for autonomous signal generators [41][42][43][44][45][46][47]. Later sections will introduce yet more sophisticated models with input dependence.
Despite Markovian state-to-state transitions, HMMs can generate temporally-structured non-Markovian stochastic processes-those with infinite Markov order. Processes generated by even finite-state HMMs, in fact, typically have infinite-range statistical dependencies between observables since simple state-transition motifs guarantee this feature [48]. In addition to this richness and their ability to compactly generate the exact temporal statistics of nonlinear dynamical systems, HMMs are attractive since they are amenable to linear operator techniques [49][50][51][52][53][54][55].
Let the 4-tuple M = S, A, P, T be a discrete-time HMM that generates the stationary stochastic process . . . X −2 X −1 X 0 X 1 X 2 . . . according to the following. S is the (finite) set of hidden states of the hidden Markov chain and A ⊆ C is the observable alphabet. S t is the random variable for the hidden state at time t that takes on values s ∈ S. X t is the random variable for the observation at time t that takes on values x ∈ A.
Given the hidden state at time t, the possible observations are distributed according to the conditional probability density functions: P = p(X t |S t = s) s∈S . For each s ∈ S, p(X t |S t = s) may be abbreviated as p(X|s) since the probability density function in each state is assumed to not change over t. Similarly, we will write p(x|s) for p(X t = x|S t = s). Finally, the hiddenstate-to-state stochastic transition matrix T has elements T s,s = Pr(S t+1 = s |S t = s), which give the probability of transitioning from hidden state s to s given that the system is in state s, where s, s ∈ S. It is important for subsequent developments that Pr(·) denotes a probability Simple 3-state HMM that generates a stochastic process according to the state-to-state transition dynamic T and the probability density functions (PDFs) {p(X|s)} s∈S associated with each state. Theorem 1 asserts that its power spectrum is the same (modulo constant offset) as the power spectrum generated from an alternative process where the state's PDFs are solely concentrated at the average value x p(X|s) of the original PDF associated with the state.
in contrast to p(·) which denotes a probability density.
Epitomizing the class of processes considered, Fig. 2 presents a rather simple HMM with continuous observable alphabet A = R, whose samples are distributed according to the probability density function shown within each hidden state. As seen in the HMM's top-right state, both continuous probability density functions and discrete output probabilities can be accommodated in this framework: Finite probability of a particular observable is accomplished by an appropriately weighted Dirac δ-function in the probability density function. The memoryful structure in Fig. 2 should be contrasted with the completely memoryless process of sampled Gaussian white noise shown in Fig. 1. Figure 3's Bayes network depicts the structure of conditional independence among the random variables for these memoryful signal generators.
For example, for a generic HMM, p( cannot be simplified since the condition on even arbitrarily distant past observables can influence the probability of the current observable. However, when conditioning on hidden states, the situation can simplify markedly. For example: . The general properties of HMMs allow one to calculate any statistic about the generated process from the hidden-state-to-state transition matrix T and set P of conditional probability density functions. For simplicity in the following, assume a finite set of hidden states and a single attracting component. Then every transition matrix T admits a unique stationary distribution π. This is determined as T 's left eigenvector associated with the eigenvalue of unity: π| T = π|. The eigenvector is normalized in probability: π|1 = 1, where |1 is the column vector of all ones. Note also that |1 is the right eigenvector of T associated with the eigenvalue of unity, T |1 = |1 . This property conserves state probability in hidden Markov chain evolution.

Gaussian
We can now provide the correlation functions and power spectral density in general and in closed form for this entire class of stochastic process generated by HMMs. Helpfully, for particular HMMs, the expressions become analytic in the model parameters.
Appendix B shows that the autocorrelation function is given by: where Ω is the |S|-by-|S| matrix defined by: We use the hidden-state basis in which |s is the column vector of all 0s except for a 1 at the index corresponding to state s. s| is simply its transpose. This yields a natural decomposition of the identity operator: I = s∈S |s s|. In the hidden-state basis, then, the Ω matrix simply places state-conditioned average outputs along its diagonal. The power spectrum is calculated starting from Eq. (3) together with Eq. (4), using the spectral decomposition techniques developed for nonnormal and nondiagonalizable operators in Ref. [55]. In the derivation it is important to treat individual eigenspaces separately, as our generalized framework naturally accommodates. Appendix C gives the derivation's full details. Qualitatively, the power spectrum decomposes naturally into a discrete part P d (ω) (a weighted sum of Dirac δ-functions) and a continuous part P c (ω) (a collection of diffuse peaks): For the power spectrum's continuous part the end result is: where Re(·) denotes the real part of its argument.
Remarkably, all of the ω-dependence is in the apparently simple term e iω I − T −1 . This is the resolvent of T along the unit circle in the complex plane. However, and central to our main results, this frequency dependence is filtered through π| Ω and Ω |1 . Notably, if the average-observation matrix was proportional to the identity, then all frequency dependence would be lost since Re π| e iω I − T −1 |1 is independent of ω. Frequency dependence of the power spectrum thus requires that there are different averages associated with different states. Surprisingly though, none of the structure of the conditional probability density functions {p(X|s)} s matters for the power spectrum, except for the average value observed in each state. Structure beyond averages is simply not captured.

D. Apparent Structure
To fully appreciate the structure that is captured by the power spectrum requires a spectral decomposition of the transition matrix. The set Λ T of T 's eigenvalues is calculated as usual. However, since transition matrices are generically nonnormal and often nondiagonalizable, the spectral projection operators associated with T deserve a brief review.
In particular, the spectral projection operator T λ associated with eigenvalue λ can be defined as the residue of (zI − T ) −1 as z → λ: where z ∈ C and C λ is a small counterclockwise contour around the eigenvalue λ. Alternatively, the spectral projection operators can be constructed from all left eigenvectors, generalized left eigenvectors, right eigenvectors, and generalized right eigenvectors associated with λ. The construction is given explicitly in Ref. [55]. In the simple case where the eigenvalue is nondegenerate, the eigenprojector takes on the simple form: However, the left and right eigenvectors are not simply complex-conjugate transposes of each other, as they would be in the normal-operator case familiar from closed quantum systems and undirected networks. For example, the spectral projection operator associated with stationarity-T 1 = |1 π|-can be interpreted as the classical version of a density matrix but, typically, the stationary distribution is not uniform and so π| is not proportional to the transpose of |1 .
We will also use the broader class of spectral companion operators: They have the useful property that T λ,m T ζ,n = δ λ,ζ T λ,m+n . Clearly, the spectral projection operator is contained in this set, as , the size of the largest Jordan block associated with λ. One should keep in mind that the transition matrix can be represented as: While the resolvent has the general spectral decomposition: The spectral expansion of the resolvent given by Eq. (9) allows us to better interpret the qualitative shape of the power spectrum: Note that π| Ω T T λ,m Ω |1 is a complex-valued scalar and all of the frequency dependence now handily resides in the denominator. When T is diagonalizable, Eq. (10) reduces to: Appendix E gives the corresponding results for continuous-time processes.
The discrete (δ-function) portion of the power spectrum is: where ω λ is related to λ by λ = e iω λ . Equation (11) is valid even when T is nondiagonalizable: An extension of the Perron-Frobenius theorem guarantees that T 's eigenvalues on the unit circle have index ν λ = 1. With that the δ-function at zero frequency appears whenever the average observation is nonzero. When plotted as a function of the angular frequency ω around the unit circle, the power spectrum suggestively appears to emanate from the eigenvalues λ ∈ Λ T of the hidden linear dynamic. These are the coronal spectrograms displayed in Figs. 4(c) and (d); these will be discussed after the general phenomena is explained.
T 's eigenvalues on the unit circle yield Dirac δfunctions in the power spectrum. T 's eigenvalues within the unit circle yield more diffuse line profiles, increasingly diffuse as the magnitude of the eigenvalues retreats toward the origin. Moreover, the integrated magnitude of each contribution is determined from the amplitude π| Ω T λ Ω |1 . Finally, we note that nondiagonalizable eigenmodes yield qualitatively different line profiles.
The spectral decomposition of the power spectrum offers several insights into the minimal temporal structure required to generate the observed power spectrum. In particular, since (i) each local maxima in the power spectrum emanates from an eigenvalue of the hidden stateto-state transition matrix and (ii) since the number of unique eigenvalues is upper bounded by the number of hidden states (i.e., |Λ T | ≤ |S|), we have the following result: Counting both diffuse peaks and δ-functions, the number of observed peaks in the power spectrum (from ω ∈ (−π, π] in the discrete-time setting) puts a lower bound on the number of hidden states of any model capable of generating the observed stochastic process. Note further that all transition matrices must have an eigenvalue of unity and that this eigenvalue can only produce a δ-function at ω = 0 with no other way to shape the power spectrum over other frequencies. This gives the immediate consequence that all single-state HMMs (i.e., all IID processes) have a flat power spectrum, as suggested earlier. In such cases, Λ T = {1}, and there are no other eigenvalues to shape the power spectrum. Figure 4 shows the power spectrum of a particular parametrized family of stochastic processes. Figure 4(a) displays the HMM's skeleton with state-to-state transition probabilities parametrized by b. The mean values x p(X|s) observed from each state are indicated as the blue number inside each state. The process generated depends on the actual PDFs and the transition parameter b. Although, and this is one of our main points, the power spectrum is ignorant to the PDFs' details. The evolution of the eigenvalues Λ T of the transition dynamic among hidden states is shown from thick blue to thin red markers in Fig. 4(b), as we sweep the transition parameter b from 1 to 0. A subset of the eigenvalues pass continuously but very quickly through the origin of the complex plane as b passes through 1/2. The continuity of this is not immediately apparent numerically, but can be revealed with a finer increment of b near b ≈ 1/2. Notice the persistent eigenvalue of λ T = 1, which is guaranteed by the Perron-Frobenius theorem.
Using coronal spectrograms, introduced in Refs. [56] and [54], Figs. 4(c) and 4(d) illustrate how the observed power spectrum P (ω) emanates from the eigen-spectrum Λ T of the hidden linear state-dynamic. Specifically, in Fig. 4(c) and again, at another parameter setting, in Fig. 4(d), we show the continuous part of the power spectrum P c (ω) (plotted around the unit circle in solid blue) and the eigen-spectrum Λ T (plotted as red dots on and within the unit circle) of the state-to-state transition matrix for the 11-state hidden Markov chain ( Fig.  4(a)) that generates it. As anticipated from Eq. (10), the power spectrum has sharper peaks when the eigenvalues are closer to the unit circle. The integrated magnitude of each peak depends on π| Ω |λ λ| Ω |1 .
Interestingly, our continuous spectrum (closely related to the continuous spectrum of unitary models of chaotic dynamics) is the shadow of the discrete spectrum of nonunitary dynamics. This suggests that resonances in various physics domains concerned with a continuous spectrum can be modeled as consequences of simpler nonunitary dynamics. Indeed, hints of this appear in the literature already [57][58][59].

III. HIDDEN STRUCTURE
Remarkably, the power spectrum generated by any hidden-Markov process with continuous random variables for the observables is the same as the power spectrum generated by a potentially much simpler one-a process that is a function of the same underlying Markov chain but instead emits the state-dependent expectation value of the observable within each state. Theorem 1. Let P = p(X|s) s∈S specify any statepaired collection of probability density functions over the domain A ⊆ C. Let B = x p(X|s) s∈S and let Q = δ(x − x p(X|s) ) s∈S . Then, the power spectrum generated by any hidden Markov model M = S, A, P, T differs at most by a constant offset from the power spectrum generated by the hidden Markov model M = S, B, Q, T that has the same hidden Markov chain but in any state s ∈ S emits, with probability one, the average value x p(X|s) of the state-conditioned probability density function p(X|s) ∈ P of M.
Thus, all HMMs sharing the same T and x p(X|s) } s∈S have the same power spectrum P (ω) = P c (ω) + P d (ω), modulo a constant offset determined by differences in |x| 2 . Figure 5 demonstrates Thm. 1 for the power spectrum  Fig. 4(c). Besides an overall constant offset of |x| 2 , the power spectrum is insensitive to all details of the state-conditioned PDFs except for their averages. On top of the theoretical curve (thick gray) given by Eq. (6) we overlay horizontal offsets of the power spectra calculated numerically for stochastically-generated time series. The stateconditioned PDFs used to define the different stochastic processes are: (i) single δ-functions, (ii) single Gaussians, (iii) two symmetrically spaced δ-functions (with no support at the mean), and (iv) weighted δ-functions with asymmetric spacing. For each, a time series of length 2 18 was generated. The Welch method was used to calculate the average power spectrum for each process using FFTs of segments of length 2 9 . The inset shows the raw power spectrum for each process without the offset.
One immediate consequence of Thm. 1 is the following: Corollary 1. Any hidden Markov chain with any arbitrary state-paired collection of zero-mean distributions, i.e.: generates a flat power spectrum indistinguishable from white noise.
Proof. This follows immediately from Thm. 1 and the fact that the all-zero sequence has a power spectrum that is zero everywhere. Thus, the corresponding power spectrum of the actual process is a flat (nonzero) power spectrum of uniform height |x| 2 . We can relax the corollary to include cases where the state-conditioned PDFs are all equal to a potentiallynonzero constant. Although, a δ-function at zero frequency (of integrated magnitude equal to the square magnitude of the constant) will then be observed in ad-  Fig. 4c. Here we demonstrate that, besides an overall constant o↵set of h|x| 2 i, the power spectrum is insensitive to all details of the state-conditioned pdfs except for their averages. On top of the theoretical curve (thick gray) given by Eq. (5), we overlay horizontal o↵sets of the power spectra calculated numerically for stochastically-generated time series, where the state-conditioned pdfs are (i) single delta functions, (ii) single Gaussians, (iii) two symmetrically spaced delta functions (with no support at the mean), and (iv) weighted delta functions with asymmetric spacing. For each of the numerical examples, a time series of length 2 18 was generated; the Welch method was used to calculate the average power spectrum using FFTs of segments of length 2 9 . The inset shows the raw power spectra without the o↵set.
the corollary to include cases where the state-conditioned pdfs are all equal to some potentially-nonzero constant, although a delta function at zero frequency (of integrated magnitude equal to the square magnitude of the constant) will then also be observed in addition to the flat power spectrum. The implications of this corollary can be jolting. It is quite surprising, for example, that a power spectrum can be completely flat even when a ring of sequential states are visited which emit observables according to a set of probability density functions with no overlapping support. An example of this is given in Fig. 6. In such a case, any awake observer should immediately detect obvious structure and forbidden sequences in the process; yet the power spectrum remains silent about the structure, reporting only the flat signature of white noise. Structure is not always so obvious though without some reliable aid.
Indeed, the structure becomes much more di cult to detect (by any means) when the state-conditioned probability density functions have overlapping support (the generic case of non-Markovian processes) so that the latent state is not obvious from casual observation. In the following sections, we will address further ways to achieve the appearance of white noise without needing to meet the requirements of Cor. 1. But first, let us reflect on the results so far.
On the one hand, Thm. 1 and Cor. 1 should strongly suggest to data analysts to look beyond power spectra when attempting to extract a process's full architecture. On the other, whenever a process' power spectrum is structured, it is a direct fingerprint of the resolvent of the hidden linear dynamic. In short, the power spectrum is a filtered image of the resolvent along the unit circle.
The power spectrum of a particular stochastic process is shown in Fig. 4 and using coronal spectrograms, introduced in Ref. [4], it illustrates how the observed spectrum can be thought of as emanating from the spectrum of the hidden linear dynamic, as all power spectra must. Figure 4a shows the state-emitting HMM with state-tostate transition probabilities parametrized by b; the mean values hxi p(x|s) of each state's pdf p(x|s) are indicated as the blue number inside each state. The process generated depends on the actual pdfs and the transition parameter b although, and this is our point, the power spectrum is ignorant to the details of the pdfs.
The dition to the flat power spectrum. The corollary's implications are striking. It is quite surprising, to consider one broad class of examples, that a power spectrum can be completely flat even when a ring of sequential states are visited that emit observables with probability density functions having no overlapping support. Figure 6 gives an example. In such a case, any cogent observer immediately detects the obvious structure in the mismatched supports-observed values are distinct-and forbidden realizations. Yet the power spectrum remains silent about this structure, reporting only the featureless signature of white noise.
In other more challenging settings, structure is not always so obvious without some reliable aid. Indeed, structure becomes increasingly difficult to detect (by any means) when the state-conditioned probability density functions have overlapping support. This is the generic case of non-Markovian processes. The hidden states cannot be detected via casual inspection.
While they give a concrete sense of missing structure, these cases fall far short of telling the full story of how power spectra mask structure. The following sections, culminating in Thm. 2, address additional ways white noise appears without needing to meet the requirements of Cor. 1. (Reference [2] goes further still, showing how the structure can indeed be hidden much deeper.)

A. Nonlinear Pairwise Correlation
In a way, the structure of the stochastic process in Fig. 6 was hidden as shallowly as possible to evade the power spectrum. As mentioned, the structure should be trivial to detect by other means. Indeed, while the linear pairwise correlation γ(τ ) vanished for all τ > 0, there is still pairwise dependence between the generated random variables, which is nonlinear. This pairwise dependence can be teased out using the pairwise mutual information I(X 0 ; X τ ) between observables at different times. For the process of Fig. 6, if we take the limit of the narrow Gaussians in the state-conditioned PDFs to be pairs of δ-functions, then the pairwise mutual information can be calculated exactly as shown in App. F. In fact, I(X 0 ; X τ ) will be unchanged for any set of four PDFs we could have chosen for the states of the example HMM, as long as the PDFs all have mutually exclusive support for the observable output. (This then makes the hidden state a function of the instantaneous observable.) A concise summary of the pairwise mutual information is provided via Ref. [54]'s power-of-pairwise-information (POPI) spectrum: where H(·) is the Shannon entropy of its argument. We generated plots of both the pairwise mutual informations and the POPI spectrum for this example (shown in App. F) and find the decay of pairwise information to scale intuitively with the phase-slip-parameter p. While the example of Fig. 6 has no linear correlation, nevertheless it does have pairwise structure. Thus, the structure of the example process was hidden from power spectra, but not hidden from the POPI spectrum.
The following sections continue investigating temporally-structured processes, but focus on those with no linear pairwise correlation (and so a flat power spectrum) and no pairwise mutual information (and so a flat POPI spectrum). These will lead us to introduce a general condition for flat power spectra. And, since power spectra fail so often to detect structure, we turn from criticizing them to being constructive: introducing ways to detect hidden structure.

B. Sophisticated Fraudulent White Noise
Theorem 1 established that the power spectrum from processes with continuous observable random variables is the same as the power spectrum from much simpler corresponding processes with discrete observable random variables. Accordingly, Thm. 1 motivates studying the power spectra of processes with discrete observable random variables to determine if there are further ways to achieve a flat power spectrum, beyond Cor. 1's possibilities. For observables that are discrete random variables, it is sufficient to consider their probability distributions rather than their probability density functions.
We begin this next step of the development by establishing the following simple lemma: Lemma 1. Any stochastic process (not necessarily stationary) with the Single-Condition-Independent Property (SCIP): for all x ∈ A and all t = t , generates a flat power spectrum, mimicking white noise.
Proof. For any such process, Pr(X t ) is the stationary distribution µ X of the instantaneous observable under the stochastic dynamic. Moreover, SCIP means that the joint probability of any two observations decomposes: Substituting µ X (x )µ X (x) for Pr(X t = x, X t+τ = x ) in the autocorrelation definition of Eq. (2) immediately implies that SCIP processes have τ -independent pairwise correlation γ(τ ) = | x | 2 for τ = 0. The power spectrum is thus flat over all frequencies, except possibly with a δ-function at ω = 0. SCIP processes not only have a flat power spectrum but also a flat POPI spectrum. SCIP implies I(X 0 ; X τ ) = 0 for all τ = 0 which, in turn, implies I(ω) = 0. These processes completely lack any pairwise correlation, whether linear or nonlinear.
Notably, Lem. 1 is not covered by Cor. 1; nor is Cor. 1 subsumed by Lem. 1. Accordingly, the following develops a single simple condition (culminating in Thm. 2) that covers all of these cases of fraudulent white noise.
Crucially, the class of potentially-fraudulent-whitenoise processes suggested by Lem. 1 is nontrivial. In addition to IID processes, this class of processes includes non-Markovian processes that hide all of their structure beyond pairwise correlations.
The Random-Random-XOR process (RRXOR), discussed at length in Ref. [54], is an example. Over blocks of length 3, the first two bits are generated randomly from a uniform distribution and the third bit is then the logical XOR operation of the last two. Explicitly: for all n ∈ {1, 2, . . . }. As a SCIP process, the RRXOR process has a flat power spectrum although it does not fall under the purview of Cor. 1. Indeed, the RRXOR process has no pairwise correlation at all since I(X 0 ; X τ ) = 0 for all τ > 0. Accordingly, the POPI spectrum is zero over all frequencies. The structure in this process is strictly three-way correlation. In Ref. [54], the phase φ itself is a random variable, and synchronizing to the phase is a surprisingly difficult task [60]. No matter, whether or not the phase φ is given, the process has no pairwise correlation-resulting in a flat power spectrum and flat POPI spectrum-and only reveals correlation in its three-way structure.
It is interesting to note that the related RRXNOR process, where X 3n = XNOR(X 3n−2 , X 3n−1 ), also has a flat power spectrum. In fact, this suggests a new method to hide structure: embed a correlated message into a sequence of RRXOR and RRXNOR 3-bit sequences that lifts all correlation beyond pairwise. Specifically, the original message is transformed into a sequence of choices about whether to use XOR or XNOR on the previous two random bits. As long as the read frame and the embedding mechanism is known, the message can be easily extracted. But, if it is not known that a message is embedded, it cannot be detected simply by looking for pairwise correlations.
Through similar construction, structure can be shifted up to arbitrarily-high orders of correlation [2]. Stochastic processes can be constructed with N -way correlation but no n-way correlation for all n < N . Moreover, an arbitrarily correlated message can be embedded within such a process, such that its structure is lifted beyond any desired order of correlation.

C. Content-preserving Whitening
Corollary 1 gave a method to construct an arbitrarily complex process with a truly flat power spectrum, so long as all latent states have the same average output. Here, we introduce an alternate method to construct arbitrarily complex processes with truly flat power spectra. These processes, in addition, are devoid of n-way correlation for all n < N .
1. Choose an embedding block length N ≥ 3. 2. Choose any stochastic process ("Process A") with a binary output alphabet.
3. Construct "Process B" as follows: • Whenever Process A would produce a 0, Process B will sample a word uniformly from the set of all words of length N with an even number of 1s. • Whenever Process A would produce a 1, Process B will sample a word uniformly from the set of all words of length N with an odd number of 1s.
Any Process B constructed in this manner has a truly flat power spectrum. Process B will also be devoid of n-way correlation for all n < N . Moreover, if A is a stationary process such that its statistical complexity C µ (A) is well defined [61,62], then Process B is also a stationary process with C µ (B) ≥ C µ (A).
This also works for "infinity-structured" processes, those with divergent statistical complexity. Choose any binary Process-A family with C µ → ∞. This can be, for example, Ref. [63]'s Heavy-Tailed Periodic Mixture Process that has infinite past-future mutual information: E → ∞. Then add some structure, via contentpreserving whitening, to obtain a binary Process-B family with C µ → ∞ and a truly flat power spectrum.
Similar constructions can also be developed for processes with larger alphabets. Further examples will be given in Ref. [2].
Through the lens of pairwise correlation, such structure is simply missed. However, before moving on to consider more advanced methods to detect such structure, we finish our investigation of flat power spectra from structured processes. The next section addresses a broad class of possibly-input-dependent process generators and we give a very general condition for when a flat power spectrum results.

D. Input-dependent Generators and Fraudulent White Noise
Probing fraudulent white noise more broadly, consider an arbitrarily correlated message m and an inputdependent generator M( m) of an observable output process {X t } t∈T . The lengths of the inputs and outputs need not be commensurate, and the input and output alphabets may also be distinct. The generator is fully specified by the tuple M( m) = S, A, P, {T t ( m)} t , µ 1 . That is, the internal states S, output alphabet A, and statedependent PDFs P are static. However, the hiddenstate-to-state transition matrix T t ( m) at time t is potentially a function of the full input m. Since stationarity is no longer assumed, the initial distribution µ 1 over hidden states must be specified for the statistics of the output process to be well defined.
Hidden message and embedding protocol . . .
Hidden message and embedding protocol . . .  Figure 7 shows the relevant Bayes network for this general type of input-dependent generator. Contrast this with Fig. 3, which showed the Bayes network of autonomous HMM generators. Autonomous HMMs can be seen as a special case of these possibly-input-dependent generators when the process M( m) = M is inputindependent and the initial distribution µ 1 = π is taken to be the stationary distribution π| = π| T of the timeindependent transition matrix T t ( m) = T .
The memoryful input-dependent generators we now consider also generalize the memoryful transducers considered in Ref. [64] to use continuous-variable outputs and allow the lengths of input and output to be incommensurate. Via any of the above models, very general message-embedding schemes can be developed that produce sophisticated fraudulent white noise.
Even with all the generalizations, we can determine autocorrelation and power spectra. Similar to the derivation for HMMs, we find that if the process is wide-sense stationary then (for τ ≥ 1): which must overall be t-independent (so long as t ≥ 1).
and Ω is again given by Eq. (5).
(Notice that T a:a+τ ( m) = T τ for the special case of autonomous HMMs.) Thus, autocorrelation can be calculated as µ 1 |Ω T 1:1+τ ( m) Ω|1 , assuming that the pairwise statistics are stationary. This can also be written as: where we treat x p(X|St) as a random variable that depends on S t and the whole expression becomes tindependent assuming stationary pairwise statistics. Accordingly, the autocorrelation function is constant and the power spectrum is flat whenever: for all τ , for all t ∈ T , and for all s ∈ S. However, this requirement is too strict to cover all cases of interest. For example, it does not yet imply the flat power spectrum of the RRXOR process. More generally, constant autocorrelation and flat power spectra can be guaranteed by an even weaker condition.
To appreciate this, define the set Ξ of average outputs exhibited by the states: Ξ ≡ s∈S x p(X|s) . Furthermore, we define S ξ ⊂ S as the set of states that all exhibit the same average output ξ ∈ Ξ. Explicitly, S ξ ≡ {s ∈ S : x p(X|s) = ξ}. Using these quantities, we can state our result more precisely as the following theorem.
Theorem 2. Let {X t } t be a stochastic process generated by any of the hidden-state models M( m) discussed above, including autonomous HMMs and input-dependent generators, X t the random variable for the observable at time t, and S t the random variable for the hidden state at time t. Such processes have constant autocorrelation and a flat power spectrum if: for all separations τ > 0, for all t ∈ T , and for all ξ, ξ ∈ Ξ.
Proof. Starting from Eq. (13), we find the autocorrelation for all such processes: Combining Eq. (14) and Eq. (15), we see that: which is a constant. Thus, the power spectrum is flat, if Eq. (14) holds. Theorem 2 says that a flat power spectrum results whenever the average output of the future latent state is independent of the average output of the current latent state.
This generalized condition for flat power spectra covers the special case for HMMs as well as fraudulent white noise from message-embedding schemes with stationary pairwise statistics, but nonstationary high-order statistics. Appendix G shows that a modified version of Thm. 2 also applies to another class of generators that can be more natural for measured quantum systems and systems with computational dependencies. Theorem 2 subsumes Cor. 1 as well as Lem. 1. And, it offers the most general guarantee yet for constant autocorrelation and flat power spectrum.
By way of contrast consider the following. While zero pairwise mutual information is always a sufficient condition for flat power spectrum, it is not a necessary condition. Here, in Thm. 2, we find a very general condition for a flat power spectrum. Appendix H established a related theorem (Thm. 4) that further generalizes the condition for flat power spectra, allowing for time-dependent PDFs associated with each state. Moreover, Thm. 2 and Thm. 4 constructively suggest how to design such processes. Notably, these generalized conditions do not require a stationary dynamic over the hidden states of the observation-generating mechanism, which furthermore allows messages to ride undetected aboard fraudulent white noise.
This final result emphasizes the main argument's generality: power spectra are mute when detecting a broad range of observable structure. Whether observing physical, biological, or social systems, we seek structure that reveals mechanism and begets predictability. Through the lens of power spectra, or pairwise correlation more generally, much structure is simply missed. The challenge then is to look for structure beyond pairwise.

IV. STRUCTURE IN NOISE?
One systematic method for exploring beyond-pairwise correlations in stationary stochastic processes is through the sequence of myopic entropy rates [50,53,54,[65][66][67]: with h 1 = H(X 1 ). For example, the RRXOR process has h 1 = h 2 = log |A| = 1 bit/symbol-it appears as random as possible when considering symbols individually or in pairs. Structure is unveiled, though, for L ≥ 3 when h L < 1. That is, progressively longer Markov-order-L approximations of the infinite-Markov-order process reveal progressively more of its hidden structure. In fact, h L 's convergence reflects how structure is hidden in the stochastic process [67]. As L → ∞, h L approaches the process' Shannon entropy rate h-the irreducible randomness per symbol after all orders of correlation have been taken into account. Notably, the accumulation of the excess myopic entropy ∞ L=1 (h L −h) = Ethe excess entropy-quantifies the total mutual information between the past and future of a process: E = I(. . . , X −1 , X 0 ; X 1 , X 2 , . . . ). So, while I(X 0 ; X τ ) = 0 for all τ > 0 for the RRXOR process, the past and future are nevertheless correlated since E > 0. And the convergence to predictability can be viewed in the frequency domain through the excess-entropy spectrum introduced in Ref. [54]. Taken together, this suggests that myopic entropy rates serve well to identify hidden structure beyond pairwise correlation. They show how predictability improves as progressively longer historical context is used.
However, correlations are not always restricted to contiguous blocks. Therefore, there can be pairwise correlations among distant observables while h 2 = 0. Moreover, the myopic entropy rates as defined above are restricted to stationary processes. Consequently, despite their utility, myopic entropies are not ideal for direct indication of L-way correlation in the most general setting.
A more direct indicator of L-way correlation is found in the dependence function D L , which quantifies the maximal uniquely-L-way correlation that exists in a process. We say a set χ of random variables is fully correlated if all constituent random variables inform all of the others; that is, if: for all X, X ∈ χ. A process is then L-way correlated if it has a set of L random variables that are fully correlated. One way to quantify this L-way correlation is through the following dependence function: defined here only for L ≥ 2. L-way dependence is nonzero if and only if there are novel L-way contributions to a process' total correlation. Note that dependence can be applied to nonstationary processes and processes of finite duration. Consider, as a simple example of noncontiguous depen-dencies, a process consisting of two interlaced RRXOR processes with unambiguous phase. Explicitly: X 6n = XOR(X 6n−4 , X 6n−2 ) and whereas X 6n−5 , X 6n−4 , X 6n−3 , and X 6n−2 are all generated from a uniform distribution for all n ∈ {1, 2, . . . }. Joint probabilities over contiguous variables are completely uncorrelated and as random as possible, up until a block-length of 5. Let's treat the example as a stationary process: Calculating probabilities from word frequencies in a single realization, with the implicit assumption of stationarity, effectively inducing random phase. Then, we find full randomness in the myopic entropy rates up to block length 5: h = log |A| = 1 bit for 1 ≤ < 5. Then, finally, a reduction in apparent entropy occurs at h 5 , after which h < h −1 for ≥ 5. Notably, h 3 reflects maximal randomness within its purview of inspection. Whereas, the process actually has three-way, but no lower-order dependencies. This yields D 1 = D 2 = 0 and D 3 > 0. With known phase, we would have D 3 = 1 bit. However, when the process is unknown and only a single realization is available for analysis, probabilities can be inferred only from motifs of random-variable clusters. For example, estimating Pr(X t−2 , X t , X t+2 ) as if the process were stationary, leads to finding 0 < D 3 < 1, where D L denotes approximating the dependence function assuming stationarity and testing a limited set of motifs. Usefully, D L sets a lower bound on D L . So, nonzero D L implies L-way dependence. Curiously, the assumption of stationarity induces D L > 0 for all L ≥ 3; reminiscent of how h L − h L−1 > 0 for all L ≥ 3 for the RRXOR process with ambiguous phase. In each case, these higher-order correlations correspond to the observer's resolving phase ambiguity.
The dependence function seems to fulfill its desired role of identifying high-order correlations that cannot be explained by lower-order phenomena. Taking a step back, though, we might question the whole endeavor. Can a single model-free signal-analysis method ever reliably detect information processing and thus complex structure in the world around us? We have clearly ousted power spectra for this task. Nevertheless, our arguments here lend support to an affirmative answer, but at the cost of more nuanced and computationally intensive techniques. What is the range of validity of the informational measures discussed above? Can they be entrusted with finding structure in the noise?
First, it should be noted that Shannon entropy is only fully justifiable for alphabets A of countable cardinality. So, apparently continuous observables must be par-titioned into measurable sets to apply the informational measures like the myopic entropy rates and the dependencies D L . Nevertheless, quantum physics suggests that even very large and apparently continuous systems are, in principle, always represented in a countable basis. Practically too, measurement devices only have a finite precision, so observations are discretized in practice anyway. Therefore, Shannon entropies (like the myopic entropy rates and the dependencies) can be applied in principle.
However, a second and likely more-severe challenge arises from limited data: reliable estimates of probabilities are not always available. Model building offers the strongest response to this challenge. Generative models inferred from low-order statistics sometimes encapsulate predictions of rare events. And, at least, they give a prediction for high-order statistics. Testing these predictions against observation allows refining one's model and discovering new structure.

V. CONCLUSION
Our investigation began with the modest task of showing how to calculate the correlation function and power spectrum given a signal's generator. To this end, we briefly introduced hidden Markov models as signal generators and then used the linear-operator techniques of Ref. [55] to calculate their autocorrelation and power spectra in closed-form. This led to several lessons. First, we saw that the power spectrum is a direct fingerprint of the resolvent of the model's time-evolution operator, analyzed along the unit circle. Second, spectrally decomposing the not-necessarily-diagonalizable time evolution operator, we discovered the range of qualitative behaviors that can be exhibited by autocorrelation functions and power spectra. Third, contributions from eigenvalues on the unit circle had to be extracted and dealt with separately. Contributions from eigenvalues on the unit circle correspond to Dirac δ-functions-the analog of Bragg peaks in diffraction. Whereas, eigen-contributions from inside the unit circle correspond to diffuse peaks, which become sharper for eigenvalues closer to the unit circle. Finally, we found that nondiagonalizable eigenmodes yield qualitatively different line profiles than their diagonalizable counterparts.
These first results incisively answer the challenges raised by Ruelle-Pollicott resonance theory about the possible relationship between complex eigenvalues of time-evolution operators and the correlation and power spectra of observables [11][12][13]. In short, we provided the exact relationship between the time-evolution operator and the correlation functions and power spectra, as well as the possible behavior modes of each. The result is a deeper theoretical understanding and constructive calculational methods that complement early investigations that experimentally delivered meromorphic power spectra from chaotic dynamical systems [9,10].
Accordingly, our findings are relevant to modern applications of Ruelle-Pollicott resonance theory. These applications are leading, for example, to better understanding of sensitivities in climate models [16] and the dynamics of open quantum systems via their correspondence to classical chaotic dynamical systems [14,15]. Our results provide full analytical correspondence between observed correlation and the spectral properties of nonunitary models. Our approach also bears on Koopman operator theory and its applications, which has received a new wave of attention due to the success of recent datadriven algorithms [68]. However, our results also clarify that resonances discovered via pairwise correlation are generically an insufficient representation of the spectral features of such nonnormal dynamics. This emphasizes that the full spectral representation of the effective nonnormal dynamics [55], generically inaccessible via pairwise correlation, is worth pursuing. Success in this will immediately yield predictions about many complex systems of interest.
The most surprising and more immediate finding, though, is that temporal structure can fully evade detection by power spectra. Arbitrarily sophisticated processes can have exactly flat power spectra and so masquerade as white noise. Accordingly, we called such processes fraudulent white noise processes. Theorem 1 and Cor. 1 characterized the many ways that structure can be hidden from power spectra. And, ultimately, Thm. 2 addressed the more general condition for fraudulent white noise, in which the generated time-series could be inputdependent and nonstationary.
We started out noting that, on the one hand, divergent correlation length often heralds the emergence of new types of order. And, on the other, that pairwise correlation is generically identified as the structure in random systems. However, we showed that there is often rich structure even in the absence of pairwise correlations. What types of order are we failing to predict due to an historical emphasis on pairwise correlations? Complex systems surely exhibit emergent structure beyond the reach of pairwise statistics. There is almost surely more functionally-relevant brain activity available in EEGs beyond what is reported in their power spectra. Perhaps, however, we should consider beyond-pairwise structure for even simple generators of structure. For example, cosmological models could be more thoroughly tested against structure in the CMB beyond what is contained in the two-point angular correlation functions.
Having diagnosed the structures inaccessible via power spectra, we briefly discussed how to detect beyondpairwise structure in general, introducing the dependence function to detect any L-way correlations for any L. We also stressed the importance of model building whenever possible. In particular, it can help anticipate and perhaps avoid never-yet encountered catastrophes, which are often a byproduct of the high interconnectivity of complex socio-economic systems [69]. Model building also allows us to discover new mechanisms in nature. This all said, nature can still keep us in the dark. We showed that the correlations in a message can be shifted to arbitrarily-high orders of correlation. The result is that, for finite length messages, statistical inference can be made effectively impossible regardless of one's sophistication [2]. Nature herself employs this technique whenever we observe an increase in entropy-giving the impression of randomness generated, when it is only ever structure hidden in inaccessibly-obscure high-order correlations. In a sense, having pulled the wool over our eyes, Mother Nature lulls us into complacency with a soothing hiss of fraudulent white noise. Waking up to the true hues of reality-prying open the black box, dispelling apparent white noise-continues to require new theory and new experimentation.

ACKNOWLEDGMENTS
The authors thank Alec Boyd and Dowman Varn for insightful discussions. JPC thanks the Santa Fe Institute and the California Institute of Technology and the authors together thank the Telluride Science Research Center for their hospitality during visits. This material is based upon work supported by, or in part by, the U. S. Army Research Laboratory and the U. S. Army Research Office under contracts W911NF-13-1-0390 and W911NF-18-1-0028.
Appendix A: Diffraction patterns as power spectra Diffraction patterns are used extensively to infer material structure from the scattering of, for example, an incident X-ray beam [70][71][72][73][74]. Generally, consider r ∈ R n to be a vector in n-dimensional real space. The spatial arrangement of elastic scatterers is given by the scatterers' density f ( r). Ideally, we wish to recover f ( r) from our diffraction experiments, which provide measured intensities. However, far-field patterns of diffracted intensity yield only I diff ( ω) = c|F ( ω)| 2 , where F ( ω) = R n f ( r)e −i ω· r d n r is the n-dimensional Fourier transform of f ( r) and c is some constant. In other words, F ( ω)'s phase information is lost when only intensity is measured. The X-ray beam's expected diffracted intensity is proportional to |F ( ω)| 2 , which is the n-dimensional generalization of a power spectrum. However, it is also interesting to relate the n-dimensional diffraction pattern, along a curve in reciprocal space, to the more familiar one-dimensional power spectrum. For a given directional frequency ω, decompose r = r + r ⊥ , where r ≡ ( r · ω) ω and ω = ω/| ω|. Then, let µ ⊥ ( r ) be the accumulated density within the entire cross-sectional plane perpendicular to and uniquely identified by r ; i.e., µ ⊥ ( r ) ≡ R n−1 f ( r + r ⊥ ) d n−1 r ⊥ . We then find that in general: (A1) In particular, we see that the diffraction pattern along any line ω = ω ω (with varying ω but fixed ω) is the power spectrum of the net magnitude of scatterers within sequential cross sections of real space perpendicular to ω. For molecular or crystalline structures, when the net scatterer density is well-approximated by a superposition of more elementary densities f ( r) = j f j ( r − r j ), we obtain the alternative expression: (A2) Here, the layer-scattering factors are defined: for cross-sectional layers of depth d, where r j = r j · ω. And: is the n-dimensional Fourier transform of f j ( r). Again, the diffraction pattern (along some line in reciprocal space passing through the origin) appears as a onedimensional power spectrum. However, this time the diffraction pattern is the power spectrum of the complexvalued layer-scattering factors F ( ) ( ω) over a discrete spatial domain. The frequency-dependence of F ( ) ( ω) is often factored out to 'correct' the diffraction pattern, so that only the structure of interest remains [75,76]. The corrected diffraction pattern is thus a standard power spectrum, with the form of Eq. (1). Equation (A2) is an especially useful expression for layered structures, as demonstrated in Ref. [56].

Appendix B: Autocorrelation for processes generated by autonomous HMMs
Let's derive the autocorrelation function in general and in closed form for the class of autonomous HMMs introduced in the main text. Helpfully, for particular models, the expressions become analytic in terms of the model parameters.
Directly calculating, we find that the autocorrelation function, for τ > 0, for any such HMM is: where the integrals are written in a form meant to be easily accessible but should generally be interpreted as Lebesgue integrals. In the above derivation, note that: holds by definition of conditional probability. The decomposition of: for τ = 0 follows from the conditional independence in the relevant Bayesian network shown in Fig. 3. Moreover, the equality: can be derived by marginalizing over all possible intervening state sequences. We can use the hidden-state basis, where |s is the column vector of all 0s except for a 1 at the index corresponding to state s, while s| is simply its transpose. This yields a natural decomposition of the identity operator: I = s∈S |s s|.
Since the autocorrelation is a Hermitian function-i.e., γ(−τ ) = γ(τ )-and: we find the full autocorrelation function is given by: where Ω is the |S|-by-|S| matrix defined by: The Ω matrix simply places state-conditioned average outputs along its diagonal.
To better understand the range of possible behaviors of autocorrelation, we can go a step further. In particular, we employ the general spectral decomposition of T τ derived in Ref. [55] for nonnormal and potentially nondiagonalizable operators: where τ m is the generalized binomial coefficient: with τ 0 = 1. As briefly summarized in Sec. II D, Λ T is the set of T 's eigenvalues while T λ is the spectral projection operator associated with the eigenvalue λ. Recall that ν λ is the index of the eigenvalue λ, i.e., the size of the largest Jordan block associated with λ, and T λ,m = T λ (T −λI) m . Substituting Eq. (B2) into Eq. (B1) yields: for τ > 0. It is significant that the zero eigenvalue contributes a qualitatively distinct ephemeral behavior to the autocorrelation while |τ | < ν 0 . All other eigenmodes contribute products of polynomials times decaying exponentials in τ . When T is diagonalizable, the autocorrelation is simply a sum of decaying exponentials.

Appendix C: Analytical power spectra
The following derives both the continuous and discrete part of the power spectrum for HMM-generated processes. The development parallels that in Ref. [56], although that derivation was restricted to the special case of diffraction patterns from Mealy (i.e., edge-emitting) HMMs with countable alphabets. In contrast, the following derives analytical expressions for the power spectrum of any stochastic process generated by an HMM. Notably, it also allows uncountably infinite alphabets. Also, it is developed for Moore (i.e., state-emitting) HMMsalthough Mealy and Moore HMMs are class-equivalent and can be easily transformed from one to the other.

Diffuse Spectra
Recall Eq. (3): and Eq. (4)'s explicit expression for the correlation function: From these we can rewrite the power spectrum directly in terms of the generating HMM's transition matrix: We used the fact that z + z = 2Re(z) for any z ∈ C. For convenience, we introduce the variable z ≡ e −iω . We then note that the summation splits: For positive integer N , it is always true that: and: Hence, whenever I − zT is invertible (i.e., whenever e iω / ∈ Λ T ), we have: Together, this yields: Noting that (z −1 I − T ) −1 = (e iω I − T ) −1 , this implies that the continuous (i.e., diffuse) part of the power spectrum becomes: Equation (C3) is the principle result, yielding the continuous part of the power spectrum in closed form. However, it is also worth noting that Eq. (C2) (without the N → ∞ limit yet being taken) provides the exact result for the expected periodogram from finite-length N samples.

Discrete Spectra
The transition dynamic's eigenvalues Λ ρ(T ) = λ ∈ Λ T : |λ| = 1 on the unit circle are responsible for a power spectrum's Dirac δ-functions. In the physical context of diffraction patterns, these δ-functions are the familiar Bragg peaks. For finite length-N samples, eigenvalues on the unit circle give rise to Dirichlet kernels. As N → ∞, the analysis simplifies since the Dirichlet kernels converge to δ-functions.
The following derives the exact form of the δ-function contributions, showing how their presence and integrated magnitude can be calculated directly from the stochastic transition dynamic. Recall that the spectral projection operator T λ,0 associated with the eigenvalue λ can be defined as the residue of (zI − T ) −1 as z → λ: The spectral companion operators are: with the useful property that T λ,m T ζ,n = δ λ,ζ T λ,m+n and T λ,m = 0 for m ≥ ν λ . The index ν λ of the eigenvalue λ is the size of the largest Jordan block associated with λ. The Perron-Frobenius theorem guarantees that all eigenvalues on the unit circle have an index of one: i.e., ν λ = 1 for all λ ∈ Λ ρ(T ) . This means that the algebraic and geometric multiplicities of these eigenvalues coincide and they are all associated with diagonalizable subspaces.
Taking advantage of the index-one nature of the eigenvalues on the unit circle, and using the shorthand T λ ≡ T λ,0 for the spectral projection operators, we define: We then consider how the spectral decomposition of T τ splits into contributions from these two independent components: From Ref. [55], and employing the simplifying notation that 0 τ −m = δ τ −m,0 , we find: is the generalized binomial coefficient.
As the sequence length N → ∞, the summation over τ in Eq. (C1) divided by the sequence length becomes: In Eq. (C4), only the summation involving Θ is capable of contributing δ-functions. Expanding that sum yields: where ω λ is related to λ by λ = e iω λ . The last line is obtained using well-known properties of the discrete-time Fourier transform [77].
From Eqs. (C1), (C4), and (C5), we find that the potential δ-function at ω λ (and its 2π-periodic offsets) has integrated magnitude: Finally, from Eq. (C6) and the 2π-periodicity of the power spectrum, we obtain the full discrete (i.e., δfunction) contribution to the power spectrum: Appendix D: Cross-correlation and spectral densities Cross-correlation and cross-spectral densities are often important in applications [78,79]. These may be especially useful when analyzing input-output processes, to characterize the correlation of input and output, or to characterize the correlation between different aspects of the output. Our results can be easily extended to address these quantities.
Using an HMM that describes the joint stochastic process of two observables (x, y) ∈ A, it is straightforward to generalize our developments to cross-correlation γ XY (τ ): (rather than necessarily autocorrelation γ = γ XX ) and the associated cross-spectral densities P XY (ω): of distinct observables x ∈ X and y ∈ Y. The individual stochastic processes for each observable by itself can simply be obtained by marginalizing over the other observable.
Explicitly, the expressions take the form: where: and: π|s xy p(X,Y |s) .
Moreover, the continuous part of the cross-spectral density is given by: And so on.

Appendix E: Continuous-time processes
For simplicity and generality, the main development addressed discrete-time dynamics. (Indeed, discrete-time dynamics are, in a sense, more general than continuoustime dynamics, while continuous-time dynamics can be obtained as the limiting behavior of discrete-time dynamics.) However, correlation and power spectra are often applied to continuous-time processes. And so, the continuous-time setting is often of direct interest. The following makes a more explicit connection to continuoustime processes.
First, we note that a continuous-time process is typically observed only intermittently at some sampling frequency f 0 . The duration τ 0 = 1/f 0 between observations thus induces a discrete-time transition operator T τ0 between states in that time interval. In such cases, the discrete-time transition matrix is related to the continuous-time generator G of time evolution by T τ0 = e τ0G . Accordingly, the continuous-time generator can be obtained from the discrete-time dynamic as: G = f 0 ln T τ0 [80]. The relationship between discrete and continuous time is the same relationship that yields the well-known conformal mapping of the interior of the unit circle in the complex plane to the left-half of the complex plane, which also relates z-transforms and Laplace transforms. Of most direct relevance here, the eigenvalues of T τ0 and G are simply related by Λ Tτ 0 = ζ∈Λ G {e τ0ζ }.
Continuous-time representations can be analyzed directly. Consider the generic case of a continuous-time dynamic over a hidden state-space, with two types of example in mind: These different settings all have the same expression for the autocorrelation and power spectrum. We now give them in closed-form. For τ > 0, we find the autocorrelation to be: From this, we derive the continuous part of the power spectrum P c (ω) with respect to angular frequency ω = 2πf ∈ R, with the result that: Appealing to the resolvent's spectral expansion again allows us to better understand the possible shapes of their power spectrum: Since all of the frequency-dependence is isolated in the denominator and since π| Ω G λ,m Ω |1 is a frequencyindependent complex-valued constant, peaks in P c (ω) arise only via contributions of the form Re c (iω−λ) n for c ∈ C, ω ∈ R, λ ∈ Λ G , and n ∈ Z + . This provides a rich starting point for applications and further theoretical investigation. For example, Eq. (E1) helps explain the shapes of power spectra of nonlinear dynamical systems, as have appeared, e.g., in Ref. [10]. Furthermore, it suggests an approach to the inverse problem of inferring the spectrum of the hidden linear dynamic via power spectra.
Appendix F: Pairwise mutual information example For the process generated by the HMM given in Fig. 6, taking the limit of ever-narrower Gaussians in the stateconditioned PDFs, so that we work with pairs of δfunctions, then the process becomes Markovian and the pairwise mutual information can be calculated exactly: Continuing, s| T τ |s can be calculated via T 's spectral decomposition. Since T is diagonalizable and nondegenerate for all values of the transition parameter p, we find:

Moreover:
s| T 1 |s = s|1 π|s = π(s ) , so s| T τ |s simplifies somewhat to: In fact, Eq. (F1) is valid for any set of four PDFs we could have chosen for the example HMM's states, as long as the PDFs all have mutually exclusive support for the observable output, since this then makes the hidden state a function of the instantaneous observable.
Using the linear algebra of Eq. (F1), we calculate the pairwise mutual information and POPI spectrum numerically. The pairwise mutual informations are shown for p ∈ {0.1, 0.5, 0.9} in Fig. 8. Reasonably, the loss of information is monotonic over temporal distance. More surprisingly, the decay of pairwise mutual information is very-nearly exponential as made clear in the inset logarithmic plot.
The POPI spectrum, which can be rewritten for a wide-sense stationary process as: is shown for these same p-values in Fig. 9. The POPI spectrum was approximated by truncating the summation of modulated pairwise mutual informations at a sufficiently large separation of τ = 2000. den states. For a given t and x, the matrix elements s| T (x) t ( m) |s provide the probability density of transitioning from state s to s while emitting the observable x; that is: where p m is the probability density (induced by m) of the labeled transition. The symbol-labeled transition matrices {T (x) t ( m)} t∈T ,x∈A yield the net state-to-state transition probabilities when marginalizing over all possible observations: where s| T t ( m) |s = Pr m (S t+1 = s |S t = s).
Hidden message and embedding protocol . . .
Hidden message and embedding protocol . . .
Hidden message and embedding protocol . . .  Accordingly, panel (a) suggests that the transited edge determines the probability of the observable. Whereas, the decomposition of the bottom panel (b) suggests that the observation determines the probability of the latent state transition. The fact that both decompositions are valid insists, perhaps surprisingly, that the interpreta-tions have no physical distinction. The interpretation of causality is ambiguous although each calculus of conditional dependencies is reliable.
The measurement feedback models may initially appear rather restrictive when considering the possibilities of, say, measuring a quantum system in different bases and with different instruments. However, in principle, the different measurement choices are incorporated through the different transformations T t ( m), both through any pre-determined measurement choices in m and through dynamic-determination via feedback of the measurement outcomes themselves.
Reference [81]'s process tensors can also be used to model classical observable processes generated by general quantum dynamics. Although unnecessarily elaborate for most purposes, process tensors are appealing since they rigorously incorporate general quantum measurements. Ultimately though, they, together with a set of "experiments" m, could be mapped onto the alternative rather-simpler models proposed here, if the goal is only to model the observable classical output process.

Theorem 2 for Measurement Feedback
The MFM's average-observation matrices are: Notably, they are no longer diagonal in the hidden-state basis. Rather, they assign to each matrix element the average observation associated with that transition, multiplied by the probability of the edge being traversed when conditioned on occupying the outgoing state. That is: If the process is wide-sense stationary, then for τ > 0: which must be t-independent.
For input-independent processes with timeindependent transition dynamics-where T (x) t ( m) = T (x) and µ 1 = π-this simplifies to the autonomous Mealytype HMMs with continuous PDFs for the observable associated with each hidden-state-to-state transition.

24
The autocorrelation function (for τ ≥ 1) then reduces to: γ(τ ) = π|Ω T τ −1 Ω|1 , while the power spectrum's continuous part is: Note that this expression lacks T , the transition dynamic, when compared to Eq. (6). This follows since Ω induces a transition for these Mealy-type HMMs, reducing the number of subsequent transitions by one. Let's return to the general setting for autocorrelation given by Eq. (G2) for processes generated by possibly-input-dependent models. Developing the analog of Thm. 2 requires recognizing that the average observation on each edge matters, rather than previously, where the average observation from each state mattered. For MFMs, constant autocorrelation and flat power spectrum can again be guaranteed by a rather weak condition: The average output of the current edge does not by itself influence the average output of a future edge.
More explicitly, consider the set of all edges: which are transitions between hidden states that can be traversed at time t with positive probability. Since outputs occur during edge transitions, we redefine Ξ as the set of average outputs exhibited by the edges. Equa-tion (G1) indicates that the desired definition is: s| Ω t |s s| T t ( m) |s .
Furthermore, we define E t to be the random variable for the edge traversed at time t; i.e., E t is the joint random variable: E t = (S t , S t+1 ). And we define E (t) ξ ⊂ E (t) as the set of edges (at time t) with average output ξ ∈ Ξ: ξ ≡ (s, s ) ∈ E (t) : s|Ωt|s s|Tt( m)|s = ξ .
With these in hand, we can state the theorem analogous to Thm. 2.
and there exists a constant c ∈ C such that: for all separations τ > 0, t ∈ T , and ξ, ξ ∈ Ξ.
Proof. Starting from Eq. (G2), we find the autocorrelation for all such processes by calculating: s| Ω t |s s| T t ( m) |s s | Ω t+τ |s s | T t+τ ( m) |s Pr E t = (s, s ), E t+τ = (s , s ) . Now, suppose that: and there exists some constant c ∈ C such that: for all separations τ > 0, t ∈ T , and ξ, ξ ∈ Ξ. Then, we find: which is a constant for all separations τ > 0, t ∈ T , and ξ, ξ ∈ Ξ. Finally, a process with stationary loworder statistics and a flat autocorrelation has a flat power spectrum, as an immediate consequence of Eq. (3). This proves Thm. 3.
For the special case of an autonomous HMM that generates observations during hidden-state-to-state transitions, this condition simplifies significantly. Specifically, Ω t → Ω and T t ( m) → T become t-independent, which furthermore means that E (t) ξ → E ξ becomes tindependent. For autonomous wide-sense stationary processes, we have Pr(E t ) = Pr(E t+τ ) for all separations τ > 0 and for all t ∈ T . It then trivially follows that ξ∈Ξ ξ Pr(E t ∈ E ξ ) is constant for all t ∈ T . So, the only requirement for an autonomous edge-emitting HMM to produce fraudulent white noise is that it satisfies the condition: Pr(E t+τ ∈ E ξ |E t ∈ E ξ ) = Pr(E t+τ ∈ E ξ ) for all separations τ > 0, t ∈ T , and ξ, ξ ∈ Ξ.
Theorem 3 provides a very general condition for flat power spectra from measurement feedback models.
Appendix H: Theorem 2 for time-dependent PDFs Moreover, Thm. 3 suggests how Thm. 2 generalizes even further to possibly-input-dependent hiddenstate models with time-dependent PDFs associated with each state. We will call these morphing hidden models (MHMs) M MHM ( m). MHMs include, as special cases, all models (Moore-type HMMs and input-dependent generators) considered in the main text. We employ methods similar to those used in § G 2.
A MHM is a possibly-input-dependent generator of an observable output process {X t } t∈T . The output is generated via M MHM ( m) = S, A, {P t ( m)} t , {T t ( m)} t , µ 1 . Here, again, the lengths and alphabets of the inputs and outputs need not be commensurate. That is, the internal states S and output alphabet A are static. However, the hidden-state-to-state transition matrix T t ( m)-as well as the state-dependent PDFs P t ( m)-are time-dependent such that their values at time t are potentially a function of the full input vector m. More specifically, P t ( m) is the set of hidden-state-dependent probability density functions p m (X t |s) at time t. As before, µ 1 specifies the initial distribution over hidden states: S 1 ∼ µ 1 .
For such cases, set: x p m (Xt|s) |s s| .
The Ω t matrix is time-dependent with the stateconditioned expected outputs along its diagonal.
Since the average state output now varies in time, we must generalize Ξ from its more restricted use in the main text. Specifically, redefine Ξ as the set of state-dependent average outputs generated throughout time: Using these, we can state the following theorem, which generalizes Thm. 2.   for all separations τ > 0, t ∈ T , and ξ, ξ ∈ Ξ.
Proof. For the processes under consideration, we find the linear pairwise correlation (for τ ≥ 1) to be: is constant for all t ∈ T , and ∀ξ, ξ ∈ Ξ. That X t X t+τ p m (Xt,Xt+τ ) is constant verifies that the autocorrelation does not depend on the overall time shift of the process, so X t X t+τ p m (Xt,Xt+τ ) = γ(τ ). Moreover, γ(τ ) is constant. Finally, a process with constant autocorrelation has a flat power spectrum, as an immediate consequence of Eq. (3). This proves Thm. 4.