Fibonacci turbulence

Never is the difference between thermal equilibrium and turbulence so dramatic, as when a quadratic invariant makes the equilibrium statistics exactly Gaussian with independently fluctuating modes. That happens in two very different yet deeply connected classes of systems: incompressible hydrodynamics and resonantly interacting waves. This work presents the first case of a detailed information-theoretic analysis of turbulence in such strongly interacting systems. The analysis elucidates the fundamental roles of space and time in setting the cascade direction and the changes of the statistics along it. We introduce a beautifully simple yet rich family of discrete models with neighboring triplet interactions and show that it has families of quadratic conservation laws defined by the Fibonacci numbers. Depending on the single model parameter, three types of turbulence were found: single direct cascade, double cascade, and the first ever case of a single inverse cascade. We describe quantitatively how deviation from thermal equilibrium all the way to turbulent cascades makes statistics increasingly non-Gaussian and find the self-similar form of the one-mode probability distribution. We reveal where the information (entropy deficit) is encoded and disentangle the communication channels between modes, as quantified by the mutual information in pairs and the interaction information inside triplets.


I. INTRODUCTION
Existence of quadratic invariants and Gaussianity of equilibrium in a strongly interacting system may seem exceptional. Indeed, generic systems have no invariants except Hamiltonian. Strongly interacting systems have non-quadratic Hamiltonians, so that equilibrium Gibbs distribution (the exponent of the Hamiltonian) is generally non-Gaussian. And yet two very distinct wide classes of physical systems have quadratic invariants and Gaussian statistics at thermal equilibrium. The first class is the family of hydrodynamic models, starting from the celebrated hydrodynamic Euler equation and including many equations for geophysical, astrophysical and magnetohydrodynamic flows. The second class, as will be described in this paper, contains systems of resonantly interacting waves. We show that the discretized models of the first class exactly correspond to the second one. We shall consider one particular (arguably the simplest) family of such models and describe far-from equilibrium (turbulent) states of such systems.
One calls turbulence a state of any system, where many degrees of freedom are deviated far from thermal equilibrium. Therefore, studies of turbulence encompass a wide variety of phenomena in nature and industry, from pipe flows to ripples on a paddle. It can be studied from the viewpoint of a mathematician, engineer or a physicist. Here we employ the perspective of statistical physics, which is interested in fundamental principles that determine statistical distributions in turbulence and thermal equilibrium. We shall use both the traditional viewpoint of cascades and the relatively recent viewpoint of information theory, that is we address both energy and entropy of turbulence. So far, statistical physics approach to turbulence was to a large extent devoted to two quite distinct classes: systems of interacting waves like those on the surface of the ocean or a paddle and incompressible vortical flows where no waves are possible. Here we build a bridge between these two classes and show that discrete models of a certain kind can describe both.
On the one hand, the vorticity, ω = ∇ × v, of an isentropic flow of incompressible fluid satisfies the Euler equation: ∂ω/∂t = ∇ × (v × ω). Quite similar are two-dimensional hydrodynamic models, where a scalar field a (vorticity, temperature, potential) is linearly related to the stream function ψ of the velocity carrying the field: ∂a/∂t = −(v · ∇)a, v = (∂ψ/∂y, −∂ψ/∂x), ψ(r) = dr |r − r | m−2 a(r ). For the 2D Euler equation, m = 2. Other cases include surface geostrophic (m = 1), rotating shallow fluid or magnetized plasma (m = −2), etc. After Fourier transform, All such equations have quadratic nonlinearity and quadratic invariants. Then it was suggested [1] to model different cases of fluid turbulence by the chains of ODEs having quadratic invariant g ij u i u j and these properties: On the other hand, consider resonantly interacting waves with the general Hamiltonian, where V l,ij = 0 only if ω i + ω j = ω l . By the gauge transformation, a i = b i exp(ıω i t), we can turn the equations of motion, ıḃ i = ∂H w /∂b * i into a system of the type (1,2): This means that quadratic and cubic parts of the Hamiltonian are conserved separately. If such a system is brought into contact with thermostat, it is straightforward to show that the statistics is Gaussian: ln P{a i } ∝ − i ω i |a i | 2 .
Our interest in resonances is connected to that in nonequilibrium. Thermal equilibrium does not distinguish between resonant and non-resonant interactions because of the detailed balance: whatever correlations can be built over time between resonantly interacting modes, the reverse process destroying these correlations is equally probable. This is not so away from thermal equilibrium, especially in turbulence.
Neglecting non-resonant and accounting only resonant interactions is the standard approach to weakly interacting systems, even though the weak nonlinearity assumption breaks for resonant modes. Weak turbulence theory gets around this by considering continuous distribution and integrating over resonances to get the kinetic wave equation, which describes nonlinear evolution that is slow compared to linear oscillations with wave frequencies [2][3][4][5]. There is a tendency in theoretical statistical physics to restrict consideration to two opposite limits: either treat few modes or infinitely many. That preference is even stronger in the studies of non-equilibrium. And yet not only most of the real-world phenomena fall in between these limits, but, as we show here, one learns some fundamental lessons comparing equilibrium and non-equilibrium states of systems with a finite number of degrees of freedom, where phase coherence can play a prominent role. A similar lesson condensed matter physics taught us by discovering the world of mesoscopic phenomena, where the system size was made smaller than the phase coherence length.
The previous treatment of mode discreteness was focused on the sparseness of resonances for the particular cases when resonant surfaces ω k + ω q = ω |k+q| did not pass through integer lattice determined by a box [5,6]. Yet in many cases resonance surfaces lay in the lattice. For example, in a quite generic case of quadratic dispersion relation, ω k ∝ k 2 , Pythagorean theorem makes the resonance surface for three-wave interactions just perpendicular to any wavevector, so that in any rectangular box resonantly interacting triads fill the lattice of the box eigen modes.
Class of models (1,2,4) is ideally suited for the comparative analysis of thermal equilibrium and turbulence. We show here that such analysis sheds light on the most fundamental aspects of turbulence, particularly the roles of spatial and temporal scales in determining cascade directions and build-up of intermittency. We consider the particular sub-class of models that allow only neighboring interactions, and find it the most versatile tool to date to study turbulence as an ultimate far-from-equilibrium state. We carry here such detailed study of the known types of direct-only and double cascades with unprecedented numerical resolution. Even more important, our models allow for an inverse-only cascade never encoun-tered before.

II. FIBONACCI TURBULENCE
We consider a sub-class of the models (1,2,4) which is Hamiltonian with a local interaction: The equations of motion ıȧ i = ∂H/∂a * i are as follows: This family of models (each characterized by V i ) can have numerous classical and quantum applications, since i can be denoting real-space sites, spectral modes, masses of particles, number of monomers in a polymers, etc. The Hamiltonian describes, in particular, decay and coalescence of waves or quantum particles, breakdown and coagulation of particles or polymerization of polymers, etc, when interactions of comparable entities are dominant. In particular, the model describes the resonant interaction of waves whose frequencies are the Fibonacci Indeed, such waves are described by the Hamiltonian The first term corresponds to the linear terms in the equations of motion, while the second term represents the only possible resonant interactions, since no nonconsecutive Fibonacci numbers sum into another Fibonacci number (Zeckendorf theorem). For any real t, the Hamiltonian (7) is invariant under the The transformation (to the wave envelopes) reduces the equation of motionȧ i = ∂H 0 /∂a * i to (6). If i are spectral parameters, they are usually understood as shell numbers. That means that one can define wave numbers as is the golden mean. It plays here the role of an intershell ratio, since asymptotically at |i| 1, the wave number depends exponentially on the mode number: F i ∝ φ |i| . The model (6) thus belongs to the class of the so-called shell models [7], that is (2) with neighboring interactions. Coefficients of shell models are chosen to have one or two quadratic integrals of motion. In particular, the Sabra shell model [8,9] for a particular choice of coefficients (non-surprisingly, connected by the golden ratio) coincides with (6), which is Hamiltonian and has the cubic integral of motion (5).
It is straightforward to show that for arbitrary V i , the dynamical equations (6) conserve a one-parameter family of quadratic invariants (generalizations of the Manley-Rowe invariants for three-wave interactions): where k could be of either sign if we define negative Fibonacci numbers: F −j = (−1) j+1 F j . All invariants can be obtained as linear combinations of any two of them. For example, the first two integrals are positive, independent, and in involution: In a closed system, the microcanonical equilibrium is . We now add dissipation and white-in-time pumping: Here ξ i a * j = δ ij P i /2. It is straightforward to show, also in a general case (3,4), that such forcing on average does not change the cubic Hamiltonian, since which must be zero in a steady state. At least when all sums γ i +γ i−1 +γ i−2 are the same, i H i = H = 0 (one can probably imagine exotic cases where separate H i = 0 but we shall not consider them). If pumping and damping are in a detailed balance, so that k α k F i+k−1 = γ i /P i for every i, the thermal equilibrium distribution is Gaussian: P = exp(− k α k F k ) -it is a steady solution of the Fokker-Planck equation: That solution realizes maximum entropy for given values of the invariants. The distribution is exactly Gaussian despite the system being described by a cubic Hamiltonian and thus strongly interacting. The only restriction on the numbers α k is normalization. In particular, when only α 1 = 1/2T is nonzero, we get the equilibrium equipartition with the occupation numbers In a turbulent cascade, the fluxes of the quadratic invariants can be expressed via the third cumulant. Gauge invariance and Zeckendorf theorem ensure that the triple cumulants are nonzero only for consecutive modes in the inertial range: The right hand side is the discrete divergence of the flux The 3rd order cumulants are zero in equilibrium, but in turbulence they are nonzero to carry the flux. In the inertial interval, the flux must be constant and its divergence zero. For our class of models, we are able to find analytically the form of the 3rd cumulant (the analog of Kolmogorov's 4/5-law for fluid turbulence): where real constant C and integer M can be of either sign. Let us substitute (14) into (13) and show that all the fluxes are non-zero constants independent of m: The last equality follows from the Cassini identity: F m F n + F m−1 F n−1 = F m+n−1 . All the fluxes have the same sign for any k, that is all the integrals F k flow in the same direction for such solutions. We shall show in the next section what kind of fine-tuning is needed to get a double cascade when both cascades carry the same integrals. In [8], the (quadric) spectral flux of the (cubic) Hamiltonian was also defined, but pumping does not produce it, so that H = 0 in a steady turbulent state, as well as in thermal equilibrium.
Every model of our family is completely characterized by specifying the dependence of V i on i. While thermal equilibrium does not depend on V i and is universal for the whole family, turbulence depends on V i , as clear from (14). In what follows, we shall consider the powerlaw dependence V i = F α i , which turns into exponential dependence V i ≈ φ iα for i 1. Therefore, the single real parameter α determines the model. Our choice of particular values for α below will make the connection between wave and hydrodynamical turbulence through the Fibonacci model more explicit.

III. CASCADE DIRECTION
To get an analytic insight into our turbulence, particularly, to understand the flux direction, consider an invariant sub-space of solutions with purely imaginary a k = iρ k for all k: (16) In this case, H ≡ 0. The invariant subspace owes its existence to the invariance of (6) with respect to the symmetry a → −a * . Consider the chain running between some integers M and N , either positive or negative, and assume V i /V i−1 = φ α . Then for ρ i = Aφ iβ and M +1 < i < N −1 we obtain: The right hand side of (17) turns into zero for β = −(1 + α)/3, which defines a steady solution ρ i = φ −i(1+α)/3 (also with the replacement φ → −1/φ). This solution can describe either direct or inverse cascade, since the symmetry ρ → −ρ, t → −t means that one reverses the flux by changing the sign of ρ in this case. Indeed, consider the evolution from the initial state where all amplitudes are zero except the first two ρ M , ρ M +1 . The first term in (16) then will produce ρ M +2 of the same sign as V M ρ M ρ M +1 , which makes the flux positive, as it should be for a direct cascade. Alternately, by pumping the last two modes, the last term of (16) produces a negative flux. Which cascade can be realized in reality: direct, inverse or both? Physically it is clear that the sign of the flux must be determined by the only parameter α, that is by how mode interaction depends on the mode number. Indeed, for α = 1/2, the scaling of the flux steady solution coincides with that of the thermal equilibrium: Such state can be excited, for instance, by an imaginary pumping acting on every mode in detailed balance with dissipation. Physical common sense suggests that the cascade must carry the conserved quantity i from excess to scarcity [3,10]. For α > 1/2 the steady solution ρ 2 i = φ −2(1+α)i/3 decays with i faster than the equipartition ρ 2 i ∝ 1/F i ∝ φ −i , so that it must correspond to a direct cascade. By the same token, we must have an inverse cascade for α < 1/2. Of course, such consideration is a plausible argument, not a rigorous proof of the cascade sign. Getting a little ahead of ourselves, mention here that we observe a double-cascade turbulence exactly at α = 1/2.
In a general complex case, arguing that the cascade changes direction when α crosses 1/2 is even less straightforward. The flux constancy determines the third moment, which only bounds the product of the second and fourth moments (the claim that it bounds the square root of the products of three second moments made in [11] is incorrect). Yet a plausible argument can be made as follows. The input rate of F k is equal to Π = P F p+k−1 where p is the position of the pumping. The input rate must be equal to the dissipation rate Π = 2γ d F d+k−1 n d for any choice of γ d taken at the dissipation position d.
In order for n d to smoothly match the cascade, one must choose γ d comparable to the nonlinear interaction time: . Such reasoning can be applied to every i, which in turn gives the estimate for the spectrum of occupation numbers: Since the direction of the flux is toward the occupation numbers that are lower than thermal equilibrium, n i ∝ F −1 i , then again we see that the flux changes direction when V i ∝ F 1/2 i . The dimensionless degree of non-Gaussianity on such a spectrum, must be independent of i. For the spectrum close to equilibrium, ξ ∝ F Figures 1 and 2 confirm these predictions. We place the pumping at a single mode, i = p, between two dissipation regions on the ends, letting the system to choose the cascade direction. The system (10) with pumping and damping has been evolved numerically using LSODE solver [12]. At each step, random Gaussian noise of power P is applied to the pumping-connected mode injecting flux Π p = P F p . Damping with γ L and γ R is applied to the two left-most and two right-most modes respectively. For α = 1/2 (V i = √ F i ), the system is weakly distorted from equilibrium, with a constant flux on each side of the pumping. For α = 1/2 we find that the invariants are absorbed only on one end of the spectrum. For α > 1/2 (V i = F i ), we have a thermal equilibrium to the left of pumping and the direct cascade (18) with a constant ξ to the right. In the opposite case (α < 1/2, V i =const), we find an inverse cascade (18) with constant ξ to the left and equilibrium equipartition to the right of pumping. In both cases, the damping on the flux side is carefully selected to avoid build-up in the spectrum (the damping on the equilibrium side can be then set to zero to establish cleaner scaling). We have chosen V i = F i and V i = const because they qualitatively correspond to the Kolmogorov scaling of the direct energy cascade in incompressible turbulence and to the inverse wave action cascade in deep water turbulence respectively. Thermal equilibrium at the scales exceeding the pumping scale together with a direct cascade at smaller scales have been predicted and observed [13]. To the best of our knowledge, nobody has seen before an inverse-only cascade together with a thermal equilibrium on the other side of the pumping, neither in hydrodynamic-type systems nor in wave turbulence or shell models. Inverse cascades play a prominent role in geophysics and astrophysics, from creation of planetary jets to Jupiter Great Red Spot and stormy seas. In all known cases inverse cascades appear in systems with at least two conserved quantities that scale differently. All our conserved quantities (8) scale the same in the limit i 1. Probably closest to our findings are the results of Tom and Ray [14] who observed an inverse cascade in the limiting case of a shell model with two invariants having the same scaling. Their inverse cascade had normal scaling and run from fast to slow modes; the direct cascade was not resolved, but was likely present.
Our observation poses the question: can one find another class of systems with a single conservation law and  the turbulent spectrum less steep than equilibrium. In weak wave turbulence, this requires the sum of the space dimensionality and the scaling exponent of the threewave interaction to be less than the frequency scaling exponent [3]. We do not know such a physical system, nor we aware of any fundamental law that forbids its existence. Remark that the connection between the cascade direction, its stability and steepness relative to equipartition has been firmly established in the weak turbulence theory [3,10]. In all known examples, the formal turbulent solution with a wrong flux sign is not realized; the system chooses instead to stay close to equipartition with a slight deviation that provides for the flux in the right direction [3,15]. Similarly, when we place pumping and damping at the "wrong" ends of a finite chain, our system heats up, staying close to thermal equilibrium. It is important that our system is a one-dimensional chain, as well as shell models, so that there is no space and consequently no distinction in the phase volume (number of modes) between infrared and ultraviolet parts of the spectrum. The directions along the chain are only distinguished temporally, i.e. in terms of growth/decay of the typical interaction time. The same combination V 2 i /F i ∝ φ 2α−1 determines the idependence of the inverse interaction time both for the equilibrium, As the above consideration shows, the cascade proceeds from slow modes to fast modes in Fibonacci turbulence. Similarly in shell models [11,16,17] (albeit with parameters and conservation laws distinct from our model), a cascade proceeding from fast modes to slow modes was never observed. It was argued that this is because the fast modes act like thermal noise on the slow ones, which must lead to equilibrium [16]. That this cannot be generally true follows from the existence of the inverse energy cascade in 2D incompressible turbulence and from numerous examples in weak wave turbulence where non-linear interaction time either grows or decays along the cascade. Moreover, the formation of the cascade spectrum proceeds from fast to slow modes (and not necessarily from pumping to damping), according to the information-theory argument [18].
Why is the flux direction unambiguously related to the cascade acceleration in shell models in general and in our model in particular, in distinction from other cases? The argument can be made by considering capacity, a measure that tells at which end the conserved quantity is stored -perturbations are known to run towards that end [3]. For example, the power-law energy density spectrum k ∝ k −s in d dimensions has the total energy k d d k -at which end it diverges is determined by the sign of d − s. This is generally unrelated to the direction of the energy cascade, determined by the sign of s, which tells whether the spectrum is more or less steep than the equipartition. However, in shell models the exponential character of i-dependencies makes the total energy i F i |a i | 2 determined by either the last or the first term of the sum, which solely depends on whether F i |a i | 2 is steeper than equipartition or not, that is by the sign of the flux.
Which direction then the cascade goes in the symmetric case, V i = √ F i ? Now the naive cascade solution (18) coincides with thermal equipartition, F i n i =const, and the interaction time is independent of the mode number for such n i . If we start from thermal equilibrium and apply pumping to some intermediate mode, the system develops cascades in both directions. The left panel of the Figure 1 shows that the pumping at site p inside the interval (1, N ) generates left and right fluxes in the proportion Π L /Π R (N −p)/p. This seems natural as in the shorter interval the steeper spectrum falls away from the pumping, which must correspond to a larger flux. This means that if we want to keep the flux constant while increasing p or N − p, we need to keep constant the ratio (N − p)/p.
We end this section with a general remark. Fibonacci Hamiltonian is not symmetric with respect to reversing the order of modes, it sets the preferred direction, which is physically meaningful since the frequencies of two lower modes sum into the frequency of a high one. Yet, as we see in the case V i F −1/2 i =const, direct and inverse cascades are pretty symmetric. So, it is natural to conclude that indeed the i-dependence of V i F −1/2 i determines which way cascade goes.

IV. ALONG THE CASCADES AND AWAY FROM EQUILIBRIUM
As we have seen, thermal equilibrium statistics is exactly Gaussian with no correlation between modes, despite strong interaction (which actually establishes equipartition). The reason for the absence of correlation is apparently the detailed balance that cancels them. We do not expect such cancelations in non-equilibrium states. In all cases of strong turbulence known before, the degree of non-Gaussianity increases along a direct cascade and stays constant along an inverse cascade [19,20]. As we shall show now, non-Gaussianity always increases along the cascades in our one-dimensional chains.
We present first the symmetric case, where the system is close to the equilibrium equipartition with the temperature set by pumping and slowly changing with the mode number: n i F i ≈ (P F p ) 2/3 f (i). The slow function f (i) can be suggested by the analogy with the 2D enstrophy cascade [21,22] as f (i) ∝ ln 2/3 F i ∝ i 2/3 , counting from the damping region. This gives the dimensionless cumulant ξ ∝ 1/i. This hypothesis is supported by the right panel of the Figure 1, which shows that ξ grows along both cascades by a power law in i rather than exponentially. Let us stress that count always starts from the dissipation region, where we have the balance condition Π = according to the dynamical estimate. This sets the nonlinearity parameter of order unity at the damping region and decaying towards pumping; the longer the interval, the smaller is ξ at any fixed distance from the pumping region. The limit of long intervals may then be amenable to an analytical treatment. Indeed, Figure 3 demonstrates that as the interval increases, the higher cumulants remain small over longer and and longer intervals starting from pumping. Despite the model having ultra-local interactions (every mode participates in only three adjacent interacting triplets), the cascade formation is very nonlocal. It is somewhat similar to thermal conduction: if we keep the flux but increase the distance, the distribution gets closer to the thermal equilibrium at every point.
Turning to asymmetric (one-cascade) cases, we see the cumulants higher than third growing with F i by a power law instead of logarithmic. Rather than look for scaling in the mode number i, we find it more natural to use F i (playing the role of frequency); at large i one has F i ≈ φ i , where φ is the golden mean. Traditional study of turbulence in general and shell models in particular was focused on the single-mode moments (analog of structure functions), |a i | q ∝ F −ζq i , whose anomalous scaling exponents, ∆(q) = qζ 3 /3 − ζ q give particular measures of how non-Gaussianity grows along the cascade. For V i = F α i , the flux law gives J i ∝ Π/V i F i , that is ζ 3 = α + 1. The anomalous scaling is observable in numerics for the single-cascade cases α = 0 and α = 1, as shown in the right panel of the Figure 6. This seems to be the first case of an anomalous scaling in an inverse cascade, with the anomalous dimensions having the opposite signs to those in direct cascades. The exponents start fairly small but grow fast with q. The anomalous exponents, ∆(q), can be related to the statistical Lagrangian conservation laws [23,24] in fluid turbulence; no comparable physical picture was developed for shell models. Without physical guiding, the set of the anomalous exponents is not very informative, all the more that they characterize only one-mode distribution.
Here we suggest a complementary set of three information-theoretic measures, which shed a new light on the turbulent statistics emerging along the cascade. The main distinction of any non-equilibrium state is that it has lower entropy than the thermal equilibrium at the same energy. Turbulence has the entropy that is much lower, which means that a lot of information is processed to excite the turbulence state. We pose the question: where is the information that distinguishes turbulence from equilibrium encoded?

V. WHERE IS THE INFORMATION ENCODED?
First, the information is encoded in a single-mode statistics, which is getting more non-Gaussian deeper in the cascade. This must be reflected in the decay of the one-mode entropy, S i = S(x i ) = S(|a i |/ √ n i ), with the growth of |i − p|. This can be computed using the multifractal formalism: the moments x q i ∝ F −ζq+qζ2/2 i in the limit of large |i − p| correspond to the multi-fractal distribution, where f (h) = min q (ζ q − qζ 2 /2 − qh), that is f (h) is the Legendre transform of ζ(q). The entropy is then This decay is logarithmic in frequency F i , that is linear in i, as indeed can be seen in Figure 6, where i is counted from pumping. Noticing that ∆ 1 ≈ ∆ 2 and assuming quadratic dependence for q ≤ 3, we estimate ∆ (0) ≈ 3∆ 1 /2 and observe that the dashed lines in the right panel of Figure 6 with the slopes ∆ 1 ln φ by the order of magnitude represent the entropy decay in the inertial interval in both direct and inverse cascades.
Second, the information is encoded in the correlations of different modes. It is natural to assume that correlations are strongest for modes in interacting triplets, a i , a i+1 , a i+2 . Disentangling of information encoded can be done by using structured groupings [25][26][27]: a j , a k , a l ) + . . . + (−1) n+1 S(a 1 , . . . , a n ) .
For n = 1, this gives the one-mode entropy S i which measures the total amount of information one can obtain by measuring or computing one-mode statistics. While the entropy itself depends on the units or parametrization, all the quantities (22) for n > 1 are independent of units and invariant with respect to simultaneous re- parametrization of every single variable. For n = 2, we have the widely used mutual information, which measures the amount of information one can learn about one mode by measuring another, that is characterizes the correlation between two modes. It is interesting that all pairs in the triplet have comparable mutual information in the direct cascade (V i = F i ), while I i,i+1 exceeds noticeably I i,i+2 in the inverse cascade (V i = 1), see the upper right panel in Figure 8. One can also define the total (multi-mode) mutual information as the relative entropy between the true joint distribution and the product distribution: I(a 1 , . . . , a k ) = k i=1 S(a i ) − S(a 1 , . . . , a k ). It is positive and monotonically decreases upon averaging over any of its arguments. As we see from Figure 8, the changes along the cascade in one-mode entropy and in two-mode and three-mode mutual information are comparable, that is one obtains comparable amount of information about turbulence from these quantities.
To see how much more information one gets by measuring or computing the three modes simultaneously compared to separately by pairs, one needs to use the measure of the irreducible information encoded in triplets, as given by the third member of the hierarchy (22): It is called interaction information in the classical statistics and topological entanglement entropy in the quantum statistics [25,28]. Interaction information measures the influence of the third variable on the amount of information shared between the other two and could be of either sign. Positive II(X, Y, Z) measures the redundancy in the information about Y obtained by measuring X and Deviation of entropies from equilibrium, mutual information, and interaction information for α = 1/2 and center pumping for a set of 5 · 10 7 data point. The same values of entropy were obtained for a set of 2 · 10 7 data point, that is Si is saturated. Both I and II show a slight decrease in absolute values with the increase of the ensemble size from 2 · 10 7 to 5 · 10 7 . Z separately, while negative one measures synergy which is the extra information about Y received by knowing X and Z together. While we cannot prove it mathematically, it seems physically plausible that systems with three-mode interaction must demonstrate synergy. Indeed, one finds a strong synergy in weak turbulence: it was shown that I 123 I 12 + I 23 + I 13 [18], so that II < 0 and much more information is encoded in three modes than in the pairs separately. Here we find that the same is true for the cascades close to thermal equilibrium at V i = √ F i as seen in Figure 7. Indeed, the two-mode mutual information is much smaller than both the onemode entropy and the absolute value of the interaction information, which is negative.
Let us stress that both the mutual information and the interaction information are symmetric, that is they measure the degree of correlation rather than causal relationship or cascade direction.
We compute the entropies and mutual information as follows. First, we obtain the probability distribution in 4D space (x 2 i−2 , x 2 i−1 , x 2 i , θ i ) and integrate it to get corresponding 1D and 2D distributions. Here, erage. Mutual information and information interaction are computed directly from entropies, S = −ΣP log 2 P, obtained for these distributions, since all normalization factors cancel out in subtraction. The entropy for an individual mode, however, is presented relative to the Gaussian entropy based on the average occupation number obtained for the binned, staircase distribution for x 2 i . We use the bin sizes ∆x 2 i = 1 for α = 0 and α = 1, and ∆x 2 i = 1/2 for α = 1/2. In all cases ∆θ = 2π/32. Far from equilibrium, we find synergy for the modes close to the pumping and redundancy for damping, see the last panel of Figure 8. That means that the interaction information passes through zero in the inertial interval. There even seems to be a tendency to stick to zero in the inertial interval but this requires further studies with the number of modes exceeding our present abilities. (Our computations are done with a record number of modes, up to 80, while previous studies were mostly done for 20-30. The interaction times decrease exponentially with the mode number, which imposes heavy requirements on the computational time step. On top of that one needs very long runs to collect enough statistics to reliably represent the three-mode probability distribution in four-dimensional space.) With the present set of data we can suggest that most of the information about the three-mode correlation is in the sum of the pair correlations in the triplet. This is more pronounced in the direct cascade than in the inverse cascade. Since the requirements on statistics grow exponentially with the dimensionality, the suggestion that one can get most of information (or at least a large part of it) from lowerdimensional probability distributions is great news for turbulence measurements and modeling. To put it simply, comparable amounts of information can be brought from one-mode and from three-mode measurements in direct and inverse cascades; most of that information can be inferred from two-mode measurements. It remains to be seen to what degree this property of small (asymptotically zero?) interaction information is a universal feature of strong turbulence.
Insets in the Figures 4,5 show the probability distribution of the relative phase, θ i , which is closely related to the flux (skewness), proportional to |a i a i−1 a i−2 | sin θ i . The probability maximum is then at ±π/2 for direct and inverse cascades respectively. Also, the i-dependence of the phase distributions is in accordance with the changes in skewness along i. In the two-cascade symmetric case, the distribution is flat (the phases are random) near the pumping, and the phase correlations appear along the cascades, as can be seen comparing the last panel of Figure 1 with the inset in the right panel of Figure 4. In the one-cascade cases, both skewness and the form of the spectrum are practically independent of the mode number, as seen from Figures 2,5. The fact that the deviations from Gaussianity grow along our inverse cascade, in distinction from all the inverse cascades known before, calls for reflection. We used to think about the anomalous scaling and intermittency in spatial terms: Direct cascades proceed inside the force correlation radius, which imposes non-locality, while in inverse cascades one effectively averages over many smallscale fluctuations, which bring scale invariance [19,20]. The emphasis on the spatial features was reinforced by the success of the Kraichnan's model of passive tracer turbulence, where it has been shown that the spatial (rather than temporal) structure of the velocity field is responsible for an anomalous scaling and intermittency of the tracer. There is no space in our case, so apparently it is all about time. Indeed, as we have seen, all our cascades propagate from slow to fast modes, which leads to the build-up of non-Gaussianity and correlations. As a result, the entropy of every mode decreases and the Deviation of the entropy from equilibrium, the mutual information, and the interaction information (all in bits) for α = 0 and α = 1 and center pumping. Number of data points 2 · 10 8 for α = 0, 80 modes, 6 · 10 7 for α = 0, 60 modes, and 10 8 for α = 1. For the bin size selected, all quantities agree with those obtained in a half-reduced data set.
inter-mode information grows along the cascade. This diminishes the overall entropy compared to the entropy of the same number of modes in thermal equilibrium with the same total energy.
Despite qualitative similarity, there is a quantitative differences between our direct and inverse cascades. Figures 5,6 show that the one-mode statistics and its moments faster deviate from Gaussian as one proceeds along the inverse cascade than the direct one. And yet one can see from Figures 6,7 that the one-mode entropy is essentially the same in both cascades, as well as the mutual information between two neighboring modes and the three-mode mutual information. The mutual information between non-neighboring modes I 13 is about twice smaller, as seen in Figure 8. This difference can probably be related to the dynamics, which in our system is the coalescence of two neighboring modes into the next one and the inverse process of decay of one into two. In the dynamical equation (16), only one (first) term is responsible for the direct process (and the direct cascade), while two terms are responsible for the inverse process (and the inverse cascade).
An important distinction between double-cascade and single-cascade turbulence in our system is the dependence on the system size. The degree of non-Gaussianity of the complex amplitudes is fixed in the dissipation regions of the double cascade, so that in the thermodynamic limit the statistics is Gaussian in the inertial intervals. On the contrary, the statistics of the amplitudes is fixed at the forcing scale for a single cascade, and it deviates more and more from Gaussianity as one goes along the cascade.
We end this section by a short remark on the production balance of the total entropy S = − ln ρ(a 1 , . . . , a N ) . Here ρ(a 1 , . . . , a N ) is the full N -mode PDF. Since wave interaction does not change the total entropy, then the entropy absorption by the dissipation must be equal to the entropy production by the pumping [18,29]: For a single-cascade cases (V i = 1 and V i = F i ), the energy balance P F p = 2γF d n d means that the left hand side of (23) must be much larger than the Gaussian estimate P/n p [18]. It may seem to contradict our numerical finding that the pumping-connected mode a p has its one-mode statistics close to Gaussian. Of course, there are nonzero triple correlation and the mutual information with two neighboring modes in the direction of the cascade. Yet since ξ 1, then the triple moment J p n 3/2 p both in direct and inverse cascades, so that the contribution to the left hand side of (23) is comparable with P/n p . We conclude then that even the pumpingconnected mode must have strong correlations with many other modes. Since the triple correlation function of nonadjacent modes are zero, such correlations must be encoded in higher cumulants. That deserves further study.

VI. KOLMOGOROV MULTIPLIERS AND SELF-SIMILARITY
Unbounded decrease of entropy along a single cascade prompts one to ask whether the total entropy of turbulence is extensive (that is proportional to the number of modes) or grows slower than linear with the number of modes, so there could be some "area law of turbulence" (like for the entropy of black holes). This question can be answered with the help of the so-called Kolmogorov multipliers, σ i = ln |a i /a i−1 | [30]. Figure 9 shows that in our cascades the multipliers have universal statistics independent of i, similar to shell models [31][32][33][34]. One consequence of the scale invariance of the statistics of the multipliers is that the entropy of the system is extensive, that is proportional to the number of modes. Of course, the entropy depends on the representation. From the information theory viewpoint, the Kolmogorov multipliers realize representation by (almost) independent component, that is allow for maximal entropy. In other words, computing or measuring turbulence in terms of multipliers gives maximal information per measurement (the absolute maximum is achieved by using the flat distribution, that is the variable u(σ) defined by du = P (σ)dσ).
The amplitudes are expressed via the multipliers: The first term is due to the pumping-connected mode, which correlates weakly with σ i in the inertial interval. As shown below, the correlation between multipliers decays fast with the distance between them. That suggests that the statistics of the amplitude logarithm at large k must have asymptotically a large-deviation form: Indeed, the three upper curves in the top row of Figure 5 collapse in these variables, as shown in the bottom row of Figure 9. The self-similar distribution of the logarithm of amplitude, (24), is a dramatic simplification in comparison with the general multi-fractal form (20). Technically, it means that g(x k /F h k ) = g(e X k −kh ln φ ) is such a sharp function that the integral in (20) is determined by the single X k -dependent value, h(X k ) = X k /k ln φ. We then identify f = −H/ ln φ.
The self-similarity of the amplitude distribution (plus the independence of the phase distribution on the mode number) is great news, since it allows one to predict the statistics of long cascades (at higher Reynolds number) from the study of shorter ones. In our case, Figure 9 shows that 28-th mode already has the form close to asymptotic. Self-similarity and finite correlation radius of the Kolmogorov multipliers has been also established experimentally for Navier-Stokes turbulence [35]. To avoid misunderstanding, let us stress that the selfsimilarity is found for the probability distribution of the logarithm of the amplitude, which does not contradict the anomalous scaling of the amplitude moments with the exponents ζ q determined by the Legendre transform of f or H.
If the multipliers were statistically independent, one would compute ln P(X) = −kH(X/k) or ζ q proceeding from P (σ) by a standard large-deviation formalism: H(y) = min z [zy − G(z)], where G(z) = ln dσe zσ P (σ). Such derivation would express |a k | q via e qσ k , which is impossible since the former moments exist for all q, while the latter do not because of the exponential tails of P (σ), see also [35,36].
Therefore, to describe properly the scaling of the amplitudes one needs to study correlations between multipliers. Physically, it is quite natural that the law of the distribution change along the cascade must be encoded in correlations between the steps of the cascade. Indeed, we find that the neighboring multipliers are dependent, albeit weakly, as expressed in their mutual information (traditionally used pair correlation function [32,33,35] is not a proper measure of correlation for non-Gaussian statistics). We find that for the inverse cascade, −0.08. No discernible I(σ i , σ i+k ) were found for k > 1. While σ i and σ i+2 are practically uncorrelated, there is some small synergy in a triplet.
To appreciate these numbers, let us present for comparison the statistics of the Kolmogorov multipliers in thermal equilibrium. Normalized for zero mean and unit variance, we have That gives I(σ i , σ i+1 ) = ln 2 − 1/2 ≈ 0.19. Figure 9 shows that the equilibrium Gaussian statistics of independent amplitudes perfectly represents the statistics of a single multiplier. The joint PDFs P (σ i , σ i+1 ) are shown in Figure 10 for thermal equilibrium and for two cascades. Again, the Gaussian statistics represents turbulence remarkable well. The differences between the three cases are most pronounced around the peak at the origin, while the distant contours are hardly distinguishable. In plain words, the probabilities of strong fluctuations of the multipliers are the same in thermal equilibrium as in turbulence cascades. This is remarkably different from the statistics of the complex amplitudes, which demonstrate most difference between the three cases for strong fluctuations and for high moments. There seems to be a certain duality between fluctuations of the amplitudes and multipliers: strong fluctuations of the multipliers correspond to weakly correlated amplitudes, while strong fluctuations of the amplitudes may require their strong correlations and thus correspond to multipliers close to their mean values. Whether this duality can be exploited for an analytic treatment remains to be seen. The information about the anomalous scaling exponents of the amplitudes in turbulence must be encoded in the correlations between multipliers. Note that the mutual information I(σ i , σ i+1 ) for both cascades (I = 0.23 and I = 0.30) is not that much higher than in thermal equilibrium (I = 0.19 bits). Physicists tend to be much excited about any broken symmetry; it is refreshing to notice that relatively small information is needed to encode the broken scale invariance in turbulence. How to decode this information from the joint statistics of multipliers remains the task for the future

VII. DISCUSSION
The most surprising finding of our work is the existence of an inverse-only cascade and its anomalous scaling. In all cases known before, an inverse cascade appears only as an outlet for an extra invariant that cannot be transferred along the direct cascade with other invariant(s). In a truly weak turbulence, when the whole statistics is close to Gaussian, an inverse-only cascade is indeed impossible, since it would require an environment that provides rather than extracts entropy, which contradicts the second law of thermodynamics [18,29]. Here we have shown that an inverse-only cascade is possible in a strong turbulence. As far as an anomalous scaling is concerned, we relate it to the change of the interaction time along the cascade. All the inverse cascades known before run from fast to slow modes and have a normal scaling. In our case, as in all shell models, cascades always proceed from slow to fast modes. Apparently, this is the reason that non-Gaussianity increases along all our cascades, and an anomalous scaling takes place in both single inverse and single direct cascades. Indeed, proceeding from fast to slow modes (in inverse cascades known before) involves an effective averaging over fast degrees of freedom, which diminishes intermittency. On the contrary, our cascades build up intermittency as they proceed.
Another unexpected conclusion follows from the entropy production balance in a steady turbulent state: even though the marginal statistics of the pumpingconnected mode (averaged over all other modes) can be close to Gaussian, the correlations of that mode with other modes cannot be weak.
Most of the present work was devoted to disentangling of the information encoded in strong turbulence. It was predicted that in weak turbulence most of the in-formation is encoded in the three-mode statistics [18], and Figure 7 confirms this prediction. Yet in strong turbulence, we find that as much information is encoded in one-mode as in two-mode statistics, while three-mode statistics does not add much. This could be of practical importance for turbulence studies since it is much more difficult to collect, store and analyze statistics for threemode and multi-mode distributions. Another important lesson is that measuring or computing mode amplitudes (or velocity structure functions) brings diminishing returns, that is less and less information, as one goes deep into the cascade. The maximal information is encoded in the statistics of the Kolmogorov multipliers. Most of that information is encoded in the statistics of a single multiplier; less than 10% is encoded in the correlation of neighbors. How to decode it is the task for the future.
We wish to thank Yotam Shapira for helpful discussions. The work was supported by the Scientific Excellence Center and Ariane de Rothschild Women Doctoral Program at WIS, grant 662962 of the Simons foundation, grant 075-15-2019-1893 by the Russian Ministry of Science, grant 873028 of the EU Horizon 2020 programme, and grants of ISF, BSF and Minerva. NV was in part supported by NSF grant number DMS-1814619. This work used the Extreme Science and Engineering Discovery Environment (XSEDE), which is supported by NSF grant number ACI-1548562, allocation DMS-140028.