Information-Thermodynamic Bound on Information Flow in Turbulent Cascade

We investigate the nature of information flow in turbulence from an information-thermodynamic viewpoint. For the fully developed three-dimensional fluid turbulence described by the fluctuating Navier-Stokes equation, we prove that information of large-scale eddies is transferred to small scales along with the energy cascade. We numerically illustrate our findings using a shell model and further show that in the inertial range, the intensity of the information flow is nearly constant and can be scaled by the large-eddy turnover time. Our numerical results also suggest that the corresponding information-thermodynamic efficiency is quite low compared to other typical information processing systems such as Maxwell's demon. These findings provide a new perspective on how universality and intermittency of turbulent fluctuations emerge at small scales.


I. INTRODUCTION
Turbulence is characterized by the interference of fluctuations between disparate space-time scales.Despite its seemingly complicated and unpredictable nature, universal laws are hidden behind the disordered fluid motion.For example, in fully developed three-dimensional turbulence, the energy spectrum exhibits the Kolmogorov spectrum E(k) ∝ k −5/3 at scales much smaller than the energy injection scale [1][2][3][4][5].The Kolmogorov spectrum is universal in the sense that it is independent of the details of the large scales, such as the boundary conditions or the mechanism of the external stirring.Furthermore, in addition to the energy spectrum, which corresponds to the second-order moments of the velocity field, the higherorder moments also exhibit universal scaling laws [4,6].Such remarkable universality of turbulent fluctuations at small scales is believed to be induced by the energy cascade process, where the energy is transferred conservatively from large to small scales.More specifically, there is a common intuitive picture that the universal statistical properties emerge at small scales because "information" about the details of the large scales is lost in the chaotic stepwise cascade process [5,7,8].
Somewhat contrary to this intuitive picture, some numerical and experimental observations suggest that the small-scale eddies do not "forget" about the large scales.For example, it is known that fluctuations of small-scale quantities (e.g., the energy dissipation rate) follow those of large-scale quantities (e.g., the energy injection rate) with a time lag along with the energy cascade [9][10][11].This time lag is on the order of the large-eddy turnover time, which is the characteristic time scale for the largest eddies to be stretched into smaller eddies.Another example is the chaos synchronization of small-scale motions induced by the energy cascade, where the smallscale velocity field is slaved to the chaotic dynamics of the large-scale velocity field [12][13][14][15][16].Moreover, smallscale intermittency can also be regarded as such an example because it implies that the turbulent fluctuations grow in each cascade step and thus "remember" the large scales [4,7,8].These phenomena suggest that information about the large-scale fluctuations is not lost in the cascade process, but rather is transferred to small scales.
In order to deepen our understanding of the generation mechanism of universality and intermittency of turbulent fluctuations, it is thus desirable to reveal the nature of the information transfer across scales associated with the energy cascade.As a first step toward this end, here we aim to prove that information of turbulent fluctuations is transferred from large to small scales in fully developed three-dimensional fluid turbulence.While turbulence has been studied in various contexts from informationtheoretic perspectives in recent decades [17][18][19][20][21][22][23][24][25][26][27][28], no previous studies have theoretically shown that information flows across scales along with turbulent cascade.
For this purpose, we employ information thermodynamics, which is a thermodynamic framework for information flow between interacting subsystems [29][30][31].While information thermodynamics has its origins in the thought experiment of Maxwell's demon, it has recently been applied to information processing at the cellular level in biological systems [32][33][34][35][36] and even to deterministic chemical reaction networks [37].By applying information thermodynamics to turbulence, we can clearly define the concept of "information" as a quantity closely related to thermodynamic quantities and obtain universal constraints on the flow of the information.Note that information thermodynamics requires us to use a thermodynamically consistent model that includes thermal fluctuations, i.e., fluctuating hydrodynamic equations [38,39].To put it another way, this approach also enables us to investigate the effects of thermal fluctuations on turbulence dynamics, which has recently been intensively investigated [40][41][42][43][44][45][46].
In this paper, we prove that information of turbulent fluctuations is transferred from large to small scales along with the energy cascade.We emphasize that our main results, [Ineqs.(25) and (26)], are exact and universal relations, independent of the details of the flow under consideration.While we derive these relations for the fluctuating Navier-Stokes equation, our results are valid for various turbulence models, including shell models.We numerically illustrate our findings using the Sabra shell model and further show that in the inertial range, the intensity of the information flow is nearly constant and can be scaled by the large-eddy turnover time [Eq.(54)].This observation suggests that the information of largescale turbulent fluctuations is transferred to small scales with nearly constant intensity by the energy cascade process.Thus, our results challenge the conventional intuitive picture of how universality emerges at small scales.Moreover, our numerical results suggest that the corresponding information-thermodynamic efficiency is quite low compared to other typical information processing systems such as Maxwell's demon.This implies that transferring information from large to small scales involves enormous thermodynamic costs, indicating the poor performance of turbulence as an information processing system.
This paper is organized as follows.In Sec.II, we introduce the fluctuating Navier-Stokes equation and its corresponding Fokker-Planck equation.In Sec.III, we briefly review some basic properties of fully developed turbulence described by the fluctuating Navier-Stokes equation.In Sec.IV, we introduce the two informationtheoretic quantities that are important in describing our main result: mutual information and information flow.Then, in Sec.V, we explain our main result on the information flow in turbulence and its derivation.In the derivation, we first formulate the second law of thermodynamics for the fluctuating Navier-Stokes equation and then derive the second law of information thermodynamics.Section VI presents a numerical demonstration of our main result.By introducing the concept of informationthermodynamic efficiency, we show that it is quite low in turbulence compared to other typical information processing systems.We further show that in the inertial range, the intensity of the information flow is nearly constant and can be scaled by the large-eddy turnover time.In Sec.VII, we summarize our findings with some remarks and future perspectives.The appendices contain details of the derivations and numerical simulations.

II. SETUP
While our results are valid for various thermodynamically consistent turbulence models, we focus on the fluctuating Navier-Stokes equation except for Sec.VI, where we numerically illustrate our main result by using a shell model.We consider an incompressible fluid with constant mass density ρ, temperature T , and kinematic viscosity ν, confined in a cube with periodic boundary conditions Ω = LT 3 .Let u(x, t) = (u x (x, t), u y (x, t), u z (x, t)) be the fluid velocity at position x ∈ Ω and time t ∈ R. Hereafter, we often omit the argument t to simplify the notation.The time evolution of the velocity field u is described by the fluctuating Navier-Stokes equa-tion [38,39,42,43]: with the incompressibility condition ∇ • u = 0, where p denotes the kinematic pressure, f represents the external force per unit mass, which, without loss of generality, is assumed to be divergence-free.We further assume that f acts only at large scales, i.e., it is supported in Fourier space at low wave numbers ∼ k f .In the last term on the right-hand side of (1), s denotes a thermal fluctuating stress prescribed as a zero-mean Gaussian random field that satisfies where a, b, c, d ∈ {x, y, z} and δ ab denotes the Kronecker delta, which is 1 if a = b, and zero otherwise.Here, the prefactor 2νk B T /ρ, where k B denotes the Boltzmann constant, is chosen according to the fluctuationdissipation relation of the second kind [47,48] so that the model ( 1) is thermodynamically consistent [49].
Let ûk be the Fourier mode of the velocity field with wave vector k ∈ (2π/L)Z 3 defined as where V := L 3 denotes the volume of the fluid.Here, we note that the fluctuating hydrodynamics describes fluid motions at the mesoscopic level [38,39].In other words, there is a cutoff wave number Λ such that ℓ −1 macro ≪ Λ ≪ ℓ −1 micro , where ℓ macro denotes the macroscopic length scale characterizing the macroscopic behaviors and ℓ micro denotes the microscopic length scale, such as the molecular size, the interaction length, or the mean free path [43].In the following, we shall assume that the summation over wave vectors k means the summation up to the cutoff wave number |k|<Λ .Because of the reality condition û * k = û−k , only the modes u k whose wave vector lies in the half-set are independent (for a schematic of this set, see Fig. 1 in Sec.IV).Then, the fluctuating Navier-stokes equation (1) can be rewritten as stochastic differential equations for û := { ûk |k ∈ K + } and the complex-conjugate variables û * := { û * k |k ∈ K + } (see Appendix A for the derivation): Here, B k ( û, û * ) denotes the nonlinear term where k := |k|, and ξk denotes the zero-mean white Gaussian noise that satisfies ξ * k = ξ−k , k • ξk = 0, and Note that the wave vectors p and q, which are summed over in (7), may not belong to the half-set K + .If p / ∈ K + , then ûp should be interpreted instead as û * −p .We also remark that the noise intensity 2νk 2 k B T /ρ becomes large in the higher wave number region to balance the viscous damping.
Let p t ( û, û * ) be the probability density of the total independent Fourier modes { û, û * } at time t.The time evolution of p t ( û, û * ) is governed by the following Fokker-Planck equation equivalent to the stochastic differential equations ( 5) and ( 6): where we note that the summation over k is restricted to the half-set K + .Here, J k ( û, û * ) denotes the probability current associated with a Fourier mode ûk : where A k ( û, û * ) denotes a drift vector defined by and I denotes the identity matrix.Note that, from the incompressibility condition ∇ • u = 0 and the definition of f and B k ( û, û * ), we have ûk Below, we assume that the system eventually reaches a statistically steady state with a stationary distribution p ss ( û, û * ) after a sufficiently long time.

III. BASIC PROPERTIES
In this section, we briefly review some basic properties of fully developed turbulence described by the fluctuating Navier-Stokes equation (1).

A. Energy balance
We first consider the time evolution of the mean kinetic energy per unit mass ⟨|u| 2 ⟩/2, where ⟨•⟩ denotes the average with respect to p t ( û, û * ) (hereafter, we omit "per unit mass" for brevity).Importantly, the nonlinear term B k ( û, û * ) satisfies the relation Then, from the Fokker-Planck equation ( 9), or equivalently from the stochastic differential equations ( 5) and ( 6) with stochastic calculus [50], we find that the energy balance equation reads where we have introduced the energy dissipation rate: The second and third terms on the right-hand side of (13) denote the energy injection rate due to the external force and the internal thermal noise, respectively.In the steady state, the energy dissipation rate balances the injection rate Here, we have ignored the energy injection due to the thermal noise by noting that the kinetic energy is much larger than the thermal energy over a wide range of scales in standard cases [42,43].

B. Energy cascade
While the nonlinear term B k ( û, û * ) does not contribute to the energy balance equation ( 13), it redistributes the energy over a wide range of scales.To see this, we investigate the energy exchange across scales by considering the time evolution equation of the large-scale energy, which is defined as the total energy up to an arbitrary wave number K: where k≤K denotes the summation over all k that satisfies k ≤ K. Here, Π K denotes the scale-to-scale energy flux from large to small scales: Now, we suppose that the system reaches a steady state and that the energy dissipation rate ε remains finite in the inviscid limit ν → 0 [4].Because the viscous dissipation is negligible at scales much larger than the Kolmogorov dissipation scale η ≡ k −1 ν := ν 3/4 ε −1/4 , the second and third terms on the right-hand side of ( 16) can be ignored in the range K ≪ k ν .Similarly, by noting the relation ( 15), the last term on the right-hand side of ( 16) can be approximated as the energy dissipation rate ε in the range K ≫ k f .Therefore, we obtain The energy is thus transferred conservatively from large to small scales within the inertial range.This energy cascade process underlies various unique properties of turbulence [4,5,8] and is also essential for deriving our main results.

IV. INFORMATION-THEORETIC QUANTITIES
In this section, we introduce the two informationtheoretic quantities that are important in describing our main result: mutual information and information flow.Since we are interested in the information transfer across scales, we first divide the set of independent Fourier modes { û, û * } into two parts at an arbitrary intermediate scale K (see Fig. 1): where denote the large-scale and small-scale modes, respectively.
The strength of the correlation between the large-scale modes U < K and small-scale modes U > K at time t is quantified by the mutual information [51]: where ⟨•⟩ denotes the average with respect to the joint probability distribution p t (U < K , U > K ), and p < t (U < K ) and p > t (U > K ) are the marginal distributions for the largescale and small-scale modes, respectively.Note that the joint probability distribution p t (U < K , U > K ) is nothing but the probability density for the total independent Fourier modes p t ( û, û * ) governed by the Fokker-Planck equation (9).The mutual information is nonnegative and is equal to zero if and only if U < K and U > K are statistically independent.
Because the mutual information is symmetric between the two variables, it cannot quantify the directional flow The light blue shaded half-space represents the set of the wave vectors K + , defined by ( 4), associated with the independent Fourier modes { û, û * }.We divide the independent Fourier modes into two parts at an arbitrary wave number The dark blue shaded hemisphere denotes the set of the wave vectors associated with the largescale modes U < K .If the information flow İK is positive, then it means that the small-scale modes U > K are gaining information about the large-scale modes U < K , as shown by the thick blue arrow pointing outward from the hemisphere.
of information from one variable to the other.The directional flow of information can be quantified in terms of the information flow, which is also called the learning rate [32,35,52,53].The information flow that characterizes the rate at which U < K acquires information about U > K is defined as Similarly, the information flow associated with U > K is defined by from the sign of İK .If İK > 0 ( İK < 0), then it means that the small-scale modes U > K are gaining (destroying) information about the large-scale modes U < K .In other words, the positivity (negativity) of the information flow indicates that information about the large-scale (smallscale) modes is being transferred to small (large) scales (see Fig. 1).
We finally provide a few remarks on our definition of the mutual information and information flow.Since the small-scale modes U > K include Fourier modes significantly affected by the viscous damping and thermal noise, the mutual information (20) and information flows ( 21) and ( 22) can possibly depend on the viscosity ν and the temperature T even for K within the inertial range.Similarly, since the large-scale modes U < K include Fourier modes directly affected by the external force f , these information-theoretic quantities can also depend on f .These points will be discussed in Sec.VI, where we present some numerical results suggesting that these dependencies are weak in the inertial range.

V. INFORMATION-THERMODYNAMIC BOUND ON INFORMATION FLOW IN TURBULENCE
In this section, we first present our main result on the information flow in turbulence in Sec.V A. Then, we provide a detailed derivation of this result in Sec.V B.

A. Main result
We now state our first main result: in the steady state, for any K within the inertial range k f ≪ K ≪ k ν , the information flow (24) is always nonnegative: This inequality states that information of large-scale eddies is transferred to small scales along with the energy cascade (see Fig. 2).In other words, small-scale modes U > K are "learning" about the large-scale modes U < K while receiving the kinetic energy from large scales.Furthermore, there is an upper bound on the information flow determined by the energy dissipation rate and the temperature of the fluid: which is the second main result of this paper.Before proving these relations, here we provide several remarks.First, no ad hoc assumptions are used in deriving the inequalities (25) and (26).These relations are based on the second law of information thermodynamics and the property of the energy cascade (18).Second, these inequalities hold independent of the details of the flow under consideration, such as the mechanism of the external forcing.That is, ( 25) and ( 26) are universal relations, which are valid for all types of flow described by the fluctuating Navier-Stokes equation exhibiting the energy cascade (18).Furthermore, we can also prove the same relations even for other turbulence models such as shell models.Here, we note that the information flow İK itself may not be universal, even if K lies in the inertial range, as pointed out at the end of the previous section.Nevertheless, our numerical simulation result suggests that the magnitude of the information flow is also universal in the inertial range (see Eq. ( 54)).Third, these relations hold for arbitrary temperature T , including the limit T → 0, which formally corresponds to the deterministic case.While the second inequality (26) becomes a trivial inequality in the limit T → 0, the first inequality (25) still provides a meaningful bound on the information flow.

B. Derivation of the main result
The derivation of the main result is based on the second law of information thermodynamics for bipartite systems [54].Below, we first formulate the second law of thermodynamics [Ineq.(38)] and then derive the second law of information thermodynamics [Ineqs.(42) and (43)].Finally, from the inequalities ( 42) and ( 43), we derive the main result.

Formulation of the second law of stochastic thermodynamics
First, we formulate the standard second law of thermodynamics.From a thermodynamic point of view, the fluctuating Navier-Stokes equation ( 5) and (6) consists of two parts: system and thermal environment [55].Here, by system, we mean the independent Fourier modes { û, û * }, and by thermal environment, we mean the fast degrees of freedom associated with the microscopic molecular motion, which induce the viscous damping and thermal noise.In this paper, we treat entropy as dimensionless by dividing it by the Boltzmann constant k B .
Let S[ û, û * ] be the entropy of the system, identified with the Shannon entropy Here, in the third equality, we have used the Fokker-Planck equation ( 9) and the fact that d ûd û * ∂ t p t ( û, û * ) = d t d ûd û * p t ( û, û * ) = 0.In the last equality, we have introduced Ṡk [ û, û * ], which is given by where the over-dot denotes the rates of change of observables that are not a time derivative of a state function.
We identify the entropy change in the environment according to Sekimoto's argument [55].Since the model satisfies the fluctuation-dissipation relation of the second kind, the thermal environment is ensured to be always in equilibrium at temperature T .Then, by noting that −νk 2 ûk + 2νk 2 k B T /ρ ξk can be interpreted as a force exerted by the environment on the system, the entropy change in the environment is identified as the work done by the system on the environment per unit time divided by k B T : where c.c. denotes the complex conjugate term and Ṡenv k denotes the entropy change in the environment associated with a wave vector k ∈ K + , Here, the symbol "•" denotes the multiplication in the sense of Stratonovich [50].We remark that this identification is consistent with the local detailed balance (see Appendix C).
The total entropy production rate, which we denote by σ, is identified as the sum of the average rate of change of the system entropy (28) and the entropy change in the environment (30): We now show that σ ≥ 0, which is a manifestation of the second law of thermodynamics and is sometimes called the second law of stochastic thermodynamics [49].To this end, it is convenient to decompose the probability current (10) into two parts: J k ( û, û * ) = J ir k ( û, û * )+J rev k ( û, û * ), where J ir k ( û, û * ) denotes the irreversible probability current, defined by and J rev k ( û, û * ) denotes the reversible probability current, defined by We then rewrite the average rate of change of the system entropy (28) as where we have used the fact that and that J ir k ( û, û * ) is orthogonal to k, i.e., J ir k ( û, û * ) We also rewrite the entropy change in the environment (30) as Here, in the second line, we have used the relation between the Stratonovich and the Ito integral [50], so that the inner product "•" here should be interpreted as the multiplication in the sense of Ito.Then, by combining (35) and (37), we can confirm that the total entropy production rate is nonnegative:

Derivation of the second law of information thermodynamics
Now, we derive the second law of information thermodynamics.Let σk := Ṡk [ û, û * ] + Ṡenv k be the partial entropy production rate [56] associated with a wave vector k ∈ K + , so that σ = k∈K + σk .As is clear from the expression (38), σk is also nonnegative for each wave vector k: From this relation, we can derive the second law of information thermodynamics for the two sets of Fourier modes U < K and U > K .We first note that the information flow İ< K associated with the large-scale modes U < K can be rewritten as For the derivation, see Appendix B. By using this relation, we obtain where denotes the Shannon entropy of the large-scale modes U < K .Then, by summing (39) over all k that satisfies k ∈ K + and k ≤ K and by using (41), we obtain the second law of information thermodynamics for the large-scale modes: where Ṡ< env := k∈K + ,k≤K Ṡenv k denotes the entropy change in the environment due to the large-scale modes.Similarly, we can obtain the second law of information thermodynamics for the small-scale modes: Note that the first two terms on the right-hand side of ( 42) and ( 43) can be interpreted as the total entropy production rate associated with the large-scale and smallscale modes, respectively.Then, (42) and (43) state that the total entropy production associated with each mode is not necessarily nonnegative but is bounded by the information flow.In particular, if U < K and U > K are statistically independent, then İ< K = İ> K = 0 and the standard second law of thermodynamics holds for each mode.In contrast, if they are correlated, then the inequalities (42) and (43) give nontrivial bounds on the information flow in terms of the entropy production.

Derivation of the main result
We now derive the main results (25) and (26) from the second law of information thermodynamics (42) and (43).We assume that the system is in the steady state.Then, by noting that İK = İ> K = − İ< K , ( 42) and ( 43) can be rewritten as respectively.We set K to be within the inertial range k f ≪ K ≪ k ν .Then, Ṡ< env can be expressed in terms of the energy flux (17) as where we have used k≤K ⟨ fk 0 in the steady state.Similarly, Ṡ> env can be expressed as where we have used the property of the nonlinear term (12).By substituting these expressions into (44) and (45) and by noting that Π K → ε as K/k ν → 0, we arrive at the main result ( 25) and (26).

VI. NUMERICAL SIMULATION
We here numerically illustrate the main result by estimating the information flow İK .Since the estimation of the information flow for the fluctuating Navier-Stokes equation requires an enormous computational cost, we instead use a fluctuating shell model, which is a simplified caricature of the fluctuating Navier-Stokes equation in wave number space.Even for the fluctuating shell model, we can easily confirm that the main results (25) and ( 26) are valid.In the following, we first introduce the fluctuating shell model in Sec.VI A. Next, we explain the setup of the numerical simulation in Sec.VI B. The numerical simulation results are presented in Sec.VI C.

A. Model
We consider the Sabra shell model with thermal noise [42,43,57].Let u n (t) ∈ C be the "velocity" at time t with the wave number k n = k 0 2 n (n = 0, 1, • • • , N ).The time evolution of the complex shell variables u := {u n } is given by the following Langevin equation: with the scale-local nonlinear interactions given by where we set u Here, ν > 0 represents the kinematic viscosity, f n ∈ C denotes the external body force that acts only at large scales, i.e., f n = 0 for n > n f , and ξ n ∈ C is the zeromean white Gaussian noise that satisfies ⟨ξ n (t)ξ * n ′ (t ′ )⟩ = 2δ nn ′ δ(t−t ′ ).The specific form of the thermal noise term satisfies the fluctuation-dissipation relation of the second kind, where T denotes the absolute temperature, k B the Boltzmann constant, and ρ the mass "density".
Although the shell model has a much simpler form than the Navier-Stokes equation, it exhibits rich temporal and multiscale statistics that are similar to those observed in real turbulent flow [58,59].In particular, the energy cascade property (18) is satisfied even for this model.Then, we can easily confirm that the main results (25) and ( 26) are valid: for K within the inertial range . Note that, in contrast to (26), the volume of the fluid V does not appear in ( 51) because ρ has units of mass in the shell model.

B. Setup of the numerical simulation
To investigate the Reynolds number and temperature dependence of the information flow, we consider three different cases as listed in Table I.In Case I, we set N = 22 and n f = 1 to ensure that the external force acts only on the 0th and 1st shells of the total 23 shells.In choosing the parameter values, we note that the presence of thermal fluctuations introduces another dimensionless quantity θ η := k B T /ρu 2 η in addition to the Reynolds number Re.Here, u η := (εν) 1/4 denotes the characteristic velocity at the Kolmogorov dissipation scale, and thus the dimensionless temperature θ η is the ratio of the thermal energy to the kinetic energy at the Kolmogorov dissipation scale.Then, the values of the external force and the other parameters are chosen following Refs.[42,43] so that the achieved Reynolds number Re and the dimensionless temperature θ η are both comparable to the typical values in the atmospheric boundary layer, i.e., Re ∼ 10 6 and θ η ∼ 10 −8 .In Case II, we set N = 19 so that the achieved Reynolds number is lowered to Re ∼ 10 5 while leaving the other parameter values unchanged.In Case III, we consider the standard deterministic case by setting T = 0 (θ η = 0) while leaving the other parameter values unchanged.In all three cases, we have used N samp = 3 × 10 5 samples in the following averaging and estimation.
In the estimation of the mutual information, we first note that the naive binning approach is not feasible because it requires estimation of the 2(N + 1)-dimensional probability density p t (U < K , U > K ).Instead, we use the so-called Kraskov-Stögbauer-Grassberger (KSG) estimator [60][61][62], which has the advantage that it does not require estimation of the underlying probability density.The KSG estimator uses the distances to the κ-th nearest neighbors of the sample points in the data to detect the structures of the underlying probability distribution.While we set κ = 4 here, following Ref.[60], essentially the same result can be obtained for other values of κ.Because the KSG estimator is based on the local uniformity assumption of the probability density, the estimated value approaches the true value as N samp → ∞ when this assumption is satisfied.In the following, we denote by ].The information flow İK can be estimated by using the KSG estimator.Note that this procedure requires high accuracy in the estimation of the mutual information because the information flow is defined through infinitesimal increments in the mutual information.Because it is not feasible to increase the number of samples, we instead take the approach of using the largest possible time increment ∆t.That is, we define the estimated information flow ÎK by Because we are interested in K within the inertial range, we choose ∆t such that it is smaller than the smallest time scale in the inertial range.Therefore, we set ∆t =  for Case I and III.In all three cases, we can see that the spectrum is consistent with the Kolmogorov spectrum in the inertial range, E n ∝ k −2/3 n .In the dissipation range, the spectrum exhibits the stretched-exponential decay in Case III, while it exhibits the equipartition of energy E n = k B T /ρ (E n /u 2 η = θ η in the dimensionless form) in Case I and II.We also note that while the thermal fluctuation effects are negligible in the inertial range, they become relevant already at the length scale kη ∼ 10, as pointed out in Ref. [42,43].

Mutual information
. Its standard deviation is also estimated to be ∼ 10 −3 by subsampling [62] (see Appendix D 3), which lies within the marker size.Notably, the mutual information is almost independent of K in the inertial range.Furthermore, Fig. 3(b) implies that the mutual information is also independent of Re and T in the inertial range.In other words, if we divide the total shell variables into the largescale and small-scale modes at an arbitrary wave number K within the inertial range, then the correlation between the large-scale and small-scale modes is not affected by Re and T .In the energy injection and dissipation scale range, however, the mutual information significantly depends on Re and T .In particular, the mutual information becomes zero in the dissipation range for Case I and II, while it remains finite for Case III.This is because the thermal fluctuation destroys the correlation, and the large-scale and small-scale modes become statistically independent.
defined by From the main result (25) and the second law of information thermodynamics in the steady state (45), it immediately follows that 0 ≤ η eff ≤ 1.This efficiency quantifies how efficiently the small-scale modes U > K gain information about the large-scale modes U < K relative to the energy dissipation or thermodynamic cost.Then, the previous result states that η eff ≪ 1, which suggests that the small-scale eddies acquire information about the largescale eddies at a relatively high thermodynamic cost.This property is in contrast to other typical information processing systems such as Maxwell's demon [35,53,54] and thus characterizes turbulence dynamics.
Furthermore, Fig. 3(c) suggests that the information flow may be scaled as in the inertial range, where C is a dimensionless constant that is almost independent of Re, K, and T .By noting that τ L can be interpreted as the characteristic time scale for the largest eddies to be stretched into smaller eddies, this result implies that the information of large-scale eddies is transferred to small scales by the energy cascade process with nearly constant intensity.This result also implies that, although thermal fluctuations are crucial in deriving the main results ( 25) and ( 26), the information flow itself is mainly governed by the large-scale dynamics rather than by the thermal fluctuations.

VII. CONCLUDING REMARKS
In summary, we have proved that, in fully developed three-dimensional fluid turbulence, information of turbulent fluctuations is transferred from large to small scales along with the energy cascade.Our main results ( 25) and ( 26) are a direct consequence of the second law of information thermodynamics, and thus they are exact and universal relations, independent of the details of the flow under consideration.Furthermore, our numerical simulation using a shell model suggests that the intensity of the information flow is nearly constant in the inertial range and that the rate of information transfer is characterized by the large-eddy turnover time [Eq.(54)].This observation states that the information of large-scale turbulent fluctuations is transferred to small scales by the energy cascade process with nearly constant intensity.Thus, our results challenge the conventional intuitive picture that the universal statistical properties emerge at small scales because information about the details of the large scales is lost in the cascade process.Moreover, we have found that the information-thermodynamic efficiency is quite low compared to other typical information processing systems such as Maxwell's demon.This implies that transferring information from large to small scales involves enormous thermodynamic costs, indicating the poor performance of turbulence as an information processing system.
We now provide some technical remarks on the estimation of the information flow.Although the KSG estimator used here is asymptotically unbiased for N samp → ∞, there are a sample-size-dependent bias and a κ-dependent bias for a finite N samp in general [62].In our case, we have found that the magnitude of Î(κ) KSG [U < K : U > K ] depends on κ.This may be because the probability distribution p t (U < K , U > K ) is skewed and has heavy tails, thus violating the local uniformity condition [62].Nevertheless, we have confirmed that the sign of ÎK does not depend on the choice of κ.See Appendix D for more details on these subtle points.It should also be noted that the number of samples N samp used here is not sufficient for high accurate estimation of the information flow because the standard deviation of the estimated mutual information is comparable to its increment.In other words, if we naively estimate the error bar of the information flow ÎK by using the estimated standard deviation of the mutual information, it is of the same order as ÎK itself.It is therefore desirable to perform the numerical calculations with higher accuracy while taking the bias into account.
Our study opens several possible directions for future research.The first direction concerns the origin of the universality of turbulent fluctuations at small scales.As we have mentioned above, our result here is somewhat contrary to the common intuitive picture of how universality emerges at small scales.Then, it seems natural to ask how the universality emerges at small scales under the influence of the information flow from large scales.We conjecture that the coexistence of the universality and information flow can be explained by the stepwise "information cascade" process where "irrelevant information" is "deamplified" as the cascade develops.The role of various energy cascade mechanisms [10,63] in this process would be an interesting question to be investigated.Note that this cascade picture is analogous to that proposed by Wilson in the context of the critical phenomena [64].The second direction concerns the intermittency.Since there is an information flow of turbulent fluctuations from large to small scales, the small-scale intermittency must be affected by the information flow.Indeed, the intermittency implies that the turbulent fluctuations grow in each cascade step and thus "remember" the large scales [4,7,8].Therefore, we guess that there are universal relations between the intermittency and the information flow that restrict the possible values of the structure function exponent ζ p .Finally, because turbulent cascade is a ubiquitous phenomenon found in quantum fluids [65][66][67][68], supercritical fluids near a critical point [69], elastic bodies [70,71], and even spin systems [72][73][74], it would be interesting research direction to investigate the nature of the information flow in these various systems.We hope that our work opens up a new research area, "information hydrodynamics," which would provide a theoretical framework to elucidate and control the dynamics of complicated hydrodynamic phenomena.
, we arrive at (40): Similarly, for the information flow associated with the small-scale modes İ> K , we can obtain By combining (B3) and (B4), we can easily confirm that (23) holds: Here, we show that the identification of the entropy change in the environment ( 30) is consistent with the local detailed balance (LDB).Note that the LDB is essentially equivalent to the fluctuation-dissipation relation of the second kind [47,75].Therefore, if the expression ( 30) is thermodynamically consistent, then it should also be consistent with the LDB.
Appendix D: Details of the numerical simulation In this section, we explain the details of the numerical simulation.After describing the setup, the details of the KSG estimator are explained.In particular, we provide a detailed explanation of the method used to estimate the variance and bias of the KSG estimator.

Setup
To evaluate the inertial range straightforwardly, we first nondimensionalize the equation ( 48 We use a slaved 3/2-strong-order Ito-Taylor scheme [77] with the time-step δ t := 10 −5 , which is smaller than the viscous time scale at the highest wave number τvis := 1/ k2 R ∼ 10 −4 .We consider three different cases as listed in Table I in Sec.VI.In Case I, the parameter values are set to the same values used in Ref. [42,43], which are consistent with the typical values in the atmospheric boundary layer.Specifically, the range of shell numbers is chosen as n = −15, • • • , 7 so that the achieved Reynolds number is comparable to the typical value in the atmospheric boundary layer of Re ∼ 10 6 .Similarly, the dimensionless temperature is chosen as θ η = 2.328 × 10 −8 .For the external force, we set n f = −14 to ensure that the external force acts only on the 0th and 1st shells of the total 23 shells.The values of the external forces are adjusted such that ûrms :=

FIG. 1 .
FIG. 1. Schematic of the information flow across scales.The light blue shaded half-space represents the set of the wave vectors K + , defined by (4), associated with the independent Fourier modes { û, û * }.We divide the independent Fourier modes into two parts at an arbitrary wave numberK: { û, û * } = U < K ∪ U > K .The dark blue shaded hemisphere denotes the set of the wave vectors associated with the largescale modes U < K .If the information flow İK is positive, then it means that the small-scale modes U > K are gaining information about the large-scale modes U < K , as shown by the thick blue arrow pointing outward from the hemisphere.
FIG.2.Schematic of information flow in the energy cascade.

Figure 3 (
Figure 3(a) shows the energy spectrum E n := ⟨|u n | 2 ⟩ ss /2 in the steady state.The achieved Reynolds numbers are Re ≃ 9.25 × 10 4 for Case II and 1.46 × 10 6for Case I and III.In all three cases, we can see that the spectrum is consistent with the Kolmogorov spectrum in the inertial range, E n ∝ k

Figure 3 (
Figure 3(b) shows the scale dependence of the estimated mutual informationÎ(κ) KSG [U < K (t) : U > K (t)].Its standard deviation is also estimated to be ∼ 10 −3 by subsampling[62] (see Appendix D 3), which lies within the marker size.Notably, the mutual information is almost independent of K in the inertial range.Furthermore, Fig.3(b) implies that the mutual information is also independent of Re and T in the inertial range.In other words, if we divide the total shell variables into the largescale and small-scale modes at an arbitrary wave number K within the inertial range, then the correlation between the large-scale and small-scale modes is not affected by Re and T .In the energy injection and dissipation scale range, however, the mutual information significantly depends on Re and T .In particular, the mutual information becomes zero in the dissipation range for Case I and II, while it remains finite for Case III.This is because the thermal fluctuation destroys the correlation, and the large-scale and small-scale modes become statistically independent.

TABLE I .
(48)largest shell number N , the achieved Reynolds number Re, and the dimensionless temperature θη of the three different cases., where τ η := η/u η denotes the typical time scale at the Kolmogorov dissipation scale.Note that ∆t is different from the time step δt := 10 −5 τ η used in solving(48)numerically.Further details of the numerical simulation are given in Appendix D, including the details of the KSG estimator.