Mutual information as a measure of mixing efficiency in viscous fluids

Because of the kinematic reversibility of the Stokes equation, fluid mixing at the microscale requires an interplay between advection and diffusion. Here we introduce mutual information between particle positions before and after mixing as a measure of mixing efficiency. We demonstrate its application in a Couette flow in an annulus and show that non-uniform rotation sequences can lead to more efficient mixing. We also determine mutual information from Brownian dynamics simulations using data compression algorithms. Our results show that mutual information provides a universal and assumption-free measure of mixing efficiency in microscale flows.

Designing protocols that force an out of equilibrium system into equilibrium faster than the natural relaxation rate is a pertinent topic in a number of classical [1,2] and quantum systems [3].A prime example of forced equilibration is the mixing of fluids if the initial state contains a nonequilibirum concentration or temperature distribution.Fluid mixing at the microscale is of paramount importance in biological organisms and in artificial systems.Examples range from the uptake of oxygen, nutrients or chemical signals in aquatic organisms to microreactors and "lab on a chip" applications [4][5][6][7][8].In biology, mixing is frequently accomplished by cilia which drive long-range flows, but also localized regions of chaotic advection [9][10][11][12][13][14].A particular challenge to microscale mixing is posed by the time-reversibility of flows at low Reynolds numbers [15,16].Mixing therefore requires an interplay between advection (stirring) and diffusion [17,18].Although most examples work with the mixing of two distinct fluids, the formalisms that are used apply equally to other scalar quantities like temperature [17].
The measures that have been proposed to quantify the mixing efficiency can broadly be classified as global and local [8,17].Global measures typically start by imposing a pattern, e.g., by distributing the solute in a part of the fluid volume.After mixing, the distribution of the solute can be measured by defining the intensity of segregation as the variance of solute concentration (the L 2 norm) [19,20], its entropy [21][22][23][24], the mean distance to the closest particle from the other population [25] or Sobolev norms [20,26].The same global mixing measures can also be applied to quantify unmixing in cases of spontaneous phase segregation [23].The limitation of these measures is that the result will depend on the choice of the initial distribution.Local measures typically characterize the amount of stretching or the Lyapunov exponents [16,27,28].In this paper, we introduce mutual information as a universal measure of mixing in fluids at low Reynolds numbers with strong interplay between advection and diffusion.As a simple model system, we test our method in a 2D Couette flow in an annulus geometry and show that the mixing efficiency depends in a non-trivial way on the time sequence of rotation (see Fig. 1).
Because the theoretically smallest compressed size of a dataset is given by its Shannon entropy, lossless data compression algorithms can be used to estimate the entropy of a distribution [29][30][31].Order in two-dimensional systems can be estimated with lossless image compression algorithms like PNG [32] or GIF with LZW compression [33].However, it has also been argued that compression algorithms lead to poor estimates in two-dimensional systems with long range correlations [34,35].We estimate mutual information with different data compression algorithms and show that the neural-network based PAQ8PX [36] gives good accuracy.
The motion of the fluid at low Reynolds number is governed by the Stokes equation µ∆v = ∇p and incompressibility condition ∇ • v = 0. Here, v is the fluid velocity, p is the pressure and µ is the fluid viscosity.We assume that the shape of the fluid volume does not change during the mixing process, although an extension to shape-changing compartments is straightforward.Therefore, the normal component of the fluid velocity has to vanish at the boundary, n•v = 0.Because there are no inertial terms in the Stokes equation, the fluid motion at any time t is fully determined by the instantaneous (tangential) velocities at the boundaries at the same time.The particles in the fluid to be mixed -we assume that they are all equivalent -are subject to Brownian motion in addition to the advection caused by the fluid flow.The time evolution of the probability density P (x, t) to find the particle at position x at time t is determined by the advection-diffusion equation (equivalent to a Fokker Planck equation) with the diffusion constant D. The zero-flux condition further implies n • ∇P (x, t) = 0 at the boundary.The conditional probability P (x, t|x 0 ) is obtained by solving Eq. ( 1) with the initial condition x = x 0 at t = 0.Because the flow is divergence-free, the stationary solution is always given by a constant density, P (x, t) ≡ 1/V .We quantify the efficiency of fluid mixing using the mutual information between the initial and the final position of a particle in the fluid.Mutual information provides the strictest possible measure of mixing: zero mutual information means that the final position of a particle is unrelated to its initial position and therefore also to the position of any other particle in the fluid.Unlike many other criteria, it does not require any assumptions about the initial spatial distribution of the fluid components to be mixed.Mutual information is defined as sum of the entropies of the distribution of the initial and final distributions of the position, reduced by the joint entropy of the initial and final position together [37]: Here, S[x 0 ] is the entropy of the initial position variable, S[x t ] is the entropy of the position variable at time t, and S[x 0 , x t ] is the joint entropy of these two position variables.The final distribution at time t can also be expressed as P (x, t) = P (x, t|x 0 )P (x 0 )dx 0 .Equivalent to Eq. ( 2), the mutual information can be expressed in terms of the conditional entropy as where S[x t ] = − P (x, t) log P (x, t) dx, and S[x t |x 0 ] = − P (x 0 ) P (x, t|x 0 ) log P (x, t|x 0 )dx dx 0 is the conditional entropy with the knowledge of the initial position.
Mutual information, as we define it, depends on the distribution of initial positions P (x 0 ).This distribution determines the statistical weight that is given to the mixing efficiency in different regions of the fluid.It should not be confused with any imposed pattern in the fluid to be "erased" by mixing, as frequently used in other mixing efficiency criteria.In the following, we assume a homogeneous weight P (x 0 ) = 1/V .As this is the stationary solution, it also leads to P (x, t) = 1/V at all later times.We note, however, that as a possible alternative, one could also use an initial distribution that maximizes I(t), following the spirit of Shannon's channel capacity theorem, which would provide a quantitatively stricter measure.
Interestingly, the mixing efficiency is invariant upon time-reversal of the mixing sequence, defined by v(x, t) = −v(x, t F −t), where t F is the duration of the mixing process.This follows from a general property of the Fokker-Planck equation with divergence-free flux densities [38], for which P (x, t|x 0 ) = P (x 0 , t|x).With a uniform distribution P (x 0 ), we have S[x t ] = S[x 0 ] and from Eq. ( 3) it follows immediately that Ī(t F ) = I(t F ).
We now demonstrate the application of mutual information as measure for mixing efficiency in 2D Couette flow.The outer circular boundary with radius R out is stationary and the inner boundary with radius R in rotates with an angular velocity Ω(t), as shown in Fig. 1(a).In the following we introduce a characteristic length scale L = (R in + R out )/2 and time scale T = L 2 /D and use them to non-dimensionalize the variables as t In these variables, the 2D Stokes equation in the annulus is solved by ṽ = ω(r)r with [39] In order to numerically determine the conditional entropy S[x t |x 0 ] in Eq. (3), we need to calculate P (x, t|x 0 ) for any initial position x 0 .For this purpose, we express the advection-diffusion equation [Eq.(1)] in polar coordinates along with the boundary condition ∂ P /∂ r = 0 at R in = 0.6 and R out = 1.4.We solve Eq. ( 5) numerically with a spectral method.Two examples of the solution for P (x, t|x 0 ), one with pure diffusion and one with uniform rotation of the inner boundary, are shown in Fig. 1(b).
The conditional entropy S[x t |x 0 ], needed to determine the mutual information I(t), is eventually obtained by integrating P (x, t|x 0 ) log P (x, t|x 0 ) over r0 , r and θ and making use of the rotational symmetry of the system with regard to θ 0 .The decay of mutual information at different constant rotational velocities is shown in Fig. 2. In all cases I(t) monotonically decreases with time (as in any Markovian process, information can only be lost, but never recovered), but we see that the decay is accelerated by increasing the rotation rate, proving that the interplay between advection and diffusion accelerates the mixing process as quantified with the loss of mutual information.A more conventional way of quantifying mixing consists of studying the dissolution of an initial pattern [19].For example, the initial state can consist two different fluids.A comparison between the decay of concentration variance for three such patterns and mutual information I(t) is shown in Figs.A2 and A3.It shows that the slowest of the patterns has the same final decay rate as I(t) while others are faster, in line with the argument that mutual information provides the strictest possible measure of mixing efficiency.
At short times, the effect of advection is small and the particle dynamics can be approximated as free diffusion.With a starting position x 0 sufficiently far from the walls, the evolution of entropy can be approximated as S[x t |x 0 ] = log t + log 4πD + 1 [18].Inserting the conditional entropy into Eq.( 3) and switching to nondimensional units gives an approximation for the mutual information I( t) = − log t − log 4π − 1 + log( Ã), where Ã is the dimensionless surface area.The result is shown by the dashed line in Fig. 2. While giving the correct slope, the value of I is slightly underestimated, which we attribute to the effect of reflective boundaries.
Because mixing in viscous fluids requires an interplay between advection and diffusion, we expect that the mixing efficiency depends on the time sequence of rotation Ω( t) and not just the total angle tF 0 Ωd t [17].For example, rotation at the very beginning or the end of the time interval just reorders the positions, but does not reduce the mutual information.We also know, as shown above for any mixing process, that I is invariant if we reverse the mixing sequence in time, Ω(t) → Ω(t F − t).
We investigated the mixing efficiency of timedependent advection under two different conditions: i) for a fixed total rotation t F 0 Ωd t with the additional condition that the sense of rotation be constant ( Ω > 0) and ii) for a fixed total viscous dissipation, In each of the settings, we compared the mixing efficiency of uniform rotation ( Ω = const.)with a Gaussian profile, centered around the middle of the time interval Ω ∝ exp[−( t − tF /2) 2 /2σ 2 ], as well as the globally optimal sequence.All sequences were discretized with an odd number of time intervals (25 or 49).We determined the optimal sequence using a global optimizer [Glob-alSearch in MATLAB (MathWorks Inc.)], using the velocities in individual intervals as optimization variables.For a constant rotation, the dependence on the width of the Gaussian is usually monotonic [Fig.3(a)] and optimal mixing is achieved by a sharp, discrete rotation at midtime [Fig.3(b)].For certain parameter ranges, especially for large total rotation angles and long times, the dependence becomes non-monotonic [Fig.3(c)].In such cases, the optimal sequence consists of two symmetrically arranged peaks [blue dashed line in Fig. 3(d)].If the total viscous dissipation is kept constant, mixing sequences with strong non-uniformity naturally become less efficient [Fig.3(e)].Then the optimal velocity profile becomes approximately parabolic with a maximum in the middle and dropping to zero at the beginning and end of the interval.Overall, the analysis shows that the velocity profiles that maximize the mixing efficiency are always non-uniform, regardless whether the total rotation or the total viscous dissipation is kept constant.The symmetry of mixing efficiency upon time reversal is also reflected in the symmetry of the optimal profiles.The exact dependence, however, is complex and can be non-monotonic in certain parameter regions.
As an alternative approach, mutual information can also be estimated using lossless data compression algorithms.The (information) entropy of a dataset in principle gives a lower bound on its compressed size.By creating a file with initial and final positions of particles (x 0 , x t ) and compressing it, we obtain an upper bound on the joint entropy S[x 0 , x t ], and consequently, a lower bound on the mutual information I(t) using Eq.(2).As each position is a two-component vector, the problem is equivalent to computing entropy of a distribution in a 4-dimensional space, larger than in most uses of compression algorithms for entropy evaluation reported to date.
We generate the data set by randomly selecting an initial position x 0 with uniform density inside the fluid area.For each initial position, we simulate the Brownian trajectory with the Euler-Maruyama method where each next position is determined as Here R(∆θ) denotes a rotation matrix with the rotation angle ∆θ and ξ t is a vector where each component is a Gaussian-distributed random variable with mean zero and standard deviation √ 2D∆t.The combination of rotational motion and noise in Cartesian coordinates is chosen to avoid spurious drift caused by the integration procedure.Positions outside the radial range [R min , R max ] were reflected at the boundary.Both positions were ex- pressed in polar coordinates and the quantities r 2 0 −R 2 min , θ 0 , r 2 F −R 2 min , θ F were normalized, converted to 8-bit unsigned integers and written to a file in this order.Using r 2 as a variable ensures a homogeneous radial distribution.The simulation is repeated with N = 10 5 , 10 6 or 10 7 starting points, giving a file size of 400 kB, 4 MB or 40 MB, respectively.The optimally compressed size of this file (C t ) is a measure of the joint entropy S[x 0 , x t ] up to a constant that depends on the level of coarse graining.Likewise, the compressed size of an equivalent file with uncorrelated random initial and final positions (C ∞ ) is a measure of the sum of the entropies of the initial and the final distributions S[x 0 ] + S[x t ].Following Eq. ( 2), the mutual information can be estimated as with both file sizes measured in bits.
For data compression, we use three different programs: the commonly used Lempel-Ziv-Markov chain algorithm (LZMA) and bzip2, as well as the experimental, neural network based algorithm PAQ8PX [36,40,41].A simple test case comprising a 1-dimensional drift-diffusion process shows that all three algorithms qualitatively give the right dependence when 10-bit integers are used, but PAQ8PX consistently gave errors < ∼ 0.1 with large samples (Fig. A1). Figure 4 shows the mutual information I( t) of our mixing process estimated using the three compression algorithms and different dataset sizes in comparison with the numerical results, obtained with the spectral solution.The comparison is shown for a diffusive process [Fig.4(a)] and for constant advection with Ω = 20π [Fig.4(b)].Whereas the standard algorithms practically failed to detect any mutual information, PAQ8PX consistently gave results with an absolute error ≪ 1.With the largest size N , the errors are consistently < ∼ 0.3.In conclusion, we have introduced mutual information between the initial and the final state as a universal measure for mixing efficiency in microfluidic setups with strong interplay between advection and diffusion.We have shown that under this measure, the mixing efficiency is symmetric upon time reversal of the actuation sequence.Among all sequences with the same rotation angle, the ones with optimal mixing consist of a fast rotation in the middle of the time interval, or in some cases two symmetrically arranged.We have also demonstrated that advanced neural network based compression algorithms can be applied to estimate mutual information to a high accuracy.The latter can prove useful in more complex flows in which a full solution of the advectiondiffusion equation may not be tractable.We also stress that the Couette flow that we chose as a demonstration is far from optimal and that several works have investigated sequences with optimal kinematics under given mixing norms [42][43][44][45].Finding the mixing pattern without geometric restrictions that minimizes mutual information remains a task for future.Furthermore, we expect that our formalism will also be applicable to more complex mixing situations, for example by active swimmers [46][47][48], natural or artificial cilia [11-14, 49, 50] or in active materials.

FIG. 1 .
FIG. 1.(a) 2D Couette flow in an annulus geometry.The outer boundary is stationary and the inner is rotating with angular velocity Ω.(b) Time evolution of the probability density P (x, t|x0), with an initial position x0 in the middle of the annulus.The top row shows the process with pure diffusion ( Ω = 0) and the bottom row with both diffusion and uniform advection ( Ω = 15π).

45 FIG. 2 .
FIG.2.Temporal decay of mutual information between particle positions in the initial state and the state after time t for different rotation rates.The dashed line shows the approximation for diffusive motion, I = − log( t) + const..The inset shows the time constant of final relaxation, where I ∝ exp(− t/τ ).

FIG. 3 .
FIG. 3. Efficiency of non-uniform mixing sequences, Ω( t).Panels (a) and (c) compare sequences with a fixed total rotation angle and panel (e) with a fixed dissipation.(a) Mixing efficiency of the rotation sequence with a Gaussian dependence Ω ∼ exp[−( t − tF /2) 2 /2σ 2 ], discretized with 25 segments, for Ωd t = π, tF = 0.2 (red line).The blue dashed line shows the optimal and the gray dashed line the uniform sequence.(b) The sequences used in panel (a).(c) As in (a), but with Ωd t = 4π, tF = 0.4.(d) The sequences used in panel (c).The optimal sequence becomes bimodal.(e) Mixing efficiency of Gaussian sequences, discretized with 49 segments, with equal total dissipation Ω 2 d t = 5π 2 and tF = 0.2.(f) The sequences used in (e).

FIG. 4 .
FIG. 4. Mutual information between the initial state and the state after time t, estimated by means of three different compression algorithms (color).Different symbol shapes mark different data set sizes N .The red line shows the numerical result using the spectral solution.(a) Process with pure diffusion Ω = 0. (b) Process with both diffusion and advection Ω = 20π.
FIG. A2.Time evolution of the concentration c(x, t), with three different initial patterns.In pattern 1, 2, 3, the whole unit is divided into 2, 4, 6 equal parts, respectively.All the processes involve both diffusion and uniform advection ( Ω = 20π).