Ott-Antonsen ansatz is the only admissible truncation of a circular cumulant series

The cumulant representation is common in classical statistical physics for variables on the real line and the issue of closures of cumulant expansions is well elaborated. The case of phase variables significantly differs from the case of linear ones; the relevant order parameters are the Kuramoto-Daido ones but not the conventional moments. One can formally introduce `circular' cumulants for Kuramoto-Daido order parameters, similar to the conventional cumulants for moments. The circular cumulant expansions allow to advance beyond the Ott-Antonsen theory and consider populations of real oscillators. First, we show that truncation of circular cumulant expansions, except for the Ott-Antonsen case, is forbidden. Second, we compare this situation to the case of the Gaussian distribution of a linear variable, where the second cumulant is nonzero and all the higher cumulants are zero, and elucidate why keeping up to the second cumulant is admissible for a linear variable, but forbidden for circular cumulants. Third, we discuss the implication of this truncation issue to populations of quadratic integrate-and-fire neurons [E. Montbri\'o, D. Paz\'o, A. Roxin, Phys. Rev. X, vol. 5, 021028 (2015)], where within the framework of macroscopic description, the firing rate diverges for any finite truncation of the cumulant series, and discuss how one should handle these situations. Fourth, we consider the cumulant-based low-dimensional reductions for macroscopic population dynamics in the context of this truncation issue. These reductions are applicable, where the cumulant series exponentially decay with the cumulant order, i.e., they form a geometric progression hierarchy. Fifth, we demonstrate the formation of this hierarchy for generic distributions on the circle and experimental data for coupled biological and electrochemical oscillators.


I. INTRODUCTION
The problem of cumulant representation and closure of cumulant expansions is likely one of the most generic problems in non-equilibrium statistical physics. For instance, starting with the Boltzmann kinetic equation governing dynamics of the one-particle probability density function, one can find an infinite chain of equations for moments of the probability density with respect to velocity: mass density, momentum density, the second moment, etc. [1]. For an isotropic medium, one deals with the convolution of the second moment, which is the net mechanical energy. Equations for these moments are nothing else but the conservation laws for mass, momentum, and net mechanical energy. Moreover, it is conventional to deduct the macroscopic velocity contributions from the second moment and obtain the energy conservation equation in terms of the internal energy; mathematically, this deducting corresponds to switching from moments to cumulants. Closure of the equation chain is a trivial task: for incompressible flows one makes this closure by assuming a constant density and discarding the energy equation (the second cumulant), for compressible ones the third cumulant is neglected and an algebraic relation between pressure and internal energy is adopted. These closures are correct for weakly nonequilibrium systems (mathematically, the limit of small Knudsen number), which is relevant for nearly all macroscopic processes on the Earth's surface with excellent accuracy. Thus, the equations of continuous media mechanics are actually an example of the cumulant expansion and its closure.
The two-cumulant representation corresponding to the case of continuous media mechanics is actually a Gaussian approximation for the microscopic velocity distribution. However, in statistical physics, one can address the problem of macroscopic description for exotic systems, where particles are not actual molecules or atoms, but macroscopic objects: grains, stones, asteroids, etc. For these systems not only may the limit of small Knudsen number not be relevant, but also the reversibility of interparticle collisions is lost, leading to essentially non-Gaussian distributions of the microscopic velocity [2]. In this case, one has to deal with higher-order cumulants and look for non-Gaussian closures.
The problem of adiabatic velocity elimination for Brownian particles belongs to the same class of problems [3,4]. Indeed, the derivation of the enhanced Smoluchowski equation for the probability density can be conducted as constructing an expansion in cumulants with respect to velocity and closure for higher-order cumulants within the limit of small inertia.
Simultaneously, it turns out that one has to be subtle with high-order cumulant approximations. The only physically meaningful case with a finite number of nonzero cumulants is the case of Gaussian distribution, where the first and second cumulants can be nonzero. With any nonzero cumulant of higher order, the series of nonzero elements becomes infinite [9]. Furthermore, one can derive the Fokker-Planck equation for white Gaussian noise [4], but an analogous equation for any other finite number of nonzero cumulants will exhibit unphysical dynamics. Nonetheless, it is possible and quite common to be able to benefit from accounting for the impact of third and fourth cumulants on the dynamics of the first and second ones, while the corrections owned by the higher-order cumulants are neglected.
Classical statistical physics deals with variables on the real line. Studies on self-organization in active media and control theory, however, revealed the practical and theoretical importance of phase variables defined on the circle [10][11][12]. With phase variable ϕ, the conventional moments are poor representatives of the macroscopic order, while the Kuramoto-Daido parameters Z m = e imϕ [13] become a natural measure of the order.
A significant breakthrough in the theory of collective phenomena in populations of phase elements-which can be limit-cycle oscillators or excitable elements-was based on the Ott-Antonsen (OA) theory [14,15] related to an important particular case of the Watanabe-Strogatz theory [16][17][18][19]. The OA theory provided the closure Z m = (Z 1 ) m for infinite equation chains for the Kuramoto-Daido order parameters and allowed one to obtain a self-contained dynamics equation for Z 1 . This closure can be referred to as the Ott-Antonsen ansatz; the issue of the attractivity of the manifold Z m = (Z 1 ) m was also elaborated within the theory. Recently [20], it was proposed to treat the order parameters Z m as moments and deal with the formally corresponding 'circular cumulants'. The circular cumulant approach allowed to go beyond the OA ansatz [20][21][22][23] and handle cases where the genuine OA theory is inapplicable [20,21]. Within the framework of the circular cumulant approach, the OA theory turned out to be the case where only the first circular cumulant is nonzero. In [20,21], corrections owned by the second cumulant allowed to achieve accurate results where the OA ansatz was significantly inaccurate. The cumulant approach can be also applicable for a theoretical analysis of the non-OA situations, e.g., in [24][25][26][27][28][29].
The circular cumulant approach can be also employed in statistical physics problems of directional systems [30]: magnetic nanoparticle ensembles, liquid crystals, active Brownian particles, optomechanical systems, etc.
The retrospective on the statistical physics experience with linear variables suggests, that the progress in the theory of collective phenomena and self-organization in oscillatory and excitable media can be closely interwoven with implementation of the circular cumulant representations. The practical application of the circular cumulant approach and closures with a finite number of cumulants raises the questions of (i) physically admissible truncations of the cumulant series and (ii) dealing with closures where a finite number of circular cumulants is kept.
In this paper, we derive that, in contrast to the linear variable case, the only physically meaningful truncation is the single-cumulant one, which corresponds to the wrapped Cauchy distribution of phases and lies at the basis of the Ott-Antonsen ansatz. With more than one nonzero circular cumulant, the cumulant series of physically meaningful distributions must be infinite. Nonetheless, similarly to the case of linear variables, where for non-Gaussian cases one can benefit from corrections owned by a finite number of higher-order cumulants, closures with more than one circular cumulant are useful. We also show that in some physical systems the macroscopic variables driving collective dynamics (e.g., neuron firing rate [28,[31][32][33][34][35]) can depend on the Kuramoto-Daido order parameters in such a way that a careless representation of these variables in terms of circular cumulants can be always diverging. Further, we discuss how the issue of approximations with a finite number of circular cumulants should be handled in a regular way in the cases where the cumulants form a decaying geometric progression. We report the presence of this progression for important generic distributions and demonstrate it with experimental data for coupled biological and electrochemical oscillators.

II. TRUNCATED CIRCULAR CUMULANT SERIES
A. Ott-Antonsen ansatz as a one-cumulant truncation In this subsection we give a brief introduction to the Ott-Antonsen theory and reformulate it in terms of circular cumulants. Basically, it is formulated for populations of identical phase elements governed by equationṡ where ω(t) and h(t) are arbitrary real-and complexvalued functions of time. The theory is valid in the thermodynamic limit of an infinitely large population, where the system state is naturally represented by the probability density function of phases w(ϕ, t). The master equation for w(ϕ, t) reads In Fourier space, where master equation (2) takes the forṁ where Z 0 = 1 and Z −m = Z * m by definition. Ott and Antonsen [14] noticed that Eq. (4) admits the solution m with the order parameter Z 1 = e iϕ obeying a simple self-contained equation: The OA manifold Z m = (Z 1 ) m is neutrally stable for perfectly identical population elements, but becomes weakly attracting for typical cases of imperfect parameter identity, where the parameter distribution is continuous [15,36,37]. Eq. (5) is an exact result, which provides a closed equation for the dynamics of order parameter Z 1 and made a ground for a significant advance in various studies on collective phenomena.
For Z m = (Z 1 ) m with Z 1 = R e iψ , R = |Z 1 |, one can calculate the probability density (3) and find that it is a wrapped Cauchy distribution Let us now consider Z m as moments of e iϕ and formally introduce corresponding cumulants [20]. The latter quantities are not conventional cumulants of original variable ϕ; hence, we are free to choose the normalization for them and will refer to them as 'circular cumulants'. With the moment generating function we define circular cumulants κ m via generating function For instance, the first three circular cumulants are In terms of circular cumulants, the OA manifold Z m = (Z 1 ) m acquires a simple form: Thus, the Ott-Antonsen ansatz or the case of a wrapped Cauchy distribution of phases can be considered as the one-cumulant truncation of a circular cumulant series. Equation system (4) turns intȯ where δ 1m = 1 for m = 1 and 0 otherwise [20]. One can employ Eqs. (10) for studying the population dynamics beyond the OA ansatz and derive analytically solvable extensions of the OA solution [22]. For the systems where the OA form (1) of equations is violated, within the framework of the circular cumulant approach, one can derive modified versions of equation system (10) and low-dimensional equation systems for order parameters (e.g., [20,28,29]).

B. Two-cumulant truncation
It is instructive to start the analysis with the truncation where only two first circular cumulants are nonzero. For a detailed step-by-step derivation see Appendix A, while here we pin the principal derivation points and present its results.
With only two first nonzero κ m , the circular cumulant generating function Ψ(k) = κ 1 k +κ 2 k 2 , and ln F = κ 1 k + κ 2 Gathering the terms with k m in product (11), one finds where int(·) returns the integer part of a number and s m;j is a brief notation for the sum terms. For arbitrary nonzero κ 1 and κ 2 , one can specify sufficiently large m such that the dominating contribution into the sum will be made by summands s m;j which are far from the sum edges. Moreover, the absolute value of the summands |s m;j | will be modulated by a Gaussian function of index j. For large m, j, and (m − 2j), the Stirling's approximation, n! ≈ √ 2πn(n/e) n , can be employed for calculation of the summands.
Calculations take the simplest form for In this case, one finds the summand with the maximal absolute value at j = l: The absolute value and for the neighboring terms, one can calculate where As the Gaussian function in Eq. (15) is localized on the scale (ma) 1/4 ≫ 1, the summand magnitude slowly varies with the index r, and one can assess the order of magnitude of the sum in Z m as an integral; Although the transition from a sum to an integral introduces inaccuracy for finite Θ, the results with an integral match the exact sum (12) surprisingly well (see Fig. 6 in Appendix). Substituting |s m;l | from Eq. (14), one obtains From Eq. (16), which is valid for m ≫ M 3 , one can see that for m > e/|κ 2 |, the absolute value |Z m | becomes larger than 1. However, this is not possible for the average value e imϕ of a phase ϕ on the circle. Technically, for the distribution density w(ϕ), the condition | 2π 0 w(ϕ)e imϕ dϕ | > 1 under the normalization condition 2π 0 w(ϕ) dϕ = 1 requires negativity of w(ϕ) for some ϕ.

C. Truncation of κm for m > N
Similarly to the previous subsection, a detailed stepby-step derivation is provided in Appendix B, while here we pin the principal derivation points and present its results. For nonzero κ m , m = 1, 2, ..., N , therefore, For sufficiently large m, the principal contributions are owned by the summands which are far from the boundaries of the summation domain in the index space. For large m, j 1 , j 2 ,..., j N , one can use the Stirling's formula and replace the summation with integration over the hyperplane j 1 + 2j 2 + 3j 3 + · · · + N j N = m; where The summand magnitude will be as well modulated by a Gaussian function of the indices. One can find the maximum of |s m;j1j2...jN | on the hyperplane by means of the method of Lagrange multipliers. To the leading order, one finds the maximum position and For the neighboring terms, one can calculate where Θ n = ψ n − (n/N )ψ N . Recasting Eq. (18) as and evaluating integrals, one finds

III. COMPARISON TO THE CASE OF A VARIABLE ON THE LINE
Let us compare the circular cumulant expansions for variables on the circle with the conventional cumulant expansions for variables on the line.
The distribution of a variable x on the line can be fully characterized by its real-valued cumulants K m , m = 1, 2, ..., or moments µ m = x m . The case of K 1 = 0, K m>1 = 0 corresponds to a δ-function probability density w(x) = δ(x − K 1 ) (notice, it is sufficient to assume K 2 = 0, which dictates all K m>2 = 0). The case of two first nonzero cumulants -the mean value K 1 = 0 and the variance K 2 = 0 -and K m>2 = 0 corresponds to . Thus, one can deal with one-and two-cumulant truncations of the random variable representation. It is important, that no higher order truncations are admitted [9]. For instance, assuming K 3 = 0 (which enforces as well a nonzero variance K 2 ), one cannot set K m>3 = 0; physically sensible distributions will require at least some higher-order cumulants to be nonzero (even though they can be small and rapidly decay as the order m increases).
For circular cumulants of a phase variable ϕ on the circle, the situation is different; one must have either only one nonzero circular cumulant [48] or an infinite series of nonzero cumulants. This dissimilarity requires explanations, since the algebraic relations between cumulants K m and moments µ m are equivalent to the algebraic relations between circular cumulants κ m and order parameters Z m (up to a multiplier, κ m ↔ K m /(m − 1)! ). We derived that having two nonzero circular cumulants results in large values of high-order Z m , |Z m | > 1, while it is not admitted by the physical meaning of Z m . One observes the same situation for the Gaussian distribution with arbitrary nonzero variance K 2 . Indeed, for K 2 > 0, there is always a finite probability of |x| > 1; therefore, for large enough m = 2n, x 2n > 1 (the case of odd m also yields | x m | > 1 if K 1 = 0). However, |µ m | > 1 are admitted for a line variable. Thus, the fundamental reasons forbidding finite-number truncations for circular cumulants essentially differ from that for conventional cumulants of line variables.
To summarize, the only admitted truncation for a phase variable is the one-cumulant one and it corresponds to the wrapped Cauchy distribution (or Ott-Antonsen ansatz), while for a line variable, the only admitted nontrivial truncation is the two-cumulant one and it corresponds to the Gaussian distribution.
Further, we compare the characterization of a distribution by the cumulants for admitted truncations and the meaning of higher-order cumulants. For a line variable, the cumulants are real-valued and two first cumulants quantify the centering of the distribution, K 1 , and its width √ K 2 (see Fig. 1a). The asymmetry of the distribution is quantified by K 3 and its kurtosis K 4 measures the deviation of tails from the Gaussian law. For a phase variable, the circular cumulants are complexvalued, the argument of the first cumulant arg κ 1 features the distribution centering and the absolute value |κ 1 | features the distribution width (see Fig. 1b). In particular, |κ 1 | = 1 for a δ-function distribution and |κ 1 | = 0 for a uniform distribution. The second complex-valued circular cumulant κ 2 quantifies the distribution asymmetry with (arg κ 2 − 2 arg κ 1 ) and deformation of tails with |κ 2 |. Thus, in both cases, the reference distribution is characterized by two primary quantities and the principal correction to it has to be characterized by another pair of quantities. However, these quantities involve different number of real-and complex-valued cumulants.

IV. MACROSCOPIC VARIABLES FOR POPULATIONS OF QUADRATIC INTEGRATE-AND-FIRE NEURONS
While a plain discarding of higher-order cumulants in cumulant equation chains can frequently be a reasonable approximation, one can encounter situations which require a more subtle treatment. In these situations, a formal adopting of finite cumulant truncations can lead to the divergence of the mean fields mediating the interaction between population elements. Below we consider an example of such a system.
The population of quadratic integrate-and-fire neurons (QIFs) (e.g., [32][33][34]) obeyṡ where V j and I j represent a neuron's membrane potential and an input current, respectively, η j and I(t) are individual and common parts of the input current, respectively, s(t) is a common field proportional to the firing rate r(t), and J is the synaptic weight. One can introduce a phase variable ϕ, and rewrite Eq. (23) in its terms: For this system the variable is important. Indeed, one can show [32] that the voltage mean-field and the firing rate r(t) in terms of ϕ is Thus, the population dynamics is essentially controlled by s(t) ∝ r(t) = Re(W )/π and Im(W ) is an important macroscopic observable. The variable r(t) is generally important for populations of pulse-coupled oscillators [31][32][33][34][35].
On the Ott-Antonsen manifold Z m = (Z 1 ) m , the sum in (26) can be readily calculated: A. Divergence of firing rate for any two-cumulant truncation If only two first cumulants κ 1 and κ 2 are nonzero, the circular cumulant generating function Ψ(k) = κ 1 k+κ 2 k 2 , and, see Eq. (11), the moment generating function Here, the exponential exp(Q) of an operatorQ is defined as 1 +Q + 1 One can calculate ∂ n ∂κ n and Eq. (28) yields a series The series (30) diverges for arbitrary κ 2 . Indeed, for m > |1+κ1| 2 2|κ2| , the sum terms grow with m.

C. Finite-N cumulant approximations
In this section we address the question whether one can use finite number cumulant approximations in applications. For approximations a small parameter is required. Refs. [20,21,29,38] theoretically reveal the importance and persistence of the case where circular cumulants obey hierarchy κ n ∝ ε n−1 with a small number ε. Below, in Sec. V, we will discuss this hierarchy and report it to be highly relevant for experimental data as well.
The hierarchy κ n ∝ ε n−1 is essentially important for a rigorous approach to constructing approximations. One should construct an expansion with respect to a small parameter ε, which means that if one introduces κ 2 2corrections, then κ 3 should be also taken into account for to achieve the accuracy o(κ 2 2 ), etc. In particular, up to the first-order corrections, Eq. (31) yields up to the second-order corrections, (33) Quite often in applications, one adopts a certain approximation instead of constructing a rigorous expansion. Such an approximation has a rigorously guaranteed order of accuracy. Additionally, the approximation can not only ease analytical calculations but also effectively yield a much higher accuracy than the rigorously guaranteed one. In the case of circular cumulant representation, one should be careful with such approximations; while one can introduce finite number of corrections related to higher cumulants, one cannot adopt approximations which correspond to exact formal expressions for a finite number of nonzero circular cumulants. For instance, in Sec. IV A, for two nonzero cumulants, the macroscopic variable W exactly corresponding to these cumulants diverges for arbitrary nonzero κ 2 . In detail, the first correction with the κ 2 -term is accurate; the further κ n 2 -contributions for moderate n introduce smaller corrections which do not enhance the accuracy for a specific distribution w(ϕ); for large n, these excessive corrections start to diverge.

A. Wrapped Gaussian and von Mises distributions
Let us consider two important particular distributions: (i) wrapped Gaussian distribution of half-width σ around ψ, which is relevant for some phase ensembles [6][7][8]39]; (ii) von Mises distribution of half-width σ around ψ, here I n (·) is the n-th order modified Bessel function of the first kind. Von Mises distribution features the steady states of ensembles of identical phase elements subject to additive intrinsic noise and common static field (e.g., see [40] or [21]). With these distributions one can calculate the macroscopic variable W . For wrapped Gaussian distribution (34), Z n = e −σ 2 n 2 /2 e inψ , and series (26) obviously possesses good convergence properties: For von Mises distribution (35), with the Jacobi-Anger expansion e a cos(ϕ−ψ) = +∞ n=−∞ I n (a)e in(ϕ−ψ) , one finds Z n = [I n (σ −2 )/I 0 (σ −2 )]e inψ , and series (26) reads In Fig. 2, one can see, that series |Z n | decay fast for both distributions providing a good convergence of W . Simultaneously, circular cumulants form a clearly pronounced geometric progression even for moderate values of parameter σ. Thus, for these two distributions the macroscopic variable W obviously converges, while the calculations with a finite number of cumulants has to be performed as discussed in Sec. IV C. In Appendix C, the formulae for W (36), (37), (39) are confirmed with an alternative approach to calculation of r.

B. Wrapped non-Cauchy distributions with heavy tails
The phenomenon of synchronization by common noise is an important case where the Cauchy distribution forms in populations of general limit-cycle oscillators subject to a weak intrinsic noise additionally to the common noise driving (cf Eq. (16) in [41]). Noticeably, the distribution of phase deviations in high-synchrony regimes is Cauchy even though both the common and intrinsic noises are Gaussian. When the common-noise synchronization mechanism is affected by global coupling, the distribution changes to [42]), where θ is the phase deviation from the cluster center; the distribution half-width σ is proportional to the intrinsic noise strength and σ ∝ (−λ) −1/2 , λ is the Lyapunov exponent of an oscillator without intrinsic noise; µ = [coupling strength]/(−2λ). Perfect synchrony of identical oscillators without intrinsic noise occurs for µ > −1/2. For non-high synchrony, non-small deviations of θ not only make the distribution w(θ) wrapped within the interval (−π, π], but also affect the distribution shape due to nonlinearities. Nonetheless, one can consider wrapped non-Cauchy distributions (38) as generic ones for synchronization by common noise where it interplays with the coupling between oscillators.
For wrapped non-Cauchy distribution (38) one can calculate order parameters where K n (·) is the modified second-kind Bessel function. For µ > −1/2, Z n decay with n exponentially (in detail, for nσ ≫ 1+µ 2 , (nσ) 1 2 +µ K 1 2 +µ (nσ) ≈ π/2(nσ) µ e −nσ ), and the sum (26) possesses good convergence properties. Circular cumulants can be now calculated from Z n . For µ > 0, cumulants always form a geometric progression (the right panel in Fig. 3); for −1/2 < µ < 0, the geometric progression is somewhat distorted by passings of cumulants through zero, where they change their signs (cusps in the left panel of Fig. 3), but their reference order of magnitude obeys the same geometric progression hierarchy. The calculations with a finite number of cumulants have to be performed as discussed in Sec. IV C.

C. Networks of coupled biological oscillators
Abel et al. [43] report hourly experimental data on resynchronization of cells in several biologically distinct mammalian suprachiasmatic nucleus (SCN) explants; measurements are performed with single-cell resolution. For illustration, we analyze the bioluminescence oscillation data of individual cells after application of tetrodotoxin for temporary inhibition of intercellular couplings. In Fig. 4, the results of analysis of the experimental data for the first (SCN1) and second (SCN2) data sets (publicly available online at https://github.com/JohnAbel/scn-resynchronizationdata-2016) from [43] are presented. Protophase is calculated via the Hilbert transform of the individual cell signal. The phases are calculated on the basis of the assumption of an identical functional relation between the protophase and the phase for all cells. The distribution of phases of all oscillators averaged over the integer number of revolutions should be uniform; hence, the corresponding distribution of protophases yields the functional relation between the protophase and the genuine phase [44].
One can see that the circular cumulants form geometric progressions and decay quite rapidly with order n. During transitions between different regimes of collective behavior, a defect of the progression multiplier propagates along the series from low-to high-order cumulants (see the series for SNC1 at t = 268 in Fig. 4c).

D. Networks of coupled electrochemical oscillators
As another example, let us consider experimental data for electrochemical oscillators [45,46]. For illustration, we intentionally analyze the data of different type as compared to the previous example. Instead of using raw measurements data, we digitize the shadowgraph in Fig. 1c of [46] with oscillation pattern of a population of 46 coupled oscillators, where grayscale indicates the instantaneous current of an individual oscillator. The color is a one-to-one function of the current; therefore, such data are sufficient for calculation of the protophase (via Hilbert transform), the genuine phase, and the order parameters. In Fig. 5, the reconstructed signals, the order-parameter |Z 1 (t)|, and circular cumulant series are presented. The circular cumulant series clearly exhibit rapidly decaying geometric progressions.

VI. CONCLUSION
For a variable on the circle we have derived that the circular cumulant series has either one nonzero element or an infinite number of them. The former corresponds to the wrapped Cauchy distribution or the Ott-Antonsen ansatz. With two or a larger but finite number of nonzero elements the high-order Kuramoto-Daido order parameters e imϕ tend to infinity, while their absolute value is not allowed to exceed 1. This should be taken into account when one deals with the systems governed by non-trivial macroscopic variables like the firing rate in a network of quadratic integrate-and-fire neurons [32]. Specifically, this firing rate [Eqs. (27), (26)] possesses good convergence properties on the Ott-Antonsen manifold and for all considered generic distributions on the circle, while it always diverges for any finite number of nonzero circular cumulants. One can compare this situation with the case of a variable on the line, where the cumulant series has either only two first nonzero cumulants or an infinite number of them. The case of two nonzero cumulants corresponds to the Gaussian distribution. This apparent dissimilarity between linear and phase variables actually preserves a concordance between them: in both cases, the admitted truncation is characterised by two quantities. On the line, the first and second real-valued cumulants determine the centering and the width of a distribution, respectively, and on the circle, the argument and the absolute value of the first complex-valued circular cumulant determine the same characteristics of a distribution. On the line, the third and fourth cumulants quantify the distribution asymmetry and the deviation of tails from a reference law (kurtosis); on the circle, the argument and the absolute value of the second cumulant do the same.
For linear variables, in some cases, one is strictly bound to the Gaussian reduction; e.g., the Fokker-Planck equation corresponds to the white Gaussian noise and the analogues of this equation for a white noise accounting for a finite number of nonzero higher cumulants exhibit unphysical behavior. Meanwhile, in a wide range of problems, one can construct approximations accounting for corrections owned by a finite number of higher cumulants and benefit from them, or quantify the deviation from a Gaussian distribution with the third and forth cumulants (e.g., [47]). Similarly, for phase variables, in some cases, one is allowed to deal with no finite circular cumulant truncation except the Ott-Antonsen ansatz; see the example of neuron firing rate. Simultaneously, in a range of problems, the approximations accounting for higher cumulant contributions yield accurate solutions where the Ott-Antonsen ansatz fails [20,21]. Strictly speaking, the example of neuron firing rate does not match the example of the Fokker-Planck equation and its analogues for a finite number of higher cumulants. With the latter, the issue cannot be handled, while the former case can be handled in a regular way for practical situations. For the case of a geometric progression, |κ n | ∝ ε n−1 , ε ≪ 1, one should not use the exact formulae for a finite number N of circular cumulants, but use expansions with terms up to κ (N −1)/(n−1) n and n ≤ N .
We have examined the relevance of the geometric progression allowing for a regular approach to constructing approximations of a prescribed accuracy or with a predefined number of circular cumulants. This progression is always present for wrapped Gaussian, von Mises, and non-Cauchy heavy-tail distributions (38), and is typical in experiments, as demonstrated with data for coupled biological [43] and electrochemical oscillators [45,46]. (A2) For m = 2n + 1, (A3) For m ≫ |κ 1 |/ |κ 2 |, the second term in sums (A2) and (A3) is large compared to the first one. For m ≫ |κ 2 |/|κ 1 |, the term ahead of the last term in these sums is large compared to the last one. Hence, for , the leading contributions into the sum are owned by the terms which are far from the sum edges. For large n, j, and (n − j), one can employ the Stirling's approximation, n! ≈ √ 2πn(n/e) n , and find m!
The largest contribution into Z m is made by the summand s m;l , for which d dj ln |s m;j | = 0 : Recall, these evaluations are valid for m ≫ a −1/2 + a 1/2 , where we introduce notation For simplicity of calculations, we conduct further consideration for even larger which allows one to rigorously neglect as many contributions in expansions as possible.
For large m, one can solve Eq. (A6) iteratively. At the first iteration, one can assume the logarithm argument to be 1 and find Substitution of l = l (1) into the second and third terms of Eq. (A6) yields Further, For (1 + ε) N , one can employ Employing expansion (A8), one can find s m;l+r where we used that the logarithm argument in Eq. (A6) is close to 1, and introduced notation With Eq. (A10), one can assess the order of magnitude of the sum in Z m as an integral; Substituting |s m;l | from Eq. (A9), one finally finds The transition from a sum to an integral introduces inaccuracy. However, one can see in Fig. 6 that, for large m, the results with an integral match the exact sum satisfactorily not only for Θ = 0 (where the accuracy is high already for non-large m), but also for Θ as large as ∼ π. From Eq. (A11), which is valid for m ≫ M 3 , one can see that for m > e/|κ 2 |, the absolute value |Z m | becomes larger than 1. However, this is not possible for the average value e imϕ of a phase ϕ on the circle. (B1) Again, for sufficiently large m, the principal contributions are owned by the summands which are far from the boundaries of the summation domain in the index space. In this section we assume m to be large compared to any reference value for it. For large m, j 1 , j 2 ,..., j N , one can use the Stirling's formula and replace the summation with integration over the hyperplane j 1 + 2j 2 + 3j 3 + · · · + N j N = m. Thus, Z m ≈ dj 2 dj 3 · · · dj N s m;j1j2...jN , One can find the maximum of |s m;j1j2...jN | on the hyperplane by means of the method of Lagrange multipliers.