Dispersion relations alone cannot guarantee causality

We show that linear superpositions of plane waves involving a single-valued, covariantly stable dispersion relation $\omega(k)$ always propagate outside the lightcone, unless $\omega(k) =a+b k$. This implies that there is no notion of causality for individual dispersion relations, since no mathematical condition on the function $\omega(k)$ (such as the front velocity or the asymptotic group velocity conditions) can serve as a sufficient condition for subluminal propagation in dispersive media. Instead, causality can only emerge from a careful cancellation that occurs when one superimposes all the excitation branches of a physical model. This is shown to happen automatically in local theories of matter that are covariantly stable. Hence, we find that the need for nonhydrodynamic modes in relativistic fluid mechanics is analogous to the need for antiparticles in relativistic quantum mechanics.


INTRODUCTION
The "practical" definition of relativistic causality is universally accepted: it is impossible to transmit information faster than the vacuum speed of light [1][2][3].The question is how to translate this principle into mathematical constraints that our physical theories must obey.In some cases, this question has an unambiguous answer.In classical field theory, causality demands that the characteristics of the field equations lay inside or upon the lightcone [4][5][6][7][8].In quantum field theory, the commutator of spacelike-separated observables must vanish [9][10][11].In other contexts, the mathematical nature of causality is less understood.Consider a homogeneous system in thermodynamic equilibrium, and let ω(k) be the eigenfrequency of one of its (linear) excitations, which is a function 1 of the wavenumber k.Under which conditions is such dispersion relation compatible with causality?Most attempted answers revolve around imposing inequalities on the phase velocity (Reω)/k, or on the group velocity d(Reω)/dk [12][13][14].However, no fully consistent and universally reliable criterion has been found.The most widely accepted constraint is that the "front velocity" [15,16], or the "asymptotic group velocity" [17], both of which usually coincide by L'Hopital's rule, should not exceed the speed of light c (=1, in our units), namely Re ω k Unfortunately, this condition is far from satisfactory as many famous acausal equations in physics fulfill (1) even if the theory of partial differential equations tells us 1 For anisotropic systems, the dispersion relation is ω(k x , k y , k z ).
In this case, we align the x−axis along a direction of interest and define ω(k) := ω(k, 0, 0).Then, our results apply to waves propagating in x.
Three notable examples are the diffusion equation, the Euclidean wave equation, and the linearized Benjamin-Bona-Mahony (BBM) [18] equation, respectively: (2) All these equations have v f = 0. Furthermore, their phase and group velocities are (sub)luminal for all k.Nevertheless, these three models are strongly acausal.The BBM equation is particularly striking because one cannot attribute the causality violation to the imaginary part of ω, given that ω is real for real k.Yet, it is acausal as the lines t = const are spacelike characteristics 2 .
This Letter shows that the limitations of (1) are manifestations of a fundamental impossibility.Namely, unless ω(k) = a+bk for all k (with a, b constant), a single dispersion relation ω(k) cannot be causal.Rather, "causality" is a collective property of the system, which describes how all the excitation branches ω n (k) combine when the full initial value problem is set up.Therefore, apart from ω(k) = a+bk, it is impossible to formulate a sufficient condition for causality in the form of an inequality that ω(k) should obey.This is why, given a causality criterion like (1), one can always find models that fulfill it and are acausal, such as (2).
Nevertheless, we also show that one can overcome these difficulties by appealing to specific structures present in many (but not all) physical theories, which guarantee that the dispersion branches combine "correctly" to ensure causality.In particular, if the operator governing the dynamics is local and the system is covariantly stable (in precise senses defined below), all superluminal tails cancel out, see Theorem 1 for a precise statement.
Our analysis relies on the following inequality, which must hold for all dispersion branches describing disturbances around the equilibrium state of a stable system in relativity [19,20]: This covariant bound can be derived from the study of retarded causal correlators of stable phases of matter, and it is textbook material [21], whose importance in constraining transport properties of matter was demonstrated in Ref. [19].Independently from the principle of causality, (3) constitutes the physical requirement that a stable system should be simultaneously stable in every inertial frame of reference [22].In fact, if (3) were violated, namely if there were some k ∈ C for which Im ω > |Im k|, then a boost with velocity v = Imk/Imω would lead us to a new reference frame where Im ω ′ > 0 and Im k ′ = 0 [20].This would imply that there is an observer who can detect a growing Fourier mode, signaling an instability [23][24][25].For this reason, we assume (3) holds as a basic stability property of the system.

SINGLE DISPERSION BRANCHES ARE SUPERLUMINAL
Fix some level of description of matter, which may be, e.g., quantum field theory, kinetic theory, or hydrodynamics.Using established techniques [26][27][28][29], one can compute all the (possibly infinite) dispersion branches predicted by such theory.Choose one of interest, ω(k).According to conventional wisdom [12][13][14][15][16][17], the relation ω(k) determines how the corresponding excitation "propagates", and there should be some causality criterion for ω(k), e.g.(1), which guarantees that the excitation propagates subluminally.Now we prove that this intuitive interpretation can be consistently maintained only in the trivial case ω(k) = a + bk.In dispersive media, causality can never be argued from ω(k) alone.
First, let us make the above (incorrect) intuition about the causality of ω(k) more precise.Let φ(x µ ) ∈ C be the linear perturbation to a local observable of interest.For example, φ(x µ ) may be the local energy density fluctuation.Then, consider a 1+1 dimensional profile φ(t, x) that is constructed by superimposing plane waves all belonging solely to the selected excitation branch ω(k), i.e.
By setting t = 0, we find that φ(k) is the Fourier transform of the initial data, φ(0, x).The straightforward definition of "causal dispersion relation" is the following: If φ(0, x) has support in a set R, then the support of φ(t, x) at later times should be contained inside the future lightcone of R, see Figure 1.As a consequence, if φ(0, x) has compact spatial support, one should find that φ(t, x) has compact spatial support for each fixed t > 0. Now we will show that, in practice, this is never the case for φ given by ( 4).On the contrary, single-branched excitations of the form (4) always "travel" at infinite speed unless ω=a+bk (i.e., when the medium is not dispersive).

A simple argument
If φ(t, x) vanishes outside the future lightcone of a compact set R, then also ∂ t φ(t, x) must vanish there.Hence, to prove that φ(t, x) exits the lightcone, it suffices to show that, for some t 0 > 0, the spatial profiles of φ(t 0 , x) and ∂ t φ(t 0 , x) cannot both have compact support simultaneously.We assume that φ(t, x) is smooth, but the argument can be generalized.Fix t 0 > 0. From (4) and the uniqueness of the Fourier transform, we have that the spatial Fourier transform of φ(t 0 , x) is given by φ(t 0 , k) = φ(k)e −iω(k)t0 .Now, suppose that φ(t 0 , x) is compactly supported.Then, φ(t 0 , k) extends to an entire function of k ∈ C [30].Under our assumptions, we can bring time derivatives under the integral to conclude that ∂ t φ(t 0 , x) has spatial Fourier transform φ(t 0 , k) = −iω(k)φ(t 0 , k).Corollary 1.1 of [19] tells us that, if ω(k) obeys (3), then it cannot be an entire function (unless ω=a+bk).Therefore, φ(t 0 , k) is the product of an entire function with a function that is not entire.Such a function can be entire only in the remote eventuality in which the discrete zeroes of φ(k) happen to cancel the singularities of ω(k).This requires a perfect fine-tuning of the initial data, and it does not happen in general 3 .Furthermore, according to Theorem 2 of [19], if ω(k) satisfies (3), then its singularities are never poles or essential singularities.Instead, they are expected to be branch points, which cannot be erased by multiplying ω(k) with an entire non-zero function.Thus, φ(t 0 , k) is not an entire function and, therefore, ∂ t φ(t 0 , x) cannot have compact support, as desired.

Application 1: Hegerfeldt paradox
The above argument is a generalization of the wellknown result (due to Hegerfeldt [31]) that relativistic single-particle wavefunctions of the form must propagate outside the lightcone [9].Indeed, it can be easily verified that the dispersion relation of the free particle, ω = √ m 2 +k 2 , obeys (3), and this forces it to be non-analytical, as testified by the square root.Hence, the support of (5) expands at infinite speed [32], even if the group velocity, Application 2: Necessity of non-hydrodynamic modes An immediate corollary of our analysis is that the retarded Green's function of any theory for diffusion having only one dispersion relation, ω(k) = −iDk 2 + O(k 3 ), always exits the lightcone.Thus, to build a subluminal Green's function, we need at least two dispersion relations (see [33] §7.4).This explains why an additional (usually gapped) mode is needed for causality [19].

Explanation
The superluminal behavior of (4) in causal matter seems absurd, but there is a simple explanation: excitations of the form (4) cannot be truly localized, unless ω=a+bk.They may seem to have compact support, if φ(0, x) is supported in R, but, in principle, an observer can detect the excitation from outside R already at t = 0 by measuring some other observable.In fact, we recall that the dispersion relation ω(k) is derived from some underlying physical theory (e.g.quantum field theory, kinetic theory, or hydrodynamics), which may possess several other local observables besides φ.The fact that the initial profile φ(0, x) has support inside R does not imply that all the measurable fields affected by the excitation are unperturbed outside R. Instead, it may be the case that, due to this excitation, the perturbation to a second observable ψ(x µ ) of the theory has already unbounded support at t = 0.This is how, in principle, φ can propagate outside the lightcone without necessarily violating the principle of causality in the full theory: there is no superluminal propagation of information if such information was already accessible through the measurement of ψ(0, x) outside R. Indeed, below we prove that if the perturbations to all the observables are initially supported inside a compact region R (i.e. the excitation is truly localized), and the dynamics is governed by a local operator, then φ(t, x) cannot be expressed in the form (4), and it must always combine at least two dispersion branches, unless ω = a+bk.

COMPACTLY-SUPPORTED EXCITATIONS
We assume that the state of the system at a given time can be characterized, in the linear regime, by a collection of smooth perturbation fields Ψ(x µ ) ∈ C D , which all vanish at equilibrium.We assume that D is finite, although it can be as large as the number of particles in a material volume element.In most physical theories currently available (e.g., electrodynamics, elasticity theory, or hydrodynamics), the 1+1 dimensional equation of motion of the system takes the form4 where is L(∂ x ) a polynomial of finite degree in ∂ x , i.e.
, where A j are constant D × D matrices, and M ∈ N.This is what we mean by a "local operator".In fact, operators involving an infinite series of derivatives can produce non-localities and causality violations.For example, if we set L(∂ x ) = e a∂x , equation ( 6) becomes ∂ t Ψ(t, x) = Ψ(t, x+a), which is clearly a non-local theory.Indeed, the main reason why the BBM equation in ( 2) is acausal is that its dynamical operator, . The general formal solution to (6) reads where Ψ(k) is the Fourier transform of the initial data Ψ(0, x).Now, the field φ(t, x), being a linearized local observable, is a local linear functional of the degrees of freedom, namely φ = V(∂ x )Ψ, where V is also a polynomial of finite degree in ∂ x , i.e.V(∂ Here, B j are constant row vectors of length D, and N ∈ N. Therefore, we have the following formula: Assume that the excitation is initially supported inside R.Then, all the components of Ψ(0, x) are compactly supported, and all the components of Ψ(k) are entire functions.Furthermore, L(ik) and V(ik) are entire in k ∈ C, being polynomials.Also, the matrix exponential is an analytic function of the components of the matrix in the exponent, so that e L(ik)t is entire in k.Combining these results, we can conclude that the integrand of ( 8) is an entire function of k for all t.For this reason, it cannot coincide with (4), unless ω(k) = a + bk (i.e. in dispersion-free systems).This shows that, when we construct the state (4) in a dispersive medium, we implicitly allow some component of Ψ to have unbounded support already at t = 0, which is what we wanted to prove.In the Supplementary Material, we analyze the explicit example of the Klein-Gordon equation.

CAUSALITY CRITERION FOR STABLE MATTER
The above analysis suggests that if the dynamics of the system is governed by a local operator L, then all the dispersion branches will automatically combine in a way to cancel the infinite tails of the individual excitations (4).This intuition can be made rigorous through the following theorem, according to which, schematically, Local equations + Stability in all frames =⇒ Relativistic causality .

More precisely:
Theorem 1.If L(ik) and V(ik) are polynomials of finite degree in ik, and the eigenvalues ω n (k) of iL(ik) obey the stability requirement (3) for all k ∈ C, then all smooth linear excitations propagate subluminally, in the sense that the support of φ(t, x), as given by (8), is contained within the future lightcone of the support of Ψ(0, x).
Proof.We will focus on the case where Ψ(0, x) has support inside the interval [−1, 0].We will verify that φ(t, x) vanishes for x > t ≥ 0.More general cases can be recovered from here by invoking linearity, translation invariance, and closure of the solution space.Consider the complex integral where Γ is the closed loop in complex k space in Fig. 2, in the limit of large R. Since the integrand is entire, I = 0.
Let us now show that, if x > t ≥ 0, then the contribution coming from the upper semicircle decays to zero as R → +∞, so that 0 = I = φ(t, x), see equation (8).
To this end, we first note that, according to the Jordan-Chevalley decomposition theorem, the matrix L(ik) can be expressed as where P n are complementary eigenprojectors (so that P m P n = δ mn P n , n P n = I), and N is a nilpotent matrix (N a =0 for some a ∈ N) which commutes with all P n .Thus, the integrand in (9) can be rewritten as The matrix elements of N j and P n grow at most like powers of |k|.This follows from [35, Chapter 2, eqs.
(1.21) and (1.26)], applied to the matrix (ik) −M L(ik) regarded as a polynomial in (ik) −1 → 0, combined with the fact that (ik) −M L(ik) and L(ik) have the same invariant subspaces.On the other hand, if Imk ≥ 0, and x > t ≥ 0, we have the following estimates: In the first line, we have invoked the inequality (3).In the second line, we have used the fact that e xImk ≤ 1 inside the interval [−1, 0].Note that Ψ(0, x), being continuous and compactly supported, has finite L 1 norm.From the estimates (12), we can conclude that (11) decays exponentially to zero when Imk → +∞.Furthermore, since Ψ(Rek + iImk), regarded as a function of Rek, is the Fourier transform of the Schwartz function e xImk Ψ(0, x), it is itself a Schwartz function [30], meaning that (11) decays to zero faster than any power also when Rek → ∞.It follows that, as R 2 = (Rek) 2 + (Imk) 2 → +∞, the integral over the semicircle converges to zero since the integrand decays faster than any power of R.
Most derivations of (1) rely on the assumption that ω ≈ v f k for large k ∈ C, so that (1) is a direct consequence of (3).However, (3) is a much more stringent condition, as it automatically rules out the acausal equations (2).Indeed, the apparent success of (1) in many situations can be traced back to (3) through the following theorem proven below: 6) is a hyperbolic first-order system, with L(∂ x ) = −Ξ − M∂ x , then (3) implies (1), and the characteristic velocities coincide with the front velocities.
Proof.The ratios ω n /k are eigenvalues of the matrix If we regard the right-hand side as a polynomial in (ik) −1 , we can take the limit as (ik) −1 → 0 and apply the continuity property of eigenvalues [35] to conclude that ω n /k must converge to eigenvalues of M for large k ∈ C.But the eigenvalues of M are the characteristic velocities of the system, and they are real (by hyperbolicity), so that lim k→∞, k∈C Restricting the above limit to real k, and using the continuity of Re, we find that the characteristic velocities coincide with the front velocities.Restricting the limit to imaginary k, we find that (3) implies causality.
This completes our proof.
While the above analysis was restricted to classical initial value problems, its broad implications may also be extrapolated to quantum systems.For example, some conformal field theories are known to be acausal [36].Given that such theories are local, we can "apply" our Theorem 1 to conclude that such theories are not covariantly stable and violate the bound (3), in agreement with Section III.A of [22].

CORRELATORS IN QFT
Causality requires multiple dispersion relations also in QFT.Given a local observable operator φ(x µ ), the correlator G(x µ ) = ⟨[ φ(x µ ), φ(0)]⟩ has support inside the lightcone [9].But since the slices of the lightcone at constant time are compact spheres, the spatial Fourier transform G(t, k) must be entire in k for all t [30].This is why introducing momentum cutoffs or "patching" correlators in momentum space leads to causality violations [37]: it breaks analyticity.Furthermore, if G(t, k) can be expressed as a superposition of modes of the form e −iωn(k)t (see e.g.[38]), then we know that all the non-analyticities of the individual frequencies ω n (k) must cancel out.

FINAL REMARKS
Consider the following puzzle: All solutions of the relativistic Schrödinger equation i∂ t φ= m 2 −∂ 2 x φ are also solutions of the Klein-Gordon equation −∂ 2 t φ=(m 2 −∂ 2 x )φ.Nevertheless, the former is notoriously acausal [9], while the latter is causal.This defies the intuition of causality as a statement about the propagation speed of φ.How can the same function φ(t, x) be superluminal when viewed as a solution of one equation and subluminal when viewed as a solution of another equation?
Here, we solved this puzzle by showing that causality is not an intrinsic property of the fields themselves.Rather, it is a property of how we "attach information" to the fields by defining the physical state.The existence of faster-than-light motion does not result in causality violation if the motion carries no new information about the state.Indeed, relativistic Schrödinger and Klein-Gordon differ by the way they define the physical state at a given time: {φ(x)} in the former, and {φ(x), ∂ t φ(x)} in the latter.The puzzle arises because compactly supported field states within relativistic Schrödinger (i.e., localized φ profiles) must have unbounded support within Klein-Gordon (i.e., cannot be localized in ∂ t φ), see Supplementary Material.
Starting from this intuition, we showed that nonhydrodynamic modes become necessary for relativistic viscous hydrodynamics for the same reason that antiparticles are necessary for relativistic quantum mechanics: defining a notion of locality in dispersive systems requires at least two dispersion relations.

FIG. 1 .
FIG.1.The principle of causality.If a perturbation has initial support inside a region of space R (blue segment), then it cannot propagate outside the set J + (R) called the "causal future of R"[6], or "future lightcone of R"[5] (red region).

FIG. 2 .
FIG. 2. Path of integration for the proof of Theorem 1.