Spin Hall effects and the localization of massless spinning particles

The spin Hall effects of light represent a diverse class of polarization-dependent physical phenomena involving the dynamics of electromagnetic wave packets. In a medium with an inhomogeneous refractive index, wave packets can be effectively described by massless spinning particles following polarization-dependent trajectories. Similarly, in curved spacetime the gravitational spin Hall effect of light is represented by polarization-dependent deviations from null geodesics. In this paper, we analyze the equations of motion describing the gravitational spin Hall effect of light. We show that these equations are a special case of the Mathisson-Papapetrou equations for spinning objects in general relativity. This allows us to use several known results for the Mathisson-Papapetrou equations, and apply them to the study of electromagnetic wave packets. We derive conservation laws, we discuss the limits of validity of the spin Hall equations, and we study how the energy centroids of wave packets, effectively described as massless spinning particles, depend on the external choice of a timelike vector field, representing a family of observers. In flat spacetime, the relativistic Hall effect and the Wigner(-Souriau) translations are recovered, while our equations also provide a generalization of these effects in arbitrary spacetimes. We construct a large class of wave packets that can be described by the spin Hall equations, but also find its limits by giving examples of wave packets which are more general and are not described by the spin Hall equations. Lastly, we examine the assumption that electromagnetic wave packets are massless. While this is approximately true in many contexts, it is not exact. We show that failing to carefully account for the limitations of the massless approximation results in the appearance of unphysical centroids which are nowhere near the wave packet itself.

The spin Hall effects of light represent a diverse class of polarization-dependent physical phenomena involving the dynamics of electromagnetic wave packets. In a medium with an inhomogeneous refractive index, wave packets can be effectively described by massless spinning particles following polarization-dependent trajectories. Similarly, in curved spacetime the gravitational spin Hall effect of light is represented by polarization-dependent deviations from null geodesics. In this paper, we analyze the equations of motion describing the gravitational spin Hall effect of light. We show that these equations are a special case of the Mathisson-Papapetrou equations for spinning objects in general relativity. This allows us to use several known results for the Mathisson-Papapetrou equations, and apply them to the study of electromagnetic wave packets. We derive conservation laws, we discuss the limits of validity of the spin Hall equations, and we study how the energy centroids of wave packets, effectively described as massless spinning particles, depend on the external choice of a timelike vector field, representing a family of observers. In flat spacetime, the relativistic Hall effect and the Wigner(-Souriau) translations are recovered, while our equations also provide a generalization of these effects in arbitrary spacetimes. We construct a large class of wave packets that can be described by the spin Hall equations, but also find its limits by giving examples of wave packets which are more general and are not described by the spin Hall equations. Lastly, we examine the assumption that electromagnetic wave packets are massless. While this is approximately true in many contexts, it is not exact. We show that failing to carefully account for the limitations of the massless approximation results in the appearance of unphysical "centroids" which are nowhere near the wave packet itself.

I. INTRODUCTION
Many observations of the physical world-particularly in astrophysical contexts-involve measurements of electromagnetic and (more recently) gravitational radiation. Interpreting this radiation requires a theoretical model for its propagation. In the case of electromagnetic waves, one might begin with Maxwell's equations. In the case of gravitational waves, one might instead use the Einstein field equation. Regardless, exact solutions are rarely available and the geometric optics approximation 1 is typically applied in order to make progress. This assumes that wavelengths are small compared with all other relevant length scales, and forms the basis for most of the theory of gravitational lensing [1][2][3][4][5].
Mathematically, the geometric optics approximation allows the field equations, which are partial differential equations, to be approximated by a set of ordinary differential equations. The problem of solving partial differential equations is thereby reduced to the much simpler problem of solving ordinary differential equations. More specifically, this process shows that the amplitudes and polarization states of high-frequency electromagnetic and gravitational waves propagate along null geodesics. The resulting field acts, in this approximation, as though it were formed from a collection of noninteracting massless particles.
It is the purpose of this paper to investigate what happens beyond geometric optics, when wavelengths are small but not completely ignorable. More specifically, how do corrections to geometric optics affect propagation directions? While the equations which govern small corrections to geometric optics were derived long ago [6,7] for electromagnetic and gravitational waves propagating through curved spacetimes, their consequences have not been thoroughly explored. It is nevertheless known that all reasonable definitions for the local "propagation direction" agree in geometric optics: The direction of the electromagnetic momentum density is identical for all observers, and that coincides with the direction of the local phase gradient, the direction along which "information" propagates, and the (necessarily degenerate) principal null direction of the electromagnetic field. Beyond geometric optics, different notions of propagation direction no longer agree: The direction of the 4-momentum density can be different for different observers, there can be two principal null directions, and phase gradients can depend on a choice of basis [8]. Moreover, amplitude and polarization states no longer propagate independently along each ray. Instead, there is a transport of information between neighboring rays as well as along them. This means that beyond leading order, there is no well-defined direction which can be associated with "information flow" in a high-frequency field.
This complexity requires that we be precise about what exactly it is whose propagation we would like to understand. In this paper, we focus on the "bulk" propagation of small 2 electromagnetic pulses in curved spacetimes. We choose a "center" for each pulse and ask how that center evolves in time. Pulses in geometric optics are simple: With reasonable assumptions, they travel along null geodesics [9]. Like their constituent rays, the centers of high-frequency pulses behave, at leading order, like massless monopolar particles. One order beyond geometric optics, the motion depends on a pulse's angular momentum. More subtly, it also depends on precisely which definition is used to describe the pulse's center. Regardless, there is a sense in which otherwise-identical wave packets with opposite circular polarizations can be deflected with respect to one another. This behavior may be summarized by stating that one order beyond geometric optics, the bulk motion is equivalent to that of a massless dipole.
In the literature on flat-spacetime optics in nontrivial materials, spin-dependent corrections to the propagation of electromagnetic fields are sometimes described as spin Hall effects. There are in fact a number of different spin Hall effects which have been discussed theoretically, some of which have also been observed experimentally [10][11][12]. Some spin Hall effects are induced by, e.g., gradients in the refractive index [13][14][15][16][17][18][19][20][21][22]. Others arise even without any material inhomogeneities: The geometric spin Hall effect [23][24][25] and the related relativistic Hall effect [26] and Wigner(-Souriau) translations [27][28][29] all arise in vacuum and in flat spacetime. These three effects may be shown to be associated with differing definitions for the "center" of a given wave packet. More precisely, the relativistic Hall effect and the Wigner translations are related to differences between three-dimensional centroids which would naturally be associated with different observers. They are essentially the same as (unnamed) effects which have long been known for massive objects [30][31][32][33]. The geometric spin Hall effect is somewhat different, being instead concerned with differences between centroids which are defined on different two-dimensional cross sections.
Still other approaches have not made any direct contact with an underlying field theory, but have instead claimed that the motion of a wave packet could be described using massless versions of the Mathisson-Papapetrou (MP) equations [18,[51][52][53][54][55][56][57], equations which are known to describe classical spinning objects in curved spacetimes. This paper focuses on the gravitational spin Hall effect of light, as described in Ref. [42]. While the derivation there was based on a high-frequency approximation, it differs from other high-frequency approaches by being applicable in arbitrary spacetimes and by avoiding specific 3 + 1 foliations. This paper endeavors to better understand the meaning, the domain of applicability, and the limitations of the gravitational spin Hall equations. It also unifies those equations with others which have appeared in different contexts: The gravitational spin Hall equations are shown to be a special case of the MP equations, and the flat-spacetime gradient-index spin Hall effect, the relativistic Hall effect, and the Wigner (-Souriau) translations are all shown to be special cases of the gravitational spin Hall equations.
But before any equations of motion can be sensibly discussed, it is necessary to first explain what exactly those equations describe. This leads us to consider what can be meant by the centroid of an extended wave packet. One definition which has appeared in the literature is shown to be untenable. A large number of others remain, however, and we show the set of all such possible centroids is unbounded for massless-but not massiveobjects. While this might at first appear to be a failure of the definitions, it is in fact a failure of masslessness. We show that wave packets with nonzero angular momentum cannot be massless, and this is essential to their localizability. While the massless approximation can be useful for many purposes, ignoring its limitations can result in qualitatively incorrect conclusions.
The paper begins in Sec. II by reviewing the gravitational spin Hall effect of light, as presented in Ref. [42]. For comparison with the spin Hall effect of light in flatspacetime optics [11], we emphasize the role of the Berry phase and the Berry connection in describing polarization, as well as the role of the Berry curvature in the gravitational spin Hall equations.
Section III shows that with particular initial and spin supplementary conditions, the gravitational spin Hall equations emerge as a special case of the MP equations. This relation allows us to use the well-developed theory associated with the MP equations to clarify the meanings of the worldline and the momentum which arise in the spin Hall equations. It also allows us to write down conservation laws for those equations and to discuss their regimes of validity.
The spin Hall equations involve an arbitrary choice of timelike vector field, and we show in Sec. IV that this parametrizes different definitions for the centroid of an extended wave packet. Our main result regarding these centroids is that although massive spinning objects can be localized, massless ones cannot. Section V examines whether or not the initial conditions which reduce the MP equations to the gravitational spin Hall equations are in fact realized by reasonable wave packets. We use a high-frequency approximation to explicitly construct a large class of electromagnetic wave packets. For many members of this class, the appropriate conditions are indeed satisfied. However, we also find wave packets which do not have the expected properties. This implies that there are nontrivial assumptions on the nature of the wave packet which have been implicitly (and unknowingly) imposed in the prior literature. We also show that our approximate wave packets can fail to satisfy the dominant energy condition. This is an unphysical artifact of the high-frequency approximation, and is what leads to the apparent delocalization of spinning wave packets discussed in Sec. IV.
In Sec. VI, we discuss an analogy between light propagation through an optical medium and light propagation through vacuum but in an effective optical metric. Using a standard optical metric, we recover the spin Hall effect of light in an inhomogeneous medium from the gravitational spin Hall equations.
Finally, the Appendix demonstrates that at least in flat spacetime, the spin of any massless object which satisfies the dominant energy condition must vanish. It follows that, e.g., electromagnetic wave packets with nonzero spin cannot be exactly massless.
Notation and conventions: We work on an arbitrary smooth Lorentzian manifold (M, g αβ ), where the metric tensor g αβ has signature (− + + +). Greek letters are used for spacetime indices and run from 0 to 3. We use bold symbols to denote 3-vectors, and their components are labeled by Latin letters from the middle of the alphabet, (i, j, k, . . .), that run from 1 to 3. Units are used in which G = c = 1, the Einstein summation convention is assumed, and we use the notation a α b α = a · b, a α a α = a · a = a 2 . The Riemann tensor is defined such that 2∇ [α ∇ β] ω γ = R αβγ λ ω λ for any ω γ . When working with tensors T defined at different spacetime points x α ,x α ∈ M , we use the usual notation, T β α (x), when the tensor is defined at x α , while we use primed indices, T β ′ α ′ (x), for tensors defined atx α .

II. GRAVITATIONAL SPIN HALL EFFECT OF LIGHT
The equations of motion which describe the spin Hall effect for electromagnetic waves propagating through curved spacetimes were derived in Ref. [42]. They were obtained by performing a covariant high-frequency analysis of the vacuum Maxwell equations. A similar approach was used in Ref. [45] to describe the spin Hall effect for gravitational waves propagating on curved backgrounds. For electromagnetic waves, the derivation starts with the WKB ansatz for the electromagnetic potential, where ǫ is a small parameter related to the wavelength, u is a real phase function, ψ is a real scalar amplitude, and a α is a complex polarization vector normalized such that a αā α = 1. Higherorder terms, such as ψ (1) α , do not play any role in this section. The overall factor of ǫ is for convenience and ensures that the field strength F αβ = 2∇ [α A β] is nontrivial and finite in the ǫ → 0 limit. If u increases as with time, it is convenient to define the future-directed wave vector k α = −∇ α u, and in terms of that, a timelike observer with 4-velocity t α will measure the wave frequency The derivation of the spin Hall effect in Ref. [42] relies on an analysis of the overall phase factor of the field, which consists of u at the lowest order in ǫ, together with a higher-order phase factor, referred to as the Berry phase, which comes from the polarization vector a α . Using the WKB ansatz above, together with the Maxwell equation and the Lorenz gauge condition ∇ α A α = 0, the wave vector k α must be be null and orthogonal to the polarization vector, Additionally, the scalar amplitude ψ must satisfy the transport equation ∇ α k α ψ 2 = 0, (2.5) and the polarization vector a α must be parallel transported, These are the usual equations of geometric optics. The null geodesic rays of geometric optics are integral curves of k α . It is convenient to expand the polarization vector in terms of a tetrad {k α , t α , m α ,m α }, where the real covector t α is timelike, k α , m α , and its complex conjugatem α are null, t · m = 0, and m ·m = 1. Given Eq. (2.4), there must exist complex scalars z 1 , z 2 and z 3 such that (2.7) The complex covectors m α andm α form a circular polarization basis, and the considered electromagnetic wave is circularly polarized when z 1 = 0 or z 2 = 0. The term proportional to k α is pure gauge, not fixed by the Lorenz gauge condition, and it will not play any role in what follows.
The only element of the tetrad which is interpreted as being fixed by the field is k α . Supplementing that with t α fixes the 2-plane spanned by m α andm α . And within that plane, there is still an additional freedom associated with the spin rotations m α → e iφ m α , where φ is any real scalar. Any change in t α will result in a shift with the form m α → m α + ck α , which can only affect z 3 in (2.7). Although that is interpreted as a gauge transformation here, changes in t α will act nontrivially and play an important role in the spin Hall equations 3 . Spin rotations instead affect the values of z 1 and z 2 . Nevertheless, they do not affect the spin Hall equations.
We can now obtain a transport equation for z 1 and z 2 along the rays. Viewing m α as a covector-valued field over the cotangent bundle, depending on both position and on k α , the parallel transport equation (2.6) implies that d dτ is the Berry connection. The operator in brackets in the Berry connection may be seen to be a horizontal covariant derivative on the cotangent bundle. Regardless, the transport equation (2.8) can be integrated to yield is the Berry phase. The Berry phase represents a higherorder correction to the overall phase of the WKB potential, and is generally responsible for the spin Hall effect of light [16,21]. For circularly polarized electromagnetic waves, the leading-order field takes the form A α = Re ǫψm α e i(u+ǫγ)/ǫ or A α = Re ǫψm α e i(u−ǫγ)/ǫ , depending on the handedness of circular polarization state. The total phase at this order is therefore proportional to u + ǫsγ, where s = ±1. Note that both the Berry connection and the Berry phase depend on spin ro- In geometric optics, the dispersion relation k · k = 0 may be viewed as a Hamilton-Jacobi equation for the phase function u. If that is solved using the method of characteristics, one recovers the null geodesic rays of geometric optics. In Ref. [42], the strategy was to generalize this procedure, deriving the spin Hall effect by looking for an effective dispersion relation involving the gradient of the corrected phase function u + ǫsγ. Letting K α = −∇ α (u + ǫsγ), we use k · k = 0 and the definition of the Berry phase γ to arrive at the following effective dispersion relation:  [58]). The spin Hall equations which result from the use of these coordinates, which describe the polarizationdependent propagation of circularly polarized light, can be written as [42] x where x α denotes a position, p α a momentum, the dot an ordinary (noncovariant) derivative d/dτ , and s = ±1 depending on the handedness of the circular polarization state. The same equations, but with s = ±2 instead, describe the spin Hall effect for circularly polarized gravitational waves [45]. They are understood to be valid up to terms of order ǫ 2 . The spin Hall equations are expressed in terms of the Berry curvature components which are invariant with respect to spin rotations. The procedure leading to the spin Hall equations has removed any dependence on spin rotations. However, it has introduced a physical dependence on the timelike covector t α . This can be made explicit by calculating the components of the Berry curvature using the properties of the tetrad 5 {p α , t α , m α ,m α }, which results in [42,4 A similar approach is also used for the description of charged particles in an external electromagnetic field, where a coordinate transformation is used to rewrite the equations of motion in terms of the gauge-invariant Faraday tensor instead of the gauge-dependent vector potential. 5 Here, pα has replaced the kα which appeared in the above discussion. mα andmα are now viewed as functions of position and of pα.
Appendix C] Each term here is linear in the real bivector which is invariant under spin rotations and is uniquely determined by p α and t α . We shall see in Sec. III that Σ αβ is proportional to the angular momentum tensor of the wave packet. Regardless, substituting Eq. (2.15) into (2.13) shows that the spin Hall equations can be written in the more compact forṁ where D/dτ =ẋ α ∇ α denotes the covariant derivative along the worldline. It is now manifest that the only external choice relevant to these equations is the timelike vector field t α . We shall see below that that choice parametrizes the definition for x α . Unlike the integral curves of k α , which are interpreted as rays within (say) a wave packet, the position x α which appears in the spin Hall equations (2.17) is interpreted as describing the position of the wave packet as a whole: its "centroid." Similarly, the momentum p α is interpreted as the net momentum of the wave packet, not as a momentum density within that wave packet. One consequence of the spin Hall equations is that the worldline is not necessary tangent to the momentum. Systems with this feature are sometimes referred to as having hidden momentum [33,[59][60][61][62] or anomalous velocity [15,48,[63][64][65]. Also note that inspection of Eq. (2.17) shows that the affine parameter τ is dimensionless and that ǫ has units (length) 2 . Examination of the spin Hall equations shows that τ has been chosen such thatẋ · t = p · t. It is straightforward to see from Eq. (2.17b) that if p α is initially null, it remains null for all time. Equation (2.17a) implies that when the momentum is null, so too is the worldline:ẋ ·ẋ = O(ǫ 2 ). These equations are assumed to be used only for initial data in which p α , and thereforė x α , are indeed null.
There is not always a hidden momentum. As a particular case, suppose that t α is parallel transported in the sense thatẋ β ∇ β t α = O(ǫ). (2.18) With this choice,ẋ α = p α + O(ǫ 2 ) and Eq. (2.17) reduces to the polarization-dependent ray equations obtained by Frolov in Ref. [44,. In this sense, the spin optics approximation in that paper describes a particular case of the gravitational spin Hall equations obtained in [42]. However, since one of our goals is to understand the role of t α in Eq. (2.17), we do not assume any special choices for it in the remainder of this paper. It was not clear in the derivation of the spin Hall equations precisely what x α , p α , or t α are, what types of wave packets these equations describe, or what sorts of approximations are implicit in them (beyond the assumption of high-frequencies). These issues will be addressed below.

III. MATHISSON-PAPAPETROU EQUATIONS AND THEIR IMPLICATIONS FOR THE SPIN HALL EQUATIONS
The spin Hall equations (2.17) are interpreted as describing the motion of circularly polarized electromagnetic wave packets. However, it is known from separate arguments that the motion of any sufficiently compact spinning object is governed by the Mathisson-Papapetrou (MP) equations 6 These equations evolve an object's linear momentum p α and its angular momentum S αβ = S [αβ] along a specified worldline. They are very general: As long as the quadrupole and higher-order multipole moments of an object's stress-energy tensor can be ignored, the MP equations hold for all sufficiently compact objects with conserved stress-energy tensors [69,71]. In particular, although much of the literature on these equations assumes that p α is timelike, their derivation makes no use of that condition; null momenta are also admissible. Whether or not the momentum is null depends only on the nature of the underlying stress-energy tensor. The generality of the MP equations can be understood, in part, from the fact that they are essentially 6 These are variously referred to as the Papapetrou, Mathisson-Papapetrou, and Mathisson-Papapetrou-Dixon equations. The same labels are commonly applied also to more general equations which involve the quadrupole and higher-order moments of the relevant object. As recounted in [66], Mathisson [67] appears to have been the first to obtain the pole-dipole equations (3.1). He did so before Papapetrou [68] and using a superior method. Mathisson also derived some of the quadrupole terms which are not included here. Dixon [69] derived all quadrupole and higher-order terms, developing a full theory of multipole moments to all orders. This was later generalized to also allow for self-interaction [70,71]. Here we refer to the test body pole-dipole equations-without quadrupole or higher-order moments-as the MP equations.
kinematic. They arise as consequences of attempting to maintain Poincaré invariance as much as possible along the given worldline [72]. Indeed, they imply the presence of ten conserved quantities along that worldline, which correspond locally to the four translations, three rotations, and three boosts of a four-dimensional Minkowski spacetime (even when the actual spacetime is not Minkowski). The nontrivial physics which enters into this is that corrections to the MP equations-deviations due to the breakdown of Poincaré invariance-depend only an object's quadrupole and higher-order moments. It is expected from the equivalence principle that "sufficiently compact" objects should behave, at least locally, as though the spacetime is flat, and a calculation shows that the breakdown of the flat-spacetime conservation laws first occurs at quadrupolar order.
What is relevant here is that electromagnetic wave packets are associated with conserved stress-energy tensors. Their bulk motion can therefore be described not only by the spin Hall equations, but also by the MP equations. We show in Sec. III A that there is a precise sense in which the spin Hall equations arise as a special case of the MP equations. Section III B exploits this connection between the spin Hall equations and the MP equations to relate quantities in the spin Hall equations to an underlying stress-energy tensor. Section III C uses known results for the MP equations to write down previously unknown conservation laws associated with the spin Hall equations. Finally, Sec. III D explores the approximations used in the spin Hall equations and discusses when those approximations hold.

A. Spin Hall equations from MP equations
Our first task is to show that the spin Hall equations are a special case of the MP equations. We now show that the spin Hall equations arise after choosing appropriate initial data for the MP equations, fixing an appropriate definition for the centroid of an extended wave packet, and imposing a particular parameterization for the worldline of that centroid.
A priori, it may appear that the spin Hall and MP equations do not even describe the same physical quantities. The spin Hall equations evolve x α and p α while t α is specified independently. By contrast, the MP equations evolve p α and S αβ while x α is specified independently 7 . This discrepancy is resolved by showing that in 7 The worldline which appears in the MP equations is to be interpreted as a choice of origin for a multipole expansion. It does not necessarily have any interpretation as a centroid. That interpretation arises only when additional conditions are imposed on the worldline. However, except in maximally symmetric spacetimes, ignoring the quadrupole and higher-order moments cannot be justified unless there is some sense in which the worldline lies near an object's "center." the present context, i) the angular momentum equation (3.1b) can be trivially solved, and ii) the specification of t α is equivalent to the specification of x α . To summarize our result, given any future-directed timelike vector field t α and any constant "spin parameter" sǫ, the MP equations reduce to the spin Hall equations, at least up to terms of order ǫ 2 , when: 1. The worldline parameter τ is chosen such thaṫ for all time.
2. The momentum p α is at least initially null.

The angular momentum satisfies
at least initially, and for all time.
4. The magnitude of the angular momentum is at least initially given by The MP equations are reparametrization-invariant, so no generality is lost by imposing condition 1 for all time, at least so long as the worldline is not orthogonal to t α . Equation (3.2) serves merely to use the object's energy to nondimensionalize the time parameter.
The interpretation of S αβ t β = 0, also known as the Corinaldesi-Papapetrou spin supplementary condition [33,73], is more substantial. As discussed in more detail in Sec. IV A below, it is an implicit definition for x α . The angular momentum of any object depends on the choice of origin 8 , and certain components of S αβ can always be eliminated by an appropriate choice of origin.
Here, x α is chosen to eliminate S αβ t β , which is proportional to the body's mass dipole moment with respect to an observer whose 4-velocity is tangent to t α . This definition allows x α to be interpreted as a kind of centroid. However, that centroid clearly depends on t α . Different choices for t α generically result in different centroids, and each of these is in principle observable. The different centroids represent slightly different notions of "center" for an extended object. Relations between them are discussed in Secs. IV B and IV C below.
A particular worldline can be fixed by choosing a particular t α . There should therefore exist an evolution equation for the tangent vectorẋ α to that worldline.
To derive that evolution equation, first combine (3.1b), (3.2), and (3.4) to see that Rearranging then results in the momentum-velocity relationẋ It follows thatẋ α and p α are not necessarily collinear.
As long as the operator δ α β − (p · t) −1 S αγ ∇ β t γ can be inverted, (3.7) determinesẋ α uniquely in terms of p α , S αβ , and t α . In the small-angular momentum context considered here, the invertibility requirement is trivially satisfied as long as p · t is not too small. In particular, t α cannot be null and proportional to p α . This is discussed further in Sec. IV A below.
With the centroid fixed by (3.4), the superficially similar spin constraint (3.3) plays a very different role: It is interpreted as a genuine physical restriction on the types of systems which can be described by the spin Hall equations. Combining (3.3) and (3.4) with (3.5) shows that the angular momentum is at least initially 9 The spin constraint S αβ p β = 0 therefore amounts to a particular choice of initial condition for the angular momentum tensor. It implies that the spin is purely longitudinal. As explained in Sec. V E below, this is consistent with a wide class of high-frequency electromagnetic wave packets. However, it is also shown there that there are reasonable high-frequency wave packets which are not consistent with (3.8). It is a genuine physical restriction on the types of wave packets which can be described by the spin Hall equations. Our next task is to show that if S αβ is initially given by (3.8), it retains that form for all time. This can be demonstrated by showing that the constraints (3.3) and (3.5), as well as the null character of p α , are preserved under time evolution. First consider the null character of p α . If we assume that S αβ = O(ǫ) for all time, (3.1a) and (3.7) immediately imply that If p α p α is initially zero, it can therefore grow to be at most of order ǫ 2 . Again applying the MP equations and 9 The given constraints determine S αβ only up to an overall sign.
Here we fix the sign in order for the MP and spin Hall equations to agree, which may also be viewed as fixing the definition for the sign of s.
the momentum-velocity relation, (3.10) If S αβ p β is initially zero, as is assumed in condition 3 above, it therefore remains zero up to terms of order ǫ 2 . Equation (3.3) is thus preserved under time evolution. Lastly, use of these results together with (3.1b) shows that This implies that the initial spin magnitude (3.5) is preserved under time evolution. Combining these results shows that S αβ retains the form (3.8) for all time, at least up to terms of order ǫ 2 . The spin Hall equations (2.17) now follow, up to terms of order ǫ 2 , by substituting (3.8) into (3.1a) and (3.7). They may be viewed as the MP equations (3.1) specialized to conditions 1-4 above. This result can also be established by directly showing that (3.8) is a solution to (3.1b) and then using that to deduce the momentumvelocity relation (3.7) [35,Sec. 2.4.4.]. Regardless, the spin Hall equations of motion are equivalent to the equations of motion satisfied by a massless dipolar particle.

B. The meaning of the momentum
Now that we have established that the spin Hall equations follow from the MP equations, results known for the latter may be applied to the former. It is natural to ask what exactly is meant by the p α and the x α which appear in the spin Hall equations. The fundamental object in classical electromagnetism is the electromagnetic field, so there must be a relation between that field and (say) the momentum. Such a relation is not necessarily clear from the derivation of the spin Hall equations in Ref. [42]. However, the MP equations can be derived by first defining p α and S αβ as integrals over an object's stress-energy tensor and then using stress-energy conservation to deduce the evolution equations for those quantities [69,71,74]. Imposition of a centroid condition then provides a definition for x α in terms of the underlying stress-energy tensor. To summarize, the field can be used to construct the stress-energy tensor, which can in turn be used to construct the momenta and the centroid.
As the spin Hall equations are special cases of the MP equations, we may identify momenta in the former with momenta in the latter. There are however subtleties. In particular, different definitions for the momenta may satisfy formally identical evolution equations. This is especially clear when the definitions differ by terms which are considered "higher order." However, it can also occur in other cases. For example, if the triple (x α , p α , S αβ ) satisfies the MP equations together with an appropriate centroid condition, so does (x α , cp α , cS αβ ), where c is any nonzero constant. It follows that at best, the momenta in the two frameworks can be identified only up to an overall constant.
Despite this, we choose to interpret the momenta in the spin Hall equations to be exactly those which are typically used in derivations of the MP equations and their generalizations: If the object of interest has stressenergy tensor T αβ , and if that object's worldtube is foliated by the 1-parameter family of hypersurfaces Σ τ , the linear and angular momenta at time τ are given by [74, Eqs. (5.1) and (5.2)] Unprimed indices here are associated with x α (τ ), which is assumed to lie in Σ τ . Primed indices are associated with the integration point are Jacobi propagators; they can be used to form a basis for solutions to the geodesic deviation (or Jacobi) equation along the geodesic segment which connects x to x ′ . The Jacobi propagators can be computed explicitly [74] using derivatives of Synge's world function σ(x, x ′ ), which is defined to be one half of the squared geodesic distance between its arguments [75,76]. In terms of this world function, σ α = ∇ α σ and where [. . .] −1 denotes an inverse operation.
In flat spacetime and in inertial coordinates, the above propagators reduce to where η αα ′ = diag(−1, 1, 1, 1) is the usual Minkowski metric. Substituting these expressions into (3.12) recovers the standard special-relativistic definitions [1,32] for the linear and angular momenta. Even in curved spacetimes, the special relativistic expressions for the momenta remain good approximations to the exact expressions (3.12) if the coordinates are taken to be Riemann normal coordinates with origin x α .

C. Conservation laws
The specific choice of propagators appearing in the momenta (3.12) may appear to be obscure. They were chosen, in part, so that if κ α is Killing, This relation is exact. It is useful because the integral on the left-hand side-which is now identified as a linear combination of the linear and angular momenta-is conserved.
In fact, if κ α is interpreted not as an ordinary Killing field, but as a generalized Killing field 10 , the momentum definitions (3.12) ensure that (3.15) remains valid in any spacetime; cf. Refs. [71,72]. The space of generalized Killing fields is always ten dimensional in four spacetime dimensions. It also includes all ordinary Killing fields which may exist. If a generalized Killing field κ α is not an ordinary Killing field, p α κ α + 1 2 S αβ ∇ α κ β is not necessarily conserved, at least exactly. However, that quantity is approximately conserved in the pole-dipole context in which the quadrupole and higher-order moments of a body are neglected. That is, the approximation in which the MP equations hold, and indeed, those equations are equivalent to the statement that for all generalized Killing fields κ α . There are ten independent constants associated with the ten generalized Killing fields, and these completely determine the four components of p α and the six components of S αβ . The coupling of the linear and angular momenta in the MP equations (3.1) is interpreted as a consequence of the fact that, e.g., a local rotation about one point on the worldline is equivalent to both a rotation and translation when viewed at another point on the worldline. Regardless, (3.16) is most useful when κ α is an ordinary Killing field. In that case, the associated conservation law can be simplified in the spin Hall context. There, the angular momentum is given by Eq. (3.8) so A particular component of the linear momentum is therefore conserved. Precisely which component is conserved generically depends on both the spin magnitude ǫs and on the choice of t α . There are, however, cases where the s-dependent terms vanish in Eq. (3.17). First, the Killing field may be covariantly constant. This occurs for all translations in Minkowski spacetime, and also for null translations along the direction of gravitational wave propagation in ppwave spacetimes. For different reasons, there can be no spin correction to the conservation of energy in static spacetimes, at least when t α and κ α are both identified with the static Killing field. The spin-dependent term in (3.17) will then be proportional to t [α ∇ β t γ] , which vanishes on account of the spacelike hypersurface which is 10 A generalized Killing field requires for its construction a choice of worldline and a foliation [71,72]. Each generalized Killing field is exactly Killing on the worldline in the sense that Lκg ab = ∇aLκg bc = 0 there. Away from the worldline, the generalized Killing fields are exact symmetries for separation vectors (defined via the exponential map) rather than for the metric itself.
Although the generalized Killing fields do not necessarily satisfy Killing's equation there, they do satisfy certain projections of it.
orthogonal to t α in that context. Statements of energy conservation in the Schwarzschild spacetime are therefore independent of spin in the spin Hall context. This is not the case in Kerr spacetimes with nonzero angular momentum, which are stationary but not static. We have only discussed conservation laws associated with Killing vector fields. Other conserved quantities, associated with the existence of Killing-Yano tensors, are known (at least approximately) for the massive MP equations coupled to appropriate centroid conditions [77][78][79][80][81][82]. However, we have not investigated whether or not these laws also hold for the massless case of interest here.

D. Neglected terms
The spin Hall equations (2.17) are expected to be valid only through first order in the "small" (although dimensionful) parameter ǫ. Terms nonlinear in the spin have been ignored, as have any contributions from the quadrupole and higher-order moments of a wave packet's stress-energy tensor. Dixon has however found all multipolar corrections to the MP equations [69], and using his results, the neglected terms in the spin Hall equations can be estimated. We now discuss how to perform these estimates and under which conditions the spin Hall equations can be justified.
Dixon's laws of motion may be found in, e.g., [69,Eqs. (13.7) and (13.8)]. See also [66,Eqs. (283), (284), and (290)] for a version of those laws which is truncated at quadrupolar order. Inspecting them shows that the MP evolution equation (3.1a) for p α is corrected by a force term proportional to where J βγλρ denotes the quadrupole moment of the wave packet's stress-energy tensor. The evolution equation (3.1b) for S αβ is corrected as well, acquiring a torque term proportional to There are further forces and torques which couple a wave packet's octupole and higher-order moments to higherorder derivatives of the Riemann tensor, and together, these corrections provide a complete description for the evolution of the linear and angular momentum along a given worldline. If the worldline is fixed by adopting the centroid condition (3.4), there is, in addition, an evolution equation for x α which generalizes the spin Hall equation (2.17a). That generalization differs from its spin Hall counterpart due to the presence of terms involving the quadrupole and higher-order moments, as well as terms which are nonlinear in S αβ . To begin to estimate the consequences of neglecting all of these corrections to the spin Hall equations, it is first necessary to introduce a number of scales. To begin, the background spacetime is assumed to be characterized by a radius of curvature ℓ R and a scale ℓ ∇R over which that radius varies, (3.20) These estimates, and all the similar ones below, are assumed to hold in a locally inertial frame which is instantaneously at rest with respect to t α , the vector field chosen to fix the centroid. Another important scale is ℓ t , which characterizes variations in t α , The length scales ℓ R , ℓ ∇R , and ℓ t characterize the external environment. We now introduce several additional scales which characterize the wave packet itself. The first of these is the energy In terms of a wave packet's characteristic frequency ω, it is suggested by (2.2) [and by (5.27) below] that E = ǫω. Another relevant scale is provided by the angular momentum. Using (3.5), this is given by While the derivation of the spin Hall equations in [42] suggested that s = ±1 for circularly polarized electromagnetic wave packets, we shall see in Sec. V below that this is true only when the wave packet has a relatively simple structure; it must have "spin angular momentum" but not "orbital angular momentum." More generally, it follows from the integral expression (3.12b) for S αβ that a rough bound is given by |s| < ωℓ w , where ℓ w denotes a characteristic width for the wave packet. Below, we assume that |s| remains well below this bound in order not to violate the high-frequency approximation. But even so, it may still be that |s| ≫ 1.
Our final estimate involves the quadrupole moment J αβγλ . Unlike p α and S αβ , this rescales under changes in the worldline parameter. Using the dimensionless parameter τ which is associated with the normalization condition (3.2), it may be shown that the quadrupole moment has dimension (length) 4 . It is generically of order (3.24) Magnitudes of the higher-order moments are essentially the same except for the involvement of higher powers of ℓ w . Regardless, these scalings imply that a wave packet can be characterized by s, ω, E, and ℓ w . The approximations inherent in the spin Hall equations may now be summarized as 1. Terms nonlinear in the spin can be neglected in the momentum-velocity relation: ℓ t ≫ |s|/ω.

2.
The instantaneous quadrupole force can be neglected in comparison with the spin-curvature contribution to the linear momentum evolution: 3. The instantaneous quadrupole torque can be neglected in comparison with the p [αẋβ] contribution to the angular momentum evolution: 4. The quadrupole torque negligibly affects the spin over the dimensionless integration timescale ∆τ : These constraints are all related to terms which are neglected when going from Dixon's laws of motion to the MP equations, and finally to the spin Hall equations. Separately, it is also necessary to assume that 5. The wave packet is large compared with its wavelength and it does not have nontrivial structure on very small scales, ωℓ w ≫ |s|.
This is required for the approximate validity of geometric optics, which was used in the derivation of the spin Hall equations in [42]. Alternatively, if the MP equations are used as a starting point, geometric optics must be used to motivate the initial data considered here-for example the null character of p α . This viewpoint is discussed further in Sec. V below. Regardless, condition 5 must be imposed in order for a pulse to maintain its structure. If it were violated, a wave packet would rapidly diffract away.
Let us now examine the consequences of assumptions 1-5. The first of these implies that t α must vary sufficiently slowly that ℓ t ≫ |s|(wavelengths). Assumptions 1, 2, 3, and 5 imply that the wave packet must be small compared to the curvature scales, ℓ w ≪ min(ℓ R , ℓ ∇R ). (3.25) This guarantees that, e.g., the octupole terms in the laws of motion are negligible compared with quadrupole terms. It is therefore unnecessary to impose that restriction separately from the ones above. Regardless, (3.25) does not exhaust the content of the first three assumptions. When |s| ∼ 1, for example, they imply much stricter bounds on ℓ w . Writing those bounds in dimensionless form while also incorporating assumption 5, (3.26) In this sense, ℓ w cannot be either too large or too small when compared with one wavelength. Assumptions 1, 2, and 3 arise from comparing the instantaneous magnitudes of different terms in the equations of motion. Assumption 4 tells us how long those equations can be reliably integrated. If ℓ w is approximately constant, it implies that the integrations remain valid over (dimensionful) timescales-or equivalently distances-of order Note that the allowable integration time here is much smaller when |s| ∼ 1 than it is for a maximally spinning wave packet. This is because smaller spin effects are more easily overwhelmed by quadrupole corrections. One subtlety in this discussion is that it is not necessarily justified to assume that ℓ w remains constant over an integration timescale. Electromagnetic wave packets almost 11 invariably diffract and spread out as they propagate. This differs from the behavior of (some) solids and strongly self-gravitating fluids, which canat least approximately-maintain their dimension over long timescales. We estimate the divergence of an electromagnetic wave packet by analogy with a Laguerre-Gauss beam in flat spacetime. If such a beam has minimum width ℓ * w , its width at a distance L ≫ ℓ * w away from where that minimum occurs is of order [83] ℓ w ∼ L|s| ωℓ * w . (3.28) We assume that this relation holds not only for Laguerre-Gauss beams, but generically. Further assuming that the equations of motion are integrated beginning near the point where the beam has attained its minimum width, so L ∼ ∆t, substituting (3.28) into (3.26) while also using (3.27) shows that (3.29) Increasing ω at fixed ℓ * w therefore increases the upper bound on the integration time. This is consistent with what might have been expected from improving the highfrequency approximation. However, it is not possible to increase ∆t indefinitely: As shown by, e.g., (3.26) in the |s| ∼ 1 case, increasing ω results in a decreasing upper bound on ℓ * w . Also note that the maximum allowable integration time decreases when |s| ≫ 1.
One example which may be considered is that of a wave packet propagating at a distance r from a static gravitating object of mass M . In this case, the curvature scales are ℓ R ∼ r(r/M ) 1/2 and ℓ ∇R ∼ r. Furthermore, if t α is chosen to be parallel to the static Killing field, ℓ t ∼ r(r/M ). This implies that ℓ t ≫ ℓ R ≫ ℓ ∇R in an approximately Newtonian regime. In that regime, it follows from assumptions 2, 3, and 5 that the minimum beam width is bounded by It also follows from (3.29) that if r does not change too much over the integration time, Both bounds together imply that ∆t ∼ L ≪ r, which significantly limits the applicability of the spin Hall equations in astrophysical systems. Except for assumption 5 above, our discussion has focused only on neglected terms in the spin Hall equations of motion. However, there are separate errors incurred by using inaccurate initial data in those equations. As discussed in Sec. III A, it is assumed in the spin Hall context that p α is null, S αβ S αβ > 0, and S αβ p β = 0. However, we show in the Appendix that in fact, there does not exist any exact wave packet with these properties. An electromagnetic field with nonzero angular momentum must have a timelike momentum, not a null one. Nevertheless, there are large classes of electromagnetic fields for which the spin Hall initial data is approximately valid, and it is in that context that the spin Hall equations should be understood. We do not, however, attempt to estimate the errors incurred by this aspect of the approximation.

IV. THE MANY CENTROIDS OF EXTENDED OBJECTS
Whether an extended object is composed of "ordinary" matter, electromagnetic fields, or anything else, it is not possible to fully describe its location using only a single worldline. There are nevertheless situations in which it is useful to use a single worldline to describe the "averaged" location of an extended object. This is the role of a centroid. However, unlike in Newtonian mechanics, there are many centroids which might reasonably be associated with relativistic systems. One of these centroids might be more useful in one context, while another might be more useful in another context. The various centroids may be interpreted as a particular class of observables.
The centroids considered here are associated with timelike vector fields. As stated in Sec. III A above, any such vector field may be associated with a centroid by requiring that S αβ t β = 0. This interpretation is verified in Sec. IV A below. Distinctions between the various centroids and the implications of those distinctions are discussed in Secs. IV B-IV D.
Many of the ideas described in this section were introduced long ago by, e.g., Pryce [30] and Møller [31]. Some of those ideas have been rediscovered more recently by different communities, who have introduced different terminologies and interpretations. For example, displacements between different centroids have, in certain contexts, been described as relativistic Hall effects [26] and also as Wigner [27] or Wigner-Souriau [28,29] translations. Regardless, properties of different centroids are well-understood for massive objects, where p α is timelike.
What has not been so carefully explored in the literature is the massless case, where p α is null. This section discusses both the massive and massless cases together. Our main new finding is concerned with the maximum possible separation between different centroids associated with the same physical object. For massive bodies, we recover the classical result [31][32][33]84] that all possible centroids are confined to a disk with finite radius. The set of all centroids therefore localizes a massive object to a finite region, providing some reassurance that the centroid definition is a reasonable one. The massless case is different, however. We find that massless spinning objects cannot be localized in this way; they possess centroids separated by arbitrarily large distances. This is potentially problematic, and resolving it involves examining certain subtleties of the approximations used to describe, e.g., electromagnetic wave packets. Our conclusion is that the delocalization of massless objects is not physically relevant because a wave packet cannot truly be massless.
The strategy taken in this section is to first discuss all issues in flat spacetime and in inertial coordinates. All arguments are then straightforward and all results are exact. There are no subtleties involving neglected higherorder terms in the laws of motion. Later, in Sec. IV D, we discuss how-with appropriate caveats-the same results carry over for sufficiently small objects in curved spacetime.

A. Defining a centroid
Our first task is to show that, as claimed above, the choice of t α is equivalent to a choice of worldline. For simplicity, we work in flat spacetime and use inertial coordinates.
To begin, recall that the definitions (3.12) for an object's linear and angular momentum supposed that a particular worldline had been fixed and that p α and S αβ depended only upon a parameter τ which had been associated with that worldline. Those definitions are easily generalized to avoid the introduction of any particular worldline. Instead, if the hypersurfaces Σ τ are replaced by Σ x , where x α is now an arbitrary point (not yet associated with any particular worldline), p α and S αβ may be viewed as functions of that point. With this redefinition in mind, as long as the Σ x foliate the support of the stress-energy tensor, stress-energy conservation implies that the left-hand side of (3.15) must be independent of x α for each Killing field κ α . The quantities p α κ α + 1 2 S αβ ∇ α κ β are therefore conserved in the sense that they are independent of x α . Using this together with the fact that the flat spacetime Killing fields can be written as κ α = T α + B αβ x β , where the translation T α and the rotation or boost B αβ = B [αβ] are arbitrary constants, the linear and angular momenta associated with two different points, x andx, must be related via They could also have been derived straightforwardly from the momentum definitions (3.12) as well as (3.14). DefiningS αβ ≡ S αβ (x) together with the deviation vector ξ α ≡x α − x α , it follows from (4.1b) that The claimed centroid condition (3.4) now amounts to the vanishing of the left-hand side of this equation. And no matter howx α has been chosen or what formS αβ may have, that can be arranged by choosing x α such that where T is an arbitrary parameter. Varying over all possible values of this parameter recovers a worldline: what we call the centroid associated with t α . This is true regardless of whether p α is timelike or null. It may also be seen that if ∇ β t α = 0, the centroid is tangent to p α and T may be identified with the worldline parameter τ which is associated with the normalization condition (3.2). Both of these statements can fail when t α is not constant. Next, we verify that the centroid is deserving of its name. First, recall that stress-energy conservation implies that S αβ (x) does not depend on the hypersurface Σ x , as long as all fields fall off sufficiently rapidly and all relevant hypersurfaces completely cut through the support of the stress-energy tensor. We may therefore choose Σ x to be the hyperplane which is orthogonal to t α at x α . Doing so, while temporarily adopting inertial coordinates which are comoving with t α , use of (3.12) and (3.14) shows that S αβ t β = 0 holds only when where is the energy (3.22). This is the standard nonrelativistic center of mass definition, but with the nonrelativistic mass density replaced by the relativistic energy density T αβ t α t β . It follows that as long as T αβ t α t β ≥ 0, the centroid must lie inside the convex hull of the spatial support of the stress-energy tensor. That the energy density should not be negative could be viewed as a consequence of, e.g., the dominant energy condition. That condition is satisfied by essentially all standard classical fields, including electromagnetic ones [85]. One subtlety which does not appear to have been recognized before is that although the dominant energy condition is satisfied by exact electromagnetic field configurations, it is not necessarily satisfied by the approximate fields which might be used to describe high-frequency wave packets. As discussed further in Sec. V E below, there can exist timelike t α for which the approximate energy density is positive in some regions and negative in others. This has a dramatic consequence: The centroid of an approximate wave packet can appear to lie arbitrarily far from the wave packet itself. Those centroids are of course spurious. They are a consequence of neglecting higher-order terms in the stress-energy tensor. See further discussion in Secs. IV C and V E below.
Another comment which can be made is concerned with the fact that it is common in the literature [31,69,71,86,87] to use S αβ p β = 0 as a centroid condition instead of S αβ t β = 0, particularly-but not exclusively [53,[88][89][90]]-for objects with timelike momenta. This has the apparent advantage that the results do not depend on extraneous choices such as that of t α . And in the massive case, there is nothing wrong with this; Eq. (3.7) remains valid with t α = p α . But this fails for massless objects. In that case, (4.2) remains valid so replacing t α there by p α shows that x must be a solution tõ If the left-hand side here is nonzero and not proportional to p α , no such solution exists. If the left-hand side is instead proportional to p α , any x α which satisfies (x − x) · p = const will do. That restricts the centroid only to a three-dimensional null hypersurface, not a worldline. In either case, S αβ p β = 0 cannot be interpreted as a centroid condition for massless objects. While this has been noted before, details were scant [39,64]. In some of the literature which does attempt to use S αβ p β = 0 as a centroid condition for massless objects [52,[88][89][90], there is a relation derived between the momentum and the velocity which suggests that a centroid does indeed exist. However, that relation involves a ratio whose denominator (in a curved spacetime) is R αβγλ S αβ S γλ . The momentum-velocity relation therefore fails in the flat spacetime context of our present discussion. It also fails at least somewhere on many worldlines which might be considered in more general spacetimes. It does not appear to us to be viable to attempt to impose a condition which fails to be robust or to have reasonable limits. In particular, the lack of a viable flatspacetime limit implies that even when S αβ p β = 0 does result in a unique worldline, it will not describe a centroid in the sense of (4.4).

B. Displacements between different centroids
As there are many different centroids which may be used to describe an extended object, it is natural to ask how these are related to one another. The answer has long been known for massive objects, as described in, e.g., Refs. [32,33,84,91,92]. There, a canonical centroid was defined via S αβ p β = 0 and separations were derived between this centroid and others. As noted above, a canonical centroid cannot be defined in this way when considering massless objects. Nevertheless, only minor changes are needed to consider the differences between arbitrary reference centroids. We discuss both the massless and massive cases below. For simplicity, we also continue to work in flat spacetime and to use inertial coordinates.
Consider two future-directed timelike vector fields t α andt α and the corresponding centroid conditions These define two worldlines, the points on which may be denoted by x α andx α . Finding a unique displacement ξ α =x α − x α between them requires that points on each worldline be identified in a particular way. It is convenient to do so by supposing that ξ · t = 0, (4.8) in which case (4.2) and (4.7) imply that ξ ·t = 0 and This displacement vector is exact in flat spacetime, is valid for both massive and massless objects, and there is no constraint on the nature of the angular momentum. One immediate consequence is that all centroids coincide for nonspinning objects. Given Eq. (4.7), it is always possible to introduce a spin vector S α such that which is unique only up to arbitrary multiples of t α . In terms of any such spin vector, the displacement (4.9) can be written as This is a spacelike vector orthogonal to t α ,t α , and S α . It may be used to relate any two centroids to one another. It still does not make any assumptions regarding the nature of the spin or the object's mass.
If we now specialize to the spin Hall case where S αβ is given by Eq. (3.8) and p α is null, the spin vector may be be identified with As this is proportional to p α , it may be described as a "longitudinal spin." The displacements in this case are given by which are transverse to the momentum.

C. Localization of extended objects
As the choice of t α is essentially arbitrary, one might hope that the centroids associated with different vector fields are not too different. In particular, it is natural to ask if they are all confined to a finite region-perhaps within the convex hull of the spacelike support of the object's stress-energy tensor. As noted above, this does indeed follow from (4.3) when T αβ t α t β ≥ 0. It is also possible to show, without using T αβ , that the set of all possible centroids is localized whenever p α is timelike; cf. [91, Sec. 6.3] or [32, Sec. 3.1b]. We now discuss both the massive and the massless cases and show that in the latter context, some "centroids" can be arbitrarily distant from one another.
Assume that some future-directed timelike t α has been fixed and measure all deviations as being with respect to the centroid for which S αβ t β = 0. If points on the centroids associated witht α and t α are identified using (4.8), it follows from (4.11) that the square of the proper distance between those points is where projects vectors into the space orthogonal to both t α and t α (when those vectors are not parallel).
To discuss the implications of this in the massive case, it is convenient to now choose t α = p α so all deviations are measured with respect to the centroid defined by S αβ p β = 0. Then (4.14) implies that where S ≡ ( 1 2 S αβ S αβ ) 1/2 characterizes the magnitude of the spin, m ≡ (−p 2 ) 1/2 is the mass, is the relative speed between t α (= p α ) andt α , and θ ∈ [0, π] is the angle betweent α and S α which would be measured by an observer whose 4-velocity is tangent to t α . It is evident from (4.16) that the magnitude of the centroid displacement can be no larger than the Møller radius S/m. All centroids are therefore confined to a disk with that radius. Unless energy conditions are violated, there is a sense in which the disk of centroids must be smaller than the object itself. The massless case is more subtle. For simplicity, we do not discuss the most general massless case, but only the spin Hall case in which the angular momentum is restricted via S αβ p β = 0. As it is not possible to choose t α to be proportional to p α in this context, we assume that t α and its associated centroid have been fixed in some other way and that all other centroids are measured with respect to it. The distance between the centroid determined by t α and the one determined byt α is then found by substituting (4.12) into (4.14). This yields for a massless object, where E is the energy (3.22), and S, V , and θ have the same meanings as in (4.16). In this case, it also follows from (3.5) that S = ǫ|s|. The prefactors are essentially the same in the massless displacement (4.18) and its massive counterpart (4.16); the energy E which appears in the massless case is simply replaced by the mass m, which is of course the energy in the zero-momentum frame. Up to this replacement, both displacements coincide when V ≪ 1. Indeed, all centroids determined by "nearly comoving" observers satisfy ξ ≤ (S/E)V . In both the massless and the massive cases, they are contained within the "generalized Møller radius" S/E.
If the magnitude of V is not restricted, the massless and massive displacements still coincide whent α is aligned, antialigned, or orthogonal to S α in a frame comoving with t α . In the aligned and antialigned cases, there is no effect at all: ξ = 0. In the orthogonal case where θ = π/2, we have instead that the proper distance between two centroids is ξ = (S/E)V . Since V < 1, this is again bounded by generalized Møller radius. In the massless case and for a high-frequency wave packet, (5.27) below shows that this bound can be written as where ω denotes the angular frequency of the field in the frame comoving with t α . The displacement is therefore less than approximately |s| wavelengths. It is in agreement with discussions of the relativistic Hall effect [26] and the Wigner translations [27], where the energy centroid of a beam with nonzero angular momentum was shown to experience a similar shift after applying a boost orthogonal to the direction of propagation. What does not appear to have been noticed before is that the maximum displacement in the massless case does not occur at θ = π/2 (except in the V ≪ 1 limit). For fixed V , the angle which maximizes ξ in (4.18) is instead (4.20) Using that, the maximum displacement between massless centroids is found to be This diverges as V → 1. Unlike in the massive case, the set of all possible centroids is not bounded for a massless spinning body. Arbitrarily large displacements can occur between the centroids associated with t α andt α when those vectors differ by ultrarelativistic boosts which are almost-but not quite-parallel to the momentum. This presents an apparent problem for the formalism. One interpretation is simply that massless spinning objects, whatever those may be, cannot be localized. However, this is unacceptable if we interpret certain electromagnetic wave packets as examples of massless spinning objects. Physically realizable wave packets clearly can be localized, and any worldlines which fail to lie near the support of their stress-energy tensors are hardly deserving to be called "centroids". Therefore, either our centroid definition is inappropriate or there is something wrong with our interpretation of electromagnetic wave packets as spinning objects with null momenta. The first possibility can be discounted by recalling the discussion following (4.3).
The resolution is that the momentum of a spinning electromagnetic wave packet is not actually null. It must be timelike. We have been assuming above that the momentum is null, and given reasonable assumptions, this is approximately true for a high-frequency wave packet. Indeed, it is true through leading and subleading orders in a high-frequency approximation, and that is all that the spin Hall equations can describe (as they omit terms of order ǫ 2 ). However, it is demonstrated in Sec. V D below, using an explicit family of wave packets, that the momentum is always timelike when going to one higher order. The mass is found to be order ǫ/ℓ w , where ℓ w is again a characteristic width for the wave packet. Using Eq. (4.16), this implies that the maximum deviation between centroids is of order ℓ w . Although that is the intuitively expected result, establishing it requires that calculations be performed to a relatively high order. Truncating the approximation too early results in a conclusion which is not even qualitatively correct.
At lower orders in the high-frequency approximation, one can say only that the momentum is approximately null. Mathematically, we are considering 1-parameter families of wave packets in which, e.g., lim ǫ→0 p α p α = 0. However, what is physically interesting is an example of such a family at a particular ("small") value of ǫ. In that context, a vector can be "approximately" null only with respect to some restricted class of observers. If a vector is actually timelike, for example, there clearly exist some observers for whom it appears to be stationary and some observers for whom it appears to be "nearly null." There is therefore a sense in which the high-frequency approximation implicitly selects a kind of rest frame. It is reliable only in frames which are not too highly boosted with respect to that rest frame.

D. Centroids in curved spacetimes
The main results obtained thus far in this section are that i) the displacements between different centroids are given by Eq. (4.11), ii) the maximum magnitude of the displacement is given by Eq. (4.16) when p α is timelike, and iii) the displacement is unbounded when p α is null. These results were derived in flat spacetime, and in that context, they are exact. There are no corrections due to higher-order spin effects, quadrupole moments, or anything else. We now discuss the sense in which our results remain at least approximately valid for sufficiently small objects in generic spacetimes. One would expect from the equivalence principle that everything remains at least approximately valid even in curved spacetimes, and indeed it does.
We begin by obtaining a curved spacetime form for the transformation law (4.1b) between the angular momentum evaluated about different points x α andx α . It is useful to first use the exponential map to define a deviation vector ξ α between those points, sõ In a Riemann normal coordinate system with origin x α , this takes the standard form ξ α =x α − x α . Regardless of the coordinate system, the displacement vector is the negative gradient of Synge's world function: ξ α = −σ α (x,x). We can use this to find covariant Taylor expansions in the style of, e.g., [76,Sec. 6]. Letting primed indices be associated withx α and unprimed ones with x α , the relevant expansion for the angular momentum tensor is where g α ′ α denotes the bitensor which parallel propagates vectors from x α tox α along the geodesic segment which connects those points. A similar expansion may also be used to relate the linear momentum at x α to the linear momentum atx α .
Regardless, continuing requires that we compute the gradients of p α and S αβ . While the argument can be generalized, consider for simplicity displacements x α → x α which lie entirely within the same "constant-time" hypersurface, so Σ x = Σx. Then, associating doubleprimed indices with an integration point x ′′ , it follows from (3.12) that All bitensors here are evaluated at (x, x ′′ ). If the maximum distance, within Σ x , between x and any integration point where T β ′′ α ′′ = 0 is of order ℓ w , standard coincidence limits for the world function [76] imply that , and ∇ γ H α ′ α and ∇ γ K α ′ α are both of order ℓ w /ℓ 2 R , where ℓ R denotes the curvature length scale introduced in Sec. III D. Moreover, using the energy (3.22) to estimate the error terms, it follows that through the first order in ξ α , In flat spacetime, the error terms here are exactly zero; cf. (4.1b).
We would now like to use (4.7) to associate one centroid x α with the timelike vector fieldt α ′ and another centroid x α with the timelike vector field t α . Repeating the same steps as in Secs. IV A and IV B, it is again convenient to identify points on both worldlines using ξ · t = 0, which we now assume to be compatible with the assumption that Σ x = Σx. Equation (4.25) can then be shown to imply that g αα ′ ξ αtα ′ = O(ξℓ 2 w /ℓ 2 R ). It also follows that (4.9) generalizes to This assumes that ξ is not so large that terms of order ξ 2 become important in (4.25). More generally, the basic special relativistic form for this expression remains valid when ℓ w and ξ are both much smaller than ℓ R . Most of the above special-relativistic results remain valid in this context. Technically, however, one can no longer conclude that there are massless centroids which are arbitrarily distant from one another, as then ξ must be large. It is nevertheless clear from the flat spacetime limit that massless objects must still be problematic in curved spacetime.

E. The irrelevance of centroid conditions
One important consequence of Eq. (4.26) is that the centroid depends only (quasi)locally on the timelike vector field used to define it. If t α andt α coincide in, say, Figure 1. A massless spinning object, as described by two different families of timelike observers, t α andt α . The displacement between two points on the worldlines x α andx α is described by the shift vector ξ α . Since t α =t α near the emitter and the receiver, we have ξ α = 0 in these regions, and the two worldlines coincide, up to relative error terms of order (ℓw/ℓR) 2 .
neighborhoods of emission and observation points, whatever they do in between the emitter and the observer is irrelevant : The associated centroids will coincide at both the beginnings and ends of their journeys. This is illustrated schematically in Fig. 1. Physical meaning can be attributed to t α only in the neighborhoods of the emission and the observation events. What it does elsewhere is essentially irrelevant.
As a consequence, the effects of different spin supplementary conditions can be understood without repeatedly solving the equations of motion with different conditions. Solving the Mathisson-Papapetrou equationswhether massive or massless-with the spin supplementary condition S αβ t β = 0 requires an apparently extraneous specification of the vector field t α . Indeed, that vector field must be specified not only at a point, but in a neighborhood of the entire trajectory. The observation of the previous paragraph shows that all that matters is the specification of the vector field near the emission and the observation points, at least if the displacement never gets too large. It may also be noted that it is only near the emitter and source that there necessarily exists a natural choice for t α : It may be identified near those points with the 4-velocities of the emitter and the observer.
The transformations (4.25) and (4.26) may also be interpreted as a way to generate new solutions to the equations of motion. Given a triple (x α , p α , S αβ ) which satisfies the MP equations with the centroid condition S ab t b = 0, the triple (exp x ξ,pα,Sαβ) satisfies those same equations but with the centroid conditionSαβtβ = 0. This statement is exact in flat spacetime and approximate more generally.

V. MOMENTA OF CIRCULARLY POLARIZED WAVE PACKETS
As discussed in Sec. III, there are essentially only two conditions required for the validity of the spin Hall equations in the form considered here. First, the effect of the quadrupole and higher-order moments must be negligible. When this occurs can be estimated using the arguments in Sec. III D. However, the spin Hall equations also require for their validity that p α be null and that S αβ have the form (3.8). These may be viewed as restrictions on the initial data for the MP equations. The claim has been that such conditions model an electromagnetic wave packet. However, this connection appears in the literature as an unsubstantiated (and usually unstated) hypothesis. It is implied by results in the Appendix that the spin Hall initial data cannot hold exactly, at least when s = 0. The purpose of this section is to understand if there is an appropriate approximate sense in which the spin Hall initial data is actually associated with electromagnetic wave packets.
We show that to the expected orders, there are indeed generic wave packets which are compatible with the spin Hall initial data. Nevertheless, we show that those data inevitably break down at one higher order; the momentum becomes timelike, for example. We also show that there are reasonable wave packets which are not even approximately described by the spin Hall initial data. For them, S αβ p β is nonzero even at the lowest nontrivial order. Said differently, the spin is not purely longitudinal. The existence of these exceptions emphasizes that in applications, the connection between the "microscopic" (the electromagnetic field structure) and the "macroscopic" (the spin Hall equations or generalizations) is nontrivial and must be considered on a caseby-case basis.
All calculations in this section are performed in flat spacetime and in inertial coordinates. However, as we are concerned only with finding initial linear and angular momenta for the MP equations, all calculations for sufficiently small wave packets are confined to small regions in spacetime. Flat spacetime calculations therefore remain excellent approximations for sufficiently compact wave packets even in curved spacetimes, as long as the inertial Minkowski coordinates are reinterpreted as an appropriate system of Riemann normal coordinates.

A. A family of wave packets
Our first task is to construct a sufficiently general class of approximate electromagnetic wave packets. We work in a high-frequency approximation and consider a family of vector potentials A α . These vector potentials are assumed to be given by the asymptotic series where the amplitudes ψ (n) α and the eikonal u are independent of the small parameterǫ > 0. Note that theǫ which appears here is related to, but generically distinct from, the ǫ which appears in the spin Hall equations [and in (2.1)]. If c is any constant, A α /ǫ is invariant, at leading order, under all transformations where u → cu and ǫ → cǫ. It is convenient for now to avail of this ambiguity by allowingǫ to differ from ǫ by a convenient constant. Then u is not restricted to have units (length) 2 .
Again defining the leading-order wave vector k α ≡ −∇ α u, Maxwell's equations and the Lorenz gauge condition imply that u must be a solution to the eikonal equation Maxwell's equations and the gauge condition also imply that, for all n ≥ 0, the amplitudes must satisfy the transport equations and the constraint equations [39,42]. These equations are hierarchical. A solution to the n = 0 equation is required to solve the n = 1 equation, an n = 1 solution is required to solve the n = 2 equation, etc.
We now specialize to fields which are, at least at the leading order, plane fronted 12 and traveling in the +z direction in the inertial coordinate system (t, x, y, z). This can be represented mathematically by choosing the eikonal It is then convenient to solve the transport and constraint equations by interpreting u as a null coordinate and by defining The four scalars (u, v, ζ,ζ) form a null coordinate system in which the Minkowski line element reduces to These coordinates can be associated with a complex null tetrad (k α , n α , m α ,m α ) via The only nonvanishing inner products among this tetrad are m ·m = 1 and k · n = −1. In terms of it, the metric is g αβ = 2[m (αmβ) − k (α n β) ]. Note that unlike in Sec. II, the m α andm α here are ordinary fields on spacetime. They are not defined over the cotangent bundle. We now restrict to waves which are not only plane fronted but also circularly polarized at leading order. Mathematically, this is taken to mean that ψ (0) α is assumed to be null (and nonzero). The constraint equation (5.4) then implies that the leading-order amplitude must be proportional either to m α +χk α or tom α +χk α , where χ is any scalar [see (2.7)]. Terms proportional to k α are pure gauge at leading order, although not necessarily at higher orders 13 . Regardless, we set χ = 0 for simplicity. Circularly polarized fields are then described by or ψ (0) α = ψm α , depending on the handedness of the field. Unlike in Sec. II, we do not assume that ψ which appears here is necessarily real. Use of Eq. (5.3) shows that it is independent of v but otherwise arbitrary: ψ = ψ(u, ζ,ζ).
We may now substitute the leading-order amplitude (5.9) into the transport equation (5.3) and the constraint equation (5.4) in order to derive the subleading amplitude ψ (1) α . One solution is Two orders beyond geometric optics, we find that Together, (5.1), (5.9), (5.10), and (5.11) describe an approximate family of circularly polarized electromagnetic fields in flat spacetime. That family is parametrized by the geometric-optics scalar amplitude ψ and by the constantǫ, which is related to the inverse frequency of the field. Its properties are explored in the remainder of this section. If desired, fields with the opposite helicity can be considered by swapping m α andm α in the amplitudes (5.9), (5.10), and (5.11). Vector potentials are not directly measurable. More interesting are the field strengths F αβ , and for any vector potential with the form (5.1), a direct calculation shows that these are given by For the amplitudes constructed above, this evaluates to At leading (geometric optics) order, F αβ is a linear combination of k [α m β] and k [αmβ] . At subleading order, its tensorial structure changes; it acquires terms proportional to k [α n β] and m [αmβ] . These corrections may be interpreted as modifying the apparent "polarization state" at higher orders. More broadly, the tensorial structure of the electromagnetic field may alternatively be understood by noting that the Newman-Penrose scalars associated with the tetrad (k α , n α , m α ,m α ) satisfy Φ i = O(ǫ 2−i ) for all i = 0, 1, 2. At leading order, there is only Φ 2 . At subleading order, there is also Φ 1 . At two orders beyond geometric optics, there is also Φ 0 . This is a special case of the peeling result for highfrequency fields which was obtained in [39].

B. Stress-energy tensors
Our goal is to compute linear and angular momenta, which are determined by integrals of stress-energy tensors. The electromagnetic stress-energy tensor is and through leading and subleading orders, substitution of Eq. (5.13) into this expression results in In regions where |ψ| = 0, this can be written more suggestively as (5.16) where the covector which appears here can be interpreted as a modified wave vector. Like the leading-order wave vector, this is null: . At leading order, T αβ ∝ k α k β + O(ǫ), and it is somewhat remarkable that at one order higher, this is modified only to T αβ ∝k αkβ + O(ǫ 2 ). All observers therefore agree on the direction of momentum density, even at the subleading order.
Similar factorizations of the stress-energy tensor 14 have been discussed in more general contexts in Ref. [39]. A result like the one found here was shown to arise whenever k α is shear-free. If there is shear in the leading-order rays, different observers generically disagree on the direction of the subleading momentum density. Moreover, even in the shear-free case, observers typically disagree on the direction of the momentum density once terms of orderǫ 2 are included; stress-energy tensors at that order are generically more complicated.

C. Linear and angular momenta
The stress-energy tensor in Eq. (5.16) may be used to compute the net linear momentum on a t = const. hypersurface. Using Eq. (3.12), and putting primes on integration variables and on the objects which depend on the integration variables, It follows from Eq. (5.17) that p α p α = O(ǫ 2 ); the momentum is null through leading and subleading orders. However, it is not necessarily true that p α is proportional to k α beyond leading order. More interesting is the angular momentum S αβ , which we compute about a point x α = (t, x i ) which lies within the hypersurface of integration. There are two interesting sets of components. First, using Eqs. (3.12) and (5.16), This can be interpreted as the dipole moment of the energy density with respect to a static observer at x α . The other relevant components of the angular momentum tensor are

D. Vanishing phase gradients
The linear and angular momenta simplify considerably when the complex phase of ψ is constant. Looking first at the linear momentum, if ∇ α arg ψ = 0 and if |ψ| decays to zero sufficiently rapidly at large transverse distances, Eq. (5.18) reduces to The subleading contribution to the linear momentum vanishes and p α is seen to be proportional to the constant leading-order wave vector k α . It is important to note, however, that the net momentum is in general distinct from the momentum density. The former is proportional to k α while the latter is proportional tok α . An observer with a high-resolution detector might therefore ascribe an apparent "direction of propagation" which differs, at O(ǫ), from the direction of p α . More than this, the direction of the momentum density varies slightly across the wave packet. When the phase gradient vanishes, Eq. (5.19) simplifies to Here too the subleading contribution vanishes. Furthermore, Eq. (5.20) reduces to One particularly simple choice for x i arises by enforcing the centroid condition (3.4). Temporarily assume that t α = (1, 0, 0, 0) so that condition reduces to S i0 = 0. It then follows from Eq. (5.22) that this centroid condition implies that where E is again given by Eq. (3.22). In terms of the bivector Σ αβ which is defined by Eq. (2.16), The angular momentum therefore satisfies S αβ p β = O(ǫ 2 ). An angular momentum which differs from this only by a sign can be obtained by considering an otherwise-identical field with opposite helicity, which is accomplished by replacing the m α which appears in Eq. (5.9) withm α . Equation (5.25) matches the form (3.8) for S αβ , assuming that s = 1 and ǫ =ǫE. (5.26) The two small parameters we have introduced are therefore proportional to one another. Physically, either one can be interpreted as related to the leading-order angular frequency which would be seen by an observer with 4-velocityt α : To summarize, we have found sufficient conditions for the physical picture suggested in Sec. III, namely that p α is null, s = ±1, and S αβ p β = 0. This is valid, up to terms of orderǫ 2 , at least for all decaying, circularly polarized electromagnetic wave packets with planar wavefronts and vanishing phase gradients. It is shown in Sec. V E below that this picture can change when ψ has a nontrivial phase gradient.

Violation of energy conditions
It was shown in Sec. IV above that if p α is null and S αβ has the form (3.8), the set of all possible centroids determined by S αβ t β = 0 is not bounded in spacelike directions (when varying over all timelike t α ). This suggests that the worldlines we refer to as centroids are perhaps poorly named; it may be that some of them are nowhere near the wave packet of interest. It is well known that this type of situation can occur for the Newtonian center of mass if the mass density switches sign 15 . Relativistically, avoiding this kind of pathology involves requiring that the stress-energy tensor satisfy appropriate energy conditions. And in an exact context, electromagnetic stress-energy tensors with the form (5.14) are known to satisfy all standard energy conditions [85]. However, it is not necessarily true that the approximate electromagnetic stress-energy tensors of interest here also satisfy those energy conditions. We now show that they do not. It is this violation of energy conditions which is behind the peculiarly distant centroids associated with massless spinning objects.
Suppose thatt α has a form which maximizes the centroid displacements in Sec. IV C. Using our null tetrad (5.8), one possibility which is compatible with (4.20) is (5.28) where V ∈ [0, 1) denotes the relative speed betweent α and (1, 0, 0, 0). In the case of interest here, where there is no phase gradient, (5.16) and (5.17) imply that if O(ǫ 2 ) terms are ignored, For any nontrivial bounded wave packet, there will be some regions in which ∂ y |ψ| 2 is negative and other regions in which it is positive. Furthermore, the first term here can always be made negligible compared to the second by choosing V sufficiently close to 1. It follows that if O(ǫ 2 ) terms are ignored, there are timelike vectorst α for which T αβt αtβ is negative in some parts of the wave packet and positive in others; the energy density switches sign. This amounts to a violation of the weak, strong, and dominant energy conditions.
As noted above, this violation is an artifact of our approximation. All exact electromagnetic stress-energy tensors satisfy the weak, strong, and dominant energy conditions. That there is a problem with our approximation is not difficult to see in this context, as we are finding a "subleading" term which dominates over the "leading" term. It is not particularly surprising that in such a scenario, terms of even higher order might not be negligible. It is less clear, however, that simply assuming that a wave packet is both massless and spinning is enough for it to be associated with spurious, arbitrarily distant centroids. That is, however, a consequence of the fact the high-frequency approximation breaks down for certain highly boosted observers.

Momentum is timelike, not null
It is shown in the Appendix that it is impossible for a truly massless wave packet to have nonzero spin. It is however clear that an electromagnetic wave packet can have nonzero spin. The conclusion is that spinning electromagnetic wave packets cannot truly be null. We now show that our wave packets are timelike once we include terms two orders beyond those in geometric optics.
To establish this, first note that the electromagnetic field (5.13) can be used to compute the stress-energy tensor to one higher order than shown in Eq. (5.16). That may in turn be used to compute p α . The full stressenergy tensor is complicated, however. The calculation can be considerably simplified by noting that all we need to determine the causal character of p α is the O(ǫ 2 ) contribution to k · p. That can in turn be computed by showing only that where E again denotes the energy seen by a stationary observer. Note that because ∂ v ψ = 0, this is always negative; every localized electromagnetic wave packet of the given form has the small nonzero rest mass If ∇ψ ∼ ψ/ℓ w for some length scale ℓ w , this suggest that m ∼ǫE/ℓ w = ǫ/ℓ w . Results on the localization of massive objects which were reviewed in Sec. IV C therefore imply that all centroids associated with these wave packets are in fact confined to a disk whose radius is of order S/m ∼ ℓ w . This is the expected result. We emphasize, however, that it cannot be established-even qualitatively-in the massless approximation.

E. Nonvanishing phase gradients and the limitations of the spin Hall framework
If the phase gradient of ψ does not vanish, the physical picture associated with the spin Hall equations might not hold. First, it is not necessarily true that s = ±1 for a circularly polarized electromagnetic wave packet. Additional contributions to this parameter can arise. This is referred to as orbital (as opposed to spin) angular momentum in the optics literature [93,94]. While the derivation of the spin Hall equations in [42] did not allow for the possibility of orbital angular momentum, their derivation as a special case of the MP equations makes it clear that orbital angular momentum requires no essential changes; s merely takes on different integer values in the spin Hall equations 16 . A more dramatic consequence of allowing nontrivial phase gradients is that it becomes possible to construct wave packets with spin vectors which are not longitudinal: S αβ p β = 0 even at the leading nontrivial order. In these cases, the form (3.8) for the angular momentum is incomplete and the spin Hall equations are no longer valid. Even then, however, the MP equations can still be applied.
We first consider the possibility of nonlongitudinal spin. If the centroid condition is imposed with t α = (1, 0, 0, 0), it follows from (5.20) that (5.33) A nontrivial transverse angular momentum therefore requires thatψ∂ζ ψ have a nontrivial moment along the optical axis. That this is possible can be illustrated by examples. Suppose that where ℓ ⊥ > 0 is a parameter. Also assume that so the z component of the centroid lies at z = t + O(ǫ). Substitution into (5.33) then shows that This is nonzero for any nontrivial ψ. It follows that for this class of wave packets, S αβ is not in the spin Hall form (3.8). Equivalently, S α cannot be parallel to p α ; it must have a nonzero y component. It is not possible to use the spin Hall equations to understand the motion of such a wave packet. However, there is no obstacle to using the MP equations in their more general form (3.1). The conclusion here is that S αβ p β = 0 is a physical restriction; it is not inevitable. The derivation of the spin Hall equations appears to have implicitly assumed that the wave packets have, e.g., vanishing phase gradients. Now consider a wave packet with no transverse angular momentum, but with potentially large amounts of longitudinal angular momentum. This can be produced by introducing polar coordinates (r, θ) in the xy plane and then supposing that where n is an integer. Additionally, suppose for simplicity that |ψ| depends only on r and u and that it satisfies (5.35). It then follows from (5.20) that when evaluated at the centroid, where Σ αβ is again given by (2.16). If ǫ is again related toǫ via (5.26) comparison with (3.8) shows that for these wave packets, If a wave packet with the opposite helicity had been considered, we would have found instead that s = n − 1. Regardless, it is clear that s is not necessarily equal to ±1. Much larger amounts of angular momentum are possible than had been supposed in, e.g., [42]. Formally, n can be arbitrarily large here. However, the high-frequency analysis breaks down when ψ varies on the same scale as e iu/ǫ . If the spatial extent of the wave packet is of order ℓ w , this implies that our equations can be trusted only when n ≪ ℓ w /ǫ = ωℓ w .

VI. SPIN HALL EFFECT OF LIGHT IN AN INHOMOGENEOUS MEDIUM
As a final application, we now show how the ray equations describing the spin Hall effect of light in an inhomogeneous medium [13-18, 21, 96, 97] can be recovered from the gravitational spin Hall equations (2.17). The main tool used here is the well-known analogy between electromagnetic waves propagating inside a dielectric medium and electromagnetic waves propagating through vacuum but in an effective metric [98][99][100][101][102][103][104]. More precisely, consider a background metricg αβ and a dielectric medium with a varying refractive index n and a 4-velocity u α . It has then been shown in Ref. [75,99] (see also Ref. [104]) that the combined effect of the background spacetime and the dielectric medium on light rays can be studied by considering vacuum propagation in the optical metric where the indices on the 4-velocities here have been lowered usingg αβ . To describe the spin Hall effect of light in an inhomogeneous medium, we take the background metric to be the Minkowski one in inertial coordinates (t, x, y, z) sõ g αβ = η αβ . We also suppose that the medium is stationary in these coordinates so ∂ t n = 0 and u α = (1, 0, 0, 0). The spin Hall equations (2.17) additionally require the choice of a timelike vector field t α to fix the centroid definition, and this may be identified here with u α . A calculation then shows that with the effective metric g αβ , Σ αβ R αβγ λ = 0. The spin Hall equations in this metric therefore reduce tȯ As u α is Killing and the effective metric is static, the spindependent terms in the conservation law (3.17) vanish so E = −p α u α = const. Introducing a 3-vector notation, the momentum must be null with respect to g αβ , meaning that Energy conservation in the effective metric therefore implies that E = p/n is constant. It is only the direction of p which must be determined from the equations of motion. A calculation shows that the nontrivial components of (6.2) reduce to dt dτ = np = n 2 E, (6.4a) dx dτ = p + ǫs p 3 dp dτ × p , (6.4b) dp dτ = p 2 n ∇n = 1 2 ∇(nE) 2 , (6.4c) To obtain the same form of the ray equations as in the optics literature, we can reparametrize everything in terms of t instead of τ . Doing so, dx dt = p np + ǫs p 3 dp dt × p , (6.5a) dp dt = p n 2 ∇n, (6.5b) These are the ray equations describing the spin Hall effect of light in an inhomogeneous medium, as obtained in Refs. [15,97,105]. They can be rewritten in the form presented in Refs. [16,17,21,22] by rescaling the momentum and time, as mentioned in Ref. [97] (see also Ref. [18]). Deriving the spin Hall effect of light in an inhomogeneous medium from (2.17) is important for several reasons. First, it establishes that the gravitational spin Hall equations really are related to the spin Hall effects described in flat-spacetime optics; the gravitational spin Hall equations thus have been given an appropriate name. Second, the ray equations usually used in the optical literature implicitly fix t α at the outset and do not allow it to vary. Beginning instead with the gravitational spin Hall equations, where t α is arbitrary, instead allows a unified description of the spin Hall effect of light, determined by the gradient of n, and the relativistic Hall effect [26,27], determined by changes in t α . Lastly, the spin Hall effect of light, as described by Eqs. (6.5), has been confirmed experimentally in Refs. [19,21]. The present connection between Eqs. (2.17) and Eqs. (6.5) gives some level of confidence in the theoretical predictions of Eqs. (2.17) and in the existence of a genuinely gravitational spin Hall effect of light.
As another application of the type of analysis presented in this section, one might consider the propagation of light in a plasma which is in a curved spacetime, perhaps near a black hole. In some regimes, the plasma can be expected to have an effective refractive index [106][107][108][109][110]. The spin Hall equations (2.17) together with the optical metric (6.1) could then be used to derive polarizationdependent corrections to the propagation of electromagnetic pulses in the presence of an astrophysical plasma.

VII. CONCLUSIONS
This paper has investigated the implications, properties, and limitations of the gravitational spin Hall equations derived in Refs. [42,45]. In the electromagnetic case of interest here, these equations describe the motion of high-frequency circularly polarized electromagnetic pulses which propagate in vacuum but in arbitrary background spacetimes. In this context, the spin Hall effect refers to the transverse deflection of a pulse due to its spin.
Our first class of results concern the meanings of the position and the momentum which appear in the gravitational spin Hall equations. In general, a spinning wave packet must be extended and the adoption of any equation of motion must be associated with a particular choice of centroid. We have found that the position appearing in the gravitational spin Hall equations is a centroid whose definition is parametrized by the timelike vector field t α which appears in those equations. That position may be interpreted as a spin supplementary condition, chosen to ensure that the angular momentum satisfies S αβ t β = 0. More physically, the centroid is the center of energy of the wave packet in a frame which is instantaneously at rest with respect to t α .
Different choices for t α are in general associated with different centroids, and we have computed the shifts between those centroids. For massive objects with timelike momentum, we recover the known result that these shifts are always bounded: At any fixed time, all centroids lie within a finite disk. However, we have shown that this is no longer true in the (massless) null case. Massless spinning objects have arbitrarily distant centroids. In this sense, they cannot be localized.
Although this might appear to be problematic, we show that there is no spinning electromagnetic field configuration which is in fact massless. More generally, no massless object of any composition can have spin unless it violates the dominant energy condition. Nevertheless, there is a sense in which high-frequency electromagnetic wave packets can be approximately null. Truncating the high-frequency approximation at subleading order results in an approximate stress-energy tensor which violates the dominant energy condition. There is a large amount of both positive and negative energy density in certain highly boosted frames, and it is this negative energy density which makes it appear as though there are centroids far outside of the wave packet itself. These energy densities-and the associated distant centroidsare not real. They are unphysical artifacts of the highfrequency approximation. If higher-order terms are included, a spinning electromagnetic wave packet would be seen to satisfy all standard energy conditions and to have a positive rest mass. This rest mass guarantees that all centroids remain in a finite region; real electromagnetic wave packets can be localized.
When working in the approximately massless approximation associated with the spin Hall equations, one must be careful about the limitations of that approximation. The concept of something being "approximately null" can make sense only in a class of frames, and the highfrequency approximation breaks down in very different frames. In particular, some weak restrictions must be placed on the t α appearing in the gravitational spin Hall equations in order to avoid regimes where those equations are no longer valid.
We have also addressed other aspects of the approximations inherent in the gravitational spin Hall equations. First, there is the question whether or not the initial data assumed in the gravitational spin Hall equations does indeed describe reasonable high-frequency wave packets. We argue that it does, to the expected degree of accuracy, at least when there are negligible phase gradients across the wavefronts. Some cases of nontrivial phase gradients can still be described by the spin Hall equations, just with larger amounts of angular momentum. In other cases with significant phase gradients, S αβ p β = 0 so the angular momentum is no longer longitudinal and the spin Hall equations cannot be applied. Nevertheless, those cases can still be described by the Mathisson-Papapetrou equations, which are more general than the spin Hall equations.
Besides the approximations involved in the initial conditions used in the equations of motion, there are also neglected terms in the equations of motion themselves. For example, the quadrupole moment of the wave packet is neglected. We have provide a detailed discussion of when such terms can be neglected and when they cannot. This is subtler than for the nearly-rigid massive objects whose quadrupole moments are more commonly considered, as electromagnetic fields do not hold themselves together as they propagate. Electromagnetic wave packets generically spread out over time, increasing the quadrupole and higher-order moments and eventually invalidating the equations of motion.
Another theme in this paper has been to relate the gravitational spin Hall equations to other equations which have also been proposed in the literature to describe the motion of spinning electromagnetic wave packets-sometimes in quite different contexts. First, we have shown that the gravitational spin Hall equations are special cases of the Mathisson-Papapetrou equations, which govern the motion of generic (not necessarily electromagnetic) spinning objects in curved spacetimes. The gravitational spin Hall equations arise from the MP equations with a particular choice of spin supplementary (or centroid) condition, a particular type of initial data, and a particular worldline parameterization. Second, we have shown that the spin Hall effect of light in an inhomogeneous medium can be obtained from the gravitational spin Hall equations with the use of an effective optical metric. This provides a connection between the gravitational spin Hall and the MP frameworks, and an effect which has been experimentally observed [19,21].
Lastly, we have shown that the observer dependence of the gravitational spin Hall equations is directly related to the relativistic Hall effect [26] and the Wigner(-Souriau) translations [27][28][29]. While these effects are exactly recovered (as previously discussed) in Minkowski spacetime, the discussion here generalizes them to arbitrary curved spacetimes. We have also pointed out that this effect has long been known in the relativistic theory of motion as applied to massive objects [30][31][32][33], and the approximately massless electromagnetic case is not significantly different (except in the aforementioned unboundedness of the set of all massless centroids).
Our analysis of different centroids and their properties has been purely classical. In a quantum mechanical context, there are various results which state that massless particles cannot be localized when their spins are greater than 1/2 [111][112][113][114][115][116][117][118][119][120][121]. While this appears to be at least qualitatively related to our result that massless classical objects with finite spin cannot be localized, the meanings of "particle" and "localization" are different in both contexts. It would nevertheless be interesting to better understand the connections between these results.