Synchrotron radiation representation in phase space

The notion of brightness is efficiently conveyed in geometric optics as density of rays in phase space. Wigner has introduced his famous distribution in quantum mechanics as a quasi-probability density of a quantum system in phase space. Naturally, the same formalism can be used to represent light including all the wave phenomena. It provides a natural framework for radiation propagation and optics matching by transferring the familiar `baggage' of accelerator physics (beta-function, emittance, phase space transforms, etc.) to synchrotron radiation. This paper details many of the properties of the Wigner distribution and provides examples of how its use enables physically insightful description of partially coherent synchrotron radiation in phase space.


I. INTRODUCTION
The concept of phase space plays an important role in accelerator physics. Useful tools such as Twiss parameters, emittance, phase space propagation have been in long use in the accelerator community. The extension of classical phase space concept to synchrotron radiation is straightforward for geometric optics, applicable for incoherent radiation. The Wigner distribution, or Wigner distribution function (WDF), was recognized to be a general framework to represent quantum [1] and therefore wave phenomena in phase space [2,3]. The approach allows light characterization of arbitrary degree of coherence [4] and polarization [5] in phase space, though its application by the accelerator community has so far been mostly limited to simplest cases of Gaussian or Gauss-Schell beams [6,7]. This provides a set of useful analytical expressions for quick estimates of performance of modern x-ray sources with improved coherence properties. In particular, the concepts of the diffraction limit and brightness have been extended to cover partially coherent radiation cases following the notion of Gaussian distributions in the phase space for both the synchrotron radiation of the undulator central cone and density of electrons. More detailed approach to coherent or partially coherent sources inevitably calls on physically rigorous wave description of the radiation, either through using cross-spectral density [8] or the Wigner distribution. In particular, since neither undulator radiation, nor electron distribution in the phase need to be Gaussian, the general framework becomes essential to be able to describe the performance of x-ray sources with improved coherence. The Wigner distribution function provides a natural and elegant description of the properties of light, and can serve as a useful tool in accelerator and x-ray beamline design including electron to x-ray beam matching, light propagation, fully accounting for arbitrary polarization and coherence properties of radiation. The intuitive picture provided by the WDF, being the phase space density of light or generalized brightness, is particularly appealing to the accelerator community trained to view many aspects of the beam dynamics in phase space. Not only can the WDF be readily computed from the first principles, the first measurement of x-ray Wigner distribution has been reported in the literature [9]. The knowledge of the Wigner distribution represents the entirety of what can be known about the radiation and its importance will only increase with advent of more coherent x-ray sources.
The purpose of this paper is to review many of the useful properties of the Wigner distribution and demonstrate that the WDF can be used with physical insight to describe partially coherent synchrotron radiation. In what follows, the Wigner distribution properties are first reviewed in Section II using the language of quantum mechanics. Various examples illustrate the physical meaning of the WDF for both pure and mixed quantum states. The case of the synchrotron radiation as discussed in Section III is then viewed as a natural extension of the quantum mechanical treatment. Coherence and dispersion properties of light as conveniently conveyed by the WDF are emphasized. A special attention is given to light polarization, being an important characteristic of synchrotron radiation. Practical matters of computing the WDF are covered in Section IV, which outlines the general procedure for obtaining the Wigner distribution first for a single electron, and then extending the result to include electron bunches of Energy Recovery Linac as an example. Since neither synchrotron radiation nor electron beam in this case have Gaussian phase space density, some consideration is given to generalizing the concepts of emittance and brightness to describe non-Gaussian distributions.

II. WIGNER DISTRIBUTION IN QUANTUM MECHANICS
The Wigner distribution, initially introduced to account for quantum phenomena in statistical mechanics [1], provides a convenient description of a quantum mechanical system in phase space. The Wigner distribution itself does not possess any new information not already contained in quantum state itself, which is fully described (together with its complete time evolution through HamiltonianĤ) either by a pure state ψ or more generally for a mixed state by its density matrixρ = j p j |ψ j ψ j |, with state weights j p j = 1. The utility of the Wigner distribution is in convenient and visual representation of the quantum system (and by extension wave optics phenomena) in terms of quasi-probability of having both phase space quantities (e.g. x, p). Such characterization, being very familiar to accelerator physicists, is a natural framework of description for a unified phenomena of both classical and wave nature reusing many of the concepts from the accelerator field (emittance, β-function, phase space propagation, brightness, etc.). Quasi-probability refers to the fact that while the Wigner distribution is normalized to 1 and is used to compute averages of various quanitites as expected for a probability density function, the function can take on local negative values. This deviation from non-negativity is essential for general quantum or wave phenomena where position and momentum operators do not commute and the uncertainty principle must hold contrary to the classical description. Nevertheless, this non-positivity does not preclude measurement of Wigner distribution using tomography techniques [9,10].
The properties of the Wigner distribution have been studied extensively in the context of quantum mechanics [11,12], wave optics [4,5,13] and signal processing [14][15][16]. To provide a suitable context, the properties of the WDF are reviewed in this section. For simplicity, we limit our consideration here to a 1D scalar wavefunction, ψ(x). Extension to higher dimensions and polarization as required for synchrotron radiation is detailed in Section IV.

A. Pure quantum state
First, we consider a pure quantum state ψ. Particularly insightful definition of the Wigner distribution can be given in Dirac notation: (The integration here and elsewhere in this paper is taken over the entire range −∞ to +∞ unless stated otherwise.) The integrand is the quantum equivalent of a classical phase-space trajectory as seen by reading Dirac brackets from right to left: (1) the probability amplitude for a particle in state ψ to have a position (x − x 2 ); (2) the amplitude for a particle with position (x − x 2 ) to have momentum p; (3) the amplitude for a particle with momentum p to have position (x + x 2 ); and finally (4) the amplitude for a particle with position (x + x 2 ) to (still) be in the state ψ. The integration over the entire space x therefore creates a superposition of all possible quantum trajectories of state ψ, which interfere constructively and destructively, providing a quasi-probability distribution in phase space [17]. Using a well known identity (with h = 2π the Planck constant) we rewrite Eq. 1 then assumes its most frequently quoted form In the same spirit, Eq. 1 can be rewritten in terms of integration over the entire momentum space, leading to an equivalent definition of the Wigner distribution function now in terms of momentum representation of the state Ψ (p) ≡ p|ψ : The momentum and position representations, ψ(x) and Ψ (p), are related via the Fourier A summary of the main properties of the Wigner distribution function is given below.
Properties that are revisited later for a more general case of a mixed state are denoted by an asterisk (*).
This property follows from W * (x, p) = W (x, p).

Property 2 (Normalization and Marginals*)
The WDF is normalized to 1 with its projections (or marginals) corresponding to nonnegative probability densities in either position or momentum W (x, p) dx dp = 1, The proof is by substitution of the Wigner definition into Eqs. 7 and then using the identify This property can be proven using the Cauchy-Schwarz inequality on the definition of the Wigner function.
It is illustrative to consider when the WDF assumes ± 2 h extrema. An arbitrary wavefunction ψ(x) can be written in terms of even ψ e (−x) = ψ e (x) and odd . Then, the WDF at the origin becomes This can be written in terms of the wavefunction's even and odd parts: As can be seen from Eq. 10 and the wavefunction normalization, is odd and vice versa [12].

Property 4 (Expectation values)
The expectation value of an operatorÂ can be found from its phase-space representation function A(x, p) according to where W (x, p) acts as a phase-space probability density. The function A(x, p) and operator A satisfy the following relationships [11] A(x, p) = The pair of Eqs. 12 and 13 is referred to as the Wigner-Weyl transformation [18].
Refer to [11] for proof. A practical significance of this property is that any linear combination of functions of only position operators or only momentum operators correspond to the classical phase-space representation given by Eq. 12 where one replacesx → x and p → p. In particular, the n-th moments of the distribution are readily obtained using p n = p n W (x, p) dx dp, x n = x n W (x, p) dx dp.
Quantum mechanical correspondence to the classical correlation expectation of positionmomentum is more involved due to non-commuting nature of the operators. Indeed, the operatorxp is not Hermitian, i.e. (xp) † =p †x † =px =xp since [x,p] = i = 0. As a result, the expectation value xp is generally complex and xp = px . A solution is to write 1 2 (xp +px) = xp = px , where the symmetric operator is now Hermitian and its corresponding phase space function is found from Eq. 12 to be 1 2 (xp +px) → xp. Therefore, The above equations allows us to compute the Σ-matrix of the quantum phase-space distribution familiar to accelerator physicists with the usual meaning of emittance = √ det Σ and Twiss parameters satisfying det T = 1.
The Heisenberg uncertainty principle can then be written as Property 5 (Time evolution) For a time-independent HamiltonianĤ =p 2 /2m + V (x), the time evolution for the Wigner distribution W is governed by The proof is straightforward using time-dependant Schrödinger equation. Refer to [12] for details. In particular, for a linear force F (x) = F 0 − kx with a potential energy V (x) = V 0 − F 0 x + 1 2 kx 2 (F 0 , k, and V 0 are arbitrary constants), drops out from the Eq. 16 and we recover the classical Liouville's evolution of the phase-space distribution This property further illustrates the connection to the classical concept of phase space. In particular, classical invariants and transformation rules directly carry over to the quantum phase space density in case of no or linear forces.
Property 6 (State cross-correlation*) Cross-correlation of the wavefunction can be recovered from the WDF of a pure state via a Fourier transform This property is proven by substituting the WDF definition and using the identity (8).
Note the similarity to Weyl's relationship, the Eq. 13.
It should be noted that the cross-correlation function of Eq. 18 is just a density matrix of a pure state ψ in either position or momentum basis This connection of the Wigner distribution to the density operator matrix will continue for mixed states as discussed later.
Property 7 (State recovery*) Property 6 allows to recover the wavefunction from the WDF modulo a complex constant Property 8 (Integrated product) For two WDFs corresponding to pure states ψ and χ the integrated (overlapping) product is related to scalar state product according to The proof of this property again involves a substitution of the WDF definition into Eq. 21, the use of identity (8) and a change of integration variables.
Property 9 (Generalized integrated product) To account for Wigner distribution of a superposition of quantum states, we introduce a generalized Wigner distribution (now generally complex) using Then the integrated overlapping product of two generalized WDFs becomes Property 10 (Superposition of states) Consider a superposition of states |ψ = n α n |φ n (either finite or infinite sum). States φ n need not be orthogonal, but all states are assumed normalized: ψ|ψ = 1 and φ n |φ n = 1. The Wigner distribution can then be written as Note that the superposition of states generally leads to appearance of cross-terms W (φn,φm) in the Wigner distribution. Realness of W (ψ) is readily verified by noting that the off-diagonal terms in Eq. 24 are complex conjugates of each other (α * n α m W (φn,φm) ) * = α * m α n W (φm,φn) = α * m α n W * (φn,φm) and therefore their sum must be real. For example, consider a stationary HamiltonianĤ producing a complete orthogonal basis |n with corresponding energy eigenvalues E n H |n = E n |n .
Then, the time evolution of an arbitrary state characterized by the initial vector |ψ 0 = |ψ(t = 0) adopts the familiar form where the expansion coefficients are found in terms of projections of the initial state on the eigenbasis a n = n|ψ 0 and must satisfy normalization requirement n |a n | 2 = 1.
The time evolution of the WDF for |ψ(t) is given by Eq. 27 can be also rewritten using (23) in terms of integrated products of the Wigner Note that the decomposition of the Wigner distribution for a pure state into an othronormal basis generally requires the presence of non-vanishing interference cross-terms. We shall revisit this subject when considering incoherent addition of states.
Property 11 (Gaussian state) A positive WDF can only be realized for a wavefunction of the form leading to a joint Gaussian WDF in position and momentum [12]. We also note in passing a well-known fact that a Gaussian state yields the phase-space probability density with the smallest rms spread (quantum emittance) = /2. This property is introduced in [19] and is mentioned here for completeness. It should be noted that the distribution function so obtained is no longer the quasi-probability suitable for finding the expectation values of the original state. For further implications of this property, including it being a possible mathematical tool of the concept of measurement in quantum mechanics, the reader is directed to the discussion in [19].

Example: wave packet time evolution in 1D potential
Next, we consider several examples that illustrate the concept of the Wigner distribution.
The first example shows the phase-space motion of an electron in 1D potential depicted in (the position x is in nm): The initial wave packet is described by a Gaussian ψ( and σ x = 0.3. Fig. 1 shows ψ(x, t = 0) and first 40 energy eigenstates ψ n (x). The intial quantum state is then evolved according to Eq. 26. The motion in the phase space of a classical particle with an initial position x = 3 and a zero velocity is shown in  | x|ψ | 2 and | p|ψ | 2 are also shown. In addition, the rms emittance of the quantum phase space distribution is shown. As can be seen from Fig. 3, the emittance of the initial wave packet is /2, increasing when the wave packet reaches the hard reflective potential boundary (and more generally when discontinuities in the potential are encountered). Here and in all subsequent plots of the Wigner distribution we use the same color map: blue and red colors correspond to negative and positive values respectively and the white corresponds to zero.
It should be noted that despite increase in the phase space area, the mode obviously remains pure at all times.

Example: eigenstates of a simple harmonic oscillator
Another example we consider is the WDF of energy eigenstates for a SHO [20], which adopts the same math as Hermite-Gaussian modes in optics (at a waist and a single coordinate). The well-known eigenstates of the HamiltonianĤ = − phase space with Here L n are Laguerre polynomials. The first three states along with W (r) are shown in Fig. 4. We make a couple of observations regarding Eq. 32: • The WDF is the maximum possible value at the origin for even n and the minimum possible for odd n in accordance with Eq. 10: W (0) = (−1) n /π or in the regular units • The uncertainty in position and momentum or emittance for mode n is 2n + 1 times 1/2 in the natural units or /2 in regular units. In optics, /( /2) quantity is known as the M 2 -parameter, i.e.
In natural units used in this example we have Thus, the ground Gaussian state has the smallest possible uncertainty of 1/2 (or /2), whereas each subsequent excitation adds an additional node to the wavefunction and the radial Wigner distribution, and increases the emittance by 1 (or ).

B. Mixed quantum state
Generally real quantum systems cannot be described as a superposition of pure modes (which is itself a pure mode) instead adopting a statistical language to describe an incoherent mixture of pure states characterized via the density operator with state probabilities p j adding up to one. When only one coefficient p j = 1 for some j is present the formalism is reduced to that of a pure state. Recall that an expectation value of an operatorÂ is given in terms of a trace Other standard properties are where the equal sign in Eq. 39 is for a pure state case and less than 1 otherwise.
The definitions Eq. 2 and 4 are now replaced with In other words, the WDF of a mixed state is a weighted sum of the WDFs corresponding to individual pure states of the density matrix Since, the Wigner distribution is a quadratic (intensity-like) function of the state, addition of individual WDFs corresponds to an incoherent addition, which is to be contrasted with coherent superposition of Eq. 24. This language of a pure state vs. a mixed state, incoherent addition vs. coherent superposition carries over directly to optics and allows characterization of partially coherent sources.
Next, we present a new property that relates the WDF to modal purity. We then revisit some of the properties introduced earlier to extend them for the mixed state case. Most of the properties discussed previously, namely Properties 1 through 7 directly carry over to the mixed state case after the necessary modifications to Properties 2, 6, and 7.
Property 13 (Measure of modal purity) The integrated WDF squared is a measure of modal purity The equal sign in Eq. 42 is for a pure state.
The proof of this property follows directly from Eq. 41 and Property 8.

Property 14 (Marginals -Property 2 revisited)
Again, this property reinforces the notion of simply adding the intensities (here probability densities) for mixed states.
Property 15 (Density matrix -Property 6 revisited) The density matrix is related to the Wigner distribution via a Fourier transform which are simply the inverse of the Wigner function definitions Property 16 (Mode decomposition -Property 7 revisited) Property 7 to invert the wavefunction from its WDF is only applicable for a pure state returning a meaningless "wavefunction" otherwise. Since the density matrix is a positive-semidefinite Hermitian operator with unit trace, it has an orthonormal basis of eigenstates φ n whose corresponding real eigenvalues λ n ≥ 0 and n λ n = 1 [21]: Therefore, from Eq. 45, the WDF can also be written as an incoherent sum of orthogonal pure modes The knowledge of the Wigner function or the corresponding density matrixρ allows finding the modes and their weights via the standard eigenvector and eigenvalue problem, i.e. as seen from multiplying both sides of Eq. 46 by |φ m and using orthonormality condition Any convenient complete orthonormal basis can be chosen to represent the density matrix ρ and its eigenstates φ n [22].
Some additional comments about this property are in order. The decomposition given by Eq. 47 is distinct from Eq. 41 in that the decomposition yields orthogonal states, which is generally not the case in how the mixed state has been originally set up. As a result, the modes of the decomposition, Eq. 46, may bear little resemblance to the original preparation states of the density matrix (i.e. different mixtures may correspond to the same density operator). For the case of a mixed state with orthogonal preparation states, Eq. 41, the decomposition recovers the modes and their weights exactly. Small negative eigenvalues λ n usually indicate an experimental error in arriving at the density matrix (or the Wigner distribution) [22] and can serve as a diagnostics and a self-consistency check. Finally, as is generally the case for pure states, the modes φ n need not be simple in the sense that they don't necessarily have a small momentum-position uncertainty (i.e. M 2 can be M 2 1).
As an example, consider a relatively complicated mode of Fig. 3 after one or more oscillation periods, which, despite having a large dispersion, is still a pure (fully coherent) mode with Tr(ρ 2 ) = 1. The use of Property 16 recovers just that mode, which itself may have a very rich spectrum in some other basis.

Example: superposition of two states
In this example we demonstrate the difference between a coherent superposition and an incoherent mixture of two Gaussian states. A numerical example of a mixed mode decomposition further demonstrates the use of Property 16.
First, let us consider another property useful in this example.
leads to the WDF W ( ψ) related to the original W (ψ) via the linear momentum transformation In other words, multiplying the wavefunction by a linear phase factor amounts to a shift in momentum, whereas the quadratic phase shift adds a linear correlation (chirp) to the momentum vs. position. In this example, the position is in nm and velocity is in nm/fs (an electron is assumed).
Coherent superposition is for two Gaussian wave packets with equal weights (the Gaussians are nearly orthogonal as seen from the fact that their Wigner distributions don't overlap and the Property 8).
Consider a superposition of two states |ψ ∝ |ψ 1 + |ψ 2 . Using Property 10, the Wigner distribution has 3 terms where the interference term . The interference term is responsible for oscillations seen in Fig. 5, and it is easy to show that W i for two orthogonal states carries no energy On the other hand, a mixed stateρ ∝ |ψ 1 ψ 1 | + |ψ 2 ψ 2 | has only 2 terms from each individual states in its Wigner distribution without the interference term.

Example: Gauss-Schell model
Now we consider a quantum mechanical analog of what is known as a Guass-Schell model in optics [23]. Using same natural units of SHO with , m, ω → 1, we rewrite the Wigner distribution similar to Eq. 32 in a generalized Gaussian form where as previously r 2 = x 2 + p 2 . Setting M 2 → 1 recovers a pure Gaussian ground state, whereas M 2 > 1 corresponds to a mixed state. As previously, Eq. 35 applies for our choice of units Next, we use Eq. 44 to recover the density matrix A more common form of presenting the density matrix is as a Schell-model source where the probability density and the degree of spatial coherence The function µ(∆x) is bound 0 ≤ |µ(∆x)| ≤ 1 with 1 or 0 corresponding to a perfect or no phase correlation respectively. σ µ is known as a coherence length in optics. E.g. M 2 → 1 yields σ µ → ∞ (a perfect phase correlation or pure state) whereas M 2 → ∞ gives σ µ → 0 (no phase correlation in the state).
Decomposition eigen problem (48) can be rewritten as This Fredholm integral equation yields the following spectrum of eigenvalues and eigenfunctions for the density matrix of Eq. 55 [23,24] where Thus, the Gauss-Schell model adopts a particularly simple mode decomposition, which are the pure states of a simple harmonic oscillator. Additionally, it can be checked that Eq. 64 can be a source of confusion in that one may be tempted to equate M 2 (phasespace area, or emittance, or dispersion) directly to spectral purity Tr(ρ 2 ) for an arbitrary mixed state. This temptation should be resisted since quantum-mechanically the two concepts, the phase-space uncertainty M 2 and the mode purity Tr(ρ 2 ), are distinct as argued previously. To the extent that the Gauss-Schell model is applicable to a quantum or optical system, such blurred interpretation of M 2 simultaneously being a measure of dispersion and coherence may be justified. A notable exception is when M 2 → 1, which corresponds to both perfect coherence and minimum uncertainty of a pure Gaussian state. We shall see later, however, that the synchrotron radiation from an undulator source by a single electron is far from a Gaussian and, therefore, such dual interpretation of M 2 needs to be rejected for the diffraction-limited electron beams. Similarly, the use of the Gauss-Schell model on a distinctly non-Gaussian phase space distribution function has little merit.

III. WIGNER DISTRIBUTION FOR SYNCHROTRON RADIATION
The connection of the Wigner distribution to describing partially coherent sources is usually made through the cross-spectral density function Γ(r 1 , Here E(r 1 , ω) is frequency representation of the electric field, which assumed for now to be a scalar function (e.g. linearly polarized light) of 2D transverse coordinates r = (x, y) (e.g. the detector plane) and . . . means ensemble average (e.g. over electron bunches for synchrotron radiation). For Eq. 65 to fully describe coherence properties, the source needs to be stationary in that all ensemble averages do not vary with respect to time (or at least first and second moments are time-independent, which is a requirement for widesense stationary processes). The synchrotron radiation with its pulsed bunch structure is generally non-stationary. However, as argued in [8], one can use the cross-spectral density in the form of Eq. 65 if individual synchrotron radiation pulses last much longer than their coherence time (the time scale of short-term field fluctuations, inversely related to the source bandwidth), or σ t N u ω 0 for an undulator source with N u undulator periods and resonant (radiation) frequency ω 0 and electron bunches of σ t duration. This condition is usually well satisfied (though an extension of the formalism can be straightforwardly made to describe nearly transform-limited sources in time-frequency domains). One also typically defines the spectral degree of coherence [25] µ(r 1 , r 2 ; ω) = Γ(r 1 , r 2 ; ω) The modulus of the spectral degree of coherence ranges from 0 to 1 for incoherent to fully coherent sources, 0 ≤ |µ| ≤ 1. For a fully coherent radiation, |µ| = 1 everywhere. This quantity is directly related to the fringe visibility in interference experiments.
A fully equivalent characterization can of course be made in time domain [26]. In what follows, we restrict our treatment to frequency domain, being a more natural choice for x-rays. Therefore, the frequency dependence for the functions will be understood while the symbol itself will usually be omitted from the expressions, e.g. E(r) ≡ E(r; ω).
The Wigner distribution for optics is then given by [2] W (r, θ) = 1 λ 2 Γ(r − r 2 , r + r 2 )e ikr ·θ d 2 r , where transverse position r and angle θ = (θ x , θ y ) form a conjugate pair similar to positionmomentum in quantum mechanics (small angle approximation is used throughout). Crossspectral density in position and angular representations are defined according to where the angular representation E(θ) of radiation (far field) is related to its spatial representation E(r) via the Fourier transform pair The radiation wavenumber k above is given by k = 2π/λ = ω/c in terms of wavelength λ, frequency ω, and the speed of light c.
The connection to quantum mechanics now becomes obvious. Refer to Table I. All the properties introduced in the previous section have their counterparts in optics. In particular, we note that the radiation wavelength λ in wave optics plays a role of Planck constant h in quantum mechanics. E.g. geometric optics is recovered in the limit λ → 0 just as the classical behavior can be obtained through h → 0. Also, the overall degree of coherence µ 2 g , which is directly equivalent to Tr(ρ 2 ) of density matrixρ in quantum mechanics, can be expressed in terms of the Wigner distribution function where the denominator, the total flux squared, plays a normalization role so that 0 ≤ µ 2 g ≤ 1.

A. Polarized light
Treatment of polarized light [27] is directly analogous to WDF of spin-1 2 quantum particle [28], which require a 2-component spinor to characterize a state. The reason that a spin-1 particle (photon) can be described by a 2-component spinor (as opposed to 3) is well known in that only ± spin projections along the direction of propagation (helicity) are realized for a massless particle.
The Wigner distribution now becomes a 2 × 2 matrix W(r, θ) (complex for off-diagonal elements) with components defined according to Generalizing the formalism of polarized light [27,29], W(r, θ) can be represented as a scalar function on the Poincaré sphere using where generalized Stokes parameters are found from Here σ j are 2 × 2 Pauli matrices with σ 0 being an identity matrix, and Ω is a vector mapping Stokes parameters onto the Poincaré sphere with polar χ and azimuthal φ angles The generalized Stokes parameters, which now play a role of a 4-component phase space distribution, can we written explicitly in terms of the WDF components S 0 (r, θ) = W xx (r, θ) + W yy (r, θ) In what follows we may occasionally refer to the generalized Stokes parameters as simply represents total intensity in phase space, S 1 (r, θ) represents +45 • / − 45 • linearly polarized light (for +/− respectively), S 2 (r, θ) corresponds to right/left-hand circular polarization, and S 3 (r, θ) to x/y-linear polarization. We note that exact signs here apply only to intensity projections s j (r) since the WDF (S 0 ) is allowed to take on negative values while its projections (marginals) are guaranteed to be positive. For example, a Gaussian mode with x-polarization will have S 0 (r, θ) = S 3 (r, θ) > 0 with other Stokes parameters being 0, or for left-hand circular polarization S 2 (r, θ) = −S 0 (r, θ). Whereas fully polarized light satisfies s 0 = s 2 1 + s 2 2 + s 2 3 and partial polarization manifests itself as s 0 > s 2 1 + s 2 2 + s 2 3 , the generalized Stokes parameter S 0 can take on local negative values and deviate from these expressions.
As shown in [27], the overall degree of coherence for vectorial waves can be written as where d 2 Ω = sin χ dχ dφ. Or equivalently in terms of generalized Stokes parameters and explicitly in terms of the WDF components as

B. Wigner distribution projections
One of practical limitations of the Wigner distribution is that generally one needs to employ four-dimensional arrays as a function of light frequency (and possibly time if the temporal structure inside an individual synchrotron pulse is important) times 4 for Stokes parameters to represent the radiation fully. In addition to large memory requirements, one typically prefers to visualize two-dimensional projections rather than the entire phase space, much as it is done in accelerator physics for particle tracking. Here, we mention some of the properties of such projected WDFs, limiting our discussion to linearly polarized light for simplicity. Important 2D projections of the Wigner distribution are intensity I(x, y), the far field (angular) intensity I(θ x , θ y ), x − θ x and y − θ y phase space projections, B x (x, θ x ) and B y (y, θ y ).
If the radiation modes are separable, i.e. can be written in the form E(x, y) = φ x (x)φ y (y) (for example Hermite-Gaussian modes), then all the properties discussed in Section II for two-dimensional WDF in quantum mechanics apply to the Wigner 2D projections after normalization W x,y = B x,y /F, where F = I(x, y) dx dy is the total (spectral) flux. It includes the interpretation of the λ W 2 x (x, θ x ) dx dθ x and a similar expression for y-plane to be a measure of coherence µ 2 gx,y (the analog of the Tr(ρ 2 ) in quantum mechanics). On the other hand, for non-separable radiation fields (e.g. general radially symmetric modes), the same interpretation of λ W 2 x (x, θ x ) dx dθ x = µ 2 gx cannot be made. Nevertheless, for simple linear optics without coupling of x, y-planes (drifts and lenses), the WDF projections can be propagated in the same way as the full four-dimensional Wigner distribution. We also note that a pure mode with symmetric fields E(−x, −y) = E(x, y), which are of a practical importance to synchrotron radiation, the on-axis 2D brightness takes on the possible maximum value where F is the total spectral flux contained in the mode.

C. Light propagation
One of the strong appeals of the Wigner distribution function is in its natural propagation for linear optics, which is entirely similar to the classical phase space evlolution. As a result, the formalism developed in the accelerator physics for classical phase space distributions can be directly carried over to (partially) coherent synchrotron radiation. In analogy to Property 5, the local values of WDF stay constant on phase space trajectories subject to classical transformation in drifts and lenses along the longitudinal position z W (r(z 2 ), θ(z 2 )) = W (r(z 1 ), θ(z 1 )), where  Similarly, for a decoupled in x, y-plane transport, the 2D projections of the WDF follow the classical transformation B y (y(z 2 ), θ y (z 2 )) = B y (y(z 1 ), θ y (y 1 )).
Drift or lens transformations lead to a rotated or sheered WDF, and since the projections of the WDF are accessible for measurement, this allows a reconstruction of the Wigner distribution through tomography, similar to the use of tomography in phase space reconstruction in accelerators.
The introduction of spatial filters (e.g. a pinhole or a slit) naturally leads to diffraction phenomena and the Wigner distribution gets altered in a non-trivial way. Whereas, the electric field after an aperture with transmission t(r) is simply E(r) → E(r)t(r), the Wigner distribution is given by the convolution of the angular variables θ of the input Wigner function with that of the spatial filter [5] W where W t (r, θ) = 1 λ 2 t * (r + r 2 )t(r − r 2 )e ikr ·θ d 2 r .

IV. NUMERICAL EVALUATION
In this section we discuss practical matters pertaining to computing the Wigner distribution for undulator radiation. As we shall see, as long as the effects of cosh-dependence of the undulator fields can be ignored, the synchrotron radiation in phase space can be obtained by a convolution of the WDF from a single electron with that of the entire electron beam phase space. This is a consequence of the well-known fact that the electrons in a single bunch do not interfere with each other unless there is a microbunching structure on the wavelength scale. In which case, the computation of radiation fields proceeds differently. In what follows, we limit our examples to the "electron only interferes with itself" scenario, as applicable for non-free-electron-laser (non-FEL) emission regimes.

A. Radiation field generation
Calculation of radiation fields is well established, e.g. see [30,31]. The frequency representation of the electric field is given by for an observer at r, the position vector from the observer to the electron R = r − r e , with r e (τ ) being the electron's trajectory as a function of time τ , the velocity β = c −1 dr e /dτ , and the unit vector n = R/R with R = |R|. The expression 96 is exact and is convenient for numerical evaluation in that once the trajectory r e (τ ) is found, the integral evaluation is direct. We assume transversality of the field, i.e. E ≈ (E x , E y , 0). An expression with paraxial approximation for the field can be obtained [32], however, it represents little advantage over the exact expression for numerical work. A simulation tool has been developed that solves for the electron trajectory in arbitrary field configuration and evaluates the radiation integral, Eq. 96.
The undulator magnetic fields are taken of the usual form B x = B 0x sin(k u z) cosh(k u x), B y = B 0y cos(k u z) cosh(k u y), Here λ u = 2π/k u is the undulator period, B 0x,y are the maximum magnetic fields in both planes with B 0x = 0 for a conventionally oriented planar undulator. The total undulator length is taken to be L u = N u λ u , and the relation to the undulator K parameter is via the usual K x,y = eB 0x,y λ u /2πm e c. To ensure on-axis orbits with no net deflection, the undulator fields are 1/4 and 3/4 of their nominal values for the first and second period halves on either undulator end.
For numerical evaluation of the Wigner distribution function, the Fourier transform of Eqs. 87, 88 is replaced with its discrete analog. A detector is placed at an arbitrary position z downstream of the undulator, and the electric field is evaluated on a transverse grid of positions r kl = (x k , y l , z). The phase space distribution is then typically back-propagated to the undulator center using the usual transforms. It should be noted that the discrete Fourier transform can suffer from aliasing problems and, in order to avoid this problem, the maximum angular extent of the radiation must be within π/k∆ x,y , where ∆ x,y is the grid size of the radiation field sampling. To avoid very small grid sizes, it is convenient to use Property 17 to first remove the quadratic phase present in the radiation pattern. This is equivalent to introduction of a perfect thin lens, which is subsequently removed after the WDF is evaluated but prior to phase-space propagation to a point of interest.

B. Electron bunch effect
The effect of adding radiation from many electrons is equivalent to an earlier considered example of superposition from two quantum states. For any two electrons in the bunch, the electric field will differ by a phase factor e iωt j , where t j represents time of the electron inside the bunch. It is easy to see that the interference term of Eq. 51 averages out to 0 since it contains essentially random phase factors e ±iω(t j −t k ) inside the averaging brackets. In other words, the uncertain phase relationship between the two electrons on the optical scale leads to a density matrix case analogue where the interference term drops out and the WDF is simply an incoherent sum over all the electrons.
Therefore, if the Wigner distribution function of a single electron does not change its shape, but simply shifts in position and angle, as one would expect for an undulator in which the trajectories remains linear with small position and angle offsets, the overall radiation pattern is just a convolution (summation) of the electron distribution in phase space with the WDF of a single electron. Additional effects arise for either segmented undulator (with focusing between the segments), due to the vertical focusing of a planar (horizontally deflecting) undulator, or due to the effect of a larger off-axis field which has cosh-like dependence in the vertical plane for the planar undulator or in both planes for a helical undulator.
In other words, the most general form of the Wigner distribution obtained from incoherent addition of radiation from individual electrons is of the form Here N e is the total number of electrons inside a bunch described by the probability density the integral of Eq. 98 is then replaced with a convolution integral W (r, θ) = N e W 0 (r − r e , θ − θ e )P (r e , θ e ) d 2 r e d 2 θ e .
The effect of energy spread in electron beam can be quite significant, and most generally it is accounted by extending the integration variable V e to also include the energy. E.g. the effect of a small energy spread δ e ≡ ∆γ e /γ e 1, where γ e is the normalized electron energy (later denoted as simply γ), leads to Note that for a small energy change δ e ∼ 1/N u with a large number of undulator periods N u 1, the effect on the radiation pattern at a given frequency ω 0 is identical to that of the on-energy particle ∆δ e = 0 while tuning the radiation frequency off the resonance by The evaluation of Eq. 101 can be quite involved in terms of computational resources required even if being straightforward in all other respects. However, if the electron distribution P (V e ) is separable, i.e. P (V e ) = P x (x e , θ ex )P y (y e , θ ey )P γ (δ e ), then the 2D projection of the WDF, B x (x, θ x ) and B y (y, θ y ) can be easily computed.
It is instructive to consider the requirements for when Eq. 99 is applicable in case of a planar undulator (horizontally deflecting). As mentioned previously, two effects can change the shape of the WDF depending on the (small) electron trajectory offsets in position and angle in vertical plane. One is the cosh-dependence of the vertical field, whereas the other is the natural undulator focusing.
The equation of motion for the average vertical position y av when B 0x = 0 for the undulator, Eqs. 97, can be written as [33] where the vertical focusing strength is given by k βy = B 0y e/ √ 2γm e c, or in terms of the period of oscillations due to focusing L βy = 2π/k βy = √ 2γλ u /K y , where K y is the undulator Kvalue and γ = E/m e c 2 is the normalized energy of the electron. Typically, L βy L u , i.e. the slow oscillation phase increment due to the focusing is 2π in undulators. Nevertheless, in order to be able to treat vertically offset trajectories as simple copies of each other, we require that the slow sine-like oscillations due to focusing produce a change in the electron trajectory's deviation over the length of the undulator that is much smaller than the natural cone of the radiation, λ/L u [6]. Integrating Eq. 102 for a typical vertical size σ y , the angle change of the electron trajectory is of the order σ y k 2 βy L u , which leads to the following requirement Similarly, the vertical dependence of the magnetic field in the undulator B y ∝ cosh(k u y), leads to the vertical trajectories with an offset to effectively sample a larger K y value.
Therefore, to enable the simpler treatment, we require that ∆K y /K y ≈ (k y y) 2 /2 produces a change in the undulator wavelength λ = λ u /2γ 2 (1 + K 2 y /2) which is much smaller than the natural undulator bandwidth ∆λ/λ ∼ 1/N u . This leads to another requirement for the electron beam size Electrons coming with a vertical angle into a planar undulator generally sample a more complicated magnetic field pattern, such as shown in Fig. 7. Therefore, the following requirement can be imposed on the vertical angular size σ y L u σ y λ u 2π In summary, if the requirements 103, 104, and 106 are satisfied, the simple convolution of a single electron radiation pattern with that of the electron bunch phase-space distribution, Eq. 100 or Eq. 101 can be used. Otherwise, the more general integral, Eq. 98, needs to be evaluated. We note that the potential complications discussed here apply only to the vertical plane for a planar undulator with exact translational symmetry of fields in the x-direction.

C. Revisiting emittance definition
Rms emittance, Eq. 14, is widely used as a measure of beam quality in accelerator physics.
While this definition is attractive due to the fact that it can be applied to a variety of different distributions, the connection of the rms emittance to phase space density or brightness available in the beam is generally distribution dependent. Whereas equilibrium processes (e.g. radiation damping in storage rings, equilibrium beam in a focusing channel under the influence of space charge [34], etc.) lead to a Gaussian distribution in phase space, beams in linear accelerators are rarely in equilibrium. As a result a meaningful characterization of the phase space of electron beams or, as we shall see later, the synchrotron radiation, needs a more flexible metric than the rms emittance alone. Short of the complete knowledge of the actual phase space distribution, a useful way to reduce and represent the information is to extend the concept of the rms emittance to the so-called brightness curve or rms emittance vs. beam fraction [35]. As we will see, a wide class of practical phase space distributions can be effectively characterized by such a curve as the beam fraction is varied from 0 to 100%.
Three parameters, the usual rms emittance ( = (100%) with 100% denoting that the entire beam is included in the emittance calculation), core emittance, c , and core fraction f c can convey the information not only about the second moments of the beam distribution, but also the peak brightness and what fraction of the beam effectively contributes to this brightness. The situation is somewhat analogous to how the peak height and the full-width at half maximum complement the rms width information for arbitrary (unimodal and finite integrable in the second moment sense) pulses.
Below is one prescription for obtaining emittance vs. fraction curve. Here we only consider the case of a two-dimensional phase space, x = (x, p) where x is the transverse coordinate and p can represent (normalized) transverse momentum or angle. The phase space distribution function P (x, p) is assumed to be normalized, P (x, p) dx dp = 1. One can apply the following procedure: 1) For an ellipse of a fixed area πa, choose Twiss parameters T of the ellipse (c.f. Eq. 14) that maximize the beam fraction contained therein: 2) Obtain the rms emittance (a) for x ∈ D(a) of Eq. 107: The parametric curve (f (a), (a)) is the emittance vs. fraction curve, (f ).
3) Define the core emittance, c , and the core fraction, f c , according to We have assumed that each individual ellipse D(a) remains centered around the origin as does the corresponding centroid of the beam fraction. Generalization to when this is not the case is straightforward by allowing the clipping ellipse to shift. This procedure for obtaining emittance vs. fraction curve is meaningful for distributions which are unimodal (i.e. with a single hump) and finite integrable (for second moments).
It is easy to show that the core emittance is directly related to the peak phase space density or brightness P 0 = max{P (x, p)}: To see that one simply needs to note that a small area clipping ellipse in the limit a → 0 cuts out a uniform slice containing the beam fraction πaP 0 and having the rms emittance of a/4. It is interesting to note that because of the Property 3, which states that a maximum Wigner distribution is h/2 for any even pure state, and a corresponding 2D equivalent in optics of λ/2, the minimum core emittance (the diffraction limit) is therefore and it can only be larger for a symmetric mode when the coherence µ 2 g < 1. Thus, the core emittance (or peak brightness) is a more general than the rms emittance indicator of whether the radiation is coherent. This is because the rms emittance minimum is restricted only to a Gaussian coherent mode, whereas the minimum core emittance is realized for any symmetric coherent mode. is normalized to have = σ x = σ p = 1 in these natural units. As seen, the core emittance conveniently captures the fact that the peak brightness of a Gaussian is ×2 larger than that of the uniform distribution of the same rms width, as well as the fact that the core fraction in the Gaussian is smaller (0.715 vs. 1 for the uniform). Another example, Fig. 9, compares a Gaussian distribution P (r) = 1 2π e − r 2 2 and P (r) = p 2π 1 e − r 2 2 2 , with p = 0.5, 1 = 1 5 , and 2 = 9 5 . The total emittance in this case is = p 1 + (1 − p) 2 = 1. Once again, the information about the peak brightness is lost with the rms emittance quoted only, but is conveyed conveniently with the three parameters: { , c , f c }. A practical measure of beam brightness available can be defined as f c / c , a subject that we explore further below.

D. Possible definitions of brightness
As a phase space quasi-probability, the WDF is the generalized brightness (also known as microscopic brightness [35]), B(r, θ) ≡ W (r, θ). It is convenient, however, to be able to reduce the information to a single parameter, which, for example, can facilitate comparison of various partially coherent sources. Here we revisit several of the definitions that can be useful for this purpose remembering that no single reduced parameter or a figure of merit can suit all the practical purposes.
1) The following definition, which we denote as classical, can be written (modulo a prefactor that generally depends on the actual distribution shape) as F is the overall (spectral) flux. In the definition above we have assumed that the 4D emittance can be represented as a product of two 2D emittances. This definition, which gives a positive quantity, is easy to compute and can serve as a measure of brightness. One drawback is in the use of rms emittance, which, as discussed previously fails to capture the peak brightness available in the beam and tends to exaggerate the importance of tails when non-Gaussian distributions are encountered. A possible modification to the definition of Eq. 113 can be made to write the effective brightness in terms of where in place of rms emittances as a measure of effective phase space area we use the core emittance cx,y while at the same time reducing the participating flux by the product of the core fractions in each plane f cx,y . All the necessary quantities in Eq. 114 can be found from the WDF as discussed previously.
These classical definitions, however, fail to capture the concept of coherence. A mode with a large dispersion (emittance) but perfectly coherent is indistinguishable from its incoherent analog of the same emittance.
2) As we have seen, the WDF contains the information about the density matrix, which, to the overall flux factor leads to the following natural definition for brightness (denoted as average brightness) This definition is discussed in a classical context in [36], though its genuine justification becomes clear from the connection to quantum or wave phenomena. The brightness of Eq. 115 is higher for more coherent radiation, even though the dispersion or emittance no longer comes into this definition. In particular, as pointed out previously, a pure mode, no matter how dispersed it gets, would have the same B av provided the flux remains unchanged.
3) Another definition is simply to quote the on-axis peak brightness An obvious drawback of this definition is that the WDF is not guaranteed to be positive. However, as previously discussed, the on-axis WDF is always positive for symmetric (even) modes, which are of most practical interest for synchrotron radiation. Additionally, the peak brightness due to the boundness property (Property 3) can serve as a measure of coherence because for any pure and symmetric (even) mode B 0 is guaranteed to be related to the total (coherent) spectral flux according to As pointed out previously, the core emittance is inversely related to the peak brightness.
Finally, for the purpose of the numerical examples below, it will be convenient to consider 2D projections of the WDF which are easy to visualize. The extension of the above definitions to 2D is straightforward and the equivalent meaning remains intact only when the mode is separable in x, y-planes. In particular, the Eq. 115 in 2D becomes and equations Eqs. 116 and 117 with equivalent expressions for y-plane. We are going to use Eq. 118 even when the mode is not separable as a measure of effective average brightness in one plane.
where I is the average beam current (non-FEL process is assumed), and 0 is the vacuum permittivity.
with fine-structure constant α, and function F n (K) = K 2 n 2 /(1 +  Refer to text for other parameters.
flux, Eq. 122, in terms of the computed fields via Eq. 121. Fig. 10b compares on-resonance spectral flux with the analytical result where Q n (K) = (1 + K 2 /2)F n (K)/n. In what follows, we denote the spectral flux by simply F 0 implicitly assuming the usual 0.1% bandwidth scaling. To find the total flux, the detector in simulations is placed 50 m away from the undulator center and the electric field is computed on a 1024 × 1024 3 mm square grid.
To check that the code correctly computes on-axis (peak) brightness for 4D and 2D WDF computed from the fields, we use the Eq. 123 with Eqs. 117 and 120, which relate the total flux to the peak brightness of any symmetric coherent mode according to B 0 = (2/λ) 2 F 0 and B 0x = (2/λ)F 0 . The results of this cross-check are shown in Fig. 10c and 10d. While the planar undulator radiation on-axis is fully horizontally polarized, Fig. 11 shows the WDF for a helical undulator at its first harmonic (N u = 250, K x = K y = 0.696, ω = 8 keV). The WDF is obtained from the detector plane placed 50 m away from the undulator center, and subsequent back-propagation of the radiation phase space back to the center of the undulator. As discussed previously, the case of a (nearly) pure circularly polarized wave leads to |S 0 | = |S 1 | with other generalized Stokes parameters being approximately zero as seen in Fig. 11.
The ×-like shape of the x-ray phase space is persistent throughout all the examples.
The explanation behind it is simple -undulator, being an extended source, has radiation emitted from its beginning and the end, which must advance different distances to reach the observer, or when (back)propagated to the undulator center. This results in the ×-like shape, with the two branches corresponding to the undulator ends.
In the remainder of this section, we limit our numerical examples to planar undulators investigating x-ray phase space for radiation on and off resonance, the segmented undulator with a quadrupole focusing in between, and a 25-m long undulator including electron emittance and energy spread effects. This example illustrates the x-ray phase space for radiation at undulator resonance, Fig. 12, along with the emittance vs. fraction curve. It is seen that M 2 > 1 or emittance is not the minimum possible for the fully coherent mode. On the other hand, the core emittance is its possible minimum as discussed previously. Also, note the value of the βfunction or Rayleigh range, is somewhat different than L u /2 or L u /2π values commonly quoted in the literature. Additionally, the full beam and its core have different β-function values. Therefore, a proper matching with the electron beam depends on whether one maximizes the peak brightness or minimizes the overall rms emittance of light.
3. Example: segmented undulator with quad focusing Next, we consider a segmented undulator with a quadrupole focusing in between the two segments. Fig. 13 shows trajectories for different horizontal offsets of electrons going into the undulator.

Example: radiation off undulator resonance
Here we consider the radiation off the undulator resonance. This is not only of interest for practical cases of detuning or selecting photon energy in a monochromator but also when considering off-energy electrons (electron beams with energy spread). This is because for undulators with a large number of periods, the effect of tuning off resonance is identical to keeping the radiation frequency ω 0 the same but changing the electron energy according to ∆ω/ω 0 = −2∆γ/γ.     17 shows the emittance and β-function of light for scanning the radiation frequency around the resonance. As shown previously, the core emittance is λ/8π in all cases, whereas the rms emittance is minimal (though with M 2 > 1) around the resonance.
Finally, Fig. 18 shows the effective 2D average brightness The deviation of µ 2 gx from 1 is due to the fact that the radiation mode is not separable, even though the radiation is fully transversely coherent in this case and therefore the full 4D µ 2 g = 1. The peak 2D brightness, which is not shown, simply follows the trend of Fig. 15 since it is related to the flux according to B 0x = F(2/λ).

Example: including emittance and energy spread of electrons
Here we provide an example of including emittance and energy spread to the calculated WDF. For simplicity, we continue to limit ourselves to 2D projection of the Wigner distribution function and treat electron phase space probability distribution function as separable P (r e , θ e , δ e ) = P x (x e , θ ex )P y (y e , θ ey )P δ (δ e ). Fig. 19 shows the horizontal phase space at 5 GeV obtained from the simulations of the photoinjector for 77 pC per bunch and 1.3 GHz repetition rate (average current of 100 mA), including the effects of the merger and the linear accelerator [37]. See Fig. 19. The energy spread of the electron beam is σ δe = 2 × 10 −4 . To illustrate its effect, we consider a 25-m long undulator with N u = 1250 periods. Table II summarizes the parameters used in this example. As seen, the radiation is computed slightly below the resonance where the flux is roughly doubled.
To provide more optimal matching for the core of the beam, β x is chosen to be β x = 4 m close to the Rayleigh range of the core of the radiation from a pencil (zero emittance) beam,    Fig. 20b shows the effect of the energy spread for otherwise ideal (zero emittance) beam. Some degradation of the B avx can be seen. space with frequency and time where the timing structure is important. Though straightforward, such a description is rather challenging from the point of computational requirements, even though a sampled approach similar to particle tracking in accelerator physics can be employed to represent the radiation in the entire 6D phase space (the microscopic brightness is allowed to take on negative values). When the x-ray optics beamline consists of drifts and perfect lenses without clipping apertues, this description is complete and allows to fully ac-count for the light properties following geometric optics transformation rules. Introduction of apertures in the beam, however, requires the convolution of the transmissive mask's WDF with that of the beam. In this cases, it might be more efficient to consider decomposition of the partially coherent light into orthogonal mutually incoherent modes and to include the diffraction effects on each mode separately.
Nevertheless, the Wigner distribution function is demonstrated in this paper to be a rigorous and insightful way to describe the coherence and other properties of the synchrotron radiation. Its use will grow in importance as synchrotron x-ray sources with higher coherence become more prevalent.

VI. ACKNOWLEDGEMENTS
I would like to acknowledge stimulating discussions with Keith Nugent, who pointed out his work on the Wigner distribution measurements for partially coherent x-rays and the subsequent spatial mode decomposition. Andrew Gasbarro has assisted in tests and design of various MATLAB scripts used in this work. David Sagan is acknowledged for initial discussions on the synchrotron radiation calculation approaches.