Phase tracking for sub-shot-noise-limited receivers

Non-conventional receivers for phase-coherent states based on non-Gaussian measurements such as photon counting surpass the sensitivity limits of shot-noise-limited coherent receivers, the quantum noise limit (QNL). These non-Gaussian receivers can have a significant impact in future coherent communication technologies. However, random phase changes in realistic communication channels, such as optical fibers, present serious challenges for extracting the information encoded in coherent states. While there are methods for correcting random phase noise with conventional heterodyne detection, phase-tracking for non-Gaussian receivers surpassing the QNL is still an open problem. Here we demonstrate phase tracking for non-Gaussian receivers to correct for time-varying phase noise while allowing for decoding beyond the QNL. The phase-tracking method performs real-time parameter estimation and correction of phase drifts using the data from the non-Gaussian discrimination measurement, without relying on phase reference pilot fields. This method enables non-Gaussian receivers to achieve higher sensitivities and rates of information transfer than ideal coherent receivers in realistic channels with time-varying phase noise. This demonstration makes sub-QNL receivers a more robust, feasible, and practical quantum technology for classical and quantum communications.


I. INTRODUCTION
Optical communication with coherent states can achieve the highest rate of information transfer through lossy and noisy channels [1][2][3]. Coherent optical communications encode information in the coherent properties of the electromagnetic field, allowing for using highspectral efficiency modulation and high-sensitivity coherent detection [4,5]. Efficient coherent modulation and detection can dramatically increase the rate of information transfer beyond the reaches of intensity encodings [5][6][7]. Moreover, the intrinsic nonorthogonality of coherent states can enable quantum communications [8][9][10] including quantum key distribution [11][12][13][14][15][16] for secure communications over optical networks [17,18]. However, coherent encodings are highly susceptible to phase noise and random phase variations in real-world devices and communication channels [5,6]. To ensure the expected advantage of coherent communications over intensity modulation and direct detection, communication protocols require efficient methods for phase estimation and phase tracking to correct for random phase changes induced by the channel [5][6][7], while being compatible with existing communication technologies. Moreover, practical scenarios in low-power and quantum communications require phase tracking based only on the transmitted signal state, without relying on transmissions of strong pilot phase reference pulses [19][20][21][22][23][24][25].
Conventional coherent receivers that realize Gaussian measurements, such as heterodyne receivers, can perform phase tracking based on signal post-processing in the digital domain with diverse and efficient methods for channel and phase estimation [26][27][28][29][30]. These methods renewed interest in coherent communications for increasing information transfer, and has made coherent communications more practical for future realizations of highcapacity communication networks [5,31,32].
Further developments in optical communication will seek to approach the ultimate limits of information transfer in realistic communication channels. Quantum information science (QIS) provides the basis for approaching the fundamental limits in receiver sensitivities [33] and information transfer in communications [1][2][3]. Receivers based on Gaussian measurements, Gaussian operations, and local operations and classical communication have been investigated for information processing, phase estimation, and state discrimination [34]. The optimal Gaussian receiver for the discrimination of two nonorthogonal coherent states is the simple homodyne receiver [35]. Furthermore, measurements based on adaptive homodyne detection can provide advantages for single-shot phase estimation of coherent states [36][37][38]. However, the ultimate limits of receivers based on Gaussian operations for state discrimination are still under investigation [34,39]. Among technologies enabled by QIS, nonconventional receivers, termed quantum receivers, use optimized non-Gaussian measurements based on photon counting [35,[40][41][42][43][44][45][46][47][48][49][50][51][52][53][54][55][56][57][58] to provide sensitivities surpassing quantum noise limit (QNL) of coherent receivers [5], and approach the true quantum-mechanical limit, the Helstrom bound [33]. Moreover, non-Gaussian receivers performing joint measurements over coherent-state codewords hold promise to bridge the gap between the Shannon and the Holevo limits in capacity [1,59]. However, making non-Gaussian receivers practical for coherent communications in realistic channels will require novel approaches for performing efficient phase tracking. These approaches will be fundamentally different from those based on conventional heterodyne detection using digital signal processing post-measurement [26][27][28][29][30], and will require realizing active phase estimation [54,60] and correction in real time, while ensuring performance beyond the QNL.
Here we demonstrate a phase tracking method for non-Gaussian receivers for quadrature phase-shift-keyed (QPSK) coherent states [53] based on coherent displace-FIG. 1: Phase tracking for non-Gaussian receivers surpassing the QNL (a) A sender (Alice) prepares a coherent state |α k with a phase θ k ∈ {0, π/2, π, 3π/2} by phase modulation of lasers. The pulses propagate though a channel inducing random phase drifts φ off . The receiver (Bob), uses a local oscillator (LO) to perform optimized discrimination non-Gaussian measurements [53]. The mismatch between the Alice's and Bob's phase reference frames caused by the channel increases the discrimination error for decoding information. (b) Probability of error for the adaptive non-Gaussian state discrimination measurement from Ref. [53] as a function of phase offset φ off between the signal and LO, for signal mean photon number |α| 2 = 2.0, 5.0, and 10.0, together with the ideal heterodyne limit (Het.) for each mean photon number. (c) Flowchart of the algorithm followed for phase tracking. The discrimination measurement provides samples for phase estimation containing detection results dj for the relative phase between input state and the LO δj ∈ {0, π/2, π, 3π/2}. After 500 channel transmissions, the pairs {dj, δj} are used generate two estimates:φc andφs. A weighted averageφi combines these estimates to increase accuracy. The final estimateφest is the average over Navg phase estimates {φi} multiplied by a gain function g( φ i ). The phase-tracking method feeds forward the estimateφest to the LO at a rate of fPT = fexpt/(500Navg) Hz for real-time phase tracking, with experimental repetition rate fexpt ≈ 12 kHz. (d) Expected phase estimate as a function of applied phase for |α| 2 = 5.0 for the Sin-Cos estimatorφest without (red line) and with (green line) optimized gain function g( φ i ), and for a Bayesian estimator (blue line), and the corrected Bayesian estimator (black line). The solid lines represent the mean of 100 Monte Carlo samples and shaded regions correspond to one standard deviation. ment, adaptive measurements, and photon counting. The phase tracking method performs phase estimation and correction in real time using the data collected from the non-Gaussian discrimination measurement [53] without relying on strong phase-reference pilot pulses. This method enables the non-Gaussian receiver to overcome random phase variations encountered in realistic communication channels, while allowing the receiver to perform decoding measurements with sensitivities beyond the QNL, the shot-noise limit of conventional coherent receivers. This demonstration makes non-Gaussian receivers a more robust, feasible, and practical quantum technology for optical communications, and represents a significant advance for realizing low-power communications approaching the quantum limits in realistic communication channels. Fig. 1(a) shows the concept of phase tracking for a non-Gaussian measurement surpassing the QNL over a channel inducing random phase variations. The sender (Alice) uses laser pules to encode information in four coherent states with phases θ k ∈ {0, π/2, π, 3π/2}. The pulses propagate through the channel, which induces random phase shifts. The receiver (Bob) uses a laser as a local oscillator (LO) phase reference and performs a non-Gaussian discrimination measurement that surpasses the QNL for decoding the information [53]. The finite linewidths of the lasers in the transmitter and the receiver and the random channel phase variations cause mismatch between the phase space reference frames of Alice and Bob. These random phase drifts severely affect the expected performance of the state discrimination measurement. Figure 1(b) shows the probability of error for the adaptive non-Gaussian measurement for discriminating four non-orthogonal states |α k ∈ {|α , |iα , |−α , |−iα } below the heterodyne limit [53], the QNL (see Appendix A), as a function of phase offset φ off between the input state |α k and the receiver's LO, for mean photon numbers |α| 2 = 2.0, 5.0, and 10.0. While the discrimination strategy demonstrated in [53] can tolerate small phase errors φ off without significant degradation, moderate values of φ off severely limit its performance, preventing discrimination below the QNL. To keep the expected performance benefit of the non-Gaussian measurement over the QNL, the receiver needs to perform phase tracking to correct for phase drifts induced by the channel. While phase tracking based on heterodyne measurements can be realized with digital signal processing post measurement [22,23,30], non-Gaussian receivers require active phase tracking and correction in real time to maintain performances below the QNL [49,53,55]. Here we demonstrate a method for actively tracking and correcting for time-varying random phases for non-Gaussian receivers to enable sensitivities beyond the ideal heterodyne limit in channels inducing random phase variations.

II. PHASE TRACKING FOR NON-GAUSSIAN RECEIVERS
The phase-tracking method for non-Gaussian receivers builds on a discrimination strategy to discriminate a state |α k ∈ {|α , |iα , | − α , | − iα } implementing N adaptive measurements with photon-number resolution [53]. During each adaptive measurement j = 1, 2, ..., N , the receiver's LO performs hypothesis testing of the input state |α k by adjusting its phase θ j ∈ {0, π/2, π, 3π/2} according to a Bayesian discrimination strategy [53]. After N adaptive measurements, the receiver provides an answer to the state discrimination problem θ disc about the phase of the input state |α k (see Appendix B). Assuming that this answer θ disc is correct, the data collected during the N adaptive measurements can now be used as samples for estimation of the phase φ off induced by the channel. For a discrimination measurement, these data consists of N photon-counting detections {d 1 , d 2 , ..., d N }, together with the LO's phases during each adaptive measurement {θ 1 , θ 2 , ..., θ N }. Since the answer to the discrimination problem θ disc is a very good estimate of the phase of the input state, it can be used to estimate the relative phases δ j between the input state and the LO in each adaptive measurement as δ j = θ j − θ disc . The data for estimating φ off then consist of the pairs {d j , δ j }. Accumulating data during a moderate number of channel transmissions allows for estimating φ off in real time and performing phase tracking simultaneously with the state discrimination measurement [53]. This method enables the receiver to utilize the data from the discrimination measurement to estimate and correct for random phase excursions.
To obtain an estimate of φ off , the phase tracking method uses the collected data {d j , δ j } from discrimination measurements over 500 channel transmissions. This data consists of photon counting samples of the interference between the input state and the LO for relative phases δ = {0, π/2, π, 3π/2}. In principle, there are different estimators that can produce an estimate from the pairs {d j , δ j } (see Appendix C for two possible estimators). However, phase tracking for non-Gaussian receivers requires a simple estimator that can be efficiently calculated in real time, while being robust to the unavoidable errors from the discrimination measurement. A simple estimator can be obtained by using {d j , δ j } to generate four photon number (Poisson) distributions P 0 (n k |δ = 0), P π/2 (n k |δ = π/2), P π (n k |δ = π), and P 3π/2 (n k |δ = 3π/2) for δ = {0, π/2, π, 3π/2} (see Fig  1(c)). Here n k is the photon number of detected photons for different distributions. By calculating the differences between means n δ of these distributions we can form estimates of φ off as: with where η is the detection efficiency, ξ is the interference visibility, andφ c,s are the phase estimates. f (|α| 2 ) is a factor that is used to reduce a bias in the phase estimates arising from the non-zero probability of error for the state discrimination strategy (see Appendix C). Errors in the state discrimination measurement (P E = 0) cause errors in populating the distributions P δ and thus in their mean values n δ . These errors causeφ c to be biased away from zero by making n π − n 0 < 4|α| 2 ηξ/N . The function f (|α| 2 ) allows for correcting these biases by making n π − n 0 ≈ f (|α| 2 ) × 4|α| 2 ηξ/N , thus reducing the effects of discrimination errors in the phase estimation procedure. These errors also produce biases inφ s by causing n 3π/2 − n π/2 to be reduced. Appendix C 1 describes the procedure to obtain the optimal values of f (|α| 2 ) for reducing the effects of discrimination errors. As a second step, the phase-tracking method uses a weighted average of estimatesφ c andφ s with relative weight r to obtain an estimateφ i of φ off within 500 transmissions (see Fig. 1(b)). The weight r, determined from Monte Carlo simulations, allows for reducing the difference betweenφ i and φ off at the end points of the capture range for phase tracking (±0.6 rad in our experimental demonstration). The final estimateφ est in the phase tracking method is obtained by averaging over N avg estimatesφ i and multiplying by a gain factor g( φ i ) that depends on the N avg estimates {φ i } (see Appendix D). This gain factor reduces the difference between the applied phase φ off and the estimated phases {φ i }. Figure  1 (d) shows the result of Monte Carlo simulations of the final phase estimateφ est with g( φ i ) = 1 (red line) and g( φ i ) optimized to approach the true phase φ off (green line). This final estimateφ est is used to feed forward to the receiver's LO every 500×N avg channel transmissions at a rate f PT for phase drift correction. This method enables real-time phase tracking for correcting time varying phases while enabling the non-Gaussian receiver to surpass the QNL.  Figure 2 shows the experimental configuration for the demonstration of phase tracking of adaptive non-Gaussian state discrimination measurements for QPSK states {|α , |iα , | − α , | − iα }. The measurement strategy consists of N = 7 adaptive measurements via feedback in the phase of the LO [53]. A helium-neon (HeNe) laser at 633 nm and an acousto-optic modulator (AOM) prepare 35-µs coherent state pulses at a rate of f expt ≈ 12 kHz. The light pulses enter an unbalanced Mach-Zender interferometer through a 50/50 beam splitter. We prepare the phases of the input signal state and the LO with two 4:1 multiplexers (MUX) and two phase modulators (PM). The input states and LO interfere in a 99/1 beam splitter, which implements the displacement operation of the input state [61]. A field programmable gate array (FPGA), FPGA1 in Fig. 2, implements the discrimination strategy based on adaptive measurements and photon number resolving (PNR) detection described in Ref. [53]. This FPGA1 controls the timing of the experiment, processes photon detections, and updates the phase of the LO for each adaptive measurement. The overall detection efficiency of the experiment is η = 72%, with interference visibility ξ = 99.8%.

III. EXPERIMENTAL CONFIGURATION
We use a second FPGA (FPGA2) to perform active phase tracking using the data pairs {d j , δ j } sent from FPGA1 generated from the state discrimination measurement, as described above. FPGA2 performs real-time phase estimation to obtain an estimate of φ off , and feeds forward this information to the receiver to adjust the LO phase to perform phase tracking and correction. Controllable phase offsets and phase noise in the input state are prepared with an arbitrary waveform function generator (FG) [58] for investigating the phase tracking method in channels inducing random phase variations. We use an 8-bit Digital to Analog Converter (DAC) to feed forward the estimated phase offset to the LO. We chose a finite capture range R of phases for feed forward to the phase of the LO equal to R = [−0.6, +0.6] rad. This choice results in a phase resolution of about 1.2 rad/2 8 = 5 mrad for phase tracking, while having a large enough capture range.
The absolute power of the input state is calibrated using a photodiode-based light-trapping (TRAP) detector with a 0.05% uncertainty tied to an absolute spectral response scale [62]. This TRAP detector was used to calibrate a series of attenuators to lower the power of a power-stabilized 633 nm laser to the single-photon level with a combined 1σ uncertainty of 1.8%, and the transmission of the optical elements from where the state is prepared to where it is detected with transmittance T = 92.5(2)%. This results in a total uncertainty for the calibration of the absolute average photon number per pulse of σ ≈ 2%. The FPGAs used for implementing the state discrimination strategy and phase tracking were both Altera Cyclone II FPGAs, model EP2C5T144C8 with 4608 logical elements, base clock of 48 MHz, and 158 digital I/O pins.

IV. RESULTS
We investigate the performance of the phase-tracking method under different scenarios. In the first scenario, the input state experiences a sudden constant phase offset and the phase tracking method needs to estimate and correct for large phase offsets. The second scenario aims to simulate a realistic channel inducing time-varying phase noise, where the input state experiences Gaussian random walks in phase at different diffusion rates. This scenario allows us to investigate phase-tracking of random phase drifts in the channel and the impact of tracking bandwidth on the performance of non-Gaussian receivers.
A. Phase tracking under constant phase offsets Figure 3 shows the performance of the phase tracking method under sudden constant phase offsets of φ app ={±0.1, ±0.2, ±0.3, ±0.4, ±0.5} rad of the input state with mean photon number |α| 2 = 5.0. Figure 3(a) shows the probability of error P E calculated for time bins of about 0.5 s. Figure 3(b) shows the phase estimateŝ φ est generated by the phase-tracking method as a function of time. Thick lines represent the average over 5 independent experimental runs, each time bin corresponds to about 5 × 10 3 (≈ f exp × 0.5 s) independent experiments, and shaded regions represent one standard deviation. Green (blue) lines correspond to positive (negative) applied phase offsets φ app . The phase-tracking method here uses N avg = 20, so that each estimateφ est was obtained at a phase tracking rate f PT = 1/T PT ≈ 1/0.85 s [See Fig. 1(c)]. For this investigation, the relative phase of the signal and LO was locked between each experimental trial, similar to Ref. [53], and the phase offset φ app was applied during each state discrimination measurement.
From t = 0 to 2s, we verify that the performance of the receiver is 3.4 dB below the heterodyne limit (red line) without the applied phase offset φ app = 0. At t = 2s, we apply a constant phase offset φ app to the input state, producing sudden jumps in the probability of error P E depending on the value of φ app . At t = 4 s, the phase tracking method is turned on. After an estimation cycle T PT ≈ 0.85 s, the phase tracking produces an estimateφ est and corrects for φ app , allowing the receiver to perform below the heterodyne limit for all phase offsets. Figure 3(b) shows the phase estimatesφ est as a function of time, demonstrating that the phase-tracking method accurately identifies and corrects for large phase offsets. We observe that this method enables the non-Gaussian receiver to keep its expected advantage of 3.4 dB over an ideal heterodyne measurement.
B. Phase tracking of random walks in phase

Phase tracking with different noise strengths
In coherent optical communications, the receiver is usually required to decode information encoded in coherent state signals in the presence of time-dependent random variations in phase, which severely limits the receiver's ability to recover the information. We investigate the phase-tracking method for situations where the input state experiences Gaussian phase noise [28], which could include effects of phase noise in the LO and the transmitter laser or random fluctuations arising from different processes [27,28,[63][64][65][66]. Gaussian random walks in phase have been broadly used as an acceptable model for phase noise in optical communications and phase drift between the sender and receiver in classical [5,27,67,68] and quantum communications [22,30,30]. We note that the algorithm used for phase tracking is independent of the choice of phase noise model, as it makes no assumptions about the dynamics of the noise and the noise model is not used to obtain the phase estimate (see Sec. II).
For this study, we do not stabilize the relative phase between the input signal state and the LO. Under these conditions, the receiver experiences the drift of the experimental setup and induced random walks in phase. This situation is analogous to having a LO whose phase is constantly drifting and a channel that induces phase noise. This experimental configuration aims to mimic more realistic situations where the signal and LO are generated from different lasers [28]. In this investigation, the phase noise in the signal is implemented by preparing discrete Gaussian random walks in phase with L=6500 steps, each distributed according to a zero-mean Gaussian distribution with standard deviation σ 1 [28].
, with total variances σ 2 L = Lσ 2 1 for each case. Here, L = 6500 corresponds to the total steps in the random walks, and R = [−0.6, 0.6] rad is the capture range of phase tracking in our experiment. Bold lines show the average and shade regions the spread for these walks with (blue) and without (green) phase tracking. (d)-(f) Applied phases φapp during Gaussian random walks and the phase estimatesφ f f est =φest generated by the phase tracking method for cases (a)-(c), respectively. The phase estimatesφ f f est used to feed forward to the LO for phase correction are bounded by the experimental capture range R, as can be seen in (e) and (f).
We observe that, in general, Gaussian random walks in phase severely degrade the performance of the non-Gaussian receiver precluding any advantage over the heterodyne limit. However, in situations with small (2σ L < R) and moderate (2σ L ≈ R) levels of noise in Figs. 4(a) and (b), respectively, the phase-tracking method accurately estimates and corrects for phase noise, enabling the receiver to maintain its performance 3.4 dB below the heterodyne limit. For small phase noise in Fig. 4(d) with 2σ L < R, the applied phase φ app is smaller than the phase drifts in the experiment. The estimated phasê φ f f est captures the contributions of φ app and of these drifts showing a larger variance than φ app . For situations with moderate phase noise with 2σ L ≈ R, the applied walks in phase φ app contain walks that exceed the capture range R at some point in time, as shown in Fig. 4(e). After this point, the estimated phasesφ est for these walks are clamped at |R| to generateφ f f est to feed forward to the LO. This procedure produces a slight increase in P E after 50s, as shown in Fig. 4 For situations with large phase noise in Fig. 4(c) for which 2σ L R, the receiver's performance degrades above the heterodyne limit within a short time. The phase tracking method can reduce the effects of phase noise. However, since the applied phases φ app rapidly exceed the capture range R, the estimated phasesφ f f est that are fed forward to the LO are clamped at R for many cases, as can be seen in Fig. 4(f). This procedure limits the performance for phase tracking in our current implementation. However, increasing the resolution of the electronic controller and the DAC to 10-bits used to feed forward to the LO phase can allow for increasing the capture range to R = ±π rad, while maintaining good phase resolution of ≈ 6 mrad for phase tracking. In this case, whenever φ app reaches this range, the estimateφ f f est would wrap around from ±π to ∓π. This procedure would allow for tracking phase walks exceeding R and maintaining the receiver's performance below the heterodyne limit under any level of phase noise.

Phase tracking with different input powers
The performance of the phase-tracking method for non-Gaussian receivers critically depends on the performance of the state discrimination measurement. The information used for phase estimation and tracking {d j , δ j } assumes the answer to the discrimination problem θ disc to be correct, which is true only with probability P C = 1 − P E . Since P E depends strongly on the mean photon number |α| 2 of the input state [53], the receiver's ability to perform phase tracking will also depend on |α| 2 . Larger input powers |α| 2 result in lower error probabilities P E , and can allow the phase-tracking method to perform phase estimation with higher accuracy, achieve higher tracking rates f PT , and correct for phase noise with higher bandwidths f RW . On the other hand, for low powers |α| 2 the performance of the phase tracking method is affected due to higher P E . However, achieving phase tracking in these two power regimes is required for both low-power classical [6,27,28,63] and quantum [13,22,23] communications. Figure 5 shows the performance of the phase-tracking method for |α| 2 = 10.0 (a) and (b) |α| 2 = 2.0. phasetracking in these high-and low-input power regimes can be implemented at different rates f PT to correct for noise with different strengths and bandwidths. In the two plots  [53]. Note that the phase tracking method enables the receiver to perform below the heterodyne limit under Gaussian phase noise in the high-and low-input power regimes.
in Fig. 5, the phase-tracking parameters and Gaussian phase noise are chosen to satisfy the condition 2σ L = 2 √ Lσ 1 = 2f RW T σ 1 ≈ R, so that these situations are analogous to the one shown in Fig. 4 (b) for |α| 2 = 5.0. Here σ 1 = 5 mrad, and T is the displayed period: T = 13 s for |α| 2 = 10.0; and T = 120 s for |α| 2 = 2.0. For |α| 2 = 10.0, phase-tracking achieves higher estimation accuracy, and can be reliably implemented with N avg = 4, enabling phase-tracking rates of f PT ≈ 5.8 Hz, five times faster than for |α| 2 = 5.0. As a result, the receiver can track and correct for random Gaussian phase noise with a rate of f RW = 500 Hz, while performing below the heterodyne limit (See Fig. 5(a)). phase-tracking for |α| 2 = 2.0 requires more samples to obtain accurate phase estimates, and can be implemented reliably with N avg = 40 with a tracking rate f PT ≈ 0.5 Hz. In this case, the receiver can track and correct for phase noise at a rate of f RW = 50 Hz, while performing below the heterodyne limit [See Fig. 5(b)].

phase-tracking of noise with different bandwidths
The phase-tracking method has a strong dependence on the noise bandwidth present in the communication channel and how it compares to the rate at which phasetracking can be implemented [28]. We have studied the performance of the phase-tracking method for non-Gaussian receivers for tracking random phase noise with different noise bandwidths. This study is described in Appendix E. In our findings we observe that for a fixed f PT , the phase-tracking method can correct for noise with different bandwidths f RW . We note, however, that f PT has to be high enough to keep the receiver's performance below the heterodyne limit for extended periods of time. As one example, we observe that for a non-Gaussian receiver with |α| 2 = 5.0 and f PT = 1.2 Hz, reliable phase-tracking of random noise can be performed for noise bandwidths f RW = 100 Hz, and sub-QNL sensitivity can be kept for f RW = 500 Hz for ≈ 10 s. Tracking noise with higher bandwidths can be achieved with higher experimental rates f expt > 11 kHz to increase f PT or with larger mean photon numbers |α| 2 to generate more accurate phase estimatesφ est .

V. DISCUSSION
Receivers with sensitivities surpassing the QNL of ideal conventional receivers have a large potential for enabling efficient and reliable low-power communications at the single-and few-photon levels. The phase-tracking method demonstrated here for non-Gaussian receivers allows for tracking random phase variations and noise with different strengths and bandwidths. This method provides the much needed robustness to enable non-Gaussian receivers to perform below the QNL in channels with phase noise for a wide range of powers. We note that the phase-tracking bandwidth in our proof-of-principle demonstration was implemented at low rates because of experimental constraints, and used a single laser shared between transmitter and receiver. However, using an estimator with higher estimation accuracy, such as the Bayesian estimator, combined with high-bandwidth electronics and efficient single-photon detectors [69], would allow the receiver's measurement and phase-tracking to be realized at much higher bandwidths. This in turn would enable non-Gaussian receivers to overcome realistic noise in communication channels with independent lasers at the receiver and transmitter [6], while outperforming ideal shot-noise-limited coherent receivers [5].
We note that transmissions of high-power pulses interleaved with the input states could be used with a heterodyne detection for phase-tracking [20,23], without relying on knowledge of the power of the input coherent states to be discriminated and the visibility of the interference with the LO. The phase-tracking method presented here uses only the data directly collected from the non-Gaussian measurement that assumes a known intensity and visibility. However, the data from the state discrimination strategy could in principle be used to estimate the input intensity and visibility in addition to the phase offset, and allow for tracking of multiple timevarying parameters without the need for dedicated light pulses for estimation and tracking. We also note that it may be possible to split the power of the input state to use a fraction of light to perform phase estimation with a heterodyne measurement. However, the estimation precision of these split-and-estimate methods for phase-tracking will depend on the fraction of power used for phase estimation, and there will be an increase in the probability of error in the state discrimination due to the reduced power entering the sub-shot-noise receiver.
In the future, enabling coherent communication technologies that can approach the quantum limits in sensitivity and information transfer in realistic channels at low powers will require the ability to track other impairments in the channel including polarization rotation, background noise, and power variations. While we demonstrated a method for tracking to correct phase drifts induced by a channel, we believe that the data from the state discrimination measurement that are used for phase-tracking can be leveraged for estimation and tracking of other sources of noise in the channel, such as amplitude noise (see Appendix E). Moreover, we anticipate that this technique for phase-tracking can be applied to optimized non-Gaussian measurements surpassing the QNL in the single-photon regime [55]. This possibility can enable phase-tracking in quantum key distribution for secure communications at very low powers without requiring strong phase reference pilot pulses [22].

VI. CONCLUSIONS
We demonstrate a phase-tracking method for non-Gaussian receivers [53] for phase-encoded coherent states surpassing the sensitivity limits of shot-noise limited coherent receivers: the quantum noise limit (QNL) [5]. The phase-tracking method performs phase estimation and correction in real time using the data from the non-Gaussian discrimination measurement [53], without continuously relying on phase reference pilot fields from the transmitter. Our experimental demonstration shows that the phase-tracking method provides non-Gaussian receivers with the required robustness to overcome random phase noise encountered in realistic communication channels, and enables the receiver to perform measurements beyond the QNL under diverse conditions with different noise strengths and bandwidths. Moreover, since the phase-tracking method uses the data from a measurement surpassing the QNL at very low power levels, this method is well suited for assisting quantum communication protocols based on weak coherent states for efficient [10,70] and secure [8,9,[11][12][13][14][15][16] communications. Our demonstration of phase-tracking for non-Gaussian receivers makes sub-shot-noise-limited receivers a more robust, feasible, and practical quantum technology for low-power communications based on coherent states for approaching the quantum limits in realistic communication channels.
The Quantum Noise Limit (QNL) for the discrimination of coherent states in a given encoding scheme is obtained through the probability of error in discrimination: where M is the number of states in the alphabet, and P(α k ) is the prior probability of state |α k which is equal to 1 M for equiprobable states. P(α k |α k ) is the probability of guessing state |α k given that state |α k was sent, i.e. the probability of correct discrimination.
For the discrimination of two coherent states in the binary phase shift keying (BPSK) format |α k ∈ {| ± α }, the homodyne measurement along the x-quadrature is the optimal Gaussian measurement [71]. The probability of error for the homodyne measurement corresponds to the QNL for the BPSK alphabet. P(α k |α k ) for a homo-dyne measurement is [72]: where R is the region where |α k is the most likely state, and erf(y) is the error function. Using Eq. (A1), the total probability of error is: For QPSK states |α k ∈ {|αe ik π 2 }, where k ∈ {0, 1, 2, 3}, the QNL corresponds to the probability of error of an ideal heterodyne measurement [34], which performs a projection onto coherent states and measures both quadratures of the input state simultaneously. The probability of correct discrimination of state |α k is given by [72]: Then, the QNL for QPSK states is [5,34]: While the homodyne measurement is known to be the optimal Gaussian measurement for the discrimination of two coherent states, the ultimate Gaussian limit for coherent multistate discrimination is not known. Therefore, there may be strategies based on Gaussian operations and measurements [34] that provide advantages over the heterodyne measurement [37].

Appendix B: State discrimination strategy
The phase-tracking method builds on the adaptive measurement strategy for QPSK states with PNR detection described in detail in Ref. [53]. In this strategy, the receiver performs N = 7 adaptive measurements on the input state |α k ∈ {|α , |iα , |−α , |−iα }. In each adaptive measurement j (j = 1, 2, ..., N ), the LO is prepared in a state hypothesis |β j , and displaces the input state |α k toD(−β j )|α k . Note that for a correct hypothesis β j = α k , the input state |α k is displaced to the vacuum state |0 . The displaced stateD(−β j )|α k is then detected with a PNR detector with number resolution m, ideally described by operatorsΠ n = |n n| for n = 1, 2, ..., m − 1 andΠ m =Î − m−1 i=0Π i . The strategy uses a maximum a posteriori probability (MAP) criterion and a recursive Bayesian updating [53]. Given a photon number detection d j and the hypothesis β j in adaptive measurement j, the strategy estimates the posterior Bayesian probabilities for input states and the most likely state. In subsequent adaptive measurements, the LO tests this most likely state, and prior probabilities are updated according to Bayes' theorem. Recursive application of this method during all adaptive measurements results in a final estimate θ disc of the possible input state, which corresponds to the most likely state at the end of the last adaptive measurement N , β N +1 . This most likely state corresponds to the answer to the state discrimination problem, and the discrimination strategy allows for surpassing the QNL. After a discrimination measurement, the data collected during N adaptive measurements consists of N photon counting detections {d 1 , d 2 , ..., d N }, together with the phases of the most likely states β j in each adaptive measurement {θ 1 , θ 2 , ..., θ N }. Assuming that the answer to the state discrimination problem is correct, the phase θ disc = arg{β N +1 } then corresponds to the phase of the input state, so that δ j = θ j − θ disc are the relative phases of the input state and the LO during each adaptive measurement. The pairs {d j , δ j } correspond to samples of phase space that can be used to estimate the phase offset caused by the channel for performing phase-tracking.

Appendix C: Phase estimator and performance
The method for phase-tracking for non-Gaussian receivers uses the data collected from the state discrimination measurement, consisting of the pairs {d j , δ j }, to estimate and correct for the relative phase between the input state and the local oscillator (LO) in real time. This method works in conjunction with the state discrimination strategy and requires no extra resources such as strong phase reference pilot pulses or performing additional measurements for phase estimation. For the adaptive non-Gaussian discrimination measurement in Ref. [53] with photon number resolution (PNR) of 3, PNR(3), the receiver samples four photon number distributions P 0 (n k |δ = 0), P π/2 (n k |δ = π/2), P π (n k |δ = π), and P 3π/2 (n k |δ = 3π/2) for δ = {0, π/2, π, 3π/2}. These photon number distributions can be used to obtain different estimators for the phase offset φ off (or the applied phase φ app in the experimental investigation described in the main manuscript). We note that the photon number distributions P 0 (n k |δ) can represent the rows of a 4 × 4 matrix. In general, these distributions can be arranged as rows of a (PNR+1)×M detection matrix for M -ary shift keyed states and PNR of the measurement. Below, we describe two estimators: one based on the differences of the mean photon numbers n δ of these distributions referred to as "sine-cosine estimator" that is implemented in our demonstration, and one that is a Bayesian estimator.

Sine-cosine estimator
The sine-cosine estimator, as implemented in the experimental demonstration described in the main text, uses the differences of the average of detected photon numbers n δ to obtain a final estimateφ est of the phase offset φ off (or the applied phase φ app in the actual experiments) based on the collected data from N avg × 500 channel transmissions.
As a first step, the estimator obtains two initial estimates:φ c andφ s . These estimates are obtained from the photon number distributions P δ (n k |δ) of the observed data from the state discrimination measurement for the relative phases between the input state and LO δ = {0, π/2, π, 3π/2} (see Sec. II of the manuscript). Under a situation where there is a phase offset φ off , interference visibility ξ, dark count rate ν, and N = 7, the mean photon numbers of the distributions P δ (n k |δ) are: Combining the equations for n 0 and n π in Eq. (C1) we can obtain samples for the quantity cos(φ off ) in terms of n , η, |α| 2 , ξ, ν, and N . In a similar way, samples for sin(φ off ) can be obtained from the equations for n π/2 and n 3π/2 . These samples can be used to obtain an estimate of the expected values from the average over 500 channel transmissions. For a large number of data samples, we expect that the average of cos(φ c ) over these channel transmissions approach the cosine of the averagē φ off of the actual phase offset φ off over these channel transmissions, cos(φ off ). We define this averageφ off from the cosine function as the estimateφ c . Similarly, the estimateφ s is obtained from the samples of sin(φ off ). These estimatesφ c andφ s can be expressed in terms of the estimates of the mean photon numbers n δ for 500 channel transmissions: φ c = arccos n π − n 0 C(|α| 2 ) and n 3π/2 − n π/2 = C(|α| 2 )sin(φ s ), (C3) where f (|α| 2 ) is a factor arising from non-zero probability of error of the discrimination strategy. Here, N = 7 is the number of adaptive measurements, η is the detection efficiency, and ξ is visibility of the displacement operation by interference. As a second step, the two initial estimates (φ c ,φ s ) are combined in a weighted average to form a phase estimateφ i every 500 pulses: The weight factor r(|α| 2 ) is used to increase the linearity of the final estimateφ est as a function of φ off , while reducing its variance near the edges of the capture range R in our experiment R = ±0.6 rad. As a final step, the final estimateφ est is obtained from the average of N avg estimatesφ i with a gain factor g(φ i ) The gain factor g( φ ) depends on the average of the N avg estimates {φ} = {φ 1 ,φ 2 , ...,φ Navg } and is used to further increase the linearity with respect to the actual phase offset φ off , as described below.
To obtain the final phase estimateφ est of φ off , the estimator aims to find the optimal values for the factors r(|α| 2 ), f (|α| 2 ), and g( φ i ), which depend on the input mean photon number |α| 2 , the estimatesφ i , and the experimental detection efficiency η, visibility ξ, and dark counts. We find the optimal values of r(|α| 2 ), f (|α| 2 ), and g( φ i ) using numerical approaches based on Monte Carlo simulations of the experiment with the following steps: 1.-Find the optimal value of r(|α| 2 ) (r opt (|α| 2 )) that minimizes the difference |φ i − φ off | at the extreme points of the capture range R = ±0.6 rad.
This procedure reduces the bias of the final estimatê φ est with respect to the true value of the phase offset φ off . The estimatesφ c andφ s are initially biased.φ s is biased towards zero phase as the magnitude of φ off increases.φ c cannot provide information about the sign of the phase offset φ off , and biases the estimates towards positive values. These biases result in a bias of the combined estimator φ i , and r(|α| 2 ) aims to reduce this bias. The final estimateφ est is also biased for large values of the phase offsets, as can be seen in Fig. (6), but unbiased for phase offsets near zero. The optimization of the gain function g( φ ) allows for minimizing the overall bias ofφ est , which can be mostly suppressed for |α| 2 ≥ 5.= Step 1.-Optimal value of r(|α| 2 ) The parameter r(|α| 2 ) is a weight factor for the contributions of the initial estimatesφ c andφ s to the estimatê φ i , and its optimal value is chosen to reduce the variance of the final estimateφ est near the end points of the experimental capture range range R = ±0.6 rad. The collected data {d j , δ j } during the discrimination measurement provides samples for phase estimation which are mostly δ j = 0. This is because the discrimination strategy is based on hypothesis testing by displacements to the vacuum state [53], and the displaced state spends most of the time in the vacuum state. This means the distribution P 0 (n k |δ = 0) is populated at a higher rate than P π/2 , P π , and P 3π/2 , and provides more data for the initial estimateφ c compared toφ s [see Eqs. (C2) and (C3)]. As a result,φ c gives a much better estimate with smaller variance. However,φ c does not give any sign information about the applied phase φ off , and is less sensitive to small phase offsets around φ off = 0. On the contrary,φ s is more sensitive to small phase offsets and contains the sign information of φ off , but is a worse estimate with a much larger variance. The optimal value of r(|α| 2 ) seeks to balance the contribution of these two initial estimates to minimize |φ i − φ off | at R = ±0.6 rad. This optimization has the overall effect of reducing the variance of the final estimateφ est at these points, where this estimator shows the greatest variance.
To find the optimal value of r(|α| 2 ) [r opt (|α| 2 )] we fix the mean photon number |α| 2 and N avg . We use Monte Carlo simulations to obtain the weighted averageφ i with f (|α|) = 1 as a function of φ off for different values of r(|α| 2 ). We then obtain a final average for the phase estimate φ i = φ i /N avg after N avg realizations for applied phases φ off within the range R = ±0.6 rad. We observe that φ i is in general a non-linear function of φ off . The optimal r opt (|α| 2 ) is obtained by finding the value of r(|α| 2 ) that minimizes the average of | φ i − φ off | at φ off = ±0.6 rad. This condition increases the linearity of the final phase estimateφ est with respect to the applied phase φ off and reduces its variance.
We note thatφ est is obtained by multiplying φ i by a gain function g( φ i ). As described in Step 3, g( φ i ) is obtained by inverting the relation of φ i as a function of φ off . Minimizing | φ i − φ off | at φ off = ±0.6 rad prevents a large value of the gain g( φ i ) at these points, which would result in a great increase in the variance of the final estimateφ est . Therefore, determining r opt (|α| 2 ) that minimizes |φ i − φ off | at (±0.6) rad, results in a gain g( φ i ) ≈ 1, reducing the variance of the final estimatê φ est .
We note that the distribution P π (n k |δ = π) can only be incorrectly populated with samples from the other three distributions P 0 , P π/2 , and P 3π/2 , which have smaller mean photon numbers. As a result, any discrimination error causes the estimated n π to be smaller than the true value on average. Similarly, the distribution P 0 (n k |δ = 0) can only be incorrectly populated with samples from distributions P π/2 , P π , and P 3π/2 with larger mean photon numbers. Then, P E = 0 causes the estimated n 0 to increase on average. The overall effect of having P E = 0 is to reduce the difference n π − n 0 , such that n π − n 0 < 4|α| 2 ηξ/N . As a result, the estimatê φ c in Eq. (C2) with f (|α| 2 ) = 1 will have a non-zero value when φ off = 0 that depends on the probability of error P E .
The effect of having P E = 0 can be reduced by finding the value of the parameter f (|α| 2 ) that makes the estimatesφ c and φ i close to zero when φ off = 0. The procedure to find the optimal value f opt (|α| 2 ) consists of using Monte Carlo simulations for different values of f (|α| 2 ) with r opt (|α| 2 ) found in Step 1. The optimal value f opt (|α| 2 ) is the one that minimizes the χ 2 between φ i and φ off , where a linear dependence is expected. To verify this optimal value we use ≈ 10 6 Monte Carlo runs with φ off = 0 to obtain the expected difference E[ n π − n 0 |φ offf f = 0]. The value f opt (|α| 2 ) should then be approximately: Table (S1) shows examples of the optimal values of r(|α| 2 ) and f (|α| 2 ) for different mean photon numbers of the input state |α| 2 and N avg .
Step 3.-Gain function g( φ i ) The final estimateφ est is obtained from the product of the average of N avg estimatesφ i and a gain factor g( φ i ), as shown in Eq. (C6). The gain function g( φ i ) is solely  a function of the average phase estimate φ i , and ideally maps φ i onto the phase offset φ off with a linear dependence with unit slope. We obtain the gain function g( φ i ) by using Monte Carlo simulations with the optimal values of r opt (|α| 2 ) and f opt (|α| 2 ) to obtain the dependence of the quantity (φ off / φ i ) as a function of φ i . The quantity (φ off / φ i ) shows in general a nonlinear dependence with φ i , and this dependence corresponds to g( φ i ). We fit the quantity (φ off / φ i ) using a smoothing spline, and this spline is defined as the gain function g( φ i ).
The gain function g( φ i ) obtained in this way allows to linearize the phase estimator with respect to known applied phase offsets φ off . Figure 6 showsφ est for input mean photon numbers |α| 2 = 2.0, 5.0, and 10.0 with optimal values of g( φ i ) ( = 1) in green; and with g( φ i ) = 1 in orange. In our experimental demonstration, the method for phase-tracking is set to generate estimates every 500×N avg transmissions through the channel, and subsequently applies a phase correction to the local oscillator every 500×N avg shots of the experiment, allowing to perform phase-tracking and phase correction at a rate of f PT ≈ 23/N avg Hz.

Bayesian estimator
A second possible estimator of φ off based on the collected data {d j , δ j } from the state discrimination measurement is the Bayesian estimator. For a Bayes estimator, the photon number distributions P δ (n k |δ) = P (n) are converted into distributions over the phase P (φ|n) through Bayes' theorem where L(n|φ) are likelihood functions and P (φ) a prior phase distribution. Given the collected data from the state discrimination measurement {data} = {d j , δ j } and assuming some prior distribution P (φ), a phase estimate can be obtained by forming the posterior probability distribution over phase given by: where N is a normalization factor and N n,m is the number of times that n photons were detected given that δ m = θ m − θ disc = mπ/2. Here θ disc is the answer to the state discrimination problem about the input state. The values of N n,m correspond to elements of the matrix of the photon detections for given {δ m }.
The likelihood function L(n|φ − δ m ) is given by: where η, ξ, and ν are the detection efficiency, interference visibility, and dark counts, respectively. The phase estimateφ B for the Bayesian estimator is then given by: The Bayesian estimateφ B provides a more precise estimate of the phase offset φ off with smaller variance than the sine-cosine estimateφ est , as shown in Fig. 6. However, this estimator is far more computationally difficult to implement experimentally in real time. While in our current experimental setup such a complex estimator cannot be implemented, the sine-cosine estimator described above produces similar results for estimating φ off while remaining computationally inexpensive, and allows for real-time estimation and implementation of the phase-tracking method.
Appendix D: Estimator performance as a function of Navg The performance of the phase-tracking method critically depends on the variance of the estimatesφ est . In general, increasing the number of samples N avg to obtain an estimate of the applied phase φ app [73] improves (reduces) the variance of the estimatesφ est . However, increasing N avg also increases the time required to obtain such estimate, thus reducing the phase-tracking bandwidth f PT . In situations with random Gaussian phase noise with bandwidth f RW > f PT , this reduction in f PT can significantly increase probability of error P E in the state discrimination measurement, and produce estimates that are far less accurate. As a result, there is a trade-off in the performance of the estimator as a function of N avg . While larger values of N avg provide better estimates for constant phases, in situations where the phase is not constant these estimates may not be accurate, limiting the performance of the phase-tracking method. To investigate this trade-off, we experimentally study the estimator variance as a function of N avg in situations with Gaussian-distributed random phase noise. Figures 7(a)-(c) shows the estimates from the experiment using the phase-tracking method as a function of time from t = 0 to 60 s with zero applied phase (φ app = 0) for |α| 2 = 5.0, for cases with (a) N avg =2, (b) N avg =15, and (c) N avg = 40, and the corresponding histograms of estimates. We observe that while smaller N avg increases phase-tracking bandwidth f PT , the variance of the estimates for φ app = 0, denoted as σ 2 0 , also increases. On the other hand, larger N avg reduces f PT , but also reduces σ 2 0 . Figure 7(d) shows the variance σ 2 0 for estimators with different N avg (blue points) for five experimental runs with φ app = 0. The inset (i) shows the variance σ 2 0 on a log-log scale with a best-fit line showing good agreement with a 1/N avg scaling, which is consistent with the statistical uncertainty for a process with random noise.
For situations with random Gaussian phase noise φ RW with variance σ 2 RW , the total variance of the estimates will contain contributions from σ 0 and σ RW . The solid black line in Fig. 7(d) shows the expected accumulated variance σ 2 RW for Gaussian random walks between the times to obtain phase estimates, which is proportional to N avg . Then, the total expected variance σ 2 Etot for situations with Gaussian phase noise will be approximately the sum in quadrature of σ 0 (related to the variance of the estimator with different N avg ) and σ RW [74]. Figure 7(e) shows the performance of the estimator for a given applied Gaussian random walk in phase (black line) for N avg =2, 15, and 40, with σ 1 = 5 mrad (σ 1 as defined in the main text) from t = 0 to 60 s. Figure  7(f) shows the difference between estimatedφ est and the applied phase φ RW , ∆ =φ est − φ RW from Fig. 7(e). The variance of ∆ will now contain two contributions: one from the estimator with different N avg , ideally given by σ 2 0 , and one due to the random walks in phase with variance σ 2 RW . For N avg = 2 we expect a large variance of the estimatesφ est , as shown in Fig. 7(a), and due to the relatively high tracking bandwidth f PT , the effect of the random walks is relatively small. On the other hand, for N avg = 40, the variance of estimatesφ est is expected to be small, as seen in Fig. 7(c), but due to the low f PT relative to f RW (f RW = 100Hz), the accumulated variance from random walks becomes dominant resulting in larger deviations ∆. Figure 7(g) shows the total variance σ 2 ∆ of ∆ = φ est − φ RW as a function of N avg . Error bars represent one standard deviation over five different random walks. The red solid line shows the expected total variance σ 2 Etot = σ 2 0 + σ 2 RW , which contains the contributions from the estimator σ 2 0 and the applied random walks σ 2 RW . The good agreement between σ 2 ∆ and σ 2 Etot indicates that drifts in the experiment are not significant. We observe that there is an optimal value for N avg ≈ 10 that minimizes the total variance σ 2 ∆ . This optimal value of N avg provides a good phase estimateφ est by increasing the number of samples for parameter estimation, while reducing the effects of errors in state discrimination caused by drifts in phase due to the random walks. We note that optimal values for N avg are larger for smaller values of f RW , and vice-versa.
Appendix E: phase-tracking with different noise bandwidths Methods for phase-tracking should be able to track phase noise with different bandwidths. In general, the performance of any phase-tracking method to track fast noise depends on how fast reliable phase estimates can be generated and how fast correction can be applied [28], which defines the phase-tracking bandwidth f PT . We investigate the performance of the phase-tracking method for the sub-QNL non-Gaussian receiver to track timevarying random phase noise with different bandwidths f RW for |α| 2 = 5.0, while keeping the same phasetracking bandwidth f PT = f exp /(500N avg ) ≈ 1.2 Hz, with f exp ≈ 12 kHz and N avg = 20. Figure 8(a) shows the probability of error P E as a function of time from t = 0 to 14s when applying random walks in phase with noise bandwidths f RW = 100 Hz (blue) and f RW = 500 Hz (green) for two different realizations of 50 random walks. Thick lines show the averages and shaded regions show the spread in P E over the 50 random walks. In this study, the random walks in phase and the phase-tracking are enabled at t = 2 s in both cases [see Fig. 8(b)] for the applied phase φ app and phase estimatesφ est for f RW = 500 Hz). We observe that in the presence of phase noise with f RW = 100 Hz, phase-tracking allows the receiver to maintain a 3.4 dB advantage over the ideal heterodyne limit (Het.) , which corresponds to the expected performance without phase noise [53] at |α| 2 = 5.0. On the other hand, for phase noise with bandwidth f RW = 500 Hz, the average probability of error increases from this ideal case, showing a much larger spread in P E compared to the case with f RW = 100 Hz. Performing phase-tracking for non-Gaussian receivers under phase noise with higher bandwidths f RW requires achieving higher f PT to generate accurate phase estimatesφ est at a sufficiently high rate compared to f RW . This can be achieved by increasing f exp resulting in higher sampling rates for phase estimation. Alternatively, increasing |α| 2 would reduce errors in state discrimination and increase the accuracy ofφ est , effectively reducing the effects of low phase-tracking bandwidths f PT . A combination of higher rates f PT and low discrimination errors would make phase-tracking reliable for enabling the receiver to maintain discrimination be- (a) Probability of error PE as a function of time for |α| 2 = 5.0 for noise bandwidths fRW = 100 Hz (blue) and fRW = 500 Hz (green) with different random walks. Thick lines show the averages and shaded regions show the spread in PE over 50 Gaussian random walks. For fRW = 100 Hz, the phasetracking method allows the non-Gaussian receiver to perform 3.4 dB below the heterodyne limit (Het.), which is the expected performance in the absence of phase noise (black line). Noise with fRW = 500 Hz causes PE to increase compared to fRW = 100 Hz. However, phase-tracking allows receiver to perform below Het. for 14 s. (b) Applied random walks in phase for fRW = 500 Hz (lower) and the phase-tracking estimates (upper). Applied phase φapp = φRW and estimateŝ φest for fRW = 100 Hz are shown in Fig. 4(e) in the main manuscript over 60 s. low the heterodyne limit under phase noise with high noise bandwidths.

Appendix F: Amplitude Noise
In addition to phase noise, random amplitude fluctuations may occur in communication channels. Amplitude fluctuations will affect the discrimination strategy and will result in increased discrimination errors if the noise is relatively large. Table II shows the effects of amplitude fluctuations characterized by the standard deviation of the relative amplitude noise σ amp for |α| 2 = 5 as a case study, which is assumed to be Gaussian such that |α| 2 → N (1, σ amp )×|α| 2 . This increase in the probability of error for state discrimination causes higher errors in populating the probability distributions P δ (n k |δ), from which the phase estimatorφ est is formed, and affects the phase-tracking performance, which can be characterized by the estimator variances. Table II shows the variances for the Sin-Cos and Bayesian estimators for different levels of amplitude noise. We observe that the presence of amplitude noise with levels from σ amp = 5−25% has very moderate effects on the error of state discrimination and on the variances of the estimators. This highlights the robustness of the phase-tracking method to amplitude noise.
For situations with slow amplitude noise, it may be possible to perform amplitude estimation and tracking based on the collected data from state discrimination measurement. Specifically, the information contained in the photon number distributions P δ (n k |δ) may be sufficient to estimate both phase and amplitude, which can eventually be used for phase and amplitude tracking. The receiver could use the Bayesian estimator for a twodimensional estimation yielding a simultaneous estimate of phase offset and amplitude of the input state. Alternatively, the receiver could perform phase estimation with the method described here, and sequentially realize amplitude estimation based on Eq. (B1), or vice versa.