Quantum Receiver for Phase-Shift Keying at the Single-Photon Level

Quantum enhanced receivers are endowed with resources to achieve higher sensitivities than conventional technologies. For application in optical communications, they provide improved discriminatory capabilities for multiple nonorthogonal quantum states. In this work, we propose and experimentally demonstrate a new decoding scheme for quadrature phase-shift encoded signals. Our receiver surpasses the standard quantum limit and outperforms all previously known nonadaptive detectors at low input powers. Unlike existing approaches, the receiver only exploits linear optical elements and on-oﬀ photodetection. This circumvents the requirement for challenging feed-forward operations that limit communication transmission rates and can be readily implemented with current technology.


I. INTRODUCTION
Quantum mechanics places strict fundamental limits on our ability to discriminate nonorthogonal quantum states [1,2]. This is a deep-rooted property of quantum mechanics that, on one hand, fuels numerous applications in quantum information science such as quantum computing and quantum key distribution [3][4][5], and on the other hand, limits the performance of other protocols such as sensing, metrology [6][7][8][9], and communication. The mathematical framework around state discrimination is based on the theory of quantum detection [1], and it has been applied to study the discrimination of various quantum states [10][11][12].
Of particular importance is the efficient discrimination of weak coherent states. Coherent states are endowed with an intrinsic resilience to loss and, given their immediate availability, have become indispensable information carriers in the optical realization of classical [13] and quantum information protocols [14,15]. An alphabet of coherent states with very small amplitudes (down to the singlephoton level) possesses large state overlaps, and thus exhibits strong quantum features. Such a small-amplitude alphabet occurs often in quantum communication protocols and in classical communication schemes that aim to enhance channel capacities [16]. More specifically, the optimal discrimination of weak coherent states can be used to enhance the secure key rate in quantum key distribution, improve the success rate in entanglement distillation, and increase the distance of deep-space communication [17,18].
This work focuses on the discrimination of four weak coherent states with equal amplitude and equidistant phase separations, {|α , |iα , |−α , |−iα }, chosen with equal prior probabilities, where the amplitude α is real valued and positive. This ensemble is referred to as quadrature phase-shift keying (QPSK) and is commonplace in fibre networks [19]. It offers efficient encoding of two bits of information in one mode of the electromagnetic field. Efficient readout of the encoded information can be accomplished by measuring conjugate quadratures via a heterodyne detection [13,20]. However, the optimal bound on the discrimination error (that is, the minimum average error in discriminating QPSK coherent states), known as the Helstrom bound, is significantly lower than that attainable through heterodyne detection [21,22]. A practical setup for discriminating the QPSK coherent states at the exact Helstrom limit is unknown. However, it is possible to surpass the heterodyne limit and approach the Helstrom bound using different decoding strategies. These schemes generally use a combination of linear optics, photodetection, and globally optimized displacement operations to distinguish coherent states through conditional signal nulling.
While receivers based on adaptive feedback indeed exhibit superior performance, e.g., outperforming the heterodyne detection limit for all amplitudes, they are technically challenging to implement and may impose practical limitations on the optical communication. Indeed, the bandwidth of the optical communication will be intrinsically limited by the feedback mechanism since the feedback delay significantly degrades the receiver performance [28]. Alternatively, adaptive receivers can be realized by spatially dividing a signal state into multiple modes [34,35], but this significantly increases the complexity of the receiver. It is therefore essential to devise a detection system that beats the heterodyne detection limit without the use of feedback techniques [36]. It has been shown that, for large coherent state amplitudes (α 2), this is possible by solely using linear optics and photodetection without the adoption of feedback [24,37]. However, obtaining the same advantage for weak coherent states was, up to now, an open question.
In this paper, we introduce, characterize, and experimentally demonstrate a new decoding strategy for QPSK states, comprised of linear optics and on-off photodetection. Notably, we do not make use of adaptive measurements, feed forward, or photon-number resolution. We show that adaptive feedback is not necessary to beat the conventional heterodyne decoding limit in the fully quantum, weak coherent amplitude regime (α 0.5). We experimentally realize the receiver and evidence strong agreement with theoretical predictions that account for the system efficiency. This work demonstrates a fundamental advance towards suboptimal optical receivers, and provides an immediate, practical strategy to surpass the heterodyne detection limit with currently available technology. Our strategy is compatible with photon-numberresolving detection that can increase the robustness of the receiver against noise and extend the performance of our scheme to higher input intensities [25,27,38].
The theoretical framework of this paper is presented in Secs. II and III. In Sec. IV, we present our new receiver for QPSK decoding. We demonstrate that our receiver outperforms previous decoding strategies in the weak amplitude regime, and present an experimental demonstration of this in Sec. V. Conclusions are summarized in Sec. VI.

II. THEORETICAL FRAMEWORK
Consider the problem of identifying a quantum state ρ x drawn from a known finite set {ρ 1 , ρ 2 , . . . , ρ n } with prior probabilities {p 1 , p 2 , . . . , p n } [4,39]. We focus on a singleshot scenario where only a single instance of the state is available. When the states ρ x are mutually orthogonal, detectors placed along the orthogonal directions will be able to perform perfect state discrimination. However, perfect discrimination of nonorthogonal states is not possible from a single-shot experiment and finding an optimal optical receiver is generally a difficult task. We consider this problem within the framework of ambiguous state discrimination, i.e., we allow for a finite probability of error that we aim to minimize.
Consider a general scheme for structured detection where the unknown state ρ is mixed with a known ancillary state σ through a unitary transformation U. The two output systems are then measured by applying a given measurement M , which is characterized by the positive operatorvalued measure elements M y , with y = {1, . . . , m}. This is shown schematically in the inset of Fig. 1. While the measurement is fixed, the ancillary state σ and the unitary U can be chosen within given sets, respectively denoted by S and U.
We now determine the optimal unitary U and ancilla σ that maximize the average probability of successful discrimination. For given ρ x , U, and σ , the Born rule gives the probability of obtaining the measurement outcome y as The Bayes rule allows us to compute the probability of input x given output y: . . .
A scheme for discrimination of multiple coherent states using linear optics and photon detection. The input ensemble is made of s-mode coherent states of unknown amplitudes. The scheme uses t ancillary modes prepared in coherent states of known amplitudes. The final measurement is realized as s + t independent photon detectors. Inset: a general ancilla-assisted discrimination scheme, with ρ an unknown state, σ a known ancillary state, U LO is a unitary transformation constructed from linear optics (LO) and M a measurement.
The best guess for x, given the measurement output y, is the one that maximizes the conditional probability where p U,σ (x|y) is the probability of successfully identifying the input state given the output measurement y.
The average success probability is then given by The optimization routine consists in finding the ancillary state σ ∈ S and the unitary U ∈ U that maximize p U,σ . This yields the optimized success probability Note that this quantity is a function of the sets U and S only, in addition to the input ensemble and measurement M . In the following section, we apply this approach to the problem of discriminating a quaternary coherent state alphabet using the linear optics toolbox.

III. COHERENT STATE DISCRIMINATION WITH THE LINEAR OPTICS TOOLBOX
Coherent state discrimination represents a concrete example of quantum state discrimination and has many applications in quantum optics. Here we focus on the problem of ambiguous discrimination of coherent states using linear optics and on-off photon detection. For this, the unknown state ρ is a coherent state over s optical modes, σ is a known coherent state over t modes, and the measurement M is modewise on-off photon detection. The unitary U LO is chosen from the set of linear optical (LO) transformations over N = s + t modes. While the following framework can be extended to nonclassical ancillary states, we focus on coherent states given their availability and widespread use in quantum information.
It is known that linear N -mode unitaries can be constructed from passive linear optical transformations followed by modewise phase-displacement operations [40,41]. In turn, passive linear optical transformations are realized using specific arrangements of beam splitters and phase shifters [42,43].
Consider an unknown coherent state of amplitude α x (s = 1) and N − 1 auxiliary coherent states of amplitudes β j ∈ C for j ∈ {2, . . . , N }. Mixing these coherent states through an N -mode passive, linear optical unitary U, followed by modewise displacements δ j , yields as output on mode j a coherent state with amplitude The ancillary state amplitudes, passive linear unitary, and the displacements must be chosen to maximize the average success probability of state discrimination. The complexity of this optimization scales quadratically with the number of modes N due to the decomposition of the unitary [42,43]. We significantly reduce this complexity by noting that the amplitudes in Eq. (6) are also attained if the signal α x is instead mixed with N − 1 vacuum modes at the same unitary, with displacements j = N k=2 U jk β k−1 + δ j on the j th mode. The original optimization is then equivalent to requiring a general displacement j ∈ C such that γ j = U j 1 α x + j . Note that our use of ancillary vacuum modes renders only the first column of the unitary important, which amounts to a quadratic speedup of the optimization process. Hence, we write where u = (u 1 , u 2 , . . . , u N ) is a unit vector. The objective is then to determine the optimal choice of u and , collectively referred to as optimization parameters, to maximize the state discrimination. We now define the optimized success probability for coherent state discrimination. Each mode is subject to an on-off photon detection. Denote a photon detection on the j th mode by y j = 1, and a no detection event by y j = 0. These mutually exclusive events occur with probabilities respectively. The overall output of the N -mode measurement is represented by the binary vector y = (y 1 , y 2 , . . . , y N ), and the average success probability in Eq.
(4) reads The latter is the objective function that we maximize by finding the optimal choice for the parameters u and .

010332-3
Note that, without loss of generality, u can be assumed real given the objective function is a function of the modulus of γ j alone. The objective function in Eq. (10) is then optimized over the parameters j ∈ C and u j ∈ R for all j ∈ {1, 2, . . . , N } with j u j 2 = 1. This corresponds to an optimization over a total of 3N − 1 real parameters, which scales linearly with the number of modes. We implement a nonlinear constrained global optimization of the success probability. The constraints ensure that the vector u has unit norm. The numerical optimizer takes advantage of primitive implementations of gradient-based and direct search algorithms for finding constrained local maximum.
We have implemented these routines in Mathematica [46].

IV. QUADRATURE PHASE-SHIFT KEYING
In QPSK, the unknown states are coherent states, ρ x ≡ |α x = |i x α with x ∈ {0, 1, 2, 3}, given with equal probability, and α > 0 [19], where |α| 2 is the mean photon number. A practical measurement scheme to distinguish these states is heterodyne detection, which, on average, is successful with probability To apply our optimization routine in Sec. III to QPSK, we first fix the number of ancillary modes. We illustrate the achievable average error probability of distinguishing the QPSK alphabet with different numbers of ancillary modes in Fig. 2. The number of modes required to minimize the error probability depends on the amplitude of the coherent states. Specifically, our numerical optimization suggests that a decoder equipped with one ancillary coherent mode is optimal in the weak amplitude regime α 0.5. For larger amplitudes α 0.75, two ancillary modes is the optimal choice. Additional ancillary modes increases the complexity of implementation, while the decoding improvements over two ancillary modes are negligible.
In the following, we concentrate on the weak amplitude regime and will hence consider only one ancillary mode (i.e., N = 2). We also find an analytical solution that is an excellent approximation of the numerical optimal in the regime of weak amplitudes, with N = 2. Furthermore, this solution does not require each parameter to be tuned to specific values of α. This near optimal receiver is attained through Physically, this is realized by mixing the input and ancilla modes on a 50% beam splitter. The two modes are then displaced by (i + 1)/2 and (i − 1)/2, respectively. This is shown in Fig. 3, where D( ) denotes the phase-space displacement of amplitude . With this, the computational overheads are greatly reduced, and an experimental implementation to discriminate QPSK states below the standard quantum limit can be easily performed for weak signal amplitudes. For this near optimal choice of parameters, we obtain the following analytical expression for the average success probability (see Appendix A): This is close to the numerically optimized success probability. Note that the independence of the success probability, and hence the optimal parameters, on |α| is only valid in the weak amplitude regime. Intuitively, this is due to a fundamental scale in phase space that is given by the shot noise, which in our units is equal to 1. The optimal For larger photon numbers (|α| 2 1), three modes are optimal and sufficient to outperform heterodyne measurements; using more than three modes does not make any noticeable difference. The magnification on the right-hand side illustrates this clearly.
3. An optimal receiver for QPSK discrimination. The unknown coherent state is first mixed with a vacuum state at a beam splitter (BS) with transmissivity η and phase φ. Each mode is then independently displaced in phase space by 1 , 2 , before being detected using bucket detectors. This scheme is easy to implement given its independence of ancillary states and adaptive strategies.
parameters change substantially when α 1. In the weak amplitude regime, where α < 1, these parameters remains fairly constant. We benchmark the success probability of our optimal receiver scheme with the Helstrom bound and heterodyne detection in Fig. 4. Our scheme outperforms both heterodyne detection and the hybrid scheme of Müller et al. [33] in the small amplitude regime. Unlike the receiver of Müller et al., our scheme is implemented using only linear optics and photon counting and does not rely on adaptive conditioning of the optimization parameters.
While we have considered QPSK alphabets, our optimized routine can be applied to an arbitrary constellation of coherent states. In Appendix B, we consider the application of our detection scheme for an M -ary phase-shift keying code.

V. EXPERIMENTAL DEMONSTRATION
We experimentally demonstrate our QPSK decoder in a temporal mode representation [34,35,47]. In the spatial mode representation used thus far, beam splitters divide the signal coherent state into multiple modes while maintaining the encoded phase information. This multimode splitting can equivalently be accomplished in the temporal domain by splitting the signal coherent state in multiple time bins. The splitting ratio of the beam splitters corresponds to the ratio of the widths of each time bin. The displacement operations performed individually on each mode in the spatial mode version can be implemented in the temporal mode version by instantly updating the displacement operation in time. The multiple single-photon detectors in the spatial version are replaced with a single single-photon detector, but it is now required that this detector has sufficiently high time resolution to detect photons in each time bin. FIG. 5. Average success probability of distinguishing QPSK coherent states as a function of the signal amplitude α for different receivers. The theoretical success probability of our scheme in the experimental condition is shown with a solid red line, together with experimental data points. The success probabilities from heterodyne [from Eq. (11)] measurements accounting for 66% and 100% detection efficiencies are shown with dashed black and dotted gray lines, respectively. The blue dotted line is for a conventional nulling Kennedy receiver [44], shown together with experimental data points. The green dash-dot line is for the Kennedy receiver with optimized displacement amplitudes [45], also shown with experimental data. Error bars denote one standard deviation from five realizations.
FIG. 6. Experimental setup. FC is the fiber coupler, SW is the optical switch, PM is the phase modulator, PZT is the piezo transducer, VA is the variable attenuator, PC is the polarization controller, DAC is the digital to analog converter, AMP is the amplifier, and SSPD is the superconducting a nanowire single photon detector.
Our experimental setup is shown in Fig. 6. We use a continuous wave laser at 1550 nm, which is split into two optical paths in order to individually prepare the signal coherent state and the auxiliary coherent state for the displacement operation. A variable attenuator and a piezo transducer respectively control the amplitude and the phase of the signal state. A phase modulator on the auxiliary coherent state path controls the phase of the displacement operation with a maximum frequency of 1 MHz. Since the temporal width of the signal state is defined to be 100 μs, the displacement phase can be changed to the desired condition with little adversary effect from the finite bandwidth of the phase modulation. The signal state is combined with the auxiliary state at the 99/1 fiber coupler corresponding to the physical implementation of the displacement operation. Using an optical switch, the interfered beam is guided to either a photodetector for the purpose of stabilizing the relative phase between signal and auxiliary states, or a superconducting nanowire single-photon detector (SSPD) for data acquisition. Because the conventional photon detector cannot measure the laser power highly attenuated to photon level, the laser power is also switched between high and low by an optical switch after the laser source. A field-programmable gate array (FPGA) collects the electrical signals from the SSPD and generates the signal driving the phase modulator. We achieve a total system efficiency of about 66%, where the transmission efficiency from before the 99/1 fiber coupler to the SSPD is approximately 90%. The detection efficiency of the SSPD is measured to be approximately 73% and the dark count noise around 25 Hz, which corresponds to 2.5 × 10 −3 counts per signal [28]. The dark count noise, as well as the nonunit visibility, are critical experimental imperfections that limit the performance of receivers based on displacement and photon detection [25,27]. Nevertheless, since we demonstrate our strategy with very weak coherent amplitude conditions, the error probability is so high that the contribution of the error induced by the dark count noise and the visibility imperfection is negligibly small.
We experimentally investigate the performance of three types of two-mode receivers based on displacement operations and photon detections. The three schemes are illustrated on phase-space diagrams in Fig. 7 and the experimentally obtained performances of the receivers are depicted in Fig. 5 [48]. (1) A conventional nulling Kennedy receiver that implements the displacement (a) (b) (c) FIG. 7. Illustration of the three receiver schemes implemented in the experiment for discriminating between the four input QPSK states |α 0 , |α 1 , |α 2 , |α 3 . (a) In the nulling Kennedy receiver [44], both BS outputs are displaced such that the |α 2 state is shifted to the phase-space origin. (b) The amplitude-optimized Kennedy receiver [45] is similar to the nulling receiver, but by displacing |α 2 further past the origin, the success probabilities for weak amplitudes are significantly improved. (c) Our optimal receiver also optimizes the phases of the displacements on the two BS outputs, leading to further improvements.
operations such that one of the QPSK signals is displaced to the vacuum state, i.e., | 1 | = | 2 | = |α| / √ 2 with η = 1 2 [44] [ Fig. 7(a)]. The mean and error bars of the success probabilities are evaluated from five independent procedures with 4 × 10 4 data points for each procedure. The signal amplitude is calibrated from the observed photon count rate by blocking the auxiliary state path and the error bars are evaluated from ten independent procedures.
(2) The Kennedy receiver with optimized displacement amplitude [45] [ Fig. 7(b)]. Our numerical analysis indicates that, for both nulling and displacement amplitudeoptimized receivers, the displacement phases for modes 1 and 2 should be set to the same value to maximize the average success probability in the very weak amplitude case. For the displacement amplitude-optimized receiver, the near optimal performance for the weak coherent signal amplitude is obtained with | 1 | = | 2 | = 1 2 and η = 1 2 . The conventional approaches are unable to beat the heterodyne limit, given by the black dash-dot line, in the weak coherent amplitude range. (3) Our strategy of optimizing the phase of the displacement operations [ Fig. 7(c)], implemented with the near optimal parameters, provides an improved performance that overcomes the heterodyne limit. Since our system has a finite detection efficiency of 66%, the success probability is degraded and the performances in the actual experimental condition are plotted for the optimal receiver by a dashed curve and open circles for theory and experiment, respectively.

VI. CONCLUSIONS AND DISCUSSIONS
Nonorthogonal quantum states are the building blocks of quantum communication protocols. Weak coherent states are commonly used in these applications given their nonorthogonality, relative ease of generation in laboratories, and resilience to loss. Motivated by this, we have looked at designing practical receivers to discriminate coherent states. In particular, we have focused on the optimal receivers that can be obtained by combining linear optics (including passive linear optics and phase-space displacements) and on-off photodetectors.
The natural decoders for discriminating coherent states are homodyne and heterodyne detections. These detectors have the advantage of being already commonly employed in standard telecommunications. However, they are limited by the shot noise. Several works have focused on a design of the structured receiver that could beat the shot noise limit. In particular, for the problem of decoding QPSK, the only known sub-shot-noise strategies in the low amplitude regime exploit feed forward [30,31,33]. Although feasible in principle, approaches based on feed forward remain technically demanding.
Furthermore, the delay due to practical factors associated with the feedback, such as the signal propagation time, the speed of signal processing after photon detection and the response time of the single-photon detector, and the delay of updating the displacement operation, may significantly degrade the error probability [28]. The delay becomes more critical if the temporal extent of the signal carrier becomes shorter and therefore the bandwidth of the feedback would limit the possible bandwidth of the optical communication. On the other hand, since static strategies without feedback control do not require real-time signal processing, our receiver is compatible with the high-speed optical communication where the temporal width of the signal carrier is intrinsically short.
Here, we develop a novel sub-shot-noise QPSK decoding. Our scheme employs linear optics and on-off photodetectors, without feed-forward operations. We demonstrate that it outperforms heterodyne detection, as well as all previous nonadaptive detectors in the weak pulse regime. Experimental implementation of our receiver demonstrates results consistent with our theoretical analysis. Going beyond QPSK, we have shown that our scheme allows us to beat heterodyne detection for 3-PSK decoding, but not for 5-PSK and 6-PSK decoding (Appendix B). This suggests that our scheme can beat heterodyne in M -PSK detection only for M ≤ 4.
An interesting experimental platform to test some of this work is the hybrid spatiotemporal architecture for universal linear optics [49]. This scheme would be useful to implement the optimized unitary receivers that we construct based on the design by Reck et al. [42]. It would be interesting to see how this proposal compares with the experimental minimum error measurements proposed by Solś-Prosser et al. [2].
Our results pave the way for a number of research questions. First, photon-number-resolving detectors may further improve our decoding strategy, especially in the region of higher signal amplitudes. Second, the effects of nonclassical ancillary states are also expected to deliver improvements to the attainable success probability [50]. The extent of improvement to our scheme is an interesting line of future work. Third, though we have focused on ambiguous state discrimination, our approach may also be useful for unambiguous discrimination of coherent states [51]. Finally, it may be combined with error-correcting codes and exploited to demonstrate the phenomenon of superadditivity in quantum communication [52].

APPENDIX A: ALMOST OPTIMAL DISCRIMINATION OF QPSK
Consider the receiver in Fig. 3, where the BS has 50% transmissivity and zero phase, and the complex displacements on both modes have amplitudes 1 = (i + 1)/2, For a given x, the action of the BS maps the input and ancillary coherent states to two coherent states of amplitudes Defining p(y|x) as the probability of obtaining the measurement outcome y given the input state α x , then the probability of individual detectors to click is given by (A2) In conclusion, we table the detection probability for each input state: (A6) From this table, we compute the maximum likelihood estimation for each combination of detection events, and the associated average probability of successful state discrimination: Hence, we have found a fixed receiver setup, where the displacements are independent of |α|. Note that the independence of the success probability, and hence the optimal parameters, on |α| is only valid in the weak amplitude regime.
In Fig. 8 we illustrate the difference between the numerically optimized success probability and the approximate analytic solution in Eq. (A7). Note that the difference is negligible and increases as the signal amplitudes approach the classical regime (α 1), where the receiver is no longer independent of α. Specifically, in this regime, vacuum fluctuations have minimal effect and any change to signal amplitudes effect large changes in the attainable success probability. This highlights that it is possible to implement a signal amplitude independent receiver and still obtain a good approximate to the fully optimized receiver, provided |α| 1.

APPENDIX B: M -PSK RECEIVERS
In this section, we explore the application of our detection scheme for an M -PSK (M -ary phase-shift keying) code. For arbitrary M , this code is defined by M equally likely coherent states with amplitudes α k = αe ik2π/M for k = {1, 2, . . . , M }. We compare our scheme with heterodyne detection, which for M -PSK yields the average success probability (see Appendix C) We have considered the cases in which M = 3, 5, 6, and optimized our detection scheme using a variable number of modes N . In Fig. 10 we illustrate the effect of increasing the number of ancillary modes on the average error probability of the optimal schemes. The average error probability of discriminating an M -PSK alphabet using heterodyne detection is also illustrated for comparison. The discrimination capability of heterodyne detection increases with increasing M . In general, there is an advantage in using more ancillary modes. However, the average error probability quickly saturates, especially when the number of coherent states, M , is small. Specifically, for M = 3, there is no noticeable improvement in using more than one ancilla for all signal amplitudes. Similarly, for M = 4, there is no improvement in using more than two ancillas for all signal amplitudes (see Fig. 2). The use of more ancillas appears to be more beneficial as M increases. However, our scheme does not beat heterodyne detection for M = 5, 6. This suggests that the standard quantum limit can be exceeded only for M = 3, 4 when using linear passive optics, without feedback.

APPENDIX C: M -PSK AVERAGE SUCCESS PROBABILITY OF HETERODYNE DETECTION
In this section, we derive an expression for the success probability of distinguishing coherent states from the M -PSK alphabet. This is a set of M coherent states with amplitudes α k = αe i2π k/M . The M -PSK coherent states are represented in phase space in Fig. 9. Without loss of generality, we assume that α is real and set α = x 0 / √ 2. Measuring by heterodyne detection a single coherent state with amplitude α = x 0 / √ 2 leads to an outcome (x, y) with probability density distribution where N = (2πσ 2 ) −1 , and σ 2 = 1 is the shot noise (variance of the vacuum). We need to express this in polar (C4) This corresponds to the integral over the colored region in Fig. 9. We can simplify the integral with the change of variable ρ := r 2 − 2rx 0 cos θ 2σ 2 .
( C 5 ) We have