Identifying nonclassicality from experimental data using artificial neural networks

The fast and accessible verification of nonclassical resources is an indispensable step towards a broad utilization of continuous-variable quantum technologies. Here, we use machine learning methods for the identification of nonclassicality of quantum states of light by processing experimental data obtained via homodyne detection. For this purpose, we train an artificial neural network to classify classical and nonclassical states from their quadrature-measurement distributions. We demonstrate that the network is able to correctly identify classical and nonclassical features from real experimental quadrature data for different states of light. Furthermore, we show that nonclassicality of some states that were not used in the training phase is also recognized. Circumventing the requirement of the large sample sizes needed to perform homodyne tomography, our approach presents a promising alternative for the identification of nonclassicality for small sample sizes, indicating applicability for fast sorting or direct monitoring of experimental data.


I. INTRODUCTION
Quantum technologies promise various advantages over classical technologies. By employing different features of quantum systems that are not present in classical systems, one can, e.g., perform more precise measurements, speed up computations, or share information in a more secure way. These nonclassical properties create possibilities to optimally exploit physical systems for many technological challenges. Light fields, described as continuous-variable systems, play a key role for the transmission and manipulation of quantum information [1]. Due to their infinite dimensions and an accessible control by means of linear optical elements and homodyne detection, they are widely considered for quantum technological applications. In the case of single-mode continuousvariable quantum systems, the central quantum resource is nonclassicality [2,3]. Directly related to the negativities [4,5] of the Glauber-Sudarshan P representation of the quantum state [6,7], nonclassicality manifests itself in different observable characteristics such as photon antibunching [8][9][10], sub-Poissonian photon-number statistics [11,12], and quadrature squeezing [13][14][15][16][17], and can be transformed into other quantum resources such as entanglement [18,19]. The fundamental nature of nonclassicality is exploited for the investigation of the roots of quantum phenomena and several quantum technological tasks such as, e.g., precision measurements.
Due to its crucial importance for quantum technologies, a fast and reliable identification of nonclassicality from experimental observations of the quantum state represents an unavoidable step toward a practical usage of such a resource for quantum technologies. In continuous-variable systems, one of the most common measurement methods is homodyne detection [20]. Advanced state tomography techniques based on this type of measurements have been developed [21,22]. However, nonclassicality certification based on homodyne tomography usually requires many different quadrature measurements and involved analysis tools. A different approach is nonclassicality certification via negativities of reconstructed quasiprobabilities [23] (particularly, the Glauber-Sudarshan P function [24] and the Wigner function [25][26][27][28]). Methods that involve regularizations of quasiprobabilities have been implemented for the singlemode and multimode scenarios [29,30], and more recently, phase-space inequalities have been proposed and tested experimentally [31][32][33]. Finally, a direct nonclassicality estimation without the need for quantum state tomography was proposed in Ref. [34]. Here, the nonclassicality of phase randomized states was classified via semidefinite programming. In all above approaches, to guarantee the detection of nonclassicality with a high statistical significance, extensive measurements must be performed (using different measurement settings or sampling different moments), after which advanced postprocessing is required (estimation of pattern functions, reconstruction of quasiprobabilities, and semidefinite programming, among others). Consequently, these methods are often complex and time consuming. A direct access to nonclassicality identifiers from unprocessed and finite homodyne-detection data is therefore desirable.
In this paper, we use ML techniques to identify nonclassicality of single-mode states based on a finite number of quadrature measurements recorded via balanced homodyne detection. For this purpose, we employ a dense artificial NN and train it with supervised learning of simulated homodyne detection data from several noisy classical and nonclassical states. We demonstrate the successful performance of the NN nonclassicality prediction on real experimental data and compare the results with established nonclassicality identification methods. Furthermore, we test the performance of the network for experimentally generated states which were not used in the training procedure and show that the NN can identify different nonclassical features at once. We conclude that the ML approach offers an accessible alternative for the classification of single-mode nonclassicality, and, particularly, due to its performance on small sample sizes, the presented approach constitutes a powerful tool for data pre-selecting, sorting, and on-site real-time monitoring of experiments. Our result represents an approach to train NNss for identifying nonclassicality of single-mode phasesensitive states, here measured by homodyne detection.
The paper is structured as follows. In Sec. II, we briefly recall the technique of single-mode balanced homodyne detection. In Sec. III, we describe in detail the training of the NN and the resulting nonclassicality identifier. In Sec. IV, we apply the NN to experimental homodyne measurement data and then analyze its performance on untrained data in Sec. V B. We summarize and conclude in Sec. VI.

II. BALANCED HOMODYNE MEASUREMENT AND NONCLASSICAL STATES
Any direct experimental investigation of light is based on photodetection. Depending on the information on the quantum statistics of the measured light required, different measurement schemes need to be implemented. For example, photon-counting measurements are not sensitive to the phase of the sensed field. To get information about the phase, interferometric methods have to be applied. In these methods, the field is mixed with a refer- ence beam, the so-called local oscillator (LO). The mixing takes place just before intensity measurements [20,22]. The scheme of balanced homodyne detection is shown in Fig. 1. It consists of the signal fieldρ, the LO, a 50:50 beam splitter (BS), two proportional photodetectors, and the electronics used to subtract and amplify the photocurrents after all. Homodyning with an intense coherent LO gives the phase sensitivity necessary to measure the quadrature variances [56][57][58].

LO LO
This kind of interferometric approach is necessary for the reconstruction of the quasiprobabilities of bosonic states. In principle, all normally ordered moments can be determined from this measurement scheme, including the ones which contain different numbers of creation and annihilation operators. Thus, homodyne detection drastically enlarges our measuring capabilities in a simple way.
The key for the quasiprobabilitiy estimation is to perform measurements for a large set of quadrature phases, which leads ultimately to a proper state reconstruction. Balanced homodyne detection and the subsequent reconstruction of the Wigner function have become a standard measuring technique in quantum systems such as, e.g., quantum light, molecules, and trapped atoms [25][26][27][28].
Although experimentally accessible, phase-space function reconstructions and moment-based nonclassicality criteria require significant amounts of measurement data, computational power, and postprocessing time. Here, we propose a shortcut to this process. Using NNs, we can do an on-the-fly nonclassicality identification with few measurements.

A. Setup of the network
The input vector of the network consists of a normalized histogram (relative frequencies) of homodynedetection data which is collected along a fixed phase setting. To generate the histogram from simulated or experimentally generated data (produced from quadrature-measurement outcomes x), we bin the data into 160 equally sized intervals which cover the interval [−8, 8] [59]. Since the histogram is normalized, input vectors constructed from arbitrary numbers of detection events can be used for the same network.
We use a fully connected artificial NN with an input layer of size 160, an output layer of size 2 and three hidden layers with sizes 64, 32 and 16. The hidden layers are activated with the rectified linear unit, and the output layer is activated with a softmax function. These parameters were chosen for a good performance in discriminating between classical and nonclassical states. The simulated data consisting of 2 × 10 4 input vectors per training family (see below) are split into training data (80%) and validation data (20%). The network is trained until the validation error stops decreasing for more than 10 training cycles.
Considering the experimental data on which we want to test the network's prediction later, we simulate 16000 detection events to generate each training input vector.
We train the NN with data generated from Fock, squeezed-coherent, and single-photon-added coherent states (SPACS) as states that show nonclassical signatures and with coherent, thermal, and mixtures of coherent states as states showing classical characteristics, see Appendix A for a discussion of this choice. All families of states used in the training are summarized together with their parameters in Appendix B. To account for realistic (imperfect) scenarios, we chose an overall efficiency of the homodyne measurement of η = 0.6 [33]. Note that the quantum efficiency, that represents external limitations such as channel or detector efficiencies, can equivalently be used to describe noisy quantum states. Thus, we train the network with data that correspond to the detection of realistic, lossy quantum states.

B. Identification of nonclassicality
In the training process, we assign the value 0 to all classical quadrature data and the value 1 to nonclassical data. The output of the NN is a value r between 0 and 1 that provides a way to discriminate classical and nonclassical data. A high output value (close to 1) indicates the nonclassical character of the tested quadrature data. We choose a threshold value t above which we say that the NN identifies nonclassicality. As our goal is to faithfully identify nonclassicality, we set t = 0.9. This means that, for r > t = 0.9, we conclude that the NN identifies nonclassicality. In this way, we might reject some nonclassical states to be recognized as such, but we minimize the risk of falsely recognizing classical states as nonclassical ones. Note that depending on the specific requirements and the choice of trained and studied states, the value of t can be adapted.
In this context, it is important to stress that the result of the NN can only be an indication for nonclassical states; cf. also Ref. [50]. A certification of nonclassi- Nonclassicality prediction of the neural network (NN) on the training states [coherent, thermal and mixed coherent states as classical ones; Fock, squeezed-coherent and single-photon-added coherent states (SPACS) as nonclassical ones], each in its corresponding state-parameter domain. α is the coherent amplitude, n is the number of photons, andn is the mean number of photons. The gray horizontal line corresponds to the nonclassicality threshold t = 0.9. Note that for the squeezed-coherent states, the squeezing parameter ξ is chosen randomly in ξ ∈ [0.5, 1] and is not shown in this plot. For each Fock state, the NN prediction is tested for four different simulations of the quadrature measurements. For details on the state parameters, see Appendix B.
cality requires full analysis including the evaluation of a nonclassicality test (witness) and a proper treatment of errors. While such an analysis can be rather involved, the proposed NN approach allows one to implement an easy and fast identification of nonclassicality. Therefore, it provides a useful tool for pre-selecting and sorting of data or the online, in-laboratory monitoring of experiments.

C. Performance of the network on trained states
In Fig. 2, we show the output r of the network for the different families of training states in their corresponding parameter ranges. All training families are correctly and consistently recognized to be classical or nonclassical. This holds for the total parameter regions of the considered states (cf. Appendix B), indicating that the training of the NN is successful in the sense that the network learned to correctly classify the states from the training set into classical and nonclassical ones.

IV. APPLICATION TO EXPERIMENTAL DATA
Here, we will use the trained NN for the identification of nonclassicality from experimental quadrature data. We analyze data from two different families of states: single-mode squeezed states and SPACS. This analysis will demonstrate the strength of the network approach as a fast and easy-to-implement characterization tool for experimental data.

A. Squeezed vacuum states
The first nonclassical experimental state we consider is a squeezed vacuum state. The vacuum state is squeezed along the real axis of the coherent plane. Details on the experimental realization can be found in Ref. [60].
In the measurements, the homodyne phase setting is changed continuously within the interval φ ∈ [0, 2π]. The resulting measurement data are then divided into 125 bins of size ∆φ = 2π/125, such that ∼ 16000 detection events are grouped together to constitute an input vector of the NN. For our analysis, the amount of squeezing |ξ exp | and the quantum efficiency η exp of the detectors do not have to be known, which highlights the practicability of the NN prediction.
In Fig. 3 (bottom), we show the prediction of the network for the nonclassicality of the squeezed state with respect to the homodyne phase setting together with the variance of the measured quadrature distribution. Additionally, the quadrature distributions p(x) for φ = 0 and φ = π/2 (solid) compared with the vacuum quadrature distribution (dashed) are displayed (top). It is known that nonclassicality in quadrature data can be verified by observing single-mode quadrature squeezing, see, e.g., Ref. [61]. That is, if the quadrature variance Var[x(φ)] is below the vacuum noise for some values of φ, Var[x(φ)] < 1/4, nonclassicality is detected. We see that the domain of nonclassicality classification of the network coincides well with the domain of nonclassicality detection by sub-shot-noise variance.
In short, we confirm that the NN learns the standard nonclassicality classifier of sub-shot-noise variance. If one is simply interested in the detection of squeezing, measuring the variance of the quadrature distribution remains sufficient. However, as discussed below, in contrast to a mere variance classifier, the NN can learn how to identify further nonclassicality features. It is more flexible than the squeezing condition which recognizes only one specific nonclassical feature, and it can be advantageous in scenarios where the underlying quantum state is not known and cannot be captured by a simple variance condition.

B. SPACS
Let us now analyze the prediction of the network for experimentally generated SPACS, which are the result of the single application of the photon creation operator onto a coherent state. In principle, such states are always nonclassical, independent of the input coherent state; however, they present an evident Wigner negativity and resemble single-photon Fock states only for small coherent state amplitudes. On the other hand, for intermediate amplitudes, they also present quadrature squeezing. Exhibiting a variety of different quantum features in different parameter regions, SPACS are therefore particularly interesting candidates for testing the performance of the NN. The experimental data consist of quadrature values, measured via homodyne detection, for the states Nâ † |α (N is a normalization constant) with 14 different values of α ∈ R + . To experimentally generate such optical states, we injected the signal mode of a parametric down conversion crystal with coherent states obtained from the 786 nm emission of a Ti:Sa mode-locked laser [62]. When the same crystal is pumped with an intense ultraviolet beam, obtained by frequency doubling the same laser, the detection of an idler photon heralds the addition of a single photon onto the seed coherent state. In other words, each idler detection event announces the presence of SPACS along the signal mode. Performing heralded homodyne detection on this mode, we measured the quadrature distributions along 11 different quadrature angles φ for each value of α [62]. Mode mismatch between the seed coherent states and the pump and LO beams, optical losses, electronic noise, and limited detector quantum efficiency in the homodyne measurement setup are the main causes for a non-unit overall efficiency of η exp ≈ 0.6 in the experiment. For each state, 15963 detection events are used to construct the network input vector.
In Fig. 4(a), we show the (binary) prediction of the network for the experimental SPACS data, together with exemplary quadrature distributions p(x) for different combinations of α and φ. We observe that the ability of the NN to identify nonclassicality depends crucially on the homodyne phase setting. For sin φ ≈ 0, SPACS are identified as nonclassical in a wide range of α; cf. Fig.  4(b) for the detailed NN predictions for this case. On the other hand, for suboptimal directions, SPACS are rarely recognized as nonclassical by the NN (except for small α). Also, for large α, SPACS are generally classified as classical in all directions. As a comparison, we show the NN prediction for experimental homodyne data generated by coherent states in Fig. 4(c) for the same parameters as used in Fig. 4(b). The network correctly recognizes coherent states as classical.
The phase-dependent behavior of the NN output for the experimental SPACS can be explained by the fact that, for sin φ ≈ 0, the quadrature distributions differ significantly from the one produced by a coherent state, while for other directions, the corresponding quadrature distributions resemble closely the ones of coherent states [62][63][64]. For small α < 0.5, SPACS resemble singlephoton states and are thus recognized as nonclassical at all quadrature angles [see p(x) for α = 0.32 in Fig. 4]. On the other hand, for large α, the quadrature distribution of the SPACS approaches the one of coherent states also in the optimal direction (φ = 0), and therefore, the NN eventually does not indicate nonclassicality anymore. In this regime, it is known that SPACS can be a good approximation of a coherent state of a larger amplitude [65]. The similarity of the SPACS quadrature distribution p(x) for large α and the distribution from a coherent state explains the difficulty for the NN to classify SPACS as nonclassical in this regime.
To summarize, for an optimal homodyne phase setting, SPACS are identified as nonclassical in a wide range of parameters. It is a direct and simple method for testing nonclassicality of SPACS directly based on quadrature distributions. As we discuss below, this identification is successful even in a parameter regime where the homodyne distribution does not show sub-shot-noise or similarity to Fock states. Therefore, the NN prediction proves operational for several different states and nonclassicality features.

V. INFLUENCE OF THE TRAINING SET AND APPLICATION TO UNTRAINED DATA
In this section, we first discuss the ability of the NN to recognize different features of nonclassicality at the same time. Then, we test its performance to recognize nonclassicality of states that were not seen in the training phase and of measurement data consisting of varying sample sizes.

A. Beyond single-feature recognition
To get some insights into which features are learned by the NN, we examine the performance in recognizing simulated SPACS of a network trained without SPACS; see Fig. 5. We observe that a network which is not trained with SPACS recognizes the latter only in specific parameter regions (teal dots). For |α| ∈ [0, 0.5], SPACS are recognized as nonclassical states due to their similarity to single-photon states. On the other hand, in the parameter domain |α| ∈ [1,2], their nonclassicality is recognized because the variance of the quadrature distribution is significantly smaller than the vacuum variance. Beyond that, the distribution does not resemble Fock states and has a large quadrature variance and is, therefore, not classified as nonclassical. For |α| > 3, the variance approaches the vacuum variance, making a correct classification as nonclassical impossible. In total, we see that the network can identify some SPACS even if they were not part of the training set. The network effectively identifies similarity to Fock states and sub-shotnoise variances. This is one example of the general fact that common features can lead to the identification of untrained data. In comparison, a NN that also used SPACS for its training can only achieve its performance (c.f. Fig. 2) by learning how to recognize similarity to SPACS where they do not resemble Fock or squeezed states. Therefore, we conclude that the network is sensitive to different nonclassical features at the same time and, thus, identifies nonclassicality beyond single features. Hence, a properly trained network can be advantageous, as it can recognize different nonclassical features for which one would otherwise need to implement different test conditions. This is particularly useful if the nonclassical features of the state to be tested are unknown. As we have just seen, a state must not be part of the training set to be recognized by the network. The above analysis also indicates the necessity to train a deep NN to perform this task since simple baseline models like, e.g., sub-shot-noise variance only provide single-feature recognition.

B. Application to untrained data
Now we discuss the performance of the NN on states which are not used in the training. We apply the network to the family of so-called (odd) cat states where α ∈ R + . As all states in this family consist of a coherent superposition of coherent states, they are all nonclassical. In Fig. 6, we show the α-dependent prediction r of the NN for quadrature measurements simulated for |α − . We use quadrature angles (a) φ = π/2 and (b) φ = π/4. For each subfigure, we additionally display the quadrature distribution p(x) for different values of α (solid) compared with p(x) for the same parameters but using a quantum efficiency η = 1 (dashed). For both quadrature angles, the network correctly classifies the states as nonclassical in a significant range of α. Thus, this example shows that the NN can certify nonclassicality also of untrained states. For larger α, cat states are not identified as nonclassical. This behavior can be explained as follows. For small α, the cat state resembles a single-photon Fock state and can therefore be identified as nonclassical. For larger α and measured along φ = π/2 with unit quantum efficiency η = 1, the quadrature distribution develops a nonclassical interference pattern (a, dashed). However, for a realistic efficiency η = 0.6, this interference is smoothed away (a, solid) such that the states are eventually classified as classical. Surprisingly, by choosing a different quadrature angle of, e.g., φ = π/4, the cat states are classified as nonclassical in a wider range of α [ Fig. 6(b)]. This is because the quadrature distribution still resembles a Fock state in this direction. Note that the performance of the NN prediction for cat states can be increased by including this family in the training process.
In summary, the NN is able to identify the nonclassicality also for states that were not used in the training process. However, for an optimized performance, it remains practical to adapt classes of states and parameter ranges in the training, see Appendix A.
C. Influence of the sample size Finally, we want to discuss the prediction of the NN if it is given measurement data with a smaller sample size than that used in the training phase. In Fig. 7, we show the NN nonclassicality prediction r for experimental quadrature data of a SPACS (yellow) and a coherent state (teal) for α = 0.32 and φ = 0. We observe that a NN trained with sample sizes of 16000 (dashed line) can correctly classify these two states for measurement data starting from sample sizes as low as ∼ 800. Decreasing the sample size even further results in false classification of coherent states as nonclassical and vice versa, which renders the NN prediction unreliable in this regime.
This analysis shows the flexibility of the NN even once it has been trained. Importantly, the NN can provide conclusive predictions based on comparably very small sample sizes, which opens the possibility of online classification during measurements or fast (pre-)classification of data. Note that the performance of the NN for small sample sizes can also be improved by training it with the corresponding sample size.  Prediction r of the neural network (NN) for experimental data from single-photon-added coherent states (SPACS; yellow) and coherent states (teal) for α = 0.32 and φ = 0 as a function of the measurement sample size. The dashed vertical line indicates the sample size 16000 that was used in the training phase.

VI. CONCLUSIONS
In this paper, we introduced an artificial NN-based nonclassicality identifier for single-mode quantum states of light measured with balanced homodyne detection. We trained the network using simulated homodyne detection data for realistic noisy measurements of different classical and nonclassical states. We observed that the trained network can correctly classify different classical and nonclassical states, i.e., coherent states, squeezed states, and SPACS, from real experimental data. Furthermore, the network recognizes certain nonclassical states that were not used in the training phase of the network. Compared with typical nonclassicality conditions based on homodyne tomography or other more complex nonclassicality tests, the strength of our approach lies in its simple implementation and the fact that only a small amount of data is required. We would like to emphasize that the NN nonclassicality prediction cannot certify nonclassicality and, if necessary, should be complemented by an error-proof nonclassicality witness.
The ML-based classification offers a fast and accessible method to sort and preselect experimental data, considering that it circumvents the need to first perform homodyne tomography or the calculation of complex test conditions and, as we showed, performs well also on small sample sizes. It is furthermore easy to implement and applicable in multiple experimental settings. ML has been used before for the detection of quantum effects [50][51][52]. In this context, it is important to highlight that the presented approach can detect phase-sensitive nonclassical features, which was not possible with previous results [50].
Further, the network approach can be used to search interesting experimental parameter regimes, especially if the production rate of detection events is small. To maximize the accuracy of the NN prediction in experiments, any specific information about possible states and noise (such as phase or amplitude noise) should be included in the training phase. Finally, note that the presented approach can be generalized to multi-mode scenarios and might be adapted to the identification of entanglement in a similar fashion. Also, different additional ML methods such as convolutional layers or regularizations can be considered to optimize the performance of the NN nonclassicality prediction and make it more applicable to untrained data.
states have to be included in the training because, otherwise, training only on classical states with single-peaked quadrature distributions, the NN might interpret doubleor multi-peak structures as features of nonclassicality. However, the choice of which classical state to use here is not unique. For instance, a different classical state that occurs typically in experiments is a phase averaged coherent state, ρ av = 2π 0 dφ |αe iφ αe iφ | /2π. Using ρ av instead of ρ mix in the training results in a similar performance of the NN as in the main text, with the expectation that, for larger α, ρ mix (and therefore also cat states measured along φ = 0) are classified as nonclassical.
This points to an important caveat of the NN classification of nonclassicality: as mentioned in the main text, the different states used in the training phase must be chosen carefully, given the experimental conditions. Training with more families of classical states decreases the probability of false identification of nonclassicality for states that were not seen in the training. At the same time, it makes it harder for the NN to learn nonclassicality features of the corresponding nonclassical training states. This discussion shows that the NN nonclassicality classification, while representing a simple and fast nonclassicality identification if possible input states are known, is not universal and does not yield a strict nonclassicality certification.
Appendix B: Parameters and probabilities used for the simulation of quadrature measurement data Here, we specify the state-dependent quadrature probability distributions and the corresponding parameters used in the simulation of quadrature measurement data for the different states in the main text. In Table I, we list the different states together with the corresponding parameter regions used in the simulations and the quadrature distribution along the quadrature angle φ. Note that we use a vacuum variance of 1/4.
For the simulation of the training data, we fixed a quadrature angle φ = 0. For thermal and Fock states, this restriction does not influence the distribution, as these states are phase insensitive. For coherent states with amplitude α, the distribution along a nonzero φ is equivalent to the one of a coherent state with amplitude α cos φ, measured along a zero quadrature angle. For squeezed coherent states and SPACS, this choice assures that only quadrature distributions which show nonclassical features are used in the training.
As noted in Ref. [59], the different parameter limits are chosen such that the probability for an event outside the considered measurement range, |x| > 8, is small (< 10 −6 ). Note that, for SPACS, we further restrict the parameters (|α| ≤ 3) to a domain where the network is able to separate them clearly from the classical states. For the simulation of the squeezed states, the squeezing parameter is chosen uniformly in ξ ∈ [0.5, 1].