Quantum state smoothing cannot be assumed classical even when the filtering and retrofiltering are classical

State smoothing is a technique to estimate a state at a particular time, conditioned on information obtained both before (past) and after (future) that time. For a classical system, the smoothed state is a normalized product of the $\textit{filtered state}$ (a state conditioned only on the past measurement information and the initial preparation) and the $\textit{retrofiltered effect}$ (depending only on the future measurement information). For the quantum case, whilst there are well-established analogues of the filtered state ($\rho_{\rm F}$) and retrofiltered effect ($\hat E_{\rm R}$), their product does not, in general, provide a valid quantum state for smoothing. However, this procedure does seem to work when $\rho_{\rm F}$ and $\hat E_{\rm R}$ are mutually diagonalizable. This fact has been used to obtain smoothed quantum states -- more pure than the filtered states -- in a number of experiments on continuously monitored quantum systems, in cavity QED and atomic systems. In this paper we show that there is an implicit assumption underlying this technique: that if all the information were known to the observer, the true system state would be one of the diagonal basis states. This assumption does not necessarily hold, as the missing information is quantum information. It could be known to the observer only if it were turned into a classical measurement record, but then its nature depends on the choice of measurement. We show by a simple model that, depending on that measurement choice, the smoothed quantum state can: agree with that from the classical method; disagree with it but still be co-diagonal with it; or not even be co-diagonal with it. That is, just because filtering and retrofiltering appear classical does not mean classical smoothing theory is applicable in quantum experiments.

The true state of a stochastically evolving system will typically be unknown, even if the system is continuously monitored through time, because of information that the observer is missing.In this situation, the true state at time t could be estimated by filtering -using the record ← − O t before t -or by retrofiltering -using the record − → O t after t.However, the best estimate comes from smoothing, using the entire record, ← → O .For classical systems, these estimates are, most generally, expressed as probability distributions ℘ involving the event x = x T , the unknown true state, and the smoothed distribution is simply related to the others: Here ℘ F (x; t) := ℘(x; t| ← − O t ) is called the filtered state and E R (x; t) ∝ ℘( − → O t |x, t) is the retrofiltered effect.In quantum state estimation, there are well established analogues: the filtered state ρ F (t) and the retrofiltered effect (POVM element) ÊR (t), where we borrowed this termi-nology in the classical case for consistency.However, quantum smoothing is not so straight-forward.
If ρ F is not pure, that implies that there is information missing to the observer.Thus, intuitively, it should be possible to obtain a better estimate by smoothing, by using ← → O instead of just ← − O t .Naively following the classical form (1) and multiplying ρ F (t) and ÊR (t) does not, in general, lead to a valid quantum state [1].However, there is one case where it does: when ρ F (t) and ÊR (t) are co-diagonal in some basis.Then multiplying these quantities replicates exactly the classical smoothing calculation (1), with the diagonal elements in these matrices acting like classical probabilities for the true state.
The applicability of classical smoothing theory for quantum systems where both the filtering and retrofiltering have this classical description seems quite reasonable.It has been used in a number of experiments on quantum systems where the diagonal states are atom-field dressed states [2,3], atomic energy levels [4], and photon number states [5].But can it really be justified?
In this paper we show that classical smoothing gives the best estimate for the true quantum state only with an extra assumption: that the missing information is such that, if it were known, the state would be in one of the diagonal basis states.Whilst this assumption may seem plausible, it is not entailed by the dynamics.If the system is quantum then the missing information is also quantum.
For it to be knowable, in principle, it must be turned into a classical measurement record, a second record alongside the observer's record ← → O .Then the smoothed quantum state [1,[6][7][8][9], denoted ρ S , can be defined as the optimal estimate [10,11] of the state conditioned on both measurement records (i.e., the true state, denoted ρ T ) by the observer who knows ← → O but to whom the second record is unknown.Crucially, the nature of this second measurement record depends on the type of detector which is assumed to create that record from the missing quantum information in the system's environment.
We consider three different ways to perform the second measurement, to prove various results relevant to our argument.In all cases, we use the same simple system, and the primary observer performs the same type of measurement, for which ρ F and ÊR are co-diagonal.For the first measurement choice, the true state ρ T is a pure state in the co-diagonal basis of ρ F and ÊR , and ρ S reproduces the classical smoothed state, as expected.For the second choice, ρ T is not in this diagonal basis, and a different ρ S is obtained, albeit one which is still diagonal in this basis.For the third choice, ρ T is again not in this diagonal basis, but this time the smoothed state ρ S is also not co-diagonal with the classically obtained smoothed state.We further show that, contrary to what one might expect, the classically obtain smoothed state is not even the most optimal, in terms of the expected value of the cost function which defines the smoothed quantum state.Our results highlight and delineate the limitations of applying classical estimation techniques to quantum systems even when they seem adequate.

A. Outline
In Sec.II, we review classical state smoothing in the classical setting as well as how it has been applied to quantum system.We then discuss some actual experimental examples which have made use of the classical smoothing theory and before introducing the quantum state smoothing theory.Following this, in Sec.III, we summarize the three main results of this paper and provide a rough outline of the arguments that lead to them.
Next, in Sec.IV, we introduce the physical system that will serve to prove our results.In Sec.IV A, we specify the type of measurement (by Alice, say) on the system outputs that will yield the observed records on which all of the state estimates are conditioned.Alice's measurements are chosen such that the filtered state and retrofiltered effect are co-diagonal, the situation in which classical smoothing theory would seem appropriate.In Sec.IV B, we consider the system outputs that are not accessible to Alice.We specify a measurement (by Bob, say) on these such that the optimal estimate by Alice, using quantum state smoothing [10,11], coincides with that obtained by classical smoothing.Before moving on, we compare this smoothed quantum state to the filtered state, to attain a deeper understanding of the physical system in this regime.We also resolve what may seem a puzzling feature of the expected cost functions that define these optimal states, to lay the groundwork for later results on this subject in this paper.
The subsequent three sections in this paper are dedicated to proving its three main results.In Sec.V, by changing Bob's measurement scheme to a homodyne measurement, we show that the smoothed quantum state does not necessarily reduce to the classically smoothed state independent of the choice of secondary measurement (Result 1).In Sec.VI, we again change Bob's measurement strategy, this time to an adaptive measurement scheme, to show that the smoothed quantum state is not even necessarily co-diagonal with the filtered state and retrofiltered effect (Result 2).Next, in Sec.VII, we compare which of these three cases yield the lowest expected cost function, and hence could be the most desirable choice of assumed measurement for Bob, showing that for the majority of the time the classical case performs the worst (Result 3).Lastly, in Sec.VIII, we conclude the paper, and provide some questions for future research.

A. Classical State Smoothing
For ease of presentation, we will consider classical systems that can be described by a countable number of discrete-valued parameters, collected in a vector x ∈ X, called the configuration.The characteristic of classical systems is that there exists a true or objective configuration x T (t), defining definite values for these parameters.It is only through one's lack of knowledge about the initial configuration of the system and the environment with which it interacts that this value is obscured.As such, the best description one can give about the parameters is through the state ℘(x; t), a non-negative distribution over possible true configurations normalized such that x∈X ℘(x; t) = 1.This state in general has non-zero entropy; to use quantum terminology it is a mixed state, unlike the pure state ℘ T (x; t) = δ x,xT(t) corresponding to the true configuration.
Often, the mixedness of a state will increase over time due to interaction with the environment.However, through measurement, which we will take to be a continuous-in-time measurement, one can gain information about the hidden true state.Note that, since we are only considering discrete classical systems, the dynamics of the configuration are restricted to transitions between the points in X.We will further restrict to Markovian systems; that is, both the measurement results at time t and the state of the system at time t + dt are governed only by the state at the current time, without the need to specify any of the states prior to that.This model of classical systems is commonly referred to as a hidden Markov model, as the true state underlying the observed measurement results is hidden.
Given information about the system, in the form of a measurement record O, one can ask: what is the best estimate ℘(x; t) of the true state using that information?Here we use a check (rather than the more common hat) to denote an estimate in order to prevent confusion with quantum operators, for which we will use hats.In order to define this optimal estimate one needs a measure of closeness between the estimated and true states, henceforth referred to as the cost function.In this work, we will consider the sum-square deviation cost function, ( This cost function is the state analogue of the square error cost function that is often employed when directly estimating the configuration instead of the state. The above cost function depends on the true state which is, of course, unknown, since it is what one is trying to estimate.Thus one must consider a way to turn the cost function into a value (to be minimized) that is independent of the true state.The simplest way is the Bayesian approach to estimation [12][13][14][15], in which one aims to minimize an expected cost function given the available measurement record, defined as Here, E X|Y {Z} denotes an ensemble average of Z over X given Y , where as a convention X will be omitted when X = Z.We have introduced a dummy subscript 'C' (standing for conditioning) on O to denote what part of the measurement record is available.For example, one has a filtered conditioning C = F if only the past measurement record, i.e., O F = ← − O t = {y(τ ) : τ ∈ [t 0 , t)}, is available.Here y(τ ) is the measurement outcome (detector click, photocurrent, etc.) at time τ and t 0 is the initial time.Similarly, to have a smoothed conditioning C = S, we must have access to the past-future measure- It is easy to show [11] that the optimal estimator for a sum-square deviation cost function is By substituting in the true state ℘ T (x; t) = δ x,xT(t) we can simplify the computation of Eq. ( 4) drastically, obtaining ℘ opt C (x; t) = ℘(x; t|O C ).Thus, the optimal filtered estimate of the true state is defined as Similarly, the optimal smoothed estimate is defined as where is also a non-negative function over possible configurations.Borrowing terminology from quantum measurement theory, we call E R (x; t) an effect.Specifically, because it involves the measurement record over the whole future, we call it a retrofiltered (R) effect.Eq. ( 6) can be derived by applying Bayes' theorem and remembering that the system is Markovian.Thus the optimal smoothed state involves both the optimal filtering and the Bayesian retrofiltering.

B. Quantizing State Estimation
In open quantum systems a similar idea to classical state estimation exists, where one instead aims to give an estimated quantum state, ρ(t), based on the outcomes of a continuous-in-time measurement.In fact, the ideas of both filtering and retrofiltering translate quite naturally to quantum systems.Quantum filtering [16,17], also known as quantum trajectory theory [18,19], is concerned with determining the conditional evolution of the quantum state of an open quantum systems where the environment is subject to a continuous-in-time measurement.This conditional state is called the filtered quantum state ρ F (t) := ρ← − Ot , the analogue of ℘ F (x; t), as it conditions the state of the system on all measurement outcomes up until the estimation time t.This analogy is actually very close; the dynamical map in quantum trajectory theory, is an obvious quantization of the dynamical map in classical filtering [19].
As for retrofiltering, the likelihood function ℘( − → O t |x, t) is replaced by a quantum effect, i.e., a POVM (positive operator-valued measure) element.In quantum measurement theory [19][20][21] a POVM is a set of positive hermitian operators { Êo } whose elements have an expectation value equal to the probability of outcome o occurring, i.e., Tr[ Êo ρ] = ℘(o|ρ).From this, it is easy to see that the quantum analogue of the classical retrofiltered effect is the operator ÊR (t) defined such that With both filtering and retrofiltering having a direct quantum analogue, one would be forgiven for assuming that the smoothing technique also has some trivial quantization.Based on the classical formula for smoothing in Eq. ( 6), one might define the smoothed quantum state as where the Jordan product, defined as A • B = 1 2 (AB + BA), is used to ensure that the operator is Hermitian, and the denominator is for normalization.The subscript 'SWV' will be made clear shortly.However, due to the fact that the filtered state and retrofiltered effect do not commute, in general, this construction does not always yield a valid (i.e., positive semi-definite) quantum state.As such, one cannot use it as the general definition of the smoothed quantum state.In fact, this operator has a close connection to weak-values (see Ref. [11] for an indepth discussion), and we refer to it, following Refs.[11,22] as the smoothed weak-value (SWV) state.

C. Quantum Experiments Involving
Smoothing of the State Whilst the naïve quantization (8) of the smoothed quantum state does not generally yield a valid estimate, state smoothing has been applied with apparent success in several quantum experiments.All of them were done in cavity QED, a platform known for high efficiency continuous measurements and superb control of individual quantum systems [23].They all succeeded in obtaining a valid quantum state when applying classical smoothing techniques to the quantum system.Here we summarize these experiments.Note that it is not essential for the reader to understand these details in order to grasp the key results of this paper.
The first experiment is that reported in Refs.[2,3].This experiment involves dropping Caesium atoms through a driven optical cavity containing a small number (10-100) of photons.Its aim was to experimentally witness the bistability [24] of the atom-field dressed state; that is, the tendency of the joint state of a single atom and field, in the strong coupling limit, to rapidly relax to two very-nearly orthogonal states, switching between them as in a so-called random telegraph signal [25,26].To demonstrate this phenomenon, the authors estimated the phase-quadrature of the intracavity field using the measurement record arising from homodyne detection of the cavity output beam.To obtain the best estimate, the authors used the entire (past-future) record.Specifically, inspired by Ref. [24], they applied classical smoothing to a simplified hidden Markov model with three states, each corresponding to a different conditional expectation value of the phase-quadrature of the intracavity field.The positive and negative values correspond to the two different dressed-states of the atom and field, while the state with a zero-value for the phase-quadrature corresponded to the case (not considered in Ref. [24]) where no atom was present in the cavity.Another simplification they made was not to calculate the full classical smoothed state in Eq. (6).Instead, they plot just the most-likely of the three states given the past-future record.
In the second experiment we consider, Ref. [4], a single Caesium atom was trapped within an optical high-finesse cavity, in the strong coupling regime.The aim of the experiment was to estimate the occupation probabilities of two possible energy eigenstates of the Caesium atom as well as estimating the transition rates between these states.Similarly to the previous experiment, they apply classical state smoothing, where the hidden Markov model assumed in this case comprises the two possible energy eigenstates and a third state introduced to help model the additional energy levels of the atom.To obtain this estimate, they probed the cavity with an on-resonant (with the empty cavity) weak laser, contributing a small number (less than 1) of photons.The output beam was measured by a single-photon detector, with the sequence of detection times forming the measurement record.The authors processed this measurement record using classical state smoothing, Eq. ( 6), to obtain estimates of the occupation probabilities of the energy eigenstates.
The final experiment Ref. [5], also involves a highfinesse cavity and atoms, but the relevant cavity mode is at microwave frequency, and the focus of attention is on the dynamics of its state.The atoms, which are Rubidium atoms in circular Rydberg states, are used just to probe the number of photons in the cavity.Specifically, the Rydberg atom is prepared in a superposition of circular Rydberg eigenstates |+ = (|50 + |51 )/ √ 2, where |50(51) is a circular state with principle quantum number 50(51), through an interaction with a separate lowfinesse microwave cavity.These atoms then interact with the intracavity field, causing a photon-number dependent relative phase shift φ between |50 and |51 .As the atom exits the cavity, it interacts with a second low-finesse cavity which rotates the atomic superposition with some particular relative phase φ to |50 (and the orthogonal state to |51 ) after which the atomic state is measured projectively.By continuously (at least to a good approximation) probing the system with Rydberg atoms, one forms the measurement record from the outcomes of the projective measurement.However, due to the periodicity of the relative phase shift, this measurement can only detect the photon number modulo k, where for this experiment k = 8.The authors used this measurement record to estimate the photon number probabilities via the classical smoothing theory in Eq. ( 6), and showed an improvement (higher purity) relative to the estimate obtained using classical filtering theory, Eq. (5).Their hidden Markov model is equivalent to assuming that the intracavity field is in some Fock state |n .

D. Why does classical smoothing seem to work in these experiments?
In all of the above experiments, the authors apply classical smoothing theory to systems that are, undoubtedly, quantum, while still obtaining sensible results.Although none of these papers explicitly writes down their estimate as a quantum state, there is a trivial relationship between the smoothed estimates of the probabilities, ℘ S (x; t) and the corresponding estimate of the state: ρ(t) = x ℘ S (x; t)|ψ x ψ x |.Here the set of pure states {|ψ x } are those assumed in their respective hidden Markov models: dressed atom-field states [2,3], atomic energy states [4], and Fock states [5]).We now examine in detail what makes classical smoothing applicable in these cases.
In all these experiments the quantum states corresponding to the d discrete states in the hidden Markov model are orthogonal, or very-nearly so.This means that, in the orthonormal basis {|ψ x : ψ x |ψ x = δ x,x } d x=1 , the 'true' quantum state of the system at any given time is with ℘ T (x; t) = δ x,xT(t) .The important point is that the true state will always be diagonal in this basis, which means that so will the filtered state and retrofiltered effect (See App.A for the proofs).That is, which trivially commute, [ρ F (t), ÊR (t)] = 0. Since the problem of the smoothed weak-valued state becoming indefinite only occurs when [ρ F (t), ÊR (t)] = 0, the orthogonality assumption removes any possibility of SWV (t) becoming physically invalid at any point in the evolution.
In this case, the SWV state becomes where ℘ S (x; t) is defined in Eq. ( 6).From this point forward, we will refer to this as the 'classical regime', hence the superscript 'cl'.While the assumption Eqs. ( 10) and ( 11) facilitates the usage of the SWV state as a smoothed estimate of the quantum state, it has also effectively removed the 'quantumness' from the system as it assumes that all the information that was missed by the observer was classical in nature.Moreover, there is an obvious tension in this semi-classical treatment.On the one hand, a portion of the information in the environment is treated as quantum information, i.e., that portion captured by the observer, whose choice of measurement affects how it is converted into classical information.On the other hand the remaining portion left in the environment is treated as purely classical, revealing which of the orthogonal basis states the system is in.To obtain a more consistent treatment of quantum systems, all the information in the environment should be treated on the same footing, being subject to different measurement choices ("unravellings").But to deal with this generalization requires the quantum state smoothing theory of Ref. [1].

E. Quantum State Smoothing
Similar to the classical state smoothing in Section II A, the quantum state smoothing theory [1] assumes a hidden Markov model with possible true (unknown) states, ρ T (t), consisting of only valid quantum states of the system.Importantly, these possible true states are not restricted to only orthogonal basis states as in the classicallike quantum smoothing.The goal is to obtain an estimated state ρ(t) that is closest to possible true states.As a quantum analogue of the sum-square deviation considered for classical systems, we consider a trace-square deviation expected cost function where the average is taken over the set of possible true states T t .The estimator that minimizes this expected cost is [10] attaining the minimum value where Ot {ρ T (t)}.This is identical to the filtered state defined in standard quantum trajectory theory.When the conditioning is on both past and future records, ← → O , the optimal estimate of the true state is This is the definition of the smoothed quantum state, which, unlike the SWV state, is guaranteed to be a valid quantum state as it is a convex combination of valid quantum states.
It should be clear that, in general, it is the case that ρ S (t) = ρ cl S , due to the non-orthogonality of the set of true states T t .However, in the event that the possible true states are pure and mutually orthogonal, i.e., when [7].Note, this is only a sufficient condition for a quantum-classical equivalence.However, at this point we have yet to specify how one can obtain the set of possible true states.Unlike in the classical state estimation where possible true states can be directly associated with true configurations x T (t) of the system, in the quantum state estimation, the set of possible true states T t is more subtle to define.
It is almost always the case that the observer, henceforth referred to as Alice, does not have complete access to the environment into which the system is leaking information.(If she did then the filtered state would, typically, be a pure state and no decrease in uncertainty from smoothing would be possible while maintaining a valid quantum state.)Thus we consider a partition of the system's environment or baths into two parts.This applies even if, by conventional accounting, there is only one bath, if Alice's detection has some non-unit efficiency η.
From this fraction of the bath, Alice's choice of measurement, M A , yields the measurement record O, the 'observed' record.As for the remaining fraction, 1 − η [27], this quantum information is unobserved by Alice.However, as it propagates away from the system, the information will encounter more complex environments which induce effectively irreversible decoherence.This defines a preferred basis [28,29] which can be regarded as a choice of measurement, M B of the information, yielding a second measurement record U that is unobserved by Alice, called the 'unobserved' record.We anthropomorphize this process by calling this Bob's measurement and record.
With both of these measurement records now defined, it is easy to see that the true state of the system is given by ρ Ut , as together the observed and unobserved measurement records contain the maximum amount of information about the system.This true state can be computed with standard quantum trajectory theory with measurements on multiple baths [18,19].The set of possible true states is then determined by the possible measurement records that could have occurred for Alice and Bob, given their respective measurement choice, i.e., Ut .It is with this set that the smoothed quantum state has been computed with in previous works [1,7,10,11].
One important thing to notice is that this definition for the set of true states explicitly depends on both Alice's and Bob's choice of measurement, as indicated by the notation Ut .This dependence is expected, as we are dealing with quantum information, and it raises the question of how Bob's choice is determined.As stated above, we may consider Bob to be an anthropomorphic representation of the environment, rather than a real agent.In that case it is not possible to 'ask' Bob what measurement he performed.Rather, one would have to infer the best model for how, ultimately, the quantum information from the system is turned into classical information by decoherence in the environment.However, for experimentally testing the theory, it would be necessary to have a 'real' Bob -an agent who can potentially make different measurement choices M B , and reveal them to Alice (even while keeping the results from her.)This is the situation we will consider in the remainder of the paper.We stress again that whilst the choice of unravelling for Bob has no impact on the filtered quantum state, this choice can cause drastically different dynamics for the smoothed state [1,8,30].

III. CLASSICAL-LIKE FILTERING AND RETROFILTERING DOES NOT GUARANTEE CLASSICAL-LIKE SMOOTHING
With the necessary background covered, we can move onto the main question this paper addresses.Say that for a given observed record ← → O , Alice's filtered state and retrofiltered effect commute at all times, as in the experiments discussed above (see Sec. II D).Given only this, is it correct for Alice to use the classically smoothed estimate as her optimal smoothed estimate?If this proves true it would open an entire regime where quantum state smoothing can be implemented without the need to know the measurement M B which Bob performed.In particular, it might justify the approach already applied in the experiments detailed in Sec.II C.However, in the following sections we answer this question, and related questions, definitively in the negative.We address this in three stages, from which we obtain the following three results: Result 1: The commutativity of the filtered state and retrofiltered effect is not sufficient for the smoothed quantum state to reduce to its classical counterpart at time Result 2: The commutativity of the filtered state and the retrofiltered effect is not sufficient for the smoothed state to mutually commute with both the filtered state and retrofiltered effect.
Result 3: The classical smoothed quantum state does not even always have the smallest expected cost function when compared to the smoothed state resulting from other unravellings for the environment.
We prove all of these results using a single, very simple example open quantum system: a single two-level system (qubit) coupled to bosonic baths, the details of which are given in Sec.IV.We then fix Alice's measurement choice (Sec.IV A) so that the filtered state and retrofiltered effect commute over a known time interval [t 1 , t 2 ).This is to ensure that we are always operating within the regime of interest for the question.As for Bob's measurement, we investigate three potential choices for this.The first choice (Sec.IV B) gives the classical regime, where his measurement is such that, over the interval (t 1 , t 2 ), ρ T (t) ∈ {|e e|, |g g|}, with e|g = 0 causing ρ S (t) to equal ρ cl S (t).This case is considered first to provide a comparison for the following regimes.In the second (Sec.V) and third (Sec.VI) cases, Bob's measurement is chosen so that ρ T (t) is not restricted to an orthogonal set over the interval (t 1 , t 2 ).The second case considers a homodyne measurement of the output and enables us to prove result 1.The third case considers an adaptive interferometric measurement strategy on the output bosonic field, and enables us to prove result 2. As for the final result, in Sec.VII, we investigate the expected cost functions of the smoothed quantum state in all three cases and show that, for the majority of the time, the classical case yields a larger (and hence worse) estimate when compared to the other two cases.

IV. MODEL: SINGLE QUBIT COUPLED TO EMISSION AND ABSORPTION CHANNELS
In this section we introduce the example system that will be used throughout the remainder of the paper to prove our results.Also in this section we introduce the measurement which Alice makes on her portion of the environment, chosen so that the resulting filtered state and retrofiltered effect are mutually diagonal over the given time frame.This is to ensure that we are always operating within the regime where the filtered state and retrofiltered effect are describable by a classical discrete hidden Markov model.Finally, in this section, we introduce the first choice for Bob's measurement, the one that explicitly realises this hidden Markov model and so makes ρ S (t) = ρ cl S (t).The two other, more complicated, measurement choices for Bob will be explained in the subsequent sections.
The physical system we consider is a qubit that is coupled to three decoherence channels: two emission channels and one absorption channel.The dynamics of the quantum state, under no observation, is specified by a vector of Lindblad operators, ĉ = ( This defines the Lindblad master equation as we are working in a frame where the system Hamiltonian disappears.In Eq. ( 16) we are using the Pauli ladder operators σ± ≡ (σ x ± iσ y )/2, where σx,y,z are the usual Pauli operators.We denote the eigenstates of σz by σz |e = |e , σz |g = −|g .The master equation ( 17) is extremely simple in this basis, looking like a classical bit stochastically transitioning between |e and |g , which is described by the classical master equation for the ground state probability, ℘(g; t) = g|ρ(t)|g , ℘(g; t) = (δ + γ)℘(e; t) − ℘(g; t) , and the probability of being in the excited state being ℘(e; t) = 1 − ℘(g; t).

A. Codiagonal Filtering and Retrofiltering: Alice uses photon detection
Throughout, we consider the case where Alice perfectly monitors the first channel, corresponding to emission at rate δ.Since there is a second emission channel, with rate γ, we could imagine a single emission channel which Alice monitors with efficiency δ/(δ + γ).To ensure Alice's filtered state and retrofiltered effect commute, over some time interval (t 1 , t 2 ), it is sufficient to say that Alice uses photon detection to monitor her channel.Such a measurement, as will be shown shortly, causes both the filtered state and retrofiltered effect to share the σz -basis as their diagonal basis between any two detection events (jumps).
This measurement results in the following stochastic master equation for the filtered state [19], with ĉo = √ δσ − , ĉu = ( √ γ σ− , √ σ+ ).The superoperators in Eq. (19) are , and the stochastic increment characterising the jump, dN o , satisfies the following properties Here, we have introduced the subscripts 'o' and 'u' to distinguish the channels observed and unobserved, respectively, by Alice.
To see that this measurement for Alice does result in a filtered state diagonal in the σz -basis between any two consecutive jumps, we can compute Eq. ( 19) under a ground state initial condition given by the first jump at time t 1 , i.e., ρ F (t + 1 ) = |g g| with t + 1 = t 1 + dt, and a no-jump evolution (dN o (t) = 0).This results in ρ F (t) = ℘ F (e; t)|e e| + ℘ F (g; t)|g g| with ℘ F (g; t) satisfying the differential equation ℘F (g; t) = γ℘ F (e; t) − ℘ F (g; t) + δ℘ F (e; t)℘ F (g; t) , (21) and ℘ F (e; t) = 1 − ℘ F (g; t) As for the retrofiltered effect, the stochastic differential equation governing its evolution is [1,31] which is evolved backwards in time from the final condition ÊR (T ) ∝ 1, representing a final uninformative state.In Eq. ( 22), G[â]ρ = âρâ † − ρ and H[â]ρ = k âk ρ + ρâ † k are the linear versions of the nonlinear superoperators introduced above, while ζ is an arbitrary positive constant which affects only the norm of ÊR .It is related to a socalled ostensible probability distribution for jumps; see Ref. [31].Thus, strictly, ÊR is only proportional to the effect, but all the expressions for the smoothed state are normalized so that the norm of ÊR for a fixed − → O does not matter.Now we show that the retrofiltered effect is also diagonal in the σz -basis between any two consecutive observed jumps.We have the final condition at the time just prior the second jump t 2 that ÊR (t − 2 ) ∝ |e e|, where t − 2 = t 2 − dt.To see this, one can use the quantum map form for the evolution of the retrofiltered effect (See App.A for an explanation), FIG. 1.A qubit undergoing three separate dynamical processes.The first two processes are photon emission processes, one with rate δ that is monitored by Alice (blue arrow) using photon detection and one with rate γ monitored by Bob, again using photon detection.The final process is an absorption process with rate (dashed arrow), modelled by a continuous Raman driving (black arrow) to a virtual state that immediately decays to the excited state via photon emission, that is observed by Bob via photon detection.
Since, at t 2 , a photon is detected in the δ-channel, the map corresponding to the observed measurement takes the form Mo (y o ; t 2 ) ∝ ĉo ∝ |g e|.Computing Eq. ( 23) with this map yields the final condition for the effect.Computing the evolution of the retrofiltered effect given this final condition at t − 2 and under a no-jump evolution (dN o = 0), we obtain the solution ÊR (t) = E R (e; t)|e e| + E R (g; t)|g g|, where the coefficients satisfy Thus by Alice choosing photon detection, from her perspective, the quantum system can be completely described by a classical hidden Markov model between any two detection events.In fact, this means that the entire evolution has this nature apart from potentially the evolution prior to the first jump, if the initial state is not diagonal in the σz basis.
As Bob monitors all the channels that Alice's measurement missed, this leaves him to perfectly monitor the γand -channels.One might be wondering how it is possible for Bob to measure an absorptive channel.Up until now we have not provided details about how the absorptive channel could have arisen physically.We consider that the qubit is under a continuous driving of a Raman transition [32][33][34][35] to a virtual state which immediately decays to the excited state by emitting a (detectable) photon, see Fig. 1.Such a scheme can be made equivalent to an absorptive channel with rate while emitting a photon of a different frequency to those in the emission channels so that it can be monitored, similarly to the emissions channel, by detecting this photon.With this technical point taken care of we next turn to Bob's first choice of measurement.

B. Classical Smoothed State: Bob uses photon detection
In this section we want Bob's measurement to be such that T t = {|g g|, |e e|} for all t ∈ (t 1 , t 2 ), with ρ T (t + 1 ) = |g g|.Just as for Alice, above, this is easy to achieve by having Bob monitor the remaining two channels perfectly using photon detection.For this measurement scheme, the stochastic master equation for the true state is [19], where dN u,k are the stochastic increments describing a detection of a photon in the corresponding channel.
To see that this measurement choice of Bob's does indeed result in the true state being in either the ground or excited state between any two consecutive observed jumps we can look at each term in Eq. ( 25) individually.Firstly, as we are considering the evolution between two consecutive observed jumps, we know that initially the true state will be in the ground state and that dN o (t) = 0 until the following jump, removing the first term in Eq. (25).Leaving the terms of order dt for last, the remaining unobserved jump terms project the true state into either the ground or excited state if a photon is detected in either the γ-or -channel, respectively.Finally, looking at the terms of order dt, when computing these with the system in the ground state, they are equal to zero.This means that the true state will remain in the ground state until a photon is detected in the -channel and projects it into the excited state.Computing the dt terms also gives zero when the true state is in the excited state and it remains unchanged until it is projected into the ground state via a photon emission in the γ-channel.Thus, under this monitoring by Bob, the true state between any two observed jumps will only be in either the ground state |g g| or the excited state |e e|, as claimed.
With T t obtained, all that remains is to compute the smoothed quantum state.For this case it is a fairly simple task.Beginning with Eq. ( 15), we have where we have used Bayes' theorem in the second line and in the final line we have recognized that ℘(m; t| ← − O t ) is exactly the coefficient for the filtered state in Eq. ( 21).After normalization this gives ρ S (t) = ℘ S (e; t)|e e| + ℘ S (g; t)|g g| = ρ cl S (t), as expected.For a simpler analysis, we henceforth consider the case where δ → 0 + .In this limit, Alice very rarely observes a jump causing (t 1 , t 2 ) to, typically, be large enough for both the filtered state and retrofiltered effect to reach their respective stationary solutions between any two consecutive jumps.As a result, the filtered state and smoothed quantum state only differ for a time on the timescale that the system equilibriates, in this case 1/(γ + ), prior to the final jump.To see why this is the case, we need to look at the retrofiltered effect.From Eq. ( 24), the steady-state solution is E ss R (e) = E ss R (g) = λ(t), where λ(t) is half the norm of ÊR .(Recall that the norm does not affect any calculated quantities, and so can be time-dependent even in steady state.)This means that whenever the retrofiltered effect is in steady state, the smoothed quantum state in Eq. (26), when normalized at time t, will be equal to the filtered state.Thus, in the limit δ → 0 + , the retrofiltered effect will be in steady state over the entire evolution until a time of order 1/(γ + ).Thus we only need to consider the evolution of the state on this time scale prior to an observed jump.
To gain some physical understanding of ρ cl S (t) in this system, let us compare it to the filtered state.We can see in Fig. 2 (a), that, unsurprisingly, prior to the jump occurring at t = 0, the filtered state (solid blue line) remains in its steady state ℘ ss F (e) = /(γ + ) until the jump where upon it is projected into the ground state.The smoothed quantum state (dashed green line), on the other hand, begins to diverge from the filtered state as the jump approaches and just prior to the jump reaches the excited state before it is projected to the ground state.This divergence is expected as the smoothed state 'knows' that a jump is about to occur, as it is conditioned on the future measurement record.Furthermore, since the true state can only be in either the ground or excited state, it means that for a jump to occur the system must have been in the excited state.
If we look at the purity of the filtered and smoothed quantum state in Fig. 2 (b), we see that as the smoothed quantum state begins to deviate from the filtered state, the purity begins to drop rapidly.Such a drop is not surprising as in order for the smoothed state to reach the excited state it must pass through the maximally mixed state, resulting in the smoothed quantum state having minimal purity prior to the jump.However, this brings up an interesting point.From Eq. ( 14), we know that the expected cost function for the optimal state is equal to the average difference between the purity of the true state and the conditioned state.However, for this system, the true state is pure irrespective of the unobserved measurement record and Eq. ( 14) reduces to the impurity of the conditioned state.This means that, since the purity of the smoothed state decreases below that of the filtered state just prior to the jump, the filtered state seemingly gives a lower expected cost function and would be the optimal estimator of the true state, not the smoothed quantum state.However, this is not true, as stated earlier and proved in Ref. [10].
The issue lies in how the expected cost function is calculated for the filtered state.Using Eq. ( 14) for the filtered state assumes that one only has access to Alice's past measurement record, whereas the smoothed quantum state assumes that the past-future record is available.Thus, to compare the expected cost function of the filtered state to that of the smoothed state one must take the future measurement record into account and compute Note, this argument also holds for classical systems, with the appropriate analogues of the states and cost function.When computing the expected cost function of the filtered state in this way, we see, in Fig. 2 (c), that the smoothed quantum state has the lower expected cost function.

V. PROOF OF RESULT 1: BOB USES X-HOMODYNE MEASUREMENTS
In this section we prove, by considering a different measurement choice for Bob, that the smoothed quantum state need not equal the classically smoothed quantum state.Here, instead of photon detection, Bob (perfectly) monitors the unobserved channels each with homodyne measurements.That is, the output light of the system is combined together with a local oscillator with phase φ on a 50:50 beam splitter, at which point, both outputs of the beam splitter are measured with photon detectors.The single two signals are then subtracted yielding quadrature information about the system.See Fig. 3. Specifically, the measurement current obtained is [19,31] where k = γ, , φ k is the local oscillator phase for the k-channel, and dW k is the innovation, an infinitesimal Wiener increment satisfying Such a monitoring causes the true state to evolve according to the stochastic master equation [19] dρ Here ĉφ = [ĉ u,γ e iφγ , ĉu, e iφ ] and dW = [dW γ , dW ] .From this point forward, we will consider the case where  27), and smoothed, using Eq. ( 14), states.The expected cost for the smoothed estimate of the state is seen to be a better estimator than the filtered estimate due to its smaller expected cost before the jump.In all cases, we have taken = 0.05γ and the limit δ → 0 + .FIG. 3. The same physical system as in Fig. 1 with Bob now measuring his γ and channels both (independently) using Xhomodyne measurements (with local oscillator phase ϕ = 0).
Here BBS stands for balanced beam splitter.
φ γ = φ = 0, i.e., X-homodyne measurements on both channels.Note, this is called an X-homodyne measurement since the measurement current only contains information about the x-component of the Bloch vector r = (x, y, z) that characterizes the state of the qubit, where ρ = 1 2 ( 1 + r • σ) and σ = (σ x , σy , σz ) .The analysis that is to follow would also hold, in this system, for any choice of homodyne phase φ γ = φ = φ.
The important part about this measurement scheme for Bob is that, between two observed jumps, ρ T is not restricted to the set {|g g|, |e e|}, but instead can be in any pure state on the x-z great circle of the Bloch sphere.The true state is pure between jumps because it starts in the (pure) ground state and remains pure since the system is perfectly monitored by both Alice and Bob.As for why the pure state can be confined to the x-z great circle of the Bloch sphere, this is because, without Bob's measurement, the quantum state is confined to the z-axis of the Bloch sphere and conditioning on an X-homodyne measurement will only give information about the x-component of the Bloch vector.Hence, the true state will, typically, have a non-zero x-component, whilst the y-component will remain zero.
To prove this more rigorously, one can obtain the stochastic differential equations for the Bloch vector of the true state from Eq. ( 30) subject to the initial condition ρ T (t + 1 ) = |g g| and a no-jump record dN o (t) = 0. We will again take the limit δ → 0 + to simplify the computation.For this measurement scenario we have with z(t + 1 ) = −1 and y(t + 1 ) = x(t + 1 ) = 0.With these equations we see that, due to the initial condition, the ycomponent will remain zero until the next observed jump.
Since the true state is restricted to the x-z great circle, we can reparametrize the true state by the angle θ of yrotation from the positive z-axis instead of the Bloch vector.That is, defining we have With this θ-parametrization we can reduce the evolution of the true state between two observed jumps to a single stochastic differential equation for θ.Using the Itô formula [19,36] to move from a differential equation in cos θ (or sin θ) to one in θ, we obtain Comparing the conditional average of the zcomponent of the Bloch vector when Alice's measurement record is obtained by photon detection and Bob's measurement record is obtained by either photon detection (green dashed line) and an X-homodyne (red line) measurement.Here, we have taken δ → 0 + and = 0.05γ and an observed detection (jump) occurs at time t = 0.The z-component for the smoothed state using an X-homodyne measurement for Bob differs from the classically smoothed state despite the filtered state being diagonal in the σz-basis, proving result 1. Here, We can now begin to compute the smoothed quantum state for this case.From Eq. ( 15), we have As we are only considering the evolution between two observed jumps, we only need to find ℘(θ; t| ← − O t ) = ℘(θ; t|N.J.), as the retrofiltered effect can be computed via Eq.( 22).Here the conditioning N.J. stands for a nojump record.Since Eq. ( 36) is in the form of a Langevin equation, it can be mapped to a Fokker-Planck equation [11,19,36,37] describing the evolution of the probability density of θ for unknown Wiener processes.However, since Eq. ( 36) assumes a no-jump observed record, the probability density the Fokker-Planck equation describes is in fact ℘(θ; t|N.J.).For this Langevin equation, the corresponding Fokker-Planck equation is where ∂ x = ∂/∂x, with the initial condition ℘(θ; t + 1 |N.J.) = δ(θ − π) corresponding to the ground state and the boundary condition ℘(0; t|N.J.) = ℘(2π; t|N.J.).We solve this Fokker-Planck equation numerically using Mathematica's NDSolve function with a Gaussian initial condition g(θ; µ = π, V = 0.01γ) ≈ δ(θ−π), where the mean of the Gaussian is µ and variance V .With the probability density found, we can now compute the smoothed quantum state in the X-homodyne case for comparison to the photon detection case.
As an aside, in general, to calculate the smoothed quantum state requires using an ensemble of unnormalized true states generated according to an ostensible probability distribution.See, for example, Refs.[11,31].That is, one must calculate an extra stochastic variable, the norm of each possible true state.This applies, in general, even in the current simple system where we can generate the ensemble of true states by solving a Fokker-Planck equation.Specifically, the Fokker-Planck equation must be modified to describe the joint probability of θ and the normalization [11,37].In the present case, however, since we are considering the limit δ → 0 + , the amount of information Alice gains from a no-detection event is negligible, as this is almost always the result, causing the equation of the true state to reduce to just the unobserved evolution.Since the actual probabilities, for this case, can be computed easily from the distributions for dW γ and dW , the normalized equation for the true state can be used.
We can now begin to compute the smoothed quantum state.We can see that, in this homodyne case, the smoothed quantum state will be diagonal in the σz -basis because of a symmetry in the dynamics.Specifically, since there are no unitary dynamics driving the system in a particular way, for any unobserved measurement record that causes the true state to rotate clockwise on the xz great circle, there is an equally likely record of the opposite sign that causes the state to evolve in exactly the same way in the counter-clockwise direction.Importantly, it is equally likely even given the future record (the jump) that Alice observes because the excited state probability is the same for both directions of rotation.Thus, when Alice averages over the possible unobserved records, each true state can be paired with its mirror image about the z-axis, cancelling the x-component of the Bloch vector.
We can thus easily compare the X-homodyne smoothed quantum state with the classically smoothed quantum state by only looking at their z-components.As Fig. 4 shows, there is a clear difference between them.(As explained in Sec.IV B, in the limit δ → 0 + we need be concerned only with the smoothed quantum state in a time of order 1/(γ + ) before the final jump.)Thus, by this example, we have proven our first result: the commuting of the filtered state and retrofiltered effect is not sufficient for ρ S (t) to equal ρ cl S (t).In this example, the obvious difference between the smoothed state obtained classically (when Alice assumes an orthogonal basis for the true state) and that obtained when Bob performs a homodyne measurement (when Alice takes this into account) is the value of z for the smoothed state immediately prior to the jump.We can intuitively understand this as follows.Since the true state in the latter case can be in a superposition of both the ground and excited state, as opposed to just being in one of these states, a jump can occur even when the system is not in the excited state.Thus, when Alice is estimating the true state using smoothing, she cannot be certain that the true state was in the excited state just prior to the transition, unlike in the classical (photon detection) case where she is certain.As such, her smoothed estimate in the homodyne case only moves somewhat closer to the excited state as the jump approaches.
In the preceding section, the non-classical smoothed state was still diagonal in the same basis as the classical smoothed state, and therefore diagonal in the same basis as the filtered state and retrofiltered effect (whose co-diagonality defines the scenario we are investigating).In this section we show the stronger result that the smoothed quantum state is not necessarily even codiagonal with ρ F and ÊR .As we saw in the X-homodyne example, the smoothed quantum state was diagonal in the σz -basis because Bob's measurement gave the set of possible true states a symmetry about the z-axis of the Bloch sphere.To avoid reasoning of this sort for this final case, Bob's measurement is chosen to break this symmetry in the set of possible true states.
To achieve an asymmetric distribution of true states, we allow Bob to use an adaptive measurement involving finite strength local oscillators on the unobserved channels; see Fig. 5. Here, "finite strength" means that the local oscillator intensity is comparable to the intensity of light emitted by the system, so the detection still resolves individual photons (unlike the strong local oscillator case of a homodyne measurement).The measurement is "adaptive" in the sense that after every detection event, the amplitude (strength and phase) of the local oscillators can be changed, depending on the current settings of these amplitudes, and the type of photodection that occurs (if more than one detector is used, which is the case for our system here, with two unobserved channels).Measurements of this kind have been studied theoretically for many decades [38][39][40][41][42][43][44][45].
A non-trivial property of adaptive measurements of the sort described is that they can constrain the stochastic evolution of the true state of the system to a finite and time-independent set T. Note the lack of a subscript t.Together with the corresponding stationary probabilities of each state in T, this comprises a so-called physically realizable ensemble (PRE) [41,46].The significance of "physically realizable" here is as follows.Consider a Markovian (in the strongest sense [47]) open quantum system whose unconditional evolution is described by a Lindblad master equation, where, in the long-time limit, t ≥ t ss , the system reaches a unique stationary solution that is mixed.Owing to the fact that a mixed quantum state can be decomposed into a weighted ensemble of pure states in infinitely many ways, there are different interpretations of how the underlying pure state dynamics of the system unfold.However, only some of these pure state ensembles are physically realizable, meaning that there exists a way to continuously monitor the environment (without affecting the unconditional evolution) such that the conditioned state at times after t ss is confined to T with the corresponding probabilities being realized in the ergodic sense.
The classical ensemble of Sec.IV B, with T = {|g g|, |e e|}, is an example of such a PRE, but in general the states in T need not be orthogonal [40,41,[43][44][45].This is essential for finding an asymmetric (under application of σz ) steady-state ensemble.(In our case, this steady state means a long time, compared to 1/(γ + ), after Alice's last jump, which is always the limit we can consider when Alice's jump rate δ → 0 + ; see more below.)For our system, the particular adaptive measurement scheme is chosen so that T comprises three states, with no symmetry under application of σz .The conditioned dynamics causes the true state to cyclically transition between these states.This is possible only in certain parameter regimes, which is why we chose = 0.05γ.
A detailed discussion of PREs in the context of the photon emission and absorption master equation is given in Refs.[44] §4.4.2.In our case, the scenario is modified slightly as it is Bob's measurement record which causes Alice's filtered state (as opposed to the unconditioned state) to become pure upon conditioning.That is, we make the substitution ρ → ρ F and it is Bob's measurement that causes the true state to become pure when deriving the PRE for the system.Note, there is a subtlety in regards to the filtered state that one should be aware of when making this substitution.In the standard PRE scenario, it is necessary for the unconditioned state to have reached a unique stationary solution.However, the filtered state is a stochastic quantity and in general will not reach a unique stationary solution in the long-time limit.As such, there are only select cases, i.e., when the filtered state (or some deterministic property of the state [7]) reaches a unique steady state, that we can apply the PRE theory in this manner.
In the example that we consider, such a steady-state will usually exist for the filtered state, provided the time between consecutive jumps, τ , is typically much longer than the time in which the system equilibriates.More formally, when τ = (δ e|ρ|e ) −1 (γ + + δ) −1 .Since, after the first jump, ρ(t 1 ) = |g g|, it will always be the case that e|ρ|e ≤ e|ρ ss |e = (γ + + δ) −1 .Thus, the filtered state is likely to reach steady state between consecutive jumps if we operate in the parameter regime where, for the final approximation, we have assumed δ γ + .Note, for the parameter regime we have been considering thus far ( = 0.05γ), we require that δ 22γ.As has been the case for the other measurement scenarios, we will, for simplicity, take the limit δ → 0 + .For the stationary solution of the filtered state in this example, the cyclic physically realizable ensemble {ρ(θ), ℘ θ } θ∈{α,β,φ} chosen, from the possible valid ensembles, is displayed in Fig. 6 with the angles and corresponding probability weights.It should be emphasized that the implemented measurement strategy [44] only causes the true state to undergo the cyclical dynamics depicted in Fig. 6 when the filtered state is in steady state.Outside of this regime the dynamics of the true state may be more complicated, but when averaged still result in the transient dynamics of the filtered state.The reader is referred to App.B for the local oscillator settings that achieve the PRE shown in Fig. 6.
With the measurement scheme and dynamics of the true quantum state covered we can now begin to compute the smoothed quantum state for this case.As in the previous two cases, the smoothed quantum state differs from the filtered state only for a time of order 1/(γ + ) prior to the second observed jump.As such we only need to compute the smoothed quantum state when the retrofiltered effect is outside of steady state.Since we are working in the limit as δ → 0 + , it will typically (in the strict sense) be the case that enough time has passed for the filtered state to have reached steady state well before the next jump.As such, over the time prior to the second jump that we are interested in, the true state will be cyclically jumping between the three pure states in the PRE.Thus the smoothed quantum state over this time region can be computed via FIG. 6.The x-z great circle of the qubit's Bloch sphere showing the cyclic physically realizable ensemble chosen for the example qubit system subject to the measurement scheme in Fig. 5 plotted on the x-z great circle.The three black points on the circle are the three pure states in the ensemble with there corresponding angles and occupation probabilities given in the top right corner (see Ref. [44] for how these are calculated) and the arrows connecting them shows the cyclic dynamics that results from the adaptive measurement strategy.The shaded region is where both the filtered and smoothed quantum state must reside at times after t ss .Lastly, the square marker indicates ρ ss F .See Fig. 4 for the parameter details.
All that remains to compute the smoothed quantum state is to determine the probability distribution ℘(x| ← − O t ).This distribution is, by definition, the occupation probabilities of the PRE states, i.e., ℘(θ; t| ← − O) = ℘ θ .To check whether the smoothed quantum state is nondiagonal in the σz -basis, we only need to look at the x-component of its Bloch vector, as the y-component is always zero for this adaptive local oscillator scheme.In Fig. 7, the x-component (right-hand-axis) and the length of the Bloch vector (left-hand-axis), defined as r = σx 2 + σy 2 + σz 2 , for the smoothed quantum state are shown for this scheme (cyan).There are a couple of interesting features in this example, the most important being that the Bloch vector of the smoothed quantum state prior to the jump has non-zero x-component.Thus, the smoothed quantum state is not diagonal in the σzbasis, the shared basis of the filtered state and retrofiltered effect, proving our second result.
The reason the smoothed quantum state becomes nondiagonal is that Alice is able to infer from − → O t , specifically the upcoming jump, that the two states in the PRE with a negative x-component are more likely to have been oc-FIG.7. The length (left-hand-side axis) and the x-component (right-hand-side axis) of the Bloch vector for the smoothed quantum state when Alice's measurement record is obtained by photon detection and Bob's record by either photodetection (green dashed line), X-homodyne (red solid line), or an adaptive weak local oscillator (cyan dot-dashed line).The xcomponent of the smoothed state when Bob implements photodetection or an X-homodyne measurement are not shown as they are zero at all times.cupied than they otherwise would be.This is because they have a larger overlap with the excited state than the single state with a positive x-component.This breaks the symmetry of reflection the z-axis.This is evident in Fig. 8, where we can see the clear increase in the probability of the true state being in the PRE state on the far left and a substantial decrease in the probability of being the PRE state closest to the ground state.
It should also be apparent, since the smoothed quantum state will be a mixture of the states in the PRE over the interval of interest, that the Bloch vector will lie within the triangle formed by the PRE states (the grey shaded area in Fig. 6).As a result, the smoothed quantum state will not pass through the maximally mixed state (the centre of the circle) as the observed jump approaches.This is not to say that the length of the Bloch vector does not decrease, just that it does not go to zero.This can bee seen in Fig. 7.

VII. PROOF OF RESULT 3: COMPARING THE EXPECTED COST FUNCTIONS
As we have just seen in the last two examples, the commutativity of the filtered state and retrofiltered effect imposes no apparent constraints on the dynamics of the optimal smoothed quantum state.However, the fact that the optimal estimates in these cases are different gives rise to a new question.Alice cannot compute her optimal smoothed quantum state without knowledge of the nature of Bob's measurement, as this determines the set of possible true states.But what is the 'best' measurement unravelling for Bob to perform, from Alice's point of view?Since we already have a metric, the expected cost function, whose minimization defines the optimal estimate (ρ F or ρ S ), it seems natural to use that expectation value as a measure of how good Alice's estimate is.For our particular cost function, this is equivalent to the measurement that results in the purest smoothed state.Now, intuitively, the measurement for Bob that would give the greatest purity would be the measurement that causes ρ S (t) = ρ cl S (t) as Alice only has to estimate between perfectly distinguishable true states.Applying this type of logic to the three cases that we have already considered, one would guess that after the case where Bob uses photon detection (having two perfectly distinguishable true states), the next best case would be the adaptive scheme (with three non-orthogonal true states), followed by the X-homodyne case (with a continuous infinity of non-orthogonal state to distinguish).Comparing the expected cost functions for each estimate in Fig. 9, we can see that the above intuition is incorrect.In fact, the complete opposite holds over the majority of the time.This is despite the fact that the classically smoothed quantum state ρ cl S (t) is pure immediately prior to the jump occurring.Actually we have already discussed how this is linked with a prior decrease in purity in the classical case, in Sec.IV B. With only two states, with the steady state being relatively close to the ground state, and with the state just prior to the jump being the excited state, the classical smoothed state must pass through the maximally mixed state, at which point it must be the worst (highest expected cost) estimator.This leads us to our final result, that the classically smoothed quantum state does not necessarily yield the lowest expected cost function.
We can gain a more general intuition as to why increasing the number of possible true states yields purer smoothed estimates, by analyzing the expression for the purity.That is, where for ease of illustration we have assumed a discrete set of possible true states.We see that the purity is a weighted sum of the overlap between possible true states.Thus, with only two orthogonal true states in T t , the only terms that contribute are when ρ T = ρ T .However, as the number of possible true states increase, additional terms that result from non-orthogonal states will also contribute, increasing the sum.Following this intuition, we arrive at the ordering that is observed in Fig. 9, over On the surface of the circle we have plotted the probability distributions for the case where Bob measures the environment using photodetection (green), an X-homodyne (red), and an adaptive scheme (cyan) (all of which have a zero y-component).
The relative area of the bars indicates the occupation probability for that state.Note, the total area of both the the green and cyan bars are equal but the area under the red curve has been scaled by a factor of 3 for clarity.The markers (circle, square and triangle) on the interior correspond to the smoothed mean using the correspondingly coloured distribution.See Fig. 4 for the parameter details.
the great bulk of times.The exception is when the jump is imminent, where the ordering flips.(From Fig. 9, this flipping might appear to happen at an instant, but zooming in one finds that the three line do not intersect at the same point in time.)For the photon detection case the reason for the reversal is clear; as discussed above, just before Alice's jump the state is pure and so the expected cost is zero.Something similar, but less dramatic, happens with the adaptive jump case.Just before Alice's jump, one of the three possible true states becomes much more likely the others, as discussed in Sec.VI.We refer the reader to Fig. 8 again to appreciate the difference from the homodyne case.
As an aside, in Ref. [10] it was shown that the tracesquare deviation from the true state is not the only cost function that has the smoothed state as its optimal estimator, so does the relative entropy.One might then ask, does the classically smoothed state give the lowest expected relative entropy?The simple answer is no, in fact the ordering will not change.This is because, as shown in Ref. [10], when the true state is pure, the expected relative entropy between the smoothed state and the true state reduces to the von-Neumann entropy.Since, for qubits, the von-Neumann entropy is a monotonic function of the purity, the ordering will remain.

VIII. CONCLUSION
In this paper we have proved three results regarding quantum state smoothing.These three results were motivated by the application of classical smoothing techniques in quantum experiments.These experiments assumed the unobserved information in the environment was made classical in a way that removed the 'quantumness' from the system.Specifically, the quantum system was effectively replaced by a classical system, always in one of a set of orthogonal states with dynamics described by an hidden Markov model.
One might say that these assumptions were physically reasonable in the context of these experiments.However, we argue that it is important to state an assumption like this explicitly, because in its absence one cannot justify the application of classical smoothing techniques for a quantum state.Our argument is established by our first result, and is backed up by two even stronger negative results, regarding the applicability or usefulness of classical smoothing techniques.All three of the results in our paper were proven using a simple qubit system (atom) coupled to three decoherence channels.The measurements by the observer (Alice) has a classical description, describable in terms of transitions between the ground and excited states, and the master equation has no terms that generate coherence between these states.
Our first result was to show that the optimal smoothed estimate for this system in this situation depends on FIG. 9.The expected cost of the smoothed state when Alice's measurement record is obtained by photon detection and Bob's record is obtained by either photon detection (green dashed line), X-homodyne (red line) or by an adaptive strategy based on weak local oscillators (cyan dot-dashed line) (see Sec. VI.The expected cost for both the X-homodyne case and the adaptive weak local oscillator cases have a lower expected cost than the photon detection (classical) case for most times due to the latter having to pass through the maximally mixed state to reach the (pure) excited state.See Fig. 4 for the parameter details.
how the information unobserved by Alice in the environment is assumed to become classical.In particular, we here considered two potential measurement unravellings for the environment, which for convenience we call Bob's measurement schemes.For photon detection, which is equivalent to assuming an orthogonal hidden Markov model for this system, the classical smoothed state (probability distribution over the two orthogonal states) is recovered.But for homodyne detection a different smoothed state is found, albeit still a mixture of the states in the classical model.
Our second result was that one cannot even make a statement on the diagonal basis of the smoothed quantum state without knowing the unravelling of the unobserved information.For this we considered a third type of measurement for Bob: an adaptive strategy using local oscillators and photon counters.Under this scheme, despite the apparent classicality of Alice's measurement, the smoothed quantum state is not, in general, a mixture of ground and excited states.
Our third and final result is the most subtle.In the face of the first two results, one might still hope that the classical smoothing technique would be better than the other techniques; that the simplicity of the photon detection scheme for Bob, relative the others (homodyne and adaptive), would be reflected in Alice's ability to estimate the resulting unknown record and hence estimate the associated true state.(This ability is quantified by the trace squared deviation between the true state and the state estimate, which is the cost function whose minimization defines the optimal estimate.)Contrary to this intuition, we showed in our example that, most of the time, the expected cost is higher for the classical smoothing technique than for the quantum smoothing technique appropriate for the other unravellings.
Our results raise several questions for future research.Whilst we have shown that a classical description of Alice's filtering and retrofiltering is not sufficient for classical smoothing to be applicable in general, and that a classical description of the true state is sufficient, we do not know whether the latter condition is necessary, or whether some weaker condition would be sufficient.Another question is whether, given a classical description of Alice's filtering and retrofiltering, there could be a different cost function such that the classically smoothed quantum state in Eq. ( 11) would be the optimal estimator for quantum systems.Another way that one might try to apply classical smoothing would be to diagonalize the filtered state at each instant in time and then compute new weightings for those eigenstates using classical smoothing.Lastly, as mentioned in Ref. [30], it would be interesting to investigate what would happen if Alice assumed that Bob's unravelling had a classical description when in fact it does not.How poorly does this 'wrong-guessed' smoothed state perform in terms of the trace-square distance, and could we find unravellings for Bob (or even Alice) that minimizes the deviation from the optimal expected cost?

FIG. 2 .
FIG. 2. (a) The filtered and smoothed probability of the system being in the excited state.Before the jump (at time t = 0) the smoothed state begins to increase until it reaches unity right before the jump.(b) The purity for the filtered (blue line) and smoothed (green dashed line) states.The purity of the smoothed state begins to decrease before the jump due to the state having access to the future measurement record.(c) The past-future expected cost function for the filtered, using Eq.(27), and smoothed, using Eq.(14), states.The expected cost for the smoothed estimate of the state is seen to be a better estimator than the filtered estimate due to its smaller expected cost before the jump.In all cases, we have taken = 0.05γ and the limit δ → 0 + .

FIG. 5 .
FIG. 5.The same physical system in Fig. 1 with Bob using the adaptive measurement scheme in App.B. In this figure, LRBS stands for low-reflectivity beam splitter and LM stands for light modulator.

FIG. 8 .
FIG. 8. (a) and (b) show the x-z great circles of the Bloch sphere plotted at two different times prior to the observed jump.On the surface of the circle we have plotted the probability distributions for the case where Bob measures the environment using photodetection (green), an X-homodyne (red), and an adaptive scheme (cyan) (all of which have a zero y-component).The relative area of the bars indicates the occupation probability for that state.Note, the total area of both the the green and cyan bars are equal but the area under the red curve has been scaled by a factor of 3 for clarity.The markers (circle, square and triangle) on the interior correspond to the smoothed mean using the correspondingly coloured distribution.See Fig.4for the parameter details.