Approximate quantum non-demolition measurements

With the advent of gravitational wave detectors employing squeezed light, quantum waveform estimation---estimating a time-dependent signal by means of a quantum-mechanical probe---is of increasing importance. As is well known, backaction of quantum measurement limits the precision with which the waveform can be estimated, though these limits can in principle be overcome by"quantum nondemolition"(QND) measurement setups found in the literature. Strictly speaking, however, their implementation would require infinite energy, as their mathematical description involves Hamiltonians unbounded from below. This raises the question of how well one may approximate nondemolition setups with finite energy or finite-dimensional realizations. Here we consider a finite-dimensional waveform estimation setup based on the"quasi-ideal clock"and show that the estimation errors due to approximating the QND condition decrease slowly, as a power law, with increasing dimension. As a result, we find that good QND approximations require large energy or dimensionality. We argue that this result can be expected to also hold for setups based on truncated oscillators or spin systems.

The general problem of waveform estimation is to estimate a classical time-dependent signal x(t) by coupling it to a probe system and repeatedly measuring the probe. The difficulty in using quantum probe systems is that measurement causes back-action on the probe, limiting the overall precision of the scheme [1]. This is relevant not only to very small probes, e.g. optomechanical systems [2,3], but also very large, such as LIGO [4], which has recently begun an observing run employing squeezed light [5] as suggested by Caves [6].
To circumvent these limitations, a specific class of measurements-known as "quantum nondemolition"was identified [7] and explored, particularly with application to gravitational wave detectors (see, e.g. [8,9] and references therein). An observableÔ(t) (regarded in the Heisenberg picture) is quantum nondemolition if [Ô(t),Ô(t ′ )] = 0 for all t and t ′ . When this condition only holds at discrete times, the observable is termed stroboscopic, otherwise continuous.
A static observable is a simple case of a QND observable in whichÔ(t) =Ô(t ′ ), either for all times (continuous) or periodically spaced times (stroboscopic). A prominent example useful for metrology is the "backaction evading measurement" of the co-rotating position quadrature of a harmonic oscillator [9,10]. More recently, building on Koopman's formulation of classical mechanics in Hilbert space [11], Tsang and Caves showed how appropriate coupling of several quantum systems enables one to construct a collection of continuous QND observables which satisfy any desired classical equations of motion [12]. For instance, the center of mass 1 2 (q 1 +q 2 ) and relative momentump 1 −p 2 of two uncoupled oscillators, one of mass m and the other of negative mass −m, are QND observables of the position and momentum of a classical oscillator [13].
Unfortunately, as in this example, the mathematical description of non-static QND observables relies on unphysical Hamiltonians whose implementation would require infinite energy [14]. The mathematical issues are similar to those first raised by Pauli, of whether or not a time observable can exist in quantum theory [15,16]. Of course, one need only approximate the QND condition by finite-dimensional or finite-energy truncations. For instance, to emulate the negative mass oscillator, one can employ symmetric red and blue sidebands of a carrier frequency [12] or use spin systems [17]. The latter have already been applied to magnetometry [18] and position measurement [19]; possible applications to LIGO were recently examined by Khalili and Polzik [20]. The question then becomes how the approximation limits the estimation scheme. These limitations may bear (among others) upon the properties of the waveform x (t) to be estimated, or the frequency at which one is allowed to perform measurements. It is an interesting question of principle, and potentially of practical relevance in the near future, to determine how stringent these restrictions are for a given dimension or energy constraint. For instance, does the approximate quantum nondemolition setup approach the exact one exponentially fast as a function of dimension/energy, or with a slower convergence? One may regard the approximation as especially forgiving in the former case, as only a very modest investment would be needed to obtain excellent performance.
Another possible approximate QND system that we explore here involves the "quasi-ideal clock" of [21], a finitedimensional approximation of an idealized clock governed by the (unbounded) HamiltonianĤ = vp, for some constant velocity v. The dynamics of the idealized clock are just pure translation,q(t) =q(0) + vt andp(t) =p(0), as for the case of a classical free particle, and so its position records the time. Indeed, the idealized clock can be viewed as an instance of Tsang and Caves's construction, since the center of mass and relative momentum of two free (quantum) particles with opposite masses also satisfy the equations of motion of the classical free particle (with v = (p 1 −p 2 )/m in this case).
The quasi-ideal clock is a particularly simple QND system, as its free dynamics approximate the idealized case exponentially well in the dimension d [21]. Nevertheless, as we show in this Letter, when subject to repeated mea-surement for waveform estimation, the quasi-ideal clock only approximates the idealized nondemolition setting with errors decaying as a power law in the dimension. Namely, backaction limits the minimum achievable measurement precision as well as the minimum detectable signal strength to scale as d − 1 4 and d − 1 2 , respectively, which translates into an energy scaling of E − 1 4 and E − 1 2 . We also discuss the other options and argue that they may be expected to have a similar power law scaling.
Description of quantum measurement.-For context, we begin by reviewing the general framework of quantum measurement, following [1,Chapter 5]. Given a quantum system prepared in a state |Ψ , consider n sequential measurements corresponding to the observableŝ q 1 , . . . ,q n .
The (non-normalized) post-measurement state, given that one observed q 1 , . . . , q n , is given by |Ψ ′ =Ω n ( q n ) · · ·Ω 1 ( q 1 ) |Ψ , whereΩ j ( q j ) is the Kraus operator corresponding to outcome q j of the jth measurement. The joint probability distribution of the outcomes q 1 , . . . , q n is obtained from the norm squared of the latter. Typically, the Kraus operatorΩ j ( q j ) is constructed by "smearing" the projector onto the eigenspace ofq j associated to the eigenvalue q j , e.g. by a Gaussian. Physically, this corresponds to making an imprecise measurement of q j . We denote the imprecision by σ m , and assume it is the same for all j. An interesting particular case is when the observablesq 1 , . . . ,q n are given by a Heisenberg picture operatorq evaluated at different times:q j · · =q(t j ) (t 1 ≤ . . . ≤ t n ), which amounts to considering a measurement of a fixed observableq at times t 1 , . . . , t n .
This gives a prescription for computing the joint probability distribution of the measurement outcomes for any series of measurements. The moments of this disribution enjoy particularly simple expressions when the measurements are linear, meaning the corresponding observables commute up to a scalar: [q j ,q l ] = ic jl , for c jl ∈ C. Linear measurements cover several elementary quantum systems including the harmonic oscillator and free particle, which are of special interest to metrology. Here, measurement backaction does not show up in the first moments, as q j = Ψ|q j |Ψ . It does however show up in the second moments. Letting B jl = ( q j − q j )( q l − q l ) and B init jl the symmetrized correlation evaluated using |Ψ , [1,Chapter 5] shows that The first term describes the contributions from the wavefunction, the second term the imprecision of measurement, and the third term the quantum backaction. Precise measurements make the third term large, and with it the variance of the measurement result.
The quasi-ideal clock.-Consider an odd d-dimensional Hilbert space, whose basis elements |k we label using the integers Z d , ranging from − d−1 2 to d− 1 2 . The Hamiltonian of the quasi-ideal clock is simplyĤ d = 2π √ d k∈Z d k |k k|. The discrete Fourier transform of the energy eigenstates defines the "time eigenstates" |θ j · · = 1 √ d k∈Z d e − 2πijk d |k , which are eigenvalues of the "time operator"T d · · = j∈Z d j |θ j θ j |. The time eigenstates have the property that |θ j is transformed to |θ j+1 under evolution by time 1/ √ d, meaning the the system stroboscopically emulates the idealized case of pure translation [22]. Remarkably, this feature persists for all evolution times, up to an exponentially small error (as a function of the dimension or energy), provided the state is restricted to the "quasi-ideal states" of [21] (and not necessarily otherwise [23]). Essentially, these states consist of a Gaussian superposition of time eigenstates, with mean energy E ∝ d above the ground state and width growing as d λ for some λ ∈ (0, 1).
Consider, now, the quasi-ideal clock coupled to a classical waveform through the time-dependent Hamiltonian (1 + x(t))Ĥ d . We eschew the question of how to engineer such a coupling, as our focus is on in principle limits. For the idealized clock,q(t) =q(0) + t + t 0 dτ x(τ ), so the waveform can in principle be reconstructed from q(t). The finite-dimensional analog of the position operatorq is the rescaled time operatorξ j · · =T d (t j )/ √ d, as evolution by time 1/ √ d advances the clock value by precisely this amount when x = 0. Hence, setting t j = j/ √ d for integer j, one finds thatξ j −ξ j−1 furnishes an estimate of tj tj−1 dτ (1 + x(τ )). Observe that the resulting measurements are not linear.
We take the initial state to be a quasi-ideal state of variance σ 2 s d 4π , where 1 d ≪ σ 2 s ≪ d, and the measurement precision to be given by σ m . The evolution of the clock state between t j−1 and t j is given by the unitary Equivalently, one may regard this as a measurement of a freely evolving quasi-ideal clock at successive time intervals ∆t1 √ d , . . . , ∆tn √ d . Backaction scaling.-It would be desirable to compute the lowest-order moments ξ j and ξ j ξ l , but this is technically awkward due to periodic boundary conditions. Instead, for integers ℓ and m, investigating e leads to a much more tractable problem and nonetheless allows for a nice analogy with the theory of linear measurements. These quantities carry information about both the expected values of the measurements and their correlations. For illustration, the random variable X ∼ N (µ, σ 2 ) with α ∈ R, yields e iαX = e iαµ e − 1 2 α 2 σ 2 .
In general, the phase of e 2πi ξ j √ d can be divided into three contributions, as where C 1 depends only on σ s , C 2 only on σ m , and C 3 on both σ s and σ m as well as {∆t j } j . The first contribution, consisting of C 1 and the phase factor, is roughly analogous to the contribution of the wavefunction to the first and second moments in the linear case: It turns out that C 1 behaves as e − πσ 2 s 2d for large d, and Therefore, it becomes trivial, i.e. 1, when σ m → 0. For this reason one may regard it as analogous to the second term in (1). Since we have identified the analogs of all the factors appearing in (1) except the term coming from the quantum backaction, one may by default regard the C 3 contribution as analogous to the backaction. There is also a more positive argument supporting this conclusion, as one can show that C 3 = 1 if all the ∆t j are integers. This corresponds to commutingT d operators, i.e. the case of no backaction.
It turns out that C 3 has a simple form related to a random walk. Here we give the general picture; the precise details are spelled out in §B 3 and hold for all ℓ and m. The walk takes place on the discrete ring Z d , and the step size varies according to a roughly Gaussian distribution of zero mean and variance d 4πσ 2 m . Nontrivial contributions to C 3 occur whenever the walk lands on d−1 2 . Calling Z j the position at step j, we have where E z1 denotes the expectation for a walk starting at z 1 and the distribution P depends on σ s .
To proceed further, we must specify the ∆t j (or equivalently, x(t)). As previously mentioned, the case of integer ∆t j gives C 3 = 1. Heuristically, the half-integer choice, e.g. ∆t j = 1 2 , can be expected to generate the largest backaction on the system, as this is the "furthest away" (for a comparable spacing of measurements) from the case of no backaction. Now let the number of measurements scale with d as n = 2t √ d, so that the total measurement time is fixed at n j=1 ∆tj √ d = t, independently of d. The behavior of C 3 in terms of the scaling of σ 2 m with respect to d is worked out in detail in §B 3 b, but the results can be appreciated from the form of (3). The walk will spend a significant amount of time on the last position when the variance after n steps is the size of the ring: n d σ 2 m ≫ d 2 , which for the given choice of n gives the condition The detailed calculation shows that in this case C 3 is bounded away from 1 by 2t/ √ d, while C 3 ≈ 1 up to a deviation exponentially A numerical simulation of the difference of the two cases is illustrated in Figure 1.
Combining the scaling behavior of C 2 and C 3 , it is apparent that one cannot achieve a variance smaller than 1/ √ d on the measurement of ξ n . Due to the form of the Hamiltonian, this is essentially 1/ √ E, for E the mean energy of the quasi-ideal clock. This scaling can be approached by taking σ 2 m ∝ d − 1 2 +ε , with ε > 0 small. If, however, σ 2 m ≪ 1/ √ d, the variance on ξ n is at least √ d. Although these results were derived for the case of halfinteger ∆t j , they generalize in a straightforward way to the case of a random waveform consisting of white noise of variance σ 2 : x(t) = σ dW (t) dt , as shown in §B 4 c (a standard way of benchmarking a statistical estimator; see the Cramér-Rao bound in [24]). In case the waveform is completely general, it is still possible to show that one may achieve a variance as small as d − 1 2 +ε (for all fixed ε > 0) on the measurement of ξ n , but we have no clear proof that this scaling is optimal.
Waveform estimation.-It remains to be seen whether one can perform efficient waveform estimation given the above constraints on ξ j . Returning to the case of continuous x, given that typical errors of ξ j − ξ j−1 will be roughly of size d − 1 4 and from this quantity we aim to estimate surements are too sharp for how frequently they are occurring, and we should either contemplate weaker measurements at the same frequency or less frequent measurements. Suppose that the number of measurements n in fixed time t scales as d γ for 0 ≤ γ ≤ 1 2 . The condition on the variance is now σ 2 m ≫ d γ−1 , so that the condition on the waveform becomes |x| ≫ d 3 2 γ− 1 2 . As a result, provided γ < 1 3 , the smallest detectable signal is allowed to vanish as d → ∞, though strictly slower than d − 1 2 . Of course, measuring less frequently impacts the useful bandwidth of the procedure, and the maximum detectable frequency f max is bounded by f max < d γ . For example, choosing γ = 1 6 and working in terms of E gives |x| ≪ 1/ √ E and f max < E Improved scalings can be obtained by more sophisticated estimation procedures, but likely not an exponential improvement. For instance, one might like to employ smoothing, as formulated in the quantum case by Tsang [25][26][27], or indeed any other estimation technique designed for continuous signals. To do so requires a formulation of the measurement process in a suitable limit as a continuous-time process. Note that our setup is outside the usual limiting procedure of ever weaker measurements made ever more often [1,[28][29][30]. In contrast, here we call for stronger measurements made ever more often on an ever larger system, for particular scaling of the former two as a function of the latter. Using our results and techniques from [31, sections 7 and 13], it can be shown that as d → ∞, the measurement process converges weakly to a well-defined continuous-time process in certain cases: to a deterministic motion if the measurement is moderately sharp (1/ √ d ≪ σ 2 m ≪ 1) and to a Cauchy process with drift for sharp measurements (σ m = 0). The former case is precisely the behavior we expect for an idealized clock, namely zero-error, which reinforces our conclusions above. A different limit procedure is needed to construct continuous time estimators for finite d, but studying the speed of convergence of this limit may be useful in this regard. Finally, to underscore the relative crude nature of our estimator, we note that estimating the position of a forced oscillator, as considered e.g. in [32], using finite differences does not seem to work, as their variance diverges in the limit, as detailed in §A.
Other approximate QND systems.-The oscillators in the QND construction of [12] can be approximated by truncated oscillators or by spin systems via the Holstein-Primakoff approximation. Both have free evolution that well-approximates the idealized case. Indeed, the time evolution of spin coherent states exactly emulates the idealized case, since a rotation of a spin coherent state by J z produces another spin coherent state. The former behaves similarly to the quasi-ideal clock in that the free evolution is exponentially good, provided the wavefunction is taken much wider than 1/ √ d but much narrower than √ d, where d is the truncated dimension, and the energy scales as the dimension. However, it appears from numerical investigation that the accuracy of waveform estimation scales not as favorably with the dimension. Coupled with the fact that two oscillators are needed for the QND setup of [12], it is not unreasonable to expect a worse scaling in estimator accuracy with energy.
Conclusion.-The polynomial scaling of the error in waveform estimation in dimension or energy of the quasiideal clock echoes similar error scalings when it is used for timekeeping [33] or covariant quantum error correction [34]. Indeed, in [34] the bound achieved via the quasi-ideal clock is proven to be the optimally achievable rate permissible by quantum mechanics [35]-suggesting that the scaling derived in this Letter may also be optimal. Its simple structure enables relatively straightforward mathematical analysis, compared with the double oscillator systems; though in light of their practical application [12,13,20], it would be interesting and useful to more thoroughly characterize the error scaling in those cases. To enable a more sophisticated error analysis, it would also be useful to formulate a continuous limit. Perhaps, unlike our considerations above, one can fix d and scale σ m and n to obtain a useful limit.
Acknowledgments.-We acknowledge useful discussions with Carlton M. Caves. This work was supported by the Swiss National Science Foundation (SNSF) via the National Centre of Competence in Research "QSIT". M.P.W. acknowledges funding from the SNSF (AM-BIZIONE Fellowship PZ00P2_179914).

APPENDIX
Here we present detailed proofs and more extensive discussions of the main results of the paper. In section A, we treat the repeatedly measured quantum harmonic oscillator coupled to a classical time-dependent force; we show that owing to quantum backaction, one may not precisely estimate the latter by means of continuous measurements (hence the motivation behind non-demolition measurement). We then establish (section B 3) the main result of the paper, involving a waveform estimation setup consisting of a continuously measured quasi-ideal clock coupled to the time-dependent signal. Before moving to the main matter, we prove some instructive preliminary estimates on the freely evolving (i.e. without measurement) quasi-ideal clock. Finally, section C gathers all the mathematical results used in the body of this document.
Appendix A: Measured quantum harmonic oscillator coupled to a time-dependent classical force The case of the driven harmonic oscillator, described by the time-dependent Hamiltonian is of special interest to metrology. In this setup, generally speaking, the purpose is to infer something on the timedependent force x by monitoring the position of the oscillator (or more generally any quadrature, the angle of the quadrature being possibly time-dependent). The Heisenberg-pictureq andp operators for this system are given by: The two-times commutators of these observables are manifestly independent from the classical force. In particular, for the position Now, it is interesting to figure out what one may learn about x from measuring the position q. Since it may be tempting (at least conceptually) to think of continuously measuring q, yielding some continuous time series q(t) and then use 1 q ′′ (t) + q(t) as an estimator for x(t). Since the formalism described up to now deals with discrete instead of continuous measurements, one should start from a discrete setting and approach the continuum one by a limiting process where the scaling of the free parameters (essentially the ∆q) is to be specified. Let then τ > 0 denote a "unit time step". This means that one measures the positions at times t 1 = 0, t 2 = τ, . . . , t n = (n − 1)τ and that we approximate the second derivative of the idealized estimator y ′′ (t) + y(t) by a finite difference. In other words, our estimate for x(t j ) reads One may now compute the variance of this estimator: 1 In the following we use primes to indicate derivatives w.r.t. t.
We will now focus on the contribution of the backaction terms. Furthermore, we will assume that the Kraus operator is Gaussian, so that ∆q j is minimized given σ j and therefore equals 1 . This leads to: 1≤n≤j−1 k j,n k j−1,n σ 2 n + 2(τ 2 − 2) 1≤n≤j k j,n k j+1,n σ 2 n +2 1≤n≤j−1 k j−1,n k j+1,n σ 2 One now rewrites the terms in k explicitly to exhibit their scaling in τ : (A20) But assuming τ ≤ 1 (which is reasonable since one wants τ ↓ 0 in the end) and using therefore (τ −2) 2 ≥ 1, sin(τ ) ≥ 2 π τ , and this minimum is achieved only if σ 2 j scales as 1 τ . One can check that the other terms in equation (A10) do not grow more rapidly as τ ↓ 0. 2 This means that for a given force x, one may certainly not let τ ↓ 0 if one wants to get any information at all about the force! To compute the expectation of the estimator (therefore for finite τ ), let us remark that using equations (A2), (A3) we obtain Provided x is regular enough and t 0 dt ′ x(t ′ ) sin(t − t ′ ) is bounded by some constant when t ≥ 0, the expectation of this quantity approximates x(t) with an error O(τ 2 ).
Appendix B: Quasi-ideal clock: general properties and our setup In this part, which will lead to the proof of our main results, we will focus on the so-called "quasi ideal clock". This system, studied extensively in [1,2], rests upon a discretization of the phase-space making extensive use the discrete Fourier transform. This allows for more tractable exact computations or estimates than with other discretization schemes such as the Holstein-Primakoff approximation or the mere truncation of the infinite-dimensional annihilation operator. An example of application [1] is the proof that the accuracy R ∈ [1, ∞) 3 of quantum clocks may scale with dimension d as d 2−η for arbitrarily small fixed η ∈ (0, 2) -while d is an upper bound for classical stochastic clocks. In this section, one will be concerned with the incorporation of measurement into the quasi-ideal clockkeeping as close as possible to the general framework exposed in the main text and in greater detail in [3, Chapter 5]. As our setting and definitions slightly differ from those of the aforementioned papers 4 , we will review them first. Then, we will apply the definitions and results hereby introduced to show that the quasi-ideal clock approximates the QND condition up to exponentially decaying terms in the dimension. Finally, we will move on to the analysis of the measurement; an interesting result will be the emergence of an error decreasing as a power law in d (as opposed to exponentially without measurement) compared to the idealized infinite-dimensional system.

General setting
The setting considered here is that of the quasi-ideal clock, described in detail in [1,2]. We recall in this paragraph the elements essential to the understanding of the following.
The d-dimensional -take d odd for simplicity -Hilbert space of the quasi-ideal clock is span{|n Note that it remains true for all t ∈ R and all sequences ( forms an orthonormal set.
We now describe the states on which we will study the dynamics of the clock. Here, we still follow [1] in essence, namely we use a Gaussian superposition of time eigenstates. However, instead of using actually Gaussian weights, we resort to Jacobi θ functions which, in essence, are periodized Gaussians. These Jacobi θ functions are more adapted to periodic systems and the use of the discrete Fourier transform. Some important properties of these functions are spelt out in section C 2. We will consider an initial state |Ψ 0 of the form: where n 0 is an integer and ξ > 0 parametrizes the width of the state. This width will scale with d: We also introduce -following the notations of the aforementioned papers -a parameter α 0 ∈ [0, 1] which measures the distance of n 0 from its extreme possible values − d−1 2 and d−1 2 : This state is normalized according to the results of section C 2. Furthermore, the discrete Fourier transform ψ of ψ is given by: Using one of the two expressions of N ′ stated above, recalling that both d ξ 2 and ξ 2 d go to infinity (following some power law in d) given the scaling for ξ 2 we chose and using the transformation property C6, one obtains: The following lemma, combined with Poisson's summation formula, will be very useful to derive various estimates concerning the quasi-ideal clock.
Now, using the bound on f as well as proposition 12, one can upper bound the line (B16) as Following similar steps to the proof of lemma 1, one can prove: Lemma 2. Let f denote a complex-valued function defined on R such that there exists c, β ≥ 0 for which: Let z ∈ − 1 2 , 1 2 , ξ > 0 and d ∈ N odd. Then the following estimate holds: . (B20)

Approximation of the QND condition by the quasi-ideal clock
In this section, we will be concerned with how well the time operatort d of the quasi-ideal clock approximates the QND condition with respect to the HamiltonianĤ d . More precisely, we will derive a bound for how close [t d (t 1 ),t d (t 0 )] |Ψ 0 is to 0 given two times t 0 , t 1 and some quasi-ideal state |Ψ 0 .

a. With a "linear" time operator
We now want to show that the commutator , when applied to a Gaussian state, gives a vector whose magnitude decays exponentially with the dimension. It will be convenient to introduce parameters η 0 , η 1 for t 0 , t 1 which measure their distance to the points ± d−1 2 , playing the same role as α 0 with respect to n 0 (equation B9): From now on, we assume |t 0 |, |t 1 | < d−1 2 and hence 0 ≤ η 0 , η 1 < 1. Namely, we will show: Proof. To start with, we first express this commutator in the time eigenbasis: By lemma 1 (applied with c = 1, β = 0, z = − n0 d ), one may first perform the summation in r: where The leading above term can be cast to a θ function: We will now use lemma 1 again to perform the summation over k 0 : where The leading term above can be written as the derivative of a θ function: Therefore, we have established One may now perform the summation in q using lemma 2: where The leading above term can again be written as the derivative of a θ function: One now carries out the summation over k 1 : where .

(B49)
The leading terms can be rewritten as − 2π with All in all, after recalling the scalings for ξ 2 and N ′ we enforced or established, one obtains: Note that the leading term does not depend at all on t 0 , t 1 , so in particular not on their ordering. (However, t 0 , t 1 do contribute to the errors; this essentially says that for the latter to be under control, the wavefunction should not have moved too close to the "boundary times" In this section, we will essentially repeat the calculation we have just performed in the previous subsection, except that we will replace the time operator by a d-periodized version. More precisely, given fix integers m, n Precisely, we will prove: Proof. One has: Applying this operator to the initial state |Ψ 0 and projecting onto |θ k : Since the times t 1 , t 0 are not necessarily integers, the summand is a priori not invariant in p, q modulo d. Therefore, one must distinguish between the case where |m + n + r|, |m + r| ≤ d−1 2 , and the case where at least one of these conditions is violated. One can write: where which is indeed small provided m, n are of order unity. More precisely, assume for simplicity |m| ∨ |m + n| ≤ n 0 ≤ d+1 2 − |m + n| ∨ |m|. Then: One then rewrites each θ 3 in the form: Finally, applying 12 to sum over k yields:

Measured quasi-ideal clock
In this section, we will incorporate measurement into the quasi-ideal clock. As the quasi-ideal clock is a finitedimensional system -for which in particular one cannot implement exact canonical commutation relations -the analysis of measurement will not lend itself to the methods in [3,Chapter 5]. Before moving to the general setting, we will pause to describe in detail a specific subcase which will serve as a reference for what follows afterwards.

a. Measurement in the time basis
In this section, we consider the degenerate case where the initial state is a time eigenstate and one repeatedly measures the clock in this same basis. Therefore, after each measurement, the state of the clock collapses to a time eigenstate and the state of the system at any given time is completely described by the measurement results one has obtained up to this time.
Recall from section B 1 the following definitions of the HamiltonianĤ d and time operatort d -which will allow us to exhibit in a convenient form the scaling of the measurement statistics as d → ∞: Here the "time" operator has eigenvalues ranging from − ; therefore, roughly speaking, time becomes continuous and unbounded as d → ∞ which is what one would expect from taking the infinite-dimensional limit. Now, suppose the clock is initially prepared in the state |θ k and is let to evolve freely for a time δ √ d before being measured in the time eigenbasis. Then the probability to collapse to the state |θ l is given by: One may regard this expression as the coefficient of a Markov transition matrix M (M lk giving the probability of transitioning from k to l). It is clear that it can be diagonalized by the Also, for all I ≥ 0: As an example, if one starts the clock in the state |θ k before performing I measurements on it at time intervals corresponding to the result of the I th measurement -is: The factor e 2πim(k+Iδ) d in the result indicates that the "expected angle" of the clock after I measurements is essentially k + Iδ -as one might have anticipated. The factor 1 − 1 − e 2πδ sign(−m) |m| d I conveys information about the "dispersion" of this angle: in the limit where the angle is certain, it is 1; in the limit where it is completely uncertain, it is 0. 5 Reassuringly, in case δ is an integer, this factor is manifestly always 1 -the measured time is certain. This can be generalized to: where I 0 := 0. Now, if one wants to consider the limit of "continuous measurement" and see how the expectation above scales as d → ∞, one may set I := ⌈ τ δ √ d⌉ (where τ > 0 is fixed and corresponds to the "continuous time interval" during which one measures) so that to leading order in d, I δ √ d is independent of both δ and d as d → ∞. Taking k = 0 for simplicity, the above expectation behaves as follows as d → ∞: Measured quasi-ideal clock with pseudo-Gaussian Kraus operators and states: derivation of formulae In this section, we will describe a more general treatment of measurement for the quasi-ideal clock. Namely, we will allow for more or less sharp measurements (instead of a sharp time measurement in the previous paragraph) and quasi-ideal states for the initial state (instead of a time eigenstate in the previous paragraph).
A common approach for the treatment of measurement in infinite-dimensional systems is to choose Kraus operators that are Gaussian in the measured observable. It is also common to use Gaussian states for the initial state of the system. A natural transposition of this setting to the quasi-ideal clock is to use Kraus operators that are Jacobi θ functions in the time eigenbasis and, similarly, initial states which are quasi-ideal states. More precisely, following the notations of the main text, we define the Kraus operators as follows: The rationale behind this definition is that we interpret two consecutive time eigenstates |θ k , |θ k+1 as representing two times ξ = k ξ. As for the σ 2 m parameter, it controls the precision of the measurement; more precisely, σ m is exactly the precision with which one measures ξ. Therefore, if one wants to keep measuring ξ with a fixed precision in the limiting process d → ∞, σ m must scale as a constant in this process. Concerning the initial state, we keep using the quasi-ideal state |Ψ 0 defined in equations (B5) and (B10).
We start by showing that the Kraus operators above indeed define a normalized POVM: Proof.
One then transforms θ 3 according to the first equation of proposition 9: Therefore, the second term of the sum cancels when integrated over ξ ∈ − As for the first term, Therefore, and the result follows.
To perform the computations to come, one will systematically have to compute integrals of the following form: Lemma 4. Let k, k ′ , n denote integers. Then the following holds: Proof. The integral to be evaluated is Similar to what was done to prove the normalization of the POVM in the proof of lemma 3, one writes: One can now use the same parity arguments as in the proof of the normalization of the POVM. For even n, we therefore need to evaluate: Therefore: For odd n, we need to evaluate: It follows: Having established these lemmas, one can now derive an expression for the measurement statistics. We adopt the general description of measurement developed in the main text. We consider a sequence of J ≥ 2 measurements such that the j th measurement (j ≥ 1) is separated from the (j − 1) th by a time interval δj √ d . The outcomes of the measurements are denoted by ξ j 1≤j≤J and their joint probability distribution f is given by: One will be interested in finding an expression for the moments of order at most 2 of this distribution. More precisely, given integers m, n, we will compute: These quantities constitute a natural transposition of two-times correlation functions to the setting of the quasi-ideal clock.
One can then perform the summations over k ′ 1 , . . . , k ′ J−1 which amounts to discrete Fourier transforms of θ 3 functions (we therefore use equations (C11) and (C10)). For example: One can now write: As one can see from the expression above, it is now easy to perform the summation over k 1 , . . . , k J . For convenience, we introduce the notation (pseudo-Kronecker delta): One obtains: Now, one may simplify the product of pseudo-Kronecker deltas as follows: One should be careful though before substituting (for instance) p j → p ′ j − m − n (1 ≤ j ≤ I) in the summand. Indeed, unless all the δ j are integers, exp 2πi d 1≤j≤J δ j (p j − p ′ j ) is not invariant in the variables p j , p ′ j modulo d. However, if for the sake of simplicity one assumes |m + n| ≤ d, one knows that for m + n ≥ 0 (respectively m + n ≤ 0), In the former case, the solution p j to whereas it is simply p j = p ′ j − m − n in the latter situation. All in all, it is legitimate to write: Therefore, the original integral becomes: Next, one wants to "normalize" the θ 3 functions appearing in the sum so that they add up to 1 when summed over one of the p ′ j appearing in their first argument. To achieve this, we use the formula (derived from equations (C11) and (C10)): Therefore by plugging equation (B145) into (B144), we "normalize" the θ 3 functions. Performing this step followed by renaming the indices p ′ → p for simplicity, one ends up with: Finally, one will normalize the wavefunction part in the sum so that summing constant × ψ(p 1 − m − n) ψ(p 1 ) * over , one needs to compute for all integer r: where we used in the last line that d is odd. For r = 0, this is simply 1 N ′2 . Therefore, the sought normalized expression of the integral under consideration is: The sum has now a clear probabilitic interpretation as an expectation computed over a random walk on a ring. p 1 , . . . , p J are to be interpreted as the steps of this random walk and the probability distribution for the amplitude of a jump is for all jumps except for jump I (leading from position p I to position p I+1 ) where it is given by: Roughly speaking, this means that the jumps j = I have expectation 0, with a typical standard deviation √ d σm . As for the jump I, it has an expectation − m 2 and the same standard deviation. Finally, the initial distribution of the random walk is prescribed by the initial state as follows: With this interpretation in mind, the sum could be rewritten as a probabilistic expectation: where we used the standard notation E µ [·] for the expectation given an initial distribution µ. Note that the expectation is trivially bounded by 1 in norm as the random variable is. This implies that the parametrization of the Kraus operators (through σ m ) and that of the initial state (through ξ 2 ) alone enforce bounds on the modulus of , as specified in the following two propositions.
The proposition below provides a bound on exp 2πin ξJ based on the scaling of ξ with respect to d.
Proposition 3. Let r be an integer, c > 0 and −1 < α < 1. Suppose ξ 2 = cd α . Then as d → ∞, the following estimate holds: Proof. Let us inject the stated scaling for ξ 2 in the above expression.
Let us now restrict ourselves to the case r even. This means that one can replace r 2 → 0 in the first arguments of the θ 3 functions (but not r 2d → 0 !). The above then reduces to: Combining the estimates in equation (C8) and proposition 10, the above is found to behave as follows as d → ∞: The case r odd is very similar and we omit it.
The following proposition provides a bound on exp 2πin ξJ while for β < 1, Proof. This results from straightforward analysis, after rewriting The last two propositions say that if one wants to make the prefactors of the probabilistic expectation as close to 1 as possible, one has to take both the width of the Kraus operator and that of the initial state to decrease quickly with d. Therefore, if one were to ignore finite-dimensional effects, hence the contribution of the probabilistic expectation (assuming it should be 1), one may think there is a way to tune the parameters ξ 2 , σ 2 m so as to make exp 2πin ξJ arbitrarily close to 1. Unfortunately, we show in the next paragraph that this is not possible as for an exceedingly sharp measurement, the probabilistic expectation systematically exhibits a poor scaling that will ruin the improvement of the

Measured quasi-ideal clock with pseudo-Gaussian Kraus operators and states: scalings
After deriving a general formula for the pseudo-correlation function exp 2πin ξJ √ d exp 2πim ξI √ d , we will restrict ourselves in this section to the case m = −1, n = 0. In other words, we will consider the pseudo-variance One remarks that in this particular case, all jumps of the random walk have identical distribution.
a. Case δ = 1 2 One will now specialize the analysis further by assuming δ j = 1 2 for all j. (This is in some sense the simplest non-trivial case because if all δ j are integers, 1 − e −2πiδj = 0 and the probabilistic expectation is 1.) In the formula for the pseudo-variance above, one will therefore substitute 1 − e −2πiδj → 2. Actually, for reasons that will become clearer later, it is convenient to make the substitution 1 − e −2πiδj → c 3 instead, where c 3 may be any constant in [0, 2] (though in the present case it will simply be 2). The general idea will be to first estimate for all p 1 (more or less precisely depending on the range of p 1 and the scaling of σ 2 m ) and to deduce an approximation for exp 2πim ξI √ d from these estimates.
The analysis breaks into two cases according to whether σ 2 m is "big" or "small" with respect to d. The final result dealing with the first situation is corollary 1, the final result dealing with the second case is corollary 2.
Let us then start by controlling E p1 [·]. In the case where σ 2 m is "big", this simply involves controlling the marginal distribution of each of the steps of the random walk and applying a union bound to show that the expectation is 1 up to an exponentially small error in d. We start by the lemma which allows to bound the probability of reaching d−1 2 after j steps, provided the starting point p 1 of the walk is bounded away from d: . Then for all η ∈ (0, 1), ζ > 0, for all integer j ∈ 2, t √ d : provided A weaker form (i.e. bounding the probability in a power-decreasing function of d instead of an exponentially decreasing one) is: which holds if one has furthermore where: Proof. From proposition 11, one may write: Let us substitute the scaling σ 2 m := c 2 d β as well as the inequality 2 ≤ j ≤ t √ d into the bound for ε: Now, assume for some η > 0, which implies ≤ 1 + η. Furthermore, one uses equation (C71) to bound: This yields: Now, provided one chooses one has 2(1 + η)t √ d ≤ π 16c2 d 1−β and the inequality becomes Then, according to equation (C64), if one chooses It now remains to control the term where we used proposition (10) to obtain the penultimate line. Now, provided (B193) can be bounded by: Finally, provided (B196) can be bounded by: As for θ 3 p d + 1 2d + 1 2 , i 2ξ 2 d , one may bound it in a cruder way: One now uses the modular transformation properties of θ as well as equation (C31) to treat the factors θ 3 1 2d , i One may now use 2 γd ≤ 1 2 and d ≥ 2 to obtain a slightly simplified bound: Now, suppose: One may then further simplify the bound: (B224) In the case α < 0, choosing One has indeed The following lemma allows one to precisely lower-bound the denominator appearing in the expression for the initial distribution.
Lemma 7. Let c 1 > 0, −1 < α < 1 and ξ 2 := c 1 d α . Then the following lower bound holds: Proof. Since all the θ functions are positive, it suffices to lower bound each of them. First, Secondly, Thirdly, Fourthly, Recalling −1 < α < 1, only the first two terms are relevant and one can then write: Putting all the last lemmas together, one can now precisely bound the probabilistic expectation in case σ m is "big enough" with respect to d. Doing so results in the following proposition.
Proposition 5. Let c 1 , c 2 > 0, −1 < α < 1, − 1 2 < β < 0. Suppose as usual that ξ 2 and σ 2 m scale with d according to Proof. Assume for simplicity n 0 > 0. Choose γ ∈ 1 2 , 1 such that n 0 < 1 2 (γd − 1). One will estimate by distinguishing the p 1 according to whether |p 1 | ≤ 1 2 (γd − 1) or |p 1 | > 1 2 (γd − 1). In the former case, lemma 5 tells us that P p1 p j = 1 2 is small for every j so that E p1 [. . .] is also small by a union bound. In the latter case, we will only use that the expectation is bounded by 1 but according to lemma 6, one will now be considering the tail of the initial distribution and therefore get that this contribution is also small.
We have therefore just proved that if σ 2 m scales with d such that it vanishes as d → ∞, but not faster than 1 √ d , then the pseudo-variance in equation (B254) behaves essentially as Therefore, it deviates from its "infinite-dimensional value" 1 (cf. discussion of quantum measurements in infinite dimension in the main text) by an error which scales as a power of d. This error contains a contribution both from the dispersion of the initial state (exponent α) and from the imprecision of the measurement (exponent β). To connect it to the general theory of measurement exposed in the main text (and in greater detail in [3, Chapter 5]), one may say that the backaction contribution is exponentially suppressed. By the conditions of application of the last proposition, one is allowed to make the error contributed by α as small as 1 d 2 but one cannot make the term depending on β scale better than 1 d 3/2 since one assumed β > − 1 2 . One may then wonder whether one could not obtain a better scaling by choosing β < − 1 2 . The purpose of the following is to show that doing this will yield an error essentially as bad (i.e. of order 1 √ d ) as the one obtained from measuring the clock in the time basis (a limiting case which was studied in section B 3 a and formally corresponds to letting σ 2 m ↓ 0 in the more general framework considered here). Roughly speaking, the key idea is that the variance for a jump of the random walk under study is approximately d 2 1 2σ 2 m d = 1 2c2 d 1−β . Therefore, one expects the variance of the marginal distribution of the j th step to scale as jd 1−β . If now j scales as √ d, this becomes d 3/2−β . One therefore sees that whenever β < − 1 2 , this grows faster than d 2 and one therefore expects the marginal distribution of the step to be close to uniform. This is the first ingredient of the proof, treated in the first lemma. However, this is not sufficient since although the steps may all be close to uniformly distributed beyond a certain number of iterations, they are not independent, precluding a priori an evaluation of the expectation. This point will be addressed in the second lemma.
For the following two proofs, it will be particularly helpful to write the distribution for a jump of the random walk under consideration, as a discrete Fourier transform: where we defined θ 3 (z, iτ ) := θ3(z,iτ ) θ3(0,iτ ) (therefore, θ 3 (0, iτ ) = 1 and θ 3 (z, iτ ) < 1 for z = 0 [1]. We therefore start to show that above a certain number of iterations scaling as √ d, the random walk becomes (exponentially) close to completely mixed. Lemma 8. Let β < − 1 2 (σ 2 m := c 2 d β ) and j ≥ 1. Then the variation distance between the marginal distribution for the j th step of the random walk under consideration is upper bounded by: In particular, if j ≥ t √ d with t > 0, this yields: Proof. Let j ≥ 1. To show that the distribution of p j is close to uniform, we will bound its variation distance (denoted here by . ) to the uniform distribution. To achieve this, we use the following bound for the variation distance between two probability distributions P, Q on Z d (for a very general exposition, including more general finite groups than Z d , see [4]): whereP ,Q denote the discrete Fourier transforms of P, Q. This yields: A convenient way to bound the summand is by way of the Jacobi triple product formula C9. The latter implies: One may now use the estimate cos(2πx) ≤ exp − (2πx) 2 2 valid for all x ∈ − 1 2 , 1 2 to bound the above as: One can now use the inequality C79 to write As for the prefactor containing s, one may use the crude bound since 1 − e −x ≥ x 2 for x ∈ 0, 1 2 for instance (which implies that the above holds for d ≥ 7). Therefore, This entails: In particular, if j = t √ d where t > 0, the upper bound becomes: which indeed vanishes exponentially as d → ∞ since β < − 1 2 . Having established this result on the marginal distributions of the steps of the random walk, one will now prove an important lemma that will allow us to control a family of expectations involving many p j . Lemma 9. Let I ≥ 1 denote an integer. Let c > 0. Then the following bounds hold: Proof. One first rewrites the expectation under consideration by expanding the product: where we use the convention that for j = 0, the to equal 1. Then, note that for every fixed 2 ≤ i 1 < . . . < i j , one may use the Fourier expansion B258 of the probability distribution for one step of the random walk to obtain: where we have set i 0 := 0 for convenience. Therefore: For −1 < β < − 1 2 , one may estimate it as: Therefore, the positivity condition 1 − c This lemma being established, one is now ready to prove the main proposition concerning the behavior the probabilistic expectation when β < − 1 2 .
Proposition 6. Let t > 0 and I := ⌈t √ d⌉. Assuming the usual scalings for ξ 2 and σ 2 m , there exists some constant c 5 (depending on c 2 and c 3 ) such that for all ε > 0, the following holds as d → ∞: Proof. Fix I := t √ d and any integer p 1 ∈ − d−1 2 , d−1

2
. One wants to consider: Expand the expectand as: Now, consider the expectation of one term of the sum: d. Application to waveform estimation Keeping the setting introduced in the last paragraph, one may now wonder what can be learnt about x(·) from monitoring the time ξ of the clock. An estimate for X j from the measurements of ξ j−1 , ξ j is From what we identified as the optimal scaling in d from corollary 1, the variance on each measurement of ξ √ d cannot decrease to 0 (with d) faster than d −3/2 . This implies an error increasing at least as d 2 × d −3/2 1/2 = d 1/4 on the estimation of each X j . Now, the typical magnitude of X j is given by √ d σ 2 1 √ d 1/2 = σd 1/4 . Therefore, σ should grow faster than 1 (that is, diverge) for the standard deviation on the estimation of X j to be negligible with respect to X j . Now, assume one no longer measures the clock at every 1 √ d time steps, but approximately every d −ε time steps instead (0 ≤ ε < 1 2 ). The integral corresponding to white noise, rescaled by √ d, along such an interval has typical magnitude √ d σ 2 d −ε 1/2 = σd (1−ε)/2 . For the standard deviation of the estimator of X j to be negligible with respect to X j , one now needs d 1/4 ≪ σd (1−ε)/2 , that is σ ≫ d ε/2−1/4 . If one considers the limit ε → 0, this means that σ needs to grow faster than d −1/4 .

Existence of continuum limit
In sections B 3 b and B 4, given a quasi-ideal clock of dimension d measured with some precision σ m , we constructed and characterized a discrete-time random process ξ (d) 1 , ξ (d) 2 , . . . Here, we added the superscript d here to account for the dependence of this random process on the dimension; in the construction, the parameters c 2 , c 3 , α, β which specify ξ 2 and σ 2 m for all d are implicitly fixed; we also always take δ j := 1 2 for all j. One may then wonder, loosely speaking, whether this discrete-time process admits a sensible "continuum limit" as d → ∞. We will address this question in two interesting limiting cases: on the one hand, σ 2 m ∝ d β with − 1 2 < β < 0; on the other hand, measurement in the time eigenbasis as described section B 3 a -formally σ 2 m = 0. Generally speaking, showing the convergence in law of a sequence of stochastic processes {(X n t ) t≥0 } n≥0 to some stochastic process (X t ) implies showing the weak convergence of the finite-dimensional distribution as well as verifying a tightness condition; see [5, theorems 7.5-13.5] for specific criteria and [5, theorems 8.1-8.2] for a simple application (weak convergence of a random walk of finite variance to Brownian motion).
We now give a sketch of how these results can be applied to the problem under consideration. First, the weak convergence of finite-dimensional distributions may be conveniently derived from the pointwise convergence of the characteristic functions. Indeed, the latter can be recovered -say in the 2-dimensional case for definiteness -from our expression for exp 2πn ξ (m, n ∈ Z) by taking m := ⌊χ 0 √ d⌋, n := ⌊χ 1 √ d⌋ where χ 0 , χ 1 ∈ R are fixed. This differs from the case in which m, n are constants treated above; fortunately enough, at least in the cases σ 2 m ≫ d −1/2 and σ 2 m = 0 (measurement in the time eigenbasis), one can show that our estimates are still robust against this scaling.
Proof. This follows from [5, theorem 13.5], using the previous two lemmas.
The θ functions play for the discrete Fourier transform a similar role to the Gaussians for the continuous Fourier transform. Roughly speaking, the discrete Fourier transform of θ 3 function of a given width is a θ 3 of "inversed" width. More precisely, given a positive integer N and ξ > 0, the following relations hold [9]: In the sequel, we will be exclusively interested with θ 3 where the second argument is purely imaginary (with positive imaginary part) and we will therefore frequently use the notation θ 3 (z, iτ ) where τ > 0 (instead as τ ∈ H from the initial definitions). An interesting property is that if one restricts the summation in the definition C3 of the θ 3 function to the integers that are congruent to some r modulo N (N > 0, 0 ≤ r < N ), the result is still a θ function. Precisely: p∈Z exp −πτ (r + N p) 2 + 2πi(r + N p)z = exp −πr 2 τ + 2πirz θ 3 zN + irN τ, iN 2 τ (C12) In our study of the quasi-ideal clock, we will use the θ 3 function as wavefunction coefficients. It will therefore be frequently necessary (e.g. to compute scalar products) to rewrite products of θ functions as a linear combination of such functions. The following proposition gives a general formula for this purpose.

Proposition 14.
For all x > 0, Proof. Let us first prove the lower bound. First, observe d