Automatic Post-selection by Ancillae Thermalisation

Tasks such as classification of data and determining the groundstate of a Hamiltonian cannot be carried out through purely unitary quantum evolution. Instead, the inherent non-unitarity of the measurement process must be harnessed. Post-selection and its extensions provide a way to do this. However they make inefficient use of time resources -- a typical computation might require $O(2^m)$ measurements over $m$ qubits to reach a desired accuracy. We propose a method inspired by the eigenstate thermalisation hypothesis, that harnesses the induced non-linearity of measurement on a subsystem. Post-selection on $m$ ancillae qubits is replaced with tracing out $O(\log\epsilon / \log(1-p))$ (where p is the probability of a successful measurement) to attain the same accuracy as the post-selection circuit. We demonstrate this scheme on the quantum perceptron and phase estimation algorithm. This method is particularly advantageous on current quantum computers involving superconducting circuits.


I. INTRODUCTION
Algorithms for classification of data, optimising the energy to find the groundstate properties of a Hamiltonian (and indeed optimising classifiers for a given data set) require the use of non-linear operations that cannot be achieved solely through unitary quantum evolution. When carrying out these tasks on a quantum computer we must use the non-unitarity of the measurement process. There are several ways in which to do this depending upon the relative abundance of resources, quantified by measures such as the number of qubits, coherence times and gate fidelities. Post-selection and the related repeatuntil-success algorithms are popular choices.
However post-selection makes inefficient use of time resources -a typical computation requires O(2 m ) measurements over m qubits to reach a desired accuracy. Nevertheless, it is a frequently used tool in atomic contexts where coherence times are long and manipulation timescales short, so that time is not the limiting resource. For superconducting circuits where coherence times are much shorter and where measurements of a subset of qubits is not possible while maintaining the coherent evolution of the remainder, this is more problematic.
Curiously, post-selection in fact uses classical nonlinearity -through a yes or no decision based upon a * lewis.wright@kcl.ac.uk measurement on ancillae qubits. This suggests how a more time-efficient scheme might be developed. Fundamentally, the non-linearity of the classical world is induced by observation of only a portion of a larger quantum world. It is possible then to replace post-selection with a scheme where explicit measurement of ancillae qubits is not required, i.e. where they are traced out or simply ignored.
Eigenstate thermalization gives a clue as to how this can be achieved. Coupling a small system at high temperatures to a large, low temperature bath allows us to cool the small system. Eigenstate thermalization extends this notion to closed quantum systems. Coupling a large number of ancillae qubits in a low entropy state (e.g. |00000... ) to a small system and evolving the total system under some unitary evolution allows entropy to flow from the small system of interest to the ancillae.

II. RESULTS
Inspired by this, we replace the classical yes or no nonlinearity of post-selection with a non-linearity attained by tracing out ancillae. In the following, we demonstrate the application of ancillae thermalization to the quantum perception, quantum gearbox and a groundstate preparation algorithm. Robustness to noise is demonstrated for the quantum gearbox in simulation for two different noise models and on the 'ibmq_virgo' quantum device.   Our scheme is a type of amplitude amplification-a generalisation of Grover search [1] introduced by Brassard et. al [2]. Amplitude amplification comes in various flavours depending upon whether the state of the target qubits or probability of success at each iteration is known or not. For a direct comparison with our method, we focus upon an implementation that does not require the knowledge of either and decreases the error monotonically, π 3 fixed-point search [3]. Used in the context of amplitude amplification, this algorithm which we call π 3 fixed-point oblivious amplitude amplification (FP OAA) is equivalent to the optimal 'Fixed-point quantum search' from Yoder et. al in the regime of large initial success probabilities [4]. In this regime we find ancillae thermalization more robust to the effects of gate errors, with a similar cost in time and more qubits. In addition, the different structure of our method allows implementation of a wider class of transformations whilst retaining no knowledge of the target state, including non-unitary transformations, the latter of which OAA algorithms alone do not allow [5]. We demonstrate this using a non-optimal procedure for groundstate preparation shown in Fig.3. Table I and Fig. 4 show the trade-offs in resources between the various approaches for unitary transformations and robustness to gate noise respectively, in the large success probability regime.
A. From post-selection to ancillae thermalization achieves the rotation R on |ψ and the state |0 ⊗m on the ancillary qubits with probability p 0 . The states E k |ψ corresponds to incorrect transformations of the target qubits. If the procedure fails, all qubits are reset and the process is repeated. The probability of failure after N iterations of the algorithm is ∼ (1 − p 0 ) N ; the transformation R is implemented exactly with a finite probability.
Ancillae thermalization and more generally amplitude amplification take a different philosophy. The output from U on the target qubits is interpreted as a superposition of correctly transformed (R |ψ ) and incorrectly transformed (E |ψ ) states. Tracing out the ancillary qubits prepares the target qubits in a mixed state ρ that is approximately correct. The density matrix ρ has an overlap ψ| R † ρR |ψ = 1 − with R |ψ . For ancillae thermalization, this fidelity with the target state is obtained with an exponential reduction in measurements for an increased cost in ancillary qubits. This is achieved by iteratively entangling fresh ancillary qubits with the target qubits via unitary V = U W , where W is a reset gate that transforms E k |ψ → |ψ for all k. The ancillae conditionally entangle with the incorrectly transformed parts of the system's wavefunction through control gates on the previous ancilla at each iteration. The circuit to achieve this is shown in Fig. 1b). The overlap between the target qubits and R |ψ increases exponentially with the number of iterations applied (see Appendix B for full details).

Method
Measurements Computational resources in the of post-selection, ancillae thermalization and π/3 fixed-point oblivious amplitude amplification (FP OAA) [3] for guaranteeing unitary transformations in the large initial success probability regime. We define Q(ON ) as the number of operations required to implement the N-qubit gate, O, and p0 as the initial probability of a successful measurement. In the asymptotic limit the number of operations required to implement controls on Un+m is constant. Additionally, Q(Wn) ∼ O(1) and can be ignored in practice. Although ancillae thermalization has more operations, we observe lower susceptibility to gate errors. We suspect this is due to the exponentially fewer operations acting on each ancilla in ancillae thermalization, exposing them to less gate noise. Justification for these values can be found in Appendix A whilst a comparison of noise robustness can be found in section III.

B. Quantum Perceptron
The quantum perceptron [6] is the first explicit example of a quantum circuit fulfilling the requirements for a meaningful quantum neural network. It was introduced by Schuld et. al [7]. It is able to simulate a classical perceptron whilst taking advantage of quantum properties such as processing input data as a superposition. In general quantum neural networks struggle to construct a nonlinear activation function due to their linear dynamics. The quantum perceptron uses a post-select circuit shown in Fig. 2a) to achieve this non-linearity. This circuit implements the transformation |ψ → exp(−iq(θ)Y ) |ψ onto a target qubit with probability p(θ) ∼ O (1/2). The angle of rotation q(θ) = arctan(tan 2 (θ)) is sigmoidal in shape and can be used to capture the non-linear properties found in classical neural networks in a quantum setting.
Ancillae thermalization removes the need to post-select in order implement the quantum perceptron. The circuit shown in Fig. 2b) achieves the same level of accuracy as O(N ), with N = 2 attempts of the post-selection circuit. To achieve a total overlap with the desired state exp(−iq(θ)Y ) |ψ within additive error , the process of applying V to fresh ancillae and the target qubit must be repeated O(log(1/ )) times. This achieves a fidelity between the finalised target qubit and the desired state given by, where δ = 1 − p(θ) and has been re-scaled. The fidelity increases exponentially with the number of iterations.
Results of applying ancillae thermalization to the quantum perceptron obtained from IBMQ's 'qasm-simulator' i.e. a simulator with no noise and 'ibmq-oursense' quantum machine are shown in Fig. 2c). As in other applications of NISQ devices, there is an optimum circuit depth that balances theoretical advantages of deeper circuits with the effects of noise. The quantum perceptron displays an increase in accuracy with increasing iterations up to a threshold where further operations increase exposure to finite gate fidelity leading to a decrease in accuracy. This point is emphasised in the sub-figure which shows a lower fidelity for a higher number of iterations.

C. Phase Estimation
Next, we apply our procedure to a groundstate preparation algorithm. Although more efficient state preparation algorithms exist, see [8][9][10], this setting is still of interest since it reveals the role of the ancillae as an effective low-temperature bath in addition to demonstrating a FP OAA scheme for non-unitary transformations.
The quantum phase estimation algorithm shown in Fig. 3a), computes the eigenvalue θ satisfying A |A = exp (2πiθ) |A . Post-selecting on the precision qubit register can prepare target qubits in the groundstate of an nqubit Hamiltonian. Ancillae thermalization achieves the same effect by tracing out ancillae qubits -the ancillae effectively provide a low entropy reservoir into which the excess energy of the target state can be transferred. The circuit shown in Fig. 3b) achieves the same level of accuracy as O(N ) for N = 2 attempts of phase estimation with post-selection. In a similar manner to the quantum FIG. 2. Quantum Perceptron/non-linear activation function q(θ). a) Post-select circuit for implementing angle q(θ) in the quantum perceptron, acting on an ancilla and the target qubit. A successful transformation of exp (-iq(θ)Y )|ψ corresponds to measuring |0 on the ancilla with probability p(θ)= cos 4 (θ)+ sin 4 (θ). Upon failure, when |1 is measured on the ancilla, the target qubit is guaranteed to transform as exp (-iπ/4)|ψ . As a result, the target qubit can be reset by rotation Ry(π/2) and the circuit is repeated. b) Ancillae thermalization circuit for an equivalent O(N ) attempts of post-selection, with N = 2 applications of the circuit in a). The reset and second instance, which acts on a new ancilla, are both conditioned by the state of the first ancilla. c) Angle q(θ) obtained by ancillae thermalization for different number of iterations and θ. The values were obtained from IBMQ's 'qasm-simulator' (symbols) (no noise) and 'ibmq_ourense' quantum computer (rings). The subfigure shows the fidelity from 'ibmq_ourense' between the finalised target qubits and groundstate for different numbers of iterations.
perceptron, a total overlap with the groundstate within additive error is achieved by applying V to fresh ancillae and the target qubits O(2 m log(1/ )) times. After tracing out the ancillae qubits, the fidelity between the finalised mixed target state ρ and the groundstate is given by, where J = (N − 1)/N , N = 2 n is the number of eigenstates and m is the minimum number of precision qubits required to distinguish between all energy values without imperfections. Therefore the precision, i.e. the number of ancillae used in the phase estimation circuit, dictates an upper bound on the fidelity. Ancillae thermalization shows an exponential increase in fidelity as the number of iterations increase compared to the fidelity attained with the same number of attempts of the post-select circuit. We assume that the value of the groundstate energy is known up to precision 2 −m . Additionally, a preprocessing procedure has occurred which shifts all energy values by this amount such that correctly preparing the groundstate is indicated by measuring |0 ⊗m . Results of applying the ancillae thermalization to groundstate preparation obtained from IBMQ's 'qasm-simulator' with the addition of simulated noise are shown in Fig. 3c). The fidelity was computed between the finalised target qubits and the groundstates of the one qubit Hamiltonian 1| and the two qubit Hamiltonian H 2 = 4 i,j=1 a ij |i j| (a random set of parameters a ij were chosen in the latter case as described in Appendix C) for different numbers of iterations of the ancillae thermalization circuit. As in the case of the quantum perceptron an increase in fidelity with the number of iterations reaches an upper bound when the circuit depth leads to too great an exposure to gate noise. The fidelity is lower in the two-qubit case than predicted analytically. This is due to an approximation made on the initialization and scrambling operations on the target qubits. Furthermore, due to the inclusion of Toffoli gates in the NOR gate, the ancillae thermalization modification of post-selection is highly sensitive to noise. This is exacerbated for larger Hamiltonians due to the increase in number of gates required to act on the target qubits and slower convergence of fidelity with the number of iterations. A more detailed discussion of these effects and of the parameter values used in the simulations can be found in Appendix C. OAA schemes alone cannot deterministically implement non-unitary transformations. However, recent developments in block-encoding [11] and quantum signal processing [12] allow us to embed an approximate n-qubit projector in the upper-left corner of an n + k qubit unitary, where k is the number of ancillae needed for the encoding. Amplitude amplification is then used to deterministically implement this approximate projector onto  The NOR gate compiles conditions from the ancillae whilst the reset gate, S, redistributes the weights of the incorrectly prepared states of the target qubits onto all bit strings. Each iteration acts on the scrambled states of the target qubits and is controlled by the output of the last NOR ancilla. A complete description for the NOR and scrambling gate can be found in Appendix B. c) Fidelity between the finalised target qubits and groundstate of the Hamiltonian using ancillae thermalization. We show results for different numbers of iterations. The results were obtained from IBMQ's 'qasm-simulator' for H1 = 0 |0 0| − 3π 2 |1 1| and H2 = 4 i,j=1 aij |i j|. We simulate different levels of noise based upon a model of thermal relaxation between the qubits and their environment. The range of data points for the noise based simulations was restricted due to computational limitations. Additionally an approximation was made on the initialization and scrambling operations in the non-diagonal two qubit case. Further details of these experiments can be found in Appendix B.
the target qubits [10]. A further comparison of resource costs between these methods can be found in table II in Appendix A. State of the art algorithms for groundstate preparation assume the initial target qubit state has a non-trivial overlap with the groundstate. For ease of demonstration we initialize the target state in an equal superposition of all eigenstates and construct a reset gate, i.e. the scrambling gate, which scrambles every eigenstate such that the output has an equal overlap with all other eigenstates. In order for ancillae thermalization to have a competitive groundstate preparation algorithm, a more sophisticated initialization and reset procedure must be implemented.

III. ROBUSTNESS TO NOISE
Resource costs such as the number of qubits and gate operations are a good indication of an algorithm's efficiency. On near-term quantum devices, however, an algorithm's robustness to noise is a much more practical measure. Ancillae thermalization is more robust than alternative schemes.
Demonstrating robustness to noise: Fig. 4 c)/d) demonstrates ancillae thermalization's robustness to noise compared with π/3 FP OAA for the quantum gearbox -an extension of the quantum perceptron with two ancillary qubits [13]. The circuits for the quantum gearbox, ancil-  [13]. This circuit is a generalisation of the quantum perceptron found in Fig. 2 with m = 1 ancilla. Measuring 0 ⊗m on the ancillae with probability p (θ) = cos 4 (θ) + sin 4 (θ) corresponds to the successful transformation of exp (−iq(θ)Y ) on the target qubit where q(θ) = arctan(tan 2 (θ)) and sin(θ) = sin(θ1)... sin(θm). Measuring |1 on any ancillae corresponds to applying Ry(−π/2) onto the target qubits, and thus can always be reset by applying Ry(π/2). For the simulation θ1 = θ2 = π/4. b) Ancillae thermalization circuit for an equivalent O(N ) attempts of post-selection with N = 2 applications of the circuit in a). Unitary V , which acts on the fresh ancillae and incorrect states of the target qubit, is conditioned by the state of all ancillae from the previous iteration. c/d) Fidelity between the finalised target qubit and desired state, exp (−iq(θ)Y ) |ψ using ancillae thermalization and π/3 FP OAA for different number of operations i.e. exposure to gate noise. Both dephasing and depolarising noise models were simulated in c) whilst d) is ran on 'ibmq_vigo' quantum device, in addition to a noiseless simulation.
lae thermalization and π/3 FP OAA are given in Fig. 4a, Fig. 4b and Fig. 6 in Appendix C. The multi-control gates in both circuits were implemented without additional ancillae. Specifically, the 3 qubit control gate was implemented using 6 C-NOT gates and 7 single-qubit control gates, whilst the 4 qubit control gate was implemented using a 2 single-qubit control gates and 3, 3 qubit control gates [14]. Simplified multi-qubit Toffoli gates were used to further reduce operational cost [15]. Full state tomography was performed on the target qubit with 24576 shots for both circuits. The fidelity was then computed between the target qubit and desired transformation, R = e −iY θ , where sin(θ) = sin(θ 1 ) sin(θ 2 ). Note that in practice full state tomography is not required as the target qubit will be assumed to have a sufficiently large fidelity with the desired state. Additionally, the number of single qubit and C-NOT operations were measured per iteration of each algorithm. In addition to demonstration on IBMQ's quantum machine, simulations of both circuits were performed with depolarising and thermal relaxation noise. The depolarising noise error parameter, λ = 0.001, 0.01, for all the single qubit and 2 qubit gates, respectively. Details of the latter noise model are given by the 'high' noise level in table III in Appendix C. All circuits run on IBMQ's quantum machine were compiled using the OpenQASM backend [16] without additional error mitigation techniques.
Origin of Noise Robustness: We believe that the robustness to noise of ancillae thermalization arises because of the intrinsic robustness of the thermalization to changes in the coupling of the system to the bath. In the special case of the quantum perceptron, some of the controls -which are the proxy for the system-bath interaction -can be removed entirely without any detriment to the performance. This can be seen in Fig. 2 b) where no controls are placed upon the R y (2θ) rotations of the fresh ancillae. We have preliminary evidence of robustness to reducing controls in other circumstances. This strongly suggests that ancillae thermalization does indeed inherit robustness to noise from independence upon the bath-system interaction. A thorough analysis will be the subject of a future work.
An additional consideration in the comparison between ancillae thermalization and π/3 FP OAA is the number of operations acting on each ancilla qubit. This number is fixed in ancillae thermalization by the depth of the post-select unitary, regardless of the finalised accuracy, exposing ancillae to less gate noise.
We expect our intuition to be applicable to a variety of circuit and quantum machine archetypes. The fixedpoint quantum search proposed by Yoder et. al has been shown to have an exponential decrease in query complexity over π/3 fixed-point quantum search in the regime of small initial success probabilities. Currently, it is unknown whether fixed-point quantum search for OAA has an increased robustness to gate noise compared to ancillae thermalization in this regime. In the large initial success probability regime however, it is known that the operational costs of fixed-point quantum search coincide with π/3 FP OAA for an equivalent finalised success probability. Therefore although a direct comparison has not been made, we expect ancillae thermalization to have the highest overall robustness to gate noise within the large initial success probability regime.
Mitigating qubit costs and control complexity: One drawback of ancillae thermalization is the use of resource intensive control gates. However, the same considerations that suggest robustness to gate noise also motivate ways to mitigate control costs and complexity. Inspired by the fact that a system thermalizes when only a subset of its modes are coupled to a heatbath, we have preliminary evidence that the number of control qubits and complexity of controls can be reduced by conditioning on a subset of factors of the unitary. In the case of thermalization, the coupling to the bath can be simple providing that the Hamiltonian is sufficiently scrambled. The scrambling transfers energy to the bath-coupled modes where it is dissipated. We find that it is sufficient to control simple factors of the unitary with the more complex factors playing the role of scrambling. In the context of repeat until success, such a mitigation scheme would correspond to controlling on an imperfect error flag. The result would no longer be guaranteed a success. This is less useful than its application in ancillae thermalization where even an imperfect correction of error increases the amplitude for success exponentially in time. The speedup from conditioning on a subset of operations does not change the linear-in-time scaling of the number of qubits, only the pre-factor to this scaling. Moreover, the time to reach the desired accuracy increases, reflecting the increased time to thermalise if the coupling to the bath is weakened. Nevertheless, this ability to tension these costs against one another will likely prove useful in near-term applications.

IV. DISCUSSION
The non-linearity of the classical world can be understood by the observation of a minor part of a quantum system -the unobserved part of the system acting as an environment. The environment can be interpreted as a heatbath extracting entropy from our system, or equivalently an entanglement bath which gradually and selectively entangles with a subset of our system. A simple and effective model of a heatbath is to assume no back reaction so that each mode of the heatbath interacts exactly once with the system of interest. It forms such a small fraction of the overall size of the bath that the bath distribution is unaltered. At the same time, the fact that the system never interacts with this mode again means that the back reaction effects are not felt. We have used these two ideas to allow a set of ancillae qubits initialised in some low entropy state to extract entropy from our system. The free evolution of our ancillae is with a zero energy Hamiltonian -ensuring that entropy only flows from the system of interest to the ancillae and each ancilla interacts only once with the system corresponding to a no back reaction condition. The resulting algorithm is a type of FP OAA for unitary and non-unitary transformations, achieving non-linearity by tracing out auxiliary degrees of freedom. Its structure is rather different from its counterparts, which require fewer qubits. The π/3 FP OAA scheme achieves an optimal amplitude amplification through a cunning cancellation of phases. It is a fundamental observation of statistical mechanics however that the nature of a heat bath does not determine the thermal equilibrium state (provided suitably weak coupling). Our scheme effectively harnesses this universality to obtain a degree of robustness to gate infidelity in addition to deterministically implementing a wider class of transformations without knowledge of the target state. Moreover it seems possible to reduce the qubit cost and control gate complexity at the expense of longer times. This gives additional freedom to operate within the NISQ constraints of qubit count and gate fidelity. Which scheme is optimal is contingent upon the particular system to which the algorithm is applied. Still, it is gratifying that there exists a regime where a simple physically motivated scheme such as the one we present can outperform other methods.

V. ACKNOWLEDGEMENT
We acknowledge support from the EPSRC: LW and FB through EP/L015854/1 , FB and AGG through EP/S005021/1, JD through EP/S021582/1 and GHB through support from the Royal Society via a University Research Fellowship, as well as funding from the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation programme (Grant Agreement No. 759063)

Appendix A: Resource Scaling
The entries in Table. I of the main paper show a comparison of resources required to implement the techniques discussed in this paper compared to ancillae thermalization. Here we give more detail on how these entries are obtained.

Unitary Transformation
Measurements: For a post-select unitary U acting on m ancillae and target qubits |ψ , up to O(2 m ) measurements on the ancillae are required to implement R onto |ψ . This assumes all measurements are independent of each other. On the other hand, no measurements are required in π/3-OAA and ancillae thermalization.

Number of Qubits:
All qubits can be reused upon failure in postselect circuits since the ancillae state collapses upon measurement. The state E |ψ is independent of R and can always be reset. Therefore U acting on m ancillae and n-target qubits requires n+m qubits in total. Using an additional m−1 qubits in OAA allows for a linear scaling in Q(Sm(π/3)). To achieve a fidelity ψ| R † ρR |ψ = 1− for ancillae thermalization requires Insertions of m new ancillae and m − 1 NOR ancillae qubits are needed at each application. Therefore a total O (n + log (1/(1 − p 0 )) log (1/ ) m) qubits are required for ancillae thermalization.
Operations: The most important resource in the comparison of ancillae thermalization and OAA is the number of gate operations required. This is computed in the case of post-selection, OAA and ancillae thermalization as follows: i. Post-selection. We define Q(O N ) as the number of single qubit and C-NOT gates required to implement the N-qubit gate O. Therefore, U requires a total of Q(U n+m ) operations for n target and m ancillary qubits . No further coherence time nor operations are needed, since the ancillae becomes disentangled with the target qubits once measured and the circuit can be reset. ii. π/3 FP OAA. A detailed derivation of the resource costs can be found in Ref. [17]. We summarise the key results here. An upper bound on the error for the k'th nested iteration is given by (1 − p) 3 k ≤ . Rearranging we find, The number of operations at each iteration can be computed using the recursive relation Q(An,m(j)) = Q(An,m(j − 1)) + 2Q(Sm( π 3 )). Consequentially its closed form can be written as, where the initial conditions are given in Eq.D2 and D3. The function Q(Sm(π/3)) is the number of operations required to implement the controlled phase shift. Assuming access to additional ancillae, Sm(π/3) can be constructed in a similar way to the NOR gate such that the number of operations scale linearly with m, for m > 2. Using our expression for k from Eq.A1 and Q(Sm(π/3), the result for the total number of operations in π/3 FP OAA follows. iii. Ancillae thermalization: Ancillae thermalization produces an overlap ψ| RρR |ψ ) = 1 − (1 − p) P +1 (1 − | ψ|Rψ )| 2 ) between the finalised target qubits and the desired state for P iterations of V . To achieve an overlap 1 − , O (log(1/ )) implementations of V are required. Each V consists of Q(U n+m ) + Q(Rn) operations, of which controls need only be implemented on gates acting on the target qubits. The result for the number of operations required in ancillae thermalization follows.

Groundstate Preparation
In this paper we use groundstate preparation as a demonstration to deterministically apply a non-unitary operation onto target qubits using ancillae thermalization. We also mention using OAA in addition to linear combination of unitaries to achieve the same task.
Here we give a comparison of resources between the two schemes.

a. Phase Estimation
The phase estimation algorithm (PEA) implements an exact projector onto the groundstate of a Hamiltonian with p = 1 − . Error is intrinsic to PEA and originates from the binary approximation of the eigenvalues. A higher precision is required to ensure these 'imperfections' do not effect the computation. We assume a pre-processing shift has occurred such that the groundstate energy value is 0 and initialise the target state in an equal superposition of all eigenstates.
Measurements: For an initial target state, |ψ , given as an equal superposition of eigenstates an average O(p −1 0 ) = 2 n measurements for p 0 = 2 −n are required for groundstate preparation. No measurements are required in ancillae thermalization nor block encoding + OAA.

Number of Qubits:
Post-selection and the PEA act upon an nqubit input state and m ancillae qubits. Here, the number of ancillary qubits is dependent on the required precision of the eigenvalue, m ∼ O(n/2 + log(1/ ) + log(1/∆)), where ∆ is the lower bound on the spectral gap [8]. Qubits can be reused upon failure. Ancillae thermalization with the PEA require a total O (m log(1/ )/ log(1/(1 − p 0 ))) qubits, the result follows with ii. Ancillae thermalization. The number of operations for the NOR gate scale linearly with m whilst two additional operations are needed per ancillae to implement a control. We require all qubits that have been acted on to remain coherent throughout the computation and assume Q(Wn) ∼ O(1) or ∼ O(log(n)) [19]. The total number of operations is O (log(1/ )/ log(1/(1 − p 0 ))Q(U n+m )), the result follows.

Groundstate Preparation Measurements
Qubits Gates Computational resources for postselection and ancillae thermalization using phase estimation (PEA) as well ancillae thermalization using linear combination of unitaries (LCU) via the results found in Ref. [8]. We also show the results for most state-of-the-art quantum groundstate preparation algorithm [10]. This algorithm uses block-encoding in addition to amplitude amplification to approximately project the target qubits onto the groundstate, where k is the number of qubits required in the block-encoding.Õ is up to polylogarithmic factors. Here we assume the number of calls to H, Λ ∼ d, where d is the sparsity of the Hamiltonian i.e. maximum number of non-zero elements in each row of the Hamiltonian and ∆ is the lower bound on the spectral gap. Note that we assume the input target state is given as an equal superposition of all eigenstates. Various schemes may exponentially reduce the number of ancillary qubits required for ancillae thermalization, e.g. by conditioning on only a subset of elements of a factorisation of the unitary U . Indeed, a heat bath does not need to couple to all elements of a system to effectively cool.

b. Linear combination of unitaries
Linear combination of unitaries (LCU) can be used to construct a truncated Taylor series of the time-dependent evolutionary operator. This approximately projects the target qubits onto the groundstate of the Hamiltonian. The implementation of this unitary is non-deterministic, thus either OAA or ancillae thermalization can be used to amplify the probability of success. A detailed derivation of the resource costs of LCU for groundstate preparation can be found in [8]. We summarise the key results here.

Number of Qubits:
The number of ancillary qubits i.e. the precision required for LCU is less than PEA, requiring m ∼ O(log(1/∆) + loglog(2 n / )) qubits to achieve the same accuracy. The probability of correctly implementing the groundstate projector by LCU is p 0 ∼ O(2 −n ), assuming |ψ is in an equal superposition of eigenstates. We ignore any qubits required for the Hamiltonian simulation.  Table II for ancillae thermalization.

Appendix B: Methods
We provide a detailed description of the methods and results found in the main text.

Ancillae Coupling and Reset Gates
To construct the ancillae thermalization circuit we use the methods discussed in Section II A to initially apply U followed by iterations of the controlled unitary V . For groundstate preparation, V includes a reset gate and unitary U of the phase estimation circuit acting on excited states in the wavefunction of the target qubits. It is constructed as follows: Firstly, we only apply V if the ancillae from the previous iteration are not in the state |0 ⊗t . This state corresponds to the preparation of the target qubits in the groundstate. The condition is checked through use of a NOT-OR (NOR) gate. This logic gate acts upon all ancillae from the previous iteration and an additional m − 1 NOR ancillae qubits. The result of whether the previous ancillae correspond to the preparation of the groundstate is outputted onto the last NOR ancilla. Secondly, a reset gate consisting of a scrambling operation acts upon the target qubits conditioned by the last NOR ancilla. The purpose of the scrambling operation is to redistribute the probability of an eigenstate to all other eigenstates equally. A full description of the NOR and scrambling gate can be found in appendices B 2 and B 3 respectively. Finally, another application of the phase estimation unitary U -controlled by the last NOR ancillary quibit acts upon the target qubits and a batch of m fresh ancilae.

NOR Gate
The quantum NOT-OR (NOR) gate shown in Fig. 5 is an quantum logic gate. Its purpose is to compile the controls on all excited states represented by the ancillae onto a single qubit. The gate acts on the m precision ancillae from each iteration in addition to m − 1 NOR ancillae in initial state |0 ⊗m−1 . The state of the last NOR ancilla is |0 if and only if the precision ancillae are in |0 ⊗m , otherwise the NOR qubit is in state |1 . The scrambling gate, W and U in the next iteration of V are both controlled by the last NOR ancilla. If the groundstate energy E G = 0, then a pre-processing procedure needs to be implemented to shift all the energies by a constant such We can see numerically that the unitary increases the overlap between the states to ∼ 1 N , becoming more accurate as the system size increases. This proves that the application of a scrambling gate gives the desired result of redistributing the probability weights of each state equally.
that |0 ⊗m corresponds to the preparation of the groundstate of an arbitrary Hamiltonian on the target qubits.

Scrambling Gate
The purpose of the scrambling gate is to redistribute the weight of each eigenstate equally amongst all other eigenstates on the target qubits. We assume that an equal distribution of eigenstates corresponds to a maximally mixed set of bit strings. This approximation is discussed further in this section below. Fig. 5b shows that for a randomly chosen eigenstate of U , applications of arbitrary local unitaries will increase its overlap with another eigenstate. As N increases, the mean overlap between the transformed eigenstate and a perpendicular state converges to 1/N . To numerically prove this result, N -qubit state |λ 1 and its perpendicular state λ ⊥ 1 were chosen uniformly with the Haar measure. The overlap between λ ⊥ 1 and |λ 1 acted on by local Hadamard gates i.e. H ⊗N was computed and the process was repeated 1000 times. The simulation was also repeated with local X gates which gave the same result. Scrambling Approximation: The eigenstates of a diagonal unitary correspond to single bit strings. Consequently a maximally mixed state of bit strings corresponds to a equal distribution of eigenstates, where application of a Hadamard gate on a bit string scrambles the state entirely. This is not true for non-diagonal unitaries, where eigenstates correspond to a linear combination of bit strings. However, as N increases so does the number of eigenstates which have an average overlap 1/N with each bit string. Therefore the assumption that a maximally mixed set of bit strings can approximate an equal distribution of eigenstates for an arbitrary unitary becomes more accurate as the number of target qubits increase.

Fidelity Calculations
In this section we derive the fidelity between the finalised target qubits from the tracing out method and the desired state for the Quantum Perception and groundstate preparation algorithms.

a. Quantum Perceptron
Applying the Quantum Perceptron post-select unitary U onto the state |0 |ψ produces, where |ψ = R(θ) |ψ . The reset gate W transforms E |ψ → |ψ and a new ancilla is inserted. U conditioned by the state of the previous ancilla, acts on the new ancilla and target qubits to give, Resetting, inserting new ancillae and applying U conditioned by the previous iteration's ancilla P times leads to the state, where |k = |1 ⊗k |0 ⊗(P −k) and |P = |1 ⊗P . The density matrix after P steps is given by, where |A = P −1 k=0 p(θ)(1 − p(θ)) k |k and |B = (1−p(θ)) P 2 |P . Using k |k = 0 for all k = k , a partial trace is performed on the ancillae to obtain, Using the equation above, the fidelity between the finalised target qubits and desired state F = Tr(ρ P target |ψ ψ |) can be written as, As the quantum perceptron is an example of a single ancilla repeat-until-success circuit, the result can easily be generalised to the m ancillae case.

b. Groundstate Preparation
To prepare the target qubits in the groundstate of a specified Hamiltonian we utilise quantum phase estimation. This algorithm computes 0 ≤ θ ≤ 1 which satisfies, A |λ = e 2πiθ |λ up to a finite precision for m ancillae qubits in the first register. In other words it computes the binary valueθ = 0.θ 1 θ 2 ...θm with θ i ∈ {0, 1}. The unitary A can always be constructed from hermitian matrix H such that A |λ = e −iEτ |λ ⇒ E = − 2πθ τ where E is the energy corresponding to eigenstate |λ . Note that in all of our experiments τ = 1.
The n-target qubits are initialised to an equal superposition |λ i of all eigenstates. Implementing the phase estimation unitary U onto 0 ⊗m |ψn gives, The false positive in the prepared groundstate originates from the finite precision on the eigenvalues. A scrambling operation S is performed on the incorrectly prepared eigenstates by placing a condition on the ancillae. This operation produces an overlap λ i | Sλ ∼ 1 N for all i = 1, ..., N . Details of S and the conditioning on the ancillae can be found in the section B. A batch of m new ancillae are inserted. U acts on the new ancillae and target qubits conditioned by the previous ancillae to transform the state as, where |k = N i=N * +1 |θ i and a conditioned S has been applied to the target qubits. After P iterations of inserting ancillae, applying U and S, the state is given by, By expanding |ψ N it can be shown that scrambling the state increases the overlap with the groundstate, where|λ G = N * j=1 |λ j and |λ E = N j=N * +1 |λ j . The density matrix after P steps is given by, where, Performing a partial trace on the ancillae the density matrix of the target qubits is given by, Using the equation above, the fidelity between the finalised target qubits and groundstate can be written as, The fidelity is bound by the distinguishability of the eigenstates. In the main paper we choose m such that N * = 1.  Fig. 4a. The circuits consists of repetitions of the post-select unitary U and controlled phase gate Sm(π/3) for m = 2, which performs a phase shift on the |0 state of the top ancillary qubit. The quantum gearbox with π/3 FP OAA was simulated with depolarizing and thermal relaxation noise in addition to being ran on IBMQ's quantum device. Fidelity between the finalised target and desired state was computed for different numbers of nested iterations and compared against ancillae thermalization. In all 3 noise models it is shown in Figs. 4 (c) and 4(d) that ancillae thermalization has an increased robustness to gate noise as a function of circuit depth.
circuits into the universal gate set consisting of arbitrary single qubit rotations and C-NOT gates. We use a total of 8192 shots for each data point in both examples and chose not to display error bars since they are statistically negligible. Fig. 6 shows the π/3 FP OAA circuit for the Quantum Gearbox, implemented using qiskit. The code used for the experiments in this paper can be found at [21].

Hamiltonians
The Hamiltonians used in the experiments are given by, Orthonormalization of V is ensured by the Gram-Schmidt process. The motivation behind H 2 comes from the accurate approximation on the scrambling gate discussed in appendix B 3. As the number of target qubits increase, the space of applicable Hamiltonians increases and this approximation becomes more accurate.

Simulated Noise
The simulated noises for groundstate preparation represents the thermal relaxation between each qubit and their environment. This was parameterised by the thermal relaxation time T 1 , the dephasing constant T 2 and the implementation time of each gate. The thermal relaxation noise model provided by Qiskit was used in the groundstate preparation experiment of H 2 . This model is parameterised by the thermal relaxation time T 1 , dephasing constant T 2 and implementation time of: CC-A, C-A, C-NOT and single qubit gates. Table III shows the range of parameter values used in the groundstate preparation experiments with different levels of noise. The noise was computed by decomposing each gate into C-NOT and single qubit gates where the noises are given explicitly. The gates C-A and CC-A were an exception to this decomposition and custom gate noises were computed respectively using the values below. T 1 and T 2 were sampled for each qubit from a normal distribution with means µ 1 and µ 2 respectively and shared variance σ.  a bias b. This computes the input signal to the perceptron θ = x 1 w 1 + x 2 w 2 + ... + b. The second part maps θ onto the activation function a(θ) ∈ [0, 1]. This is known as the state of a perceptron and is used either as an input for a next perceptron or an output for a neural network. Within the quantum perceptron the latter of the two processes is represented by an angle of rotation upon a target qubit as a function of θ. The challenge is to overcome the innate linearity of quantum dynamics to find a realisation of this non-linear function.

Oblivious Amplitude Amplification
Oblivious Amplitude Amplification (OAA) replaces post-selection with tracing out ancillary qubits to guarantee a specified unitary transformation, without knowledge of the target state [5]. The allocation of resources for OAA differ to that of our proposed method. We focus upon an implementation that monotonically decreases the error of implementing the specified unitary transformation in the regime of large initial success probabilities, π 3 FP OAA [3]. Repeating Eq.(1) of the main text here for clarity, we seek a transformation U that achieves a desired unitary transformation R of a target set of qubits with some probability p 0 : In essence, π 3 FP OAA 'boosts' the final success probability from p 0 = 1 − to 1 − 3 using the equality (1 − e iπ/3 ) = e −iπ/3 . This is done by replacing U with A 1 given by where S(π/3) = I m − (1 − e iπ/3 )|0 m 0 m | is a controlled π/3 phase shift applied to the ancillary qubits. A k concatenates this procedure k times to obtain a final success probability p final = 1 − 3 k . Each recursion increases p final super-exponentially at the cost of an exponential number of operators. The larger number of gate operations acting on each ancillary qubit in π/3-FP OAA, as discussed in Sec. III, may lead to a reduced robustness to noise when compared with ancillae thermalization.