Autonomous Temporal Probability Concentration: Clockworks and the Second Law of Thermodynamics

According to thermodynamics, the inevitable increase of entropy allows the past to be distinguished from the future. From this perspective, any clock must incorporate an irreversible process that allows this flow of entropy to be tracked. In addition, an integral part of a clock is a clockwork, that is, a system whose purpose is to temporally concentrate the irreversible events that drive this entropic flow, thereby increasing the accuracy of the resulting clock ticks compared to counting purely random equilibration events. In this article, we formalise the task of autonomous temporal probability concentration as the inherent goal of any clockwork based on thermal gradients. Within this framework, we show that a perfect clockwork can be approximated arbitrarily well by increasing its complexity. Furthermore, we combine such an idealised clockwork model, comprised of many qubits, with an irreversible decay mechanism to showcase the ultimate thermodynamic limits to the measurement of time.


I. INTRODUCTION
Time plays a special role in quantum physics. While other physical quantities of interest are represented as Hermitian operators, there is no observable corresponding to time itself. That is, it is not possible to find an operator conjugate to the Hamiltonian (representing energy) that may serve as 'time observable' in the same way as is done for position and momentum [1] (see e.g. [2] for some caveats to this statement). Time thus plays the role of a parameter in the equations of motion. Consequently, estimating the passage of time requires a reference system -a clock. By tracking the dynamical evolution of (observable quantities related to) such a clock system it is possible to extract information about the flow of time, see, e.g. [3][4][5][6]. But what makes a specific system useful as a clock?
To address this question, we consider time to be a continuously elapsing parameter t ('Schrödinger time') whose value is estimated by a clock in terms of discrete increments ('ticks'). According to quantum theory, the evolution of any closed system is time-reversal symmetric, and therefore any complete description of an instrument that measures time inevitably requires an irreversible part that breaks this symmetry. By definition, the equilibrium state of any system features no non-trivial evolution in time. Thus, the first necessary ingredient for building a clock is an out-of-equilibrium system, such that the clock can harness the irreversible transition to higher entropy to produce ticks.
Entropy-increasing processes are fundamentally stochastic. Consequently, individual events resulting from such a process provide little information about t and thus make for rather bad clocks. While one could, in principle, use any equilibrating system as a clock -such * emanuel.schwarzhans@oeaw.ac.at † marcus.huber@univie.ac.at as a hot coffee mug cooling down on your desk -its ticks, e.g. the spontaneous emissions of thermal photons (which exhibit super-Poissoninan statistics), come at highly irregular intervals with respect to Schrödinger time. Structuring this irregular entropy flow into a series of ticks to allow for a precise synchronisation of events is exactly the purpose of a clock. In this article we formalise the task of timekeeping by splitting it into two stages: (i) an irreversible process that follows the second law of thermodynamics, i.e. an out-of-equilibrium system moving towards equilibrium by means of discrete and stochastic events, (ii) an internal clockwork that temporally concentrates the probability of an irreversible event occurring, thereby regularising the interval between equilibration events.
As we will see, the particular choice of (i) provides the context for evaluating clock performance because it represents a basic form of clock itself, while at the same time limiting the performance of a clock for any given clockwork. Stage (ii) gives rise to a clearly defined mathematical task that we will refer to as autonomous temporal probability concentration (ATPC).
To practically describe the performance of a clock, we use two quantities: accuracy and resolution. The accuracy N , is the number of ticks until the clock is, on average, off by one tick with respect to Schrödinger time. The resolution R = 1 t , is the average of the tick frequency with respect to Schrödinger time.
That there is a trade-off relation between accuracy and resolution, and that there is a proportional relation between the entropy dissipated in the process and the clock performance, was first noticed in a model of an autonomous quantum clock as an open quantum system in [7] and recently corroborated in a mesoscopic experiment in [8].
Here, we combine these aspects and provide a detailed investigation of the trade-offs between accuracy, resolution, and entropy production for given energy and complexity within the framework of autonomous quantum clocks [9,10]. A central tenet for providing these tradeoffs is the separation of timekeeping into two separate processes mentioned above: (i) the irreversible out-ofequilibrium transitions of the clockwork via interaction with an environment, resulting in distinguishable events registered as 'ticks', which we model with a decay mechanism, and (ii) the internal closed-system (unitary) dynamics that provide a clockwork and temporally concentrate the population of states from which an irreversible transition can emerge. That is, the clockwork ensures that the circumstances that allow for a tick to happen (e.g., a specific energy level resonant with the out-ofequilibrium dynamics being highly populated) occur only within a very narrow time window. Here, we study this task and the resources required to achieve it. In particular, as a main result we present an analytical trade-off between the maximal probability amplitude and temporal variance for ATPC, on the one hand, and the complexity of the respective autonomous clockwork achieving it, on the other. In conjunction with the generic example of a memoryless out-of-equilibrium process, namely exponential decay, this allows us to describe trade-offs between accuracy, resolution, entropy production, and clockwork complexity.
The specific clock model that we consider here consists of (1) external heat baths as out-of-equilibrium resources, (2) a quantum system representing the 'clockwork', and (3) an external field that the clockwork can emit energy ('ticks', e.g., photons) into. In Sec. II, we first discuss the role and choice of the clockwork, and formalise the task of ATPC. In Sec. III, we then discuss mechanisms for coupling the clockwork to an equilibrating process to produce ticks. In Sec. IV we combine the two, to showcase the limitations set by the irreversible process and how the complexity of a clockwork can be utilised to reach the maximal potential of a clock. We continue in Sec. V with a discussion of the implications and the relation to other literature on clocks and end with a short conclusion in Sec. VI.

II. THERMAL MACHINES AND THE CLOCKWORK
Let us now consider a clockwork in the sense discussed above, that is, a device that contains a target subsystem, which is to be prepared for an out-of-equilibrium transition, thus resulting in a 'tick'. From a thermodynamic perspective, such a preparation requires work to be performed on the target, which can be achieved by a quantum thermal machine. Operating such a machine in turn requires an out-of-equilibrium resource, which we here consider to be provided by thermal baths at different temperatures, i.e. a thermal gradient. More specifically, we assume that two independent baths are available, a a hot and a cold bath can be utilised by the clockwork in order to prepare (a subsystem of) the clockwork for an irreversible process. The signal thus produced can be registered as a 'tick' that serves as a time reference.
hot bath and a cold bath, at temperatures T H and T C , respectively, where the latter represents the environment. This setup is depicted in Fig. 1.
This choice is motivated, first, by the general availability of heat baths. Second, because systems are usually expected to thermalise (eventually) without external agency, preparing such heat baths does not require any temporal information or external control. Consequently, heat baths allow for transparent bookkeeping of the relevant resources, i.e. of the average amount of entropy dissipated by the clockwork for each tick.
A specific focus of the analysis performed here lies on the identification of trade-offs between different figures of merit for the clock performance for fixed energy input and clock complexity. In principle, the performance of a given clock also depends on the (difference between the) temperatures T C and T H . However, since we are primarily interested in upper bounds on the relevant figures of merit, we will often concentrate on the case where the environment temperature is T C = 0. For the sake of completeness, calculations for general T C can be found in Appendices B and C.
Our clockwork model then consists of two parts, a ddimensional 'ladder ' target system (in the simplest case, a qubit, d = 2) and a machine, which itself has some substructure and couples to the ladder via unitary interaction. This interaction supplies work (which the machine draws from its coupling to the heat baths) to the ladder, driving it to its excited states. The ladder in turn couples to an external field, and thus these excitations eventually result in ticks (i.e. energy emitted into the field). Here, we consider a model where only a non-zero population P top (t) of the 'top level' -the most highly excited state of the ladder -can lead to a tick. As a consequence, the quality of the clockwork depends on the properties of the particular probability distribution P top (t) as a function of Schrödinger time t. In particular, an ideal clockwork should be capable of producing While one would expect a perfect clockwork to be capable of producing this distribution, it is also clear that it is not always desirable in conjunction with an irreversible mechanism. If the probability is arbitrarily concentrated, it does not leave sufficient room for the stochastic equilibration event to occur, worsening actual clock performance. Nonetheless, an ideal clockwork should be capable of approximating this ideal distribution to the desired precision set by the irreversible mechanism. Arguably, it seems implausible that a heat engine itself, which intrinsically also harnesses the stochastic flow of energy from a hot to a cold bath, should be able to produce such a perfect signal. However, it may be reasonable to expect that a sufficiently complex clockwork, itself driven by a heat engine, could approximate the ideal ATPC of Eq. (1). In the following, we therefore investigate the role of the complexity of the internal structure of the machine in approximating the ideal ATPC. In order to do so, we decompose the machine into a set of elementary few-qubit machines, each realising an effective virtual qubit [11]. This allows the number of (elementary) machines to be used as a proxy for the complexity of the clockwork's microscopic structure. In terms of these quantifiers, i.e. the dimension d of the target system and the number M (d − 1) of virtual-qubit machines 1 , a central result on autonomous probability concentration that we derive in this paper can be phrased as follows:

Result 1: Autonomous temporal probability concentration of qubit machines
Driving a d-dimensional target system at temperature T C = 0, with M virtual-qubit machines per transition between neighbouring levels, autonomously allows a top-level probability of to be reached. Here, Z H is the partition function of a qubit coupled to the hot bath, and can thus take values between 1 and 2.
In other words, we show in the following that the behaviour of an ideal clockwork [i.e. Eq. (1)] can be approximated arbitrarily well by increasing the complexity of the clockwork, that is, by increasing M and d. The green arrows indicate the transition where the ladder gets exited. The yellow arrows show the reverse transition. Coupling the qubit with the biggest energy gap to the hot bath introduces a bias towards the transition that is indicated by the green arrows.

II.1. Two-Qubit Machine
We begin by considering the simplest possible heatengine-driven clockwork: a 2-dimensional ladder coupled to a 'cold' bath (the environment) and to a two-qubit machine, i.e. d = 2 and M = 1. In terms of Hilbert space dimension, this is the smallest possible thermal machine [11], consisting of a 'cold' qubit and a 'hot' qubit, in contact with the 'cold' environment and a hot bath, respectively, as illustrated in Fig. 2.
Before the machine is activated, the qubits only interact with their respective baths. Under the assumption of weak coupling between the qubits and baths, each qubit thermalises to the corresponding bath temperature. Denoting the energy gaps of the hot, cold and ladder qubits as E H , E C and E L respectively, the reduced states of the qubits can be represented by the thermal states with i = H, C, L, and where Z i = 1 + e −βiEi are the respective partition functions and H i the corresponding free Hamiltonians with eigenstates 0 i ⟩ and 1 i ⟩. The total initial state of the clockwork -the machine and the ladder -thus takes the form ρ 0 = ρ H ⊗ ρ C ⊗ ρ L . We further assume that the timescale of the interaction between the machine and target qubits is much shorter than that of their thermalisation with the respective baths. Consequently, the relevant dynamics of the clockwork are well described by energy-conserving unitary processes on the clockwork Hilbert space H = H H ⊗H C ⊗H L . At the same time, the purpose of the machine is to transfer energy to the target system. We are therefore interested in designing the internal structure of the clockwork, namely the energy levels of the free Hamil- This, in turn, allows us to define an interaction Hamiltonian that acts non-trivially only within the degenerate subspace, given by where g ∈ R is a coupling constant. The unitary dynamics generated by the total Hamiltonian H = H 0 + H int hence conserves the total energy of the clockwork, since [H 0 , H int ] = 0. However, since [H L , H int ] ≠ 0, the interaction, once activated, can perform work on the ladder.
The resulting dynamics leads to an increase of the population of the top energy level 1 L ⟩ of the ladder, which (in units where ̵ h = 1) is given by The maximally reachable population depends on the temperatures of the baths, as well as the energy gaps of the machine qubits [11]. The top-level probability in Eq. (5) evaluates to (see Appendix A) For T C = 0, this simplifies to Thus, even when T C = 0, this function is far away from the ideal shape in Eq. (1), both in terms of its maximal value and the width of the distribution around its peak. Even in the limit T H → ∞, the maximal value reached at t = π 2g is only 1 2 . Moreover, this top-level population could also have been achieved by directly coupling the ladder to the hot bath. Thus, the two-qubit machine does not provide the desired ATPC by itself. However, in the following we present a generalisation of this framework which allow arbitrarily precise ATPC, and hence an ideal clockwork to be approximated to within any given error.

II.2. Generalised Machines
In the following we present a generalised clockwork model that allows both the 'sharpness' and the amplitude of P top (t) to be controlled, while keeping track of all the relevant resources. This can be achieved by two qualitatively different but compatible extensions that we refer to as 'horizontal' and 'vertical' extensions, as illustrated in Fig. 3. The horizontal extension allows the amplitude of P top (t) to be increased, while the vertical extension allows the width of the peak of P top (t) to be decreased, thus increasing its 'sharpness'. Specifically, we add more levels to the target ladder and with it more two-qubit machines, interacting with each successive transition (vertical extension); to a given transition we add more machines (horizontal extension). We start by collecting all interactions along a vertical column (see Fig. 3) of machines interacting with the ladder into a term H 1 . This vertically extends the interaction of a single two-qubit machine, Eq. (4), along all ladder states, i.e. (8) for the first vertical column, where M j i denotes the Hilbert space of the j th two-qubit machine acting on the i th ladder transition. We then add another term H 2 , which does the same for the second vertical column, albeit with an additional projector onto the orthogonal subspace of H 1 to ensure commutativity of H 1 and H 2 . This continues for M vertical columns, always projecting onto the orthogonal subspace of all previously used machines. Using M (i) to denote the Hilbert space of the vertical group of the i th machine, i.e. M (i) ∶= ⊗ d−1 j=1 M j i , we can then write our generalisation of the interaction Hamiltonian from the previous section in a compact notation as Here, we have defined the projectors and the operator and where the states n M (k) ⟩ are defined as That is, the state n M (k) ⟩ can be considered to be the n th excited state of the k th vertical group M (k) in the sense that the first n machines M j k for j = 1, . . . , n are in the 'used' state In the following we will briefly discuss the horizontal and vertical extensions separately to outline their physical impact.

Horizontal extension
As shown in Appendix B, for T C = 0, the interaction Hamiltonian for N = 2 in Eq. (9) then modifies the toplevel probability from Eq. (7) to For finite T C , there are additional contributions to P top (t) whose weight relative to the sinusoidal behaviour apparent in Eq. (13) increases with increasing T C (see Appendix B). From Eq. (13) we see that the maximal value of P top (t) increases with increasing M , and total population inversion can be achieved in the limit M → ∞. However, in order to achieve ATPC, only increasing the magnitude of P top (t) is not sufficient since this neglects the temporal concentration. In the next section we therefore introduce the 'vertical' extension, which allows us to temporally concentrate P top (t), leading to sharper peaks.

Vertical extension
For the vertical extension, we generalise the ladder to a non-degenerate system with d evenly spaced energy eigenstates, with the gap between neighbouring states equal to E L . To each of the d − 1 pairs of neighbouring energy levels of the vertically extended ladder, a 2-qubit machine can be coupled in the way described in the previous section. In total, the vertically extended clockwork thus consists of a d-dimensional ladder and d − 1 twoqubit machines, as illustrated in Fig. 3. The resulting top-level probability for T C = 0 becomes We thus see that just vertically extending the machine makes the temporal distribution sharper, but it also decreases the achievable top-level population.

General extended clockwork
Finally, by combining the horizontal and vertical extension, we can combine the advantages of both, i.e. simultaneously increase the top-level population and the sharpness of the temporal distribution. Straightforward calculation of the top-level probability for T C = 0 (shown in detail in Appendix C) yields For non-zero T C , Eq. (15) smoothly approximates the above top-level probability (see Appendix C).
A direct consequence of the particular form of the toplevel probability in Eq. (15) is that the amplitude and temporal variance ('sharpness') of P top (t) can be optimised to within any desired error by controlling the number of machines (M ) per neighbouring pair of energy levels in the horizontal extension and the dimension of the ladder (d) in the vertical extension, respectively.
Since we restricted the clockwork to consist of qubit machines that have up to M (d−1)-body interactions, and the Hilbert space of the machine is 4 M (d−1) -dimensional, the most reasonable quantifier of complexity should be related simply to M and d. We therefore focus on elucidating the role of M and d separately. What we can now see is that, in order to decrease the temporal variance (increasing d) while increasing the amplitude (increasing M ), the complexity necessarily has to increase. Thus, for a fixed complexity there exists a trade-off between temporal variance and probability amplitude.
In the following sections we will include the irreversible decay mechanism to numerically analyse how accuracy and resolution of clocks are influenced by changes in M and d.

III. IRREVERSIBILITY AND CLOCK TICKS
Any autonomous quantum clock (or any clock for that matter) inevitably produces entropy in order to tick [7], as it needs to be subject to an irreversible evolution. While the internal clockwork produces a temporally wellconcentrated and repeating distribution, there needs to be an irreversible process that turns this into a measurable signal. For this to happen there needs to be a system that is driven out of equilibrium in order to relax back to equilibrium while producing a tick. In our case the system that is driven out of equilibrium is the ladder. As an example we will assume that the system with respect to which the ladder is out of equilibrium is a photon field at the environment temperature T C , and that its coupling to the ladder is such that the top level is unstable and decays to the ground state, emitting a photon of energy E γ = (d − 1)E L . Since any irreversible process can be viewed as a reversible process on a larger Hilbert space, the presence of such a decay channel in principle must also allow for the reverse process of exciting the ladder while absorbing a photon of the field. However the probability for this to happen can be made arbitrarily small by demanding that the background temperature of the field satisfies Eγ k B T C ≫ 1. The number of possible decay processes is vast. However, since our aim is to capture all resources that are necessary to operate a clock, allowing for decay processes that require memory would miss the purpose, since the required resources are not clearly defined for them. We therefore require the photon field to be memoryless, i.e. that correlations with the ladder are diluted very quickly and are thus negligible. The resulting dynamics are governed by the law of exponential decay, and thus constitute an ideal case, giving an effective upper bound to the clock performance and allowing us to keep track of the resources that are invested. In particular, the probability density for a tick occurring at time t is given by (see Appendix D) where c is the coupling strength of the photon field with the top level of the ladder.
Let us now consider the energetic resources required to run the clock. We first note that, taking T C = 0, as the clockwork state evolves, each branch of its superposition where the ladder's top level is excited corresponds to a transition of d − 1 machines: where the value of k differs between branches, and we recall that the { n M (k) ⟩} were defined in Eq. (12). Thus, regardless of which branch is realised, if the ladder's top level is excited then the heat flow from the hot bath into the total system is given by Q in = (d − 1)E H . This heat flow does the work of driving the ladder from 0 After the clock has ticked and the cold qubits of the machines re-thermalise, Q out = (d − 1)E C of heat will be dissipated into the cold bath.
Since E H = E L + E C , we thus have the usual relation for a thermal machine, i.e. Q in = W + Q out , and the thermal efficiency of the process is η th ∶= W Q in = E L E L +E C . From this, one can see that as E C E L decreases, we approach the Carnot efficiency bound η th ≤ 1.
Curiously, for T C = 0 and M → ∞, the top-level population is just P top (t) = sin 2(d−1) (gt). If interpreted as a heat engine whose purpose is to charge a battery (the ladder), then one can indeed reach an efficiency of η th ≈ 1 and still charge the battery in finite time τ = π 2g . Even the task of ATPC can be achieved to arbitrary precision at perfect efficiency. One can interpret this as sufficient clockwork complexity permitting perfect efficiency at finite power.
In any case, the efficacy of ATPC and the resulting clock dynamics are essentially determined by the ladder dimension d and the number of driving machines M , which together correspond to a simple notion of clockwork complexity. In order to investigate how these affect the quality of the clock, we quantify this quality using two notions introduced in [7]. These are the accuracy, which is the average number of ticks until the next tick is off by the average time between two ticks, i.e.
and the resolution, which is the inverse average time between two ticks, i.e.
In the following section, we present numerical calculations of the accuracy as a function of the resolution, the clockwork complexity and the energy dissipated per tick.
For comparison, let us take as a baseline an example where there are no qubit machines employed, and the ladder simply begins in equilibrium with the hot bath and emits this energy into the cold bath via the irreversible process, i.e. there is no ATPC. In that case, the top-level probability is constant, i.e. P top (t) = ] Z H , and thus the resolution is essentially determined by the decay rate c and the population of the decaying level. The accuracy is simply N = 1. This highlights the main purpose of a clockwork: An individual event resulting from pure thermalisation would result in an accuracy of 1 and come at a work cost of (d − 1)E L , whereas the clockwork can increase the accuracy while keeping the work cost of one tick constant.

IV. NUMERICAL RESULTS
Since we are interested in upper bounds on the clock quality, for the following results we assume the temperature of the hot bath to be infinite, i.e. T H → ∞, as well as T C = 0.
First we analyse the relation between the 'sharpness' of the peak of P top (t) and the clock accuracy. Recalling the discussion in Sec. 2, we note that the 'sharpness' of P top (t) increases with increasing d, and the latter may therefore be used as a measure of the 'sharpness' of P top (t). In Fig. 4, the behaviour of the accuracy as d increases is depicted for different coupling strengths c and different values of M . For small d we see that the accuracy increases linearly. However, increasing the 'sharpness' beyond a certain point leads to a decrease in accuracy. The value of c therefore puts a bound on the maximally achievable accuracy for all potential clocks. The same limiting behaviour is apparent if we fix c and vary g instead (see Appendix F), for reasons that we discuss below.
In order to analyse both accuracy and resolution with respect to the clockwork complexity in Fig. 5, we compare those two quantities for different fixed values of M while varying d. Increasing M allows us to reach a higher maximal accuracy, while increasing d (which increases from right to left in Fig. 5) allows us to trade resolution for accuracy up to the optimal point, after which the accuracy reduces again. We further observe that all clocks with the same c and g lie under a curve defined by the case of M → ∞. Increasing c allows for clocks of higher quality, i.e. that have higher accuracy and resolution. Furthermore, the position of the maximum depends on the value of g. Here g was chosen to be equal to 1 E C . Increasing this value shifts the peak to the right, i.e. to higher resolutions (see appendix F).
Finally, in order to analyse the effect of the energy dissipation rate ε = Q out t on the clock accuracy with respect to the complexity of the clockwork, in Fig. 6 we plot the accuracy over the energy dissipation rate ε for clockworks of different complexity. In particular, we compare different values of M while varying d. What we can see in Fig. 6 is that for fixed M (at fixed c and g), increasing the energy dissipation rate (which is achieved by increasing d) increases the accuracy at a certain slope until a maximum is reached. Increasing d further decreases the accuracy. Furthermore, for a given c, increasing M 5  10  15  20  25  30  d   50   100   150 200  We have chosen g = 1 E C in all cases.  leads to a lower slope (approaching the slope of M → ∞) while allowing for higher maximal accuracy, which suggests that a greater maximal accuracy can be achieved at the cost of a greater energy dissipation rate.

V. DISCUSSION
Our results have two general implications. The first concerns the task of autonomous probability concentration. We show that, in principle, sufficiently increasing the clockwork complexity alone is enough to concentrate the temporal probability arbitrary well. In particular, this task can again be split into two conceptually different sub-tasks: maximizing the achievable population and improving the temporal 'sharpness' with which this (maximal) population can be reached. By splitting the clockwork into a target ladder and virtual-qubit machines coupled to the different ladder transitions, we were able to analyse how more complex clockworks can help to achieve the two respective sub-tasks. While we have worked with equally-spaced ladder systems, the same result (qualitatively) also holds for arbitrarily spaced target Hamiltonians, simply by redefining the respective coupling strengths (the g's) of the interaction Hamiltonians. We have equipped our clockwork with a particular tensor product structure, the division into two-qubit machines, for the sake of keeping track of its complexity. Our machine operates optimally within the framework set by this division, but whether more general machines could also achieve the same performance with smaller overall size remains an open question.
In our analysis, we have optimised the internal structure of the clock, i.e. the clockwork, to concentrate the probability in a fashion that most closely resembles the temporal distribution of an ideal clockwork. For given c, this amplifies the clock quality only up to a limit, which we showcase in Fig. 4. This can intuitively be understood by considering the two key timescales of the clock, namely that of the clockwork's dynamics, and that of the irreversible decay. Increasing d while keeping c fixed, one eventually reaches a point where P top (t) is so well concentrated temporally that the comparative slowness of the decay mechanism reduces the probability that the clock will tick. In other words, it becomes more likely that the decay mechanism will skip that peak. We see the inverse of this behaviour if instead of c, we consider curves of fixed g (see Appendix F), as increasing g speeds up the clockwork, effectively making the limit imposed by c more restrictive.
This brings us to the second main conclusion that can be drawn from our work: The irreversible mechanism, in our case characterised by the parameter c, puts an absolute upper bound on the achievable combinations of resolution and accuracy, i.e. the clock quality, and thus determines the potential for how well a particular physical process can be used as the basis for a clock.
The question of how well this upper bound can be ap-proximated brings us to the role of our two extensions. First of all, the horizontal extension, i.e. the coupling of multiple elementary machines to a single transition between neighbouring ladder levels primarily serves the purpose of increasing the possible population inversion and with it the achievable top-level population. As we see in Eq. (16), c always appears multiplied by the prefactor of the sine in Eq. (15), resulting in an effective coupling: From this we can see that increasing the horizontal extension M is physically equivalent to increasing the coupling c, and this is why they play the same role in Fig. 4 (though we note that C M is bounded with respect to M but not with respect to c). One cannot make a similar statement to relate c with the vertical extension d, since the exponent of the sine in Eq. (15) will vary as d does.
As noted above, d sharpens the temporal distribution, thus increasing the accuracy of the clock as long as the limit set by c and M (via Eq. (19)) is not surpassed, as demonstrated in Fig. 4.
In the regime where the accuracy grows linearly with the 'sharpness' (Fig. 4), which is determined by d (see Appendix F), there exists a trade-off relation between accuracy and resolution. To see this, note that the resolution decreases monotonically with d. For fixed c and g, the case of M → ∞ represents an upper bound on the clockwork quality, i.e. on the achievable combinations of accuracy and resolution, which is illustrated in Fig. 5.
In our computations, we have focused on the limit T C → 0. State-of-the-art atomic clocks operate at optical frequencies [12], and at even higher frequencies in novel proposals [13]. The vacuum state of any opticalfrequency mode of the electromagnetic field has a population of ≈ 1 at room temperature, and this situation is thus virtually indistinguishable from temperature 0. For clocks operating at much lower frequencies, such as those based on microwave transitions, cooling the environmental degrees of freedom into which the irreversible mechanism dissipates heat would be necessary in order to approach the fundamental limits we derive here.
It is nonetheless important to stress that we are interested in fundamental limits of timekeeping and the associated complexity and cost. For the practical purpose of building clocks for everyday use, atomic clocks can require as little as 30 mW [14], which is many orders of magnitude above the scale defined by the system energy and the timescale of the relevant processes, but still insignificant for global energy use. The majority of that cost, however, comes from the fact that at some point that single event needs to be amplified and registered by a measurement apparatus, whose inherent irreversible nature is also thermodynamic and comes with its own costs and limitations [15]. Conversely, the inevitable imperfections of clocks and the associated costs also limit the achievable quality of measurements and, consequently, of all estimation procedures, e.g. of work itself [16].
Our limits have more practical relevance for the autonomous control of quantum systems by a quantum clock [17,18]. Here, a small quantum system is envisioned to be controlled by an autonomous quantum clock. This is important for any type of unitary process requiring precise timing, from small machines operating in cycles [19] to general repeating unitary processes such as circuit-based cooling models [20][21][22][23][24].
Coming back to the actual energy cost, there are a number of interesting observations that follow from our clock model. First of all, the horizontal extension always comes at a finite cost and dissipation for any clock. The vertical extensions on the other hand linearly increase the energy cost and dissipation and, as long as the limit imposed by the c and M is not exceeded, also linearly increases the accuracy. While we thus observe that N ∝ ∆S, i.e. that a clock's accuracy is essentially determined by the entropy it dissipates (which seems to be a prevalent feature in all classical and quantum clocks [8]), we can also pinpoint which resources allow us to maintain this linear regime and we can identify the proportionality factor.
Finally, let us put our clockwork model in context with recent literature on quantum clocks. In Refs. [25][26][27][28][29], the relationship between achievable clock accuracy as a function of 'clock dimension' is studied by means of repeated applications of maps from a clock system to a register. These works provide fundamental bounds for clock accuracy for fixed system dimension d, showing that the accuracy of classical (incoherent) clocks can at best scale linearly in d, whereas a quantum clock's accuracy (with states featuring coherence) may scale as d 2 . The clock system considered in these works is exactly what we here refer to as the ladder system. Meanwhile, the map that Refs. [25][26][27][28][29] refer to as being responsible for creating a tick event in the register subsumes the interactions between the ladder, the qubit machines, and the heat baths, and also includes the irreversible mechanism and the subsequent read-out. In other words, our work provides a concrete physical realisation of the maps that effect the transfer of ticks to the register. In Fig. 4, we see that, in the regime of the clockwork not exceeding the clock potential dictated by the irreversible mechanism, the accuracy scales linearly with the ladder dimension d, which is already the optimal achievable scaling [28].

VI. CONCLUSION
In this article, we have put forward a framework for studying fundamental limits of timekeeping. The conceptual split of any such task into a clockwork, which creates a temporally concentrated probability distribution, and a mechanism for irreversibility, allowed us to derive an analytic formula for the achievable temporal probability concentration of the clockwork. The irreversible mechanism provides a context for the operation of the clock by allowing the passage of time to be tracked in the first place. Meanwhile, the chosen irreversible mechanism sets the reference timescale that ultimately constrains the potential of any clockwork that harnesses this mechanism to form a clock. But it is the clockwork that needs to be appropriately tuned to achieve maximal performance given these constraints. By composing the clockwork of the smallest possible thermal machines, we were further able to conceptually split the task of autonomous probability concentration into two sub-tasks. First, by having more machines interact with a single transition, we can increase the maximum top-level probability and with it, the effective coupling to the irreversible mechanism. Second, by concatenating multiple transitions of this kind, we are able to sharpen the temporal distribution. This reveals the intricate ways in which the complexity of the clockwork determines its performance.
In the future, it could be interesting to study more exotic irreversible mechanisms beyond exponential decay and whether they could be harnessed to further improve the clock quality. Nevertheless, our results consolidate the fact that perfect clocks are practically impossible within quantum mechanics and that significant resources have to be invested to reach the potential of any physical system to act as a clock.

APPENDICES
In these appendices, we provide detailed derivations and background information for the results presented in the main text. In Appendix A we first present a derivation of the top-level probability for the two-qubit machine of Sec. II.1. In Appendix B, we then present the derivation of the top-level probability for the horizontal extension of the clockwork for arbitrary temperatures T H and T C . In Appendix C, we then again focus on the case T C = 0, for which we derive the top-level probability in the full horizontal and vertical extension. Appendix D contains the derivation of the tick probability. Appendix E presents the details for the numerical computation of accuracy and resolution. Appendix F discusses the behaviour of clocks with changing ladder dimension d as well as changing coupling constant g.

Appendix A: Top-level probability of a two-qubit clockwork
Here, we present a derivation of the top-level probability P top (t) from Eq. (7). That is, we consider the minimal clockwork discussed in Sec. II.1, which consists of a hot qubit (coupling to the hot bath at temperature T H , as well as a cold qubit and a ladder (both coupling to a cold bath at temperature T C ). The derivation presented here is a special case (M = 1 and T C = 0) of the more general derivation of the top-level probability within the horizontal extension that we will present in Appendix B (where M > 1 and both T H and T C can take on arbitrary values). Nevertheless, we will first go through the much simpler derivation for M = 1 and T C = 0 here, which will serve as a guiding example for the much more involved general calculation that is to follow.
Assuming that the systems have thermalised with their respective baths, the initial state of the clockwork is given by where where T 0 is a matrix encoding the transition amplitude between ground state and excited state of the ladder, given by Now, because H 2 int is proportional to the identity on its support, that is, even powers of H int do not contribute to T 0 . However, for odd powers we have H 2k+1 int = g 2k H int , such that we find With this, we can evaluate the transition matrix T 0 , i.e.
Inserting the result into Eq. (A2), we finally arrive at the top-level probability Appendix B: The horizontal extension In this appendix, we present more technical details of the horizontal extension of the autonomous clockwork from a single (M = 1) to multiple (M > 1) two-qubit machines interacting with the same two-level (d = 2) transition of the ladder system. In particular we will derive the top-level probability for the horizontal extension for arbitrary temperatures T C and T H .
Following a similar approach as in Eq. (A2), we define transition operators for n = 0, 1, where the last equality follows from the fact that the interaction terms H k given by the terms in Eq. (9) have mutually disjoint support, i.e. H k H k ′ = 0 for k ≠ k ′ . Before we calculate these transition operators, we note that H k satisfies the cyclic property l, m, p, q = 0, 1. Now, for the transition operator T 0 , we note that only odd powers of H k can map 0 ⟩ L to 1 ⟩ L , and therefore We can calculate T 1 similarly, noting that only even powers of H k contain the factor 1 ⟩⟨ 1 L .
whereΠ is a projection defined bỹ In order to evaluate the top-level probability, let us briefly inspect the initial state ρ 0 in this situation, which is given by We split the initial state into two parts: one where the ladder is initially exited and one where it is not, i.e.
The top-level probability can then be seen to split into two be separate contributions, corresponding to the two terms in Eq. (B8), that is, We will first consider the part of ρ 0 where the ladder is initially in the ground state. Considering ⊗ k−1 (B4) we see that for each k there are k − 1 machines that are acted upon only by identities, meaning the partial trace over each of these machines simply contributes a factor 1. There are M − k machines that are acted upon by an operator Π Mj , each leading to a factor in Tr T 0 Tr L (ρ 0 )T † 0 . In addition, there is always exactly one machine which is acted upon by σ − M k , contributing a factor of in Tr T 0 Tr L (ρ 0 )T † 0 . The first term of P top (t) is thus given by the sum over all k ∈ 1, 2, ..., M [see Eq. (B4)], multiplied by the initial population of the ladder ground state, resulting in The second part of P top (t) can be calculated in the same way. Comparing Eq. (B5) with Eq. (B4), one sees that (aside from the oscillating scalar factors) the first term of T 1 differs from T 0 by replacing σ − M k in the latter with 1 C 0 H ⟩⟨ 1 C 0 H M k , and the corresponding factor Z H −1 Z H Z C is thus replaced by The transition operator T 1 additionally contains the static termΠ, which means that P top (t) contains the additional term Thus, the second term of P top (t) becomes and the total top-level probability of the horizontal extension is given by Taking the limit T C → 0, the top-level probability becomes Inserting Eq. (C7) into Eq. (C6), we obtain . The projectorsΠ [k] and operatorsŨ [k] are defined as With this, we are now in a position to provide a compact expression of the top-level probability P top (t), which takes the form where we have used the assumption that the initial state ρ 0 is diagonal with respect to the joint eigenbasis of the orthogonal projectorsΠ [k] , which is the case here because the ladder and all machines qubits are initially thermal with respect to either the cold or hot bath.
For the first term in P top we then have Here, we further have where we can use Eq. (C3) to calculate .
Inserting Eqs. (C14) and (C13) into Eq. (C12) and evaluating the sum over k, we obtain Turning to the second term of P top in Eq. (C11), we express the individual terms in the sum over k as Then, we note that we can writē where we have separated the terms corresponding to projectors onto the kernel and support of the operator J M (k) L in the second step. With this, we can simplify the first factor appearing on the right-hand side of Eq. (C16) to Reinserting the first term appearing in the last step of Eq. (C18) back into Eq. (C16), and evaluating the sum over k in Eq. (C11), we obtain another [i.e. in addition to that in Eq. (C15)] time-independent contribution to the top-level probability, given by For the second term appearing in the last step of Eq. (C18), we note that, since J M (k) L corresponds to the spin-j representation (with j = d−1 2 ) of the generator of rotations around the y-axis on the subspace spanned by the vectors n M (k) , n L ⟩ for n = 0, 1, . . . , d−1, the matrix elements ⟨ d − 1 M (k) , d − 1 L e −iJ M (k) L t n M (k) , n L ⟩ coincide with the elements of the Wigner (small) d-matrix d j µ,m (β) ∶= ⟨ j, µ e −iβJy j, m ⟩ for µ = j, m = n − j, and β = 2gt, see, e.g., [30] or [31]. In particular, Eq. (B7) in [30, p. 485] lets us write The prefactors of these sinusoidal contributions are then obtained by combining the second term in the last step of Eq. (C18) with Eq. (C16), and evaluating the sum over k in Eq. (C11), which yields Finally, we can collect Eqs. (C20) and (C21), and combine them with the time-independent terms in P top to arrive at where the coefficient f (M, d, β C , β H ) is given by The expression in Eq. (C22) holds for arbitrary temperatures T C and T H > T C , and includes the desired term proportional to sin 2(d−1)(gt) in the sum for n = 0. In particular, this term is the only term in P top (t) that remains when taking the limit T C → 0, in which case as stated in Eq. (15) of the main text.
To see that small deviations from the ideal case where T C = 0 still allow for P top (t) to be close to the corresponding value of the ideal case, i.e. to show the stability of our approach to ATPC, we analyse the behaviour of P top (t) in the limits M → ∞ and d → ∞ at finite temperatures. To this end we first inspect Eq. (C23), and note that the term that is potentiated by M is smaller than 1. To see this, we first write where we have defined x ∶= Z H −1 Z H Z C and y ∶= Z C −1 Z H Z C with the property 0 ≤ y < x ≤ 1 2 . The expression on the right-hand side of Eq. (C25) is smaller or equal than 1 if x − x d ≥ y − y d , which is the case if x − x d is monotonically increasing on the interval [0, 1 2 ]. Inspecting the derivative, we have ∂ Since we know that P top (t) must lie in [0, 1], showing that the first term of Eq. (C22) (n = 0) remains close to 1 when M and N go to infinity is sufficient to show that our approach is stable with respect to deviations from T H → ∞ and The value of the expression in Eq. (C27) for t = π 2g can further be written as which remains close to 1 for finite temperatures when β H E H < β C E C , E H > E C , for β H close to 0, and β C ≫ β H . such that we can write the desired first and second moments of the tick distribution as and respectively. In this way onlyĨ j needs to be calculated numerically for j = {0, 1, 2}, which decreases the effective computational costs enormously.
Appendix F: How the 'sharpness' of Ptop(t) influences accuracy and resolution The aim of this appendix is to give further insight about the behaviour of clocks with changing ladder dimension d as well as changing coupling constant g, in particular with respect to Figs. 4, 5 and 6 in Sec. IV.
First, let us discuss the relationship between accuracy and 'sharpness' of P top (t). The intuition is that clockworks that are capable of producing a very 'sharp' temporal probability distribution should have the potential to give rise to highly accurate clocks, given a suitable irreversible process for the 'tick' production. Since the maximal amplitude of P top (t) is given by 1 , increasing M leads to an amplitude of P top (t) that approaches 1 very quickly. Assuming that M is chosen large enough so that the maximal amplitude is within a desired distance to the value 1, the only parameter left that influences the 'sharpness' of the probability distribution is the ladder dimension d. We can therefore use d as a proxy for 'sharpness'. We then proceed by numerically calculating the accuracy in this situation for given values of c and g. The results are shown in Fig. 7 (a) and indicate that the accuracy grows linearly with d. In this regime, the 'sharpness' therefore determines the accuracy up to a constant factor. However, one should note that this linear relationship only holds in a regime where the decay process happens fast enough, i.e. assuming a sufficiently large value of c (or small enough value of g). If P top (t) is too 'sharp' compared to the time scale of the decay process increasing the ladder dimension leads to a reduction of the accuracy [as seen in Figs. 7(a) and 4]. This implies that for a given combination of c and g there are certain choices of d that lead to sub-optimal clocks. Considering the resolution as a function of d [ Fig. 7(b)] in the limit M → ∞ we do not observe an optimal configuration. The resolution simply decreases with increasing d indicating a trade-off relation between accuracy and resolution in the regime where the accuracy increases linearly with d. Thus plotting accuracy over resolution reveals the trade-off relation depicted in Fig. 5. However, considering finite M the resolution reaches a point at which it starts dropping to zero quickly. The reason for this can again be found in the amplitude of P top (t), which goes to zero for large enough d and fixed M . Thus, not only the accuracy (see Fig. 4), but also the resolution is bounded from above by the corresponding resolution obtained for M → ∞. There c and g determine this upper bound.   . This leads to an additional reduction in resolution initiating a drop of the resolution eventually approaching 0 (with d). Increasing g shifts the peak towards the right, i.e. to higher resolutions while decreasing the maximum accuracy.