Alternative experimental ways to access entropy production

We theoretically derive and experimentally compare several different ways to access entropy production in a quantum process under feedback control. We focus on a bipartite quantum system realizing an autonomous Maxwell's demon scheme reported by Najera-Santos et al. [Phys.~Rev.~Research 2, 032025(R) (2020)], where information encoded in a demon is consumed to transfer heat from a cold qubit to a hot cavity. By measuring individual quantum trajectories of the joint demon-cavity-qubit system, we compute the entropy production with six distinct expressions derived from different approaches to the system description and its evolution. Each method uses a specific set of trajectories and data processing. Our results provide a unified view on the various meanings of irreversibility in quantum systems and pave the way to the measurement of entropy production beyond thermal frameworks.

We theoretically derive and experimentally compare several different ways to access entropy production in a quantum process under feedback control. We focus on a bipartite quantum system realizing an autonomous Maxwell's demon scheme reported by Najera-Santos et al. [Phys. Rev. Research 2, 032025(R) (2020)], where information encoded in a demon is consumed to transfer heat from a cold qubit to a hot cavity. By measuring individual quantum trajectories of the joint demoncavity-qubit system, we compute the entropy production with six distinct expressions derived from different approaches to the system description and its evolution. Each method uses a specific set of trajectories and data processing. Our results provide a unified view on the various meanings of irreversibility in quantum systems and pave the way to the measurement of entropy production beyond thermal frameworks.

I. INTRODUCTION
Entropy production (EP) is a key physical concept that quantifies the irreversibility of a given process: the larger the EP, the more irreversible the process. It was born from very practical considerations, since irreversibility fundamentally limits the performance of heat engines and fridges [1]. Eventually it turned into the fundamental concept allowing to phrase the second law of thermodynamics (SLT): the EP of a physical process can never be negative. As a typical example, spontaneous heat flow from cold to hot bodies is forbidden by the SLT, as it would give rise to a negative EP.
Pioneering expressions of EP were established at the outset of macroscopic thermodynamics and generally applied to specific irreversible processes such as the thermalization of systems or, conversely, the driving out of their thermal equilibrium. Such processes involve heat dissipation into reservoirs of well-defined temperatures, therefore making heat and temperature two essential quantities to define EP. Later on, the ability to monitor and control the evolution of microscopic systems at the level of single realizations gave rise to the so-called stochastic EP [2][3][4], that provided a renewed perspective on irreversibility. At this level of description, irreversibility results from random perturbations exerted on the system dynamics by external reservoirs, thus preventing the external operator to rewind any protocol. In this view, EP fundamentally captures the lack of control over microscopic systems, a concept that broadens the notion of EP to a much wider range of situations. Moreover, stochastic thermodynamics is agnostic to the type of noise and reservoirs which cause irreversibility. Its conceptual tools can be adapted to any kind of random perturbation, holding the promise to quantify irreversibility of quantum nature, e.g., stemming from decoherence or any source of quantum noise [5,6].
Microscopic systems undergoing feedback-controlled dynamics provide a first example of extension beyond open systems interacting with thermal environments. In such processes, information on the system's microstate is used to set its following evolution. In the past decades, it became possible to quantify the EP of these processes, evidencing a novel place for information within thermodynamics. Treated as a correlation between the controlled system and the memory of the feedback loop, information was shown to be an essential component of EP in experiments inspired in the Maxwell's demon paradox [7][8][9].
From an experimental perspective, EP in its various acceptions was measured on a handful of platforms. Without feedback control, experiments at the ensemble average level have been performed in a nuclear magnetic resonance (NMR) setup [10], in a micromechanical resonator [11] and in a Bose-Einstein condensate [11]. For feedback-controlled protocols, EP has been accessed at the average level in an NMR setup [12] and at the trajectory level with a superconducting circuit [13] and singleelectron transistors [14]. The latter case provides an example of an autonomous Maxwell's demon, where information is encoded on a quantum system and is never processed at the classical level. The device operated as a fridge, consuming information to transfer heat from a cold to a hot reservoir. In this spirit we have recently implemented a fully closed version of such a device where the cold and hot bodies, as well as the demon, are quantum systems evolving unitarily [15]. This situation is a minimalistic model of a closed, information-powered fridge.
The ability of theoretically describing and experimentally realizing a wide range of irreversible processes involving an increasing number of parties has given rise to an equivalent variety of expressions of EP. This calls for the development of a unified perspective, serving as much as a consistency check for the various definitions and as a testbed for their respective sensitivity to measurement arXiv:2012.13640v2 [quant-ph] 7 May 2021 errors. This is the purpose of the present article, where we theoretically and experimentally study the EP of the model system recalled above [15]. Namely, we derive and compare six alternative methods to measure the entropy produced by this system which are chosen to cover and illustrate a large variety of equivalent approaches to characterize the EP. They differ by the way we analyze the system's state (ensemble average or quantum trajectories), theoretically describe the control (external or autonomous) and experimentally access the system evolution (single unitary evolution or a cyclic implementation incorporating the time reversal of the basic evolution). Each choice provides us with a different view onto the EP and its definition, allowing us to acquire a deeper understanding of the physical nature and the experimental meaning of EP obtained with different measurements. Despite being equivalent in the ideal case, these expressions show different sensitivities to experimental errors. This observation is confirmed by a thorough modelling of our experiment, providing a practical benchmark that can be used to adapt the measurement strategy of EP to a particular quantum system.

A. Review of general methods
We first review the measures of irreversibility established within the so-called quantum Jarzynski's protocol [3,[16][17][18][19][20][21], schematically presented in Fig. 1(a). After having thermalized with a heat reservoir at temperature T , a quantum system starts in the thermal equilibrium state ζ with inverse temperature β = (k B T ) −1 , also known as thermodynamic beta, where k B is the Boltzmann's constant. The system is first driven out of equilibrium through a unitary operation U , to the nonequilibrium state ρ f (here and in the following the subscript f labels quantities at the end of the evolution). To be treated as a unitary, U is assumed to be performed swiftly compared to the system relaxation. Then, the system relaxes back to the thermal state ζ. This last step causes the whole process to be irreversible. The entropy production Σ is proportional to the amount of heat Q dissipated by the system along its thermalization, Σ = βQ. It is shown to equal the relative entropy D(ρ f ||ζ), also known as quantum divergence, quantifying the nonnegative distance between the two states. For states ρ and σ, it is defined as D(ρ||σ) = −Tr[ρ ln σ]−S(ρ), where S(ρ) is the von Neumann entropy of state ρ [22]. This provides a first intuitive flavour for the EP: the farther the system is brought away from equilibrium, the larger the entropy production.
Another meaning for the EP is acquired by the attempt to reverse the forward evolution U . For this purpose we complete the above protocol with the time-reversed unitary operationŨ . In general, unitary operations are considered as reversible: from an operational point of FIG. 1. The concept of the forward and backward system's evolution used to access the entropy production of the thermalization process. (a) Starting in the equilibrium state ζ, the system is unitarily driven out of equilibrium into state ρ f . The irreversible thermalization with the external heat reservoir produces entropy Σ by bringing the system back to ζ. The backward evolutionŨ implements the time-reversed forward evolution. In the presence of the thermalization, the backward evolution cannot bring the system back into its initial state thus revealing the irreversibility. (b) The overall scheme can be extended to a feedback-controlled evolution, where the unitary U (k) depends on the result k of a control measurement (readout R) of the system state by a feedback controller.
view this presupposes the ability to generate the backward evolution U † =Ũ (here and in the following the symbol ∼ denotes the backward quantities). In the absence of the intermediate thermalization, this backward evolution would bring the system back to its initial state. The presence of the intermediate irreversible thermalization is the reason why the process cannot be reversed. EP is shown to equal the relative entropy D(ζ||ρ f ) of the initial thermal state with respect to the final stateρ f of this backward evolution. This is also intuitive: the lower our ability to time-reverse the evolution, the larger the entropy production.
Finally, the concept of EP can be extended at the level of single realizations, that corresponds to two-point quantum trajectories in the present quantum Jarzynski's protocol. Each trajectory γ is defined by the outcomes of energy measurements performed at the beginning and at the end of the forward protocol, whileγ stands for its time-reversed counterpart, as introduced in the pioneering two-point energy measurement (TPEM) scheme [2][3][4]. The stochastic EP is defined as σ[γ] = ln(p(γ)/p(γ)) and compares the probability p(γ) for γ to be realized in the forward protocol and the probability p(γ) for the correspondingγ in the backward protocol [5]. This expression provides us with another intuitive way to quantify irreversibility at the level of single trajectories. Although σ[γ] can take negative values, its average Σ over all possible trajectories is non-negative by convexity of the exponential, in agreement with the SLT.
From this brief review, it appears that EP can be captured owing to various operational resources and, in particular, to the ability to access average or stochastic physical quantities as well as to run an evolution forward and backward. In what follows, we systematically employ these different approaches of EP to a basic protocol of an information-powered fridge. More specifically, the system is measured by a controller (readout R) and its further unitary evolution U (k) is set by the readout outcome k, thus leading to several different evolution branches of the feedback-controlled system, see Fig. 1(b). Along with the two measurements forming the TPEM scheme the readout outcome also contributes to the definition of the quantum trajectories. For the sake of clarity, we first detail the non-autonomous description of the protocol, where the feedback uses information encoded on a classical memory of the external controller. Then, we focus on the autonomous description, i.e., for a fully closed system as reported in Ref. [15].
B. Average evolution in the non-autonomous description Figure 2 illustrates a non-autonomous description of the Maxwell's demon experiment studied in this paper. We consider a qubit Q and a cavity C. Their interaction is controlled by a third system further dubbed demon and denoted by D. In this description, D features a classical entity, performing a local projective measurement on Q in its energy basis and storing its result in a classical memory. The two measurement outcomes are then exploited in the feedback loop (readout followed by feedback), that conditionally acts on the joint QC system. Namely, they trigger a unitary system evolution U (1) = V for k = 1 and no interaction, i.e., the identity U (0) = I, for k = 0. All the EP expressions derived in this and the next section are also valid for more general settings, with two arbitrary systems Q and C.
We first consider the average evolution of the joint QC system and use the density matrix approach to describe its state. Initially, Q and C start at the local thermal equilibrium state ρ QC are the Gibbs states. For each system j ∈ {Q, C}, H j is the local Hamiltonian and F j = −(1/β j ) ln Tr e −βj H j is the equilibrium free energy. The internal energy of system j in state ρ j is given by U j = Tr j H j ρ j . Next, D performs a projective measurement (i.e., demon readout) on the system. The measurement outcome k projects QC onto the state ρ QC,k i with probability p(k). Then, D stores the outcome k and induces the unitary feedback operation U (k) between Q and C depending on k. There are thus several distinct branches, labelled by k, of the possible unitary evolution of the system. The final QC states and their average over all measurement outcomes read ρ QC,k , respectively. The relaxation of the non-equilibrium state ρ QC f towards the initial thermal product state gives rise to the entropy production. The demon's memory, on the other hand, does not relax and hence does not produce entropy. This leads to our first expression of EP: where ∆β = β C − β Q , Q C = k p(k)∆U C,k is the heat absorbed by C, ∆U C,k is the energy change of C during the feedback operation in branch k, and I = H[p(k)] = − k p(k) ln p(k) is the Shannon entropy of the readout measurement. We use the fact that Q Q = −Q C for a closed system and an energy-preserving readout, see Appendix A. If there was no feedback action (U (1) = I), Σ 1 would reduce to the well-known classical expression Σ = ∆βQ [5], quantifying the entropic counterpart of the heat exchanged between two systems. In addition to this exchange term, Eq.(1) explicitly involves an informational contribution. This is in agreement with the pioneering expressions of the SLT in the presence of a feedback control that were obtained by explicitly taking the demon's physical memory into account [23][24][25]. An alternative, second expression for the EP can be obtained starting from the following identity for an arbitrary state ρ: Writing the heat in (1) in terms of the quantum divergence we obtain where ρ QC,k f is the final QC state conditioned on k, see Appendix A. This expression can be interpreted as follows. The entropy production for a thermalization process is known to be given by the quantum divergence of the initial state with respect to the final thermal one [26]. For a given k, the entropy produced during the ther- The total EP is, therefore, the average of such a conditional EP, associated to each branch k, over all readout outcomes.
The expressions Σ 1 and Σ 2 rely on the physical quantities provided by the forward protocol only. A third expression containing information also from the backward protocol can be obtained as well. For each branch k, the backward process is defined by the application of the time-reversed unitaryŨ (k) = [U (k) ] † on the state after the thermalization, while the demon's memory remains unchanged. Thus, the probability of applyingŨ (k) is given by the probability p(k) of ending up in the forward branch k. Starting from (2) we show in Appendix A that whereρ QC,k f is the QC state of the backward protocol of the branch k after the backward evolutionŨ (k) . This expression for the EP also comes in the form of the average over the outcomes of the readout measurement. It is a generalization of the equation obtained in Ref. [10], where there is no feedback control being considered.

C. Stochastic evolution in the non-autonomous description
The system evolution can also be described stochastically by means of individual quantum trajectories. All thermodynamic quantities become trajectory-dependent, providing a finer description of the system dynamics. In the spirit of the TPEM scheme, the definition of our quantum trajectories involve the initial and final energy states of the joint QC system, respectively denoted |n Q , n C and |m Q , m C . In the present case where the dynamics generates no coherence in the energy basis, these states can be accessed by two energy measurements M 1 and M 2 respectively performed at the beginning and at the end of the feedback loop, of respective outcomes The probability of the measurement outcomes {n Q , n C } is p(n Q , n C ) = p(n Q )p(n C ), where the probabilities p(n j ) are the Boltzmann weights of the initial uncorrelated thermal states. The demon readout R of outcome k conditions the QC evolution U (k) between M 1 and M 2 . For the ideal readout, the state |n Q deterministically sets the value of k. In a more general case, we can consider a conditional probability p(k|n Q ) of the readout outcome k accounting for possible readout limitations, such as non-projective measurement or detection errors. Eventually, the trajectory γ is defined by a unique set of the initial states |n Q , n C , the system evolution branch k, and the final state |m Q , m C . The forward trajectory probability distribution p(γ) is given by the probability of getting the set of outcomes γ = {n Q , k, n C , m Q , m C } and it explicitly reads The total number of all possible trajectories is d 2 where d j is the size of the Hilbert space of the system j. The last contribution, d Q , comes from the fact that the external controller (demon) must contain d Q distinguishable states to encode the measurement outcomes.
We now turn our attention to the distribution of the backward trajectories. As mentioned above, the backward process in each reverse branch k is generated by the time-reversed unitaryŨ (k) . The backward trajectoryγ is defined by the set of parameters {n Q , k, n C , m Q , m C } and is the counterpart of the forward trajectory γ labelled by the same indices. Similarly to the forward protocol, the probability distribution of the backward trajectories is given by where p b (n Q , n C |m Q , k, m C ) is the conditional probability of the final backward state. The initial backward probabilities p b (m Q ) and p b (m C ) are determined from the corresponding initial Gibbs states. The probability p(k) of the backward branch k equals the probability of the readout outcome k in the forward protocol.
Given the probabilities of the forward trajectory γ and of the corresponding backward trajectoryγ, the stochastic EP is defined as σ[γ] = ln p(γ)/p(γ) . The average EP computed over all γ's is then given by [5] where the relative entropy D is computed between probability distributions p(γ) and p(γ). Note that for classical distributions p n and q n , D is defined by the Kullback-Leibler divergence: D (p||q) = n p n ln (p n /q n ) [27], which is equivalent to the quantum divergence between two states whose density matrices are diagonal in the same basis. The expression Σ 4 quantifies the irreversibility by comparing the stochastic trajectories of the forward and backward protocols. Notably, its computation requires no knowledge on the actual physical states defining the trajectories, but needs only the ability to distinguish different trajectories in order to properly access their probabilities.
We define as the total probability of the forward and backward trajectories, respectively, contributing to the value σ of EP, where δ a,b is the Kronecker delta. With these two probability distributions one can easily show the detailed fluctuation relation: exp(σ) = p(σ)/p b (σ). By averaging over all possible values of σ we obtain The stochastic entropy production σ[γ] for each trajectory γ, required for computing Σ 5 , can be obtained as see Appendix A. Here, Q j [γ] is the stochastic heat received by the system j ∈ {Q, C} and I[γ] = − ln p(k) is the stochastic information extracted from the readout measurement. On the contrary to Σ 4 , the expression Σ 5 compares the forward and backward probability contributions to the EP, even if σ[γ] is degenerate for some trajectories. Thus, despite the obvious similarity of the mathematical expressions, Σ 4 and Σ 5 differ appreciably with respect to the information required for their computation.

D. The autonomous description
In the non-autonomous description the demon D has been treated as a classical feedback loop, involving a measurement and a conditional action on the QC system. The demon's influence has been taken into account through the measurement outcome probability p(k) quantifying the information extracted by D from the system Q. Alternatively, we can also consider a global QDC system incorporating D with the demon feedback action being part of a global unitary evolution of this closed system. For the remainder of this section we apply to our experiment the autonomous demon description reported in Ref. [15]. The equivalent quantum circuit for the forward protocol corresponding to a two-level (qubit) system Q is depicted in Fig. 4. The demon, also assumed to be a qubit without loss of generality, starts in the pure reference state |1 D . Both the readout and the feedback operations are dynamically implemented by means of global unitaries on the total QDC system. The projective readout in the energy basis of Q is replaced by a controlled NOT (CNOT) gate. It transforms the initial state of D into |0 D if the state of Q is |0 Q . After the CNOT gate, a controlled unitary operation between Q and C is performed to appropriately implement the feedback action. As expected, the reduced QC state after this operation is k p(k)ρ QC,k f , which is the average final state ρ QC f of the protocol described in Sec. II C.
Our final expression for the EP comes from the analysis of the closed QDC system with the demon D explicitly included as a third quantum system. We show in Appendix A that where I QC:D = D ρ QDC ||ρ QC ⊗ρ D is the mutual information between QC and D, while ∆ fb denotes the information change during the feedback step, i.e., before and after the controlled unitary gate. We consider here the ideal readout, i.e., p(k|n Q ) = δ k,n Q . This relation is a generalization to our current protocol of the expression first derived in Ref. [15]. It comes directly from the entropy conservation of the global QDC system for the closed evolution depicted in Fig. 4. Since the correlations before the feedback step are given by the Shannon entropy H[p(k)], substituting (9) into (1) we obtain It clearly shows that the EP has two contributions. The divergence quantifies the entropy produced in the thermalization process for the QC system starting in the state Expression Protocol Quantities forward averaged ρ QC f . The final mutual information between the two subsystems QC and D quantifies the amount of entropy produced by erasing all correlations between them, due to the thermalization. This result evidences that there is an entropic cost for erasing correlations [5,28]. Table I summarizes the six alternative expressions for the entropy production along with additional information on the underlying protocols and the statistical nature of the required physical quantities. Expressions Σ 1 , Σ 2 and Σ 6 are based on the data extracted exclusively from the forward protocol. The other expressions require the execution and analysis of the backward protocol as well. We can distinguish three types of physical quantities showing up in different expressions: expressions Σ 1 and Σ 6 can be computed using information on the average initial and final states of the system ("averaged"). Expressions Σ 2 and Σ 3 are based on data averaged over different readout outcomes, thus requiring the discrimination of different evolution branches ("branched"). Finally, to compute expressions Σ 4 and Σ 5 we have to resolve individual trajectories ("stochastic").

E. Summary of all expressions
Describing the same physical quantity, all these expressions are equivalent under the restriction of ideal unitary evolutions and ideal projective measurements, which have been used for their derivation. In the presence of realistic deviations from the idealized scenario, they start to differ, as will be shown in the next Section. For diagonal states, as considered here, only the expressions Σ 2 and Σ 6 stay mathematically identical and provide the same EP value irrespective of the evolution imperfections, see Appendix B. The Maxwell's demon system is realized with a microwave cavity (C) and flying circular Rydberg atoms (blue toroid for the qubit-demon atom, magenta toroids for QND probe atoms). See text and Ref. [15] for details.

A. Maxwell's demon system
We measure the entropy production in the Maxwell's demon system described by the quantum circuit in Fig. 4 and realized in a cavity QED setup [15]. Qubit Q and two-level demon D are simultaneously encoded into three adjacent circular Rydberg states of a single Rubidium atom A (with principle quantum numbers 49, 50 and 51 corresponding to atomic states |f , |g and |e , respectively). The mapping between the logical states of the QD system and the physical states of A is the following: |1 Q ,1 D = |e , |0 Q ,1 D = |g , and |0 Q ,0 D = |f . According to the Maxwell's demon circuit, see Fig. 4, the state |0 Q ,1 D is never populated and does not need to be encoded in a particular physical state of A. The system C is realized with a high-quality superconducting microwave cavity resonant with the atomic |g -|e transition at 51 GHz and far detuned from the |g -|f transition at 54 GHz.
The basic experimental setup is schematically presented in Fig. 5. Individual flying Rydberg atoms exit the preparation zone B in the state |g . To prepare the state |e a resonant microwave pulse is applied in R Q by means of the microwave source S eg . Its amplitude and duration are adjusted to realize a Rabi π-pulse between |g and |e . The demon readout is implemented by deterministically flipping the atomic states |g and |f before A enters C. This operation is induced by the microwave source S gf resonant with the |g -|f transition and adjusted to maximize the atomic population transfer.
The atom-cavity interaction is controlled by an electric field applied across C by the voltage source V via Stark-tuning the atomic frequency. The demon feedback is implemented by a resonant interaction between A and C based on the adiabatic passage technique. It allows for the efficient population transfer between the AC states |e, n and |g, n + 1 independent of the cavity photon number n. Energy conservation prevents the coupling of the joint ground state |g, 0 to other states.
The atomic states are directly measured by a field-ionisation detector M providing us with the final qubit and demon states, |m Q and |m D = k , respectively. The cavity photon-number state |m C is probed by a sequence of several tens of atoms interacting with C in the dispersive regime and performing a quantum non-demolition (QND) measurement of its photon number [29].

B. Experimental sequences
Each of the six EP expressions can be experimentally accessed by running the Maxwell's demon circuit and measuring physical quantities entering these expressions using measurement strategies properly adapted to each quantity. For instance, the cavity energy change Q C , required to compute Σ 1 , can be obtained by comparing the average initial and final photon number in the cavity without the need to resolve different numbers. However, in order to significantly reduce the overall data acquisition time and to address all expressions at once we have decided to record the complete statistics of individual trajectories in the forward and backward protocols of the Maxwell's demon circuit. Knowing the initial and final states of each trajectory as well as their occurrence probability allows us to compute any physical quantity appearing in the EP expressions, as will be shown below.
In order to get the EP for any initial temperature of Q and C without increasing the overall experimental time, we have decided to replace the combination of the initial thermal state preparation and the first projective measurement of the TPEM scheme with the direct preparation of the QDC system in the pure energy eigenstate |n Q , 1 D , n C . The different thermal states are then taken into account by using the corresponding theoretical probability distributions p(n Q ) and p(n C ). We have shown in Ref. [15] that the experimental Gibbs' states of Q and C of given temperatures β Q and β C can be experimentally prepared and measured to be in good agreement with the theoretical distributions p(n Q ) and p(n C ).
Summarizing, in our basic experimental sequence we initially prepare the QDC system in the pure energy eigenstate |n Q , 1 D , n C and measure the probability of its final state |m Q , k, m C after the feedback evolution.
In this way we obtain the conditional probability p(m Q , k, m C |n Q , n C ) of the trajectory γ = {n Q , k, n C , m Q , m C }. Note that the initial demon state is always |1 D and thus does not enter into the trajectory definition. Finally, for any temperature β Q and β C with the corresponding p(n Q ) and p(n C ) we compute p(γ) using Eq. (4).
In this work we consider, without loss of generality, the constant cavity temperature of 2.8 K and the qubit temperature varying such that the relative inverse temperature δβ = 1−β Q /β C ∈ [−6, 6]. Since the populations of the photon-number states larger than 3 are negligible for this temperature (the mean thermal photon number is 0.71), we restrict n C to values from 0 to 3 only.
The vacuum state |n C = 0 of C is prepared by sending through its mode a beam of resonant atoms in state |g . They absorb all photons from C thus cooling it into the vacuum. The state |n C = 1 with one photon is excited from the vacuum by using one atom in state |e and forcing it to resonantly emit a photon into C. The preparation of larger photon-number states are realized by the QND projection of a small coherent field [29]. We first inject into C a coherent field with about 3 photons on average. Then, we perform the QND measurement, randomly resulting in different photon-number states. Finally, we post-select and sort all trajectories with the initial projected states |n C = 2 and |n C = 3 . The final QDC state is measured independently on each ensemble of quantum trajectories with the same initial state |n Q , n C . The final detection of A gives us the conditional probability p(m Q , k|n Q , n C ). The cavity photonnumber probability is reconstructed on the ensemble of trajectories [30] with the same initial and final QD state. In this way we obtain the conditional distribution p(m C |n Q , n C , m Q , k) and compute p(m Q , k, m C |n Q , n C ) = p(m C |n Q , n C , m Q , k) p(m Q , k|n Q , n C ). The procedure of the state preparation and detection, along with all measured probabilities, is presented in detail in Appendix D. The probability of the trajectory γ = {n Q , n C , m Q , k, m C } for each β Q then equals p(γ) = p(m Q , k, m C |n Q , n C ) p(n Q ) p(n C ). A similar procedure is realized to obtain the probability distribution p(γ) of the backward trajectories. The set of probabilities {p(γ)} and {p(γ)} are used in the following to compute all expressions for the entropy production, as explained below for each EP expression.
C. Measurement of entropy production Figure 6 shows the temperature dependence of the entropy production Σ computed from the six expressions. Dotted lines correspond to the theoretical values for the ideal QDC system as presented by the quantum circuit in Fig. 4. As expected, they coincide for all expressions, showing their fundamental equivalence. For large negative δβ (i.e., the qubit state close to |0 Q ), the probability for Q to be in |1 Q is small making the QC interaction after the demon readout unlikely. In this limit, the QC state stays almost unchanged reducing the entropy production to zero. For large positive δβ (i.e., the qubit state close to |1 Q ), Σ linearly increases with δβ, see Appendix C. Since Q is mostly in |1 Q , the QC interaction is extremely likely, pushing the QC state further away from the initial thermal one and, consequently, producing more entropy.
The solid lines in Fig. 6 are computed from the experimental results using the expressions Σ 1 to Σ 6 for panels (a) to (f), respectively. The deviation from the ideal curves is due to experimental imperfections. The most significant imperfections are the preparation error prep of the initial atomic states, the errors of the readout ( read ) and feedback ( feed ) operations and the dis- crimination error meas of the atomic state measurement, see Appendix E for details. The errors read and feed modify the system evolution. They change the entropy production and influence equally all experimentally obtained Σ. Namely, the imperfect readout allows for the non-negligible QC interaction even for Q prepared in |0 Q resulting in the non-zero Σ for δβ 0. On the other hand, the imperfect feedback reduces the probability for the QC interaction for Q prepared in |1 Q , thus decreasing Σ for δβ 0. The errors prep and meas mix the labels of the detected quantum trajectories. Since different expressions are based on different combinations of experimental data, these errors have, in general, a different effect on the different expressions of Σ. Other imperfections, like atom and cavity relaxations, have a minor effect on the TPEM scheme and are listed in Appendix F.
The computation of the first expression, Σ 1 , given by (1) and presented in Fig. 6(a), starts by computing the stochastic heat change Q Q [γ] of Q for each trajectory γ. By averaging over all trajectories we get Q Q . The probability p(k = 1) is given by the probability to finally detect D in the state |1 D and equals the sum of p(γ) over all trajectories with k = 1. Here, we have used only the data from the atomic state detection of the forward protocol (i.e., no information on the cavity state is required). It is also noteworthy that the measured Σ 1 is higher than Σ based on other expressions. Ideally, the Shannon entropy I goes to zero for large negative and positive δβ when the demon state after the readout is a pure quantum state, |1 D or |0 D , respectively. However, due to the imperfect atomic state measurement meas , I is bound from below by H[ meas ] thus shifting Σ 1 up, as seen in Fig. 6(a).
The state ρ QC,k f in the expression Σ 2 is obtained from the final probability distribution p(m Q , k, m C ). The product Gibbs state ζ Q β Q ⊗ ζ C β C is set by temperatures β Q and β C . The probability p(k) is obtained in the same way as for Σ 1 . Therefore, the current expression is based solely on the forward protocol after averaging the quantum trajectories into ρ QC,k f . The experimental temperature dependence of Σ 2 is shown in Fig. 6(b). Figure 6(c) presents the expression Σ 3 based on the analysis of the backward protocol with two branches, k = 0 and k = 1. It mainly relies on the backward trajectories, except for the value of p(k) for the demon state extracted from the forward protocol. This expression shows the largest deviation from the ideal case for δβ 0, which can be explained by the use of the backward trajectories and the divergent properties of D. The relative entropy D(ρ||σ) is very sensitive to the smallest state variations if the support of the matrix σ does not include the support of ρ, hence the second name "divergence" for D. In the expression Σ 2 the support of the reference state ζ Q β Q ⊗ ζ C β C is the whole Hilbert space of the QC system, making this expression less sensitive to the small state variations. For the expression Σ 3 , however, the situation is radically different: both states appearing in the function D have limited supports making its evaluation more sensitive to most experimental imperfections than all other expressions (see Appendix E for details).
The expression Σ 4 in (6) is directly computed from the sets of {p(γ)} and {p(γ)} and is shown in Fig. 6(d). It is the only expression based on all data measured in the forward and backward protocols with no additional transformation or averaging. Figure 6(e) shows the relative entropy Σ 5 obtained from (7). We first compute, for each trajectory γ, the stochastic entropy production σ[γ] from its initial and final state using (8). Then, we calculate the probabilities p(σ) and p b (σ) from the set of all values of σ detected in the forward and backward protocols and obtain Σ 5 . This expression uses all experimental data after having grouped trajectories with the same σ.
Finally, the expression Σ 6 defined in (10) is shown in Fig. 6(f). The state ρ QC f is computed from the joint QDC state ρ QDC f , based on the distribution p(m Q , k, m C ), by tracing out D. The mutual information between QC and D is computed directly on ρ QDC f . Remarkably, the value of Σ 6 perfectly coincides with that of Σ 2 . We show in Appendix B that these two expressions are mathematically identical and are based on the same set of the experimentally obtained physical quantities.
The dashed lines in Fig. 6 are the entropy productions computed from simulated data obtained by taking into account all mentioned experimental imperfections. The good agreement between the measurement and the simulation allows us to test and confirm the influence of various system's errors onto different ways to experimentally access the entropy production Σ. Some errors perturb quantum trajectories for particular temperature ranges. For instance, read manifests itself for δβ 0, while prep and feed are noticeable mainly for δβ 0. The detection error meas is influential for the qubit temperatures with very different populations in |0 Q and |1 Q , i.e., for |δβ| 1. The influence of other sources of errors on the discrepancy between the ideal and realistic cases depend on a particular expression for Σ and on the way it is measured experimentally. In general, the errors increase Σ for δβ 0 and decrease it for δβ 0 relative to the ideal case.

IV. CONCLUSION
Our results allow to clarify the meaning of entropy production. Beyond its usual acception as a quantifier of irreversibility, it relates to some experimental lack of control over a quantum system, the larger the entropy production, the smaller the control.
In this spirit, we have presented different alternative ways to address and describe an ultimate informationpowered quantum fridge, providing us different operational expressions for entropy production. Our cavity QED setup has allowed us to formulate theoretically and to access experimentally several expressions for Σ, each of them having its own physical interpretation. Their computation is based on different data and requires different data processing. However, describing the same physical quantity, they provide equivalent strategies to measure Σ. Following the same line, similar sets of entropy production expressions can be derived for any other system under investigation and characterization. The final experimentalist's choice is set by features and imperfections of a particular setup, perturbing measured data and thus the different Σ expressions in different ways.
In the current work the state analysis has been restricted to the populations of energy states, sufficient for accessing the entropy production of the thermalization. To study other types of environments, it might be necessary to access quantum information e.g., stored in the system's coherence or entanglement between its parts. Our experimental setup allows for the complete quantum state tomography [30] providing access to quantum information and its transformation. We plan to use this ability and to implement dephasing and decorrelating environments in the forward-reservoir-backward protocol in order to reveal how different types of the information erasure induce irreversibility. The derivation of the expressions Σ 1 and Σ 5 are based on the definition of the stochastic entropy production given by Eq. (8) of the main paper. Here we derive this equation from Eqs. (4) and (5). Since the systems Q and C start in the thermal state, the initial probabilities for the energy measurement are given by p( n Q and E C n C are the energy eigenvalues of H Q and H C , respectively. The readout is performed in the energy basis of Q. We assume at this moment that there is no readout errors, i.e., p(k|n Q ) = δ k,n Q . The remaining conditional probability in Eq. (4) reads where E Q n Q and E C n C are the energy eigenstates of Q and C, respectively. If k = n Q , the conditional probability p (k|n Q ) makes the whole trajectory probability p (γ) equal to zero and the corresponding stochastic entropy production will not contribute to the average. Hence, it makes sense only to compute p (γ) for which k = n Q . Note that we still keep the two indices separately, as it is done in the main text.
For the backward trajectory probability p(γ) in Eq. (5), the branch probability is given by p(k) = p(n Q ), while the initial probabilities are The remaining conditional probability in Eq. (5) is given by (A2) By computing the stochastic entropy production from the ratio of p(γ) to p(γ) and considering the ideal measure-ment case, we obtain (A3) Since there is no source of work in this dynamics we iden- n C as the stochastic heat absorbed by Q and C, respectively. Defining I[γ] = − ln p(k), we arrive at Eq. (8).

Expression Σ2
We present the derivation of the expression Σ 2 , given in Eq. (2), starting from Eq. (1). The total heat absorbed by Q and C can be rewritten in terms of the states ρ QC,k i and ρ QC,k f as Q Q = k p(k)∆U Q,k and Q C = k p(k)∆U C,k , is the energy change of Q for the branch k, and similarly for ∆U C,k . Now, writing these energy changes for each branch and employing the divergence property we obtain and j ∈ {Q, C}. Next, we add Eq. (A4) for Q and C, and use the following two identities: where we denote ζ = ζ Q β Q ⊗ ζ C β C for simplicity, and (A7) Since S(ρ QC,k f ) = S(ρ QC,k i ) due to the unitary feedback operations and since S(ρ QC,k i ) = S(ζ C β C ) because the feedback measurement is projective, we end up with For an ideal readout, S(ζ Q β Q ) = I . Averaging (A8) over p(k) we finally obtain the equivalence between Eqs. (1) and (2).

Expression Σ3
We derive the expression Σ 3 of Eq. (3) starting from Eq. (2). Since the divergence is invariant under unitary transformations, i.e., D(U ρU † ||U σU † ) = D(ρ||σ), using the definition of ρ QC,k f we obtain The very last state is the definition ofρ QC,k f , i.e., the state of the kth branch of the backward protocol after performing the unitaryŨ QC,(k) . By averaging Eq. (A9) we get Eq. (3).

Expression Σ6
We start to derive the expression Σ 6 given in Eq. (10) by directly substitutng Eq. (9) into (1) and obtain Since I is the mutual information just before the implementation of the feedback [15], The entropy production expressions Σ 2 and Σ 6 are based on the same data treatment and are thus mathematically equivalent. In order to show this, we first remind several basic definitions from the information theory. The conditional entropy is defined as where A and B are two random variables with the probability distributions p(a) and p(b), respectively. The probability p(a, b) is the joint one defined through the conditional probability p(a|b) as Finally, for diagonal density operators used in the current paper, the complete density matrix ρ QC f is computed from the branched ones, ρ QC,k f , as We start by computing the difference of the two expressions: The main experimental imperfections of our experimental setup affecting the measured entropy production are the following: • the nonideal purity of the initial atomic state prepared for the TPEM protocols (error prep ), • the limited readout efficiency (error read ), • the limited feedback efficiency (error feed ), • the imprecision of the final state measurement (error meas ).
The errors read and feed modify the QDC system evolution and, thus, the entropy production. Therefore, they influence Σ computed from all expressions in the same way. On the other hand, the errors prep and meas limit our ability to resolve different quantum trajectories of the system. They can have different effects to different expressions depending on how exactly they are computed. Below we comment on each of these errors. In the next Section we present other minor sources of experimental errors.
The preparation of the initial atomic state |g is exact. The excited state |e , however, is prepared with the population 1 − prep = 0.9, where prep = 0.1 is the probability for the atom to be left in its ground state |g due to the imperfect excitation pulse in R Q . Therefore, this error source is relevant for the temperature range δβ 0 and is negligible for small qubit temperatures.
The demon readout is realized by a microwave π-pulse resonant with the atomic transition between levels |g and |f . Being imperfect, this operation leaves the atom in its initial state |g with the undesirable probability read = 0.11. Obviously, the readout does not affect the atom in the state |e . Therefore, this error source is relevant for low qubit temperatures (δβ 0) and is negligible otherwise.
The feedback operation is a resonant population transfer between the atom and the cavity realized by means of the adiabatic passage technique. The population transfer has a failure probability of feed = 0.03. Contrary to read , the error feed has effect on the system for δβ 0 and is negligible otherwise.
The error of the final state measurement originates from the limited state resolution of our field-ionization detector in combination with the atomic relaxation during the atom flight from the cavity to the detector. The probabilities a,b meas to erroneously detect a state |b as the state |a , with a, b ∈ {e, g, f }, have been independently measured to be: e,f meas = 0.01, f,e meas = 0, e,g meas = 0.05, g,e meas = 0.02, g,f meas = 0.05, and f,g meas = 0.02. The influence of these errors is different for different temperatures and different Σ expressions.
To illustrate the effect of the major error sources onto the measured values of Σ, we have performed a series of  numerical simulations with only one of the error sources activated. Figures 7 to 10 present four simulations with the individual errors prep , read , feed and meas , respectively. The panels from (a) to (f) correspond to the expressions from Σ 1 to Σ 6 . The line types are the same as in Fig. 6. The dotted lines are the direct computation of the ideal quantum circuit in Fig. 4. The solid lines are the experimental results (given here for reference). The dashed lines are the simulation results with only one error source activated.
The large difference between the ideal and real cases for the expression Σ 3 , seen in panels (c) of Figs. 7, 9 and 10, can be explained by the use of the imperfect backward trajectories and the divergence properties of D. We remind that the computation of Σ 3 is based on the comparison between the state before the forward evo-lution and the state after the backward evolution, for the two branches. One important feature of the backward protocol is that it can populate states that cannot be reached by the forward protocol in the same branch. Moreover, the relative entropy D, or divergence, comparing two probability distributions, is very sensitive to minor changes in the underlying distributions when their supports are different, i.e., when there are states populated in one distribution but not in the other. Consequently, since the backward protocol in the presence of experimental imperfections populates states which don't appear in the forward protocol, the divergence based on the backward states becomes much more sensitive to these imperfections than any other expression of Σ.

APPENDIX F: Minor experimental imperfections
Besides the four major sources of system imperfections listed in the previous Section, there are several minor effects influencing the system evolution. We consider here the limited purity of the initial cavity state for the TPEM protocol, the atom and cavity relaxations and the possible presence of a second undetected atom in the main atom sample. By simulating the experimental sequence, we have found that the influence of these errors on the measured entropy production Σ is relatively small, but still noticeable. Below we give the typical values of the corresponding errors.
The preparation of the cavity vacuum state |n = 0 is exact. The preparation of the one-photon state by means of the resonant injection of a photon by a single excited atom results in the idle population of 0.08 in |0 and 0.16 in |2 . The former is mainly due to the limited injection efficiency, while the latter originates from the possible presence of a second atom in the resonant atomic sample. The preparation of the two-and three-photon states are based on the photon-number measurements of a coherent state in the cavity [29]. Being limited in time, it results in the non-zero populations in the two neighbouring states: for the two-photon preparation (|2 ) the cavity has a residual probability of 0.15 to still contain one photon (|1 ) and 0.10 to contain three photons (|3 ). For the three-photon state, |3 , we get the residual populations in |2 and |4 equal to 0.17 and 0.10, respectively.
The lifetime of the circular Rydberg states used in the present experiment (principal quantum numbers 49, 50 and 51) is of the order of 30 ms. The cavity lifetime at a 1.5-K temperature of the cryostat is 25 ms. The duration of the main experimental sequence from the initial state preparation to the atom detection is less than 1 ms. This limits the relaxation probability in the system to less than 0.04. The atom and cavity relaxations are represented in the simulations with the corresponding master equations [31,32].
The number n a of atoms present in an atomic sample is random obeying the Poisson probability distribution P a (n a ). In the current work we have set the average atom number ton a = 0.22 by adjusting the efficiency of the Rydberg state excitation of ground state atoms. The overall detection efficiency (i.e., probability to detect an atom) is ε = 0.5. For the data analysis we select trajectories with exactly one detected atom. The conditional probability to have two atoms in a trajectory containing only one detected atom is P (2|1 detected) = 2ε(1 − ε)P a (2) εP a (1) + 2ε(1 − ε)P a (2) . (F1) In our case, P (2|1 detected) = 0.10.