An electronic Maxwell demon in the coherent strong-coupling regime

We consider a feedback control loop rectifying particle transport through a single quantum dot that is coupled to two electronic leads. While monitoring the occupation of the dot, we apply conditional control operations by changing the tunneling rates between the dots and its reservoirs, which can be interpreted as the action of a Maxwell demon opening or closing a shutter. This can generate a current at equilibrium or even against a potential bias, producing electric power from information. While this interpretation is well-explored in the weak-coupling limit, we can address the strong-coupling regime with a fermionic reaction-coordinate mapping, which maps the system into a serial triple quantum dot coupled to two leads. There, we find that a continuous projective measurement of the central dot would lead to a complete suppression of electronic transport due to the quantum Zeno effect. In contrast, a microscopic model for the quantum point contact detector implements a weak measurement, which allows for closure of the control loop without inducing transport blockade. In the weak-coupling regime between the central dot and its leads, the energy flows associated with the feedback loop are negligible, and the information gained in the measurement induces a bound for the generated electric power. In contrast, in the strong coupling limit, the protocol may require more energy for opening and closing the shutter than electric power produced, such that the device is no longer information-dominated and can thus not be interpreted as a Maxwell demon.


I. INTRODUCTION
In the famous thought experiment, Maxwell's demon is an intelligent being that measures the direction and speed of particles in a box with two compartments. By suitably opening or closing a shutter between the compartments, the demon can sort the initially thermally distributed particles into cold and hot fractions. This thermal gradient can be used to extract work. Effectively, the feedback loop implemented by the demon leads to a local reduction of entropy, ideally without any energetic cost, only using the information from the measurement. The possibility of converting information into work has inspired generations of researchers to investigate the role of information in thermodynamics 1 .
With nowadays rapid improvement of competing experimental approaches, it has become possible to implement different versions of a Maxwell demon in realworld scenarios. These approaches include electronic 2,3 , qubit-qubit 4 , qubit-cavity 5 , and photonic 6 implementations. And beyond Maxwell's demon, which is concerned with the control of average currents, for electronic transport setups feedback schemes proposing the control of even higher moments 7 have been experimentally implemented 8 . With such advanced experimental abilities, huge interest exists in exploring quantum implications of a Maxwell demon.
Generally, it should be noted that in the theoretical discussion of quantum feedback control devices, two fundamentally different approaches exist. In an autonomous (also termed coherent or all-inclusive) feedback loop, the original quantum system is supplemented by another auxiliary quantum system, which modifies the dynamics to reach a specified objective. This allinclusive approach has the advantage of simpler bal-ance equations for the joint entropy of system and controller. However, such systems are hard to design for arbitrary feedback loops (both theoretically and experimentally) and are not very flexible as the feedback loop and thus the desired function is hard-wired in the device. Alternatively, one can implement an external feedback loop by performing measurements on the quantum system, classically processing the obtained information, and performing conditional control operations on the quantum system just as in the original thought experiment. By changing the classical control protocol, i.e., choosing different control actions, the scheme can be modified to achieve different objectives. In contrast to classical systems, which ideally remain unaltered by the measurement alone, the dynamics of quantum systems is modified already by a measurement, which can have drastic consequences such as the quantum Zeno effect. Thus, while this second approach appears closer to the original setup and may be more flexible, it has the disadvantages that its theoretical discussion and experimental implementation are also demanding regarding the inclusion of the quantum measurement process and the fidelity of measurement and control steps, respectively. We note that it has been possible to relate the entropy balances of autonomous 9,10 and external 11,12 Maxwell demons with each other. Furthermore, in electronic transport setups, both autonomous and external versions of Maxwell's demon have been experimentally implemented 2,3 .
Being introduced within the framework of classical physics, models discussing Maxwell's demon theoretically typically employ the weak coupling limit between the controlled system and its reservoirs 11,[13][14][15] . Here, the energy contained in the interaction between them is negligible. By contrast, in the strong-coupling limit, it arXiv:1711.00706v2 [cond-mat.stat-mech] 27 Mar 2018 is known that this interaction energy is no longer negligible [16][17][18][19][20][21][22] , and even the partition into system and reservoir components becomes less obvious. In our model, this interaction energy is directly related with the energetic cost associated to opening or closing the shutter. Therefore, it is an intriguing question how Maxwell's demon performs when the interactions between controlled system and its reservoirs become strong.
In this paper, we will attempt to discuss this case for an electronic external feedback loop. Particularly, we aim at generalizing the setup in Ref. 11 to the strongcoupling regime. The generalization of the corresponding autonomous setup in Ref. 9 will be discussed in a companion paper 23 , but see also Ref. 24 . On the technical side, we will employ a fermionic generalization of a reaction-coordinate mapping, which is frequently employed in bosonic systems 19,[25][26][27] to treat the strongcoupling and non-Markovian limit. We will see that projective measurements will imply Zeno-related modifications 28 to the Maxwell demon dynamics, which requires a generalized discussion of the control loop including weak measurements 29 . Since these methods are partially new, we will explicitly present them in the following. Below in Sec. II, we briefly review the underlying model, discuss the fermionic reaction coordinate mapping to a triple quantum dot, and show how to set up the propagator for single feedback cycles in case of strong (projective) and weak measurements of the central dot's occupation. Afterwards, in Sec. III, we discuss the thermodynamics by defining the heat currents and the energy injected by the measurement as well as the energy injected by the control. We discuss the performance of the device in Sec. IV before concluding.

II. MODEL
In this section, we first briefly review the original model system in presence of projective measurements and piecewise-constant feedback control in Sec. II A. Then, we show how to map it to an equivalent triple quantum dot model in Sec. II B, where a Markovian embedding in an extended space allows to treat non-Markovian and strong-coupling effects in the original system. Afterwards, we discuss the effect of a projective (strong) measurement on such a triple quantum dot in Sec. II C and show that to avoid Zeno blocking, the introduction of weak measurements is necessary. Finally, we discuss the weak-measurement implications of a microscopic detector model in Sec. II D, with technical details exposed in App. B, and close the feedback loop in Sec. II E. For orientation, we depict the setup and feedback cycle we have in mind in Fig. 1.

A. SQD with projective measurements
The system we are aiming to control is a single electron transistor, where a single quantum dot (SQD) is FIG. 1. Top: Sketch of the considered electronic transport setup with some relevant parameters. Through the bottom circuit, the rates Γα allow for a current flowing between the left and right leads characterized by the inverse temperatures β and the chemical potentials µα. Via monitoring the occupation n dot of the central dot with a nearby quantum point contact (QPC), the QPC current signal (obtained by measuring n QPC charges in time interval ∆τ ) can be fed back into the system by conditioning the tunneling rates on the measured dot occupation. We aim at the strong coupling limit by including collective reservoir degrees of freedom into the system (blue region). Bottom: Schedule of the feedback cycles composed from repeated but alternating application of measurement phases (orange) of duration ∆τ , followed by control phases (green) of duration ∆t−∆τ . In this paper, we consider the limit ∆τ → 0, keeping however the effect of the measurement (γ∆τ ) finite. During control (implemented by conditional rates Γ E/F α ), the QPC detector is formally decoupled and the system Hamiltonian is conditioned on the preceding measurement, and during measurement, the system Hamiltonian is fixed to HS (implemented by fixed rates Γα), and the QPC interaction is dominating the central dot dynamics.

tunnel-coupled to two leads
Here, denotes the dot level and d the annihilation operator of an electron in the dot, which can tunnel via the amplitudes t kα into left or right reservoirs α ∈ {L, R}, described by non-interacting modes with annihilation operators c kα and energies kα . Spin effects are not considered throughout this work (spin-polarized electrons). In absence of feedback, the single electron transistor is exactly solvable for arbitrary coupling strengths, see e.g. Ref. 30 . However, in the weak-coupling limit a simple rate matrix W = W L + W R is sufficient to describe the evolutionṖ = WP of the probabilities P = (p E , p F ) T for the empty and filled dot state Here, f α ( ) = e βα( −µα) + 1 −1 denotes the Fermi function of reservoir α ∈ {L, R} with chemical potential µ α and inverse temperature β α , and the overall prefactor is determined by the spectral coupling density (SD) which contains the non-thermal reservoir properties such as its level distribution kα and the coupling strengths t kα of individual reservoir modes k to the system. By keeping the reservoirs at different temperatures and chemical potentials, the total system can be interpreted as a thermoelectric generator 31 , where a thermal gradient can be harnessed to drive electronic transport against a voltage gradient. Now, when additionally placing a quantum point contact (QPC) near the quantum dot, it is possible to read out the time-dependent dot occupation with high precision 32 . The current signal can be processed and fed back into the system by changing the tunneling rates in a time-dependent fashion. In a simplified treatment, this can be treated as a sequence of instantaneous projective measurements, followed by a period of conditional piecewise-constant evolution, where Γ (0) α ( ) → Γ E/F α depending on the measurement outcome empty/filled, respectively. After averaging over all outcomes and considering the limit of continuous measurements 33 , one obtains an effective rate matrix under feedback which breaks the conventional local detailed balance relation. Even in absence of a temperature gradient this can be used to generate electric power 11 , which can be interpreted as an electronic Maxwell demon. We stress that a modified version of the fluctuation theorem is still valid 11 , and the particular form of broken detailed balance implies that the second law is obeyed when the information current resulting from the feedback loop is included in the entropic balance 12 . Very recently, this feedback scheme has been experimentally verified 3 . However, there are some limitations in the SQD treatment. First, the scheme is valid in the weak-coupling limit only β α Γ E/F α 1, such that for strong feedback driving the results should be questionable. Second, the discussion of this scheme as a Maxwell demon lacks the calculation of the energetic balance done with the control actions, since this is neglected in the conventional master equation treatment. Third, the treatment with a projective measurement does not fully comply with the experimental situation. With the present contribution, we would like to overcome these limitations.

B. TQD without measurements
By applying separate fermionic Bogoliubov transforms for each reservoir, we can include separate reaction coordinates into the system, which maps our setup to a serial triple quantum dot system (TQD) that is tunnel-coupled to two residual reservoirs via the renormalized tunneling amplitudes T kα , see Fig. 2 for an illustration.
FIG. 2. Sketch of the reaction coordinate mapping from the SQD (left) to the TQD (right) model. A collective degree of freedom is separated from each reservoir of the SQD and absorbed into a redefined system, the TQD, which is still tunnel-coupled to two residual reservoirs. The original feedback loop on the SQD that modifies the tunneling rates in a Markovian treatment is thereby mapped to a feedback loop on the TQD, where the internal tunneling amplitudes within the TQD are changed in a piecewise-constant fashion.
After the mapping, the Hamiltonian of the TQD assumes the form Here, the first two lines denotes the TQD system Hamiltonian H S with reaction-coordinate on-site energies Ω α and TQD internal tunneling amplitudes λ α , the third line contains the coupling, and the last line the residual reservoir terms. Based on this TQD model, a new SD can be introduced which can be obtained from the original SD with complex calculus methods. More details regarding the fermionic reaction coordinate mapping are exposed in App. A and a companion paper 23 . Specifically, when we parametrize the original SD by a Lorentzian function the TQD system parameters can be analytically evaluated and the transformed SD becomes flat This suggests that a Markovian treatment of the TQD system in the infinite-bias and/or high temperature regime yields the exact dynamics of the SQD. We note that opposed to previous treatments of similar mappings in the literature (e.g. Refs. 34,35 ), the mapping discussed above can be systematically extended by mapping any reservoir into a chain of reaction coordinates. Most important however, the discussed mapping holds also for time-dependent modifications (see also Ref. 36 for a bosonic periodically driven example). In the transformed picture, instead of changing the coupling to the residual reservoirs, the feedback loop now modifies a parameter (λ α ) of the TQD Hamiltonian only. This means that in case of the discussed piecewise-constant feedback interventions we simply have H S → H E/F S , whereas the coupling to the residual reservoirs remains constant.

C. TQD with projective measurements and Zeno blockade
Let P E = dd † and P F = d † d denote projection operators on the empty or filled central dot state, respectively. Upon measuring outcome ν ∈ {E, F} with probability Tr {P ν ρ}, the TQD density matrix transforms according to Now, if for each measurement outcome the subsequent evolutionρ = L ν ρ is conditioned on the measurement outcome ν (which results from switching H S → H ν S ), we get -by averaging over the measurement outcomes -immediately before the next measurement where we have used P ν ρ=P ν ρP ν . We see that even in absence of feedback (L E = L F ), the projection superoperators P ν may strongly affect the dynamics. We note also that although P E + P F = 1 in ordinary operator space, this does not hold in superoperator space P E + P F = 1, which formally reflects the fact that quantum measurements always affect the system. This prevents the transformation of Eq. (10) into a master equation in the continuum limit (∆t → 0). However, we can use that P ν P ν = δ νν P ν to infer that the projected density matrixρ ≡ (P E + P F )ρ=P E ρP E + P F ρP F obeys in the continuous measurement limit a master equation of the forṁ In numerical investigations (not shown) we have found that this effective feedback Liouvillian L prj fb is bistable, with different stationary solutions corresponding to an empty and filled central dot, respectively. In addition, the currents associated with these stationary states vanish throughout. Thus, the usual Redfield treatmentsee App. C 3 -will lead to a complete blockade of the current if the central dot is strongly and continuously measured, see also Sec. III A.
This can be attributed to a Zeno-type blocking of transport 34,37 , which however is not observed in actual electronic transport experiments. This motivates us to model the effect of measurement more realistically, which naturally leads to the concept of positive operator-valued measures (POVMs) 29,38 or weak measurements. We note that the secular approximation would not imply a TQD current blockade, but it is not applicable in the continuum measurement limit. The Zeno blockade of the current for projective measurements at high rates is also found in an independent investigation 39 based on dynamical coarse-graining 40 .

D. TQD with weak measurements
A natural way to introduce a weak measurement is via a physical interaction with a detection device such as a QPC 41,42 . Schematically, system and detector are allowed to interact for a finite time ∆τ (described e.g. by unitary or dissipative evolution), leading to the buildup of system-detector correlations. Afterwards, a projective measurement in the detector Hilbert space (in our case, fixing the number of charges n tunneled through the QPC during ∆τ ) performs a weak measurement on the system (TQD), implementing Neumarks theorem 43 . The feedback loop could then be closed by conditioning the subsequent evolution on the measurement outcome, as sketched in Fig. 1 bottom panel. To characterize the measurement properties, we will for the moment however not consider any feedback and consider the limit ∆τ = ∆t (measurement device is always on).
Starting from a microscopic model for the interaction between the TQD and a QPC measurement device, we derive an effective Lindblad generator for the TQD dynamics during the measurement, which eventually can be used to obtain the weak measurement superoperator, see Appendix B. Effectively, the weak measurement is described by a POVM, which depends on two dimensionless parameters x and y and can be written as a minimally disturbing measurement 29 , which after observing n tunneled QPC electrons during measurement interval ∆τ acts on the TQD system density matrix ρ as The special form of our detector model lets the measurement affect the central dot only. It is not hard to show that n M † n M n = 1 although the M n operators are no projectors. Microscopically, the x and y parameters are linked to the maximum QPC current γ, the reduced QPC current γ(1 − σ) 2 , and the measurement time ∆τ . They correspond to the average particle transfer through the QPC during ∆t for an empty (x = γ∆τ ) or filled (y = γ∆τ (1−σ) 2 ) SQD, respectively, see Fig Formally, we see directly that the action of M n preserves the Hermiticity of any valid density matrix. Furthermore, the positivity of ρ is also preserved, which can be deduced from the observation that both trace and determinant of M n ρ are non-negative for any valid ρ. To preserve the trace, one has to renormalize afterwards, i.e., ρ (n) = Mnρ Tr{Mnρ} is a valid density matrix. The limit x = y corresponds to a QPC that is insensitive to the dot occupation (σ = 0), and consequently it has (after normalization) no effect on the TQD. Usually, the detector is tuned to obtain information about the system, and the von-Neumann entropy of the system will decrease for most individual measurement outcomes. However, this need not always be the case, i.e., the entropy for individual outcomes n may also increase under the action of the measurement. In particular, on average, the effect of any minimally disturbing measurement will increase the entropy 29 S vN ( n M n ρ) ≥ S vN (ρ). For the model at hand one can confirm this by considering that on average the measurement will induce a reduction of coherences, bringing the eigenvalues of the reduced central dot density matrix closer together and thereby increasing the entropy.
Since the QPC statistics is in the considered limit just given by the sum of two Poissonian distributions that propagate at different speeds, one can define a suitable discrimination threshold (dashed line and right panel in Fig. 3) at the point where both distributions coincide, i.e., where x n thr e −x = y n thr e −y , leading to Supposing that x y (sensitive QPC), obtaining a value of n in the measurement that is close to the two peaks will tell a lot about the state of the system, but when n ≈ n thr , the measurement is practically useless. To obtain a compact description with just two rates (and also to compare with the SQD treatment) it is natural to coarse-grain the measurement outcomes into the ones interpreted as an empty or filled dot, respectively where ρ still denotes the TQD density matrix and F n (x) ≡ Γ(n, x)/Γ(n) with Γ(n, x) denoting the incomplete Gamma function and Γ(n) = (n − 1)! the ordinary Gamma function. The F n (x) functions behave similar to Fermi functions, such that when x y 1, the measurement superoperators approach the projective limit M E/F → P E/F . We note that, as for the projective case, these superoperators do not add up to the identity. Furthermore, we also note that

E. TQD with feedback
By conditioning the subsequent evolution on the measurement outcome, we close the feedback loop. We denote the implicit dependence of the dissipators on Γ E/F α by L → L E/F , which yields for ∆t > ∆τ the feedback iteration equation (recall that the measurement duration ∆τ is implicit in the M E/F ) For finite ∆τ ≤ ∆t this can be solved for a stroboscopic stationary stateρ ∆t,∆τ = P(∆t)ρ ∆t,∆τ . We note that when L E/F are of Lindblad form, the above propagator P(∆t) will preserve all density matrix properties, since it is derived from an average over all conditional evolutions (which separately preserve the density matrix properties). This property must be preserved in the limit of small ∆t. We therefore consider the limit ∆τ ∆t → 0, keeping however x and y finite to preserve the measurement effects 44 . This implies that the QPC coupling γ must be large, justifying a posteriori the singular coupling limit used in its derivation in App. B. Formally, we therefore only set the explicit dependence on ∆τ to zero and then expand for small ∆t to obtain Subtracting ρ(t) on both sides and dividing by ∆t we get an effective feedback master equation, described by the feedback dissipator The corresponding stationary state will be defined by L fbρ = 0. The first dissipator defines an effective measurement dissipator (compare App. B) which is of Lindblad form and appears formally divergent as ∆t → 0. However, in the discussed regime ∆τ ∆t we stress that the measurement dissipator remains finite. Effectively, it is determined by the pa-rameterΓ = 2 ∆t 1 − e −σ 2 γ∆τ /2 > 0 describing the dephasing due to the QPC measurement 37,41 . If we would have directly expanded the projective measurement iteration (10), the corresponding measurement dissipator would indeed diverge. The action of (19) on the central dot density matrix appears trivial, since it deletes coherences which cannot be created anyway and leaves the diagonal elements of the central dot untouched. However, for the simulation of the full TQD density matrix we have found that the inclusion of this term with sufficiently small ∆t is necessary to preserve the density matrix properties of the TQD system: Even when L E and L F are chosen of Lindblad form and M E/F separately preserve positivity and Hermiticity and M E +M F corresponds to a Lindblad exponential, the action of L E M E + L F M F alone will in general not preserve the density matrix properties. These can only be recovered by adding a very strong (strongness of the superoperator does not imply a projective measurement) measurement dissipator.

III. THERMODYNAMICS
In this section, we will define sensible expressions for the heat exchange during the control and measurement phases (Sec. III A and Sec. III B, respectively), and for the work required to switch the Hamiltonian between measurement and control (Sec. III C). These can be used to show that our system obeys the first law (Sec. III D) and the second law (Sec. III E) of thermodynamics, which bound the performance of the demon.

A. Heat flow during conditional evolution
Since by construction Therefore, a phenomenologic way to define the stationary matter current entering the TQD from reservoir α proceeds via considering d dt = 0, since the feedback operations do not inject particles into the TQD system. We further note that the currents depend on δ α , but also implicitly on Γ E/F α due to the feedback. Also, doubling Γ L and Γ R will not necessarily double the current, since these parameters enter the TQD Hamiltonian. An alternative definition of the steady-state matter current would be to look at the time derivative of the occupation of the central dot, which decomposes into a left-and right-flowing contribution. Due to the special structure of the Born-Markov master equationsee App. C 3 -these are fully identical with the Heisenberg equations of motion for the central dot, leading to the definition I From this form, we see more directly that a projected density matrix (e.g.ρ → d † dρd † d) would lead to a vanishing matter current for any density matrix ρ (e.g. via Tr Finally, the matter currents entering the TQD can also be defined microscopically using the counting field formalism 45 . We have found these three definitions to be equivalent at steady state within the Born-Markov (non-secular) description.
The energy currents entering the system from the left and right reservoir would be sensibly defined in a similar way. We note that this treatment neglects the interaction energy between TQD and its reservoirs, but keeps the interaction energy between the central dot and its reservoirs. However, in addition the system Hamiltonian now depends on the control operations, eventually leading to In presence of feedback (L and H E S = H F S ), the energy currents are not necessarily conserved, since the feedback loop may inject energy into the system, both during measurement and switching. Again, we find from microscopic considerations based on counting fields the same definitions for the energy exchanged with the reservoirs. Together, the energy and matter currents enter the heat current from the corresponding reservoirṡ

B. Heat during measurement
If we do not change the TQD Hamiltonian H S during the measurement, the average energy injected into the TQD during ∆τ is (in App. B we detail M E + M F using a microscopic model) which suggests to define the energy current due to the measurement in the conventional way Making the form of L ms explicit, we see that when [d † d, H S ] → 0 (this holds approximately in the weak coupling limit between central dot and its reservoirs), the measurement will on average not inject any energy. Furthermore, the measurement-associated energy current also vanishes when the measurement is insensitive (σ = 0). For simplicity, we use as a natural choice the average of the two Hamiltonians H S = 1 2 H E S + H F S in the numerical results of this paper.
We finally note that a special Hamiltonian acting during the measurement implies that switching work is applied twice to the TQD, at the beginning and at the end of the measurement process, see below. An alternative scheme would be to leave the previous Hamiltonian acting during the measurement, which however would complicate the discussion.

C. Switching work
To run the feedback loop, work has to be performed on the system, both when initializing the measurement (switching the Hamiltonian from H E/F S to H S ) and right after the measurement (switching back from H S to H E/F S ). The Hamiltonian H S chosen implicitly determines the rates Γ α during the measurement via Eq. (5), compare also Fig. 1. On average, this will imply for the switching work during a feedback cycle Now, upon expanding for small ∆t we see that the leading order in ∆t is linear, and we get for theẆ sw = Wsw ∆t the expressioṅ We see that for a constant TQD Hamiltonian throughout, this expression must vanish. It can be further simplified by adopting the choice H S = 1/2(H E S + H F S ) (which we will use in our numerical simulations), where the expression for the work rate becomesẆ sw = In particular, with δ L = δ R this means for the rates during measure-

D. First law
By adding the contributions (21), (23), and (25) we see analytically that at steady state the energy is conserved where we have used that L E/F = L (L) E/F + L (R) E/F . This expression vanishes asρ denotes the stationary state of the corresponding feedback Liouvillian. In our considerations we will only investigate the total energy flow due to the feedback loop Since it is also information that is needed to run the feedback loop, the gain is not bounded by one. Furthermore, since it counts only electric output power, this measure vanishes at equilibrium (V = 0) although the demon feedback loop may produce a finite current.

E. Second law
In our considerations, we are operating the QPC in a unidirectional transport regime, whose associated entropy production rate 45Ṡ i = βV QPC I QPC diverges. With all other quantities remaining finite, the global entropy production rate is therefore always positive by construction. However, we may attempt to write a balance equation for the local entropy of the TQD system. In doing so, we first note that the construction of the feedback loop enables us to consider the change of the von-Neumann entropy along particular trajectories belonging to different measurement results.
We will discuss only trajectories that start with the stationary state of the feedback loop. Without the coarse-graining of the measurement outcomes, this is defined byρ = n e Ln∆t M nρ (for simplicity of notation we have put here ∆τ → 0). Then, the entropy at the end of the feedback loop will for an initial measurement outcome n be given by Here, the probability for this outcome is given by p n = Tr {M nρ }, ∆S (n) ct denotes the system entropy change during control, and ∆S (n) ms the system entropy change during measurement, all conditioned on the measurement outcome n.
For a Lindblad-type conditional control with thermal reservoirs, we can split the entropy change of the system into a non-negative irreversible part ∆ i S (n) ≥ 0 and an exchange part 46,47 where the latter part is related to the heat flows entering the system. Inserting and solving for the exchange entropy and measurement contribution yields If we average over all measurement outcomes, the last term on the r.h.s. corresponds to the mutual information between system and detector that is discarded, see App. E. By averaging over this expression, we see by invoking that 48 and ∆ i S (n) ≥ 0 on average, we must have Dividing by ∆t and considering ∆t → 0, the first terms just become the energy and matter currents, leading to which denotes a version of the second law for the continuum weak measurement limit 49 . It can be used to bound the energetic performance of the device by the information gained by measurement. For our setup where β L = β R = β this yields -using Eq. (27) -our version of the second law which can be used to bound e.g. the gain. In particular, to have a gain G > 1 (information-driven regime) when I ms E +Ẇ sw > 0, it is necessary that d t S ms < 0, i.e., that the measurement on average reduces the system entropy. Technically, we note that the average of the measurement entropy change is given by In the regime of positive electric power, we can also define an efficiency for the conversion of both information and feedback energy into electric power via (compare also Ref. 50 ) which is bounded by one by the second law (35). However, we stress that also other regimes are conceivable, for example generating both electric power P el > 0 and simultaneously extracting workẆ sw < 0 39 , which would motivate other definitions of efficiency. Finally, we remark that the same second law can be derived when the entropic contribution of the abstract detector (and thus, the mutual information between system and detector) is explicitly taken into account, see App. E, which is similar to the framework of repeated interactions 51 .

IV. NUMERICAL RESULTS
We will first investigate the weak-coupling regime, where one would expect that the TQD treatment is equivalent to the SQD treatment in Ref. 11 -as far as the current through the system is concerned. However, our extended description now allows to quantify the injection of energy into the TQD system by measurement and control steps, which will in general not vanish. We will demonstrate that in the weak-coupling regime this is indeed negligibly small in comparison to the generated electric power, such that the device indeed implements a Maxwell demon feedback loop in the weak-coupling regime. Next, we will investigate how these relations change beyond weak-coupling.

A. Weak-coupling regime
We first benchmark our TQD treatment in absence of measurement (x = y) and also in absence of control H E S = H F S to yield similar results as the SQD treatment. Indeed, the black curves in Fig. 4 demonstrate close agreement of the TQD and SQD treatments in the weak-coupling-limit in absence of any measurement and control. Then, we compare the SQD treatment in presence of feedback control with the TQD treatment, first only in presence of measurements (x > y) but absence of control (H E S = H F S ). This already suppresses the current due to the partial projection of spatial superpositions (solid brown), but does not break the detailed balance relations and therefore does not produce electric power. Finally, when control is applied to the TQD (solid red), a similar situation as with the SQD in presence of feedback (dashed orange) arises. Although the electric power is significantly reduced (filled rectangle area vs. hollow rectangle area), the inset demonstrating the TQD energy flows defined in Eqns. (21), (23), and (25) show that these contributions are negligible in comparison to the electric power (solid red curve in the inset), which justifies to call this parameter regime information-dominated. The dashdotted magenta curve describes adaptive feedback, explained in Sec. IV C.
In a nutshell, we obtain that the more realistic TQD treatment with weak measurements and coherent control supports a Maxwell-demon mode in the weakcoupling regime, but with a significant reduction in electric output power. To compensate for this, one can explore the strong-coupling regime, see below.

B. Towards strong coupling
A naive extrapolation of the SQD treatment towards the strong coupling limit predicts that all currents and derived quantities such as generated power should scale linearly in the coupling strength, apparently predicting no limit in power production. However, from the exact solution of the SQD in absence of feedback control we know that by increasing the coupling strength to the reservoirs, the current through the system can be increased only up to a finite limit 30 , see also the benchmark in the companion paper 23 . Therefore, it is an intriguing question how the generated electric power scales in the strong-coupling regime. In this section, we therefore investigate how the currents change when the coupling strength is scaled by a factor Γ We note that this convention leads to larger differences between the coupling constants and thereby also to a larger switching work as κ is increased. In Fig. 5 we show the power vs. bias voltage for different coupling strengths, where we adopt the convention that the previous parameters of the weak-coupling limit (Fig. 4) are reproduced when κ = 0.01. We observe that by increasing the coupling strength, the electric output power is indeed increased as well (main plot). For weak to moderate coupling strength (solid black, dashed red, and dash-dotted green curves), we see that the device is still information-dominated, as the gain factor (28) can become larger than one, indicating that then mainly information is converted to electric output power. However, as the gain factor G continuously decreases with increasing coupling strength, beyond a critical coupling, the device is no longer information-dominated, and the gain G is smaller than one (dot-dot-dashed blue and dotted brown curves in the top inset of Fig. 5). Considering both information and feedback energy as consumed resources, the efficiency (37) becomes the relevant figure of merit. In contrast to the gain, the observed maximum efficiency does not evolve monotonously with the coupling strength (bottom inset). It first grows with increasing coupling strength, and we observe an acceptable maximum efficiency of nearly 80% at a moderate coupling strength where the device is still informationdominated (green dash-dotted curve). However, for stronger and ultrastrong couplings, the efficiency decreases again (dot-dot-dashed blue and dotted brown curves), such that in this energy-dominated regime, the device is useless from a practical viewpoint: Running the feedback loop requires more energy than is generated from information.
Additionally, we observe that in the informationdominated (Maxwell-demon) regime, where the gain is larger than one, also the non-trivial point V * (brown circle), where the generated power vanishes, is hardly dependent on the coupling strength. Experimentally, this may be an important hallmark for the identification of this regime. For the naive SQD treatment, the position of this point may be calculated analytically which is however significantly larger than the observed value for the TQD treatment even in the weak-coupling regime, compare also the main plot of Fig. 4. We attribute this to the inherent weakness of the measurement, which strongly delimits the capabilities of the demon.

C. Coarse-graining effects
We see that in the TQD treatment, the electric power produced by the device is significantly smaller than in the SQD treatment, compare Fig. 4. Consistently, the efficiency in the weak-coupling regime (where the device In absence of both measurements and control actions (x = y and H E S = H F S ), the TQD treatment (solid black) follows the SQD treatment closely. This changes already when measurement is active but no control is applied (solid brown). When the feedback loop is closed (solid red), the current no longer vanishes at the origin, and there is a regime where electric power is produced, albeit reduced as compared to the SQD (red small rectangle). Additionally adapting the feedback to the actual measurement outcome has little impact (dash-dotted magenta, see Sec. IV C). Inset: Curves for the coarse-grained feedback demonstrate that in the demon regime the generated power (solid red) is significantly larger than other contributions such as measurement energy (dashed green), switching work rate (dotted blue), and sum of left and right energy currents (solid black). Other parameters: β = β L = β R = 1, βΓ E L = βΓ F R = 0.015, βΓ F L = βΓ E R = 0.005, x = 10, y = 3, Γ = (Γ E α + Γ F α )/2, Γ∆t = 1, µL = −µR = V /2, and βδL = βδR = 0.1.
operates information-dominated) is also significantly below the SQD efficiency, compare Fig. 5 with the discussion in App. D. This leads us to the conclusion that the presented TQD treatment does not efficiently use the information to close the feedback loop. Knowing that coarse-graining strongly influences the entropy balance 52 , one might question whether this results from the employed coarse-graining of measurement outcomes into M E and M F . Instead of just using the two coarsegrained rates Γ E/F α , we can consider n-conditioned rates, where a suitable choice could be This will preserve positive tunneling rates throughout with similar tunneling rates as in the coarse-grained picture. Furthermore, if y < n < x, the conditional rates will be between the coarse-grained ones, and in particular Γ (n thr ) α = 1 2 (Γ E α + Γ F α ), such that the feedback is adapted to the weakness of the measurement. The information gained by the measurement is thus no longer discarded. We have observed however, that the generated electric power is not significantly enhanced by this procedure, see the dash-dotted magenta curve in Fig. 4.

V. SUMMARY
We have considered the performance of an externally controlled feedback loop implementing an electronic Maxwell demon. To explore the strong-coupling limit, we employed a fermionic reaction-coordinate mapping to an effective triple quantum dot system, serially coupled to two leads. Combining this with continuous projective measurements of the central dot occupation, the destruction of coherent superpositions within the triple dot system led to a complete suppression of the current due to the quantum Zeno effect. Since the mapping holds also for weak couplings, this raises the question why the Zeno suppression was not observed in the original approach based on a single dot rate equation 11 . An independent investigation shows that this is due to the inherent Markovian assumption in the single dot rate equation: If a non-Markovian approach is applied to the single-dot feedback problem, the Zeno suppression is found 39 . By performing a mapping to a triple quantum dot, we obtain a Markovian embedding, which captures the non-Markovian Zeno suppression in the single quantum dot. A Zeno suppression is not directly observed in the numerous counting statistics experiments, with detectors always switched on and thereby continuously measuring. We therefore are led to believe that realistic charge measurements are far from projective, and reflected this in our work by generalizing our model to weak measurements. Inspired by charge detectors used in experiments, we implemented this by using a microscopic detector model for a point contact. Effectively, this led already in the weak-coupling limit to an overall reduced performance of the demon in comparison with the original single-dot model -both in terms of information-to-power conversion efficiency and electric power output. In the intermediate coupling strength regime, the energy to run the feedback loop becomes important, and the driving force is no longer information but the device rather acts like a pump. Finally, in the strong-coupling regime, the energetic contribution (opening and closing of the shutter) becomes dominant, showing that the device can no longer be interpreted as a demon.
Beyond obvious applications to more advanced models (e.g. Coulomb interactions, spin valves), it would in the future also be interesting to discuss the implications of finite-time control cycles, which enables to lift the strong-coupling assumption during detection. Then, measurements may be constructed that leave the system energy invariant, such that the only energy required for the feedback loop is the switching work.

VI. ACKNOWLEDGMENTS
Financial support by the DFG (grants SCHA 1646/3-1, SFB 910, GRK 1558), the European Research Council (Project No. 681456), the WE-Heraeus foundation (WEH 640) as well as stimulating discussions with A. Carmele, S. Gurvitz, A. Nazir, and D. Segal are gratefully acknowledged. The authors are indebted to Tobias Brandes for motivating the research.
where the system coupling operator A suppresses the tunneling amplitudes t kk between left and right modes of the QPC H QP C = k kL γ † kL γ kL + k kR γ † kR γ kR by 1 − σ when the central dot of the TQD system is occupied. Thus, when σ = 0, the QPC is insensitive to the charge of the central dot, and for σ = 1, QPC transport is completely blocked when the central dot is filled. We now assume in addition that during the measurement, the interaction dominates the internal dynamics of the TQD system. Then, the singular coupling limit is applicable (compare e.g. Sec. 3.3.3 in Ref. 53 ), which automatically leads to a dissipator of Lindblad form. This will locally act only on the central dot, just as if it was only an SQD coupled to the QPC. In the limit where the bias voltage of the QPC is large βV QPC 1 we can consider the wideband limit (compare Sec. 5.4 of Ref. 42 ) where T (ω, ω ) ≡ 2π kk |t kk | 2 δ(ω − kL )δ(ω − k R ) → T 0 , which condenses into the QPC tunneling rate γ = T 0 V QPC and the simple dissipator In the last line, we have inserted the definition of A and used the fermionic anticommutation relations. This particularly simple form allows one to easily compute the exponential of L dt . Defining the superoperators J ρ = d † dρd † d, J L ρ = 1 2 d † dρ, and J R = 1 2 ρd † d, we can -since they all mutually commute -compute their action separately e −γσ 2 JL∆τ ρ = dd † + e −σ 2 γ∆τ /2 d † d ρ, e −γσ 2 JR∆τ ρ = ρ dd † + e −σ 2 γ∆τ /2 d † d , and e +γσ 2 J ∆τ ρ = ρ + e +σ 2 γ∆τ − 1 d † dρd † d, which upon sequential application yields for the total effect of the measurement on average From this, we can see that on average the effect of the measurement is just the destruction of coherences between empty and filled central dot states. By employing the fermionic anticommutation relations we therefore get the expression which implies with M E + M F = e L dt (0)∆τ -see the discussion in the subsection below -Eq. (19) in the article.

Measurement dissipator in presence of QPC electron counting
To furthermore infer the counting statistics of the QPC, it is a well-established practice to introduce counting fields 45 , which yields a generalized dissipator of the form where χ denotes the counting field for the charges transferred through the QPC circuit. By computing derivatives of the moment-generating function M (χ) = Tr e L dt (χ)∆τ ρ(t) with respect to the counting field χ, we can determine all moments of the charge distributions of tunneled QPC charges during the interval [t, t + ∆τ ], where ∆τ denotes the duration of the measurement. By construction, the inverse Fourier transform of the generating function yields the full distribution and the corresponding conditional (not normalized) density matrix is given by 54 which defines the measurement superoperators M n . In system (TQD)-detector (QPC transfer particle number) Hilbert space the most general density matrix can be written as ρ SD (t) = nm ρ (nm) (t) ⊗ |n m|, such that by performing a projective measurement of the number of particles transferred through the QPC and trace out the detector afterwards, this leads to the identification ρ (n) = ρ (nn) . We note that from the completeness relation of the Fourier transform we can also infer that ρ(t + ∆τ ) = n ρ (n) (t + ∆τ ) = e L dt (0)∆τ ρ(t), such that n M n = e L dt (0)∆τ . In a similar fashion as before, we can also partition the generalized dissipator into mutually commuting superoperators L dt (χ) = γ e +iχ J + J L + J R , for which we can separately compute the exponential. Eventually, this yields M n = (γ∆τ ) n n! J n e −γ∆τ (JL+JR) .
By using that A n = dd † + (1 − σ) n d † d we can explicitly determine the individual superoperators which upon nested application eventually leads to Eq. (12) in the main article.

Example: SQD monitored by QPC
Taking L dt (χ) from Eq. (B5) with A = 1 − σd † d, we can evaluate the action of the QPC in the dot eigenbasis |0 , |1 . Due to the special form of the dissipator, it does not couple between the populations and the coherences of the dot density matrix, and we get (L dt (χ)ρ) 00 = γ(e +iχ −1)ρ 00 and (L dt (χ)ρ) 11 = γ(1−σ) 2 (e +iχ −1)ρ 11 . Since for a single quantum dot system, coherences do not play a role, we therefore obtain that the generalized rate matrix (acting on the probability vector P = (p E , p F ) T for finding the dot empty or filled, respectively) is diagonal in the SQD eigenbasis Here, we see that for the chosen large QPC-bias limit, we only have unidirectional transport through the QPC, such that only terms with e +iχ occur, and γ > 0 describes the QPC transmission (compare Sec. 5.4.1 of Ref. 42 ). The total rate matrix is then constructed additively -compare Eq. (2) -which leads to Here, the rate matrices W L/R describe electronic jumps onto or off the dot from the left and right leads, respectively, and W dt is a generalized rate matrix for the QPC, with the counting field χ describing the number of transferred QPC charges. In absence of counting (χ = 0), the effect of the QPC on the single dot vanishes, for larger system such as e.g. a double dot however, the QPC would still have an effect 55 . The above rate matrix is an extremely simple example of a bistable stochastic process: For vanishing SQD tunneling rates W L/R → 0, the dot occupation cannot change, and the statistics P n (∆t) will be fully Poissonian, depending on only the initial SQD occupation: When it is empty, the cumulants will be given by γ∆t, and when it is filled, they are given by γ(1 − σ) 2 ∆t. The interesting case arises when the SQD rate matrices W L/R are small in comparison to W dt : Then, slow switching events occur between the two Poissonian distributions 56 , and the time-dependent detector signal can be used to infer the occupation of the dot 32 .
From the theory of Full Counting Statistics, we can infer the probability P n (∆t) of observing n QPC charge transfer events during in the interval [t, t + ∆t] via P n (∆t) = 1 2π +π −π Tr e W(χ)∆t P (t) e −inχ dχ , where the trace corresponds in this case to the multiplication from the left with the row vector (1, 1). Furthermore, a measurement of n QPC charge transfers after ∆t would project the probability vector to which still needs to be normalized by P n (∆t). In the limit where γ Γ α , a perturbative treatment for these expressions can be used. To generate a trajectory such as in Fig. 3, we start with an empty dot P (0) = (1, 0), then compute the probabilities P n (∆t), choose accordingly a particular outcome n -which defines a current I = n/∆t -and perform the projection, which leads to P (∆t). This is then taken as the initial state for the next iteration and so on. Due to the projection, it is more likely to measure large currents after large currents and low currents after low currents, which leads to the switching behaviour shown in Fig. 3.
In the simple case when L = R = , the spectrum and the eigenvectors of the TQD part of the Hamiltonian (5) can be computed analytically. The eigenstates can be grouped into states with the same total particle number, with energies We note that the TQD energies become near degenerate for small Γ α δ α . Applying the Master equation formalism to it should be well justified now when δ α β 1. Applying the secular approximation on top however should only be admissible when √ Γ α δ α δ α . Furthermore, it should be noted that out of the 64 matrix elements of the TQD density matrix, not all are allowed within our treatment, as a master equation treatment of the TQD will only admit to create superpositions of states of similar charge on the TQD. That is, we can have coherences between the singly and the doubly charged states separately, leading to 20 = 1+9+9+1 non-vanishing density matrix elements in total. Taking only these physically allowed matrix elements into account and then performing the partial trace over the left and right dots, we obtain that the reduced density matrix of the central dot must always be diagonal.

Correlation functions
Identifying the coupling operators between the TQD and the residual reservoirs as we can represent the non-vanishing correlation functions as (compare e.g. Chapter 5 in Ref. 42 ) and similar for C 34 (τ ) and C 43 (τ ) by replacing L→R. From this, we can read off the Fourier transform of e.g. the left-associated correlation functions

Born-Markov master equation in absence of feedback
We can decide not to perform the secular approximation, but only the Born and Markov approximations. This will in general not lead to a Lindblad type master equationρ but we can nevertheless expect that for weak residual couplings δ α it will approximately preserve the basic thermodynamic properties of the system. In our case, it assumes the forṁ where (similar for the right lead L→R and 12 → 34) Here, we have used the TQD energy eigenbasis H S |a = E a |a (compare App. C 1) and the half-sided Fourier transform of the correlation function These can be rewritten using the convolution theorem where we have inserted the the Fourier transform of the Heaviside-Θ function. These principal value integrals can in principle be evaluated numerically, but for a Lorentzian tunneling rate we may also obtain an analytic solution in terms of Polygamma functions (not shown for brevity). For flat tunneling rates, this Lamb shift contribution diverges logarithmically. For example, for a Lorentzian tunneling rate Γ (1) α (ω) we would get for large bandwidth δ α (compare e.g. App. C of Ref. 57 ) where Ψ(x) denotes the Digamma function, and similar for the right-associated rates. However, since in our approach the transformed TQD has flat tunneling rates Γ (1) α (ω) = 2δ α , we have to consider the limit δ α → δ cutoff → ∞ and Γ α → 2δ α in the above equations. Numerically, we see that the currents and steady states do not depend on the cutoff width δ cutoff .

Appendix D: SQD efficiency
To evaluate the efficiency of information conversion of an SQD treatment, we can use earlier results 12 such that an efficiency of information conversion isin the region of P el ≥ 0 -given by (we assume equal temperatures and in the SQD treatment both energy and matter currents are conserved) Using the numerical values in the figures in the main article, this is way larger than the efficiency in Fig. 5 in the weak-coupling limit.

Appendix E: Second law with included detector
One can also derive the second law by considering the implementation of the measurement apparatus in more detail. Then, it is not necessary to perform an average over different trajectories. We note that the treatment here fits in the framework of repeated interactions 51 , where the QPC measurements and external feedback operations assume the role of the units, which however are subject to a nonequilibrium environment. It is sufficient to remain at the level of the average density matrix ρ SD involving TQD system and detector and the corresponding reduced density matrices of system ρ S = Tr D {ρ SD } and detector ρ D = Tr S {ρ SD }, respectively. We will consider a finite measurement and control times ∆t, ∆τ , dropping however for brevity their dependence in the stroboscopic stationary state.
Right before the measurement, the joint density matrix is given by and the mutual information I ≡ S vN (ρ S ) + S vN (ρ D ) − S vN (ρ SD ) ≥ 0 of this state actually vanishes.
As the first part of the measurement, we let TQD system and QPC interact during the time-interval ∆τ , leading to the joint density matrix where the measurement superoperators are defined in Eq. (12). We note that the joint entropy of this state is exactly given by the sum of the Shannon-entropy of the detector and the averaged entropy of the system 38 S vN (ρ (1) SD ) = − n P n ln P n + n P n S vN ( Mnρ Pn ), where as before P n = Tr {M nρ }. Also, we can calculate the reduced density matrices ρ We can confirm its non-negativity by inequality (32). During control, we apply the conditional evolution, leading to The entropy of this state can also be additively decomposed into the Shannon entropy of the detector and the averaged system entropy S vN (ρ (2) SD ) = − n P n ln P n + n P n S vN ( e Ln (∆t−∆τ ) Mnρ Pn ). By construction, the reduced density matrices become ρ for which we can also confirm the non-negativity by inequality (32). Thus, not all of the mutual information is used to perform the feedback operation.
Finally, we reset the detector to its initial value by ρ (3) SD = m |0 m| ρ (2) SD |m 0|, closing the loop. In terms of units, we would replace the old unit by a new one. When the detector is reset, we discard the mutual information I (2) , which explains the detrimental effect on performance. Explicitly, this resetting yields where we have used that we operate at (stroboscopic) steady state. Accordingly, also the entropies must be the same after one feedback cycle.
During control, the detector does not change, and as we have a conventional evolution, we have for the change of total entropies 47 with irreversible entropy production ∆ i S ct ≥ 0 and heat transfers from the reservoirs ∆Q (α) . The measurementassociated contributions can be separated into the which upon invoking Eq. (32) yields the same second law (34) as in the main manuscript. We see that the last two terms on the r.h.s. correspond to the mutual information I (2) that is discarded in the resetting of the detector. Finally, we also mention that we can express the average of the measurement entropy change as ∆S ms = S vN (ρ SD ) − I (2) , (E10) which demonstrates that it is not only the mutual information gathered during the measurement which bounds the performance, but also how much of it is actually used during the feedback.