Quantifying memory capacity as a quantum thermodynamic resource

The information-carrying capacity of a memory is known to be a thermodynamic resource facilitating the conversion of heat to work. Szilard's engine explicates this connection through a toy example involving an energy-degenerate two-state memory. We devise a formalism to quantify the thermodynamic value of memory in general quantum systems with nontrivial energy landscapes. Calling this the thermal information capacity, we show that it converges to the non-equilibrium Helmholtz free energy in the thermodynamic limit. We compute the capacity exactly for a general two-state (qubit) memory away from the thermodynamic limit, and find it to be distinct from known free energies. We outline an explicit memory--bath coupling that can approximate the optimal qubit thermal information capacity arbitrarily well.

FIG. 1: Operation of a thermally passive memory: the writing process must establish correlations between the pertinent information variable K (contained in a classical system C) and the quantum memory M, working in a power-deficient environment; the information can later be read through a powered and unrestricted process, recovering an estimateK of the original information. The information written is quantified by the correlations established between C and M. This paper formalizes the quantum thermodynamic value of such a general memory by conceptualizing a thermally passive memory: writing onto such a memory is constrained to use no free energy other than what the memory's initial state carries intrinsically. We model such a writing process by thermal operations [19], and define the thermal information capacity of the memory's state as the amount of classical information that thermal operations can encode onto it. We show that this capacity is a thermodynamic resource in its own right-distinct from the Gibbs free energy and converging to the latter only in the thermodynamic limit. We quantify this resource exactly for general qubit states, explicitly constructing an optimal thermal operation encoding scheme. We characterize the capacity's relation to temperature, energy gaps, and the Gibbs free energy. We then outline a schematic memory-bath interaction between a qubit memory and a bosonic bath to approximate the optimal encoding scheme arbitrarily well, finding a tradeoff between the closeness of the approximation and the time taken to run the encoding process.
Framework. The working of a thermally passive memory is depicted schematically in Fig. 1: the information contained in some classical system C is to be written onto a quantum memory system, M, in a thermal environment and without access to external free energy; the retrieval (readout) of this information is unrestricted and allowed to use free energy and other resources. In the present work we focus only on the writing step, and ask how much classical information can be written in a way that can be reliably recovered by the most general readout.
Represent the classical information in C by a classical variable K ≡ {(k, p k )} distributed according to the probabilities p k . In bra-ket notation, this can be denoted κ C = k p k |k k|. Encoding this information onto M entails changing the state of M from some initial ρ to an ensemble of k-dependent "codeword" states σ (k) . In effect, the state of the CM composite changes from the uncorrelated κ C ⊗ ρ M to k p k |k k| ⊗ σ (k) ≡ σ CM , which contains classical-quantum (CQ) correlations.
The subsequent readout (which we do not specify) is in general some measurement of M, resulting in an estimatẽ K of K. The amount of information written and recovered is quantified by the correlation between these two classical variables. Allowing the most general readout, Holevo's theorem states that this correlation is bounded above by the mutual information between C and M in the state σ CM , given by where S(σ) := −Tr (σ log 2 σ) is the von Neumann entropy of the argument. If we abbreviate the code, i.e. the ensemble of codewords, as C ≡ p k , σ (k) , the quantity I (C : M) σ is also denoted by χ (C), called the Holevo information of C.
Thus far, we have described general codes C. However, thermally passive encoding limits the codewords accessible from a given initial state ρ. We model the encoding by thermal operations [19], which describe the dynamics of a system in contact with a heat bath. Let M's environment be a bath at temperature T . Left to equilibriate with the environment, M would eventually reach its thermal, or Gibbs, state γ ∝ exp (−H M /k B T ), where H M is the free Hamiltonian of M. A thermal operation is a controlled interaction of M with a thermal auxil-iary system A, whose effect can be interpreted as partial thermalization. It starts with M in some initial state ρ uncorrelated with A (which, by virtue of being thermal, is in its own local Gibbs state γ A ), followed by turning on an arbitrary energy-conserving interaction between M and A, and then decoupling the two again. The thermal operations model leaves the choice of system A arbitrary, so long as it is prepared in the thermal state determined by its own Hamiltonian and the bath's temperature.
For a given initial state ρ of M, define C (ρ) as the set of all codes C ≡ p k , σ (k) consisting exclusively of codewords σ (k) accessible from ρ by thermal operations. In the equivalent representation in terms of CQ states, C (ρ) corresponds to the set of all states σ CM accessible from ρ under a generalized class of processes called conditioned thermal operations [25]. This set represents all possible ways that classical information can be written onto M, allowing arbitrary variations in the classical variable K being written. Our main quantity of interest is the optimal amount of information that can be written in this way, given an initial resource state: Definition 1 (Thermal information capacity). The thermal information capacity (TIC) of the thermal memory M initialized in blank state ρ is defined as From the properties of the Holevo information, it follows that the TIC is always nonnegative. Another property of the TIC is that, as a function of the input state, it is strictly non-increasing under thermal operations: Thus, as expected, the TIC is a measure of thermodynamic resourcefulness of the state ρ, akin to free energy functions: a thermal operation acting on a given state can only result in a state with equal or lower TIC. It vanishes when ρ = γ, and takes higher values for non-thermal ρ.
Thermodynamic limit. A helpful point to start investigating the TIC is to consider its thermodynamic limit, which concerns the average behaviour over a large number of independent, identically-prepared (i.i.d.) instances. What is the optimal TIC per copy of a resource state ρ, in the thermodynamic limit? More precisely, this quantity is defined as Apart from its own operational significance, the limiting i.i.d. value is useful in being an upper bound on the single-copy TIC. While the latter is in general difficult to compute, the i.i.d. limit can be calculated exactly using the theory of asymptotic equipartition, leading to the following result. Proposition 1. In the thermodynamic limit of infinitely many, independent and identically distributed (i.i.d.) copies, the optimal thermal information capacity per copy of a memory state ρ is given by I ∞ th = F (ρ), a quantum generalization of the Gibbs free energy, defined as the quantum relative entropy of ρ with respect to the Gibbs state γ: F (ρ) := S (ρ γ) ≡ Tr (ρ log 2 ρ) − Tr (ρ log 2 γ) .
We provide the proof in Appendix A. Note that F is measured in units of work bits, or "wits", each of which is thermodynamically equivalent to one energy-degenerate qubit prepared in a pure state. Our framework of the TIC is consistent with the general principle that any measure of thermal resourcefulness converges to the Gibbs free energy in the thermodynamic limit [26,27]. We now turn to the study of the TIC in the non-i.i.d., or single-shot, regime. The science of general coherent thermal operations in this regime is nontrivial, but the special case of two-level systems, or qubits, is relatively tractable.
2-level memory. Consider a qubit memory M governed by a (generally non-degenerate) Hamiltonian H M = E 0 |0 0| + E 1 |1 1| and immersed in an ambient temperature T . Computing the TIC (Definition 1) of a given initial state ρ entails searching from the set C (ρ) of codes accessible from ρ. The concavity of the von Neumann entropy function implies that codes containing only extreme points of the accessible set will attain the optimum. This and other simplifications (cf. Appendix B) lead to our main result: Theorem 2. For a qubit memory M, an optimal code accessible thermally from an initial state ρ is of the form FIG. 3: Thermal information capacity (TIC) over different blank-memory states in the X + Z section of the Bloch ball, for a qubit memory (with energy gap ∆E) at various temperatures. The TIC of the Gibbs state γ is zero, and is higher for states further away from γ. The zero-temperature limit behaviour persists at temperatures as high as 0.1 ∆E/kB; significant variation ensues in the O (∆E/kB) temperature range, while the high-temperature limit resembles the information landscape of a non-energy-degenerate qubit memory.
where q ∈ [0, 1], Z = |0 0| − |1 1|, andρ is the state at the tip of the accessible set (Fig. 2). The thermal information capacity (TIC) of ρ can then be determined by carrying out the single-parameter optimization I th (ρ) = max This optimization can be easily carried out numerically. Figure 3 depicts the result: the TIC as a function of the initial state ρ, at various temperatures measured in relation to ∆E ≡ E 1 − E 0 . The TIC understandably vanishes when ρ equals the Gibbs state γ, and increases with athermality, i.e. the departure of ρ from this state. The Gibbs free energy F (ρ) [Eq. (5)] is an operationally meaningful measure of athermality, and so we investigate the behaviour of I th (ρ) in relation to F (ρ) (Fig. 4). We see that the two resources vary similarly with ρ, but less so at lower temperatures. In Appendix C, we examine Scatter plots of the thermal information capacity vs. Gibbs free energy of qubit memory states: while the two resources are correlated in their state-dependence, they are distinct, particularly at lower temperatures. In each plot, the top-right point of maximum capacity corresponds to the initial state ρ = |1 1|, the pure excited state. The maxima occurring to the left of this point correspond to initial states along the equator, e.g. ρ = |+ +|. In the T → ∞ limit, the two maximal regions get more and more similar in their Gibbs free energy, as the latter converges to the purity (or "negentropy") of ρ.
the TIC in relation with other resourcefulness measures, namely the purity and the relative entropy of coherence; we find the Gibbs free energy to be better than these other resources as an indicator of the TIC. This is understandable, given the asymptotic convergence of the TIC to the free energy (Proposition 1).
Towards implementation. The thermal operations framework, which we have used to model the encoding process, is agnostic about the existence of a practically feasible auxiliary system A and coupling to realize a desired thermal operation (see [28] for a detailed discussion). Thus, we would like to go beyond the abstraction of thermal operations and construct a concrete realization. To this end, we now probe an interaction of the qubit memory M with a bosonic mode bath tuned to M's energy gap, interacting with the latter via a Jaynes-Cummings coupling. We refer again to Fig. 2 showing the three states constituting an optimal code obtainable from a given initial state. The initial state itself being one of these, another results from reflecting the initial state about the Pauli Z axis, while the third lies at the tip of the convex cone of accessible states. Reflection about Z is repre-FIG. 5: Time taken by a Jaynes-Cummings coupling to approximate the optimal qubit thermal information capacity to various efficiencies, vs. bath temperature. The speed-efficiency tradeoff is reminiscent of a heat engine's performance.
sented by the unitary transformation Z, which can be effected simply by evolving the memory system under its free Hamiltonian for a suitable length of time. Transforming to the third codeword state, however, requires population inversion relative to the initial state, which cannot be achieved perfectly by a Jaynes-Cummings coupling owing to asynchronicity between the Rabi oscillations within different memory-bath energy levels. Nevertheless, we found that the optimal capacity can be approximated arbitrarily well, albeit at the cost of longer running time (Fig. 5): this mirrors the power-efficiency tradeoff in the performance of heat engines. The phase transition-like jumps occur due to the above-mentioned Rabi oscillations whose collective effect on the qubit's marginal state is irregular in time. The degree of population inversion required to meet a given efficiency is generally achieved at similar times over short ranges of temperature, but at certain critical temperatures where it just begins to fail, the irregular time-dependence of the population inversion leads to a long period of oscillations where this failure persists, until a sufficient inversion level is finally reached around a different time regime. This new inversion level again remains sufficient to meet the required efficiency, until the next critical temperature is hit, and so on. The downward dip of some of the curves with increasing temperature seems counterintuitive. We conjecture that this is a consequence of the fall in optimal capacity with increasing temperature, thus rendering it easier to approach. Technical details about these results are provided in Appendix D.
Conclusion. Among the many facets of the connection between information and thermodynamics is the work value of a memory, epitomized by Szilard's engine and the Landauer principle. In this work, we probed the thermodynamical limitations of the capacity of a quantum system to store information. We defined a thermally passive quantum memory as one which is written onto without access to free energy sources. The storage capacity of such a passive memory converges to the quantum Gibbs free energy of its initial state, in the thermodynamic limit. We then computed the optimal single-copy capacity of a qubit memory, showing it to be distinct from the free energy. We then described a proposal for approximating the optimal encoding strategy through a Jaynes-Cummings interaction of the memory with a Bosonic bath.
The study of quantum coherence in thermodynamics has risen as a topic of interest in the past few years. In addition to much work on the basic role of coherence in thermal evolution [29][30][31], its role in the task of work extraction has been studied in some detail [32][33][34]. But other operational tasks in thermodynamics, particularly information-processing tasks [16,35,36], have received less attention with regard to the role of genuinely quantum properties. By focusing on information storage, we have taken a step towards filling this gap.
Our results can be adapted to applications involving probes that record information in remote, noisy, or power-deprived locations, in the spirit of quantum illumination [37]. It also pertains to the first step in the operation of a generalized Szilard engine, namely imprinting the outcome of a measurement of the working medium onto a passive memory. Subsequent steps involve leveraging the stored information to counter-act on the working medium to extract work, then re-initializing the memory to complete a cycle. Such cyclic conversion between athermality and correlations is a natural extension of the theme of [12]. There is also potential for expanding our ideas to include the treatment of a memory that stores quantum information.
[1] Leo Szilard. On the decrease of entropy in a thermodynamic system by the intervention of intelligent beings. A. TIC in the thermodynamic limit We are interested in determining under the class of thermal operations (TO) on infinitely many copies of a d-dimensional elementary system with Hamiltonian H, with the associated Gibbs state γ. For convenience, we assume H has no degeneracy; our arguments can be easily generalized to degenerate cases. The result of [38] states that, given two resources ρ and σ, the conversion ρ ⊗m ⊗ γ ⊗nm → σ ⊗nm ⊗ γ ⊗m in the limit m → ∞ is possible under TO (allowing a conversion error that vanishes in the limit) at the optimal rate lim sup where F (ρ) := S(ρ γ) = Tr (ρ log 2 ρ) − Tr (ρ log 2 γ). We first convert the given m copies of the general resource ρ to some standard resources with the same amount of free energy; the asymptotic reversibility mentioned above ensures that the TIC of these standard resources-which happens to be easier to calculate-is equal to that of the general ones.
The standard resources of our choice are pure states of the form where j is a collection of (an as-yet-unspecified number) n indices, each chosen from {0, 1 . . . , d − 1}. The energy of this state is given by The number n is expected to be very large, while the possible values for each j k number d. Therefore, j will typically have repeating indices. Define the vector of frequencies, f ∈ N d , by Then, the rank of the degenerate subspace of energy E(j) is given by the multinomial coefficient Arbitrary unitaries within this subspace are energy-conserving, and therefore TO. Thus, starting from Ψ(j) (or any other pure state in this subspace), we can use TO to construct an ensemble of µ(f ) equally-probable orthonormal pure states, which achieves a Holevo rate of log 2 µ(f ). We now draw inspiration from the theory of asymptotic equipartition to determine the best choice of f . As an Ansatz, let us fix n and set f j = g j n. Using Eq. (A.2), we find the number of initial copies of ρ required for constructing Ψ(j): where Z = j g j is the single-system partition function. Using this construction, we can lower-bound the asymptotic TIC rate defined in Eq. (A.1): To see that this is also an upper bound, we note the following. The final memory state contains correlations with the classical variable, which is itself a thermodynamic resource that can be capitalized to recover copies of the original resource ρ at precisely the rate F (ρ). If this were not also an upper bound, more of the initial resource could be reconstructed than we began with, thereby leading to a net creation of resource under TO. Since this is forbidden, the bound works both ways, establishing Proposition 1 of the main text.

B. Technical results for qubit TIC
Here we provide the technical results used in proving Theorem 2 about TIC under qubit TO. We adopt a convenient shorthand, denoting a general state of the qubit memory M by where the matrix representation is relative to the energy basis {|0 , |1 }. The Gibbs state is given by Our aim is to compute the TIC, defined as Observation B.1. The optimization in Eq. (B.3) can be restricted to codes C containing only the extreme points η[s, κ s e iφ ] of ϑ(ρ). For, if some codeC contains the codeword (p, qσ 1 + [1 − q]σ 2 ) with p > 0; 0 < q < 1; and σ 1 = σ 2 both in ϑ(ρ), we can construct another code C, identical toC except with this codeword replaced by two others, namely (pq, σ 1 ) and (p[1 − q], σ 2 ). By the concavity of the von Neumann entropy, χ(C) > χ C .
Lemma B.2. The optimization in Eq. (B.3) can be further restricted, to codes consisting solely of pairs η[s, κ s e iφ ], η[s, −κ s e iφ ] of extremal states lying on opposite sides of the Z axis in the Bloch representation, with both states in a pair of a given s occurring with equal probability. Explicitly, such a code takes the form Proof. For some codeC = {(p j , η [s j , β j ])} constructed from extreme points of ϑ S (ρ), the Holevo quantity is given by Now, we will show that the Holevo rate of the corresponding paired code C, as in Eq. (B.6), is no smaller than χ C .
The second line follows from the unitary relation (namely, through the unitary Z) between the states within each pair, the third from the concavity of the von Neumann entropy, and the fourth from the existence of a common unitary (again, Z) connecting corresponding codewords in the two half-codes. Together with the previous observation, this implies that an extremal code of the paired form (B.6) will attain the optimum TIC.
We now note that S η[s, κ s e iφ ] is independent of φ; it is effectively a function of s, which we denote S(s). The Holevo information of a paired code such as in Eq. (B.6) is given by where we recall that h(·) denotes the binary entropy function. We can now state the optimization in Eq. Here it is to be understood that p is a probability distribution and that the s j are constrained to lie in [r, 1 − λr].
We will prove this proposition through several steps. We will largely exploit the simplicity of the qubit case, wherein all spectral properties of a density operator reduce to functions of a single parameter. In particular, consider the von Neumann entropy S(σ), introduced already, and the determinant, which we shall denote D(σ). They can both be expressed in terms of a single parameter. One possible choice for this parameter is the smaller of the two eigenvalues of σ, which we here denote t; note that t ∈ [0, 1/2]. As a function of t, the von Neumann entropy and determinant are (B.12) We note that our use of the symbols S and D here is to refer not to specific functional forms, but rather to the von Neumann entropy and the determinant treated as variables. When one of these symbols is followed by an argument, it is then (and only then) intended to convey the behaviour of the variable as a function of the said argument. In particular, this means that the following functional forms are all distinct, even though they all represent the von Neumann entropy: 1 Lemma B.4. Over qubit density operators, the von Neumann entropy S is an invertible function of the determinant D; specifically, S is a strictly increasing, concave function of D.
Proof. Recall that both these quantities are effectively functions of the single parameter t ∈ [0, 1/2]. Both functions are well-defined and continuous in the interior of this region; S(t = 0) can be set to 0 using the limit as t → 0 + . Denoting the total derivative with respect to t by an overhead dot, These are both well-defined and strictly positive in the interior of the parametric region. In other words, both S(t) and D(t) are well-defined and strictly increasing in the region. Therefore, S is a strictly increasing function of D. It remains to show the function's concavity. First, using Eqs. (B.13), we haveS (B.14) Also from Eqs. (B.13), admitting the definition dS dD t=1/2 = 2 log 2 e (B.16) through L'Hôpital's rule and Eqs. (B.14). Moving on, The denominator is nonnegative for t ∈ [0, 1/2]. Now let Evidently, f (1/2) = 0, while one may show easily (e.g., using L'Hôpital's rule), that The function is well-defined and smooth in the interior of the region, wherė Concavity of S(D) follows from Eq. (B.17).
Note that this functional relationship holds generally over all qubit states, although we are only interested in the parametric family η[s, κ s e iφ ]. Now let us return to our objective of proving Proposition B.3, which concerns the behaviour of S(s). Note that D is also effectively a function of s, given by Denoting the total derivative with respect to s by an apostrophe, we have The details of these derivatives are unimportant to us; what is relevant is that they are both well-defined in general, as well as that D (s) is manifestly negative. Now, exploiting the bijective relationship between S and D, we have where qs is the unique number satisfying qsr + (1 − qs)(1 − λr) =s, namely qs =s + λr − 1 (1 + λ)r − 1 .

(B.25)
Note that qs is well-defined except when r = g, which is anyway a trivial and uninteresting case. This brings us to our main result: Theorem B.6 (Theorem 2 of main text). For a qubit memory M with Gibbs state γ = η [g, 0], an optimal code accessible from an initial state ρ = η [r, α] The thermal information capacity (TIC) of ρ is given by the Holevo capacity of C opt : The above optimization can easily be carried out numerically, although we have been unable to find a closed analytical form for it.
C. The thermal information capacity in relation to other resources Fig. 4 of the main matter shows a scatter plot of the thermal information capacity (TIC) vs. the Gibbs free energy for a representative sample of qubit initial states. Why did we choose the Gibbs free energy as a reference against which to compare the TIC, and not other relevant measures of resourcefulness of the state? An obvious motivation for this is the asymptotic convergence of these two quantities. Nevertheless, we did also study the TIC's relation with two other resourceful aspects of the state, namely its purity (measured by the von Neumann "negentropy", 1 − S[ρ]) and its relative entropy of coherence with respect to the energy eigenbasis, given by where ρ diag is the diagonal part of ρ in the energy eigenbasis. Like the Gibbs free energy, both the purity and the coherence are useful properties in information-processing tasks, and never increase under thermal operations. In the context of the task of information storage on a memory, the purity exactly measures the information capacity in the case of an energy-degenerate memory. On the other hand, states with the highest coherence, such as |+ +| and |− −|, achieve maximum TIC regardless of the temperature and energy levels. Thus, both the purity and the coherence are ostensibly indicators of the TIC. Indeed, Fig. C.1 and Fig. C.2 bear this out. However, we see by comparing these with Fig. 4 that the Gibbs free energy is the resourcefulness measure that is most strongly correlated with the TIC at all temperatures. Define ω := ∆E/ , and λ := exp [−∆E/k B T ]. In order to implement the encoding scheme of Theorem 2, we need to be able to perform two transformations: (D.2b) The first, which only entails rotating the phase of the off-diagonal elements of the density operator, can be achieved simply through local evolution under the memory's free Hamiltonian.
To approximate the second transformation, we propose to couple the memory to a bosonic mode bath through a Jaynes-Cummings interaction. The bath B is a single bosonic mode with annihilation operatorb and free Hamiltonian We couple the memory with the bath through the Jaynes-Cummings interaction term For a general blank tape state ρ M ≡ ρ as in Eq. (D.2a), the initial state of the composite MB is the uncorrelated product ρ M ⊗ γ B . The ground-state amplitude of this state is r(1 − λ). As mentioned above, this component stays invariant, while the higher-energy components oscillate within their degenerate two-dimensional subspaces, as described in (D.6).
To achieve the transformation of (D.2b), the |0 M components in all of these oscillatory terms must be transformed to |1 M and vice versa. But the frequencies of the Rabi oscillations within different energy levels are not relatively rational, scaling instead as √ n for integer n, whence the desired transformations within different energy subspaces do not occur synchronously after any finite time of evolution. Nevertheless, good approximations to the desired transformation can be implemented in reasonable time. We used numerical computation to find the time taken by the Jaynes-Cummings interaction to achieve various fractions of the optimal capacity. These ranged between 0.15 and 1.57 in units of Ω −1 , where Ω is the strength of the coupling.