Simple and Tighter Derivation of Achievability for Classical Communication over Quantum Channels

Achievability in information theory refers to demonstrating a coding strategy that accomplishes a prescribed performance benchmark for the underlying task. In quantum information theory, the crafted Hayashi-Nagaoka operator inequality is an essential technique in proving a wealth of one-shot achievability bounds since it effectively resembles a union bound in various problems. In this work, we show that the pretty-good measurement naturally plays a role as the union bound as well. A judicious application of it considerably simplifies the derivation of one-shot achievability for classical-quantum (c-q) channel coding via an elegant three-line proof. The proposed analysis enjoys the following favorable features. (i) The established one-shot bound admits a closed-form expression as in the celebrated Holevo-Helstrom Theorem. Namely, the error probability of sending $M$ messages through a c-q channel is upper bounded by the minimum error of distinguishing the joint channel input-output state against $(M-1)$ decoupled products states. (ii) Our bound directly yields asymptotic results in the large deviation, small deviation, and moderate deviation regimes in a unified manner. (iii) The coefficients incurred in applying the Hayashi-Nagaoka operator inequality are no longer needed. Hence, the derived one-shot bound sharpens existing results relying on the Hayashi-Nagaoka operator inequality. In particular, we obtain the tightest achievable $\epsilon$-one-shot capacity for c-q channel coding heretofore, improving the third-order coding rate in the asymptotic scenario. (iv) Our result holds for infinite-dimensional Hilbert space. (v) The proposed method applies to deriving one-shot achievability for classical data compression with quantum side information, entanglement-assisted classical communication over quantum channels, and various quantum network information-processing protocols.


Introduction
Communicating classical information over a noisy quantum channel is a foundational task in quantum information science.To protect the transmitted messages against potential noise, an indispensable coding strategy is employed.At the transmitter, Alice initiates the procedure by encoding each message m ∈ {1, 2, . . ., M } into an n-qubit quantum state.Suppose in the communication process that each qubit suffers from independent and identically distributed (i.i.d.) quantum noise, which is characterized by an i.i.d.quantum channel.Then, at the receiver Bob performs a quantum measurement on the corrupted quantum system to extract the decoded message m.
Via a coding strategy based on the so-called quantum typicality, the well-known Holevo-Schumacher-Westmoreland (HSW) theorem [1][2][3][4][5][6][7] states that the probability of erroneous decoding, ε := Pr { m ̸ = m}, vanishes asymptotically in the limit of n → ∞, whenever the number of bits to be sent per qubit (lim n→∞ 1 n log M ) is below the channel capacity.The HSW theorem extends the seminal work of Shannon [8] to the quantum scenario, and hence, it is one of the fundamental core stones in quantum information theory.However, the HSW coding strategy relies on certain technical assumptions that could be physically demanding.First, the asymptotically large qubit number n requires the quantum devices to implement arbitrarily large encoding and decoding.Second, the actual quantum noise may be correlated among several qubit systems; hence, the underlying quantum noises are not independent.Third, even if the noises are independent, they may not be stationary; namely, the noise acting on the first qubit is not identical to that on the last qubit.
To circumvent the aforementioned technical requirements, one-shot quantum information theory emerges as a new research stream to consider the scenario that no structural hypotheses are imposed on the underlying quantum state or channel.The ultimate goal is to characterize the optimal trade-off between the error probability ε and the message size M of transmission, for which the channel is used only once.Such a study allows us to better understand the fundamental capability of one-shot communication.Therefore it may serve as a general guideline for designing the next-generation quantum information-processing systems.However, without the i.i.d.repetitions of channel use, conventional methods based on quantum typicality are no longer applicable.Hence more refined and sophisticated coding techniques are requisite for the one-shot analysis 1 .
Why is it challenging to design and analyze good coding strategies for a one-shot quantum informationprocessing task?Essentially, a proper coding scheme aims to enforce the error probability, Pr{ m̸ =m E m|m }, for sending each message m small.Here, E m|m denotes the event of decoding to m when message m was sent.Yet, the analysis and computational evaluation of such a union error event for a nontrivial quantum measurement could be quite difficult (even in the classical scenario).A useful trick in this effort is the following union bound : In view of the right-hand side of Eq. ( 1), the decoding rule remains to minimize (M − 1) pairwise error probabilities of deciding m against each m ̸ = m.This then serves as the general principle of coding design.
the i.i.d condition) was proposed by Hayashi and Nagaoka, in which, a powerful operator inequality [58, Lemma 2] was proved: for any positive semi-definite operators 0 ≤ A ≤ 1 and B ≥ 0, where we denote a noncommutative quotient by (here, the inverse is defined only on the support of the operator in the denominator).At the first glimpse of Eq. ( 2), it is not obvious how it resembles the union bound as Eq. ( 1) and how it is applied in analyzing the error probability in channel coding; nonetheless, an ingenious application of it by Hayashi and Nagaoka [58] yields a Feinstein-type bound for achieving the c-q channel capacity [59], [44,Theorem 1].Later, Oskouei, Mancini and Wilde proposed a quantum union bound with similar coefficients as in Eq. ( 2), and hence achieved the same error bound as Refs.[58], [57] 2 .Subsequently, Hayashi and Nagaoka's analysis based on Eq. ( 2) lays a technical cornerstone in a wealth of one-shot and asymptotic achievability results in quantum information theory, wherein a quantum measurement for extracting classical information from a quantum system is needed.For example, letting the coefficient c in Eq. ( 2) be a fixed constant along with a quantum Chernoff bound [61][62][63][64] delivers a large deviation bound for c-q channel coding [63].Letting c = 1 / √ n for an n-fold i.i.d.repetition of a c-q channel, Eq. ( 2) achieves the second-order coding rate [50,[65][66][67][68] in the small deviation regime.Later, both results were extended to the moderate deviation regime [69,70] accordingly.In addition, Anshu, Jain, and Warsi proposed a positionbased coding for achieving entanglement-assisted classical communication over quantum channels [21], which also relies on the Hayashi-Nagaoka operator inequality in Eq. (2).Apart from the success and significance of Hayashi and Nagaoka's approach, there are still conceptual and practical subtleties.First, the technically sophisticated proof of the operator inequality (2) may blind the insight of the analysis and hide the reason why such a coding strategy works.Does there exist a good quantum coding strategy that naturally reflects the union bound as in Eq. ( 1) so that the analysis is more interpretable?Second, is it possible to tighten the one-shot achievability bound for quantum informationtheoretic tasks by eliminating the incurred coefficients in terms of c 3 4 ?Removing those coefficients may seem superficial.However, we remark that every bit in an analytical bound counts in the one-shot setting; one cannot ignore any coefficient.On top of that, the unnecessary coefficients may often trivialize the (ε, M ) trade-off.For instance, existing one-shot bounds on the error probability ε could trivially be greater than 1 for log M close to the channel capacity.This analysis then provides no useful characterizations of certain system configurations for practical communication.Lastly, the Hayashi-Nagaoka decoder [58,60] involves solving a positive semi-definite program (SDP) to obtain the mathematical description of quantum measurement, for which the computational complexity is exponential in the number of qubits.Moreover, a quantum algorithm for implementing the Hayashi-Nagaoka decoder is still missing.
In this paper, we give affirmative answers to the above concerns and questions by showing that the so-called pretty good measurement (PGM) [72,73] naturally plays a role as the union bound.Together with the random coding technique, it yields a one-shot achievability bound via a much simpler and self-explainable analysis, which is merely based on previously known facts.The coefficients mentioned above in terms of c are not required anymore, and hence, the established one-shot bound is sharpened.Furthermore, the proof itself provides a more transparent connection between c-q channel coding and binary quantum hypothesis testing.
To present our result, we first introduce a noncommutative minimal between two positive semi-definite operators A and B as5 This quantity is prominent in quantum state discrimination since the celebrated Holevo-Helstrom theorem [14,74,75] endowed it with an operational meaning6 : Tr [A ∧ B] determines the minimum ' error' of discrimination between operators A and B.
The coding strategy for a c-q channel x → ρ x B (which maps each classical symbol x to a density operator ρ x B ) proceeds as follows.The encoding is via a random codebook {x(1), x(2), . . ., x(M )}, in which each codeword x(m) is drawn pairwise independently according to an arbitrary probability distribution p X .The decoding is via the PGM with respect to the corresponding channel output states [72,73]: We show that the associated average error probability is upper bounded by (Theorem 1): with ρ XB = x∈X p X (x)|x⟩⟨x| ⊗ ρ x B the resulting joint bipartite state between the channel input and output.Via the Holevo-Helstrom theorem, the established bound in Eq. ( 4) provides us the following interpretation for sending M messages over a c-q channel (see Section 3 for the detailed explanation) : The average error probability is upper bounded by the error of discriminating ρ XB and Below, let us elaborate on the intuition of the proposed coding strategy and why PGM works well.The key observation is that using the PGM to discriminate M states at the channel output is exactly equivalent to (an average of) binary discrimination between each channel output state, say, e.g.ρ for all m ̸ = m using a two-outcome PGM.In this regard, PGM works as a one-versus-rest classification strategy; see Figure 1(A).Most importantly, this manifests the fact that PGM effectively resembles the quantum union bound as shown in the right-hand side of Eq. (1).By taking the conditional expectation E x( m)|x(m) over the random codebook, the remaining states are hence averaged to (M − 1) identical marginal states ρ B ; see Figure 1(B).After taking expectation E x(m) , the error bound is equivalent to discriminating the joint state ρ XB between channel input and output against (M −1) product states ρ X ⊗ρ B as shown in Figure 1(C).This gives the elegant and clean bound in Eq. ( 4).
The proposed simple derivation enjoys the following favorable features.
(I) The one-shot achievability bound in Eq. ( 4) admits a closed-form expression as the Holevo-Helstrom theorem.Computing such a bound is more time efficient than the previous results in terms of entropic quantities involving optimizations (see Remark 4 in Section 3).(II) The proposed coding scheme based on the pretty-good measurement is directly implementable via the existing quantum algorithm by Gilyén et al. [78].(III) The self-explainable proof signifies a more lucid connection between c-q channel coding and hypothesis testing.Moreover, our coding strategy and analysis show that PGM effectively works as a union bound by itself.Hence, neither the operator inequality (2) nor a quantum union bound is needed.(IV) The proposed bound in Eq. ( 4) is free of parameter c as in Eq. ( 2).This then shows that the established one-shot achievable error bound is tighter than previously known results based on the Hayashi-Nagaoka operator inequality, Eq. ( 2); see Section 3.1 and Table 2 therein for a comparison with existing results.Moreover, it unifies asymptotic derivations in the large, small, and moderate derivation regimes.We refer the reader to Figure 2 for a schematic flow chart.(V) The proposed analysis applies to infinite-dimensional quantum systems [79] as well, e.g.communication over infinite-dimensional channels with energy constraints or cost constraints [80,81].(VI) The proposed methods via pretty-good measurement naturally extend to various quantum informationtheoretic tasks, leading to more profound and sharpened results.These tasks include: (i) binary quantum hypothesis testing (Section 4.1), (ii) entanglement-assisted classical communication over point-to-point quantum channels (Section 4.2), (iii) classical data compression with quantum side information (Section 4.3), (iv) entanglement-assisted and unassisted classical communication over quantum multiple-access channels (Section 4.4), (v) entanglement-assisted and unassisted classical communication over quantum broadcast channels (Section 4.5), (vi) entanglement-assisted and unassisted classical communication over quantum channels with casual state information available at the encoder (Section 4.6).We refer the reader to the summary given in Table 1 below.Lastly, the established simple analysis applies to the position-based coding, a pivotal technique in one-shot quantum information theory (see, e.g.Refs.[20,21,21,82,83]), proposed by Anshu, Jain, and Warsi [21, Lemma 4], whose decoding strategy again relies on the Hayashi-Nagaoka operator inequality in Eq. ( 2).The sharpened position-based coding (Theorem 3 in Section 4) constitutes the primary technique of deriving numerous one-shot achievability bounds in Section 4. By virtue of its variability, we may term it as a one-shot quantum packing lemma, and it might lead to more fruitful applications elsewhere.

Information-theoretic tasks
One-shot achievability Bounds on coding error

Bounds on coding size
Point-to-point quantum channel (A) Given a realization of a codebook C = {x(1), . . ., x(M )}, the error probability of sending message m = 1 using prettygood measurement (PGM) is upper bounded by the error of distinguishing ρ x(1) B against the remaining states, i.e. .We take the sum of the remaining channel output states is because the PGM effectively works as a one-versus-rest classification strategy.
Taking the conditional expectation over the random codebook C conditioned on codeword x(1), the error probability of sending message m = 1 is upper bounded by the error of distinguishing ρ x(1) B against (M − 1)ρ B .Namely, by randomly drawing a codeword x(1) ∼ p X , we are distinguishing the associated channel output state against the average channel output state scaled by a factor (M − 1).
(C) Taking the expectation over the transmitted codeword x(1) ∼ p X , Figure 1(B) is equivalent to distinguishing the joint state ρ XB between channel input and output against the scaled product of its marginal states (M − 1)ρ X ⊗ ρ B ; see Eq. ( 4).This may be viewed as a one-shot packing lemma for classical-quantum channel coding.This paper is organized as follows.Section 2 formally introduces the noncommutative minimal and its properties.Section 3 establishes our main result of the one-shot achievability for c-q channel coding; we compare it with existing results in Section 3.1.Section 4 entails its applications in one-shot quantum information theory.We conclude the paper and discuss possible open problems in Section 5.The Appendix A proves a useful trace inequality regarding the noncommutative minimal.

The Noncommutative Minimal and Its Properties
We first recall the basic concepts of (binary) quantum state discrimination, which constitutes the central tool for the proposed achievability analysis in Section 3 below.Given arbitrary positive semidefinite operators A and B, we define the minimum error 7 using two-outcome positive operator-valued measures (POVM) to distinguish them as inf The well-known Holevo-Helstrom theorem [14,74,75,84] shows that the infimum can be attained by a Neyman-Pearson test T = {A − B > 0} that projects onto the positive part of the difference A − B, and 7 The quantum state discrimination problems are usually concerned with distinguishing an ensemble of quantum states where each state (i.e. a density operator with unit trace) in the ensemble is endowed with a prior probability.For example, the minimum error defined above coincides with the error probability in conventional quantum state discrimination when assuming Tr[A + B] = 1.We note that in Holevo's early works [74,84], [75,§II], the scenario of distinguishing positive semi-definite operators (even on infinite-dimensional Hilbert space) was studied (see also [85] and [86, §3]). 1 Small deviation regime (fixed ε ∈ (0, 1)) the minimization is given by its dual formulation of a semi-definite program (SDP) [87], [86, §1.2.3]: Here the supremum is attained by the so-called noncommutative minimal (i.e. the operator with the greatest trace among the lower bounds in terms of the Loewner partial ordering) [14,74,75,84] of selfadjoint operators A and B 8 , i.e.
In other words, the Holevo-Helstrom theorem [14,74,75] provides an operational meaning to the noncommutative minimal "∧" for characterizing the minimum error of distinguishing positive semi-definite operators A and B 9 .We adopt such an interpretation subsequently.
The main goal of this paper is to characterize the average error probability in quantum informationtheoretic tasks in terms of the noncommutative minimal "∧".To that end, we first review the important properties that will be used in the proposed analysis.We note that the following properties can be found in existing literature. 8One can also define the noncommutative minimal among multiple self-adjoint operators.Throughout this paper, we will only consider the case of two positive semi-definite operators. 9In quantum state discrimination, one sometimes studies the maximum success probability [14,74,75] and expresses it via a semi-definite programming formulation as the trace of the so-called noncommutative maximal [15,85,86], i.e. the least (in trace ordering) of the upper bound (in Loewner partial ordering).Since the present paper aims to relate the error of a quantum information-theoretic task to that of quantum state discrimination, we will only focus on the error.Note also that for distinguishing multiple operators, say {Ai}i, the error of the discrimination is given by the noncommutative minimal among the set of operators Fact (Properties of noncommutative minimal10 ).Considering arbitrary self-adjoint operators A and B, the noncommutative minimal defined in Eq. (6) has the following properties.
(i) (Unique closed-form expression.)The noncommutative minimal A ∧ B is unique and (ii) (Monotone increase in the Loewner ordering.)It holds that Proof.For (i): the uniqueness (also for multiple operators) was proved by Holevo [ We note that the monotone increase under a positive trace-preserving map and the concavity for multiple operators also hold.Property (v) with trace is due to the direct-sum structure of the trace norm; the case without trace was proved in Ref. [85,Lemma A.9]. Property (vi) is the celebrated inequality of Audenaert et al. [61,62,92] used in proving the quantum Chernoff bound ; later, it was generalized to infinite-dimensional Hilbert space [64].Property (vii) in a special case of Tr[A + B] = 1 is an immediate consequence of the Barnum-Knill Theorem [93], [86,Theorem 3.10].That is, the error probability using the pretty good measurement [72,73] is no larger than twice that of using the optimal measurement.The proof for the general case of A, B ≥ 0 can be found in the author's previous work [30,Lemma 3].For the completeness, we provide an alternative proof of property (vii) (and a strengthened result of it) in Appendix A. □

Main Result: A One-Shot Achievability for Classical-Quantum Channel Coding
In this section, we prove our main result of establishing a one-shot achievability bound for classicalquantum channel coding via a direct application of the pretty-good measurement (PGM) [72,73].
B be a classical-quantum channel, where each channel output ρ x B is a density operator (i.e. a positive semi-definite operator with unit trace).1. Alice holds classical registers M and X, and Bob holds a quantum register B.
2. An encoding m → x(m) maps equiprobable messages in M to a codeword in X.
3. The classical-quantum channel N X→B is applied on Alice's register X and outputs a state on B at Bob. 4. A decoding measurement described by a positive operator-valued measure (POVM) {Π m B } m∈M is performed on Bob's quantum register B to extract the sent message m.
An (M, ε)-code for N X→B is a protocol such that |M| = M and the average error probability satisfies 1 The encoding is the standard random coding strategy.
• Encoding.Consider a random codebook C = {x(1), x(2), . . ., x(M )}, where each of the codewords x(m) ∈ X is pairwise independently drawn from a probability distribution p X .Alice sends codewords according to the realization of the codebook C. • Decoding.At the receiver, given a realization of the random codebook C and the corresponding channel output states {ρ x(m) B } m∈M , Bob performs the PGM to decode each message m ∈ M: Our main result is the following.Theorem 1 (A one-shot achievability bound for classical-quantum channel coding).Consider an arbitrary classical-quantum channel Then, there exists an (M, ε)-code for N X→B such that for any probability distribution p X , Here, Proof.The claim follows from the lower bound of the noncommutative minimal "∧" given in Fact (vii) for relating the pretty good measurement to the optimal measurement, and the concavity of "∧", i.e.Fact (iv).Precisely, given any realization of codebook C = {x(m)} m∈M , we calculate the average probability of erroneous decoding using the PGM given in Eq. ( 7) as where we have applied Fact (vii) with A = ρ x(m) B and B = m̸ =m ρ x( m) B to relate the error probability under the PGM to the expression in terms of the noncommutative minimal.Next, we take the expectation for each x(m) ∼ p X to bound the expected average error probability (which is also called the randomcoding error probability), i.e. 1 , where in (a) we used the concavity given in Fact (iv) and in (b) we recalled the pairwise independence of the random codebook.
Invoking the direct sum formula given in Fact (v), we arrive at the claimed inequality at the right-hand side of Eq. ( 8).Lastly, since the random-coding error probability using any p X is larger than the error probability of the optimal code, the proof is completed.□ Below, we provide a detailed explanation of how PGM works.An important feature of PGM is that the POVM element Π x(m) B given in Eq. ( 7) is proportional to the sent state ρ x(m) B . On the other hand, the complement of the POVM element, i.e.
, is proportional to the sum of the remaining states . Hence, the average error probability of discriminating M channel output states, i.e. the left-hand side of Eq. ( 9), is equivalent to the error of deciding each sent state ρ x(m) B using the following two-outcome PGM: Such the discrimination between ρ x(m) B with prior probability 1 M against the sum of the remaining states (again each with prior probability 1 M ) reflects the nature of the union bound inherited in the PGM; cf. the right-hand side of Eq. ( 1).Next, taking the expectation on the remaining states ensures that we are discriminating ρ x(m) B with prior probability 1  M against (M − 1) identical marginal states ρ B each with prior probability 1 M .Equivalently, this amounts to a binary hypothesis testing between ρ x(m) B with prior probability 1  M against the marginal states ρ B with prior probability M −1 M .Lastly, after taking the summation over m ∈ M, the above is equal to the discrimination of the joint state ρ XB against the scaled decoupled product state (M − 1)ρ X ⊗ ρ B .(See Figure 1 for the illustration.)We hope that this simple proof provides a conceptually clear elucidation on the intimate relation between classical-quantum channel coding and quantum hypothesis testing in a pedagogical way.Remark 2. As we show shortly in Sections 3.1 and 4, the one-shot bound established in Theorem 1 already implies (and sharpens) various previously known achievability results in the so-called achievable rate region, i.e. rates below the quantum mutual information.Is the bound in Theorem 1 tight outside the achievable rate region?Taking c-q channel coding as an example, when the message size is too large or the coding rate (i.e.R := 1 n log M ) is way above the mutual information with respect to ρ XB , the one-shot bound in Theorem 1 might not be very tight.
In view of this, Theorem 1 can be strengthened to the following more involved form: Bound (11) follows from the tighter inequality (28) given in Lemma 1 of Appendix A, instead of Fact (vii , then the random coding error amounts to randomly guessing equiprobable messages, i.e. ε ≤ 1 − 1/M .Regardless of the message size M , Eq. ( 11) is technically a tighter one-shot bound compared to Eq. ( 8).This naturally raises the question of whether Eq. ( 11) can lead to a simple proof of the upper bound on the strong converse exponent of c-q channel coding; see [97,Section 5.4], [98, Proposition IV.5], and [99, Proposition VI.2.].We leave this for future work.
The established one-shot achievability in Theorem 1 immediately covers (and sharpens) various known results of deriving the minimal error given a fixed message or coding size M or deriving the maximal message size given a fixed error ε.Let us define the following two important operational quantities for c-q channel coding: We note that although ε ⋆ (N, M ) and M ⋆ (N, ε) are inverse functions to each other in the one-shot setting, they lead to different asymptotic expansions in the large deviation and small deviation regimes, respectively.
Proposition 1 (Bounding coding error given fixed coding rate).Consider an arbitrary classical-quantum channel N X→B : x → ρ x B .Then, for any n ∈ N and R > 0, there exists an (e nR , ε)-code for N ⊗n X→B such that for any probability distribution p X , Here, Proof.For the one-shot case n = 1, we apply Audenaert et al.'s inequality, i.e.Fact (vi), on the one-shot bound given in Theorem 1 with A = ρ XB , B = (M − 1)ρ X ⊗ ρ B , and s = 1−α α to obtain the large deviation type bound.When considering product channels in the n-shot scenario, the exponential decays follows from the fact that ρ → I ↓ 2−1/α (X : B) ρ is additive for any n-fold product state.The positivity holds by noting that the map α → Then, for any ε ∈ (0, 1) there exists an (M, ε)-code for N X→B such that for any probability distribution p X and any δ ∈ (0, ε), Here, Moreover, for any ε ∈ (0, 1) and for sufficiently large n ∈ N, there exists an (M, ε)-code for N ⊗n X→B such that for any probability distribution p X : 2π e − 1 2 t 2 dt ≤ ε} is the inverse of the cumulative distribution of the standard normal distribution.
Proof.By recalling the definition of the noncommutative minimal given in Eq. ( 5) and by Theorem 1, for any test 0 completing the proof.
The second-order achievability then follows from the expansion of the quantum hypothesis-testing divergence [57,66,67,101,102] We remark that both Propositions 1 and 2 extends to the moderate deviation regime by directly following the approaches from [69,70].The reader may refer to Figure 2 for the corresponding expressions.
Remark 3. Given that Theorem 1 already provides a one-shot bound on the average error probability, one may wonder why to weaken Eq. ( 8) in Theorem 1 to obtain another one-shot bound in Proposition 1 (note that they both have closed-form expressions).The reason is that the minimum error in terms of the noncommutative minimal on the right-hand side of Eq. ( 8) is not multiplicative under product states.Nevertheless, it can be further upper bounded by certain multiplicative Rényi-type quantities.That is exactly the spirit of the quantum Chernoff bound [61,62,64,92], and hence, we term the result of Proposition 1 as a kind of large deviation type bound 12 accordingly.
On the other hand, Theorem 1 also gives a one-shot and asymptotic expansions in the small deviation regime (Proposition 2).Hence, to some extent, Theorem 1 may be viewed as a "meta" achievability for classical communication over quantum channels (see also Theorem 3 later in Section 4.2).Remark 4. Most existing one-shot achievability bounds to date (e.g.[21,58,60,82]) are expressed in terms of the quantum hypothesis testing divergence D ε h as in Eq. ( 12) of Proposition 2, for the reason that they directly provide a one-shot characterization (lower bound) on the maximal message or coding size M given a fixed coding error ε, which is also called the ε-one-shot channel capacity.To numerically compute D ε h , one can formulate the quantity into a standard form of a semi-definite program (SDP); namely, it is an optimization over a d B × d B matrix-valued variable with m := d 2 B + 1 linear (scalar) constraints, where we use d B to denote the dimension of the underlying Hilbert space representing the quantum register B. (Here, we only consider the computation on the quantum part of register B for simplicity without involving computation on the classical register X.) Using the state-of-the-art (classical) SDP solver [103], the running time 13 ), where ω ≤ 2.373 is the exponent of matrix multiplication [104].
On the other hand, the one-shot bound provided in Theorem 1 admits a closed-form expression in terms of the trace norm.Using the state-of-the-art algorithm for approximating singular values [105], it requires running time . This then shows that the computation of the proposed one-shot achievability bound in terms of the noncommutative minimal in Theorem 1 is nearly quadratically efficient compared to the computation of the one-shot bounds in terms of the quantum hypothesis-testing divergence.
3.1.Comparison to Existing Results.In the following, we compare the implications of the established one-shot achievability bounds, i.e.Propositions 1 and 2 with existing results.We refer the reader can refer to Table 2 below for a summary.
The exponential decaying rate of error probability given in Proposition 1 matches the one proved by Hayashi [63, Eq. ( 9)].However, in the one-shot setting, the large deviation type bound in Proposition 1 is tighter than [63, Eq. ( 9)] (without the factor 4).If further the N X→B is a pure-state channel, one has (where the minimization is over all density operators on Hilbert space H B , i.e.S(H B )). Hence, the bound in Proposition 1 is tighter than the bound proved by Burnashev and Holevo [9, Proposition 1] (without the factor 2).
Hayashi-Nagaoka [58, Lemma 3], and Wang-Renner [60, Theorem 1] employed the Hayashi-Nagaoka inequality Eq. ( 2) to obtain a one-shot achievability bound 14 on the message or coding size M : for any 0 in terms of the hypothesis-testing divergence D ε h introduced in Proposition 2. The term − log 4 δ 2 results from optimizing coefficient c when applying the Hayashi-Nagaoka inequality.Compared to Eq. ( 13), the i.e. namely, it is constant O(1) away from the fundamental limit.In the small deviation regime, the optimal rate R for some fixed error ε ∈ (0, 1) converges to the fundamental limit at a speed of O( 1 / √ n), meaning that R deviates by a small amount.
In between, we refer to the moderate deviation regime, where R deviates by the order ω( 1 / √ n) ∩ o(1) [69,70].In this case, the optimal error vanishes at a sub-exponential speed. 13We use notation O * to hide m o (1) and log 1 ϵ factors, where ϵ is accuracy parameter. 14More precisely, Hayashi and Nagaoka obtained the first one-shot achievability bound (for general c-q channels) as in Eq. (13) but in terms of the information-spectrum divergence D ε s [44,45,58,66,106] instead of the hypothesis-testing divergence D ε h defined in Proposition 2. On the other hand, it is known that Lemma 12], and hence, the one-shot achievability bound in terms of D ε h is tighter than that in terms of D ε s .Here, we remark that the approach proposed by Hayashi and Nagaoka [58] allows for choosing any measurement along with applying the Hayashi-Nagaoka inequality Eq. ( 2) in establishing the achievability.Namely, the analysis in [58] with Eq. ( 2) can already lead to Eq. ( 13).We remark that the terminology and concept of the hypothesis-testing divergence D ε h might already appear in the contexts of statistical hypothesis testing by Stein-Chernoff [107], Strassen [40], Csiszár-Longo [108], Polyanskiy-Poor-Verdú [50], and by Wang-Renner [60] in the quantum setting.
established Proposition 2 in Section 3 does not need to choose the appropriate coefficient c, and hence, it gives a tighter one-shot achievability bound on M (especially when δ is small): Specialized to the i.i.d.asymptotic scenario of n-fold product channels with δ = 1 / √ n, Eq. ( 14) yields an improved third-order coding rate by a factor 1 2 log n compared to the asymptotics based on Eq. ( 13).Beigi and Gohari [71] generalized a superb classical achievability approach by Yassaee et al. [109] to establish a one-shot achievability bound on M [71, Corollary 1] as well: with [44,45,58,66,106].Comparing Eq. ( 14) to Eq. ( 15), we recall the relation between quantum hypothesis-testing divergence D ε h and quantum information-spectrum divergence D ε s [66, Lemma 12]: This indicates that the proposed one-shot bound Eq. ( 14) has a stronger leading term . When considering the asymptotic expansion of the coding rate in the i.i.d.setting, one has to translate D ε s in Eq. ( 15) back to D ε h using Eq. ( 16) and to employ the second-order achievability 16 of the quantum hypothesis-testing divergence D ε h [66,67,101,102]: Then, Beigi and Gohari's result, Eq. ( 15), leads to log M ≥ nI(X : This achieves the same third-order term as the asymptotic expansion using Eq. ( 13) and Eq. ( 17).On the other hand, the established Eq. ( 14) with Eq. ( 17) gives a tighter third-order term of coding rate: Inspired from the third-order asymptotics of the classical hypothesis-testing divergence proved by Strassen [40, Theorem 3.1] (see also [50,Lemma 46] and [65, Proposition 2.3]), we conjecture the following third-order achievability of the quantum hypothesis-testing divergence: If Eq. ( 18) was true, then the established Eq. ( 14) will imply log M ≥ nI(X : ) dominates both the first-order and the secondorder coding rates.On the other hand, Eq. ( 15) does have a better constant − log(1−ε), which corresponds to the fourth-order term in the small deviation regime.Such a term is negligible for small errors (say e.g.ε ≤ 10 −3 ).For large errors, one can invoke the strengthened one-shot bound in Eq. ( 11) (though the formula is more involved). 16The second-order expansion of the quantum hypothesis-testing divergence was concurrently proposed by Tomamichel-Hayashi [66] and Ke Li [67].Here in Eq. ( 17), we cite Li's result [67,Theorem 5] since it has a better third-order term −O(1) than that of Tomamichel-Hayashi [66, Eqs. ( 28) and (33)], in which the third-order term is −( 12 +min(λ(σ), ν(σ)) log n−O(1) for λ(σ) being the logarithmic condition number of σ and ν(σ) being the number of distinct eigenvalues of σ.We also remark that Li's result Eq. ( 17 One-shot exponential bounds on coding error Proposition 1 ε ≤ e − sup α∈(1/2,1) Achievability bounds on coding size one-shot bounds i.i.d.asymptotic expansion Beigi-Gohari [71] log Table 2. Comparisons of the one-shot achievability bounds on the error probability and the coding size and rate established in Propositions 1 and 2 of Section 3 with existing results.We also present the i.i.d.asymptotic expansion of the coding rate to highlight the resulting third-order terms, where we shorthand I ≡ I(X : B) ρ and V ≡ V (X : B) ρ for brevity.
We remark that Eq. ( 19) will give the best possible achievable third-order coding rate for c-q channel coding without further assumptions on the channel17 .
Remark 5.At the writing of this paper, a very recent work by Renes establishes the optimal error exponent for symmetric classical-quantum channels [111].The result is asymptotic, but it matches the quantum sphere-packing bound [112][113][114] for the high achievable rate region, and hence, it is asymptotically optimal and tighter than the error exponent obtained in Proposition 1 in the i.i.d.asymptotic setting for symmetric classical-quantum channels.A one-shot bound for that is still missing.We leave this for future work.

Applications in Quantum Information Theory
The analysis proposed in Section 3 naturally extends to classical communication over quantum channels, network information theory [116], and beyond; see Table 1 in Section 1.We apply our analysis using the pretty-good measurement to binary quantum hypothesis testing in Section 4.1.We present entanglementassisted classical communication over quantum channels in Section 4.2.Section 4.3 is for classical data compression with quantum side information.Section 4.4 studies entanglement-assisted and unassisted classical communication over quantum multiple-access channels.Section 4.5 considers entanglementassisted and unassisted classical communication over quantum broadcast channels.Section 4.6 is devoted to entanglement-assisted classical communication over quantum channels with casual state information available at the encoder.4.1.Binary Quantum Hypothesis Testing.Binary quantum hypothesis testing and the optimal quantum measurement is a relatively well-studied topic in quantum information theory due to its simpler mathematical structure and operational significance [14, 15, 61-64, 66, 67, 69, 70, 74, 75, 117-127].The goal of this section is not to re-do the analysis via optimal measurements, but to show how the sub-optimal pretty-good measurement along with the properties of the noncommutative minimal given in Section 2 can recover the existing results with only a slightly sub-optimal coefficient.Specifically, we will show that pretty-good measurement can also achieve the quantum Hoeffding bound [62, §5.5].This indicates that the proposed analysis should not be too loose in terms of the one-shot exponential bounds (at least for binary quantum hypothesis testing).
Symmetric scenario.We first consider the symmetric scenario, where the two quantum hypotheses are described by density operators ρ and σ with prior probability p ∈ (0, 1) and 1 − p, respectively.Note that the one-shot quantum hypothesis testing is also known as the quantum state discrimination; the relation between the optimal measurement (i.e. the quantum Neyman-Pearson test) and the pretty-good measurement was proved by Barnum and Knill [93], [86,Theorem 3.10].
Subsequently, we show that the lower bound on the noncommutative minimal (Fact (vii)) can be interpreted as an adaptation of the Barnum-Knill Theorem.On one hand, the Holevo-Helstrom theorem [14,74,75] shows that the minimal error for distinguishing ρ and σ in the symmetric scenario is given by Tr [pρ ∧ (1 − p)σ].On the other hand, by using pretty-good measurement with respect to the weighted states (pρ, (1 − p)σ) and applying the lower bound of the noncommutative minimal, Fact (vii), the corresponding error probability is given by which is twice the error probability compared to the minimal error via the optimal measurement.This coincides with the claim made by Barnum and Knill on the relation between the error probability using the optimal measurement and that of using the pretty-good measurement.
Remark 6.As mentioned in Remark 2 of Section 3, the upper bound in Eq. ( 20) can be strengthened to: by using Eq. ( 28) of Lemma 1 in Appendix A instead of Fact (vii).On the other hand, one can also use the pretty-good measurement of the form Asymmetric scenario.We move on to consider the asymmetric scenario, namely, the tradeoff between the type-I error and the type-II error without knowing the prior distribution.We use pretty-good measurement ρ ρ+µσ , µσ ρ+µσ with a parameter µ that will be specified later and apply Fact (vii) to bound the type-I error α and the type-II error β: Next, we will show how Eq. ( 21) implies both the small deviation type bound and the large deviation type bound.As in the proof of Proposition 2, we invoke the definition of "∧" in Eq. ( 5) with any test Choosing µ such that the right-hand side of Eq. ( 22) equals ε and recalling the upper bound on "∧" (Fact (vi) with taking s → 0) to obtain the following bound on the type-II error: We remark that Eq. ( 23) is stronger than the analysis provided Beigi and Gohari [71,Theorem 6] in view of the relation between the quantum hypothesis-testing divergence and the quantum information-spectrum divergence, Eq. ( 16).This again magnifies the fact that the pretty-good measurement yields a one-shot achievability bound on the Stein exponent (i.e. the maximal exponent of the type-II error provided that the type-I error is at most ε ∈ (0, 1)), and it can achieve the second-order asymptotics in the i.i.d setting as well.Note here that since the pretty-good measurement is sub-optimal, it incurs a cost − log 1 δ on the Stein exponent in the one-shot setting and a third-order term − 1 2 log n in the n-fold i.i.d.scenario.Yet, it is still sufficient to achieve the moderate deviation asymptotics [70] (i.e. the inferior third-order term − 1 2 log n does not affect the moderate deviation expansion).
Next, we show that the pretty-good measurement can recover the quantum Hoeffding bound [62, §5.5].Applying the upper bound on "∧" (Fact (vi) with α = 1 − s ∈ (0, 1)) on Eq. ( 21), we obtain: α with the quantum Petz-Rényi divergence D α introduced in Proposition 1, we arrive at the one-shot quantum Hoeffding bound: For all r > 0 and α ∈ (0, 1), To our best knowledge, this is the first time that the quantum Hoeffding bound is achieved by using the pretty-good measurement.

4.2.
Entanglement-Assisted Classical Communication Over Quantum Channels.In this section, we elaborate on how the achievability of entanglement-assisted (EA) classical communication [31,80,[128][129][130][131] follows from the proposed simple derivation in Section 3 in the same fashion.
Definition 2 (Entanglement-assisted (EA) classical communication over quantum channels).Let N A→B be a quantum channel.
1. Alice holds a classical register M and quantum registers A and A ′ , and Bob holds quantum registers B and R ′ .2. A resource of an arbitrary state θ R ′ A ′ is shared between Bob and Alice beforehand.3.For any (equiprobable) message m ∈ M Alice wanted to send, she performs an encoding quantum operation E m A ′ →A on θ R ′ A ′ .4. The quantum channel N A→B is applied on Alice's quantum register A and outputs a state on Bob's quantum register B.

Bob performs a decoding measurement {Π m
R ′ B } m∈M on registers R ′ and B to extract the sent message m.
An (M, ε)-EA-code for N A→B is a protocol such that |M| = M and the average error probability satisfies We adopt the encoder of the position-based coding [21] but with the pretty-good measurement as the decoder.
• Preparations: Alice and Bob pre-share an M -fold product state θ • Encoding.For sending each m ∈ M, Alice simply sends her system A m , i.e.E m A M →A = Tr A M\{m} , for tracing out systems A m for all m ̸ = m.• Decoding.At receiver, the channel output states are our best knowledge, its explicit construction is not clear in noncommutative probability space.We leave this for future work.
Remark 8.The analysis of Theorem 2 actually shares the same flavor as that of Theorem 1.More precisely, the partial trace Tr R M\{m} in Eq. ( 25) plays the same role as the averaging over the random codebook as in Eq. (10).In other words, the partial trace Tr R M\{m} can be interpreted as a conditional expectation [136][137][138][139] (which is a completely positive and trace-preserving map) from the operator algebra of bounded operators on R M B, i.e.B(R M B), to its subalgebra 19 Directly applying the pretty-good measurement as above allows us to obtain a tighter and cleaner oneshot achievability bound in a more general form.This then revisits the position-based coding proposed by Anshu, Jain, and Warsi [21, Lemma 4].We summarize it as the following one-shot quantum packing lemma that is not only prominent to Theorems 1 and 2, and all the forthcoming results in this section, but we believe it is applicable elsewhere in quantum information theory as well.
Theorem 3 (A one-shot quantum packing lemma).Let ρ RB and τ R be arbitrary density operators and let M be an integer.For every m ∈ M := {1, . . ., M }, define , where ρ RmB = ρ RB and τ Rm = τ R for every m ∈ M.Then, there exists a measurement , ∀m ∈ M, satisfying, for every m ∈ M, To see how the one-shot quantum packing lemma is applied to the previous achievability bounds, we make the following substitutions: ρ RmB → N A→B (θ RmAm ), and τ R m → θ R m for all m ∈ M and m ̸ = m.Then, Theorem 3 covers Theorem 2 for entanglement-assisted classical communication over quantum channels.
On the other hand, in the scenario where R m B → X m B, ρ RmB → ρ XmB , and τ R m → ρ X m for all m ∈ M and m ̸ = m, the setting in Theorem 3 corresponds to the randomness-assisted communication over c-q channels, where the systems X m 's are the shared randomness at Bob; and the joint state ρ XmB resulted from Alice sending her m-th classical system through the c-q channel (see also the paper by Wilde [82] and Anshu-Jain-Warsi [140]).Then, Theorem 3 yields the achievability bound on the average error probability over the ensemble of codes, i.e. the right-hand side of (8) in Theorem 1. Via de-randomization, one can always claim the existence of a good code in the ensemble to achieve such an error bound without randomness assistance.This concludes the statement of Theorem 1 for c-q channel coding 20 .
Following the same reasoning as in Proposition 1, Theorem 2 (or Theorem 3) leads to a large deviation type bound, which is tighter than [83, Theorem 6] without a prefactor 4; following the same reasoning as in Proposition 2, Theorem 2 provides a tighter lower bound on the ε-one-shot entanglement-assisted capacity for N A→B (i.e. the maximal logarithmic size of messages with average error probability below ε) than [21, Theorem 1], [83,Theorem 8], and [57, Theorem 5.1] (with the same improvements as the comparison made in Section 3.1).
Proposition 3 (Bounding coding error given fixed coding rate).Consider an arbitrary quantum channel N A→B .Then, for any R > 0, there exists an e R , ε -EA-code for N A→B such that for any θ RA , ε ≤ e Here, we follow the notation given in Proposition 1. 19 Namely, the subalgebra consists of all operators in 20 We note that by applying Theorem 3 in randomness-assisted communication over c-q channels, it is equivalent to calculating the average error probability using mutually independent random codebook, while in Theorem 1, we only require pairwise independent random codebook.
Proposition 4 (Bounding coding rate given fixed coding error).Consider an arbitrary quantum channel N A→B .Then, for any ε ∈ (0, 1) there exists an (M, ε)-EA-code for N A→B such that for any θ RA and any δ ∈ (0, ε), Here, we follow the notation given in Proposition 2.

4.3.
Classical Data Compression with Quantum Side Information.In this section, we show that how the proposed method in Section 3 can be applied to classical data compression with quantum side information [99,126,[141][142][143].Subsequently, we will term such an protocol as CQSW (which stands for classical-quantum Slepian-Wolf ).
Definition 3 (Classical data compression with quantum side information).Let ρ XB = x∈X p X (x)|x⟩⟨x|⊗ ρ x B be a classical-quantum state.
1. Alice holds classical registers X and M, and Bob holds a quantum register B.
2. Alice performs an encoding E : X → M that compresses the source in X to an index in M.
3. Bob performs a decoding measurement described by a family of POVMs indexed by m ∈ M, i.e. {Π x,m B } x∈X on register B, to recover the source x ∈ X.An (M, ε)-CQSW-code for ρ XB is a protocol such that |M| = M and the error probability satisfies Without loss of generality, we assume that the prior distribution of the source, p X , has full support for brevity.We also adopt the standard random coding strategy as in Section 3.
• Encoding.The encoder maps each x ∈ X pairwise independently to uniform index m ∈ M.
• Decoding.We use the following pretty-good measurement (again given the realization of the above encoding): Theorem 4 (A one-shot achievability bound for classical data compression with quantum side information).Consider an arbitrary classical-quantum state ρ XB = x∈X p X (x)|x⟩⟨x| ⊗ ρ x B .Then, there exists an (M, ε)-CQSW-code for ρ XB such that Proof.We use the pretty-good measurement given in Eq. ( 26) to calculate the expected error probability (over the random encoding) as follows: where (a) uses the lower bound of the noncommutative minimal given in Fact (vii); (b) follows from the concavity given in Fact (iv); (c) follows from the pairwise-independent and uniform random encoding; (d) follows from the monotone increase in the Loewner ordering and x̸ =x p X (x)ρ x B ≤ x p X (x)ρ x B = ρ B ; and lastly, (e) follows from the direct sum formula given in Fact (v).□ Using the same reasoning as in Propositions 1 and 2 of Section 3, we have the following one-shot bounds for CQSW.
Proposition 5 (Bounding coding error given fixed coding rate).Consider an arbitrary classical-quantum state ρ XB = x∈X p X (x)|x⟩⟨x| ⊗ ρ x B .Then, for any R > 0, there exists an e R , ε -CQSW-code for ρ XB such that ε ≤ e ). Proposition 6 (Bounding coding rate given fixed coding error).Consider an arbitrary classical-quantum state ρ XB = x∈X p X (x)|x⟩⟨x| ⊗ ρ x B .Then, for any ε ∈ (0, 1) there exists an (M, ε)-CQSW-code for ρ XB such that for any δ ∈ (0, ε), 4.4.Multiple-Access Channel Coding.In this section, we show one-shot achievability bounds for classical-quantum multiple-access channel (MAC) coding and entanglement-assisted classical communication over quantum MACs [83,[144][145][146].Note that the former naturally extends to (unassisted) classical communication over quantum MACs.We will present the scenario for only two senders with one receivers; the result applies to multiple senders in the same fashion.
We follow the strategy presented in Section 3.
• Encoding.Each pair of messages (m A , m B ) ∈ M A ×M B is mapped to a codeword (x(m A ), y(m B )) ∈ X × Y pairwise independently according to some probability distribution p X ⊗ p Y .• Decoding.We use the pretty-good measurement with respect to the corresponding channel output states (given the realization of the random codebook): Then, we obtain the following result (without duplicating the proof).
Definition 6 (Classical-quantum broadcast channel coding).Let N X→BC : x → ρ x BC be a classicalquantum broadcast channel.
We follow the analysis proposed in Section 3 by considering communication from Alice to Bob and from Alice to Charlie, separately.
• Encoding.We introduce two auxiliary classical registers U and V for precoding.Message , m B ∈ M B , Bob performs the corresponding pretty-good measurement: Similarly, denoting (with a slight abuse of notation again) the marginal channel output states at Charlie by , m C ∈ M C , Charlie performs the corresponding pretty-good measurement: Then, we obtain the following result (without duplicating the proof).
Theorem 7 (A one-shot achievability bound for classical-quantum broadcast channel coding).Consider an arbitrary classical-quantum broadcast channel N X→BC : x → ρ x BC .Then, there exists an (M B , M C , ε B , ε C )-code for N X→BC such that for any probability distributions p U and p V , and (deterministic) encoding (u, v) → x(u, v), where Note that Theorem 7 extends to classical communication over quantum broadcast channels straightforwardly (see Table 1).
Remark 9. Theorem 7 employs independent pre-coding p U ⊗ p V and hence it provides a simple and clean one-shot achievability bound.We note that such a scenario was considered by Anshu, Jain, and Warsi [140].Hence, Theorem 7 improves on the achievability in Ref. [140,Theorem 13].
Generally, Alice can adopt a joint pre-coding p UV , which is called Marton's inner bound in the classical setting [153,154] (see also the studies in the quantum setting [21,31,[147][148][149][150]); however, it would require additional covering techniques.We leave this for future work [155].
Next, we present entanglement-assisted classical communication over quantum broadcast channels.
We follow the similar analysis as above by considering communication from Alice to Bob and from Alice to Charlie, separately.Again, we also employ the encoder of the position-based coding as in Refs.[21,Theorem 6] and [140,Theorem 6], and apply the pretty-good measurement for decoding.
• Preparation.Consider an arbitrary state θ • Decoding.Denoting (with a slight abuse of notation) the marginal channel output states at Bob by Bob performs the corresponding pretty-good measurement: Similarly, denoting (with a slight abuse of notation again) the marginal channel output states at Charlie by Charlie performs the corresponding pretty-good measurement: Then, we obtain the following result (without duplicating the proof).
Theorem 8 (A one-shot achievability bound for EA classical communication over quantum broadcast channels).Consider an arbitrary quantum broadcast channel N A→BC .Then, there exists an 4.6.Communication with Casual State Information at The Encoder.In this section, we consider entanglement-assisted and unassisted classical communication over quantum channels with causal channel state information available at the encoder [21,31,140], which is the quantum generalization of the classical Gel'fand-Pinsker channel [156]; see also Ref. [116, §7].
Definition 8 (Classical-quantum channel coding with causal state information).Let N XS→B : (x, s) → ρ x,s B be a classical-quantum channel parameterized by s ∈ S and assume that a channel state p S is available at the encoder.
1.The channel holds a classical register S. Alice holds classical registers M, X, and S ′ (an identical copy of S). Bob holds a quantum register B. 2. Given a realization of the channel state s ∈ S ′ , an encoding (m, s) → x(m, s) maps an equiprobable message m ∈ M to a codeword in X. 3. The classical-quantum channel N XS→B is applied on Alice's register X given the realization of the channel state s ∈ S, and outputs a state on B at Bob. (The realizations s ∈ S ′ at Alice and s ∈ S at the channel is identical.)4. Bob performs a decoding measurement described by a measurement {Π m B } m∈M on B to extract the sent message m ∈ M. (Note that Bob is aware of the mathematical description of the probability distribution p S but not of the realization of a specific s ∈ S.) An (M, ε)-code for N XS→B with state information p S is a protocol such that |M| = M and the average error probability satisfies 1 We adopt the standard random coding strategy as follows.
• Encoding.We introduce an auxiliary classical register U for precoding.The message m ∈ M is mapped to a pre-codeword u ∈ U pairwise independently according to p U .With the realization of the channel state s ∈ S ′ , Alice picks a (deterministic) encoding (u(m), s) → x(u(m), s) ∈ X. • Decoding.At the receiver, denoting (with a slight abuse of notation) the channel output state by ρ m B := s∈S p S (s)ρ x(u(m),s),s B for each m ∈ M, Bob performs the corresponding pretty-good measurement: Then, following the analysis in Section 3, we obtain a one-shot achievability bound (without duplicating the proof).
Theorem 9 (A one-shot achievability bound classical-quantum channel coding with casual state information).Consider an arbitrary classical-quantum channel N XS→B : (x, s) → ρ x,s B with state information p S .Then, there exists an (M, ε)-code for N XS→B with state information p S such that for any probability distribution p U and (deterministic) map (u, s) → x(u, s), Here, ρ UB := (u,s)∈U×S p U (u)|u⟩⟨u| ⊗ p S (s)ρ x(u,s),s B .
The result extends to classical communication over quantum channels N AS→B with quantum state information ϑ S (see also the definition below in the entanglement-assisted setting).We refer the reader to Table 1 for the corresponding results.
Remark 10.In the precoding phase of Theorem 9, the chosen probability distribution p U is independent of the channel state p S .This coding strategy is called casual state information at the encoder [116, §7.5] and it was also studied in the quantum setting [140,Theorem 12].(Hence, Theorem 9 improves on [140,Theorem 12].) For the scenario of non-casual state information at the encoder, the pre-coding probability distribution on U may be correlated with the channel state p S [31, §4], [21, §V].This would require additional techniques.We leave this for future work [155].
Next, we move on to the entanglement-assisted setting.
Definition 9 (Entanglement-assisted classical communication over quantum channels with causal state information).Let N AS→B be a quantum channel with a channel state ϑ S .
1.The channel holds a quantum register S. Alice holds a classical register M and quantum registers A ′ and S ′ .Bob holds quantum registers R ′ and B. 2. A resource of arbitrary state θ R ′ A ′ is shared between Bob and Alice.Moreover, let ϑ S ′ S be a purified state of ϑ S shared between Alice and the channel.

Alice performs an encoding E m
A ′ S ′ →A on registers A ′ and S ′ of state θ R ′ A ′ ⊗ϑ S ′ S for any equiprobable message m ∈ M. 4. The quantum channel N AS→B is applied on Alice's register A and the register S of the channel state, and outputs a state on B at Bob. 5. Bob performs a decoding measurement {Π m R ′ B } m∈M on R ′ B to extract the sent message m. (Note that Bob is aware of the mathematical description of the channel state ϑ S but Bob cannot access to the channel's register S nor operate on such a channel state.)An (M, ε)-EA-code for N AS→B with state information ϑ S is a protocol such that |M| = M and the average error probability satisfies We also use the encoder of the position-based coding as in [21,Theorem 5], [140,Theorem 4], and the pretty-good measurement for decoding.
• Preparation.Consider an arbitrary state θ RAS satisfying θ RS = θ R ⊗ ϑ S .Let θ RU be a purified state of θ R with an additional quantum register U at Alice.Then, Alice and Bob share M -fold product states θ R ′ A ′ := θ ⊗M RU .• Encoding.For each m ∈ M, Alice sends the m-th register of U and applies an isometry transformation V S ′ U→EA such that V S ′ U→EA (θ RU ⊗ ϑ S ′ S ) equals a purified state θ ERAS of θ RAS with an additional purifying register E.Then, the overall encoding map is E m U M →A = Tr E •V S ′ U→EA • Tr U M \{m} .• Decoding.Denoting (with a slight abuse of notation again) the marginal channel output states at Bob by , m ∈ M.
Bob performs the corresponding pretty-good measurement: , m ∈ M.
Then, applying the analysis in Section 4.2, we obtain the following result (without duplicating the proof).
Theorem 10 (A one-shot achievability bound for EA classical communication over quantum channels with casual state information).Consider an arbitrary quantum channel N AS→B with state information ϑ S .Then, there exists an (M, ε)-EA-code for N AS→B with state information ϑ S such that for any θ RAS satisfying θ RS = θ R ⊗ ϑ S , Here, ρ RB := N AS→B (θ RAS ).

Conclusions
We propose a conceptually simple analysis of one-shot achievability for processing classical information in quantum systems.The key point of this work is to demonstrate that the pretty-good measurement directly translates the conditional error probability of a multiple-state discrimination to the error of discriminating a state against the rest.This can be viewed as the one-versus-rest strategy, and, hence, the pretty-good measurement effectively resembles the quantum union bound in quantum coding design and analysis.We obtain an elegant closed-form expression of the average error probability for classical communication over quantum channels with standard random coding and basic properties of the noncommutative minimal.The proposed method is tight in the sense that it gives tighter one-shot achievability bounds for channels without further constraints such as symmetry (see Section 3.1), and it unifies the asymptotic derivations in the large, small, and moderate deviation regimes (Figure 2) Moreover, the analysis naturally applies to various quantum information-theoretic tasks (see Section 4 and Table 1).This manifests that the proposed method may be considered a fundamental and unified approach to deriving achievable error bounds in quantum information theory.In this regard, we may term it as a one-shot quantum packing lemma (Theorem 3).Essentially, the proposed analysis can be applied to and can sharpen almost all existing results that rely on the Hayashi-Nagaoka operator inequality [58, Lemma 2]; see e.g.Refs.[21,58,60,82,83,99,140,143,157].The improvement is crucial because every bit counts in a one-shot bound because weaker one-shot bounds could be trivial in certain practical scenarios.Hence, we expect more applications of the proposed analysis to emerge.As for the computational aspect, we point out a recently developed quantum algorithm for implementing the pretty-good measurement [78].Last but not least, the proposed achievability analysis also applies to the converse analysis for the covering-type problems [22,23,30,155].
We list some open problems along these research directions.
• Standard second-order analysis of the achievable coding rate consists of two steps: (i) reducing the underlying task to binary quantum hypothesis testing and (ii) an asymptotic expansion of the quantum hypothesis-testing divergence [57,66,67,101,102].The proposed approach simplifies step (i), and hence, now the bottleneck lies in step (ii).Specifically, we conjecture a third-order achievable expansion of the quantum hypothesis-testing divergence in Eq. ( 18).If Eq. ( 18) holds, then Proposition 2 will lead to the best possible third-order coding rate for general classicalquantum channels without further assumptions, i.e. within the framework of the Higher Education Sprout Project by the Ministry of Education (MOE) in Taiwan. ρ

Figure 1 .
Figure 1.Schematic illustration of the proposed achievability analysis for classical-quantum channel (x → ρ x B) coding.Similar reasoning applies to various quantum information-theoretic tasks.The reader can refer to Table1and Section 4 for instances.

Figure 2 .
Figure 2. Flow chart of the implications of the established one-shot achievability bound in the large deviation, small deviation, and moderate deviation regimes.Here, R := 1 n log M denotes the coding rate for blocklength n ∈ N. The precise notation is given in Section 3.

Definition 4 ( 2 .
Classical-quantum multiple-access channel coding).Let N XY→C : (x, y) → ρ x,y C be a classical-quantum multiple-access channel.1. Alice holds classical registers M A and X, Bob holds M B and Y, and Charlie holds quantum register C. Alice performs an encoding m A → x(m A ) ∈ X for any equiprobable message m A ∈ M A she wanted to send.Bob performs an encoding m B → y(m B ) ∈ Y for any equiprobable message m B ∈ M B he wanted to send. 3. The channel N XY→C is applied on Alice and Bob's registers X and Y and outputs a state on C at Charlie. 4. A decoding measurement {Π m A ,m B C } (m A ,m B )∈M A ×M B is performed on register C to extract the sent message (m A , m B ).An (M A , M B , ε)-code for N XY→C is a protocol such that |M A | = M A , |M B | = M B , and the average error probability satisfies 1

1 .
Alice holds classical registers M B , M C , and X. Bob holds quantum register B and Charlie holds quantum register C. 2. Alice performs an encoding (m B , m C ) → x(m B , m C ) ∈ X for any equiprobable message (m B , m C ) ∈ M B × M C she wanted to send to Bob and Charlie, respectively.3. The channel N X→BC is applied on Alice's register X, and outputs a marginal state on B at Bob and a marginal state on C at Charlie. 4. Bob performs a decoding measurement {Π m B B } m B ∈M B on register B to extract the sent message m B , and Charlie performs a decoding measurement {Π m C C } m C ∈M C on register C to extract the sent message m C .An (M B , M C , ε B , ε C )-code for N X→BC : x → ρ x BC is a protocol such that |M B | = M B , |M C | = M C , and the average error probabilities satisfy m B ∈ M B for Bob is encoded to a pre-codeword u(m B ) ∈ U pairwise independently according to some probability distribution p U ; message m C ∈ M C for Charlie is encoded to a pre-codeword v(m C ) ∈ V pairwise independently according to some probability distribution p V .Then, Alice picks a (deterministic) encoding (u(m B ), v(m C )) → x(u(m B ), v(m C )) ∈ X. • Decoding.Denoting (with a slight abuse of notation) the marginal channel output states at Bob by ρ m B B := 1 M C m C ∈M C ρ x(u(m B ),v(m C )) B be a purified state of θ R B and let θ R C A C be a purified state of θ R C .Then, Alice and Bob share the M B -fold product states θ R ′ B A ′ B := θ ⊗M B R B A B , and Alice and Charlie share the M C -fold product states log M ≥ nI(X : B) ρ + nV (X : B) ρ Φ −1 (ε) − O(1).• To the best of our knowledge, conjectures made by Mosonyi and Audenaert [85, Conjectrue 4.2], and Qi, Wang, and Wilde [83, Conjecture 18] are still open.If they were true, then the established one-shot achievability bound for classical-quantum multiple-access channels (Theorem 5) will directly imply an upper bound on the error probability by a sum of exponential decays 21 .

1 δ Table 1 .
Summary of the established one-shot achievability bounds in various quantum information-theoretic tasks.The precise statements and notation can be found in Section 4.
□Proposition 2 (Bounding coding rate given fixed coding error).Consider an arbitrary classical-quantum channel N X→B Definition 7 (Entanglement-assisted classical communication over quantum broadcast channels).Let N A→BC be a quantum broadcast channel.1. Alice holds classical registers M B and M C , and quantum registers A ′ B and A ′ C .Bob holds quantum registers B and R ′ B .Charlie holds quantum registers C and R ′ C . 2. Bob and Alice share an arbitrary state θ R ′ B A ′ B .Charlie and Alice share an arbitrary state wanted to send to Bob and Charlie, respectively.4. The channel N A→BC is applied on Alice's register A, and outputs a marginal state on B at Bob and a marginal state on C at Charlie. 5. Bob performs a decoding measurement {Π m B R ′ B B } m B ∈M B on register B to extract the sent message m B , and Charlie performs a decoding measurement {Π