Complexity of fermionic states

How much information a fermionic state contains? To address this fundamental question, we define the complexity of a particle-conserving many-fermion state as the entropy of its Fock space probability distribution, minimized over all Fock representations. The complexity characterizes the minimum computational and physical resources required to represent the state and store the information obtained from it by measurements. Alternatively, the complexity can be regarded a Fock space entanglement measure describing the intrinsic many-particle entanglement in the state. We establish universal lower bound for the complexity in terms of the single-particle correlation matrix eigenvalues and formulate a finite-size complexity scaling hypothesis. Remarkably, numerical studies on interacting lattice models suggest a general model-independent complexity hierarchy: ground states are exponentially less complex than average excited states which, in turn, are exponentially less complex than generic states in the Fock space. Our work has fundamental implications on how much information is encoded in fermionic states.


I. INTRODUCTION
The complexity of an object or a process quantifies how something can be generated from simple building blocks in an optimal way.For example, the computational complexity of a mathematical operation is defined in terms of the number of elementary operations required in its execution, or the complexity of a unitary operation in a quantum computer is defined as a the minimum number of elementary quantum gate operations required in its generation. 1,2][5] In this work, we introduce the complexity of N -particle fermionic states.The complexity quantifies how resource intensive it is to express a given state as a linear combination of Slater states (fermionic product states), which are the building blocks of the fermionic Fock space.If the complexity of a state is C, to express this state in any Fock basis, one needs to specify at least C nonzero coefficients.We show rigorously that the Fock representation of a state cannot be compressed to less than n qubits = log 2 C qubits, showing how the complexity determines the minimal physical and computational resources required to represent the state.Sophisticated numerical methods 6,7 have been developed to mitigate the exponential complexity of correlated systems, however, only a genuine quantum simulation 8 can be expected to incorporate it in general.][11][12] Besides its computational and information-theoretic implications, the complexity constitutes an entanglement measure in the Fock space.In contrast to widely studied partition entanglement measures, 13 the complexity describes intrinsic partition-independent properties of Nparticle states.It sharply distinguishes between the states of interacting and noninteracting Hamiltonians: all non-degenerate eigenstates of noninteracting systems can be represented as a single Slater state, thus having a trivial complexity.
The central finding in our work is that the complexity for distinct classes of states can be faithfully estimated from the correlation entropy S c , defined essentially as the entanglement entropy between a single particle and the rest of the system.The quantity S c , exhibiting intensive size scaling, is calculated from the eigenvalues of the single-particle correlation matrix (i.e. the natural occupations), and thus easily available in many numerical and theoretical methods.[17][18] FIG. 1. Complexity hierarchy of fermionic states as a function of particles Np and the correlation entropy Sc at fixed filling fraction.Ground states and excited states refer to eigenstates of interacting lattice Hamiltonians at strong coupling exceeding the bandwidth and 1 ≤ αg ≤ 2, depending on the filling.
][21][22] Specifically, i) we establish a universal lower bound for the complexity S P ≥ S c , where S P is the logarithmic complexity C = e S P ii) we introduce a model-independent finite size complexity scaling hypothesis S P ∼ αN p S c for homogeneous N p -particle states with constant filling fraction iii) numerical studies of interacting lattice models suggest that the coefficient α characterizes universal features of distinct classes of states, implying the exponential complexity hierarchy summarized in Fig. 1.In strongly coupled lattice models, the ground states are exponentially less complex than average excited states, which in turn are exponentially less complex than the generic states in the Fock space.Due to the modelindependent nature of the scaling hypothesis, we postulate that the same complexity scaling is applicable for a broad class of local Hamiltonians.Our work has fundamental implications on how much information is contained in fermionic states.

II. FERMIONIC COMPLEXITY
We begin by defining the complexity for an arbitrary fermionic state |Φ⟩ in the Fock space of N p identical particles and N o available single-particle orbitals.This state can be expanded as where B i denotes orbital i in the single-particle basis B, and {n Bi } k labels the distinct sets of single particle occupation numbers n Bi = 0, 1.The N p -particle Slater basis states are defined as , where the product of fermion creation operators contains the populated orbitals in the set {n Bi } k .Each Slater state is multiplied with a nonzero complex probability amplitude a {n B i } k ̸ = 0. Depending on |Φ⟩ and the employed single-particle orbitals B, the number of terms k max varies between 1 and the Fock space dimension Q = No Np .We now consider the 2nd Renyi entropy of the probability distribution of the Slater states where To eliminate the dependence on B, we define the logarithmic complexity as where the minimization is carried over all possible singleparticle bases B. Finally, we define the complexity of the state |Φ⟩ as In practical calculations, carrying out the minimization in Eq. ( 1) is highly nontrivial task.Remarkably, as seen below, for the eigenstates of the studied lattice Hamiltonians, the optimal basis is excellently approximated by the correlation matrix eigenbasis and the position basis at weak and strong coupling.The complexity, as defined above, has two illuminating interpretations: i) The complexity of a state determines its maximum compression in the Fock space, characterizing the number of terms in the most compact representation.By employing fundamental results in classical and quantum information theory [23][24][25] , we show in App.A that the maximum compression of the quantum information in a fermionic state is determined by its complexity.Specifically, we prove that the number of qubits required to encode the Fock space information of a state is, at least, n qubit = S P log 2 e.This characterizes the minimum physical resources required to represent and store general fermionic many-body states.We emphasize that the result, which has close parallels with Shannon's and Schumacher's encoding theorems in classical and quantum information theory, is universal and applies to generic quantum simulation and quantum information platforms.ii) The complexity of a state describes its intrinsic N -particle entanglement.Without entanglement, the state could be represented as a single Slater state.If the state has complexity C, the amount of entanglement corresponds to that in an equal superposition of C Slater states.[22]26,27 However, none of the previous mode entanglement measures capture the same information as the complexity studied here.In fact, the complexity is unique in establishing a concrete connection between the information content of the state and the N -particle entanglement.In contrast to the entanglement entropy and other partition-based measures, the complexity does not depend on arbitrary case-specific partition.Moreover, the complexity sharply distinguishes interacting and noninteracting systems since all non-degenerate eigenstates of quadratic Hamiltonians can be represented as a single Slater state with S P = 0, irrespective whether they obey the area-law 28 , the volume-law 29 or the critical entanglement entropy scaling.
The complexity S P can be contrasted with other quantities derived from coefficients in the Fock basis.The Slater rank 30,31 of a state is the minimal number of Slater determinants required to exactly expand the state, and has been studied in low-dimensional systems.This is related to the generalized Pauli constraints, 32,33 as exactly satisfying a constraint can lead to lower-dimensional representation of the state even if the average occupations do not take values of 0 or 1 (c.f. the discussion on S c below), although this may again be mostly relevant in low-dimensional systems. 34Measures similar to the Slater rank have also been considered in bosonic systems. 35,36Another extreme is to consider only the weight of the largest Slater determinant, 37,38 which was considered as an entanglement measure for the Laughlin wave function. 39he complexity C, the Slater rank and the largest weight are related to Renyi-n-entropies of the Slater weights with n = 2, n = 0 and n = ∞.Indeed, the generalization of the complexity for other Renyi entropies and the Shannon entropy is immediate, and the relation to the corresponding single-particle entanglement entropies discussed below in the n = 2 case will be developed in a forthcoming work for general n.The Renyi-2 entropy has the advantage of an elementary proof of the lower bound property discussed below and admitting certain analysis tools 40 used in App.G, while the Shannon entropy has a more direct relation to asymptotic information content of the state, as discussed in App. A.

A. Complexity lower bound from natural orbitals
A central role in the complexity is played by the singleparticle correlation matrix, also known as the one-body reduced density matrix, where ĉ † i , ĉj denote the fermionic creation and annihilation operators and indices i, j ∈ 1 . . .N o label all possible single-particle orbitals in a fixed basis.If we have a system with N o available orbitals, the correlation matrix has dimension N o × N o .Due to Fermi statistics, the correlation matrix eigenvalues satisfy 0 ≤ λ i ≤ 1 and i λ i = N p .Thus, they can be interpreted as singleorbital occupation probabilities in the eigenbasis of C ij .The eigestates of C ij are commonly referred as the natural orbitals 14 , which have found modern applications in analyzing strongly correlated many-body systems. 41,42e can define one-particle correlation entropy in the state |Φ⟩ as which is a measure of how the occupation probabilities of the natural orbitals collectively differ from 1 or 0. As discussed in App B, up to a trivial constant, S p c is equal to the Renyi entanglement entropy between a single particle and the rest of the system.Generic properties of entanglement entropies of fermionic N-particle reduced density matrices have been discussed extensively, [19][20][21][22] and quantities similar to S p c have been employed to characterize phase transitions in many-particle systems 43,44 and as measures of correlation [14][15][16][17] and complexity. 18It is thus useful to establish a connection between S P and S c , clarifying the role of S P as a novel entanglement measure.
By interchanging the role of particles and holes, we define single-hole occupation probabilities λi = 1 − λ i , which satisfy 0 ≤ λi ≤ 1 and i λi = N o − N p .We then define a single-hole correlation entropy as and the correlation entropy as the larger of the two In App C we prove that, for arbitrary fermionic state |Φ⟩, the correlation entropy provides a lower bound for the logarithmic complexity  45 These type of states, whose complexity do not scale with the total number of particles at fixed filling fraction ν = Np No , define the low-complexity category in Fig. 1.This category include, for example, eigenstates of impurity systems with a non-extensive number of scattering centers, such as the Kondo model.Despite a macroscopic reorganization of the Fermi sea, the eigenstates have only a few correlation matrix eigenvalues that differ from 0 or 1. 46

B. Complexity scaling
The existence of the lower-bound (3) saturating states suggests that the bound cannot be significantly improved without making additional assumptions of the states of interest.Eigenstates of local interacting Hamiltonians and other large-scale homogeneous states defined on a d-dimensional spatial lattice constitute a class of central importance.They define a family of states which can be studied as a function of the system size for a fixed filling fraction ν.How is the complexity of such states scaling as the system size grows?For a generic filling fraction ν ̸ = 0, 1, the dimension of the Fock space of such states grows exponentially in N p .Thus, in the leading order, we expect that the logarithmic complexity scales as S P ∼ N p .However, the maximum value of the correlation entropy does not scale with the system size S c ≤ max{− ln ν, − ln(1 − ν)}.This shows that S c alone does not provide an accurate approximation for the complexity of these states.However, the role of S c in Eq. ( 3) suggests that it encodes some universal features of the complexity.Combining this idea with the exponential scaling in the system size, we postulate that the complexity of uniform states follows, in the leading order, the scaling form where α > 0 captures universal features of distinct classes of states.Here N i is the number of particles N i = N p when ν ≤ 1 2 , and the number of holes We illustrate this hypothesis for three paradigmatic examples: the Hubbard model of spinful fermions and Haar-distributed states, which we call "generic states" as they represent uniformly distributed unit vectors in the Fock space (see Sec. D in Methods).We observe that, indeed, the value of α distinguishes different broad classes of states: 1.The generic states have α = α g , where 1 ≤ α g ≤ 2, depending on the filling fraction.The maximum α g = 2 is obtained at ν = 1 2 , while α g → 1 when ν → 0 or ν → 1.
2. For non-degenerate ground states, α = 1 2 provides an excellent lower bound, which can become tight in various limits.
3. Average excited states have 1 ≲ α < α g when the interaction exceeds the bandwith.
The difference in α, despite its innocent appearance, translates into an exponential difference in the complexity.The complexity of generic states provides a baseline reference to compare other types of states.The analytical expression for α g is derived in App App D. The generic states saturate the maximum value of the correlation entropy S c and the maximal leading order complexity allowed by the dimensionality of the Fock space.As seen below, the eigenstates of local Hamiltonians allow exponential compression compared to the generic states.
In Fig. 2 we illustrate the ground state complexity for the Hubbard model for ν = 1 2 and the t − V for ν = 1 3 .The minimizing basis, found by the conjugate gradient optimization (see App H for details), is well approximated by the momentum states at weak coupling.In this case, the momentum basis is a natural orbital basis, however, the natural orbitals are not unique due to degeneracies in the natural occupations.For the t − V model, the natural orbitals are essentially the optimal basis also at strong coupling, while for the Hubbard chain, the optimal basis at strong coupling coincides with the position orbitals.The ground state complexity for both models is seen to satisfy S P ≳ 1 2 N p S c , where the lower bound appears tight for small V and large U .In the Hubbard chain, the correlation entropy saturates the maximum value S c = ln 2 at strong coupling.Thus the logarithmic complexity of a generic state at half filling, S P = 2N p ln 2, is four times larger than that of the ground state of the Hubbard chain S P ≈ 1 2 N p ln 2 at strong coupling.Furthermore, the size scaling suggests that α converges reasonably close to α = 1 2 for all coupling strengths.This is observed for both models at fillings for which the ground state is non-degenerate.For the t − V model at half filling, the ground state corresponds to two near-degenerate charge density wave configurations.In this case, we observe that the complexity of each charge-density wave state is well-captured by S P ∼ 1 2 N p S c .We discuss the mechanism leading to the specific value α = 1/2 in App.G, and provide further data for other filling fractions in App.F.
In Fig. 3 we analyze the complexity of excited states, for the same systems as in Fig. 2, by performing a full diagonalization in the parity and center-of-mass momentum sector which contains the ground state.For the excited states, finding numerically the minimizing basis becomes more challenging.As seen in Figs. 3 a)-b), the numerical optimization does not find the true minimum for some high-complexity states.However, in the vast majority of cases, the optimization converges very close to the minimum value over natural, momentum and position orbitals.This indicates that, like for the ground states, one obtains an accurate approximation for the complexity by analyzing only these orbitals, especially when considering averages over many states.For both models, the average complexity of the excited states grows as a function of interaction and saturates to a constant at U/t, V /t ≈ 4. At strong coupling, the average complexity is substantially higher than for the ground states.While the full diagonalization is restricted to modest system sizes, a fact one should be conscious of in extrapolating the results, Fig. 3 g) and f) imply that the ratio S P /(N p S c ) for the average excited states converge to a constant α < α g .The specific value of α depends on the coupling strength and filling, but the average excited states remain, even around the midspectrum, significantly less complex than generic states.This behaviour is markedly different from the entanglement entropy, which exhibits identical leading order volume-law scaling for the midspectrum states of nonintegrable Hamiltonians and generic states. 29,47,48e also note that the quantity S P / log(Q), where Q is the number of basis states, can be seen as a basis-independent multifractal coefficient, which is connected to quantum ergodicity and thermalization.We present an outlook on this connection in App.E.

III. DISCUSSION
In the above, we have seen how the complexity hierarchy summarized in Fig. 1 emerges.The single-Slater states, such as the eigenstates of quadratic Hamiltonians, have trivial complexity and are regarded as the fundamental building blocks of more complex states.For the low-complexity states, for which the complexity is not scaling with the system size when filling fraction is fixed, the complexity can be estimated from the universal lower bound S c .As seen above, the ground state complexity is typically well captured by the scaling Ansatz (4) with prefactor α = 1/2.When the interaction exceeds the bandwidth, the complexity of average excited states follow (4) with 1 ≲ α < α g , where the upper bound determines the complexity of generic states.The model-independent nature of the scaling hypothesis and the qualitative agreement of different models suggest that the above results are not sensitive to the specific form of the Hamiltonian, as long as some broad features, such as locality and large scale homogeneity, are satisfied.

IV. CONCLUSION AND OUTLOOK
We introduced the complexity of a fermionic state to quantify the amount of information in it.The complexity provides a bound to the quantum state compression by choosing an optimal Fock basis, determining the minimum computational and physical resources to represent states.We showed that, for distinct classes of states, the complexity can be estimated from the eigenvalues of the single-particle correlation matrix.Considering the rapidly increasing interest in fermionic quantum simulation and quantum information processing, our results open several topical avenues of research.Does the observed complexity scaling laws for ground states and excited states represent a fundamental limit in encoding information to the eigenstates of local Hamiltonians?Do the complexity scaling laws, as their model-independent form suggests, also hold for higher dimensional systems?How can the scaling laws for eigenstates be derived from general arguments?To what extent the discovered complexity structure applies to bosonic states?How does the notion of Fock complexity, as studied here, reflect the circuit complexity of concrete fermionic quantum simulation schemes? 9,49Here we especially want to mention the concept of "magic", 50 a property of the quantum states critical to speedup over classical computation, and the fact that all non-gaussian fermionic states can be considered to possess "magic". 51Answers to these questions would provide fundamental new insight in many-body systems and their quantum information applications.
In information theory, the notion of entropy was introduced to quantify the compression of strings of data which follows a known distribution. 23Analogously, the logarithmic complexity, which is an entropy quantity, characterizes the compression of the quantum information in a many-body state.Here we provide a derivation of this fundamental property.Following similar steps as in the entanglement distillation, 2 we consider an optimal encoding of n copies of a fermionic many-body state |Φ⟩ = k a {n B i } k |{n Bi } k ⟩, where {n Bi } k denotes an occupation number set in the single-orbital basis B, and k ∈ {1, 2, . . .Q} where Q is the Fock space dimension.Thus, the object of interest is a composite state which is an element of Q n -dimensional composite Fock space.In the above, the probabilities are defined as and the complex phases of amplitudes are absorbed in the Fock basis states.In general, composite states |Ψ⟩ live in a lower-dimensional subspace H of Q n .Schumacher's encoding theorem implies that, in the limit of large n, state |Ψ⟩ can be projected into a typical subspace of dimension 2 nH(P B k ) with arbitrary high accuracy, where k is the Shannon entropy. 23,25The projection operator into the δ-typical subspace is of the form where the δ-typical states are defined by |P k1 P k2 . . .P kn − 2 −nH(P B k ) | ≤ ϵ, and number of such states is at most 2 n(H(P B k )+δ) .For arbitrary ϵ, δ > 0, it is always possible to achieve by allowing for sufficiently large n. 2 This implies that |Ψ⟩ = k1...kn Thus, for sufficiently large n, the state can be compressed into 2 This shows that the logarithmic complexity has a similar role in encoding quantum information of many-body states what the Shannon entropy has in encoding classical information. 52In summary, the complexity characterizes the minimum resources to represent many-body states in the Fock space and underlines the physical requirements of all quantum simulation and quantum information applications.

Complexity and quantum information from measurements
In addition to characterizing the optimal information compression, the complexity also characterizes the information that can be obtained from a many-body state by measurements.This section can be regarded as a complementary way to understand the formal result (A1) in more physical terms.Let's consider that we prepare multiple copies of state |Φ⟩ and perform repeated N p -particle measurements in some Fock basis |{n Bi } k ⟩, where {n Bi } k denotes an occupation number set in the single-orbital basis B, and k ∈ {1, 2, . . .Q} where Q is the Fock space dimension.The resulting quantum states, obtained as outcomes of the n measurements, constitutes the total information obtained from the measurements.This information can be stored as a composite state of the form which is an element of Q n -dimensional Hilbert space.Again we will see that the complexity of |ψ⟩ provides a fundamental lower bound of how much of the Q ndimensional space such states cover.In the language of quantum information theory, these composite states, obtained with probability P B k1 P B k2 P B k3 . . .P B kn , can be regarded as quantum messages constructed from individual letters, where each letter is a quantum state drawn from the ensemble |{n Bi } k ⟩, P B k .Now one can ask how much these quantum messages can be compressed, or what is the minimum dimension of space H in which the messages can be accurately stored when n is large.The dimension of H determines the physical resources needed to store information extracted from |ψ⟩.This formulation turns the problem into an application of Schumacher's encoding theorem 24,25 in the special case where the letters form an orthogonal set.In this case, the quantum state of the messages is uniquely indexed by strings k 1 k 2 . . .k n .When n is large, Shannon's noiseless coding theorem implies that these strings can be faithfully compressed to 2 nH(P B k ) long strings, where H(P B k ) is the Shannon entropy. 23,25Thus, in this limit, almost all messages fit into a space of dimension log 2 (dim H) = nH(P B k ).The maximum compression is obtained in the Fock basis that minimizes H(P B k ) bounded by the logarithmic complexity ln(dim H) ≥ nS P in agreement with Eq. (A1).Thus, the complexity provides a lower bound to the dimension of H, where the states obtained by n measurements can be stored.Whenever the logarithmic complexity is smaller than its maximum value ln Q, the composite states obtained from |ψ⟩ by n measurements do not fill the whole Q n dimensional space exhaustively but only a subspace of it.

Complexity as the characteristic number of terms in the minimal Fock representation
In addition to its rigorous role as an optimal quantum information compression rate discussed above, the complexity of a state is also connected to the characteristic number of terms which are required to span it in the optimal basis.Let {P n } be the probabilities in the optimal Fock basis which determines the complexity and let's assume that the distribution is arranged in non-increasing order P n1 ≥ P n2 when n 1 < n 2 .How many terms are needed in the optimal basis to effectively span the state?Specifically, how large should ñ be to satisfy ñ n=1 P n ∼ 1?This question is important for the states with large complexity ñ, C ≫ 1 and the answer depends on the distribution: i) for sufficiently uniform distributions with a well-defined typical probability scale, the required number of terms is ñ ∼ C ii) for heavytailed distributions, the required number of terms can scale nonlinearly in the complexity ñ ∼ C β with β > 1.
Let's first study i) and consider a case where the probabilities have a characteristic order of magnitude P n ∼ P 0 when n ≤ n ′ , and are strongly suppressed for n > n ′ .This implies that P 0 ∼ 1/n ′ and C = 1/ n P 2 ∼ n ′ .Thus, the complexity roughly coincides with the effective cutoff index n ′ and we can conclude that ñ n=1 P n ∼ 1 is achieved when ñ ∼ C. When the distribution is strictly box distribution with constant probabilities P 0 = 1/n ′ , the full probability is exactly recovered after C terms C n=1 P n = 1.In general, to recover the full probability for distributions with a finite tail above n = n ′ , one might need to include a few multiples of C terms.The linear scaling between ñ and C reflects the typical expectation that entropy-like quantities scale as the logarithm of the total number of contributing states.
In case ii), the distribution has a long tail, the probabilities do not have a well-defined scale, and the previous reasoning breaks down.For this type of distributions, a nonlinear scaling ñ ∼ C β with a model-specific β > 1 becomes possible.Such behaviour can be observed, for example, for power-law distributions and the distributions of eigenstates of lattice Hamiltonians, as illustrated in Fig. 4. For the ground state of a strongly coupled Hubbard model, we find that C n=1 P n ∼ 0.5 and that the standard deviation and the complexity of the optimal distribution satisfy σ = C β , where β ≲ 2. Because most of the probability is located withing a few standard deviations, the full probability is covered by where m is a small integer.
To summarize, the complexity of a state provides a lower bound estimate for the characteristic number of terms in the optimal Fock representation.

Appendix B: Correlation entropy as an entanglement entropy
The single-particle correlation matrix in state |Φ⟩ is conventionally defined as C ii ′ = ⟨Φ|ĉ † i ′ ĉi |Φ⟩, which can also be written in first quantized notation as where Φ(i, j, k, l, ...) is the antisymmetric wave function of the particles at coordinates i, j, k, l, ... . 14Thus the actual normalized reduced density matrix of a single particle, defined as the partial trace over the coordinates of the other particles, is ρ 1 = C/N p , 43 and the order n Renyi entanglement entropy is defined as and has the von Neumann limit S 1 = −Tr(ρ 1 log(ρ 1 )).
If |Φ⟩ is a single Slater determinant, C has N p times degenerate eigenvalue 1, the rest being zero.Therefore the entanglement entropies become However, in the spirit of the complexity S P which is trivial for Slater states, we subtract the free fermion contribution and define the single-particle correlation entropies as S c,2 is the particle correlation entropy discussed in the main text.Thus, the correlation entropy is actually a one-particle entanglement entropy from which the free fermion contribution has been subtracted.

Appendix C: Proof of the complexity lower bound
Here we will give a proof of the complexity lower bound (3) in three steps.
Proposition 1: Let's consider a fermionic N p particle state |Φ⟩.Furthermore, let's assume that λ i is the set of correlation matrix eigenvalues (occupation probabilities of the natural orbitals) and nBi are the occupation probabilities of single-particle orbitals in an arbitrary basis B. They always satisfy i λ Bα .Here we used the fact that the in the double sum all entries are positive and that occupation probabilities in a general basis are defined as diagonal entries of the correlation matrix.
Proposition 2: The average occupation numbers nBi and the state probabilities The first sum is over all the single-particle orbitals whereas the second sum is over all k max occupation number sets in Eq. (1).Proof : The average occupation numbers can be written in terms of the state probabilities as nBi = k P {n B i } k n k Bi , where n k Bi = 0, 1 is the value of the occupation number of orbital B i in the set {n Bi } k .From this we get The inequality follows from dropping non-negative terms from the double sum.Comparing the starting and final form, we have proved Proposition 2.
Universal lower bound for S P : using Property 1. and the monotonicity of logarithm, we deduce that Np .Now, using Property 2 it follows that Since this holds for arbitrary basis B, we can minimize the right-hand side over all bases and it still holds.Thus, we A Q-dimensional generic state has logarithmic complexity SCUE = − log(2/Q) in the limit of large Q.The complexity can be reduced by applying single-particle rotations as demonstrated here by transforming to the natural orbital basis or a numerically optimized basis.However, to conserve information, we expect that the typical reduction in the characteristic number of terms C = exp(−S) in the optimal Fock representation of a state with no special structure scales with the number of parameters in the single-particle basis, i.e. polynomially in Np, while C grows exponentially for a fixed filling fraction.SCUE,nat and SCUE,opt are thus expected to approach SCUE for large Np, while the correlation entropy approaches its maximal value − log(ν).
have proved that S p c ≤ S P .The corresponding inequality for the hole correlation entropy S h c ≤ S P can be straightforwardly established by exchanging the roles of particles and holes and tracing the same steps.Thus, we arrive at S c ≤ S P where S c is the larger one of S p c , S h c .

Appendix D: Complexity of generic states
Here we derive the complexity of generic states in a Fock space with M available orbitals and N p particles with dimension Let's start with some normalized vector in the Fock space |ψ 0 ⟩ = Q k=1 a k |k⟩, where |k⟩ is some basis and k |a k | 2 = 1 and consider all the states that can be obtained from |ψ 0 ⟩ by unitary transformations: These states fill the Fock space uniformly and are referred as generic states.To calculate the complexity, we extract the probabilities (repeated indices are summed) and their squares To evaluate the average ⟨P 2 i ⟩ over the Haar measure, we can make use of the circular unitary ensemble result 53 for Q ≫ 1. Employing the above formula, we obtain For large Q ≫ 1, the average logarithmic complexity becomes The expectation value can be moved inside the logarithm, because the argument becomes non-fluctuating in the large Q limit.Also, the minimization over possible single-particle orbitals would not affect the result in large systems, since the number of optimization parameters scale linearly in orbitals while the independent components of the states vectors grow exponentially.The this behaviour is illustrated in Fig. 5, showing how the optimized complexity in small systems is approaching the above analytical results.By employing Stirling's formula, the leading order complexity of generic states state becomes where ν = N p /N o .Since the generic states are uniformly distributed in space and cannot be compressed, their leading order complexity is the maximum allowed by the dimensionality of the Fock space.As illustrated in Fig. 5, the generic states also maximize the particle and hole correlation entropies S p c = − ln ν, S h c = − ln(1 − ν).Thus, the result (D1) can be expressed in the general form (4) with α = α g where Appendix E: Outlook on multifractal coefficients and ergodicity The space-filling properties of quantum state vectors have been intensively studied in the context of the eigenstate thermalization hypothesis (ETH), quantum ergodicity and many-body localization 40,[54][55][56][57] .Given a fixed basis B, one can define multifractal coefficients which quantify the extent of the wave vector relative to the full basis size.For example, the Fock-basis multifractal coefficient 40 D 2,B = S P B / log(Q), where Q is the number of basis states, ranges from D 2,B = 0 for a Slater state to D 2,B = 1 for a uniformly distributed state.Using the complexity, one can then define the basis-independent quantity D 2 = S P / log(Q) which is D 2,B minimized over the single-particle bases B. As an example, we plot D 2,B for the 1/3-filled t − V -model in Fig. 6.
For chaotic spin models it has been demonstrated that midspectrum eigenstates are "ergodic": their fractal dimension approaches 1 in the thermodynamic limit 55,57 .However, it is much less clear what fractal dimensions should be expected from a chaotic Hamiltonian when moving away from the center of the spectrum, as the states start to develop structure that may effectively limit the available basis states, potentially lowering the multifractal coefficient.If one considers assigning temperatures on the eigenstates based on subsystem density matrices, the midspectrum states are close to infinite temperature, while away from midspectrum the states have more structure and the temperature is finite. 58In analogy to this, the single-particle density matrix reveals structure in fermionic states that limits the complexity, and thus the minimal multifractal coefficients, as quantified by the scaling relation S P = αN p S c .Indeed, reaching D 2 = 1 is only expected if both α and S c approach the maximal, generic state values.As long as there is any one-particle structure and, thus, S c remains below the maximum, we expect to find D 2 < 1.For models with density-density interactions the eigenbasis of both the single-particle and the two-particle parts of the Hamiltonian is a Slater basis, and one may generically expect one-particle structure to be present even at midspectrum.In some cases the ground state is degenerate, but the degeneracy can be resolved by the total momentum and reflection parity quantum numbers.We find that the complexity for the degenerate ground states is the same in all symmetry sectors.The basis giving the lowest complexity is typically the natural orbital basis in the t − V -model or the momentum basis in the Hubbard model, where the natural occupations may become degenerate.The exception is at half-filling (dashed lines) with strong interactions, where the position basis leads to lower complexity (marked with dots), and in the case of the t − Vmodel to an apparent departure from the typical ground state scaling α ≳ 1/2.This is discussed in App.F.
Indeed, the basis-independent fractal dimension D 2 does not seem to reach 1 for any interaction strength in Fig. 6.More rigorous upper bounds for the Fock-space multifractal coefficients in terms of single-particle and higher correlations will be established in the future work.

Appendix F: Additional data on ground state complexity
As discussed in section II B, the complexity of the ground states is generally found to be lower than that of excited states, with the scaling coefficient α taking values from α ≈ 1/2 up to α ≈ 0.8.We plot data for additional filling factors in Fig. 7 with largely similar results, except for the half-filled t − V -model, which has a doubly-degenerate ground state and a significantly lower complexity.In general, a low value of α means that the state contains structure that is not apparent in the natural occupations due to the degeneracy.Ground states are expected to have a lower ratio α than excited states, because they have more two-particle correlations that restrict the available Slater determinants.The ground state of the half-filled t − V -model in the limit V → ∞ whose natural orbitals are the position orbitals and natural occupations are all 1/2.This state thus actually belongs to the low-complexity class with S P = S c = log(2), and the scaling ansatz S P = αN p S c does not apply.For large but finite V the ground state is doubly degenerate with the eigenstates in the ±1-parity blocks approaching |ψ⟩ ± as V grows, and α approaching 0. However, one can still form linear combinations of the degenerate ground states such that one of the components |010101...⟩ or |101010...⟩ is eliminated.We find that these states again closely follow the scaling Ansatz with α = 1/2 in their respective natural orbital bases.Similar conclusion holds if the degeneracy is lifted by an additional term which discriminates between the two different charge-density wave states.Thus, the qualitative departure from the ground state scaling with α ≳ 1/2 can be directly traced to the degeneracy of the t − V model at half filling.
Appendix G: Ground state complexity in a stochastic model of perturbation theory In this section we provide a heuristic model capturing essential properties of the ground states of locally interacting lattice models, and show that it leads to the scaling form S P = 1 2 N p S c .The model is based on a picture where we have a single highest weight Slater configuration, and the weight of the other configurations decreases exponentially with the growing "distance" from this "Fermi sea".If we fix a basis, we can think of the Slater configurations as bit strings of the occupation numbers, and measure such distances using the Hamming distance.Below we will use heuristic arguments to explicitly express the Slater weights, thus allowing us to compute S P , but first we need a result that allows us to connect the correlation entropy S c to this picture.
Suppose that we draw two Slater configurations from the probability distribution p i describing a state of interest |ψ⟩.The expected Hamming distance between these configurations is then where x ij is the Hamming distance between configurations i and j.Based on the discussion in Orito and Imura 40 , one can express the quantity s c = exp(−S c ) in the natural orbital basis as For example, if we draw two configurations from a generic (Haar distributed) state, the occupied orbital positions are essentially random, and thus on average νN p particles of the second configuration take positions that are occupied in the first configuration.The expected Hamming distance is thus ⟨x⟩ = 2(1 − ν)N p , and we recover the generic state result s c = ν.This is the minimal value of s c at a fixed filling fraction, while the maximal value s c = 1 is obtained if |ψ⟩ is a Slater state and thus ⟨x⟩ = 0. Consider now the structure of a typical ground state in the weakly interacting limit.For zero interactions the state is a Slater determinant in the momentum basis where all orbitals below the Fermi level are occupied and all orbitals above the Fermi level are empty.Crucially, the interaction acts perturbatively by lifting pairs of particles from below the Fermi momentum to above the Fermi momentum.For example, the interaction terms in the Hubbard model are of the form U L c † ↑k1−q c † ↓k2+q c ↑k1 c ↓k2 , where L is the system size, and particles at momenta k 1 and k 2 are lifted to momenta k 1 + q and k 2 − q.However, we cannot just use first order perturbation theory to model the limit L → ∞, as that would imply that configurations with more than one pair excitation have zero weight, which precludes the linear scaling of S P with system size.Instead, we model a large system by assuming that there are N e N p independent pair excitations, each of which occurs with a small probability γ/N e .The 1/N e scaling is required, as otherwise the number of excitations would grow with N e , which may increase with system size, as the number of possible pair excitations increases faster than N p .We will also assume that γ is small, which means the excitations are rare.Thus we can assume that the excited pairs do not "overlap", always affecting a different set of four orbitals.
Accoring to the above assumptions the number of excited pairs is distributed binomially, with N p N e the number of trials and γ/N e the success probability.For the sake of comparing to numerical data, we note that the Hamming distance x f s measured from the Fermi sea follows the distribution and P (x f s ) = 0 when x f s is not divisible by four.Fig. 8 shows that a reasonable fit to Hubbard model data is obtained with N e = 1 and γ/U 2 ∼ 0.004...0.005.It is immediately clear that the assumption of non-overlapping pair excitations is correct to good accuracy, as Hamming distances not divisible by four have a very low weight.We also expect that γ ∼ U 2 , because amplitudes in first order perturbation theory scale proportionally to U while probabilities scale as U 2 .It would be possible to build a more refined model by taking into account that some excitations occur with higher probability than others, but we will leave this to future work.When we draw a random state from the distribution, it has on average N p N e γ/N e = N p γ excited pairs.Drawing two such states, the expected Hamming distance between them is ⟨x⟩ = 2•4N p γ, where the factor 2 is because both states have N p γ excitations and the factor 4 because a pair excitation causes four opposite bits.The factor γ is thus related to s c as Note that this is only correct for small γ, as otherwise the excitations start to overlap and the calculation becomes more complicated.Indeed, the lower limit for s c is ν, so we should have 4γ ≪ 1 − ν.We can then compute S c as S c = − log(s c ) ≈ 4γ. (G3) On the other hand, S P is related to the collision probability, which in our model means the probability that the two configurations we draw from the distribution are exactly the same, i.e. the probability that exactly the same excitations occur twice.This probability can be written as We thus arrive at the relation S P =1 2 N p S c .We note that this result can also be obtained simply by approximating P (x = 0) by the probability of selecting twice the unperturbed Fermi sea, corresponding to k = 0 in Eqn.G4, as the other contributions are higher order in γ.
The main point in this simplistic model is that S c ≈ 1−s c ≈ 4γ is proportional to the average number of excitations per particle from the unperturbed Fermi sea, but the constant of proportionality depends on the type of the excitations.The scaling coefficient α = 1 2 arises because the excitations are typically pair excitations.Had we assumed single-particle excitations, we would have obtained S c ≈ 1 − s c ≈ 2γ, and the end result would have been S P = N p S c .

Appendix H: Computational details
We perform exact diagonalization calculations using the QuSpin package 59,60 , which allows easy building of Hamiltonian matrices for the fermionic lattice models considered here.The package also allows selecting specific symmetry sectors of lattice models, fixing e.g.quantum numbers corresponding to center-of-mass momentum and parity under reflection p → −p.For the excited state calculations we perform full diagonalization of the selected symmetry block using standard dense hermitian methods, while for the ground state results we employ ARPACK-based sparse methods included in the QuSpin library and Scipy 61 .
To study the entropy S P B in different bases B, we need to change the single-orbital basis for the full many-body eigenstates which are initially computed in the position basis.In the second quantized formalism, an orbital transformation for a system with N o orbitals is specified by an N o × N o unitary matrix U acting on the annihilation operators as (H1) The unitary matrix can be parametrized by a hermitian matrix A such that U = exp(iA), and this transformation can then be expressed as an operator Û in the many-body Fock space as acting on operators as Û † ⃗ c Û = U⃗ c.That the orbital rotations can be expressed in such exponentiated form is referred to as the Thouless theorem 62 in the literature 63 .
The operator Â = ⃗ c † A⃗ c is represented as a sparse matrix that only couples basis states connected by a single hop, thus having N p N h Q non-zero elements, where N p and N h are the number of particles and holes, respectively, and Q is the number of Fock basis states.Applying the operator Û on a state in the Fock space can be carried out by sparse matrix methods, where the only large matrix operation is matrix-vector multiplication by Â. 64,65 For small systems, the non-zero matrix elements of Â can be computed and stored in memory in a sparse matrix format.For large systems, it is advantageous to compute the matrix elements of Â on the fly when performing the matrix-vector multiplication, as memory access becomes the bottleneck of the computation.
For basis optimization we again parametrize the orbital basis in the form U = exp(iA) and perform a conjugate gradient minimization of the Renyi entropy S P B with the elements of the hermitian matrix A treated as free parameters.For the single-component models, we use a random matrix U ∼ CUE(N o ) as the starting point of the minimization, with N o the number of orbitals in the model.For the two-component Hubbard model, we enforce component conservation meaning that A is block-diagonal and does not mix different spin components.

FIG. 2 .
FIG. 2. Complexity of ground states.a): Ground state complexity of a half-filled Hubbard chain as a function of interaction.The system length is L = 8 sites (No = 16 orbitals).The numerically optimized complexity (black dots) is well-approximated from below by the ground state scaling form SP = 1 2 NpSc (green dashed line).The solid lines correspond to complexities calculated in the momentum and position orbitals.Complexity at weak coupling does not approach zero due to ground state degeneracy in the non-interacting model.b): Complexity scaling of the half filled Hubbard chain as a function of the particle number Np.The complexity SP is approximated by Smin = min(Spos, Smom), as we cannot perform the full optimization for large systems.c): Tentative extrapolation of the coefficient α = SP /(NpSc) to infinite system size.The dashed lines are a quadratic fit.At strong coupling the value approaches α ≈ 1/2.d)-f): The same quantities for the t − V model but at ν = 1/3 filling.Here we also explicitly include the complexity Snat in natural orbitals, which always gives the approximated minimal complexity Smin.Again, the ground state complexity follows closely the scaling form SP = 1 2 NpSc.

FIG. 3 .
FIG. 3. Complexity of excited states in the Hubbard and the t − V model in the ground state parity and momentum sector.a)-b): Comparison of numerically optimized results Sopt with Smin = min(Snat, Smom, Spos), where Snat, Smom and Spos are the natural orbital, momentum and position basis complexities in a Hubbard chain of length L = 8. c)-d): Distribution of the excited state complexity ratio S/(NpSc) for a larger system of L = 12.e)-f): Mean complexity ratio in the ground state sector of the Hubbard model for filling factors ν = 1/2 and ν = 1/3.The colored areas around the mean (dashed line) have a width of one standard deviation.g): Scaling of the mean complexity with the number of particles Np.For comparison, generic states have αg = 2 at ν = 1/2 and αg at ν = 1/3 is marked on the axis.h)-j): the same as e)-f) but for the t, V model instead of the Hubbard model.

FIG. 4 .
FIG.4.Cumulative probability distribution for the ground state of the Hubbard model at U = 10, as a function of the scaled logarithm of the state index log(n)/L.The state probabilities Pn have been ordered from the largest to smallest.The vertical lines mark the scaled entropy SP /L and log(σ(n))/L, where σ(n) is the standard deviation of n.The inset shows a tentative extrapolation of SP /L and log(σ(n))/L to infinite system size.

3 FIG. 5 .
FIG.5.Complexity of generic, Haar-distributed states in different single-particle bases computed as a mean of five samples with the vertical bars giving the sample standard deviation, except at Np = 6 where we have only computed one realization.A Q-dimensional generic state has logarithmic complexity SCUE = − log(2/Q) in the limit of large Q.The complexity can be reduced by applying single-particle rotations as demonstrated here by transforming to the natural orbital basis or a numerically optimized basis.However, to conserve information, we expect that the typical reduction in the characteristic number of terms C = exp(−S) in the optimal Fock representation of a state with no special structure scales with the number of parameters in the single-particle basis, i.e. polynomially in Np, while C grows exponentially for a fixed filling fraction.SCUE,nat and SCUE,opt are thus expected to approach SCUE for large Np, while the correlation entropy approaches its maximal value − log(ν).

FIG. 6 .
FIG. 6. Fractal dimension D2,B of the states in the ground state symmetry block of the 1/3 filled t − V model at different interaction strengths.The left and middle columns show the fractal dimension in the L = 24 model in position and momentum bases.The right column shows the scaling of the maximal fractal dimension (i.e. the highest point in the left and middle panels) as a function of system size.SB is the Renyi-2 entropy of the Slater configuration distribution in the indicated basis and Q is the number of states in the Fock space.

FIG. 7 .
FIG. 7. The ratio Smin/Sc for a range of filling factors ν in a Hubbard model of size L = 12 and t − V -models of size L = 20, 21.Smin is again the minimal SP B chosen from the natural orbital, momentum and position bases.In some cases the ground state is degenerate, but the degeneracy can be resolved by the total momentum and reflection parity quantum numbers.We find that the complexity for the degenerate ground states is the same in all symmetry sectors.The basis giving the lowest complexity is typically the natural orbital basis in the t − V -model or the momentum basis in the Hubbard model, where the natural occupations may become degenerate.The exception is at half-filling (dashed lines) with strong interactions, where the position basis leads to lower complexity (marked with dots), and in the case of the t − Vmodel to an apparent departure from the typical ground state scaling α ≳ 1/2.This is discussed in App.F.

10 FIG. 8 .
FIG. 8.Fitting Eqn.G2 to ground state data from the Hubbard model.The numerical results were computed at half-filling for system size L = 10 in momentum orbitals.The dots show the total weight of Slater configurations with a given Hamming distance from the Fermi sea, while the lines represent the model distribution.Note that the weight at Hamming distances not divisible by four is zero in the model of Eqn.G2, because all excitations from the Fermi sea are assumed to be pair excitations.This agrees with the numerical data quite well.
the restriction to rare excitations, we expand the complexity S P = − log(P (x = 0)) to linear order in γ as S P = − log(P (x = 0)) ≈ 2γN p .(G5) Bi } k ⟩, with k P k = 1, yields an example of a complexity bound saturating state.For these states the correlation matrix C is diagonal, and the natural occupations λ i = C ii = P k for each of the N p occupied orbitals in the set {n Bi } k .Thus, S c = S p = − ln nH(P B k ) dimensional subspace H.The maximum compression is obtained in the Fock basis B which minimizes H(P B k ).Since the Shannon entropy is bounded from below by the second Renyi entropy H(P B the physical resource, and the asymptotic cost, required in storing n copies of |Φ⟩.Thus, to represent and store the quantum information in n copies of |Φ⟩ requires at least N qubit = log 2 C n = log 2 e nS P qubits, or n qubit = N qubit /n = log 2 e S P qubits per copy.To properly appreciate the fundamental nature of the results, we recall Shannon's classical result which states that a string of n ≫ 1 letters, each appearing with probability P k , can be optimally compressed to n bit = log 2 2 nH(P k ) bits.
or, equivalently, dim H ≥ C n .This result has fundamental importance in the quantum information applications of many-body physics.The dimension of H should be regarded as