Efficiency of neural-network state representations of one-dimensional quantum spin systems

Neural-network state representations of quantum many-body systems are attracting great attention and more rigorous quantitative analysis about their expressibility and complexity is warranted. Our analysis of the restricted Boltzmann machine (RBM) state representation of one-dimensional (1D) quantum spin systems provides new insight into their computational complexity. We define a class of long-range-fast-decay (LRFD) RBM states with quantifiable upper bounds on truncation errors and provide numerical evidence for a large class of 1D quantum systems that may be approximated by LRFD RBMs of at most polynomial complexities. These results lead us to conjecture that the ground states of a wide range of quantum systems may be exactly represented by LRFD RBMs or a variant of them, even in cases where other state representations become less efficient. At last, we provide the relations between multiple typical state manifolds. Our work proposes a paradigm for doing complexity analysis for generic long-range RBMs which naturally yields a further classification of this manifold. This paradigm and our characterization of their nonlocal structures may pave the way for understanding the natural measure of complexity for quantum many-body states described by RBMs and are generalizable for higher-dimensional systems and deep neural-network quantum states.

Here, we report results of an investigation of neural network quantum states in the context of quantum manybody physics [5][6][7][8][9][28][29][30][31], a subject of much recent interest.The core idea is to postulate an ansatz for the wave function in terms of a neural network (NN) [6], which targets a low-dimensional manifold in the exponentially large Hilbert space for state approximation [32], and apply ML algorithms to find a specific solution.The restricted Boltzmann machine (RBM) [6-8, 28, 29] is a bipartite stochastic construct that combines the concepts of thermodynamic partition functions with those of classical artificial neural networks.RBMs have successfully represented a wide range of quantum states, such as lowlying eigenstates of quantum many-body-localized systems [6,33], code words of a stabilizer code [30,34,35] and chiral topological states [8,36,37].
While RBMs have demonstrated their power in numerical simulation, we have particular motivations to investigate the expressibility and complexity of the generic longrange RBMs, which are characterized by dense network architectures with full interlayer connectivity, in contrast to the so-called short-range or sparse RBMs [7,30,31].First, the state approximators that are produced by RBM solutions returned by ML algorithms often feature a longrange form and a fast parameter decay, even when the exact RBM representations of the target states are unknown [30,31] or less efficient [8].Increasing the number of hidden nodes captures spin correlations of higher orders [6], increasing the approximation accuracy.The best approximators often have a form similar to that obtained by magnitude-based pruning [38,39] of a finite truncated RBM with infinitely many hidden nodes.These observations motivate the generalization of the RBM wavefunction ansatz to an infinitely-many-hidden-node regime and the justification of the faithfulness of using these truncated long-range RBM approximators [40].
Another motivation for studying long-range RBMs stems from the central goal of exploring effective compressed state representations, which includes understanding the natural measure of complexity [7] and how the global physical information is encoded in that description [32].There has been some work studying the relationship between RBMs and other concepts about state representations, such as string-bond states [8], correlator product states [41] and tensor network states [37,42].However, analysis of the effects of truncations through transforming RBMs into other representations may lead to redundancy and inconvenience, and it does not fully exploit the features of RBMs as architectures that naturally describe quantum states in a nonlocal manner [7,32].Thus, we choose to build a paradigm of direct analysis of the spatial complexity of long-range RBMs.
In this paper, we analyze the efficiency of long-range RBM state representation for 1D quantum spin systems.Our procedure is as follows: 1.In Sec.II A, we generalize the RBM wave-function ansatz to an infinitely-many-hidden-node regime and define a subset of generic RBM states-the long-range-fast-decay (LRFD) RBM states, whose parameter conditions constrain the nonlocal interactions between spins (visible nodes) and virtual particles (hidden nodes).
2. In Sec.II B, we derive an upper bound on truncation errors associated with two measures of state differences for the sequence of truncated LRFD RBM states.One measure is the l 2 -norm of the state-vector difference and the other is a Hermitianoperator-based expectation-value difference.
3. In Sec.II C, we identify the dependence of the spatial complexity for LRFD RBMs in state approximation on the decaying rates specified in the nonlocal interaction pattern.
4. In Sec.III, we provide numerical evidence supporting a conjecture that the ground states of a wide range of 1D quantum spin systems, including some critical systems with logarithmic entanglement entropy, can be approximated by LRFD RBMs with the scaling of the spatial complexity being at most polynomial in both the system size and the inverse of approximation errors.
5. In Sec.III, we also provide the relations between multiple typical state manifolds through which the importance of the concept of LRFD RBMs in efficiency analysis for state representation theory is manifested.
Our results offer evidence for the utility of RBMs in cases where other state parameterizations, such as matrix product states (MPSs), become less efficient.Our work actually proposes a paradigm of doing complexity analysis for general long-range RBMs, rather than limited to short-range or sparse RBMs, and naturally yields a further classification of this manifold based on the complexity scaling.
We find that the nonlocal structure of LRFD RBMs can be characterized by two conditions.These conditions are each determined by bounds associated with two degrees of freedom, defined within a framework of levels that is depicted in Fig. 1(a).One of the two degrees of freedom is a single-level decaying factor resembling localized orbitals and encoding information about correlations between spins (Sec.II D).The second is a level-decay factor, which has a significant influence on the complexity of the RBMs (Sec.II C).
This paradigm and our characterization of the nonlocal structures may promote the understanding of the natural measure of complexities for quantum many-body states described by RBMs and may be generalizable to higher-dimensional systems and to deep neural-network quantum states.

II. THE RESTRICTED BOLTZMANN MACHINE AS A WAVE-FUNCTION ANSATZ
We use the RBM as a wave-function ansatz for 1D quantum many-body spin- 1  2 systems [5][6][7].The RBM usually works as the building block for understanding and training deeper networks because of its relatively simple structure for inference and its power in parametric modeling as a universal approximator for discrete distribution [43].As basic constructs of deep NNs, the RBMs have two layers.The first layer (a visible layer) represents a spin configuration σ in the usual way.Here, the vector σ = (σ 1 , . . ., σ L ) represents a system of L spins with σ j = ±1 for j = 1, . . ., L. The second layer is a hidden layer.It is composed of N h nodes, denoted by a vector h = (h 1 , . . ., h N h ) with h k = ±1 for k = 1, . . ., N h .The h k 's are introduced as auxiliary particles in the probability model; they play roles similar to those of virtual particles in the valence-bond picture for MPSs [32,44].
Given a specific spin configuration σ, the RBM outputs the corresponding wave-function amplitude Here, a j and b k are the bias parameters for the j-th spin and k-th hidden node, respectively, W j,k is a weight parameter describing the interlayer interaction between the j-th spin and k-th hidden node, and N denotes the set of all natural numbers.The a j , b k and W j,k are complex numbers.All such amplitudes defined on the computational basis yield a quantum state vector |Ψ = σ ψ( σ)| σ , where the summation is over all 2 L spin configurations.It is remarkable that we adopt the RBM form with a factor of 2 −N h .This choice allows us to use infinitely many hidden nodes h k as long as b k and W j,k decay sufficiently fast to ensure the convergence of ψ( σ) as N h → ∞ for fixed system sizes L. In other words, it ensures that adding hidden nodes with associated parameters (b k and W j,k ) being zero will not change the value of the wave function.This choice will facilitate the asymptotic analysis as shown below.
As mentioned in Sec.I, the RBMs solved by relevant ML algorithms to approximate target states often feature a long-range form and a fast parameter decay.As more hidden nodes are added to the network, the RBM can capture higher-order correlations between spins [6], thus leading to higher accuracy in approximation.The parameter decay is manifested by the decay of weight parameters W j,k with an increasing index separation |j − k| as well as the decay of b k with increasing k.
In this work, we assume N h to be an integer multiple of L which will facilitate the scaling analysis, especially for translationally invariant systems.When N h is not an integer multiple of L, we can simply fill the last fragment with hidden nodes associated with zero-value parameters without influencing the wave-function values.We divide the hidden layer into multiple levels, each of which contains L hidden nodes (Fig. 1(a)).Thus, there are totally N h /L levels while the ratio N h /L is called the hiddenunit density in some references [6].We will show that hidden nodes at the same level can capture the correlation of the same order between spins by performing an algorithm to reorder all hidden nodes for general RBMs.This point will be further clarified when we use the RBM form with translational symmetry to represent the ground states of 1D translationally invariant quantum systems as shown below [6].
One example of the quantum states that can be exactly represented by short-range RBMs [30,31] is the 1D symmetry-protected topological (SPT) cluster state.The Hamiltonian of the SPT cluster system is defined on a 1D L-site lattice with periodic boundary conditions as Ĥcluster = − L j=1 σz j−1 σx j σz j+1 , where σx and σz are Pauli matrices.A conventional r 0 -range RBM is defined as an RBM satisfying W j,k = 0 for any |j − k| > r 0 .A short-range RBM usually refers to an r 0 -range RBM with r 0 being a small constant independent of the system size L. It was shown in Ref. [30,31] that the ground state of Ĥcluster can be exactly represented by a 1-range RBM with L hidden nodes defined as: by using the stabilizer nature of the system to decrease the number of equation constraints for parameters from exponential to linear in L. Using our language of levels, this RBM just has one level and its weight parameters at this single level have a support of very short length which is a manifestation of its quantum entanglement satisfying an area law.Moreover, the translational symmetry of the system is inherited by the RBM form.The parameter patterns of this RBM also have a translational symmetry, which means that its parameters for different hidden nodes can be generated by the action of a translationalsymmetry transformation operator on those for a single hidden node [6].
Inspired by the extensibility of the system of equations (3) with growing system sizes and considering the need to capture higher-order correlations between spins [6] and stronger quantum entanglement between subsystem blocks [31], we expect that the RBM representation of general quantum states has multiple, possibly infinitely many, levels and the length of the support of weight parameters at each level may increase from a small constant to the maximum length L. This motivates us to analyze generic long-range RBMs with properly specified nonlocal interactions between spins and hidden nodes (virtual particles).

A. Long-range-fast-decay RBMs
We now discuss aspects of the nonlocal structure of LRFD RBMs that were summarized at the end of Sec.I.This leads to specific definitions of the two conditions that were mentioned there.
We begin by generalizing the RBM wave-function ansatz to an infinitely-many-hidden-node regime.An RBM state |Ψ (L,∞) with infinitely many hidden nodes and a system size L can be defined as where Its corresponding truncated-RBM sequence is defined as {|Ψ (L,N h ) }, where and is constructed by removing the hyperbolic cosine terms with k ≥ N h + 1 from ψ (L,∞) ( σ).
Then, we define a subset of generic RBM states with infinitely many hidden nodes-long-range-fastdecay (LRFD) RBM states-as the RBMs whose parameters satisfy the following two conditions.
Condition 1 (boundedness of W j,k ).There exists an Lindependent integer ks ∈ N and three nonnegative monotonically decreasing real functions λ R ( k), λ I ( k) and µ(r) such that, after a reordering of all hidden nodes, for all k > ks L, | Im(W where k ∈ {1, 2, . . ., N h /L} designates the numerical index of levels; j c , the center spin for the k-th hidden node, denotes the site index of the spin with which the interaction of the k-th hidden node reaches its maximum among all j ∈ {1, 2, . . ., L}; |m| circ = min{m, L − m} in accordance with the periodic boundary conditions; and r ∈ {0, 1, . . ., (L − 1)/2} denotes the distance between j and j c assuming L is odd without influencing the validity of the following asymptotic analysis.The functions λ R ( k), λ I ( k) and µ(r) satisfy the conditions that there exist finite L-independent nonnegative constants P 0 and µ 0 such that where We provide an interpretation of each new variable as follows.After a reordering of all hidden nodes which is usually associated with sorting based on the value of , starting from the level ks + 1, k = k(k) and j c = j c (k) are both functions of k and the correspondence between the pair ( k, j c ) and k is a bijective map.It means that every hidden node with k > ks L is associated with a unique pair and thus can be uniquely positioned in the RBM network after the reordering (Fig. 1(a)).The hidden nodes capturing the correlation of the same order between spins are grouped at the same level so that the new indices of these hidden nodes characterized by the pair ( k, j c ) actually manifest the level of correlations.This characterization can also facilitate a symmetry manifestation for quantum states holding translational symmetry.The reordering step is to solve the problem that ML algorithms with a stochastic nature are often unable to automatically group the hidden nodes according to level stratification and their site positions usually exhibit randomness.The condition ks = 0 implies that all hidden nodes satisfy the boundedness conditions so that the level stratification can be applied to the whole hidden layer (Fig. 1(a)).
Condition 2 (boundedness of b k ).After the same reordering of all hidden nodes that is described in Condition 1, for all k > ks L, | Im(b The definition of LRFD RBMs should be understood from the point of view of state manifolds [32,45].A state manifold for quantum many-body states usually refers to a subspace of the whole Hilbert space spanned by a parameterized wave-function family [45], thus is a set containing specific types of quantum states.So the manifold of LRFD RBMs can be defined as a space spanned by all parameterized wave functions, every one of which belongs to a quantum-state sequence associated with a varying system size and satisfying the above Condition 1 and 2. One LRFD-RBM state refers to an element in this manifold.So this definition is in the same spirit as the definition of MPSs with different scaling laws [32,44].
Condition 1 gives an upper bound on the magnitude of RBM weight parameters and actually provides a description of the nonlocal interaction between spins and hidden nodes (virtual particles).It requires that |Re(W are upper bounded, respectively, by the products λ R ( k)µ(r) and λ I ( k)µ(r).The monotonically decreasing functions λ R ( k) and λ I ( k) can be regarded as level-decay factors, while µ(r) is a factor describing the decay due to the increase of the distance between the spin-site index (j) and the corresponding spin-site index of the center spin (j c (k)) for the k-th hidden node.The function µ(r) has a localization feature and resembles a single-modal localized orbital in the physics of periodic potentials, such as Wannier modes [46], which can be reflected by its monotonically decreasing with increasing r.So this description can effectively capture the parameter decays induced by both the level increase and the growth of system size, providing two degrees of freedom in characterizing the nonlocal interaction pattern.The separate treatments for the real and imaginary parts originate from their inequivalent positions in the RBM wavefunction form, which is shown in Appendix A.
Condition 2 implies that the contribution of b k -related terms can be upper bounded by the largest W j,k -related terms at each level so that the W j,k weight parameters play a dominant role in the asymptotic analysis (Appendix A).Since there is often a degree of freedom in choosing the value of µ(0), Condition 2 can be satisfied for a wide range of RBM states.
Conditions 1 and 2 are proposed to ensure the convergence of the state vector (Eq.( 5)) and provide a clear quantification for the rate of parameter decays, on the basis of which a complexity analysis can be conducted.A rigorous proof of the convergence of the state vector when Conditions 1 and 2 are satisfied is given in Appendix A. This proof is important not only because it ensures that the generalization of RBMs to an infinitelymany-hidden-node regime makes sense by defining them as the limits of some infinite sequences, but also because it introduces the key mathematical tricks and concepts that are necessary for analyzing the effects of truncations.
The core idea of the proof is that we can prove the sequence {ψ (L,nL) ( σ) : n ∈ N} is a Cauchy sequence in the field of complex numbers C [47].This proof is inspired by the fact that, when b j,k decay sufficiently fast, the complex-valued ratio ψ (L,(n+m)L) ( σ)/ψ (L,nL) ( σ) will quickly fall into the neighborhood of the point z = 1 in the complex plane as n increases.So we derive an upper and lower bound on the ratio's modulus |ψ (L,(n+m)L) ( σ)/ψ (L,nL) ( σ)| which converge to 1 and an upper bound on the magnitude of its argument | arg ψ (L,(n+m)L) ( σ)/ψ (L,nL) ( σ) | which converges to 0 as n increases.Then we show that the corresponding magnitude sequence {|ψ (L,nL) ( σ)|} and the argument sequence {arg(ψ (L,nL) ( σ))} are Cauchy sequences in the field of real numbers R, thus {ψ (L,nL) ( σ)} is a Cauchy sequence in C.

B. Effects of wave-function truncation for fixed system sizes
We derive upper bounds on truncation errors associated with two measures of state differences for the sequence of truncated LRFD RBM states.Define ε(L, N h ) to be a specific type of truncation error for using |Ψ (L,N h ) to approximate |Ψ (L,∞) .
A natural measure of state differences is the square of the l 2 -norm [48] of the state-vector difference, 2 , where the tilde symbol is used to represent corresponding states after a normalization operation.It is remarkable that the RBM wave-function ansatz is not automatically normalized and an estimation of the normalization factor Ψ|Ψ is important and often tricky as shown in Appendix B. This measure of truncation er- rors is adopted in fundamental works about the faithfulness and efficiency of other wave-function ansätze, such as MPSs [40,49].So it allows us to make a direct comparison between the efficiencies of RBMs and other state representations.
A second measure of state differences is a Hermitian-operator-based expectation-value difference defined as j } denote the Pauli matrices.We also use this measure as {σ (m) j : m = 0, 1, 2, 3} is a complete basis set for the local Hilbert space for the jth spin and a wide range of typical physical observables, such as spin correlations and total energy, correspond to Hermitian operators of such type or linear combinations of polynomially many such operators.
Then we can prove a lemma which provides upper bounds on truncation errors of the above two types for the sequence of truncated LRFD RBM states.
Lemma 3 (upper bounds on truncation errors).For LRFD RBMs satisfying Conditions 1 and 2, after the same reordering of all hidden nodes described in Condition 1, there exists n Θ (L) > ks such that, for all where the relevant constants are , n Θ (L) can be estimated by inequality (B11), and we have assumed that λ R ( k) = λ I ( k) = λ( k) for simplicity which holds throughout the following discussion.
The proof is given in Appendix B which uses arguments similar to those described in the proof for the convergence of LRFD RBMs.Based on the intuition that the ratio ψ (L,∞) ( σ)/ψ (L,N h ) ( σ) will fastly converge to 1 with increasing N h , we derive an upper bound √ R 1 and a lower bound √ R 2 on the ratio's modulus |ψ (L,∞) ( σ)/ψ (L,N h ) ( σ)| and an upper bound Θ on the magnitude of its argument The two types of truncation errors can be upper bounded using these three variables and the two upper bounds can be finally expressed as functions (F 1 (x) and F 2 (x)) of LQ(L)P (N h /L) which decreases to zero with increasing N h and fixed L. The idea of the proof is shown schematically in Fig. 2(c).
Based on our description of the nonlocal interactions between spins and virtual particles and using the language of levels, P (N h /L) is a summation of all leveldecay factors for hidden nodes at levels starting from k = N h /L + 1 to k = ∞, while Q(L) corresponds to the localized "orbitals" at every single level and contributes a factor reflecting the pure influence of system-size growing regardless of levels.The two different types of truncation errors correspond to two different forms of the function F (x), but both of them are analytic at the point x = 0.
We give the scaling of truncation errors in N h as below.It can be obtained that, if Our construction of LRFD RBMs and theoretical analysis of the truncation errors can be further clarified with results from numerical computations.We can construct LRFD RBMs with translation symmetry whose parameters exactly satisfy for any 1 ≤ j ≤ L, 1 ≤ k ≤ N h , where a 0 , c w and c b are complex constants with x denotes the ceiling function, and ks = 0 in this case.It can be shown that such an RBM form can be directly transformed into the RBM form proposed to represent ground states of 1D translationally invariant systems [6] for any finite N h but we generalize it to an infinitely-many-hidden-node regime (N h → ∞).Since the parameters for different hidden nodes can be generated by the action of a translationalsymmetry transformation operator on those for a single hidden node, we just need to focus on one representative hidden node for each level.So we propose an importance measure η(j, k, L) to measure the importance of a set of edges which is defined as and present it as a function of the spin-site index j and level index k.Its 3D structure can reflect the decay of both λ( k) and µ(r) while the center of the "orbital" at every level is localized around j = (L+1)/2.So a plotting of the peak at every level as a function of the level index ( k) can reflect the decay of λ( k).One example of such LRFD RBM with a power-law decaying λ( k) is shown in Fig. 1(b).
We show the two types of truncation errors ε(L, N h ) as a function of N h with fixed L for 1D SPT cluster states with a perturbation part (Fig. 2(a) and 2(b)).It means that the RBM is constructed as a summation of the setting defined in the system of equations (3) and a perturbation part specified as Eqs.( 24)- (26) show.The numerical results for λ( k) with exponential and powerlaw decays are given.As described above, the 1D SPT cluster states can be exactly represented by a short-range (1-range) RBM [30].Using our description, its RBM representation just has one level, and the corresponding λ( k) and µ(r) quickly go down to zero for k > 1 and r > 1.The addition of the perturbation part makes the composite RBM a LRFD RBM so that we can study the truncation errors.We give the results for both types of truncation errors and let B be the operator of spin correlations between spin 1 and 2 in z and x directions.
Our numerical experiments on the scaling of the truncation errors in N h with fixed L are well upper bounded by our estimations given in inequalities ( 15) and ( 16), which substantiates our theoretical analysis.Those experiments also indicate that our estimations in Eq. ( 23) correctly capture the asymptotic properties of ε(L, N h ) with varying N h .Moreover, the fact that the curve of exact ε(L, N h ) and that of our estimation associated with B = σx 2 have exactly the same slope implies that our estimation in Eq. ( 23) gives an asymptotically optimal upper bound.It means that, for the second-type truncation errors (inequality ( 16)), there is still room to improve the constant prefactors in our estimation, but we cannot qualitatively further improve the upper bound.In comparison, there is room to both qualitatively improve the upper bound and improve the constant prefactors for the first-type truncation errors (inequality ( 15)).

C. Scaling of complexity
We can investigate the scaling of spatial complexity in system sizes for LRFD RBMs as the results in Sec.II A and Sec.II B still hold for varying L. We give an upper-bound estimation of the complexity of RBM representations which depends on the asymptotic behavior at x = ∞ of the functions P (x) (Eq.( 21)) and Q(x) (Eq.( 22)), and thus is determined by the decaying rates specified by λ( k) and µ(r).
Define the minimum N h to achieve a sufficiently small approximation error ε 0 as Using Lemma 3, the sufficient condition for ε(L, N h ) ≤ ε 0 is that the corresponding upper bound on truncation errors is no larger than ε 0 .So this provides one way to get an upper bound on N * h (L, ε 0 ) for LRFD RBMs.It can be shown that where q(x), p d (x) and f (x) are functions to specify the asymptotic behaviors of Q(x), P (x) and F (x) as defined above and the superscript " −1 " denotes the inverse of the corresponding function.This upper-bound estimation is usually asymptotically larger than, thus not influenced by, the n Θ (L)L.Rich information can be extracted from Eq. ( 29).First, the first factor L comes from our assumption that N h is an integer multiple of the system size L and the second factor L in front of q(L) is extracted using the translational symmetry of the wave function.So these two factors reflect the growing system sizes and the remaining factors reflect the distinction in complexity for different LRFD RBMs.
Second, P (N h /L) and Q(L) (thus µ(r) and λ( k)) which characterize the nonlocal structure of RBMs in our description have qualitatively different influence on the complexity.Specifically, Q(L) can converge to a finite L-independent constant in the thermodynamic limit and does not influence the complexity for sufficiently localized "orbitals" in the cases where µ(r) decays sufficiently fast.With the upper boundedness condition for µ(r) (Eq.( 11)), Q(L) can contribute an at-most-quadratic factor to this upper bound on N * h (L, ε 0 ).By contrast, the asymptotic property of P (x) significantly influences the complexity and may lead to the inefficiency of RBM representations if λ( k) decays sufficiently slowly.That would imply that there are too many high-order correlations between spins to be captured by the RBM so polynomially many parameters are not enough to fully compress the information into the RBM form.But as long as p −1 d (x) has an at-most-power-law dependence on x, this upper-bound estimation will imply that the complexity is definitely at most polynomial in both system size L and 1/ε 0 with the above two types of truncation errors.Moreover, it is also remarkable that our estimation only provides an upper bound on the complexity, so a fasterthan-polynomial scaling of the bound (such as S (7) 2 in Table I) does not necessarily imply the inefficiency of the representation.It is possible that the upper bound is not tight and the real complexity is at most polynomial in this case.
Third, the asymptotic behavior of F (x) at x = 0 also influences the scaling of the complexity and it directly acts on ε 0 .We have demonstrated that, for the two types of truncation errors described above, the corresponding F (x)'s (F 1 (x) and F 2 (x)) are both analytic at x = 0.For general types of truncation errors that can be upper bounded by a function F (LQ(L)P (N h /L)), 1/f −1 (ε 0 ) has a power-law dependence on 1/ε 0 as long as F (x) is TABLE I. Complexity estimations for distinct typical settings of µ(r) and λ( k)."−" in the µ(r) column denotes all µ(r) functions that make Q(L) converge as L → ∞. "−" in the λ( k) column denotes all λ( k) functions that make P (N h /L) have the asymptotic behavior of O(1/ ln(N h /L)) as N h → ∞.Note that δP > 1 and αP > 1/2 in these settings.

Manifold
analytic at x = 0 based on the Taylor series expansion of the function.
This result suggests separate effects of the factors µ(r) and λ( k).The scaling of entanglement entropy, which is an important measure of the complexity of quantum many-body states, is influenced by µ(r), whereas λ( k) significantly influences the spatial complexity of parameterization in LRFD RBM representations.The length of the support of µ(r), which determines the "range" r 0 of RBMs, directly influences the scaling of the entanglement entropy of the states between subregions but does not directly contribute a faster-than-polynomial factor to the parameterization complexity.This result possibly provides further theoretical evidence for the high efficiency of RBMs in representing states with entanglement entropy scaling faster than an area law in system sizes [31].
We apply our complexity estimation to several typical settings of µ(r) and λ( k) in Table I.The manifolds S (j) 2 with 1 ≤ j ≤ 6 all correspond to a spatial complexity which is at most polynomial in L. We also apply this analysis to RBMs constructed as the 1D SPT cluster states with a perturbation part.Our numerical results on the scaling of N * h (L, ε 0 ) in L with fixed ε 0 (Fig. 2(d)) for small system sizes are consistent with our theoretical analysis summarized in Table I.The piecewise linearity of N * h (L, ε 0 ) as a function of L with a slope growing very slowly implies that the scaling is perhaps just slightly faster than linear, consistent with our estimation based on parameter settings.The piecewise linearity is due to our assumption that N h is an integer multiple of L. So it applies a ceiling operation to the ratio N h /L which will not change when L varies within a small range.The inset in Fig. 2(d) shows that N U h (L, ε 0 ) serve as upper bounds on N * h (L, ε 0 ) as in our analysis.The N U h (L, ε 0 ) are obtained by using the exact values of the right-hand side of inequality (15) and its leading-order estimations.These are almost the same and both have a power-law scaling in L as indicated by Eq. ( 29), supporting the validity of our complexity analysis.1+L/2 to an L-independent constant (almost attaining the maximum value of 1) for αQ = 1/2 and a decay for αQ = 2.

D. Spin-correlation information
In this subsection, we analyze what information about the physical properties of the quantum states can be extracted from the LRFD RBM form using our description of the nonlocal structure.Here, we focus on a smallparameter regime in which all |a j |, |b k | and |W j,k | are no larger than ε 1 , and ε 1 1/L, ε 1 1/N h .We do not explicitly write the superscript " (L) " for RBM parameters and assume that the RBM just has a finite number (N h ) of hidden nodes in this subsection.
Based on the proof given in Appendix C, we find that the unnormalized correlation in the z direction between spins with a distance of r for a LRFD RBM with trans-lational symmetry is Note that the above result is the r-related part of the spin correlation, while the real value of the correlation is C z unnorm (r) divided by an r-independent normalization factor Ψ (L,N h ) |Ψ (L,N h ) .So for RBMs constructed as Eqs.( 24)-( 26) show with a 0 = 0 and c w ∈ R for simplicity, So the µ(r)-related factor as shown above describes the decaying rate of spin correlations in the z direction as a function of the distance r, while the λ( k)-related factors independent of r do not influence the decaying rate if we only consider the leading-order terms in Eq. ( 31).The above result in Eq. ( 31) gives an interpretation of the roles of hidden nodes.The hidden nodes can be viewed as intermediate virtual particles that relate spins (physical particles) at different lattice sites.When an RBM is short-range, the term Re(W W T ) 1,1+r will vanish for large enough r as there is no virtual particle that can have both nonzero connectivity to two spins separated by r.Then, more intermediate hidden nodes are needed to transport such relations, which means that we need to consider higher-order terms.This is additional evidence that long-range RBMs can represent states with strong quantum correlations.It is shown in Appendix C that, even when µ(r) → 0 as r → 0, we can still construct LRFD RBMs in which the spin correlations in the z direction can have long-range decayings lower bounded by Θ(1/r α Q ) (for µ(r) = Θ(1/r α Q )) with α Q > 1, Θ(ln r/r) (for µ(r) = Θ(1/r)), and even Θ(1) (for µ(r 2 ).These three kinds of decaying rates of spin correlations are demonstrated by numerical computations (Fig. 3).The spin correlation σz 1+r almost saturates the maximum value of 1 for α Q = 1/2.In comparison, these spin correlations have different long-range decaying rates for α Q = 1 and α Q = 2 as r increases.

III. GROUND-STATE APPLICATIONS
Based on the proposal of the concept of LRFD RBMs and the theoretical analysis of their spatial complexity, it is natural to explore their applications to learning quantum states associated with specific models.
First, in Appendix D, we prove that the state with all spins pointing up in the z direction, which is the ground state of a spin- 1  2 system with a single magnetic field in the z direction and has a form of the Kronecker delta function, can be approximated by LRFD RBMs with arbitrary accuracy.We find that the RBM construction is not unique for such a target state even when fixing the global phase which implies eliminating the degree of freedom associated with a global gauge transformation and we give one construction.Thus, we provide one example of the utility of LRFD RBMs in state representation for arbitrarily large system sizes.Second, we are particularly interested in the behavior of RBMs in cases where other state representations become less efficient.We numerically study the representation of the ground states of critical systems with finite sizes for which the MPS representation becomes less ef-ficient [40,49], while MPS has achieved notable success in representing quantum many-body states with entanglement entropy satisfying an area law [32,44].
To accomplish this, we use RBMs with translational symmetry and apply the conventional quantum Monte Carlo algorithm (also a variational method) with stochastic-reconfiguration optimizations [6,[50][51][52] to learn the ground states of two typical quantum models: the 1D transverse-field Ising model (TFIM) (Eq.( 33)) and XXZ model (Eq.( 34)), described by Hamiltonians with periodic boundary conditions, respectively, where B x denotes the strength of a transverse field and J z denotes the strength of coupling in the z direction.We use RBMs to learn the ground state of the TFIM with B x = 1 which implies that the quantum system is exactly in the phase-transition point between a ferromagnetic and a paramagnetic phase [53] and of the XXZ model with J z = −0.2 which implies that the system is in a gapless disordered XY phase [54].Both systems are critical systems with the entanglement entropy of the ground states scaling logarithmically in system sizes [53,55,56].The ground states of these two Hamiltonians (at least for small system sizes) can be well learned by RBMs, which is demonstrated by the high accuracy in spincorrelation calculations given in Appendix E. The importance measures η(j, k, L) for these two RBMs are provided in Fig. 4(a) and 4(b).
The numerical results show that the RBM representations of the two ground states of the above two critical systems have forms very similar to LRFD RBMs.The overall 3D structures for the importance measures η(j, k, L) are similar to the one presented in Fig. 1(b) which corresponds to a standard LRFD RBM.The weight parameters for hidden nodes at the same level are quite localized and decay fastly as the level index k increases and as the spin-site index j goes away from the center.Moreover, it seems that the "ridge" of η(j, k, L) for varying system sizes can be upper bounded by an Lindependent power-law decay curve, based on which we can extract a corresponding α P characterizing the rate of level decay for these small-system-size wave functions.If these features still hold as L increases and approaches infinity, these states will form LRFD RBMs which belong to the set S in Table I and the corresponding λ( k) and µ(r) can be defined.Moreover, the above results exhibit a feature that is also manifested in the theory of MPS representations.It has been shown that [40], though MPS becomes less efficient in representing the ground states of critical systems, the bond dimension required to achieve an approximation error ε 0 can still be upper bounded by a function scaling polynomially in the system size L. The exponent in the power-law dependence of spatial complexity of MPSs on L depends on the central charge c, which is a quantity roughly quantifying the "degrees of freedom of the theory" in conformal field theory [44].A larger c leads to a higher exponent in that estimation which implies a higher complexity in MPS representation.While the TFIM at the above phase-transition point has c = 1  2 and the XXZ model in the disordered XY phase has c = 1 [57], our numerical results do show a smaller fitted α P for the XXZ model, which implies that the XXZ model has more intrinsic "complexity" compared to the TFIM, thus needing more parameters to capture this complexity.

IV. STATE MANIFOLDS AND COMPLEXITY CLASSIFICATION
Rigorously speaking, the numerical results for systems of finite sizes only provide evidence supporting that the states may be LRFD RBMs but cannot prove it, since the properties of RBMs in the process of approaching the thermodynamic limit are not yet known.Based on the success of RBMs in numerical simulations and the fact that they can often achieve high accuracy even with a constant number of levels (at least for small system sizes), we conjecture that the ground states of a wide range of quantum systems may be exactly represented by LRFD RBMs, or a variant of them.Here, the term "variant" means a generalization of the forms specified in Condition 1 and 2 by including additional factors that can be naturally incorporated into our complexity analysis.For example, the λ( k) and µ(r) functional forms, which are L-independent in our definition of LRFD RBMs, can be generalized into λ( k, L) and µ(r, L), respectively, while their effects can be easily evaluated using our paradigm for complexity analysis.
We summarize the relations between multiple typical state manifolds so that the significance of proposing the concept of LRFD RBMs can be better understood.A state manifold usually refers to a subspace of the whole Hilbert space spanned by a parameterized wave-function family [45], thus it is a set containing a specific scope of quantum states.The manifolds S 1 , S 2 , S (j) 2 (for 1 ≤ j ≤ 6, j ∈ N), S 3 and S 4 are defined to be the space spanned by quantum states represented by RBMs satisfying corresponding conditions as given in Fig. 5, while S 5 is defined to be the manifold spanned by all ground states of 1D quantum many-body spin systems.
The definitions of these manifolds directly implies that S 1 S (j) 2 S 2 (for 1 ≤ j ≤ 6).Our complexity analysis for LRFD RBMs (Sec.II C) gives the result that S (j) 2 ⊆ (S 2 ∩ S 3 ).Previous research shows that a set of problems where RBMs appear to be powerful are related to topological states, among which the 1D SPT cluster states belong to S 5 ∩ S 1 [30].The Laughlin wave func- Relations between multiple typical state manifolds.S1: short-range RBMs; S2: LRFD RBMs; S (j) 2 (for 1 ≤ j ≤ 6, j ∈ N): LRFD RBMs with distinct parameter conditions, specified in Table I; S3: RBMs with spatial complexities scaling at most polynomially in system sizes; S4: RBMs with a faster-than-polynomial scaling of spatial complexities in system sizes, corresponding to inefficiency of representation; S5: ground states of 1D quantum spin systems.The dashed boundary of S5 means that its relations with other manifolds have not been fully determined.
tions, which have the structure of Jastrow wave functions and are associated with chiral topological order, can be exactly represented by RBMs in S 3 with a quadratic scaling of N h in L but their approximations with RBMs of a long-range form and less complexity are often used [8].S 4 contains all other sets mentioned in Fig. 5 as RBMs without restriction on the number of hidden nodes are universal approximators for discrete distribution [43].Numerical results seem to support that a "large fraction" of S 5 is contained in its intersection with S 2 .We argue that the concept of S 2 may benefit the understanding of which fraction of S 5 falls into its intersection with S 3 , thus also promoting the understanding of the complexity of quantum many-body states.
It is remarkable that our paradigm for complexity analysis and our characterization of the nonlocal structures of RBMs for 1D quantum spin systems can be generalized to higher-dimensional systems, e.g., lattices.This is done by generalizing the description of single-level "orbitals" from µ(r) to µ( r) while keeping λ( k) as a level-decay factor.For deep NN quantum states, we can still view each single hidden layer as a combination of multiple levels which capture correlations of different orders.We can calculate the truncation errors for each hidden layer associated with specific nodal functions and analyze the propagation of errors through layers.

V. SUMMARY
In this work, we define a subset of generic RBM quantum states-long-range-fast-decay (LRFD) RBM states.Using the language of levels, the nonlocal structure of LRFD RBMs is described with two functions: one of which, µ(r), captures the localization of the spatial distribution of the wave function for each single level and encodes information about spin correlations; the other, λ( k), is a level-decay factor capturing correlations of different orders and significantly influencing the complexity of the RBMs.We derive upper bounds on truncation errors, which allow us to analyze the scaling of the spatial complexity in system sizes and approximation errors for LRFD RBMs.We provide numerical results supporting that the ground states of a wide range of 1D quantum spin systems, including some critical systems, may be approximated by LRFD RBMs with an at-mostpolynomial complexity.Finally, we describe the relationships between state manifolds of different computational complexity and identify hierarchies of RBM-efficient approximation.
Generalizing the RBM wave-function ansatz to an infinitely-many-hidden-node regime and proposing the concept of LRFD RBMs does not imply the use of an infinitely-large neural network for state representations.These serve to define the completeness of a set of variational states and serve as a tool for complexity analysis based on the good extensibility and analyzability of LRFD-RBM forms.This concept may promote general understanding of the intrinsic complexity of quantum many-body states.

VI. ACKNOWLEDGMENTS
We know that there exists n Θ (L) > n I (L) > ks such that, for all N h > n Θ (L)L, We can get For simplicity, we have assumed λ R ( k) = λ I ( k) = λ( k) which implies that the real part and imaginary part of RBM parameters have the same decaying rate and where F 1 (x) is defined in Eq. ( 17).The idea of the proof is shown schematically in Fig. 2(c), where ψ full ( σ) denotes the amplitude for the full LRFD RBM ψ (L,∞) ( σ) and ψ tr ( σ) denotes the amplitude for the truncated RBM ψ (L,N h ) ( σ).
We give the proof for the second-type truncation errors as follows.
We have defined a Hermitian operator B of the form B = L j=1 σ(mj) j where is the tensor product symbol, Proof. where and capture the contribution of the deviations in the normalization factor and the unnormalized expectation value to the approximation error, respectively, with where B( σ 2 ) is the only spin configuration that makes B σ1 σ2 = 0 for a specific σ 2 .Using the Cauchy-Schwarz inequality, Therefore, combined with the geometric features, we can get where F 2 (x) is defined in Eq. (19).Since all b k and W j,k parameters for these RBMs are real positive numbers, the wave-function amplitude for spin configurations reaches its maximum at σ 0 = (1, 1, . . ., 1) which corresponds to all spins up.We can get that, for any other spin configurations σ , where σ 1 = (1, 1, . . ., 1, −1) refers to the spin configuration obtained by flipping the last spin in σ 0 .Let x = (L − 1)µ 0 λ( k) and x = 2µ 0 λ( k).By performing the Taylor series expansion of cosh(x + x) about x and comparing the leading-order terms, we can get that there exist constants 0 < β 3 < 1 and x 0 > 0 such that, for any 0 < x < x < x 0 , cosh(x + x) cosh(x) ≥ 1 + tanh(x) x ≥ e β3x x .(D6) For any fixed L and µ 0 , there exists k0 ∈ N such that (L − 1)µ 0 λ( k) < x 0 for all k > k0 .So it can be proven The above lower bound contains infinitely many terms, though each of which is upper bounded by an expression associated with x 0 , and can reach arbitrarily high values by decreasing the decaying rate of λ( k).Therefore, the long-range nature of these RBMs allows the shape of ψ (L,∞) ( σ) in the spin-configuration space to approach the Kronecker delta function as λ( k) decays more and more slowly.Our numerical results in Fig. 6 with λ( k) = δ −( k−1) P imply that the distribution of the square of normalized wave-function amplitudes in the spin-configuration space can approximate the Kronecker delta function with increasing accuracy as 1/δ P grows up and the ratio |ψ (L,∞) ( σ 0 )/ψ (L,∞) ( σ 1 )| 2 as a measure of the approximation accuracy can reach arbitrarily high values as 1/δ P approaches 1, thus supporting our argument.

Appendix E: Error curves
In this section, we provide the approximation errors as a function of the number of hidden nodes for truncated LRFD RBMs (Fig. 7(a)) and the optimal RBMs as well as the truncation errors for the calculations of spin correlations in the x and z directions as a function of the number of levels kept in the truncated LRFD RBMs (Fig. 7(b)).In Fig. 7(a), ε(L 0 , N h ) for LRFD RBMs denotes the second-type truncation errors with B being generalized into the Hamiltonian of the quantum system in the ground-state learning which usually has a form of a linear combination of polynomially many original B-type operators.Then the approximation accuracy of using the optimal RBM with the number of hidden nodes not exceeding N h is definitely better (at least no worse) than that of using an RBM which is a finite truncation of a LRFD RBM keeping N h hidden nodes based on definitions.So N U h (L 0 , ε 0 ) in our complexity analysis actually also provides an upper bound on N opt h (L 0 , ε 0 ) defined as the minimum number of hidden nodes to achieve a specific approximation error ε 0 for any kinds of RBMs (not limited to LRFD RBMs), which is of great importance for pre-training computational-resource estimations in ML tasks.The truncation error for LRFD RBMs will converge to 0 and is possibly not monotonically decreasing as N h approaches infinity.Fig. 7(b) shows that the decaying curves of the second-type truncation errors for the ground-state learning of the XXZ model with the same parameter setting as in Fig. 4(b) are consistent with our analysis and can be upper bounded by power-law decaying curves and reach high accuracy as the number of preserved levels increases.

FIG. 1 .
FIG. 1.(a) Network structure of RBMs as a wave-function ansatz for 1D quantum spin systems.Long-range RBMs usually implies full connectivity between the visible and hidden layer.The hidden layer is divided into N h /L levels, each containing L hidden nodes.(b) Importance measure η(j, k, L) for a LRFD RBM with translational symmetry.The RBM is constructed as Eqs.(24)-(26) show, where λ( k) = k−α P , µ(r) = 1 2 δQr −α Q for r = 0, µ(0) = δQ = 0.5, αP = 0.75, αQ = 1.5, cw = 1 + i, c b = 0, a0 = 0 and L = 11.The inset plots the decay of the maximum of η(j, k, L) among all j at each level with increasing k on a log-log scale.The linearity of the curve reveals a power-law decaying of the "ridge" (red circles) of the 3D structure.

FIG. 4 .
FIG. 4. Importance measure η(j, k, L) for the RBMs approximating ground states of two critical systems with L = 15.(a) TFIM with Bx = 1.(b) XXZ model with Jz = −0.2.The insets in each subfigure show the decays of the maximum importance measure at each level as level index k increases on a log-log scale.The system size L = 9, 11, 13 and 15.The purple dashed curve implies that these decaying curves can be upper bounded by a power-law decay.By numerical fitting, the corresponding αP for the dashed lines in the insets of (a) and (b) are 2.957 and 1.232, respectively.

1 FIG. 7 .
FIG. 7. (a) Approximation errors as a function of the number of hidden nodes for truncated LRFD RBMs and the optimal RBMs.(b) Approximation errors for the calculations of spin correlations in the x and z directions as a function of the number of levels kept in the truncated LRFD RBMs for the XXZ model compared with results from exact-diagonalization methods.The figure is plotted on a log-log scale.