Variational classical networks for dynamics in interacting quantum matter

Dynamics in correlated quantum matter is a hard problem, as its exact solution generally involves a computational effort that grows exponentially with the number of constituents. While a remarkable progress has been witnessed in recent years for one-dimensional systems, much less has been achieved for interacting quantum models in higher dimensions, since they incorporate an additional layer of complexity. In this work, we employ a variational method that allows for an efficient and controlled computation of the dynamics of quantum many-body systems in one and higher dimensions. The approach presented here introduces a variational class of wavefunctions based on complex networks of classical spins akin to artificial neural networks, which can be constructed in a controlled fashion. We provide a detailed prescription for such constructions and illustrate their performance by studying quantum quenches in one- and two-dimensional models. In particular, we investigate the nonequilibrium dynamics of a genuinely interacting two-dimensional lattice gauge theory, the quantum link model, for which we have recently shown -- employing the technique discussed thoroughly in this paper -- that it features disorder-free localization dynamics [P. Karpov, et al., arXiv preprint, arXiv:2003.04901 (2020)]. The present work not only supplies a framework to address purely theoretical questions but also could be used to provide a theoretical description of experiments in quantum simulators, which have recently seen an increased effort targeting two-dimensional geometries. Importantly, our method can be applied to any quantum many-body system with a well-defined classical limit.


I. INTRODUCTION
One of the main challenges in quantum many-body dynamics is that, unless the model under study is exactly solvable, the numerical overhead required to find an exact solution grows, in general, exponentially with the number of degrees of freedom. In the last decades the development of powerful computational techniques has nevertheless seen impressive progress, largely motivated by the experimental advances in realizing and controlling isolated quantum systems far away from equilibrium [1][2][3][4][5][6][7][8][9][10][11][12][13][14][15][16] . The majority of the advances have been achieved for one-dimensional (1D) systems, for which there exist now a set of reliable methods that can simulate efficiently the dynamics of lattice models. The primary example of such set of techniques are tensor network algorithms such as the time-dependent density matrix renormalization group (or its variants TEBD and tMPS) [17][18][19][20][21][22][23][24] , which uses a matrix product state [25][26][27] representation of the wavefunction and solves the dynamics, for instance, via a Trotter decomposition of the evolution operator. Yet, this approach is generally restricted due to a rapid growth of entanglement, and a substantial increment of its computational complexity in higher dimensions. A recent alternative consists of encoding quantum states in various types of networks of classical degrees of freedom, such as artificial neural networks (ANNs) [28][29][30] , and perturbative classical networks (pCNs) 31 . On the other hand, the description of quantum dynamics in higher dimensions is facing severe limitations. In spite of a few very recent efforts in two dimensions (2D) using tensor net-works [32][33][34][35][36][37][38][39][40][41] , artificial neural networks 30,42,43 , or numerical linked cluster expansion 44 , solving the quantum dynamics of 2D (and higher-dimensional) interacting systems remains one of the central challenges in computational quantum physics.
In this work, we introduce a numerical framework that allows for an efficient solution of the dynamics of quantum lattice models in one and higher dimensions. In our method, the many-body wavefunction is represented as a complex network of classical spins akin to ANNs, with couplings among the spins that are taken as variational parameters, and which are then optimized via a time-dependent variational principle (TDVP) 28,45 , with the key advantage that the classical networks do not face numerical instabilities as they have been observed for ANNs 30 . The architecture of these variational classical networks (vCNs) can be derived systematically, in a similar fashion as in the case of their relative pCNs 31 . A crucial property of vCNs is that they inherit the controlled character that arises from the perturbative nature of pCNs. In addition, the optimization step introduced with the TDVP allows us to mitigate several inherent drawbacks of pCNs, such as being forcibly limited to weak quantum fluctuations and short timescales. Furthermore, there exist situations, as discussed subsequently, where vCNs may bear a reduced computational complexity compared to similar state-of-the-art techniques, while still yielding sufficiently accurate results.
We employ our technique to study the dynamics of two quantum spin systems: First, we consider various quenches in the paradigmatic 1D transverse field Ising model (TFIM), which is exactly solvable, thereby serving as testing ground for our method. Next, we tackle the more challenging case of a genuinely interacting 2D lattice gauge theory, namely, the quantum link model (QLM). As shown in our recent paper 46 , this system features disorder-free localization 47-51 , a mechanism for ergodicity breaking in homogeneous systems due to local constraints imposed by gauge invariance. We find that the methodology presented here is particularly well suited for describing the dynamics of the 2D QLM in the nonergodic regime, allowing us to probe timescales and system sizes that could hardly be accessed with any other state-of-the-art computational scheme. The outline of the remainder of this paper is as follows: In Sec. II we state the general problem and settings for the construction and subsequent application of our method. Further, we show how to derive, in general, the structure of vCNs (Sec. II B), and recall the way in which the TDVP operates (Sec. II C). Next, using our methodology we simulate quantum quenches in the paradigmatic 1D TFIM in Sec. III, and in the genuinely interacting 2D QLM in Sec. IV. Some concluding remarks, including possible applications to recent experiments in quantum simulators, are discussed in the last section.

II. THE METHOD
In this section, we explain the basic idea of our method. It contains two crucial ingredients: (i ) an efficient compression of the wavefunction in terms of networks of classical degrees of freedom, and (ii ) a suitable procedure to optimize such networks so as to get an accurate description of the encoded time-dependent quantum state.

A. General problem and generative machines
Throughout this work, we shall consider systems of N spin-1/2 degrees of freedom. We work in the computational basis of spin configurations s = (s 1 , s 2 , . . . , s N ), with s i =↑, ↓. In this representation, the state vector can be expanded as where the amplitudes ψ(s, t) ≡ s|ψ(t) , contain the full information about the system, from which, in principle, all physical quantities can be computed. However, there is a fundamental computational limitation associated with the expansion (1), as it involves exponentially many terms. A possible way to overcome this difficulty consists of using a generative machine that approximates the wavefunction on the fly, rather than storing all the individual coefficients, see Fig. 1. This approach received increased attention recently, when general-purpose ANNs were proposed as generative models 28 . Here, we provide an alternative class of generative machines akin to ANNs, which can be constructed according to a controlled prescription as explained in the following. For the time being, however, let us consider a generic generative machine ψ η (s), such that, where η refers to a set of complex parameters that carry the time dependence, i.e., η ≡ η(t) = (η 1 (t), η 2 (t), . . . , η K (t)) ∈ C K , and which are optimized variationally.
A representation of the wavefunction in terms of such generative machine renders the problem of computing physical quantities tractable. In effect, the expectation value of an observable O with matrix elements s|O|s = O s,s , can be written as where O η (s) = {s } O s,s ψ η (s )/ψ η (s). Typical local observables are such that s|O|s is sparse. Consequently, getting O η (s) requires only a polynomial computational overhead. Hence, assuming a normalized wave function, the expectation value (3) can be calculated efficiently via a Monte Carlo sampling of the distribution |ψ η (s)| 2 . Note that a compression of the wavefunction such as Eq. (2), will be efficient as long as the overall number K of parameters is significantly less than the dimension of the Hilbert space.
Let us also note that, in general, a generative machine refers to a model that can generate samples according to some target distribution, which in this case is |ψ(s, t)| 2 . Remarkably, generative machines such as that in Eq. (2), which we shall regard in the following, not only achieve the task mentioned above but also give direct access to the complex amplitudes ψ(s, t).

B. Variational Classical Networks
As mentioned before, this work aims to introduce a class of adequate generative machines to represent the many-body wavefunction. In the following, we give a controlled prescription to construct such generative models.

General settings
Let us consider a Hamiltonian of the form where H 0 represents a classical system in the sense that, it is diagonal in the computational basis: and the off-diagonal perturbation γV , with γ playing the role of a small parameter, accounts for quantum fluctuations.
We are interested in the nonequilibrium dynamics generated by the Hamiltonian (4). This can be obtained by solving the time-dependent Schrödinger equation, which admits the formal solution (in units such that = 1): where |ψ 0 denotes the initial state. In general, it is challenging to determine the action of the evolution operator e −iHt , onto the basis vectors. However, whenever the Hamiltonian H can be split as in Eq. (4), it is possible to carry out a perturbative treatment by working in the interaction picture, in which the evolution operator can be written as where where T is the time-ordering operator, and with V (t) satisfying the equation of motion: Within these settings, the many-body wavefunction amplitudes are given by The task now is to calculate the right-hand side Eq. (10). Classical networks provide a possible solution, as detailed below.

Cumulant expansion and pCNs
The right-hand side of Eq. (10) can be computed in a controlled way by means of a cumulant expansion 52 , namely where · c denotes the cumulant average. For example, for the lowest order corrections, we have This expansion allows us to write down the wavefunction as with H eff (s, t) defined by Eqs. (10) and (11). Representations of the wavefunction in the form of a Boltzmann-like factor are quite adequate to compute physical observables via a Monte Carlo procedure 31,[53][54][55] . In order to gain some insight about the physical content of the function H eff (s, t), let us restrict for a moment to a simple initial product state, namely, an equally weighted superposition of the spin configurations: This initial state is particularly convenient as ψ 0 (s) = 2 −N/2 for all s, and hence ψ 0 (s) drops out in all the cumulant averages. In this scenario, and upon performing the integrals in the cumulant expansion (11), the function H eff adopts, in general, the following form That is, H eff can be regarded as the effective Hamiltonian of a classical spin system with complex couplings C l (t), and with spin interactions given by the functions Φ l (s), which are local provided that the quantum Hamiltonian is local, too. Situations where quenches from the initial state (15) are of physical interest are discussed in posterior sections. In some cases, it is possible to recast the systems defined by Eq. (16), as conventional classical statistical mechanical models. For example, the effective model corresponding to the 1D TFIM, discussed later in detail (see Sec. III), contains, up to first-order in the cumulant expansion, the following terms 31 which define a 1D classical Ising model with nearest and next-to-nearest neighbor interactions.
On the other hand, the system (16) can also be visualized as a network of classical spins with connectivity specified by the functions Φ l (s). Hence, the models defined by an effective Hamiltonian as in Eq. (16) are called perturbative classical networks (pCNs) 31 . In Fig. 2, we display the network representation, up to second-order, of the classical spin model that emerges when considering a translationally invariant 1D TFIM.
Let us emphasize that, our approach works as well for initial states other than the equally weighted superposition (15). Importantly, this is true not only for translationally invariant initial states but also for nonuniform ones, as will be shown later. To consider a more general case, let us assume that the perturbation consists of a sum of local terms, namely, V = α v α , where the overall number of terms is polynomial in system size. Thus, we can write an equation of motion for each of the individual terms: The commutator [H 0 , v α ] measures, essentially, the energy difference between two eigenstates of H 0 in a transition induced by v α , i.e., v α |s 1 = |s 2 . Indeed, one can readily prove that Therefore, Eq. (18) can be rewritten as with Ω α being a diagonal operator in the computational basis, which measures the energy difference in a transition induced by v α . This equation admits the formal solution v α (t) = e iΩαt v α . Thus, if the effective Hamiltonian is written as H eff = ∞ n=0 H (n) , where the zeroth-order term is H (0) (s, t) = −iE s t + ln(ψ 0 (s)), and the subsequent orders are defined by Eq. (11), one can write, for example, the first-order correction as and likewise for higher order terms. Note that the matrix s|v α |s is typically sparse for physical systems with fewbody couplings.
Let us remark that the cumulant expansion (11) goes beyond conventional time-dependent perturbation theory, since the corrections considered here effectively account for a resummation of several terms that appear in a standard perturbative expansion 31 . However, pCNs face their own limitations, too. In particular, they are inherently restricted to weak quantum fluctuations (small γ) to ensure that we can safely truncate the expansion (11). Besides, the description of the evolution of observables will eventually break down, since resonant processes may be present, giving rise to secular terms that limit a correct description to timescales of order O(1/γ) 31 . Nonetheless, one can still benefit from the framework introduced here, while mitigating the drawbacks mentioned before. This is achieved by constructing adequate variational wavefunctions with a network architecture that is inherited from a corresponding pCN, as argued in the following.

Variational ansatz
The main idea of this work was already outlined in the previous paragraph: building upon the structure of an underlying pCN, one can construct variational classical networks (vCNs) that are then used as generative machines to compute the dynamics. Importantly, the resulting vCNs will inherit the controlled character of the cumulant expansion, in that, the accuracy of the approximation can be improved systematically by reducing the value of γ, or by taking into account higher-order cumulants.
First, let us consider classical networks of the form given in Eq. (16). The corresponding vCN can be obtained simply by letting the couplings C l be variational parameters, that is, where η(t) denotes a set of complex variational parameters.
In the more general case such as Eq. (21), one can build the corresponding vCN by noting that Ω α (s) take a finite number of discrete values for any s. Thus one can simply introduce a variational parameter for each value of Ω α (s). To fix ideas let us consider the first-order correction given in Eq. (21), and let us denote as Λ Ωα the set of all possible values of Ω α (s). After rewriting the integral we can introduce a set of variational parameters so that the corresponding first-order variational effective Hamiltonian reads and likewise for higher-order terms. In either case, the concomitant wavefunction amplitudes take the form ψ vCN (s; η(t)) = e HvCN(s;η(t)) .
Let us point out that, the more higher-order corrections are included in the architecture of a classical network, the more quantum correlations can, in principle, be taken into account. Thus, the cumulant expansion (11) provides us with a controlled procedure to generate generative machines of the form given in Eq. (25), which allows for a systematic addition of terms that can potentially encode more and more non-local quantum correlations. Before turning to the applications, let us discuss the variational procedure that is used to optimize the resulting vCNs.

C. Time-dependent variational principle
The TDVP 28,45,53,54 is a way of optimizing a timedependent variational ansatz ψ η (s), where η denotes a set of complex time-dependent variational parameters, i.e., η(t) = (η 1 (t), η 2 (t), . . . , η K (t)). Such trial wavefunction could be, e.g., a Jastrow ansatz 56,57 , an ANN 28 , or, as presented in this work, a vCN. In essence, the TDVP is a procedure that establishes an equivalence between the time-dependent Schrödinger equation and a system of first order differential equations that govern the dynamics of the variational parameters, namely, where the over-dot denotes differentiation with respect to time and with the following definitions: which is the so-called covariance matrix, and These quantities are expressed in terms of the local energy E loc (s) := s|H|ψη s|ψη , and the variational derivatives, In order to quantify the accuracy of this TDVP, let us introduce the Fubini-Study metric D, which measures the distance between the exact evolution during a small time interval δt: e −iδtH |ψ η , and the variational evolution |ψ η+δη . Its definition is the following Thus, one can define a relative residual error as 28,30 which can be measured, too, by performing a Monte Carlo sampling of |ψ η (s)| 2 . In practice, we use a secondorder expansion of (29) to compute Eq. (30). Moreover, in the following, we shall consider the integrated residual error R 2 (t) := t 0 dt r 2 (t ). Note that Eq. (26) can be derived by minimizing the numerator in Eq. (30) with respect toη * 28,30 .

A. Model
As a first illustration of our method, we study several quantum quenches in the archetypal 1D TFIM, whose Hamiltonian for N spins on a ring reads where σ µ i (µ = x, y, z) are the Pauli matrices at site i, J > 0 is the exchange constant that sets the overall energy scale, and h is a transverse magnetic field.
Let us recall some features of the model in Eq. (31). First of all, the 1D TFIM is integrable by means of a Jordan-Wigner transformation 58 ; hence, comparison with analytical solutions is at our disposal. Moreover, this model features both equilibrium 59 and dynamical 60 quantum phase transitions. Indeed, the Hamiltonian (31) undergoes an equilibrium quantum phase transition at h c /J = 1 59 , where the critical point separates a ferromagnetic phase (h < h c ) from a paramagnetic one (h > h c ). Its dynamical quantum phase transition (DQPT) is signaled by non-analyticities in the manybody dynamics 60,61 , and occurs when quenching across the underlying equilibrium quantum phase transition. In this respect, it is interesting to study quenches that cross the critical point. Below, we shall consider quenches from the paramagnetic point h 0 = ∞, which corresponds to the initial state in Eq. (15), to both the ferromagnetic and the paramagnetic phases (see details below). Note that the first type of quench is precisely of interest for the study of DQPTs. From an experimental viewpoint, both probing the dynamics of this model and engineering the relevant initial state in Eq. (15), are now feasible tasks with current technologies in quantum simulators in various settings 4,9,10 .
Before discussing the details of the quench dynamics simulations and the corresponding results, let us first construct the vCNs associated to the 1D TFIM.

B. vCNs for the 1D TFIM
The corresponding pCNs for TFIMs have been recently derived elsewhere 31 . Here, we review the main steps of such calculations. First of all, according to the general settings established in Sec. II B, we take the Ising interaction term as reference Hamiltonian, i.e., , and the transverse field as the perturbation, namely, Using the basic commutation relations of the Pauli matrices, one can readily show that where we emphasize again that this commutator measures the change in energy in a transition induced by σ x j , between eigenstates of H TFIM 0 . The solution to the equation of motion Eq. (20) for σ x j (t) therefore reads where the second step follows from Euler's formula. This solution leads to a first-order pCN of the form anticipated in Eq. (17). Indeed, plugging the solution (33) in Eq. (21) and using the fact that ψ 0 (s) = 2 −N/2 for all s, for the initial state in Eq. (15), one gets where the explicit form of the coefficients C (1) l can be easily deduced from Eqs. (33) and (21). As previously explained, the classical network defined above can be turned into a vCN, simply by regarding the couplings C  Jt The perturbatively motivated structure of the vCN can be systematically expanded by straightforwardly plugging Eq. (33) into higher order terms in Eq. (11). By potentiating j σ x j (t), more and more non-local couplings are generated. In fact, at order k couplings up to distance k + 1 are generated (see Ref. 31 for details). Hence, we can systematically increase the vCN by adding all distinct classical coupling terms up to a given distance d, which are compatible with the system's symmetries. For example, when considering second-order corrections, the terms that respect Z 2 and lattice symmetries, and which expand up to a distance d = 3, would be added to the effective Hamiltonian, H TFIM = H Jt

C. Quench protocol and results
As mentioned before, quenches in the 1D TFIM across the critical point comprise a DQPT. In that respect, an interesting class of quenches consists of going from the paramagnetic to the ferromagnetic phase. Here, we concentrate precisely on the aforementioned situation as well as on quenches within the paramagnetic phase. In particular, we consider the initial state |→ given in Eq. (15), which corresponds to the point h 0 = ∞. Next, we compute the unitary dynamics generated by the Hamiltonian in Eq. (31) with h/J < 1 (ferromagnetic) and h/J > 1 (paramagnetic).
We compare results for the dynamics of the TFIM obtained in three different ways: exact, pCN, and vCN. For the exact solution we exploit the integrability of the model. Via a Jordan-Wigner transformation the spin system is mapped to a model of non-interacting fermions 58 , for which closed form expressions can be obtained for all quantities of interest 59 . The results shown are for a translationally invariant chain in the thermodynamic limit. With pCN and vCN, we consider systems with N = 50 sites and periodic boundary conditions. On the timescales shown there is no finite size effect in the observables. To obtain the time-evolved vCN we initialize all network couplings with zero and integrate the TDVP equation using a second order consistent integrator with adaptive time step. Expectation values with respect to |ψ(s)| 2 are estimated using 8 × 10 4 samples generated by a single-spin-flip Markov Chain Monte Carlo.
The results of the quench dynamics are shown in Figs. 3-6. First, in Fig. 3, we compare the performance of the first-order pCN given in Eq. (34) and its associated vCN, in a quench to h/J = 0.1. As illustrated for the dynamics of the transverse magnetization σ x l , both approaches capture very accurately the short-time behavior. However, it is the variational ansatz that yields a much more accurate description at longer times. Interestingly, when looking at the evolution of the perturbative and variational couplings Fig. 3(b), we can see that their dynamics start to differ approximately at the point where discrepancies in the evolution of observables are first noted.
In Fig. 4, we study the overall performance of various vCNs with different coupling distance d, when quenching to the ferromagnetic phase (h/J = 0.3) and the paramagnetic one (h/J = 3), left and right columns in Fig. 4, respectively. As a principal result, we observe that the accuracy is systematically improved upon increasing the coupling distance of the vCNs. This is not only observed from the real-time evolution of the transverse magneti- Time Jt zation σ x l , and the next-to-nearest neighbor correlation function σ z l σ z l+2 , but also from the integrated residuals R 2 (t), which show a systematic error convergence by increasing d. Next, focusing on the transverse magnetization σ x l , we see that, in both quenches, it quickly relaxes to a steady-state value: While at weak transverse field this feature can be well captured by all the considered vCNs, the situation becomes more challenging when the value of h/J is large. Nevertheless, even in the latter case, the dynamics computed with the vCNs with the largest coupling distances regarded here (d = 9, 11), follow very closely the actual relaxation of σ x l . As for the next-to-nearest neighbor correlation function σ z l σ z l+2 , it is found that correlations at this distance are rather small in the quench to the ferromagnetic phase, whereas they are larger and oscillate between positive and negative values before decaying to zero in the quench within the paramagnetic phase. In both cases, however, it is again the highest-order vCNs that yield a better description of this correlation function, as expected. In Fig. 4 results corresponding to a different system size (N = 25), are also shown for comparison in the quench to h/J = 0.3, with d = 11. There is no appreciable finite-size effects observed up to the accessed timescales.
The accuracy of correlations as a function of the coupling distance d, is analyzed in Fig. 5, for the two quenches considered before. In this figure we plot the deviation of the TDVP results from the exact dynamics D( σ z l σ z l+r ), in terms of two-point correlation functions σ z l σ z l+r , at various distances r. As a general remark, we observe that the deviations from the exact result are systematically decreased by increasing the coupling distance d, in agreement with the results in Fig. 4. Also, it should be noted that the smaller the coupling distance d is, the earlier the deviations from the exact dynamics occur, as expected from the underlying perturbatively-motivated structure of the vCNs. Moreover, we observe that in the case of large transverse field the deviations, in general, grow more than in the quench to weak fields. This is due to the fact that in the dynamics with a large transverse field, correlations develop significantly at all the considered distances in the relevant timescales, see Fig. 6 below, whereas at weak transverse field the dynamics is more local and hence correlations at large distances are rather small (see, for instance, the correlation function on the left column of Fig. 4). Lastly, note that the oscillations observed for some of the deviations arise from the fact that the variational results oscillate around the exact solution, as can be seen in Fig. 4. Finally, it is instructive to look at the correlation spreading when using vCNs with different coupling distance d. This is illustrated in Fig. 6 for the quench to h/J = 3, with three vCNs with d = 1, 5, 11, as well as the exact solution. The results shown on this figure reveal another crucial feature of vCNs: The distance for which a vCN can adequately capture the propagation of correlations is exactly determined by the coupling distance d. Thus, we see that the spreading of correlations can be well captured in a controlled manner by increasing the coupling distance in the structure of the vCN. Although, this result is obtained for the Ising model, we expect that it holds in general.

A. Model
We now show that the method presented in this paper can also be used to tackle more challenging problems such as the quantum dynamics of nonintegrable models in higher dimensions. Concretely, we study a 2D U (1) quantum link model (QLM), where gauge fields are represented by spin-1/2 operators, defined on the links of a square lattice, and without matter degrees of freedom (see Fig. 7). The Hamiltonian reads where the sum goes over elementary plaquettes , and x,μ are the standard rasing/lowering spin-1/2 operators defined on the link that connects sites x and x +μ, with the unit vectors on the 2D lattice being denotedî,ĵ. For simplicity, we use spin variables normalized to 1, i.e., the eigenvalues of This allows us to structure the Hilbert space into socalled superselection sectors 49 , with an associated background charge distribution {Q α } satisfying Gauss law, i.e., G(x)|ψ = Q x |ψ , for all x. Thus, every superselection sector consists of only physical states fulfilling that the incoming and outgoing fluxes equal the charge Q x at a given site x, specified by the distribution {Q α } defining the corresponding sector.
As illustrated in Fig. 7, the dynamics in the QLM is generated by the first term on the right-hand side of Eq. (36), where the plaquette operators U , U † induce tunneling processes between configurations with flippable plaquettes, in which the electric fluxes form a loop with clockwise or anti-clockwise orientation (a non-flippable plaquette is annihilated by such operators). The second contribution on the right-hand side of Eq. (36), acts as a potential energy term that favors those configurations with a larger number of flippable plaquettes for λ < 0. When λ = −∞, the Hamiltonian in Eq. (36) has two Z 2 symmetry-broken ground states, with the electric fluxes arranged such that all the plaquettes in the system are flippable, i.e., the four spins in any plaquette form a closed loop with either clockwise or anti-clockwise orientation. These states, together with all other states kinetically connected to them, define a sector that we shall refer to as the fully-flippable (FF) sector. We note that the FF sector is a subset of the superselection sector with zero charge: G(x)|ψ = 0 ∀ x.
The out-of-equilibrium dynamics of the model studied in this section, has recently gained increased attention. For instance, it has been found that DQPTs occur during the unitary dynamics that follows a sudden perturbation in the Hamiltonian in Eq. (36), when starting from one of the ground states in the FF sector 68 . Further, in a recent paper 46 , we have also shown that the model under consideration may undergo a localization-to-ergodic transition, constituting the first example of a genuinely interacting 2D theory featuring disorder-free localization [47][48][49][50][51] . Indeed, it is found that quenches from the initial state |→ to a QLM with J /λ < 0, give rise to localized behavior, such as a significant suppression of transport and a limited spread of correlations, whose origin is linked to a fragmentation 69-72 of the Hilbert space into kinetically disconnected regions due to hard local constraints imposed by gauge invariance. Let us remark that, in the present context, the initial state |→ in Eq. (15), corresponds to an equally weighted superposition of all electric flux (spin) configurations. On the other hand, the dynamics within the FF sector, for example, displays ergodic behavior, characterized by a propagation of correlations and transport quantities throughout the whole system. In the subsequent, we shall consider precisely the two scenarios mentioned above. However, let us first introduce the vCNs for the QLM.

B. vCNs for the 2D QLM
Using the general scheme introduced in Sec. II B, we derive pCNs and the corresponding vCNs for the QLM. The classical limit of the theory under consideration is Quantum fluctuations are then induced by the term γV QLM = −J (U + U † ), where we recognize γ = −J . Next, we solve the equation of motion for the operator U , that is, −i d dt U (t) = [H 0 , U ] (and likewise for U † ). We get, where the operator Ω commutes with both U and U † , and is given by where P = {a, b, c, d} denotes the set of neighboring plaquettes around a given plaquette (see Fig. 8 operators A p and B p are given by In these definitions, the operators P ↑,↓ p,i are projectors onto one of the components of the spins defined on the bonds of the four neighboring plaquettes, e.g., P ↑ a,2 = S + 2 S − 2 = (1 + S z 2 )/2, projects onto the s 2 = +1 component of the spin living on the second link of plaquette a, which lies underneath the reference plaquette as in Fig. 8. Note that the very same spin of this example is also the fourth one in the plaquette on the right side of a. The convention for enumerating the spins in a given plaquette and labelling neighboring plaquettes is shown in Fig. 8.

First-order ansatz
Using λ as the unit of energy, the first-order correction of the cumulant expansion, Eq. (21), gives where ω (s) := F (s)Ω (s), and F := U U † − U † U , is a diagonal operator with possible matrix elements F (s) = +1, −1, 0, when the reference plaquette (indicated by the subscript ) is flippable and has an anti-clockwise orientation (+1), a clockwise orientation (−1), or it is not flippable (0). Also, we use the notation |s ≡ (U +U † )|s , i.e., s differs from s by the flipping of a single plaquette. Ω (s) denotes the diagonal entries of the operator introduced in Eq. (38).
Following the prescription given by Eqs. (23), (24), we can introduce a variational parameter for each one of the nine possible values that the integer variable ω (s) ∈ [−4, 4] can take. Thus, using the short-hand notation we can write down the corresponding variational effective Hamiltonian as which contains up to nine independent (assuming translational invariance) "first-order" variational parameters η (1) On the other hand, the "zeroth-order" contribu-tion readsH with a single variational parameter η (0) (t) and E * s ≡ E s /λ being dimensionless. Thus, a vCN built upon the zeroth and first-order cumulants, reads The first-order vCN defined above contains a maximum coupling distance (along either theî-orĵ-direction) between plaquettes equal to 2. Indeed, for a given plaquette , the function ω (s) involves only the nearest neighboring plaquettes that are shown in Fig. 8. Thus, when plugged into the expression of the classical network, this gives rise to terms where the plaquettes that are separated the most are, for example, b and d in Fig. 8. In terms of parallel spins (e.g., spins on links (x, x +ĵ) and (x + l ·î, x +ĵ), with l an integer), the maximal coupling distance is equal to 3.
Finally, let us note that the ansatz (47) can be explicitly recast as a classical Ising-like spin model with multiple (up to 16) spin interaction terms. Indeed, this can be achieved by rewriting the constraints in Eq. (44) in terms of the projectors P ↑,↓ p,i . For instance, for ω = 4, we have

Second-order ansatz
For most of the numerical results we shall present next, it becomes important to make use of a second-order vCN. Thus, let us show explicitly how to construct such ansatz. The second-order cumulant contains two terms as indicated in Eq. (13). Using the solutions in Eq. (37), we have where |s , ≡ (U +U † )(U +U † )|s . These equations then give the needed corrections to form a secondorder pCN, and will be the basis to build the corresponding second-order vCN. Let us note at this point the following aspect concerning the locality of the second-order cumulant. Each of the two contributions, Eqs. (49) and (50), might, in principle, give rise to couplings at all distances. However, when we subtract them to form the overall second-order correction (see Eq. (13)), most of the resulting terms cancel out, leaving only couplings up to some (local) coupling distance d. For instance, if we consider the initial state |→ in Eq. (15), the ratios of initial amplitudes in the equations above reduce to 1. Then, one can easily verify that the only non-vanishing contributions arise from overlapping plaquettes, i.e., plaquettes sharing one common link (grey and orange plaquettes in Fig. 8 ) or from plaquettes connected by a common neighboring plaquette (grey and green plaquettes in Fig. 8).
In effect, for plaquettes separated by more than one intermediate plaquette (i.e., plaquettes outside the colored region in Fig. 8), we have that F (s ) ≡ F (s) and ω (s ) ≡ ω (s), and hence, Eqs. (49) and (50), become identical. In this case, the coupling distance (along either theî-orĵ-direction) is d = 4 for plaquettes, and d = 5 for parallel spins. We note that this remark also holds for other initial states that can be written as a product of local terms, as long as supports of sufficiently separated local terms do not overlap with each other, as is the case for all initial states considered in this work. Now let us convert the second-order pCN into a secondorder vCN. In the same spirit as for the first-order ansatz, the integrals involved in the second-order correction can be written as for Eq. (49), and likewise for Eq. (50). In this case, there are 81 possible combinations of the tuple (ω 1 , ω 2 ); hence, we introduce 81 second-order variational parameters η (2) ω1,ω2 (t). Thus, using the short-hand notation and upon integrating Eq. (49), one can make the substitution case (disorder-free localization), our method is able to yield sufficiently accurate results for long timescales that could hardly be accessed with any other state-of-the-art numerical technique. Unless otherwise stated, in all the examples discussed in this section, the TDVP equations are solved using a 4th-order Runge-Kutta integrator with step size ∆t = 0.1λ −1 .

Benchmark in quasi-1D ladders: uniform initial states
As a first benchmark, we consider a quasi-1D ladder of 2 × 10 plaquettes, i.e., 40 spins, with dynamics restricted to a given superselection sector. Concretely, we consider the FF sector, consisting of all the states with zero charge that are kinetically connected to the two maximally flippable configurations, as explained in Sec. IV A. As shown below, with these settings, one can still carry out efficient ED calculations. Indeed, the key idea is that, since the Hamiltonian in Eq. (36) adopts a block-diagonal form, with each block corresponding to a given superselection sector, a state that belongs to one of such sectors will remain in that sector during the course of the dynamics generated by H QLM . Thus, one needs to consider only the portion of the Hilbert space that is relevant for the chosen sector.
The quench protocol considered here is the following: We initialize the system in an equally weighted superpo-sition of all the basis states spanning the FF sector. We denote such a superposition as |→ FF . Next, we evolve the system with the Hamiltonian in Eq. (36), with a finite value of J /λ. Later, we will study similar quenches but starting from nonuniform initial states. Note that for a 2 × 10 system, the dimension of the FF sector is 17906. Therefore, in the TDVP calculations, we can also carry out an exact enumeration of states, i.e., we can explicitly sum over all relevant spin configurations in expressions such as the expectation value in Eq. (3), rather than performing a Monte Carlo sampling.
Let us emphasize that the dynamics in the FF sector is interesting because significant correlations may develop in the entire spatial extent of the system, as we have recently shown 46 . Therefore, this constitutes a highly challenging scenario for our method, since a vCN can adequately capture the buildup of correlations up to a distance specified, essentially, by the order of the cumulant expansion, as we have already argued.
The results of this benchmark are shown in Fig. 9, for quenches to J /λ = −0.1 and J /λ = −0.3, left and right columns of Fig. 9, respectively. We compare the dynamics computed with a first-order pCN, a first-order vCN (Eq. (47)), a second-order vCN (Eq. (56)), and via ED. Particularly, we focus on two observables: the nearestneighbor spin-spin correlation function S z x,î S z x+î,ĵ (i.e., S z 1 S z 2 , with the convention in Fig. 8 U + U † . Besides, we also show the integrated relative residuals R 2 (t) of the first-and second-order vCNs, as a measure of their accuracy. As expected, it is the second-order vCN that provides the most accurate results in the studied quenches. This can be seen from both the evolution of the observables and the growth of the residuals. In particular, for the quench to J /λ = −0.1, the second-order vCN captures remarkably well correlations at short distances (left column, upper row), as well as the oscillatory behavior of off-diagonal observables (left column, middle row), in the entire range of accessed timescales λt = 20. As anticipated, in the quench with a bigger strength of the perturbation J /λ = −0.3, the description in terms of all of the considered wavefunctions, breaks down at earlier times. Yet, the second-order vCN still yields a rather excellent agreement with ED up to significant timescales λt ∼ 7. In all cases, the vCNs outperform the first-order pCN regarded here.
Next, we point out that, as illustrated better in the second quench (right panel of Fig. 9), the oscillations of both S z x,î S z x+î,ĵ and U + U † , decay to some steadystate value in agreement with the expectation of ergodic behavior in the FF sector 46 . Importantly, such decay of the oscillations is captured by the TDVP solutions. On the contrary, the pCN fails to capture this crucial feature, although it gets right the frequency of the oscillations up to considerable timescales. Finally, let us emphasize that the results displayed in Fig. 9, corroborate once more the controlled nature of vCNs, in that the accuracy of a vCN can be improved systematically either by adding higher order terms of the cumulant expansion or by decreasing the strength of quantum fluctuations.

Benchmark in quasi-1D ladders: nonuniform initial states
Here, we consider the same setting as before, namely, a 2 × 10 quasi-1D ladder, with dynamics restricted to the FF sector. The chosen initial state, however, is nonuniform. In particular, the initial condition is created by adding a line defect with subextensive energy contribution to the state |→ FF . This is achieved by applying the operator upon the state |→ FF , where C d denotes the set of plaquettes in column d. Thus, P|→ FF represents a state with a line energy defect along column d = 0. The resulting quench dynamics obtained using both a second-order vCN and ED, are analyzed in terms of the evolution of the total energy of column d, which is given by The results are shown in Fig. 10. One can observe that the second-order vCN captures quite well, in a quantitative way, the propagation of the line defect at all distances up to a time λt ≈ 20. After this point, the TDVP solution is not exact anymore, but it still follows qualitatively the exact dynamics to the largest accessed time, Time λt

Quenches in a 2D lattice
After having assessed the performance of our method in quasi-1D ladders, we now consider a truly 2D setting. Moreover, we shall regard regimes where access to ED is computationally prohibited. Concretely, the following quench protocol is carried out: The system is initialized in the state |→ given in Eq. (15) and then evolved with the Hamiltonian in Eq. (36) at finite J /λ. As mentioned before, in the present context the state |→ can be thought of as an equally weigthed superposition of all the electric flux (spin) configurations, thereby involving all the superselection sectors. Thus, apart from very small system sizes, ED techniques become impractical. In the following, we show results for a system of 100 × 100 plaquettes (2 × 10 4 spins). In all the calculations considered here, a Metropolis Monte Carlo sampling was performed with 10 6 sweeps at each time instance, and single-spinflip updates.
As explained in our recent work 46 , it is interesting to investigate quenches from the initial state |→ to the QLM in Eq. (36), as this scenario gives rise to (disorderfree) many-body localized dynamics, even though both the Hamiltonian and the initial condition are homogeneous. In effect, when quenching from the initial state |→ , not only the transport of energy is highly suppressed, but also the spreading of correlations is drastically constrained 46 . This means that relevant correlations can only develop at very short distances. Due to this reason, we expect low-order vCNs to capture very well the main features of the exact quantum dynamics in the present case.
The resulting dynamics is shown in Figs. 11 and  12, where quenches to the points J /λ = −0.1 and J /λ = −0.3, are considered, respectively. Let us focus first on the quench to J /λ = −0.1 (Fig. 11). The left column shows the spatiotemporal buildup of quantum correlations in terms of two correlation functions, namely, the spin-spin correlation function S z 0 (t)S z l (t) ≡ S z x,ĵ (t)S z x+l·î,ĵ (t) , between parallel spins separated by a distance l, and the connected two-point plaquette-flip correlation function Time λt , between parallel links separated by a distance l. Right panel: Plaquette-plaquette connected correlation function (U0 + U † 0 )(U l + U † l ) C (see main text). Results obtained with a first-order vCN (left side), and a second-order vCN (right side), for a 100 × 100 system (2 × 10 4 spins). Here, a time step ∆t = 0.05λ −1 was used to integrate the TDVP equations.
In both cases, we compare the results obtained using a first-order vCN (left-hand side of the colormap), and a second-order vCN (righthand side of the colormap). As a first observation, we note the rather confined spreading of correlations. As already mentioned, this is due to the disorder-free localization mechanism that takes place in the situation considered here. We stress that this is not an artefact of the vCNs. Indeed, a first-order vCN, see Eq. (47), has a coupling distance d = 2, for plaquettes, and d = 3, for parallel spins. On the other hand, a second-order vCN, see Eq. (56), has coupling distances d = 4, 5, for plaquettes and parallel spins, respectively. Therefore, according to our previous analysis (see Fig. 6, in particular), we expect that both of these vCNs would be able to account for the spreading of correlations, at least, up to the aforementioned distances. However, we find that significant correlations are developed predominantly no further than a distance l = 3, at least, up to the accessed timescales.
A second observation is that the results obtained with both vCNs exhibit a very similar qualitative behavior. On the right column of Fig. 11, we make a more quantitative comparison by plotting individual cuts along the distances with the largest signal, i.e., l = 1, 2. We notice that the oscillations in both cases have pretty much the same frequency, whereas the observed discrepancies are potentially due to the fact that the first-order ansatz cannot quite capture the decay of oscillations, as also seen in Fig. 9 for the quasi-1D ladder.
Finally, in Fig. 12, we show the dynamics of correlations for the quench to J /λ = −0.3. We note that, apart from the fact that the dynamics gets accelerated due to the bigger strength of the off-diagonal perturbation, similar remarks as the ones for the previous quench also hold in this case. In particular, we see that the corre-lation spreading is constrained, too, with significant correlations growing only in the spatial region l ≤ 4. Once more, this is a manifestation of the disorder-free localization phenomenon that occurs in the dynamics of the 2D QLM for the type of quenches studied in this section.

V. SUMMARY AND OUTLOOK
We have introduced a numerical variational scheme for the study of dynamics in correlated quantum systems in one and higher dimensions. Our method relies on an efficient representation of the many-body wavefunction, in terms of complex networks of classical spin variables. This class of variational wavefunctions, termed vCNs, are similar to ANNs. Crucially, vCNs can be constructed according to a controlled prescription, as it has been underlined and explicitly shown in this paper. Moreover, the ideas presented in this work are of general use and can be applied, in principle, to study any interacting quantum lattice model, regardless of spatial dimensionality, provided that an expansion around a well-defined classical limit is possible.
We have illustrated the way in which our method works and its range of applicability by studying quantum quenches in two particular models. First, we considered the paradigmatic 1D TFIM, which serves as an ideal testing ground for our method as it can be solved exactly 58 . Moreover, the 1D TFIM presents the advantage that the resulting vCNs have a relatively simple and intuitive form, which is in fact, that of classical Ising models. Further, for this example we have shown, in a rather intuitive manner, how the perturbatively motivated architecture of vCNs can be systematically built upon by incorporating classical couplings that expand over a certain coupling distance, which is closely related to the order of the underlying cumulant expansion. Then, we characterized the performance of various vCNs in terms of the real-time evolution of two-point correlation functions at varying distance, local off-diagonal observables, and the integrated residuals. In general, we found that the basic architecture of vCNs allows for a systematic improvement in the accuracy of such results by adding higher-order terms according to the controlled procedure referred above. We have verified that this statement remains true when varying system size and quenching to different points in the parameter space of the associated Hamiltonian.
Next, we studied the out-of-equilibrium dynamics in a genuinely interacting two-dimensional lattice gauge theory, a U (1)-QLM. As opposed to the Ising model, in this case the resulting vCNs are highly non-conventional spin models; see, for example, Eq. (48). Yet, the same general ideas apply in their construction. We have shown this by explicitly calculating first-and second-order vCNs, which were then used for computing the dynamics of the QLM in different scenarios. On the one hand, we considered quasi-1D ladders of 2 × 10 (40 spins), with dynamics restricted to the FF sector. This allowed us to use ED for the purpose of benchmarking. We emphasize, however, that this is a rather challenging scenario for our method, as it has been found 46 that the dynamics in the FF sector embodies considerable quantum correlations that propagate throughout the entire system. Yet, a remarkable quantitative agreement with ED was obtained. In particular, we found that second-order vCNs yield very accurate results for short-range correlations and local off-diagonal observables, up to long timescales. Importantly, we tested our method using not only homogeneous initial conditions but also nonuniform ones with a spatial energy inhomogeneity along a given column. Once again, we found that second-order vCNs are able to accurately describe the propagation of such a line defect at all distances up to long timescales. Next, we considered situations where ED and other state-of-art techniques become computationally inaccessible. Namely, we studied quenches starting from an equally weighted superposition of all spin configurations, thereby involving all superselection sectors. This part of our study was done in systems with 100×100 plaquettes (2×10 4 spins). Here, however, the crucial observation that the QLM exhibits disorder-free localization dynamics 46 , enabled us to employ reliably our vCNs, since such localization mechanism yields a strong suppression of correlation spreading.
Overall, the quantum quenches discussed above have allowed us to characterize the accuracy and versatility of our methodology. As a principal observation, we note that upon adding higher-order couplings in the structure of a vCN, its accuracy can be improved in a controlled way. In addition, higher-order couplings also account for another crucial feature: correlation spreading is properly captured up to a spatial scale determined by the maximum coupling distance included in the architecture of the vCN. On the other hand, at a fixed or-der of the cumulant expansion, the accuracy can also be systematically increased by reducing the strength of the off-diagonal perturbation. As a rule of thumb, we expect a low-order vCN to give sufficiently accurate results up to a timescale set by the inverse of the perturbation strength 31 . However, such timescales can be further prolonged when considering higher-order vCNs. Besides, for a given order of the cumulant expansion, in general, a vCN outperforms its corresponding pCN at all relevant timescales, and in some cases, the former can account for important features of the quantum dynamics, which are just beyond the scope of the latter, such as the relaxation of observables towards a steady-state value. We have checked that all the remarks made above also hold when varying system size, and quenching to different points in the parameter space of the studied Hamiltonians.
Naturally, our method is not exempt from limitations and drawbacks, which can be understood from the considerations made in the previous paragraph. First and foremost, the addition of higher-order terms is accompanied by an exponential growth in complexity, so in practice, we are limited to a given order of the cumulant expansion. Moreover, we also know that a vCN will fail at capturing correlations beyond their maximum coupling distance, as illustrated very clearly in Fig. 6. Such coupling distance is also determined by the order of the cumulant expansion. Also, in general, we expect a breakdown in the description of the evolution of observables after a timescale set by the inverse of the perturbation strength, at least, as far as low-order vCNs are concerned.
Regarding possible further applications, there are several interesting routes that one could explore employing vCNs. A particularly promising one consists of formulating hybrid approaches that combine vCNs with ANNs, so as to mitigate some of the drawbacks sustained by both kinds of generative machines. Also, as we have already mentioned, the applicability of the approach presented in this paper is not fundamentally restricted by spatial dimensionality. Thus, in this respect, the following models arise as natural extensions of the systems studied here: 2D TFIMs, 2D bosonic and fermionic Hubbard models, 2D QLMs in the presence of a dynamical matter field, 3D quantum spin ice models, as well as other kinetically constrained models. Lastly, regarding vCNs as generative machines also raises the possibility of employing them for tasks other than solving the real-time dynamics of quantum many-body systems. In particular, vCNs could be used for addressing ground-state search problems, with an optimization procedure guided by a conventional variational principle that minimizes the energy functional, rather than using a TDVP.
Finally, let us emphasize that the approach presented in this paper not only allows to address theoretical questions regarding higher-dimensional systems but also could provide a theoretical description of recent experiments in two dimensions using quantum simulators. Among such experiments are studies of two-dimensional many-body localization dynamics employing bosons 8 and fermions 6,11 , experiments probing transport properties of various 2D lattice systems 5,16 , as well as investigations on other aspects of the quench dynamics of several 2D quantum spin systems 7,12,15 . Most of the aforementioned experiments explore regimes that are still inaccessible to simulation methods relying on classical resources, thereby calling for a larger input from theoretical and numerical sides. We believe that the method introduced in this work does constitute an important step into this direction.