Benchmarking a novel efficient numerical method for localized 1D Fermi-Hubbard systems on a quantum simulator

Quantum simulators have made a remarkable progress towards exploring the dynamics of many-body systems, many of which offer a formidable challenge to both theoretical and numerical methods. While state-of-the-art quantum simulators are in principle able to simulate quantum dynamics well outside the domain of classical computers, they are noisy and limited in the variability of the initial state of the dynamics and the observables that can be measured. Despite these limitations, here we show that such a quantum simulator can be used to in-effect solve for the dynamics of a many-body system. We develop an efficient numerical technique that facilitates classical simulations in regimes not accessible to exact calculations or other established numerical techniques. The method is based on approximations that are well suited to describe localized one-dimensional Fermi-Hubbard systems. Since this new method does not have an error estimate and the approximations do not hold in general, we use a neutral-atom Fermi-Hubbard quantum simulator with $L_{\text{exp}}\simeq290$ lattice sites to benchmark its performance in terms of accuracy and convergence for evolution times up to $700$ tunnelling times. We then use these approximations in order to derive a simple prediction of the behaviour of interacting Bloch oscillations for spin-imbalanced Fermi-Hubbard systems, which we show to be in quantitative agreement with experimental results. Finally, we demonstrate that the convergence of our method is the slowest when the entanglement depth developed in the many-body system we consider is neither too small nor too large. This represents a promising regime for near-term applications of quantum simulators.


I. INTRODUCTION
Quantum devices are on the periphery of establishing an advantage over their classical counterparts [1]. Recently, a quantum computational advantage was demonstrated in sampling problems [2,3] using superconducting qubits [4] and a photonic quantum device [5]. Moreover, these and similar platforms based on neutral atoms in optical lattices [6,7], trapped ions [8] and Rydberg atoms in optical tweezers [9,10] have demonstrated high-fidelity simulations using specific model Hamiltonians in regimes that significantly challenge existing state-of-the-art classical numerical simulations. Harnessing the unique capabilities of these platforms and pushing their boundaries to even larger system sizes and evolution times paves the way towards practical applications of quantum devices in the area of quantum simulation [1].
The dynamics of quantum many-body systems out of equilibrium constitute fundamental questions that are both physically pertinent and computationally challenging. Contemporary explorations of this regime have uncovered a number of intriguing phenomena [11] including many-body localization [12][13][14], where an interacting system with quasiperiodic or random disorder defies thermalization [15][16][17][18]. Interestingly, an apparent breaking of ergodicity was also found in disorder-free models [19], e.g., in the presence of a linear potential [20][21][22][23], which was attributed to a novel mechanism, known as Hilbert space fragmentation [24][25][26]. It constitutes one example of a rich variety of weak ergodicity-breaking models [27], where the many-body Hilbert space shatters into (approximately) disconnected subspaces [28,29]. A special example are quantum scars, where exceptional, lowentropy states in the many-body spectrum give rise to long-lived periodic orbits [30,31] that resemble classical scarring. Moreover, within the paradigm of slow thermalization, some driven systems have been shown to feature pre-thermal dynamics [32][33][34], where two distinct thermalization timescales are found [35]. Adding periodic driving to the system further enables the realization of genuine out-of-equilibrium phases, a paradigmatic example being quantum time crystals [36][37][38][39]. Accordingly, there have been extensive experimental, theoretical and numerical efforts to study these phenomena and push the limits of current theoretical methods.
Any computation of the dynamics of quantum manybody systems is met with challenges arising from the dimension of the Hilbert space, which grows exponentially in the system size. In other words, the quantum state may carry a large volume of information, which is impractical to store and process classically. A natural countermeasure is to relax the tolerance and seek approximate solutions which in many cases can be found efficiently [40]. Approximations rely on the expectation that all of the information in the many-body quantum state may not be equally important for the specific dynamics of the specific observable we are interested in. By means of an ansatz, an approximate method identifies a part of the information in the quantum state that is most arXiv:2105.06372v2 [quant-ph] 8 Nov 2021 "important" for the dynamics which can then be used to efficiently compute an approximation of the dynamics. With the advent of quantum simulators and quantum computers, it has become imperative to explore classical approximation methods that could potentially simulate quantum devices [41][42][43].
IfĤ is the Hamiltonian of a many-body system, some of the most physically relevant problems include computation of thermal states e −βĤ (here, β = 1/kT ) or of the time evolution e −iĤt |ψ of a given state |ψ . One of the earliest numerical approximation methods developed was the cluster expansion for 2D and 3D lattice systems [44][45][46]. It is based on the observation that the exponential of the Hamiltonian can be written as a sum of terms representing various paths in the lattice. Another class of approximate methods stem from a matrix product state (MPS) ansatz [47,48]. Most commonly used MPS-based time evolution methods are the time-evolving block decimation (TEBD) and time dependent variational principle (TDVP) [49]. MPS based techniques have been very successful in studying both the ground state and time evolution of localized interacting many-body systems. Some bosonic many-body systems can be studied using Monte-Carlo methods. Recently, a new method has been proposed, making use of local thermalization of many-body systems [50,51].
A key feature of approximate methods is the error estimate, which allows us to determine when it is reliable. However, not every approximation ansatz has a well established error estimate. The bottleneck in such theories is to benchmark them, which obligates us to be able to compute the exact solution for a few instances of the many-body problem. In this work we demonstrate that a neutral atom quantum simulator can be used for this purpose [ Fig. 1(a)].

II. MODEL AND EXPERIMENTAL IMPLEMENTATION
We consider a spinful one-dimensional (1D) Fermi-Hubbard model with L sites and a spatially-dependent on-site potential V i,σ , where σ = {↓, ↑} represents the spin and i represents the site index. The Hamiltonian of the system iŝ Hereĉ † i,σ (ĉ i,σ ) denotes the fermionic creation (annihilation) operator for spin σ on site i, J is the tunneling matrix element and U is the Hubbard interaction strength. In this work, we consider either quasiperiodic or linear on-site potentials to realize two paradig- where fest is a lower bound on the fidelity), which is an appropriate measure of reliability of TEBD (see appendix A for details) falls below 95% for the Stark Hamiltonian with ∆ = 3J and U = 5J, for practically accessible parameters. The orange point represents the experimental parameters, with Lexp 290 and t ≈ 700τ . c Schematic of the approximations used in this work. The red and the green spheres represent the two spin components (see text, Sec. III). d Approximating the occupancy-matrix Γ σ [see text, Eq. (4)] for spin-σ atoms. matic models studied in the context of localization -the Aubry-André model and the Stark model. In the Aubry-André model, V i,σ = ∆ AA cos(2πβi + φ) + α(i − L/2) 2 , where ∆ AA is the strength of the detuning lattice, β is the ratio of its wavelength to that of the primary lattice, φ a phase factor and and α is the harmonic trap confinement strength. For a Stark model, with a harmonic confinement, V i,σ = ∆ σ i + α(i − L/2) 2 , where ∆ σ is the spin-dependent tilt of the lattice.
The neutral-atom quantum simulator we use in this work consists of a degenerate Fermi gas of 50(5) × 10 3 40 K atoms at temperature T /T F = 0.15 (1), where T F is the Fermi temperature. The gas is prepared in an equal mixture of two spin components in the F = 9/2 manifold, with |↑ = |m F = −7/2 and |↓ = |m F = −9/2 . The Fermi gas is loaded into a 3D optical lattice with lattice constant d s = 266 nm along the x direction and deep transverse lattices, with constant d ⊥ = 369 nm, to isolate the 1D chains along x, with a residual coupling    Figure 2. Benchmarking the convergence of the numerical method. a A comparison of the imbalance time trace as predicted with the approximate method, Eq. (4) (green solid curve) with a system size Lapx = 280, k ↓ = 0, = 7 and k ↑ = 3 with the experimental data (red markers) for ∆ ↓ = J and U = J. The precise values of J, ∆σ and α are calibrated by fitting the corresponding non-interacting data (cyan markers) to the single particle theory (cyan solid curve). In this and the following datasets, ∆ ↑ = 0.9∆ ↓ . b Benchmarking the approximate method for short times (up to 25τ ) using the RMS deviation between the theoretical and the experimental time traces (see text) for U = 3J and ∆ ↓ = J, for Lapx = 280, k ↓ = 0 and various values of and k ↑ . c Benchmarking the numerical method for long times (300τ ) for U = 5J and various ∆ ↓ measured experimentally (red markers) with the prediction of the approximate method with Lapx = 100, = 7, k ↓ = 0 and various k ↑ (green curves). The dashed curves show the same calculations with k ↓ = 1. The inset shows the convergence of the predicted steady-state imbalanceĪ ↓ (green curve) at ∆ ↓ = 1.1J towards the experimental value as we increase k ↑ . The upper (lower) shaded band is the experimental value with errorbars for the non-interacting (interacting) case with ∆ ↓ = 1.1J. The dashed line is the extrapolated convergence value of the green curve. d Long time traces computed for a system size Lapx = 280 using = 7, k ↓ = 0 and k ↑ = 2, 3, 4 and 5 shown as different shades for ∆ ↓ = 3.3J and U = 5J. The red markers represent the experimental data (data taken from Ref. [20]). e The long time (100τ ) steady-state value of the imbalance for the Aubry-André Hamiltonian with a detuning strength ∆ = 3J for various interaction strengths, with Lapx = 280, = 7, k ↓ = 0 and k ↑ = 2 (see appendix F for details of the data and the computation).

RMS deviation
< 3 × 10 −4 J. Hence, the system can be considered as a set of approximately 250 independent 1D chains, realizing Hamiltonian (1). Using an additional lattice with constant 2d s we prepare an initial charge density wave (CDW), where only even sites are occupied, with an average density ĉ † i,σĉi,σ < ∼ 0.25. The fraction of doublyoccupied sites is suppressed to < 0.03, by loading the gas at a repulsive scattering length of 100 a 0 and using an additional short off-resonant light pulse [52]; a 0 is the Bohr radius. There is no spin order, therefore our initial state can be modelled as an incoherent distribution of site-localized particles with random spin configuration.
The central chain in our experiment has a length of L exp = 290 ± 10 sites and we can reliably simulate (experimentally) the time-evolution with the Aubry-André or the Stark Hamiltonian up to T ≈ 700τ [20], where τ = /J denotes the tunneling time and is the reduced Planck's constant. Although the dynamics remains fairly localized, it is numerically inaccessible due to the quantitative value of the localization length. Exact diagonaliza-tion (ED) is impractical for L > 21 and methods based on MPS are impractical for T > 200τ , for our system and the parameters we use in the experiment ( Fig. 1(b) and appendix A). The error incurred in the TEBD method using MPS has been studied extensively [42,53]. In particular, the local Uhlmann fidelity [54] was found to be useful to estimate the error in local observables [55]. For our system and parameters, we estimate that the TEBD error in local observables is ∼ 5% after 200τ for realistic bond dimensions (Appendix A). Based on previous experiments done on the same experimental set up with the same models [15,20], including studies of various imperfections [56][57][58], we expect that the systematic deviation in the imbalance due to experimental imperfections is comparable to the errorbars obtained in the measurements. Therefore we use our quantum simulator to benchmark a new approximate numerical method which we develop. For the purpose of comparison with the experiment, we restrict the on-site potential to the above mentioned models models, although our theoreti-cal method is expected to be applicable more generally, for all localized models.

III. THEORETICAL APPROXIMATIONS
Our approximate method is built upon antisymmetrized product states and is suitable for fermionic systems. A basis state |ψ of the Fock space of two component fermions on a 1D lattice of size L with N σ atoms in spin-σ can be represented in second-quantized notation as |ψ =ĉ † j1,↑ · · ·ĉ † j N ↑ ,↑ĉ † i1,↓ · · ·ĉ † i N ↓ ,↓ |vac , where |vac denotes the state of the empty lattice. Here 1 ≤ i 1 < i 2 · · · < i N ↓ ≤ L and 1 ≤ j 1 < j 2 · · · < j N ↑ ≤ L. States of multiple non-interacting particles, (e.g., N ↓ > 1, N ↑ = 0) can always be written as antisymmetrized products. For instance, if the spin-↓ atoms start at sites i 1 , · · · , i N ↓ , we can compute the full time-evolved state |ψ(t) under the Hamiltonian (1) as the antisymmetrized product of |ψ 1 (t) , · · · , |ψ N ↓ (t) , where |ψ r (t) is the time-evolved single-particle state of an atom starting at the site i r , r ∈ {1, 2, · · · , N ↓ }. If the system is localized, we can compute each |ψ r (t) by restricting the lattice to a finite size, including only sites i ∈ {i r − , · · · , i r + } for some positive integer . As we increase , |ψ r (t) converges quickly to the exact value, if the system is localized. This is an efficient approximation for many-particle non-interacting dynamics.
We now construct an efficient approximation for interacting many-body systems (i.e., N ↓ , N ↑ > 0), where the dynamics is localized. Although the many-body state |ψ consists of a large number of variables, the experimentally relevant information can be summarized in two L × L "occupancy matrices" Γ ↓ and Γ ↑ . The ij-th elements of these two matrices are defined as [59], The diagonal entries of Γ σ represent the on-site occupation density of spin σ atoms and the off-diagonal terms represent correlators. For instance, in the previous example of multiple non-interacting particles, Γ ↓ (t) can be written as (see appendix G for a derivation).
If the system is localized, each of the operators |ψ r (t) ψ r (t)| are dominated by elements around a particular diagonal entry [60]. Thus, their sum Γ ↓ (t) will be dominated by elements around the diagonal, although the diagonal itself may be uniform [61] [ Fig. 1(d)].
In the many-body case if the system remains localized, the occupancy matrices Γ ↓ (t) and Γ ↑ (t) are again dominated by elements in and around the diagonal. Therefore, we seek an approximate representation for these matrices, of the form: Where, Γ σ,r (t) are density matrices to be defined, loosely representing the state of the r-th atom. Unlike in the non-interacting case, the matrices Γ σ,r (t) need not be pure or orthonormal [see appendix H for a detailed discussion of Eq. (4)]. We use this representation to approximate the time evolution of a separable initial state |ψ =ĉ † j1,↑ · · ·ĉ † j N ↑ ,↑ĉ † i1,↓ · · ·ĉ † i N ↓ ,↓ |vac under the Hamiltonian Eq. (1). For such a state, the matrices Γ σ,r (t = 0) are well defined. Below, we summarize the approximations used to define Γ σ,r (t) [ Fig. 1(c)]. Hereafter, we use σ to represent the spin component for which we are computing the occupancy matrix andσ to represent the other spin component.
1. Approximation-1: Choose an integer κ σ ≥ 0 and construct a κ σ -shell around the site i r ., i.e., a set of κ σ sites nearest to i r [62]. Replace all spin σ atoms in the initial state |ψ , outside this κ σ -shell with holes. This would result in a new state |ψ with at most κ σ + 1 spin σ atoms.
The final state |ψ after applying the above three approximations on |ψ is a few-body state on a lattice of size 2 + 1, |ψ =ĉ † α1,σ · · ·ĉ † ir,σ · · ·ĉ † αq σ ,σĉ † β1,σ · · ·ĉ † βqσ ,σ |vac where q σ + 1 (the +1 accounts for the spin-σ atom that starts at site i r ) and qσ are the number of atoms of spin σ and spinσ that remain after approximations 1 and 2 respectively and α i , β i are their positions. During a time evolution under the Hamiltonian Eq. (1), |ψ will remain in a smaller Hilbert space H , the dimension of which is independent of L, polynomial in and exponential in q σ , qσ. We evolve |ψ in time under the projection of the Hamiltonian, Eq. (1), to the space H , using ED. We then use the time-evolved state |ψ (t) to compute Γ σ,r : This matrix is in fact the occupancy matrix of spin σ atoms corresponding to |ψ (t) . The factor 1 1+qσ stems from the q σ +1 spin σ atoms in this state (see appendix H for a further discussion). We follow this procedure for r = 1, 2, · · · , N σ to obtain Γ σ,1 (t), · · · , Γ σ,Nσ (t), from which we compute Γ σ (t) following Eq. (4).
The possible choices for the free parameters , κ σ and κσ can be represented by integer points in a 3D region defined by κ σ , κσ ≤ min{2 , L} (see Fig. 6 in the appendix). At the extreme point , κ σ , κσ = L, the approximate method is exact. If we use periodic boundary conditions on the full lattice, the approximate method is exact when 2 , κ σ , κσ = L. In another limiting case when κ σ = κσ = 2 , represented by a line in the 3D space, the approximate method reduces to a possible adaptation of the standard cluster expansion method to 1D Fermi-Hubbard systems [63][64][65], also discussed in appendix C. This extreme case can also be related to TEBD with a spatially varying bond dimension optimized for a local observable (see appendix A and B for more details. Also see Ref. [50] for a related idea). Depending on the nature of the dynamics being studied, the convergence rate of the approximation may be the fastest along a non-trivial path towards the extreme point 2 , κ σ , κσ = L in this 3D space. The practical utility of the approximate method depends on the rate of convergence. We show that the convergence can occur for relatively small values of , κ σ and κσ in the localized regime by benchmarking the numerical results with a neutral-atom quantum simulator.

IV. BENCHMARKING
In the experiment, we prepare the system in a CDW initial state, i.e., a uniform incoherent sum of states where each even site is occupied by a spin-↑ or a spin-↓ atom as described above (with the total numbers fixed to N ↑ and N ↓ respectively) and each odd site is empty. Starting from a spin-polarized sample in the |↓ -state we generate a spin imbalanced gas where the spin imbalance is adjusted by varying the RF power during a sweep that couples the two spin states for constant atom number, N = N ↑ + N ↓ . After a variable time evolution, we measure the spin-resolved imbalance I σ (t) between the evenand odd-site occupancies, defined as Here, N even is the number of spin σ atoms on even [odd] sites. We use the approximate method to compute the spin-resolved imbalance for both, the Stark and Aubry-André model, and benchmark it with experimental results. The parameters in the two models are calibrated by realizing the non-interacting limit.
For simplicity, we use σ =↓, i.e., we will compute the imbalance of ↓ atoms using the proposed method and compare it with the experimental data. Note that in a CDW, since only even sites can be occupied, the number of even sites within the κ σ -shell is more relevant than κ σ itself. Therefore, we label the shells by k σ , the number of even sites in the κ σ -shell and kσ, the number of even sites in a κσ -shell. Note that k σ = 0 when κ σ = 0, 1 or 2. And k σ = 1 when κ σ = 3. In general, when κ σ = 4n + 3, k σ = 2n + 1 and when κ σ = 4n, 4n + 1 or 4n + 2, k σ = 2n. Hereafter, we will label a shell by k σ . See table I in the appendix for a list of k σ values and the corresponding κ σ values. For the Stark Hamiltonian, we consider two time scales -a short time scale where one can observe coherent Bloch oscillations in the imbalance [20] and a long time scale where the oscillations are dephased and a steady-state value is reached. To quantify the disagreement between the experiment and the numerical method, we use the root-mean-square (RMS) deviation, defined as |I expt − I apx | 2 dt. Here, I apx is computed using Eq. (4). In Fig. 2(b) we show the RMS deviation as a function of for various k ↑ , with k ↓ = 0, upto t = 8 ms = 25τ . The data indicates that the variation of the approximate method with respect to the experiment is negligible for > 7, and therefore, we use = 7 for further computations. While there is a significant difference between k ↑ = 0 and k ↑ = 2, increasing it further has no significant impact on the RMS deviation, suggesting that much of the interaction effects on short-time Bloch oscillations stem from three-atom processes. For a CDW initial state, increasing k ↓ does not have a significant effect. This can be attributed to the averaging in the CDW. In particular, the dynamics of a spin-↓ island is unaffected by increasing k ↓ . The effect of k ↓ is most significant for a Néel type initial state (see appendix D for more details).
For long evolution times, we directly compare the steady-state imbalance valuesĪ ↓ , defined as the imbalance averaged over a time window (see appendix F for details). Fig. 2(c) shows the benchmarking results for long-time dynamics, averaged over 10 points between 300τ and 330τ . We find that for ∆ > 2J, numerical results with k ↑ = 4 are already sufficient to predict the long-time dynamics within errorbars (Fig 2d). However, the convergence is slower at smaller ∆, owing to a larger single-particle localization length (Fig 2c) of more than two sites for ∆ < 2J, where we expect our method to fail to converge within the accessible range of the parameters , k ↑ and k ↓ . Similarly, we find that for the Aubry-André Hamiltonian at 4J a value of k ↑ = 2 is already sufficient to reproduce the well-known interaction dependence within errorbars [15]. This observation is in agreement with the analytical localization length [66] of 1.4 lattices sites obtained at a detuning strength of ∆ = 3J.
In all the datasets, ∆ ↑ = 0.9∆ ↓ and the tilt factor 0.9 is determined by the differential tilts experienced by the |F = 9/2 : m F = −9/2 and |F = 9/2; m F = −7/2 states. For a study of the effect of varying this tilt difference and the performance of our approximate method in those cases see Ref. [67].

V. COMPUTATIONS USING THE APPROXIMATE METHOD
We now use the approximate method to obtain further insights into the interacting dynamics of the Stark Hamiltonian. For a CDW initial state, we can use the approximations described above to derive the following expression for the imbalance (see appendix I for a derivation. Also see Ref. [61] for a similar expression for equal The black circles represent the perturbative estimate, valid in the limit J ∆, of the side peak, 3J ± 4J 2 U/(∆ 2 − U 2 ) (see appendix F for more details) . spin populations).
where, λ ↑ = 2N ↑ /L is the filling of spin-↑ atoms and is the imbalance of spin-↓ atoms in a fewbody interacting state with q ↑ spin-↑ atoms and q ↓ + 1 spin-↓ atoms. In the case k ↑ = 0, this expression reduces to The first term in the above expression is the noninteracting imbalance, while the second one corresponds to the interaction effect. Intuitively, for any spin-↓ atom, the probability that it interacts with a spin-↑ atom is proportional to N ↑ and therefore, the first term in the sum is linear in λ ↑ . A CDW initial state with unequal spin populations must therefore show an enhanced (reduced) interaction effect on the minority (majority) spin component. This simple prediction can be directly tested using our cold-atom quantum simulator.
To quantify the effect of interactions, we consider the height of the primary peak in the Fourier spectra of the measured imbalance time traces [20]. As shown in Fig. 2(a), finite interactions induce additional frequency components in the dynamics, thereby reducing the strength of the primary peak in Fourier space. We study the interaction effect for initial CDWs with variable spin imbalance λ ↓ . Fig. 3(a) and Fig. 3(c) show that a spin imbalance in the CDW indeed produces an enhanced interaction effect on the minority spin component and the results agree quantitatively with the predictions of the method. The interaction-induced side-peak in the Fourier spectrum is clearly visible, both in the data and the numerical simulations [ Fig. 3(b)]. These a The growth of the bipartite entanglement entropy computed for a single few-body state in the approximate method using = 7 and various values of q = (q ↑ , q ↓ ) for the Aubry-André Hamiltonian with ∆ = 8J. Similar plot for ∆ = 4J is shown in the inset. See Fig. 11 in the appendix for a comparison of the entropy with TEBD computations. b The correlation Cij = ninj − ni nj in the single particle case for the Aubry-André model. c The steady-state value of the corre-lationCij for various ∆ and q with = 7 for an interacting system, showing that the convergence is the slowest near the critical point, ∆ ≈ 3.5J. The same color code as in a is used. The interaction strength used in all the computations in this figure is U = 5J (see appendix F for more details).
side peaks correspond to energy shifts in the many-body eigenspectra that appear due to interactions [20].
Having an efficient numerical method at hand, we can now systematically explore the frequency-resolved features that appear in the interacting dynamics, which would otherwise be an arduous task. Following Fig. 2, the time dynamics of the imbalance is well approximated by the approximate method with = 7, k ↓ = 0 and k ↑ = 4 for ∆ = 3J, up to 700τ . This corresponds to a resolution of 10 −2 J in the Fourier spectrum. In Fig. 3(d), we show the Fourier spectrum of the imbalance for ∆ = 3J for 100 values of the interaction strength ranging between 0 and 20J. The main peak at 3J corresponds to the primary frequency of the Bloch oscillations. The side peaks are placed fairly symmetrical around this main peak. In the limit of weak interactions (U < 2J) the dominant side peaks are ∝ U away from the main peak and in the limit of strong interactions, they are ∝ 1/U away from the main peak. In these two limits, the prediction of the approximate method agrees with the perturbative estimate, 4J 2 U/(∆ 2 − U 2 ) of the side peaks. This is the first non-zero correction to the energy obtained by treating J as a perturbation [20,67]. Note that at U = ∆, when this perturbative estimate breaks down, the approximate method reveals a rich structure in the spectrum.
So far, we have considered charge imbalance as the main observable to study the many-body dynamics, since it is easily accessible in experiments [15]. We now turn to the bipartite entanglement entropy (EE), which is widely employed to study localization dynamics in many-body systems. It is known that for thermal systems, the EE rapidly increases and saturates to a thermal value, proportional to the volume of the smaller of the two parts of the lattice. In many-body localized systems, the EE grows logarithmically in time before saturating to a value lower than the thermal value [68][69][70][71].
Here we show that the approximations behind our efficient simulations allow us to gain additional insight into the microscopic processes underlying the dynamics and growth of the EE. Indeed, we find that the logarithmic growth at initial times stems mostly from few-body physics. As a reminder, the many-body quantum state in our description is constructed by patching together several few-body states reduced into a single-atom mixed state [Eq. (5)]. Each few-body state is associated with a short sublattice of size 2 + 1 centered at the initial position of the corresponding atom. Computing the EE would require us to compute the full many-body state, which can only be defined by a canonical choice (see appendix H for details). However, constructing the full many-body state and computing its EE is hindered by significant mathematical and computational challenges. Therefore, here we use a simple estimate of the EE. If this short lattice does not contain the center of the full lattice, then we expect that this particular few-body state does not affect the total half-chain EE. Therefore in order to understand the qualitative properties of the EE, we consider only one few-body state, whose associated short lattice is placed at the center of the full lattice. This state would have the largest EE among all the few body states, whose associated short lattice intersects with the center of the full lattice.
We show that the characteristic logarithmic growth of the bipartite EE (computed from a single few-body state with q ↓ +1 spin-↓ atoms and q ↑ spin-↑ atoms) in the localized case arises from few-body processes [ Fig. 4(a)] and quantitatively agrees with TEBD calculations, shown in Fig. 11 in the appendix. While the logarithmic growth is visible already for q ↑ = 1, i.e. two particles, the steadystate value is near convergence by q ↑ = 4.

VI. LIMITATIONS OF THE APPROXIMATE DESCRIPTION
We next consider the conditions under which our approximation ansatz is ineffective, i.e., fails to converge within the accessible range of the parameters , k ↓ and k ↑ , suggesting future experimental work to explore manybody dynamics under these conditions. The parameter represents the dynamical range, i.e., the spatial extent explored by each atom. k σ + kσ + 1 can be interpreted as the maximum entanglement depth of the state [72]. Therefore, the dynamics that break the approximate method would necessarily involve a large dynamical range for each atom and produce a large entanglement depth [73,74]. Therefore, one of the technological challenges in developing quantum simulators is the high fidelity creation and control of states with a large entanglement [75].
The above arguments appear to suggest that the manybody dynamics considered in this paper is the hardest to simulate using our method when the system is fully delocalized. However, on the contrary, we demonstrate that the approximate method is effective in simulating local or few body observables when the localization length is small or large, but ineffective when it is intermediate. It has been identified that the vicinity of the phase transition offers many physically interesting effects such as slow dynamics [76][77][78][79] and anomalous diffusion [80,81].
We consider the Aubry-André model and study the convergence of the density-density correlation C ij = n inj − n i n j between site i and site j. The correlation decays down to zero for large |i − j| when the system is localized and to a non-zero value when the system is delocalized as illustrated in Fig. 4(b) for a simple non-interacting Aubry-André model. Based on this observation, we consider an interacting Aubry-André model and study the convergence of the plateau value,C ij of the correlation C ij for various disorder strengths. Although we can compute the correlation C ij for the full manybody state using our approximate method, for the purpose of the study of the convergence it suffices to consider a single few body state. Note that if the absolute change in the correlator due to increasing k σ , kσ is small, then the corresponding change in the quantum state will also be small. As shown in Fig. 4(c), the convergence of the quantum state is fast for small and large disorders, and slow in the intermediate regime, i.e., when 3J < ∆ < 4J.
In a nutshell, many-body dynamics with small and large number of modes is likely to be approximable either by an effective model or by a thermal ensemble, when we consider local or few body observables. Dynamics with intermediate number of modes, however, constitute the quintessential hard to simulate regime using the approximate method. While the correlator C ij is not a local observable (i.e., it cannot be extracted from the on-site reduced density matrix), it can still be extracted using the reduced density matrix corresponding to two sitesit is a few-body observable. If the many-body system is thermal, the reduced density matrix corresponding to two sites i and j lies in the vicinity of the corresponding Gibbs state (as long as the localization length is larger than the distance i − j). Therefore, approximation methods can be quite effective in studying local observables and two point correlations [50,82,83].

VII. CONCLUSIONS
To conclude, we have shown how a near-term quantum device can be used to quantitatively solve for the time dynamics of a quantum many-body system, which is otherwise inaccessible. In particular, we have developed an efficient approximate numerical method for localized Fermi-Hubbard systems and benchmarked it using a quantum simulator. Although our method is built to study time dynamics, based on a Wick rotation [84] e −iĤt → e −βĤ we can adapt it to study thermodynamic properties in the intermediate to low temperature regime. Moreover we can use our method to study the effect of open system dynamics in a localized Fermi-Hubbard system [67], which involves solving Lindblad equations.
One of the tools that has not been used in this method so far is acceleration of convergence, which is effective in extrapolating well-behaved sequences [44,85]. In the future, it would be interesting to apply this technique to systems where the dynamics does not converge within the accessible range of k ↑ and k ↓ , for instance, 2D Fermi-Hubbard and Bose-Hubbard models, where the number of participating atoms is not strictly bounded by the dynamical range. Moreover, one can combine MPS methods with our approximate method to enhance the efficiency for such systems. Eq. (8) indicates that the imbalance of ↓-atoms is polynomial in λ ↑ and its degree represents the entanglement depth that develops in the dynamics. Therefore, exploring the effect of interactions in relation to the spin-imbalance would reveal the number of atoms participating in the dynamics.
One of the central technological challenges in neutral atom quantum simulators is to benchmark and optimize the experimentally accessible coherence length of the dynamics. Our approximate method provides a quantitative estimate of the range (i.e., the value of at convergence) and the number of atoms (i.e., the value of k ↑ and k ↓ at convergence) contributing to the observed dynamics. We can therefore use our method to set a lower bound on the coherence length accessible in the experimental system. Moreover, using our method we can identify dynamical features that require a certain coherence range to be observed, which can then be used to optimize the experimental system.
Acknowledgments We thank D.
Abanin, E. Mueller, P. Sala  In this section, we discuss the limitations of two of the most common numerical methods, -exact diagonalization (ED) and time-evolved block decimation (TEBD), when applied to the problem of computing the time evolution of our system.
The limitations of ED fundamentally come from the exponentially growing size of the Hilbert space. On our desktop computer with a RAM of 32 GB, we are at-best able to simulate a system with L = 21 sites and N ↑ = N ↓ = 5 atoms, without using a Lanczos algorithm (the code is available at https://gitlab. physik.uni-muenchen.de/LDAP_ag-bec-fermi1/ exact-diagonalization-for-fermi-hubbard-model).
The corresponding Hilbert space dimension is ∼ 4 × 10 8 . See appendix J for the algorithm used. Given the exponential growth of the Hilbert space dimension with N ↑ and N ↓ , hardware improvements and methods such as Lanczos algorithm [86] will only marginally enhance the accessible values of L.
Unlike ED, it is not straightforward to estimate the limitations of TEBD for a specific problem. In TEBD, the precision of computation is traded out for system size. One can in principle compute the time dynamics for a larger system, while incurring a higher error. Moreover the corresponding error cannot be estimated accurately. One can obtain a lower bound for the fidelity of the full quantum state using the discarded weights in each step in TEBD. However, the global fidelity can be low for several reasons and therefore it does not reliably capture the error in local observables like the imbalance, that we are actually interested in. The error in a local observable is better captured by the local fidelity. If |ψ 1 and |ψ 2 are two many-body states, their local fidelity corresponding to a given site is defined using the Uhlmann fidelity [54,55,88] between the reduced density matrices ρ 1 , ρ 2 corresponding to that site, Here, the square-root of a positive-semidefinite hermitian matrix ρ is defined as the unique hermitian matrix whose eigenvalues are the positive square-roots of the eigenvalues of ρ and whose eigenvectors are the same as that of ρ. While the local fidelity is bounded from below by the global fidelity, nothing more can be said of the relation between the two. For instance, while An interesting problem, therefore, is to find an estimate for the local fidelity in a time evolution under TEBD. While we leave a comprehensive investigation of this problem for a future work, here we make a heuristic argument to suggest that for a localized system with a localization length of ξ, f 2ξ L , where f is the global fidelity, is a likely estimate for the local fidelity in a TEBD computation. We assume that, if the system remains localized with a localization length of ξ, a truncation at the bond between sites i and i + 1 will affect the local fidelity at site j only when |i − j| ≤ ξ. Based on this assumption we use f 2ξ L as an estimate for the local fidelity. In Fig. 5 we show that the imbalance time trace computed using TEBD deviates from the experimental data roughly around the same time when the estimate f 2 L est deviates from 1 (we set ξ = 1 because the single particle localization length for ∆ ↓ = 3J is about one site. ξ is expected to be slightly higher due to interactions, but this will only lower f 2 L est ). Here f est is the lower bound on the global fidelity. We therefore use this quantity in Fig. 1 of the main text. See refs. [89,90] for some related ideas.
The time trace in Fig. 5, going up to 300τ was computed using a TEBD with a bond dimension χ = 500, L tebd = 100 and it took 269 hours on our desktop computer. We note that while it maybe possible to compute a time trace for L exp = 290 (the experimental system size) and improve the precision to longer time by choosing a larger bond dimension, it will take an inconvenient amount of computational time. The number of pure CDW states is L/2 L/4 , exponential in the system size. Averaging over all of these states, therefore, would not be possible in a TEBD computation. However, this average can be approximated using the following technique. We map the four possible states of a single site to four states of chain of 4−level systems: |vac → |0 , |↑ → |1 , |↓ → |2 and a doublon, |↑↓ → |3 . We then construct the product state Here, the phases φ 1 , φ 2 , · · · are randomly chosen. For large system sizes, a few avarages of such states quickly converges to a mixed CDW. We describe how this issue of averaging over all pure CDW states is handled in our approximate method in appendix I. In this section, we provide an intuitive explanation of how our approximate method can be linked to TEBD. The imbalance can be expressed as a linear sum of local observables. In fact, ifn i,↓ is the on-site number operator for spin-↓, the imbalance can be written as: Therefore, the error in the computation of the imbalance using TEBD will depend only on the error in the local observablen i,↓ . The error in n i,↓ (t) would depend only on the local fidelity at site i. The standard TEBD is optimized on the global fidelity rather than a local fidelity. That is, the bond dimension is chosen so as to minimize the loss in the global fidelity, while paying no special attention to any particular local fidelity. We can modify the standard TEBD method so that it can be optimized on specific local fidelities. For reasons that will be clear soon, let us assume that we use independent TEBD computations to evaluate n i,↓ (t) for each i. Following the arguments in the previous section, truncations of bonds far away from site i will not contribute to the error in n i,↓ (t) . Therefore, to optimize the computational performance, we may choose a spatially varying bond dimension, that takes a high value around site i and gets lower as we move away from this site [50]. In order to implement such a TEBD scheme, we will have to compute each n i,↓ (t) independently. This will add N ↓ = O(L) overhead to the computation which can be easily offset by the significant advantage that we obtain by choosing the bond dimension χ to be spatially varying. In particular, if we choose χ = 4 between sites i − and i + and χ = 1 for the rest of the bonds, it reduces to our approximate method with κ ↓ = κ ↑ = 2 .
The key feature of our approximate method, however is that κ ↑ and κ ↓ can be chosen to be different from . This feature does not appear naturally in the above described TEBD scheme. These considerations suggest an alternate formulation of the approximate method. By replacing approximation-3 with an MPS ansatz, we trade the parameter for the bond dimension. In this alternative formulation, we compute the time evolved few-body state |ψ (t) on the full lattice of size L using TEBD, with an appropriately chosen spatially varying bond dimension.

Appendix C: Contrasting our work with cluster expansion
In this section, we discuss how our approximate method is related to the well-known cluster expansion method [44,45] for higher dimensional systems. While to our best knowledge there are no non-trivial cluster expansions for a 1D system with only nearest neighbour hopping [63][64][65], it is possible to construct non-trivial clusters in the case of spinful fermions. We show below that in comparison to this approach, our approximate method allows for a better optimization of the computational performance.
The cluster expansion method is suitable for 2D and 3D lattice systems. It comes from the observation that in the expansion for e −βH = 1 − βH + 1 2 β 2 H 2 + · · · , the rth term (i.e., (−1) r β r H r /r!) corresponds to paths in the lattice with r steps. In 2D and 3D the paths with r steps form a non-trivial collection of subsets of the lattice. One can use it to construct a nested collection of subsets of the lattice and build a cluster expansion.
where the approximate method reduces to the standard cluster expansion. At 2 = κ ↑ = κ ↓ = L, it reduces to exact diagonalization (ED), for periodic boundary conditions on the full lattice. In the non-interacting case, the approximate method reduces to the exact calculation when = L/2, κ ↑ = κ ↓ = 0. The orange line indicates the points with κ ↑ = κ ↓ = 0, that can be used to approximate the non-interacting dynamics (see Fig. 9 for a convergence analysis for this case).
Although our system is a 1D lattice we can make use of the internal states of the atoms to map it to a ladder-like system. This ladder, to some extent, allows for a nontrivial cluster expansion. While this idea has not been explored so far, a related idea has been studied for the case of a 1D system with next-nearest neighbour hopping [64].
We may represent the parameters , κ ↑ , κ ↓ of our approximate method in a 3D space (Fig. 6). The direct (trivial) application of a cluster expansion to our system is represented by the line 2 = κ ↑ = κ ↓ . The line κ ↑ = κ ↓ = 0 represents parameters that can be used to approximate non-interacting dynamics. See Fig. 9 for a convergence analysis on this line. As we mentioned before, the key feature of our approximate method is that the atom numbers (κ ↑ , κ ↓ ) can be varied independent of . This not only allows for a better optimization of computational performance, but also reveals a physically meaningful information, i.e., the entanglement depth in the system.
Appendix D: The role of k ↓ One of the parameters whose effect we have not explored in detail in the main text is k ↓ . In this section we  briefly discuss this parameter. As we mentioned before, in the κ ↑ − κ ↓ parameter space for a fixed , the point (κ ↑ , κ ↓ ) = (2 , 2 ) is where the error in the computation is minimized. Depending on the nature of the dynamics and the observable of interest, there may be an optimal trajectory towards this point in the κ ↑ − κ ↓ that maximizes the convergence rate. We note that in general, finding this optimal trajectory can be very complicated. We therefore restrict our discussion to the data shown in Fig. 2 of the main text. As before, we work with k ↓ , i.e., the number of even sites inside the κ ↓ -shell, as this parameter is more convenient and physically meaningful. See Table I for a list of k σ values and the corresponding κ σ values. In Fig. 7(b,c), we demonstrate that the impact of having a higher k ↓ is not significant on the scale of the experimental errorbars for the parameters from Fig. 2 of the main text. We attribute this to the spin polarized islands that may be present in an incoherent CDW initial state. Note that in a polarized spin-↓ system the other spin-↓ atoms are irrelevant for the dynamics of a given spin-↓ atom. Indeed, if we replace the initial incoherent CDW with a Néel type CDW initial state (i.e., • ↑ • ↓ • ↑ • ↓ • · · · ), which prevents spin-polarized islands, the effect of k ↓ is more significant (Fig. 7(a)). In Fig. 8, we show the imbalance steady state value, averaged between 300τ and 330τ as in Fig. 2c of the main text, but for a larger range of ∆ ↓ . Betwen ∆ ↓ = 2.5J and ∆ ↓ = 3.5J, one can see a weak effect of k ↓ , as the experiment agrees better with the computation corresponding to = 7, k ↑ = 4 and k ↓ = 1. This also shows a feature around ∆ = 3J, corresponding to a U = 2∆ resonance, explored in detail in Ref. [67].

Appendix E: Experimental details
The quantum simulator, i.e., the experimental system consists of a 3D lattice formed by three laser beams, one of which along the x-axis, has a wavelength of 532 nm (this is the primary lattice) and the other two along the y and z-axes have a wavelength of 738 nm each. The latter two lattices are deep, preventing any hopping in the y or the z-directions within the timescale of the experiment. If J is the hopping rate along the primary axis the hopping along the orthogonal axes is about ∼ 3 × 10 −4 J. Therefore, our system can be considered as a set of independent 1D tubes with a lattice spacing of 266 nm and length L ≈ 290 ± 10 sites [20]. We have an additional  Figure 7. The effect of k ↓ . a Imbalance time trace with a Néel type initial state computed using the approximate method for various values of (k ↓ , k ↑ ). We can see that the effect of increasing k ↑ on the time trace decreases progressively. The parameters used are ∆ = 3J and U = 5J. b Similar computation for an incoherent CDW initial state with the parameters used in Fig. 2(a) of the main text. Note that the effect of k ↓ is weaker. c similar computation of a long time trace, for the parameters used in Fig. 2(d) of the main text.
lattice in the x-axis created by a 1064 nm laser to form a bichromatic superlattice, which we use for the preparation of a charge density wave and measurement of the imbalance (see appendix E 1 and E 3). We load the lattice with about ∼ 50(5) × 10 3 40 K atoms at a temperature of = 0.15(1)T F , where T F is the Fermi temperature. The internal states |F = 9/2; m F = −9/2 and |F = 9/2; m F = −7/2 of 40 K are used as the spin-↓ and spin-↑ states respectively. We use the magnetic Feshbach resonance at 202.1 G between these two states to control the Hubbard interaction in the lattice. We apply two classes of on-site potentials; a quasirandom potential, V i,σ = ∆ cos(2πβi + φ) + α(i − L/2) 2 , i.e., the Aubry-André model and a linear potential V i,σ = i∆ σ +α(i−L/2) 2 i.e., the Stark model. The Aubry-André model is realized by an incommensurate lattice along the x-axis, with a wavelength of 738 nm, which introduces a quasi periodic on-site potential. The Stark model is realized by a magnetic field gradient along the x-axis, produced by a single current carrying coil. The weak quadratic term, α(i − L/2) 2 stems from an additional harmonic confinement induced by the optical dipole potentials. Typical values are α ≈ h × 216 mHz. In the tight-binding limit, the Hamiltonian of this system is given by Eq. (1) of the main text. The nearest neighbour hopping J is controlled by the depth of the primary lattice. It is typically between h × 200 Hz and h × 500 Hz.

Initial state preparation
To prepare the initial state, we load the atoms repulsively, at a scattering length of a = 100a 0 into a threedimensional (3D) optical lattice [20]. Loading the atoms repulsively suppresses the formation of doublons to a large extent and we are left with < ∼ 15% doubly occupied sites. Moreover, we eliminate any residual doublons by applying a 100 µs near-resonant light pulse right after loading the deep lattice [52], which causes light assisted collisions, removing the doublons without affecting the singlons. With deep orthogonal lattices, we can consider the system as a collection of 1D tubes. We characterize the 4σ width of the central tubes to L exp = 290 sites, using a Gaussian fit to an in-situ image of the atoms. Along the two orthogonal axes, we estimate about 150 sites and 22 sites respectively.
In order to obtain a CDW pattern in our initial state, we make use of an adiabatic ramp of the phase between the lasers forming the short (λ s = 532 nm) and the long (λ l = 1064 nm) lattices along the x-axis, also known as the superlattice phase. The atoms are loaded into the long lattice. We then ramp up the power of the short lattice with a superlattice phase of φ = 0.44π in about 200 µs. A symmetric double-well potential is realized for φ = k ·π, for integer k. The chosen phase creates strongly tilted double wells with the atom that was previously loaded into the long lattice located on the low-energy site of each double well (even site), while the high energy site (odd site) is empty. We then ramp down the power of the long lattice, which remains switched off during the time evolution.
In order to prepare a spin-imbalanced charge-density wave, we make use of an adiabatic sweep of an RF signal coupling the states |F = 9/2; m F = −9/2 and |F = 9/2; m F = −7/2 . The atoms are initially cooled in the |F = 9/2; m F = −9/2 state and before loading into the lattice, we induce an RF sweep that produces the desired incoherent mixture of the two internal states. We control the weights in the mixture using the power of the RF signal. Such a sweep of the RF frequency transports the system across an avoided level crossing, where the minimal gap is controlled by the power of the RF signal. Therefore, the probability of finding the atoms in the |F = 9/2; m F = −7/2 after the sweep is given by the Landau-Zener equation, f 7/2 = e −x/a , where x is the power of the RF signal and f 7/2 is the fraction of the atoms in the |F = 9/2; m F = −7/2 state (Fig. 10). We obtain a fit value of a = 0.44 mW. The adiabatic RF sweep is applied before loading the atoms into the lattice. After loading, the state of a single 1D tube can be described as Note that a global phase factor has been dropped and |0 represents an empty site. The phases φ i are acquired during the sweep, depending on the local magnetic field. Therefore they phases vary along the lattice, across the 250 tubes and between the experimental realizations and can be considered as independent random values. Accounting for this averaging, the initial state is modelled as the following mixed state: This is a mixture of all incoherent CDWs with various spin imbalance, weighted by a binomial distribution, i.e., a CDW with N ↑ spin ↑ atoms appears with a probability With L exp /2 ≈ 145, this distribution has a very sharp peak at N ↑ = f 7/2 L exp /2. Thus, the mixed state is very well modelled by an incoherent mixed state of all spin configurations with N ↑ = f 7/2 L exp /2 and N ↓ = (1−f 7/2 )L exp /2.

Calibration
We use the non-interacting dynamics to calibrate the parameters J, ∆ σ and α in Eq. (1) of the main text. In the Stark model the imbalance I σ oscillates at frequency ∆ σ (see Ref. [20] for details). In the non-interacting case (i.e., when U = 0), this dynamics can be described analytically [91] using Bessel functions. For an atom starting on the i-th site at t = 0, the occupation probability at where J ν is the ν-th order Bessel function of the first kind. The confinement α introduces a damping of the Bloch oscillations. In fact, the Bloch oscillations develop a beat-note envelope with a frequency Lα. This is the result of the effective tilt varying from ∆ σ − Lα/2 to ∆ σ + Lα/2, through the lattice. In our experiment, Lα ≈ h × 60 Hz. We choose the tilt ∆ σ between 0.5 kHz and 2.0 kHz. We calibrate these parameters using the imbalance time trace in the non-interacting case (Fig. 2a of the main text).

Measurement
In order to measure the spin resolved imbalance, we apply a band transfer technique in the superlattice [92,93] along with a Stern-Gerlach gradient. In the band transfer technique, we map the atoms on odd sites into the third band of the long lattice, and atoms on even sites remain in the first band. In order to accomplish this, we use a superlattice phase of φ = 0.15π, while ramping up the long lattice and ramping down the short lattice. Finally, we perform bandmapping and Stern-Gerlach resolved absorption imaging to evaluate the spin-resolved imbalance, at a finite time of flight. We use the same coil to produce the Stern-Gerlach gradient and the magnetic field gradient during time evolution.
The spatial separation between |F = 9/2; m F = −9/2 and |F = 9/2; m F = −7/2 achieved during the Stern-Gerlach separation is not large enough to obtain absorption images where we can distinguish between the two spin states. Therefore, prior to the band transfer, we apply a Landau-Zener sweep to convert atoms from |F = 9/2; m F = −7/2 to |F = 9/2; m F = −5/2 . The ambient magnetic field during this sweep is 231.6 G, corresponding to the zero crossing of the Feshbach resonance between the two states |F = 9/2; m F = −9/2 and |F = 9/2; m F = −5/2 , centered around 224.2 G. We perform a linear frequency ramp with a duration of 10 ms centered at 51.87 MHz with a deviation of 1 MHz. Switching off interactions between these two states ensures the absence of interband oscillations after the transfer to the third band.
To calibrate out the systematic imperfections in the detection sequence, we take two different sets of images. The first set is a measurement with no evolution time. Ideally this measurement should give an imbalance of 1. Due to systematics, we obtain normalized populations n σ e,1 and n σ o,1 in the even and odd sites and the raw imbalance is around 0.92 (2). The second set measures the imbalance after 25 ms evolution time with zero tilt, which should ideally correspond to a zero imbalance. We obtain populations n σ e,2 and n σ o,2 in even and odd sites in this measurement. We then construct a matrix that maps the measured populations for these two sets to the ideal value. That is, we determine the 2 × 2-matrix A σ , for each state σ =↑, ↓, such that This matrix is used to rescale the measured populations for each spin component.

Appendix F: Details of computations and measurements
In this section, we provide the details of the measurements and computations shown in the main text. The following sub sections refer to figures in the main text. Figure 1 In Fig. 1(b) of the main text, the dashed line represents the value of t at which the estimate of the local fidelity, i.e., f 2ξ L est with ξ = 1 dips below 0.9 for typical bond dimensions. Here, f est is the fidelity estimated based on the truncations in TEBD. See appendix A for details of why this expression is used. The typical value of ∆ ↓ for the data used in the paper is 3J and ξ ∼ 1 for this value. We used a Néel-type initial state and a Stark Hamiltonian with ∆ = 3J, U = 5J for this calculation, since this represents the typical parameter regime studied in the rest of the paper. In this plot we use the local fidelity estimated for χ = 500 and L = 100. The violet shades represent the dimension of the Hilbert space for N ↑ = N ↓ = L/4. Figure 2 In Fig. 2(a) of the main text, the blue curve represents the least square fit for data corresponding to U = 0. The fit values were ∆ ↓ = 0.928(2) kHz, J = 0.77(1) kHz and the revival time T r = 14.510(6) ms (the confinement α is related to the revival time as α = 1 2LTr ). The dataset included 101 points in time evenly distributed between t = 0 and t = 10 ms. Each data point was averaged 4 times and the errorbars represent the standard deviation. These values of the parameters were used for the computation of a time trace with U = J, also shown in Fig. 2(a). The initial state was an incoherent CDW (see appendix I for details on how the mixed CDW state is constructed). The system size used in the computation was L apx = 280. The parameters used were = 7, k ↑ = 3 and k ↓ = 0.

Main text
In Fig. 2(b), the fit parameters obtained using a non-interacting dataset were J = 1.502(2) kHz, ∆ ↓ = 1.260(5) kHz and T r = 8.41 (2) ms. Both the interacting and the non-interacting datasets consists of n = 81 points in time sampled uniformly between t = 0 and t = 4 ms, averaged 4 times at each point. The interaction strength was U = 3J. In the computation using our approximate method, we use L apx = 280 and k ↓ = 0 for various values of and k ↑ . The RMS deviation is computed in its discrete form: Here, t i are the points in time at which the data is taken and n = 81. The errorbars on the RMS is defined as Here, ∆I exp (t i ) is the standard deviation at t i , extracted from 4 data points. We show this RMS for = 3, · · · , 9, k ↓ = 0 and k ↑ = 0, 1, 2, 3 in Fig. 2(b) of the main text.
In Fig. 2(c) the fit parameter is J = 0.54(1) kHz. Here, the experimental data was averaged 20 times and the errorbars represent the standard error of the mean. We use L apx = 100 and T r = 7.5 ms for all computations in Fig.  2(c). We also account for an averaging over J, caused due to a Gaussian beam profile of the primary lattice laser [94]. Different 1D tubes have different value of J in the x-axis due to variation of the power of the lattice laser in the y − z plane. To account for this effect, we computed a weighted average over 4 values of J ranging up to 0.8 times the value at the center. In the inset, the black dashed line is obtained by a polynomial extrapolation [85] of the sequence of imbalance computed for different k ↑ . In this extrapolation, we assume that the convergence of the approximate method is described by a power series expansion in 1/k ↑ . Accordingly, we fit the imbalance to a + b/k ↑ an use a as the extrapolated convergence value.
In Fig. 2(e), the parameters are J = 0.560(6) kHz. This value was calibrated using Kapitza-Dirac scattering with a Bose-Einstein condensate of 87 Rb atoms loaded into the same lattice. The detuning strength was ∆ = 3.00(6) J also calibrated using the same method. Each data point was averaged 10 times and the errorbars represent the standard error of the mean. The numerical calculation used L apx = 280 and was averaged over 48 evenly spaced values of the detuning phase between 0 and 2π. The range of the interaction strength U was [−20J, 20J] with a step size of J. In this dataset, we use a calibration factor of 1.1 on the imbalance to account for the underestimation due to imperfections. In all other datasets, we use a more sophisticated calibration method as described in appendix E 3. Figure 3 In Fig. 3(a),and (b), the data shown is a Fourier transform of the imbalance with 80 data points in time ranging between 0 and t = 8 ms = 25τ . The data was averaged four times. The errorbars represent the standard deviation, propagated appropriately. The system parameters, obtained by fitting a non-interacting dataset are, J = 0.54(1) kHz, ∆ ↓ = 1.60(1) kHz, ∆ ↑ = 1.44(1) kHz and T r = 9.00(3) ms. The interaction strength was U = 3J. In the calculation (i.e., solid lines), we use a system size of L apx = 100 and parameters = 4, k ↓ = 0 and k ↑ = 2. The numerical sampling was the same (i.e., 80 data points) as the experiment so that we can make a comparison.
In Fig. 3(d), we compute the Fourier spectra of the imbalance for various interaction strengths. We use a system size of L apx = 280 and parameters = 7, k ↓ = 0 and k ↑ = 4. We begin with a CDW (see appendix I for more details) with 50% population in each spin and time evolve it under the interacting Stark Hamiltonian with ∆ = 3J. and α = 0. We use J = 0.5 kHz and go up to ∼ 700τ , sampled at 2000 points. Therefore, in the Fourier space it corresponds to a step of 5 Hz or 10 −2 J. This computation is done for 100 values of the interaction, placed uniformly between U = 0 and U = 20J. 4. Main text Figure 4 In Fig. 4(a), we compute the entanglement entropy for one few-body state. We begin with a lattice of size L = 15 (this corresponds to = 7) with a spin-↓ atom on site i 0 = 8 and q ↑ spin-↑ atoms filled in i 0 +2, i 0 −2, i 0 +4, i 0 − 4, · · · , in that order. In the case q ↓ = 1, the other spin-↓ atom is placed on site i 0 + 4. We then compute the time evolution of this state under the interacting Aubry-André model with U = 5J and two values of ∆ (4J and 8J). We Entanglement Entropy apx, U=7.0J MPS, U=0.0J MPS, U=7.0J Figure 11. Entanglement entropy.
The entanglement entropy computed using one few-body state with = 7, qσ = 0 and qσ = 4. The parameters were ∆ = 5J and U = 7J. The dark blue curve shows the TEBD calculation for a similar parameter regime reproduced from Ref. [15]. then compute the time-dependent entanglement entropy sampled approximately after each tunnelling time. We then average this entanglement entropy over 12 values of the detuning phase, placed uniformly between 0 and 2π.
For a comparison, we show the entanglement entropy computed as described above with the entanglement entropy computed using TEBD, taken from Ref. [15] in Fig. 11. Here we use ∆ = 5J and U = 7J and average it over 24 detuning phases.
In Fig. 4(b), we compute the non-interacting, timeaveraged correlation for two detuning strengths of the Aubry-André model. We begin with one atom positioned on site i 0 = 2 of a lattice with L = 19 sites. We compute the time evolution of this state under the Aubry-André Hamiltonian up to 700τ , sampled at 500 points using exact diagonalization. We then compute the time dependent correlator C i0,j (t) = n i0nj − n i0 n j . Here, n j is the total occupancy (i.e., both the spins included) on site j. We then compute the time average followed by the disorder phase average over 20 detuning phases placed uniformly between 0 and 2π, of |C i0j |. We then plot this average against j − i 0 in the figure.
In Fig. 4(c), we compute the plateau value of the correlation for an interacting system for various values of detuning strength. Similar to the previous figure, we begin with an initial state with a spin-↓ atoms at position i 0 = 1 on a lattice with size L = 13 sites (see Fig. 12 for a convergence analysis). We place q ↑ number of spin-↑ atoms in the odd sites after i 0 , that is, on i 0 + 2, i 0 + 4, · · · . We use q ↑ = 0, 1, 2, 3 and 4. We evolve this initial state under the interacting Aubry-André Hamiltonian for t = 700τ , sampled at 500 points with an interaction strength U = 5J. Similar to the previous figure, we compute the time and detuning phase average of the correlator C i0,j over 20 values of the latter. Additionally, for this figure, we extract the plateau value of the averaged C i0,j by further averaging it over the second half of the range of j. That is, over j = 7, · · · , 13. We compute this plateau value for 25 values of the detuning strength, ∆, placed uniformly between ∆ = J and ∆ = 10J. We show this plateau value against ∆ in the figure.
Appendix G: The occupancy matrix In this section, we derive Eq. (3) in the main text. Let us assume N ↓ = 0 and the N ↑ spin-↑ atoms at t = 0 are on sites i 1 , · · · , i N ↑ of the lattice. Let φ i1 (t), φ i2 (t), · · · , φ i N ↑ (t) be the time evolved states of the atoms. Clearly, φ i (t), φ j (t) = δ ij . The manybody state ψ is given by the antisymmetrized product of φ i1 (t), φ i2 (t), · · · , φ i N ↑ (t). That is Here S N ↑ is the symmetric group and sgn(µ) is the sign of a permutation µ ∈ S N ↑ . For convenience, we define vectors ψ i =ĉ i,↑ ψ. The occupancy matrix Γ ↑ is given by Following φ i , φ j = δ ij , we may write, Appendix H: Canonical construction of the many-body quantum state The approximate method produces L × L single-particle density matrices Γ ↑,1 , · · · , Γ ↑,N ↑ and Γ ↓,1 , · · · , Γ ↓,N ↓ corresponding to the individual atoms at any given time. These matrices represent a very small proportion of the information in the quantum system. As a result, while it is desirable to construct a many-body quantum state starting from these singleparticle density matrices, it cannot be done in a unique way. There will be multiple many-body quantum states that all map to the same set of single-particle density matrices. Nevertheless, it is worthwhile to construct a canonical many-body quantum state starting from the set of single-particle density matrices. Moreover, we are helped by the fact that the many-body quantum state is pure -this reduces the ambiguity.
Here, we discuss one possible method of constructing such a many-body quantum state. We also discuss Eq. (4) in the main text and possible corrections to it. For simplicity, we restrict to the case k ↓ = 0 in the computation of Γ ↓,j and k ↑ = 0 in the computation of Γ ↑,j . The general case is more involved.
Note that, insofar as Γ σ,i represents localized states, the higher order terms are insignificant (see Fig. 13).
We now address the question of how to construct a many-body state corresponding to N ↑ spin-↑ atoms and N ↓ spin-↓ atoms in the lattice with L sites, starting from the density matrices ρ ↑ and ρ ↓ . Ideally, we would seek a pure "parent" quantum state ψ satisfying the equations Tr ↑ |ψ ψ| = ρ ↓ and Tr ↓ |ψ ψ| = ρ ↑ .
In the special case when at least one of ρ ↑ and ρ ↓ is pure, the parent state is unique, i.e., W ρ ↑ ,ρ ↓ = {ρ ↑ ⊗ ρ ↓ }. To see this let ρ ↑ = |φ ↑ φ ↑ | for some pure state φ ↑ . For any ρ ∈ W ρ ↑ ,ρ ↓ , we may consider the spectral decomposition ρ = i µ i |ψ i ψ i |. It follows from Tr ↓ ρ = In all other cases, however, W ρ ↑ ,ρ ↓ is an infinite set. One canonical choice is to pick a ρ * ∈ W ρ ↑ ,ρ ↓ with the lowest Von-Neumann entropy. This is justified because Von-Neumann entropy is a measure of purity; pure states have zero Von-Neumann entropy. However, states in W ρ ↑ ,ρ ↓ with the lowest Von-Neumann entropy are still not unique. For instance, consider the special case where ρ ↑ and ρ ↓ have the same set of non-zero eigenvalues. Let satisfies Eq. (H4) for arbitrary θ i . Thus, there is an infinitude of pure parent states, all having a zero Von-Neumann entropy. Therefore, even in the general case we may expect there to be multiple parent states with a minimal Von-Neumann entropy. However, constructing a parent state with minimal Von-Neumann entropy is in general fairly non-trivial. Below, we present a natural generalization of Eq. (H5) to the case where ρ ↑ and ρ ↓ don't necessarily have the same set of non-zero eigenvalues, resulting in an impure state ρ * ∈ W ρ ↑ ,ρ ↓ , but nevertheless having a relatively low Von-Neumann entropy.
Let us assume that n and m are the ranks of ρ ↑ and ρ ↓ respectively. While there is no pure parent many-body state , we show that ρ ↑ and ρ ↓ can be written as a sum of matrices that have pure parents: Such that each ρ σ,i is positive semi-definite (not necessarily normalized) matrices and ρ ↑,i and ρ ↓,i have the same set of non-zero eigenvalues, i.e., they have a pure parent ψ i (again, not necessarily normalized). In other words, We define ρ * = |ψ 1 ψ 1 | + · · · + |ψ r ψ r | (H9) as the many-body quantum state. We will argue that this is likely to be one of the parent states with minimal Von-Neumann entropy. We begin by defining ρ σ,ν recursively. For convenience of notation, we introduce remainder matrices, REM σ (ν) = ρ σ,ν+1 + · · · + ρ σ,r .
Furthermore, it follows from Eq. (H12) that, rank(REM ↑ (ν)) + rank(REM ↓ (ν)) = max(rank(REM ↑ (ν − 1)), rank(REM ↓ (ν − 1))) (H13) Thus, the sum of the ranks of REM ↑ (ν) and REM ↓ (ν) is a strictly decreasing function of ν. Therefore, the iteration terminates after a finite number of steps. The Von-Neumann entropy of ρ * [Eq. (H9)]is This follows from the observation that ψ i |ψ j = ψ i | ψ i δ ij . Note that if ρ ↑ and ρ ↓ have the same set of non-zero eigenvalues, the above construction terminates at r = 1, and ρ * will be pure. We have discussed one possible algebraic way of constructing a many-body state starting from the single particle occupancy matrices. Another possible way would be to exploit the continuity of the many-body state ρ(t) in time, given that we know ρ(0), to reduce the ambiguity introduced by Eq. (H4).
Next, we make a few comments on the general case, k σ > 0 when computing Γ σ,j . The single-particle occupancy matrix Γ σ,j (t) cannot be unambiguously defined in this case. Consider for instance, a few-body state |ψ (t) (see main text) corresponding to q σ + 1 spin-σ atoms and qσ spin-σ atoms. The spin-σ occupancy matrix computed using this state describes q σ + 1 atoms. The simple prescription used in the main text, i.e., to divide the occupancy matrix by q σ + 1 does not affect the first term in Eq. (H2). However, this could lead to repetitions in Γ σ,1 , · · · , Γ σ,Nσ making Eq. (H1) unjustified. Moreover, the higher-order terms in Eq. (H2) would include squares.
In Fig. 13, we show a computation for k σ = 1, where we resolve the above issue by simply dropping all terms that contain squares in Eq. (H2). While the result appears to agree with the exact calculation, this procedure is not on firm mathematical grounds. An alternative method would be to resolve the occupancy matrix coming from |ψ (t) into a sum of q σ +1 maximally distinct occupancy matrices, each with unit trace.

Appendix I: Computing the imbalance time trace
In this section, we describe how the imbalance of spin-σ atoms at time t, I σ (t), is computed using the approximate method for a mixed singlon charge density wave (CDW) initial state that we start with in the experiment. In particular, we show how Eq. (7) of the main text is derived. We retrict to a CDW with N ↑ + N ↓ = L/2 and no doubly-occupied sites. The more general derivation for a CDW with doublons and holes follows the same procedure.
The mixed singlon CDW is an incoherent sum of all pure states with N σ atoms in the spin-σ state, all of them in even sites of a lattice of size L, with no double occupancies. The number of such pure states is given by Nσ , where σ andσ are the two spins. The time evolved imbalance of this mixed state is the average of the time evolved imbalances of these pure states. Indeed, if ψ is a pure state, one can compute its time evolved imbalance, denoted by I σ ψ (t) using the time-evolved occupancy matrix, denoted by Γ σ ψ (t) (see main text for a definition of the occupancy matrix). IfÎ σ = L i=1 (−1) iĉ † i,σĉ i,σ is the single-particle imbalance operator, then by restricting to the first term in Eq. (H2) we obtain I σ ψ (t) = 1 Nσ Tr(Γ σ ψ (t)Î σ ) = 1 Nσ Nσ r=1 Tr(Γ σ,r ψ (t)Î σ ). Each of the Tr(Γ σ,r ψ (t)Î σ ) are independently computed and therefore it is most useful to express the total imbalance as a sum of them. Loosely speaking, Tr(Γ σ,r ψ (t)Î σ ) is the imbalance of the r-th spin-σ atom in the initial state ψ. We can now express the imbalance of the CDW in terms of these quantities. Tr(Γ σ,r ψ (t)Î σ ) (I1) Here, the outer sum is over all pure singlon CDW states ψ. This expression is a sum of N σ L/2 Nσ terms. Despite its appearance, this does not require an exponentially (in L) large number of independent computations, for not all of Tr(Γ σ,r ψ (t)Î σ ) are distinct. This quantity depends on the configuration of atoms in the neighborhood of the r-th atom in the state ψ. The same configuration could appear multiple times for different values of r and ψ. Indeed, making use of this, we show that the number of distinct terms in the sum Eq. (I1) is at-most linear in L.
Consider, for instance, the case of k σ = 0 and kσ > 0. The time trace Tr(Γ σ,r ψ (t)Î σ ) depends solely on the configuration of spin-σ atoms in the neighborhood of the r-th spin-σ atom. There are 2 kσ L/2 distinct such traces. To see this, let the position of the r-th atom be 2j for some j ∈ {1, 2, · · · , L/2} (the initial position of an atom is always even in a CDW). j can take L/2 distinct values. Corresponding to each of them, there are 2 kσ configurations of spin-σ atoms in the kσ-neighborhood of 2j (note that we are ignoring the boundary effects due to finite size). Thus, the number of distinct terms in the sum Eq. (I1) is linear in L.
The multiplicity of one such term depends on the number of spin-σ atoms in the kσ-neighborhood of 2j. We denote this number by qσ. The multiplicity C qσ then is, This is the multiplicity of terms in Eq. (I1) corresponding to a configuration with qσ spin-σ atoms in the kσneighborhood of 2j, and there are kσ qσ of such configurations. The time evolved imbalance corresponding to such a configuration is computed by performing exact diagonalization of a system with qσ spin-σ atoms and one spin-σ atoms in a short lattice of length 2 + 1 (as mentiomned in the main text, ≥ kσ). Note that the Hilbert space and the unitary operator corresponding to the time evolution is the same for all kσ qσ initial configurations with qσ spin-σ atoms in the kσ-neighborhood. Therefore, for practical convenience, we compute the time evolved imbalance of all of these configurations together. Accordingly, we define I σ 2j,qσ (t) as the imbalance averaged over all of these states.
Thus, the total imbalance of the mixed CDW is This contains 2 kσ L/2 terms. In fact, we may drop the outer sum over j in the case of the Stark model, since the dynamics of the system does not depend on the position of the center of the short lattice with 2 + 1 sites. The number of terms is then independent of L. This expression maybe compared with the cluster expansion presented in Ref. [61]. Note that the most complex computation is the one corresponding to qσ = kσ, where the dimension of the Hilbert space of the spin-σ atoms is 2 +1 kσ . The above derivation was for k σ = 0. In the general case, assuming that k σ < kσ (which is usually the chosen case, based on the convergence rates shown in the main text), we obtain the following expression upon using the same derivation procedure. For a given spin-σ atom in position 2j, and q σ additional spin-σ atoms in its k σ neighborhood and qσ spin-σ atoms in its kσ neighborhood, the multiplicity of this configuration is The number of configurations of q σ spin-σ atoms in the k σ neighborhood of a site and qσ spin-σ atoms in its kσ neighborhood is kσ qσ kσ−kσ qσ+qσ−kσ . And thus, the total imbalance is Here, I σ 2j,qσ,qσ (t) is the time evolved imbalance of spinσ atoms averaged over all configurations of q σ spin-σ atoms in a k σ neighborhood and qσ spin-σ atoms in a kσ neighborhood of the 2j-th site with a spin-σ atom at the center. There are 2 kσ such configurations for a given j this is simply the sum qσ,qσ kσ qσ kσ−kσ qσ+qσ−kσ . Thus, the number of imbalance time trace computations is 2 kσ L/2.
When N σ = Ω(L), it is convenient to introduce λ σ = 2N σ /L. We can now simplify Eq. (I5) using the Stirling approximation. For a = o(n), it follows that (n+a)!/n! ≈ n a . Thus, Note that (1 − λσ)/N σ = 2/L. It is straightforward to see that the above expression reduces to Eq. (7) of the main text.
Here • represents the Hadamard product. See Ref. [20]. for a derivation. Using Trotter-Suzuki approximation, a time step taking the state M (t) to M (t + δt) can be written as Here, e −iδt•V is the element-wise exponentiation of V . The above expression consists of three mutually commuting operations on M (t)-Hadamard product with e −iδt•V , left multiplication with e −iδtĤ ↑ and right multiplication with e −iδtĤ ↓ . The latter two are multiplications of Msized matrices. The former is an element-by-element multiplication of M-sized matrices.
It remains to compute the unitaries e −iδtĤ ↑ and e −iδtĤ ↓ . Although they are both M-sized matrices, computing them does not require an exponentiation of Msized matrices. We can make use of the structure of Fermionic systems to compute them. Let U ↑ (δt) and U ↓ (δt) be the (2 + 1) × (2 + 1) unitary propagator corresponding to time step δt of a single spin-↑ atom and a single spin-↓ atom respectively. These two are S-sized matrices obtained by exponentiating S-sized matrices. We can now express e −iδtĤ ↑ and e −iδtĤ ↓ in terms of these matrices. For instance, if β = {j 1 , · · · , j q ↓ } and β = {j 1 , · · · , j q ↓ } (e −iδtĤ ↓ ) ββ = µ∈Sq ↓ sgn(µ)U ↓ (δt) j1j µ(1) · · · U ↓ (δt) jq ↓ j µ(q ↓ ) (J4) Thus, the time dynamics can be computed involving only multiplication of M-sized matrices which is done every timestep and an exponentiation of S-sized matrices which is done once for all. One can also absorb the above expression into the product M (t)e −iδtĤ ↓ thereby never having to multiply even M-sized matrices. Nevertheless, we find that the advantage gained from this additional optimization is negligible.
In the code, starting at r = 1, we construct the few body state |ψ (0) corresponding to the r-th spin-σ atom. We then use the above described procedure to compute its time evolution. Using the time evolved few-body state |ψ (t) , we compute the occupancy matrix Γ σ,r (t). We repeat this process for r = 1, 2, · · · , N σ to compute the total occupancy matrix of spin-σ atoms, Γ σ (t). Further, we repeat the whole process for the other spin component to compute its occupancy matrix.
In order to compute the imbalance time trace, we use Eq. (I5) and the above described ED technique.