Minimally entangled typical thermal states algorithm with Trotter gates

We improve the efficiency of the minimally entangled typical thermal states (METTS) algorithm without breaking the Abelian symmetries. By adding the operation of Trotter gates that respects the Abelian symmetries to the METTS algorithm, we find that a correlation between successive states in Markov-chain Monte Carlo sampling decreases by orders of magnitude. We measure the performance of the improved METTS algorithm through the simulations of the canonical ensemble of the Bose-Hubbard model and confirm that the reduction of the autocorrelation leads to the reduction of computation time. We show that our protocol using the operation of Trotter gates is effective also for the simulations of the grand canonical ensemble.


I. INTRODUCTION
One-dimensional (1D) systems are exceptional in the sense that it is tractable with classical computers to simulate their quantum many-body physics. The entanglement of a pure state in a quantum system is often quantified using the entanglement entropy of a subsystem that consists of a left (or right) half of the system. It is widely known that in 1D the entanglement of a ground state is not extensive, i.e., as a function of the system size it is constant in a system when the energy of the lowest excitation is gapped or increases only logarithmically in a gapless system [1][2][3]. Thanks to the low entanglement, a ground state of a generic 1D quantum system can be represented efficiently by matrix product states (MPS) [4] and can be numerically obtained via some smart optimization algorithms, including the density matrix renormalization group [5,6] and imaginary-time evolution using the time-evolving block decimation (TEBD) [7][8][9][10]. Moreover, the MPS representation has been applied for analyzing real-time evolution of a pure state [7][8][9][10][11][12] and a system at finite temperature described by a mixed state [13][14][15][16][17][18]. In general, however, since highly excited states with extensive entanglement entropies are involved in such MPS simulations of real-time dynamics and finitetemperature systems, one needs some ways to circumvent the infeasibility of directly representing such states in terms of MPS.
For the MPS simulations of finite temperature systems, there exist two major approaches: the purification method in an enlarged Hilbert space [14,15] or the sampling method known as the minimally entangled typical thermal states (METTS) algorithm [16,17]. In the former method, a mixed state is represented as a pure state in the enlarged Hilbert space. Because of the enlargement, the computational cost of the purification method increases polynomially but quickly as the matrix dimensions of MPS, namely bond dimensions, increase. Since the bond dimensions increase for decreasing the temper- * goto@phys.kindai.ac.jp † danshita@phys.kindai.ac.jp ature, the approach is efficient at high temperatures. Besides, a variant specialized for low temperature systems has also been proposed [19]. At intermediate temperatures, the purification method often requires a tremendous computational cost.
In the METTS algorithm, a mixed state is represented via a Markov-chain Monte Carlo (MCMC) sampling of pure states in the original Hilbert space, and one does not have to treat the enlarged Hilbert space. Moreover, the most computationally expensive operation is the imaginary-time evolution of MPS in the original Hilbert space, and thus the METTS algorithm does not suffer from a tremendous cost in systems where the ground state algorithm by imaginary-time evolution works well. Thanks to these advantages, the METTS algorithm is expected to be able to access a broad temperature range and seems superior to the purification method in numerical efficiency. Nonetheless, thus far, there have been more applications of the purification method [19][20][21][22][23][24][25][26][27][28][29][30][31] than those of the METTS algorithm [30,[32][33][34][35][36]. Moreover, it has been reported that the METTS algorithm is less efficient than the purification method because of statistical errors induced by the sampling [37].
One of the major roots of the inefficiency is the autocorrelation of successive samples which reduces the effective number of samples [16,17,37]. There exists a remedy to reduce the autocorrelation in the METTS algorithm [16,17]. However, this remedy changes the total magnetization or particle numbers during simulations, i.e., it disregards the Abelian symmetries. In a viewpoint of statistical mechanics, the unfixed quantum numbers only mean that statistical ensemble is the ground canonical ensemble and do not bring a significant problem as long as a system size is large enough. On the contrary, the disregard of the Abelian symmetries dramatically spoils the numerical efficiency [4,35].
In our previous work [36], in order to simulate a finitetemperature dynamics of the Kondo model [38] in a quasi-exact manner, we have improved the METTS algorithm by introducing a unitary operation represented as a series of Trotter gates. This approach is a variant of the symmetric METTS algorithm developed by Binder and Barthel [39] and is more flexible thanks to the controllability of the time step and the Hamiltonian con-structing the Trotter gates. Our approach has allowed us to access the logarithmic temperature dependence of the transport, which is a characteristic feature of the Kondo effects [36,38].
In this paper, we present more extensive and quantitative analyses of the improved METTS algorithm. We discuss how to utilize the flexibility of the unitary operation composed by Trotter gates in order to remove the autocorrelation problem in systems with large gaps such as a strongly interacting Bose-Hubbard model. We also find that our protocol can be combined with the hybrid approach of the purification method and the METTS algorithm, which has been developed very recently [40,41]. Taking the 1D Bose-Hubbard model as a specific example, we present benchmark tests for the approaches developed in this work.
The rest of the paper is organized as follows: In Sec. II, we briefly review the METTS algorithm and introduce its generalization including the basis transformation made by the operation of Trotter gates. In Sec. III, we present performance tests with the Bose-Hubbard model. In Sec. IV, we summarize the results.

II. MINIMALLY ENTANGLED TYPICAL THERMAL STATES ALGORITHM
A. Minimally entangled typical thermal states algorithm and its autocorrelation problem The objective of the METTS algorithm is to compute the thermal expectation value of an operatorÔ at inverse temperature β, which is given by where the summation takes over an orthonormal basis |i , Z = i i|e −βĤ |i , andĤ is the Hamiltonian of the system. Since the number of |i increases exponentially with system size, it is impossible to perform the summation over all |i in large systems. Hence, one has to introduce a sampling method. The METTS algorithm efficiently performs this sampling utilizing the fact that MPS, can efficiently represent low-entangled states [2,4]. Here, σ m is the index of the local Hilbert space at site m, L is the number of lattice sites, |σ = |σ 1 , σ 2 , . . . , σ L , and σ means the summation over all possible configurations of σ i . The matrix dimensions of matrices A σm m are called bond dimensions. In the METTS algorithm, one takes classical product states (CPS) as the orthonormal basis [16,17]: |i = |σ . Each CPS state can be represented by MPS with bond dimension χ = 1.
Basic procedure of the METTS algorithm is as follows. Starting from a certain initial CPS |i = |σ , one calculates a state by imaginary-time evolution and the expectation value φ(i)|Ô|φ(i) . The state |φ(i) is subsequently projected into a new CPS |j with the probability and one repeats the imaginary-time evolution and the observation. This is a MCMC algorithm and its stationary distribution Π i is given as the left eigenvector with eigenvalue 1 of a transition matrix whose elements are defined by p i,j = p i→j [42]. One can easily confirm that the canonical ensemble is the stationary distribution of the Markov chain generated by the METTS algorithm as follows: The METTS algorithm efficiently simulates the thermodynamic properties of spin-rotationally invariant spin systems [16,17]. However, it has been reported that a strong correlation of successive samples arises when the METTS algorithm is applied to the Bose-Hubbard model with particle-number conservation [35]. The existence of such a severe autocorrelation problem can be inferred from Eqs. (3) and (4) by taking a small inverse temperature β. At a small β, |φ(i) has large overlap with |i so that the probability for choosing |i again is very high. A severe autocorrelation problem also arises when a CPS has large overlap with one of eigenstates ofĤ. In SU(2)-symmetric spin systems, these problems can be eliminated by changing the spin axis of CPS, e.g., |φ(i) is projected into a CPS with X-axis for odd steps and projected into a CPS with Z-axis for even steps [16,17]. Although a CPS projected into X-axis is a superposition of states with different magnetization in Z-axis, the CPS is in a symmetric sector of magnetization in X-axis and the Hamiltonian conserve the magnetization in X-axis. In other words, the Abelian symmetry can be utilized. If this procedure is straightforwardly applied to particle systems, on the contrary, a resulting state is a superposition of states with different numbers of particles and there exists no Abelian symmetry that can be utilized unlike SU(2)-symmetric spin systems. Hence, numerical simulations become very inefficient.

B. Symmetric bases given by real-time evolution
In order to relax the autocorrelation problem of the METTS algorithm with the Abelian symmetry respected, Binder and Barthel have incorporated the use of different symmetric bases for different Monte Carlo steps [39]. Here, "symmetric" means that states in a symmetric basis lie within a certain subspace, in which the eigenvalue of an operator associated with an Abelian symmetry is fixed. In this subsection, we generalize their approach and introduce an easy way to obtain various symmetric bases.
A symmetric basis |i g with respect to an operatorĜ is an eigenbasis ofĜ with an eiganvalue g, i.e.,Ĝ |i g = g |i g for any i. Another symmetric basis |i g with the same eigenvalue can be obtained by a unitary transfor-mationÛ , i.e., |i g =Û |i g . One can easily confirm that the basis |i g is also an eigenbasis with the same eigenvalue as long as the unitary operatorÛ commutes with the operatorĜ:Ĝ By representing a unitary operator aŝ with a real parameter τ and a hermitian operatorÂ, the condition for the commutability ofÛ andĜ can be recast into that ofÂ andĜ. In other words, real-time evolution via any Hamiltonian commuting withĜ gives a new symmetric basis. Changing the operatorÂ and the parameter τ , one can obtain many symmetric bases. This variety of bases is one advantage of our proposed symmetric bases given by real-time evolution. Another advantage of our protocol for creating symmetric bases is that it is easy to implement. More specifically, since a time-evolution method such as the TEBD is required in the METTS algorithm for obtaining a state |φ(i) of Eq. (3), the operation |i g = e −iτÂ |i g can be implemented immediately. Hereafter, we consider only symmetric bases of a certain operator and drop the subscript g.
As a specific procedure for utilizing the creation protocol of a symmetric basis to reduce the autocorrelation problem, we project an imaginary-time evolved state into a CPS |i for even steps of the MCMC sampling and into a transformed state |i =Û |i for odd steps. This procedure does not change the stationary distribution of the Markov chain as shown in the followings. From Eq. (4), the transition probability from a CPS |i to a transformed state |k in odd steps is given as Similarly, the transition probability from a transformed state |k to a CPS |j in even steps is given as Considering the two steps as one step, the transition probability from a CPS |i to a CPS |j is given as One can confirm that the canonical ensemble Π i (5) is the stationary distribution of the Markov chain with this transition probability as follows: On the other hand, the transition probability from a transformed state |i to a transformed state |j is given as and one can confirm that the stationary distribution of the Markov chain defined by this transition probability is also the canonical ensemble in |i basis, Thus, the stationary distribution of a CPS |i in odd steps is Π i and that of a transformed state |i in even steps is Π i . It should be noted that the thermal expectation values do not depend on bases because of the similarity invariance of the trace. Hence, samples in both odd and even steps can be used to estimate the thermal expectation values. If a stateÛ † e − β 2Ĥ |i is the superposition of many CPS with similar weights, the above-mentioned procedure significantly reduces the autocorrelation problems. This is because the probability q i→k (9) has also similar values to many ks so that a correlation of successive samples is small. However, there is a trade-off. While more efficient reduction of the autocorrelation requiresÛ † e − β 2Ĥ |i to consist of many CPS, i.e., to be more entangled, manipulations of highly entangled MPS are numerically expensive. Hence, a controller to adjust the entanglement induced byÛ is necessary and we introduce the parameter τ for this purpose. In a typical situation, larger τ makes the entanglement larger but the computation of a single Monte Carlo step more costly.
As the operatorÂ, any choice might be effective as long as an operator respects the Abelian symmetry. A straightforward choice forÂ is the Hamiltonian of the systemĤ and this choice is sufficient for most cases. However, in the case that there are some CPS being overlapped largely with some of the eigenstates ofĤ, this straightforward choice might be ineffective because τ is required to be rather large for creating a state composed of many CPS. Examples include the Bose-Hubbard model with integer filling and the strong onsite interaction, and the strongly anisotropic Heisenberg model. In such cases, one should choose another Hamiltonian like that of the free bosons or the isotropic Heisenberg model.
Since the real-time evolution of MPS is a numerically expensive tasks in general, how to implement the application of an operatorÛ is also important for numerical efficiency. We here emphasize that what is necessary for conducting our procedure is not real-time evolution via a certain Hamiltonian but a unitary operation that makes the state optimally entangled. Therefore, the application of the Trotter decomposed operator is sufficient for this purpose. Here, we assume thatĤ even andĤ odd consist of a sum of 2-site hermitian operators commuting with one another such that we can accurately compute the application ofÛ T by using the TEBD method [7][8][9][10]. As long asĤ even andĤ odd respect the Abelian symmetry, the application ofÛ T gives a symmetric basis. The number of CPS generated by a single application of the Trotter gates is limited. Hence, we also use Û T (τ /n) n with some integer n in order to remove this limit. The parameter τ is optimally chosen so that the truncation errors are not severe.

C. Compatibility with the hybrid approach
Recently, the hybrid approach of the purification and the sampling has been proposed [40,41]. In this approach, the local Hilbert space of some sites is enlarged likewise the purification approach and the sampling is taken with respect to other sites. In the hybrid approach, the consequent sampling of a subsystem results in larger In other words, one can simulate the grand canonical ensemble with the Abelian symmetries respected. A price to pay for the hybrid approach is the enlargement of local Hilbert spaces in some sites. As the number of sites with the enlarged Hilbert space increases, the autocorrelation decreases but the computational cost for obtaining one sample increases [40].
Combining the hybrid approach with the abovementioned procedure for reducing the autocorrelation problem, one can reduce the price and enjoy the access to the grand canonical ensemble. Specifically, we enlarge the local Hilbert space of the two edge sites. By separating the enlarged space into physical and ancilla sites, we can represent the L-site system as a (L + 2)-site system shown in Fig. 1. With the alignment in Fig. 1, one can apply any operators to physical sites in the same way as in a system without ancilla sites. The ancilla sites do not introduce unnecessary entanglement or long-range interactions unlike other alignments. With only two enlarged sites, the decrease of the autocorrelation by the hybrid approach is limited [40] but one can decrease the autocorrelation further by the applications of the Trotter gates.
In the next section, we present the performance tests of our approach. For the tests, we use the L-site 1D Bose-Hubbard model at unit filling, Here, J is the hopping integral, U is the on-site Hubbard interaction,b m (b † m ) annihilates (creates) a boson at site m, andn m =b † mbm . We set L to be even. When we simulate the grand canonical ensemble by the hybrid approach, we replaceĤ toĤ − µ mb † mbm with the chemical potential µ. ForĤ even andĤ odd , we definê andĤ where δ i,j is the Kronecker delta and U is either U or 0. We project an imaginary-time evolved state into a CPS |i for even steps of the MCMC sampling and into a symmetric base Û T (τ /n) n |i for odd steps.

A. Analysis with the second largest magnitude eigenvalue
For large lag t, the autocorrelation is expected to decay exponentially [42] as Here, M is the number of samples and X i is some observed value of sample i. The exponential autocorrelation time τ exp is bounded by the slowest relaxation mode of the Markov chain which is determined by the second largest magnitude eigenvalue (SLME) of a transition matrix λ 2 as [42] Since one already knows that the transition probability of our approach is given by Eq. (11), the SLME can be obtained as long as the dimension of the Hilbert space is small for numerical diagonalization. one can construct explicitly the Hamiltonian and transition matrices. Since the temperature of system is high, −1/ log |λ 2 | is accordingly large in the simple METTS algorithm (τ = 0): Around one hundred of samples can be correlated, therefore one may get an independent sample from several hundred samples. As τ increases, −1/ log |λ 2 | decreases almost exponentially and reaches order of unity around τ J = 1.0. One can also see the limitation of a single application of the Trotter decomposition Eq. (15) by comparing n = 1 and n = 2 data. It should be noted that large τ does not necessarily mean small autocorrelation as is clearly indicated in the revival behavior of the U = 0 case (green dashed-dotted line). Thus, one can significantly reduce the autocorrelation time by the application of Û T (τ /n) n with n > 1 and optimally chosen τ .
Next, we turn our attention to a system where there are some CPS being overlapped largely with some of the eigenstates ofĤ. Figure 3  the system's Hamiltonian are well approximated as CPS.
From these SLME analyses, we see that our symmetric bases created by the application of the Trotter gates can significantly reduce the autocorrelation of samples. In the next subsection, we show that the reduction of the autocorrelation indeed leads to the reduction of computation time.

B. Computation time
The application of the Trotter gates significantly decreases the autocorrelation of samples. However, with only this fact, we cannot conclude that this approach reduces computation time because the application of operators to a MPS is a numerically expensive task in general. In this subsection, we show the benchmark results of MPS simulations, focusing on computation time required to obtain one uncorrelated sample. The number of correlated samples R is estimated from the blocking analysis [43] as R = σ 2 b /σ 2 , where σ is the standard error of the total energy calculated from bare correlated samples, and σ b is the standard error calculated from blocked uncorrelated samples, with sufficiently large block size N b . As N b increases, the ratio R also increases and approaches to a saturated value as shown in Fig. 4. This saturated R corresponds to the number of correlated successive samples. From R at the saturation and the averaged time elapsed to obtain one sample t samp , we define the time required to obtain one uncorrelated sample t unc as t unc ≡ Rt samp . We set the maximum occupation number of boson per site to six and the truncation error to 10 −10 . The initial CPS is a classical unit-filled Mott state i b † i |0 where |0 is the vacuum state. For the imaginary-time evolution of MPS, we use the TEBD method [7][8][9] with the optimized Forest-Ruth-like decomposition [44]. We set the imaginary time step to 0.0625J −1 . We perform all the simulations in this subsection on a single thread of Intel Xeon E5-2683 v4 processor and use the same pseudo-random number sequence.
At first, we simulate the 1D Bose Hubbard model with L = 6, ν = 1, U/J = 1.0, and βJ = 0.25 in order to confirm the results of the SLME analysis summarized in Fig. 2 and compare the thermal expectation value ofĤ obtained by the METTS algorithm to that obtained by the exact diagonalization. Table I shows the summary of the simulations. In order to obtain precise numerical data, we sample much more states than the dimension of the entire Hilbert space. The order of the estimated number of correlated samples R is consistent with the SLME analysis in Fig. 2: Several tens of samples are correlated in the simulation without the Trotter gates and there exists little autocorrelation in the simulation with the Trotter gates with τ J = 1.0. Consequently, t unc in the simulation with the Trotter gates is roughly one tenth of that in the simulation without the Trotter gates. The values of Ĥ /J in the simulations with the  Here, τ , n, and U are the parameters which determine the unitary operator Û T(τ /n) n . R is the square of the ratio of the standard errors of blocked uncorrelated and bare correlated samples, σ 2 b /σ 2 , at the saturation with respect to N b and indicates the effective number of correlated samples. tsamp is the averaged time for obtaining one sample. tunc is the estimation of the time required to obtain one uncorrelated sample. 1σ is the one-sigma uncertainty of Ĥ , and Nsamp is the number of samples used to estimate the average and the one-sigma uncertainty of Ĥ . The thermal expectation value obtained by the exact diagonalization is Ĥ /J = −0.9373. Trotter gates agree with the exact value within one-sigma uncertainty. Next, in order to demonstrate that our approach is efficient in large systems, we perform the same simulations with L = 50. Table II shows the performance of simulations for the 1D Bose-Hubbard model with ν = 1, U/J = 1.0, and βJ = 0.25. The values of Ĥ /J computed by the three METTS simulations agree within onesigma uncertainty. This result confirms that the use of the symmetric basis created by the operation of the Trotter gates does not affect the stationary distribution. As expected, the application of the Trotter gates increases t samp roughly by six times. On the contrary, it substantially decreases the effective number of correlated samples R by at least one twentieth. Consequently, t unc reduces roughly to one third of the value of the METTS algorithm without the Trotter gates. Thus, we can conclude that the application of the Trotter gates noticeably reduces the computation time of the METTS algorithm. Comparing the case of U = U with that of U = 0 in Table II, there is no significant difference. The absence of the difference is consistent with the SLME analysis summarized in Fig. 2.
We also investigate the performance of the METTS algorithm with the Trotter gates for the 1D Bose-Hubbard model with L = 50, ν = 1, U/J = 20.0, and βJ = 0.25, where there are some CPS being overlapped largely with some of the eigenstates of the system's Hamiltonian. Table III shows the performance tests of simulations for this case. With this setting, the effective number of correlated sample R in the METTS algorithm without the Trotter gates is larger than five hundred. It should be noticed that since this value is only a lower bound estimated from 262144 samples, it is possible that the true number of correlated samples is much larger. The application of the Trotter gates reduces such a very large R to 28.1 with U = U and to only 8.49 with U = 0, which means a tough autocorrelation problem can be significantly relaxed with the Abelian symmetry respected. One can also confirm the superiority of the Trotter gates with U = 0, which is consistent with the SLME analysis summarized in Fig. 3. Although t samp increases roughly by four times due to the application of the Trotter gates with U = 0, t unc reduces roughly by one eighteenth. This improvement means that error bars in the METTS algorithm with the Trotter gates withÛ = 0 are about four time smaller than those in the ordinary METTS algorithm with the same computation time.

C. Hybrid approach
In addition to the canonical ensemble, we check the performance of the combination of the application of the Trotter gates and the hybrid approach for the simulations of the grand canonical ensemble. We take the strong U limit and treat bosons as hard-core bosons which do not occupy the same site. A strong advantage of taking the hardcore boson limit is that the system can be mapped on to free fermions [45], where exact diagonalization with a large system is feasible. Hence, we can carry out the performance tests of the METTS algorithm on the basis of the comparison with the results obtained by using the In the grand canonical ensemble, one can obtain the fluctuations of the total number of particles which is related to the compressibility κ as Table IV shows the performance tests of the simulations of the grand canonical ensemble where we take the 1D Bose-Hubbard model in the strong U limit at L = 50, µ/J = −2.0, and βJ = 5.0. These simulations are also performed on a single thread of Intel Xeon E5-2683 v4 processor and use the same pseudo-random number sequence, and we use the second order Suzuki-Trotter decomposition with the time step 0.025J −1 for imaginarytime evolution. We estimate the uncertainty of κ from the jackknife analysis [43]. The estimated values of κ are consistent with the exact value within the one-sigma error, thus confirming that the grand canonical ensemble is simulated properly. Moreover, the computation time required for one uncorrelated data t unc is reduced roughly by one-third by thanks to the Trotter gates. In short, our approach based on the application of the Trotter gates is effective in the hybrid approach. Let us compare the efficiency of the simulations of the grand canonical ensemble with that of the canonical ensemble. Specifically, we simulate the canonical ensemble of the 1D Bose-Hubbard model in the strong U limit at L = 50, ν = 0.08, and βJ = 5.0 by using the METTS algorithm with the Trotter gates. For the comparison, the filling factor ν is determined to be close to that of the grand canonical ensemble simulated in Table IV, ν = 0.074, and we use the same Trotter gates. In the canonical ensemble, the estimated number of correlated samples R is 9.2, which is around one-seventh of the value in the grand canonical ensemble 68.3. There-fore, the canonical ensemble is preferred unless one wants to compute observables converted from the fluctuations of conserved quantities, such as the compressibility and the magnetic susceptibility. The inefficiency of the grand canonical ensemble can be attributed to the fact that while the sampling space is much larger than that in the canonical ensemble, the total particle number can change only by at most two in one Monte Carlo step in our setting, where there are only two ancilla sites. Although increasing ancilla sites reduces the number of correlated samples but also increases the numerical cost of obtaining one sample [40]. Figure 5 represents the chemical potential µdependencies of the filling factor ν, the internal energy per site Ĥ /(JL), and the compressibility κ obtained by using the METTS algorithm with and without the application of the Trotter gates. The expectation values and the one-sigma error bars are estimated from successive 8196 samples. For the filling factor and the internal energy, both of the METTS algorithms with and without the Trotter gates give sufficiently precise and accurate values for the entire region of the chemical potential. On the contrary, as for the compressibility, the deviation from the results obtained by using the exact diagonalization is visible in both algorithms. Nevertheless, the error bars of the METTS algorithm with the Trotter gates are smaller than those of the METTS algorithm without the Trotter gates, especially at µ/J = −1.8, where the compressibility takes a maximum value. This result indicates that the application of the Trotter gates to the METTS algorithm allows for a more efficient description of the grand canonical ensemble.

IV. SUMMARIES
We improved the minimally entangled typical thermal states (METTS) algorithms by adding the operation of IV. Performance tests of simulations for the 1D Bose-Hubbard model with L = 50 and βJ = 5.0 in the strong U limit. The chemical potential µ is set to be −2.0J. Here, κ is the compressibility, 1σ is the one-sigma uncertainty estimated from the jackknife analysis. The exact numerical value of the compressibility κ is 11.866J −1 . See the caption of Tables I for the  definitions of the other   a series of Trotter gates which transforms the symmetric basis [39]. We performed the analysis using the sec-ond largest magnitude eigenvalue of a transition matrix for the one-dimensional Bose-Hubbard model with unit filling in order to show that a correlation of successive samples significantly decreases by applying the Trotter gates. From the performance tests for the same model, we confirmed that the reduction of the autocorrelation leads to the reduction of computation time and thus improves the numerical efficiency of the METTS algorithm. We showed that the application of the Trotter gates can be combined with the recently proposed hybrid approach [40,41] and improves the efficiency of simulations of the grand canonical ensemble. Therefore, the improved approach relaxes the autocorrelation problem of the METTS algorithm without breaking the Abelian symmetries. The improved METTS algorithm is applicable potentially to many problems at finite temperatures, such as transport, quench dynamics, and the magnetization curve.