Application of the path optimization method to the sign problem in an effective model of QCD with a repulsive vector-type interaction

The path optimization method is applied to a QCD effective model with the Polyakov loop and the repulsive vector-type interaction at finite temperature and density to circumvent the model sign problem. We show how the path optimization method can increase the average phase factor and control the model sign problem. This is the first study which correctly treats the repulsive vector-type interaction in the QCD effective model with the Polyakov-loop via the Markov-chain Monte-Carlo approach. It is shown that the complexification of the temporal component of the gluon field and also the vector-type auxiliary field are necessary to evade the model sign problem within the standard path-integral formulation.


I. INTRODUCTION
Understanding the confinement-deconfinement transition at finite temperature (T ) and chemical potential (µ) in quantum chromodynamics (QCD) is one of important and interesting subjects in elementary particle, nuclear and hadron physics. To investigate non-perturbative properties of QCD such as the chiral symmetry breaking and the confinement mechanism, Monte-Carlo simulations of lattice QCD have been utilized as a powerful tool for studying nonperturbative properties of QCD such as the chiral symmetry breaking and the confinement at zero baryon density. Unfortunately, lattice QCD simulations have the sign problem at nonzero real quark chemical potential. To circumvent the sign problem, several methods have been proposed so far such as the Taylor expansion method [1][2][3], the reweighting method [4,5], the analytic continuation method [6][7][8], and the canonical approach [9][10][11][12][13][14]. However, we cannot access cold dense region, µ/T > 1, in these methods at present [15].
The QCD effective models are widely used to investigate the QCD phase structure at finite real chemical potential. We can sometimes avoid the sign problem in simple models such as the Nambu-Jona-Lasinio (NJL) model without the repulsive vector-type interaction. In more realistic models, however, the sign problem arises again. For example, the Polyakov-loop extended NJL (PNJL) model [16] has the sign problem even in the mean-field treatment. The sign problem appearing in the QCD effective model is called the model sign problem [17,18]. Practically, one can avoid the model sign problem by using some prescriptions, which may not have clear theoretical foundation.
One of the model sign problems arises from the vectortype interaction. In the mean-field approximation for the NJL-type model, we usually neglect the Wick rotation of the vector-type auxiliary-field (ω 0 ) and then the stationary point of the action is considered to be the solution. The stationary point is corresponding to the maxima of the thermodynamic potential along the ω 0 direction and thus it is not stable in principle. While this treatment cannot be justified from the standard path-integral formulation, it can be acceptable in the mean-field approximation. Actually, this problem has been discussed in Ref. [19] by using the Lefschetz thimble method [20][21][22]. We can clearly understand that the standard meanfield treatment implicitly employs the complexification of the vector-type auxiliary-field based on the Cauchy(-Poincare) theorem.
In this article, we use the path optimization method [23][24][25][26] to formally tackle the model sign problem induced by the Polyakov-loop and also the repulsive vector-type interaction in a QCD effective model. In Ref. [26], we have shown that the complexification of the temporal component of the gluon field is sufficient to control the model sign problem in the PNJL model without the repulsive vector-type interaction. In addition, we have shown that the complexification of the vectortype auxiliary field should be responsible to control the model sign problem in the NJL model with the repulsive vector-type interaction [19]. It should be noted that the flow equation of the Lefschetz thimble blows up at a small value of the vector-type auxiliary field, and we failed to obtain the Lefschetz thimbles in the auxiliaryfield space in Ref. [19]. Therefore, in this article, we apply the path optimization method to the PNJL model with the repulsive vector-type interaction to control its model sign problem. This study is the first attempt to treat the model sign problem correctly and systematically within the standard path-integral formulation via the complexification of the integral variables in the QCD effective model with the Polyakov loop and the repulsive vector-type interaction.
This article is organized as follows. In the next section, we explain the path optimization method and the PNJL model with the repulsive vector-type interaction. Section III shows numerical results by using the hybrid Monte-Carlo method. Section IV is devoted to summary and discussions.

II. FORMULATION
We investigate the model sign problem appearing in the PNJL model with the repulsive vector-type interaction via the path optimization method. Details are explained below.

A. Polyakov-loop extended Nambu-Jona Lasinio model
The Euclidean Lagrangian density of the two-flavor PNJL model [16] with the repulsive vector-type interaction is given as where m 0 denotes the current quark mass, D ν = ∂ ν − igA ν δ ν4 is the covariant derivative, Φ (Φ) represents the Polyakov-loop (its conjugate), and V g expresses the gluonic contribution. The coupling constants G and G v take positive values, as understood from the QCD one-gluon exchange interaction, see Ref. [27] as an example. We employ the homogeneous auxiliary-field ansatz, as adopted in the previous works using the Monte-Carlo PNJL model [26,28], and thus our numerical results converge to the mean-field results in the infinite volume limit. The homogeneous ansatz corresponds to the momentum truncation to k = 0. After the bosonization and complexification of auxiliary fields, the grand-canonical partition function is given as where Γ is the effective action, and z k represents the dynamical variables in the momentum space. With the homogeneous field ansatz, we truncate the auxiliary fields to k = 0 components. Then the effective action becomes Γ = βV V where V is the effective potential and β is the inverse temperature. Thus our Monte-Carlo results should agree with the mean-field results in the infinite volume limit, where the configuration at the minimum of V dominates. After the Hubbard-Stratonovich transformation (bosonization), the thermodynamic potential is obtained as where V NJL and V g are the fermionic and gluonic parts of the effective potential, respectively. The actual form of V NJL is given as where N f = 2 (N c = 3) is the number of flavor (color) and Λ is the three-dimensional momentum cutoff. We set the same momentum cutoff in the vacuum and the thermal parts. We here introduce auxiliary fields as σ =qq, π = qiγ 5 τ q and ω 4 = −qiγ 0 q. The Fermi-Dirac distribution functions are given as where M, N, N ± andμ are functions of the auxiliary fields, with π 0 = π 3 and π ± = (π 1 ± iπ 2 )/ √ 2. Because of the 2iG v ω 4 term inμ, the repulsive vector-type interaction induces the model sign problem in addition to that from the Polyakov-loop. For V g , we employ the logarithmic type Polyakov-loop potential proposed in Ref. [29], The parameters are usually set to reproduce the lattice QCD data in the pure gauge limit. Basic setup to compute the PNJL model with the Markov-Chain Monte-Carlo method is shown in Refs. [26,28].
Cuts in the logarithm of Eq. (7) may induce the numerical problem but it may be the model artifact and thus we do not consider any additional care for the singularities in this study as in Ref. [26]. One of the promising ways to avoid the problem is the modification of the functional form of the Polyakov-loop potential. The logarithmic term in the Polyakov-loop potential appears as V T 3 0 b 3 × ln(h) in the Boltzmann weight. If V T 3 0 b 3 is set to be a positive integer, the singularity does not matter. In the present potential, V T 3 0 b 3 is not an integer. It is well known that there is another functional form of the Polyakov-loop potential that is the polynomial one [30], which also reproduces the lattice QCD data in the pure gauge limit at finite T and does not have singularities. Nevertheless, sampled configurations are found to be well localized, then the path optimization method works well practically in the present setup as shown later.

B. Path optimization method
In the path optimization method, we first complexify the integral variables, x i ∈ R → z i ∈ C where i = 1, · · · , n with n being the dimension of integration. To construct the new (and good) integral path in the complex space, we use the cost function which represents the seriousness of the sign problem. We vary the integral path in the direction to decrease the cost function. This method has similarity from the viewpoint of the complexification of dynamical variables with the complex Langevin method [31,32] and the Lefschetz thimble method [20][21][22]33]. Especially, the path optimization method belongs to the category of the off-thimble integral methods, which allow the integral path to deviate from the thimble, as proposed in the generalized Lefschetz-thimble method [33]. See Refs. [34][35][36][37][38][39][40] for recent progress in these methods.
The path optimization method was firstly proposed in Ref. [23]. The machine learning (feedforward neuralnetwork) was introduced to describe and to optimize the modified integral path in Refs. [24,25]. Few days before Ref. [25] was submitted, the machine learning was introduced to learn the integral manifold in the generalized Lefschetz-thimble method in Ref. [41]. This method uses the supervised learning because we must teach the relevant integral path (manifold) to the neural-network, and the results of the generalized Lefschetz thimble method have been used as the teacher data; it is the first paper which employs the supervised learning to evade the sign problem as far as we know. Also, the same group applied the machine learning to optimize the integral path by using the average phase factor in Ref. [40] after our paper [25] appeared. This method has similarity with our path optimization method which uses the unsupervised learning. The machine learning can be applied to various optimization problems and thus it is quite useful in physics.
The functional form of the new integral path is represented by using the feedforward neural network [23,25]. Then, the parameters in the feedforward neural network are optimized via the minimization of the cost function. The largest advantage of using the feedforward neural network in the path optimization method is in the universal approximation theorem; the neural network even with the mono hidden-layer can approximate any kind of continuous function on the compact subset as long as we prepare sufficient number of units in the hidden layer [42,43].
To use the feedforward neural network, we represent z i by using parametric quantity (t i ) as where w ij , b i , α i and β i are parameters. In particular, w and b are so called the weight and the bias, respectively. Thus, we have the map Re(z i ) ։ Im(z i ). The function g(x) is so called the activation function and we use the hyperbolic tangent function. We use the back-propagation algorithm in actual optimization of parameters. It should be noted that the path optimization method reproduces the same results with the original theory because of the Cauchy(-Poincare) theorem as long as the integral path does not go across singular points and the contribution at Re z → ±∞ vanishes.
To obtain the good integral path, we use the following form of the cost function; where θ(t) = arg(J(t)e −Γ(z(t)) ), θ 0 = arg(Z), see Ref. [23] for details. In applying the path optimization method to the PNJL model with the vector-type interaction, we complexify temporal gluon components (A 3 and A 8 ) and the Wick rotated vector-type auxiliary field (ω 4 ), while the scalartype and pseudo-scalar-type auxiliary fields (σ and π) are still treated as real variables. Thus, we have 7 dynamical variables (σ, π 0 , π ± , Re ω 4 , Re A 3 , Re A 8 ) and 3 dependent variables (Im ω 4 , Im A 3 , Im A 8 ) where latter three variables are given via Eq. (10). Since it is known that the model sign problem can be resolved by the complexification of the temporal gluon fields in the Lefschetzthimble method at least in the system without the repulsive vector-type interaction [18]. Also, ω 4 can induce the model sign problem even in the NJL model and then we must consider the complexification of ω 4 [19].
In the present study, we directly complexify A 3 , A 8 and ω 4 , but this treatment may violate the periodicity along the Re A 3 and Re A 8 directions. If we wish to take care of the periodicity, we may use periodic functional form in the neural network as in Ref. [41]. In particular, violation of the periodicity may become serious when the configurations are spread widely in the ReA 3 and ReA 8 variables. For example, see Ref. [44] for the issue of the periodicity. As shown later, however, the present cal-culation agrees well with the mean-field approximation and configurations are well localized. Thus, we do not introduce the periodic form of inputs at present.
It should be noted that the path optimization with the feedforward neural network is unsupervised learning because we do not need teacher data. The setting of the feedforward neural network in the path optimization method such as the optimizer are the same with Ref. [26] and thus we skip the explanation here.

III. NUMERICAL RESULTS
In the actual numerical calculation, we have generated 80000 configurations by using the hybrid Monte-Carlo method. Then, the expectation values are estimated after a few times of optimizations. We employ the simple neural network which contains the input, mono-hidden and output layers. The number of unit in the hidden layer is set to 4N dof = 12, where N dof is the number of dependent variables. The expectation value of an operator (O) is obtained via the phase reweighting as where · · · pq means the phase quenched average and The parameters in the NJL part are the same in Ref. [26] and we newly introduce G v as G v = 0.5G. Figure 1 shows the average phase factor, Re e iθ pq , at T = 0.1 GeV with k = V T 3 = 8 and 64. In some regions of µ, the average phase factor becomes almost 0 before the optimization as shown by the dashed line in the figure. By comparison, we can successfully increase the average phase factor after the optimization. It suggests that there are no need to complexify σ and π auxiliary fields in the path optimization method to investigate the PNJL model with the repulsive vector-type interaction. Also, this would be true in the Lefschetz-thimble method and other complexified integral-path approaches. Compared with the PNJL model without the repulsive vector-type interaction [26], the average phase factor becomes worse because ω 4 field additionally induces the sign problem at finite density. Around µ = 0.36 GeV, the optimization is neither sufficient nor automatic in the case with k = 64. With naive initial conditions of dynamical variables, the average phase factor stays to be very small. Then, various initial conditions have been examined and we finally obtain the optimized path with reasonably large average phase factor as shown in Fig. 1. This result indicates that the present neural network in the case with k = 64 does not have enough performance of the approximation to overcome the exponential suppression of the average phase factor. Figure 2 shows the µ-dependence of the order parameters at T = 0.1 GeV after the optimization. We also show the mean-field results based on the CK symmetry ansatz in the fermion determinant [45,46] where C and K are the charge and complex conjugations, respectively. This ansatz can be justified by using the Lefschetz thimble method [18]. Under the CK symmetry condition, we solve gap equations where we do not use the Wick rotation of the vectortype auxiliary field. This treatment cannot be justified in the standard path integral formulation, but practically it reproduces the correct result in the leading-order of the large N c expansion because ω 0 corresponds to the quark number density in the mean-field approximation. From the figure, we find that the numerical errors are well controlled and the difference between Φ andΦ at finite density,Φ > Φ, is correctly reproduced. Compared with the results without the vector-type interaction [26], the chiral condensate decreases more slowly. This is reasonable, since the repulsive vector-type interaction is known to weaken the chiral phase transition. We find that |Im ω 4 | strongly increases above µ = 0.3 GeV. Since the quark number density and ω 4 are related with each other via ω 0 = q † q = iω 4 , this sudden increase indicates the absence of the Silver-Blaze problem at T = 0; the quark number density should start to increase at µ = M (µ = 0). By comparison, results at small µ are almost the same as those without the vector-type interaction [26], since the quark number density and thus the vector potential are small. It should be noted that the real part of ω 4 is consistent with zero within the error-bar and thus the consequence obtained in the analyses in the Lefschetz thimble method [19] is naturally understood in the path optimization method. We must consider Wick rotation of ω 0 , then the model sign problem can be resolved by complexifying ω 4 . The present results imply that the ω 4 field has almost only the imaginary part, then the flow equation of the Lefschetz thimble can stall at a small value of |Re ω 4 |. In Fig. 3, we show the scatter plot of the hybrid Monte-Carlo configurations at T = 0.1 GeV and µ = 0.3 GeV on the Re A 3 -Re A 8 , Re A 3 -Re ω 4 and Re A 8 -Re ω 4 planes. We can see the localized configurations around Re A 8 = 0 and Re A 3 = 0. Thus, this study implies that the standard PNJL model computation with the repulsive vector-type interaction under the CK symmetry ansatz and the un-Wick rotated ω 0 in the mean-field approximation is systematically and numerically justified via the path optimization method. The scatter plot also supports the direct complexification of A 3 and A 8 without taking care of the periodicity. The Monte-Carlo configurations are not spread but localized, then it is not necessary to take account of the periodic boundary condition.

IV. SUMMARY
In this study, we have applied the path optimization method to the QCD effective model with the Polyakov loop and the repulsive vector-type interaction. The feed-forward neural-network with the mono-hidden layer is employed to describe the good integral path in the complexified space of integral variables. The temporal components of the gluon field and the vector-type auxiliary field are complexified and then the path is optimized via the path optimization method.
By optimizing the path (manifold), we can successfully improve the average phase factor, and calculated results of observables show reasonable behavior and have small error bars. It is not easy to optimize the integral path in the rapidly changing region of the order-parameters, but we can finally improve the average phase factor by examining various initial conditions of dynamical variables.
After a few optimization steps, we can well reproduce the mean-field results at large volume as we expect. Since we use the homogeneous ansatz of the integral valuables, our numerical simulation should give the mean-field result in the large volume limit. The imaginary part of the vector-type auxiliary field starts to rapidly increase in strength above µ = 0.3 GeV at T = 0.1 GeV. This indicates the absence of the Silver-Blaze problem at T = 0 and thus the path optimization method can pick up correct properties of the theory.
In the standard mean-field approximation, we do not perform the Wick rotation of the vector auxiliary field (ω 0 ). While such a treatment cannot be justified within the standard path integral formulation, it can be justified by employing the complexified theory such as the Lefschetz thimble method and the path optimization method. In this article, we have demonstrated that the path optimization method correctly resolve the model sign problem and then the ω 4 field takes almost the pure imaginary value which is required from the fact that the grand-canonical partition function is real. This study provides the correct numerical treatment of the repulsive vector-type interaction in the QCD effective model with the Polyakov loop.
Finally, we comment on the problem of the numerical cost. The degree of freedom is enlarged in the present calculation compared with the case without vector-type interaction [26], the improvement of the average phase factor becomes slow, and thus we need more optimization steps (epochs) and/or some other extensions. One of the possible extensions to circumvent such optimization problem is introducing the deep neural-network and it is our future work. Also, the sign problem becomes exponentially severe with increasing system size. Then, it is important to know that the improvement of the average phase factor via the path optimization method can overcome the exponential suppression. Therefore, we need further investigation of the competition in the average phase factor between the suppression from the system size and the improvement from the path optimization. In particular, this problem becomes serious when we consider the lattice calculation. One promising approach is the reduction of the Jacobian computation cost; the diagonal ansatz of the Jacobian [40] and the nearest-neighbor lattice-cites ansatz [39] are promising examples. It will be reported elsewhere.