A new dual representation for staggered lattice QCD

We propose a new strategy to evaluate the partition function of lattice QCD with Wilson gauge action coupled to staggered fermions, based on a strong coupling expansion in the inverse bare gauge coupling $\beta= 2N/g^{2}$. Our method makes use of the recently developed formalism to evaluate the ${\rm SU}(N)$ $1-$link integrals and consists in an exact rewriting of the partition function in terms of a set of additional dual degrees of freedom which we call"Decoupling Operator Indices"(DOI). The method is not limited to any particular number of dimensions or gauge group ${\rm U}(N)$, ${\rm SU}(N)$. In terms of the DOI the system takes the form of a Tensor Network which can be simulated using Worm-like algorithms. Higher order $\beta$-corrections to strong coupling lattice QCD can be, in principle, systematically evaluated, helping to answer the question whether the finite density sign problem remains mild when plaquette contributions are included. Issues related to the complexity of the description and strategies for the stochastic evaluation of the partition function are discussed.


I. INTRODUCTION
Lattice QCD at finite baryon density suffers from the notorious sign problem [1]. In a nutshell, the numerical sign problem arises because the weights of the partition function are not positive definite, prohibiting importance sampling in Monte Carlo simulations. One of the several promising approaches to tackle the various sign problems in lattice field theories or spin systems are dual formulations. The basic idea is to rewrite the partition function by replacing the original (possibly continuous) degrees of freedom by new discrete degrees of freedom, such that the numerical sign problem of the new representation is milder or absent [2].
The idea of dual representations is old, and in the last decade, many different sign problems have been solved in this way. Some of the hallmarks in the context of spin models are the O(N) and CP(N-1) models [3][4][5], and in the context of lattice field theories are the charged scalar φ 4 theory [6], the Abelian gauge-Higgs model [7,8], the SU(2) principle chiral model [9] and scalar QCD [10]. The term "dual representation" may seem as a misnomer (they are not duality transformations), but it has been established as an umbrella term for representations of specific type: the representations are obtained by integrating out the original degrees of freedom and by introducing discrete variables that encode nearest neighbour interaction, e.g. so-called bond variables. These are based on a high temperature or strong coupling expansion [11,12] or similar Taylor expansions, and can be expressed in terms of oriented fluxes and/or unoriented occupation numbers (usually called monomers and dimers). A dual representation is then oftentimes called a world-line representation, or a dimerization, or is a combination of both. An important feature is that the original symmetries of the system are translated into constraints such as flux conservation or restrictions on the allowed occupation numbers. Typically these constraints are central in Monte Carlo simulations such as in the Worm algorithm [13] or generalizations thereof [14]. Dual representations are in general not unique: a model can have several dual representations which may have different residual sign problems. In some cases, a dual representations can introduce a sign problem that did not exist in the original formulation. An important example is the lattice Schwinger model at finite quark mass.
The focus of this paper is whether dual representations can be successfully applied to lattice QCD at finite baryon density, which has a severe sign problem in the usual representation, where fermions are integrated out, resulting in the fermion determinant. The standard approach is then Hybrid Monte Carlo. At finite baryon chemical potential µ B , the fermion determinant becomes complex, resulting in the sign (complex phase) problem. Many strategies are available to circumvent the sign problem for small values of the chemical potential, like the Taylor expansion method [15], the use of an imaginary chemical potential [16,17] and reweighting [18]. The latter has led to a first estimate of the position of the critical endpoint on a coarse lattice [19]. In general, however, reweighting may suffer from the lack of overlap between the sampled µ B = 0 ensemble and the target ensemble at µ B > 0. More recently, other approaches that are not limited to small µ B have been proposed, such as the Complex Langevin approach [20,21], the Lefschetz thimble approach [22][23][24], or the density of states method [25]. To name also some approaches that are in the spirit of a dual representation: the 3-dimensional effective theory [26,27] (a joint strong coupling and hopping parameter expansion that can be mapped to SU(3) spin model), decoupling the gauge links using Hubbard-Stratonovich transformations [28], "Induced QCD" based on an alternative discretization of Yang Mills Theory [29,30]. All these approaches have their shortcomings, and a method that allows to simulate lattice QCD at finite density has not yet been established.
A dual representation of lattice QCD has only been derived in the strong coupling regime: the classical arXiv:1911.08389v1 [hep-lat] 19 Nov 2019 formulation in terms of a monomer-dimer-polymer system has been both addressed via mean-field [31][32][33][34] and Monte Carlo [35][36][37][38] and is valid only in the strong coupling limit. More recently also the leading order gauge corrections have been included [39,40]. At strong coupling, also the fermion bag approach has been used [41,42] and continuous time methods have been applied [43,44]. Beyond the leading order, a dual formulation for lattice QCD is notoriously difficult. First attempts were made using a character expansion [45,46] and the so-called Abelian Colour Cycles [47]. Our ultimate goal is to find a dual representation for lattice QCD: we propose a new approach based on a combined expansion of the Wilson plaquette action (strong coupling expansion) and of the staggered action (hopping and quark mass expansion) to all orders. The integration order is, as in the case of the strong coupling formulation, swapped, with the gauge integral being performed first while Grassmann integration is carried out after a reparameterization of the link integrals. The strong coupling methods we use go back to the early days of lattice QCD, where computers for large scale simulations were not yet available [48,49]. But only due to recent progress in the computation of 1-link integrals (invariant polynomial integration [50,51]) we have complete control on the evaluation of the resulting Boltzmann weight ending up with a fully dualized partition function. The challenge when going beyond the leading order correction is that this dual representation needs to capture non-local effects: it is no longer possible to write the partition function as product of site weights and bond weights only. The basic objects of our dual representation have a tensorial structure. In this paper we show a strategy to compute these tensors. Our method is not restricted to staggered fermions and can readily be applied to Wilson fermions as well.
The paper is organized as follows: In section II we review the computation of link integrals and introduce the SU(N ) Decoupling Operators which constitutes the building blocks of the whole dualization process. In section III we sketch the steps needed to recover the color singlet Boltzmann weight from the computation of polynomial gauge integrals. In section IV the dualized partition function will be presented along with the expression of various observables in terms of the dual degrees of freedom and a discussion about the sign problem. In section V numerical crosschecks from exact enumeration in low dimensional systems will be shown. Finally in section VI we draw our conclusions.

II. STRONG COUPLING EXPANSION AND LINK INTEGRATION
We consider the finite density partition function of lattice gauge theory with SU(N ) gauge group, using the Wilson gauge action and one flavor of unrooted staggered fermions {χ, χ} with lattice quark massm q = am q where = (x, µ) and x stand respectively for lattice links and sites and DU is the Haar measure. The gauge links U are SU(N ) elements, while S g and D f are respectively the plaquette action and the massless staggered Dirac operator: where µ q = 1 N µ B is the lattice quark chemical potential and η µ (x) are the usual staggered phases. All traces are intended to be over color indices and in the following we will always use the letter p to label lattice plaquettes.
The first step in the dualization process is to perform a combined Taylor expansion of Eq. (1) in the reduced gauge couplingβ ≡ β 2N = 1 g 2 and quark masŝ m q : The sum is over the positive integers that single out a particular term in the expansion: (n p ) n p is called the (anti-) plaquette occupation number, m x the monomer number and d ,d stem from the expansion of the massless staggered Dirac operator in forward (d ) and backward (d ) direction. The quantity G contains the non-local part of the computation and is given by a Gauge+Grassmann integral over the whole lattice.
Our dualization corresponds to exactly integrate out the gauge links U x,µ and the Grassmann fieldχ, χ, trading the original degrees of freedom with the integer variables appearing in Eq. (3) . This can be achieved by splitting the computation of G in two steps: 1) The traces appearing in Eq. (4) are written explicitly: we do not perform the matrix multiplication, leaving the color indices uncontracted. As a consequence, the gauge integral SU(N ) DU , becomes a disjoint product of monomial integrals with open color indices and we integrate out every gauge link independently.
2) After gauge integration, some of the open color indices need to be contracted between links that share a common site such that the plaquette terms are recovered. The remaining indices are contracted with the Grassmann-integrated quark fields. We postpone the description of this step to Section III.
If the matrix multiplications are not performed, the link integrals to be computed assume the following general form: where the values a, b depend on the dual degrees of freedom {n p ,n p , d ,d } and we make use of the multi-index notation: Due to the properties of the SU(N ) invariant Haar measure, the integrals in Eq. (5) are non-zero only when the difference a − b is an integer multiple of N . As it will be explained in the next section, this corresponds to a (gauge-) constraint for the dual degrees of freedom. We define: and for U(N ) gauge theory q = 0. Invariant integration over compact groups have been studied extensively in the last decades [48,49,[52][53][54][55][56][57][58][59][60][61][62][63]. Although many results concerning the U(N ) group are known since many years, only recently the SU(N ) generalization has been found [50,51]. Integrals of the type Eq. (5) are now known in closed form in term of Generalized Weingarten Functions. The interested reader will find our derivation in Appendix A. Here we only quote the main result assuming, without loss of generality, a > b (a = qN + p, b = p): (8) In the previous equation, ⊗q is a shortcut for the q−fold product of Levi-Civita epsilon tensors and δ lπ i , δ j kσ are the generalized Kronecker deltas where the indices are reordered according to the permutations π and σ. The leftmost sum with: is carried over the (qN +p)! q!N ! q p! possible ways of partitioning the color indices i, j (which are qN +p) into the q epsilon tensors of size N and into the delta function of size p. All the partitions obtained from each other by only permuting the α r in Eq. (9) are equivalent. Also, note that in Eq. (8) the i and j indices are partitioned in the same way. As in the U(N ) case, a further summation over all possible permutation of indices in the delta functions (sum over π, σ) is present. Every term in the double sum is weighted by the functionWg q,p N , which is a class function of the symmetric group S p and represents the natural generalization of the Weingarten functions Wg p N =Wg 0,p N appearing in the U(N ) result [58,59]. Their expression in terms of the characters χ λ of the symmetric group is: The sum is over the irreducible representations (ir-reps) 1 of the symmetric group S p , while f λ is the dimension of the irrep λ of S p and D λ,N +q is the dimension of the U(N + q) irrep with highest weight {λ 1 , . . . , λ len(λ) , 0, . . .}.
By inspecting Eq. (8), it seems tempting to consider the permutations π, σ as an additional degree of freedom to be evaluated stochastically and to proceed with the index contraction considering single terms in the sum of Eq. (8). Unfortunately, the sign of the generalized Weingarten functions strongly oscillates, preventing the ap-plication of standard Monte Carlo methods. A similar failure is observed in [46] where the role of theWg's is played by the 18j symbols. Instead, we found it useful to exploit the knowledge of the character expansion in Eq. (10) to reparameterize the I-integral. As a starting point we write the S p characters as a matrix product of the corresponding matrix representation: Writing the matrix product explicitly, we are able to cast the Weingarten functions and (after summing over the permutations) the I-integrals in the following form: 1 The Sn irreps are in 1-1 correspondence with the integer partitions of n. In Eq. (10) only the irreps that correspond to integer partitions with at most N parts contribute. 2 Every finite group admits a unitary irrep. In the case of the symmetric group the matrix elements can be also chosen to be real. This basis is known as the Young's orthogonal form.
where n ρ is the total number of operator indices. The operators P ρ decouple the colour indices i, l and k, j in the I-integral and its computation has been automatized by using the standard hook rule to determine f λ and D λ,N . The irreducible matrix elements M λ mn (π) in the orthogonal representation are computed numerically decomposing the permutation π as a product of adjacent transpositions τ j,j+1 and then using the axial distance formula to compute the matrix representation associated to them (see [64] p. 8). The quantity ρ, which identifies a given operator in Eq. (15), will play an important role in the following. We will refer to it as Decoupling Operator Index (DOI), which can be cast into an integer in the range {1, . . . , n ρ }.

III. INDEX CONTRACTION AND TENSOR NETWORK
Given the result in Eq. (15), the next step is to perform the contraction of the color indices {i, j, k, l} in the I-integrals making use of decomposition in terms of the operators P ρ obtained in the previous section. This contraction must be performed for every lattice link . The (anti-) plaquette occupation numbers (n p ) n p together with d ,d determine how the contraction has to be performed in order to recover Eq. (4). We distinguish two types of color indices: those stemming from the expansion of the hopping term and those arising from the expansion of the Wilson gauge action. We will refer to them as the "fermionic" and "gluonic" color indices. The contraction rules for the fermionic color indices are uniquely determined by d andd . These indices are contracted with the Grassmann fields appearing in the corresponding fermionic matrices M and M † . Due to the nilpotency of the Grassmann measure, exactly N (for SU(N )) (anti-) fermion fields (χ x ) χ x have to be present at each site x in order to obtain a non-zero contribution. This property will corresponds to a constraint on the allowed degrees of freedom d ,d , m x . Similarly, gluonic color indices are contracted according to the plaquette they correspond to. In this case the contraction takes place between the color indices of the I-integrals corresponding to links sharing a common site. The contraction rules are determined by the (anti-) plaquette occupation numbers and allows us to recover the plaquette terms in Eq. (4).
The key insight is that fixing the values of the DOI ρ for each link , makes the contraction step local. This means that the contraction of the color indices can be carried out independently at different lattice sites. In Fig. 1 we illustrate the procedure in d = 2. The extension to any number of dimensions is straightforward. To see why the contraction at different lattice sites decouples, let us first consider the case of the gluonic color indices, and rewrite explicitly the definition of the plaquette and anti-plaquette: For any gauge link U with = (x, µ), the contribution from the product of traces TrU p , TrU † p for all plaquettes p containing the link can be gathered into products of matrix elements of U and U † : where U 1 , . . . , U 4 are the four links contained in the plaquette and summation over repeating color indices is implied. The l.h.s. of Eq. (18) thus contributes to the gluonic color indices {i, l}, {k, j} within I i j ,k l . Therefore, given the structure of the operators in Eq. (15), (P ρ ) l i and (P ρ ) k j contract respectively with the operators attached to site x and x + µ. Fermionic color indices arise instead from terms of the form: They can be written explicitly as = (x, µ) 3 : and again by inspecting at the index structure of Eq. (15) the indices {i, l} of the first operator (P ρ ) l i are contracted with the Grassmann variables at site x while the indices {k, j} of the second operator with the Grassmann variables at site x + µ. This concludes the proof of the locality of the contractions which is schematically shown in Fig. 1. At each site, the corresponding Grassmann integral is replaced with the usual product of two epsilon tensors: Hence, on a d-dimensional hypercubic lattice, the 2d operators attached to x, together with the epsilon tensors in Eq. (21) are jointly contracted according to the values of the dual degrees of freedom. This gives a scalar quantity which only depends on the underlying dual degrees of freedom and on the values of the DOIs on the links attached to x. The dependency on the dual degrees of freedom {n p ,n p , d ,d , m x } is local, in the sense that the contraction at site x is completely determined by the monomer number m x , by the values of d andd on the 2d links attached to x, and on the (anti-) plaquette occupation numbers of the 2d(d − 1) plaquettes attached to x. Different sites communicate only via the common DOI ρ on the shared leg. We can collect the scalar quantities obtained from the contraction of different Decoupling Operators in a tensor 3 The dependency of M, M † on ηµ and µq can be factored out as it will be shown in the dual partition function Eq. (27).
where Tr Dx is a "reordered" trace in color space that depends on the local dual degrees of freedom D x and tells us how to contract the color indices of the operators P ρ x µ according to the rules discussed above. In Eq. 22 the DOIs depend on D x implicitly due to the fact that the dual degrees of freedom determine the value of (q, p) in the I-integral Eqs. (5), (8). The tensor elements T ρ x , can be computed numerically by building up the operators P ρ x µ and saturating their color indices according to the contraction rules from D x . Given T ρ x , the value of G is given, up to a global fermionic sign (see Sec. IV), by (23) and the constraint ρ x µ = ρ x+µ −µ just stems from the fact that DOIs on the same link have to be equal as depicted in Fig 2. In this form, the system is represented by a Tensor Network where the value of G is obtained by contracting the network to a scalar.
In some cases the contraction of different operators produces the same tensor elements. For instance, two op- where (α, β) and (α , β ) only differ by a permutation of fermionic color indices, will produce the same element up to a sign factor. This is clear since the fermionic color indices are always contracted with the Grassmann variables, and a permutation of fermionic color indices only amounts to a reordering of the corresponding indices in the epsilon tensors in Eq. (21). The possible relative minus sign is however unimportant. In fact, it will always cancel when considering the contraction of the operator with same DOI and which lives on the same link. We therefore identify these DOIs taking into account the combinatorial factor from their multiplicity. This reduces the size of the tensor T x , hence the numerical cost of contracting the network.
As we already mentioned, not all sets of dual degrees of freedom are allowed. On each lattice link they have to combine in a way that the corresponding I-integral is non-zero, while at any site exactly N (anti-) fermions carrying different colors must be present. We refer to these two constraints as Gauge and Grassmann constraint. Introducing where k is the dimer number and f the quark flux, for each link = (x, µ) the gauge constraint reads: FIG. 2. The Tensor Network resulting from the dual description: Depending on the dual degrees of freedom at any lattice site the tensor Tx is evaluated. Given two neighbouring sites x and x+µ, the tensor index on the common link is contracted . The value of G is the scalar quantity obtained by contracting all pairs of indices between lattice neighbours. In the figure, the tensor indices have been displaced for visualization purposes.
where δn µ,ν (x) ≡ δn p = n p −n p . For each site x, the Grassmann constraint requires in addition: The Eqs. (25), (26) generalize the constraint in the strong coupling limit (where n p =n p = 0 and f x,µ = ±N, 0). Notice that in contrast to strong coupling QCD, dimers (d x,µ = 0) and fluxes (f x,µ = 0) are not mutually exclusive on a given link. The set {n p ,n p , f , k , m x } subject to Eqs. (25), (26) along with the corresponding DOIs define our final dual partition function.

IV. PARTITION FUNCTION IN THE DUAL REPRESENTATION
A. General properties Using the quantities defined in the previous sections, the partition function Eq. (1) can be rewritten as: where the staggered phases η µ are included in the fermionic sign σ f whose form will be discussed in the next subsection. In Eq. (27) the dependence of the DOIs ρ x µ on {n p ,n p , k , f , m x } is implicit, and the constraints in Eqs. (25), (26) are supposed to be fulfilled. In Fig. IV A we show the typical structure of an allowed configuration in d = 2 for N = 3. DOIs are not shown. Notice that quark fluxes f x,µ always form closed loops due to the flux conservation law in Eq. (26). As opposed to the strong coupling limit, the loops can overlap with dimers and can be intersecting. The system is thus an ensemble of unoriented dimers k , monomers m x , closed quark fluxes f and plaquettes. The DOIs instead, can be either thought as a mere mathematical tool to automatize the computation of the statistical weights away from strong coupling or as an additional degrees of freedom to be also sampled via Monte Carlo. Before discussing these two possibilities, we want to highlight some features of the partition function Eq. (27).
A great simplification occurring, is that the strong coupling contributions always decouple from those corresponding to non-zero {n p ,n p }. As we showed in [40], at strong coupling the tensors T ρ have only one non-zero element. Although for baryon fluxes (f x,µ = ±N ) this is a trivial statement as there is only one possible DOI per link, in the case of dimer contributions it is a consequence of the structure of the Decoupling Operators. To show this feature, let us consider the case where only dimers are attached to a given site (Fig. 4, left). Contracting the indices of each delta function appearing in the definitions of the corresponding operators Eq. (15) (for dimer contributions epsilon tensors are absent) with the Grassmann fields, we obtain: where sgn(π µ ) is the parity of the permutation π relative to the operator P in direction µ. Hence, the contraction of single deltas decouples, and due to the Great Orthogonality theorem (GOT) the only surviving DOI is the one associated to the totally antisymmetric irrep of the symmetric group: where in this case ρ x µ = (m, n) λµ as there are no epsilon tensors. Given this result, one can obtain the usual contributions from monomers and dimers (f = 0) to the strong coupling partition function: where we dropped the dependency onm q and µ q in the partition function Eq. (27) at β = 0. The weight for strong coupling baryon loops can be also easily recovered since the corresponding tensors are of size one by construction. The decoupling Eq. (29), also extends to the case where strong coupling dimers combine on a given site with links carrying a non-zero gauge flux. In this case the tensor T ρ can be decomposed as: where the proportionality coefficient depends on the external strong coupling dimer legs. An example is provided in Figure 4 (right), while in Appendix [B] we rederive the O(β) partition function. The indices of the tensor in the r.h.s. of Eq. (32) (ρ x exc. ) correspond to the DOIs of the links attached to excited plaquettes. A similar decomposition holds in presence of an external baryon. As a consequence, the value of G can be written as: where a bubble B i is any plaquette-connected region and two bubbles are disconnected if they do not share an excited link (i.e. a link attached to an excited plaquette). Therefore, to evaluate the total weight of a configuration, it is sufficient to use the more involved structure based on the tensor network contraction on the sublattice where the plaquette occupation numbers are non-zero, exploiting the factorization of the tensor network for disconnected plaquette contributions. The strong coupling part can be evaluated using the standard combinatorial formulae (e.g. Eq. [31]). This is particularly useful since at small values of β the bubbles B i extend over few lattice spacings and the non-local effects from the tensor network are manageable.

B. Complexity and sampling strategies
We now want to comment on the complexity of the dual partition function Eq. (27). Given the background {n p ,n p , k , f , m x }, the weight of the configuration is obtained by contracting the tensor network T ρ x . Two different strategies can be used to sample the partition function: one can either exploit the "bubble decomposition" in Eq. (33) to simplify the numerical cost of contracting the network, or consider the DOIs as an additional degrees of freedom to be evaluated stochastically. In the first case a relevant question is whether the decomposition Eq. (15) is optimal, meaning that the number of Decoupling Operators n ρ in Eq. (16) is the smallest possible. The machine time required to contract the network depends almost completely on the size of the external legs of the tensors. In the U(N ) case we already know that the answer is positive as it can be shown that the operators P ρ are mutually orthogonal, hence independent. It is therefore not possible to perform a reparameterization of the I-integral that results in a decomposition of the type Eq. (15) with a smaller number of terms within the sum. For SU(N ) the situation is not completely clear as we could not prove that the Decoupling Operators corresponding to the SU(N ) contributions (non-zero q) are independent. The question whether the complexity can be reduced using a different parameterization is thus still open. In any case, the lower bound on the number of DOIs provided by the U(N ) result already tells that to a certain degree, the complexity is unavoidable. This number grows as a factorial as the (anti-) plaquette occupation numbers increase and contracting the resulting tensors along the excited plaquettes becomes in general too expensive in d > 2. Even though this description can be used as a starting point for future theoretical development, as it stands, the bubble decomposition and the corresponding tensor network cannot be used for exact calculations in full QCD. Nevertheless, the dual form of the partition function together with the decomposition Eq. (33) can be used to study lattice QCD perturbatively in β, by truncating the expansion of the plaquette action. We remind that this has been done so far, using worldline formulations, only for the leading O(β) corrections [39]. Truncating at O(β n ) means that the allowed configurations are only those corresponding to bubble contributions of at most O(β n ). Making use of this definition, the truncation corresponds to a free energy which is exact up to the same order. For instance, at order O(β 2 ), the largest allowed bubble contributions are 2 × 1 rectangles with an elementary (anti-) plaquette excitation (n p ) n p = 1 as sketched in Fig. 5 In the SU(3) case, four of the six tensors T ρ x making up the bubble, are matrices of sizes at most 6×1, while the other two are rank 3 tensors of sizes at most 6 × 1 × 1. Contracting the reduced tensor network within the bubbles is straightforward and can be done on the fly during Monte Carlo evolution without any overhead. Higher order contributions (n = 3, 4, 5, ...) can be also easily evaluated in 4d. One possible strategy is to compute and store beforehand all the tensors T ρ that are compatible with the constraint and the truncation order. This step needs to be performed one time only, as the tensor network does not depend on the simulation parameters. For instance, the computation of all the tensors needed to address the 4d N 3 LO correction to strong coupling QCD took ≈ 10 2 s on a single CPU, with the largest tensor having only O(10) non-zero elements. The tensors are then loaded and used to compute the value of G Bi when the bubble B i needs to be updated. We are currently designing an ergodic algorithm capable to sample the bubble contributions which will be illustrated in a forthcoming publication where the higher order β corrections to the strong coupling phase diagram will be addressed.
The second possibility is to consider the DOIs as an additional degrees of freedom along with {n p ,n p , k , m x }. The complexity of the tensor network can be thus overcome by importance sampling. In this case, a given configuration is determined by selecting one tensor element for each lattice site. When doing so, the weight of such configurations is local and an additional Metropolis acceptance test can be easily introduced to make sure that the system explores the DOIs configuration space during Monte Carlo. For instance, when a bond, an ele-mentary plaquette or a cube containing 6 plaquettes is updated, we can propose a quasi-local update by randomly choosing new DOIs on the bonds involved. The feasibility of this approach depends on the minus signs induced by splitting the former configurations in terms of {n p ,n p , k , m x } into sub-configurations where one selects a single tensor element out of the full tensor T ρ x . In fact, the tensor elements are not positive defined and it could happen that without contracting the network an additional source of minus signs is plugged into the system. As mentioned in Sec. 1, the main obstacle to the use of the permutation basis was in fact the severe sign problem induced by the Weingarten functionsWg. Using instead the DOIs, the induced sign problem can be much milder. In Sec. V we will provide preliminary evidences to this statement based on an exact enumeration of the partition function.

C. Sign Problem
Having discussed the partition function, we now turn to the computation of σ f in the dual representation. In general, the fermionic sign of a configuration is determined by the staggered phases, the anti-periodic boundary condition for fermion fields and by the so-called geometric sign. The latter stems from the fact that, starting from Eqs. (3), (4), one has to reorder the Grassmann variables contained in the matrices M , M † before performing the Grassmann integration at each site. At strong coupling, only baryon loops (f = ±N ) can induce a negative sign and the geometric sign is known in closed form. It combines with the staggered phases and the winding number to produce for SU(2N + 1), whereas σ f = +1 for SU(2N ). In Eq. (42), C is the set of links traversed by baryons, N (C) the number of baryon loops, N (C) the number of baryon loop segments in negative directions and ω(C) the total winding number in temporal direction. At strong coupling, the baryon-loop induced sign problem is very mild and the finite density phase diagram can be mapped out using sign reweighting [38,43].
At finite β the structure of the geometric sign gets more complicated as the allowed quark fluxes can also be intersecting and the equality in Eq. (42) does no longer hold true. Specializing to SU(3), a fermionic minus sign is only induced by single and triple quark fluxes while for dimers and di-quarks σ f = +1 4 . To compute the geometric sign 4 For SU(2) only single quark fluxes can produce a negative σ f . for intersecting loops, as closed formulae are apparently lacking, we explicitly count how many times the Grassmann variables corresponding to odd fluxes need to be commuted to bring them in canonical ordering at each lattice site. Formally, σ f can be written as where C 1 and C 3 are respectively the set of links traversed by single and triple quark fluxes and the winding number ω(C 1 , C 3 ) is given by where N τ is the temporal extent of the lattice. In Eq. (35), σ G is the global geometric sign and in general cannot be factorized as a product of two terms depending separately on C 1 and C 3 . It is computed after contracting the tensor T ρ x at fixed background {n p ,n p , d , f , m x }, and cannot be cast into a product of local minus signs that can be absorbed with a redefinition of the tensors T x .
As we already mentioned, another potential source of negative signs, which does not depend on the fermion fields, is caused by the lack of positivity of the tensor elements T ρ x . This issue is relevant when considering the DOIs as an additional degrees of freedom. Strong oscillations of the sign within the tensor network can in fact hinder the application of importance sampling. Although, this question can be only answered on the basis of Monte Carlo simulations via sign reweighting, in Sec. V we will show preliminary results on the interplay between the fermionic and the tensor network induced sign problem, obtained from exact enumeration of the partition function on small volumes.

D. Observables
As both the fermion field and the gauge links have been integrated out, the observables in the dual representation take a different form. The ones defined as derivatives of log Z with respect to external parameters, can be obtained taking derivatives in Eq. (27). For instance, the chiral condensate ψ ψ , baryon number n B and average plaquette P are given by: and higher order derivatives (i.e. susceptibilities) can be obtained in a similar fashion, evaluating the various cumulants of m x , f x,0 and n p +n p . The definition of nonderivative observables, such as the Polyakov loop, is less trivial in the dual representation as the gauge fields have been already integrated out. Formally, the Polyakov loop can be written as a ratio of partition functions where Z L is the partition function with a Polaykov loop insertion and Z is given by Eq. (27). Z L admits a dual representation similar to Eq. (27) with modified tensors T ρ x at the sites x crossed by the Polyakov loop. Here, we will not discuss its specific form. The strategy to sample the Polyakov loop will be addressed in a following paper containing the numerical results from Monte Carlo simulations.
To perform finite temperature calculations at non-zero β, we can either vary the temporal lattice extent N τ at fixed lattice spacing a according to or perform simulations on anisotropic lattices. The first strategy works well for β 2N close to one, where one can meaningfully fix the scale and determine the relations β(a),m q (a) imposing a physical constraint on the lowenergy mesonic spectrum. Instead, at small enough β (and especially at strong coupling), the scale cannot be fixed as the lattice is too coarse. In this case the temperature is changed inducing a physical anisotropy ξ = as at by using two different β couplings for spatial (β s ) and temporal (β t ) plaquettes and introducing a fermionic bare anisotropy γ that favours hoppings in temporal direction. Implementing this modification in the partition function Eq. (27) is straightforward. The modifications can be summarized in: e µqδµ,0fx,µ k !(k + |f |)! → e µqδµ,0fx,µ k !(k + |f |)! γ δµ,0(|fx,µ|+2kx,µ) , and (n p s/t ) n p s/t are the (anti-) plaquette occupation numbers for spatial and temporal plaquettes. The relation between the bare parameters β s , β t , γ and the physical anisotropy ξ has to be determined nonperturbatively via the so called anisotropy calibration procedure (see [66] and references therein). This has been done so far in the strong coupling limit at zero [66] and non-zero [67] quark massm q . The extension to strong coupling QCD including O(β) is in preparation. In this paper, we will be only interested in the evolution of the observables as a function of β,m q , µ q , hence in comparing the dual observables with the HMC results we will set γ = 1 and β s = β t = β. In a finite volume, the partition function Eq. (27) truncated at a given order O (β n ), is always a finite polynomial P (β,m q , z q ) in β, quark massm q and fugacity z q = exp lations at zero chemical potential µ q . We considered as gauge group both U(N ) and SU(N ) for N = 2, 3, and obtained the full polynomial P on 2×2 and 4×4 volumes for various n ≤ 6, employing periodic boundary conditions in all directions. To enumerate the coefficients of the polynomial, we first pre-computed all the tensors T x (D x ) compatible with Gauge and Grassmann constraints and with the truncation order. We then generated all possible combinations of {n p ,n p , d , f , m x } and contracted the corresponding tensor network to determine its contribution to P. The result of this contraction was then multiplied by the fermionic sign σ f . In Tab. I we show the total number of configurations as a function of the truncation order. At fixed O(β n ), this number grows very large as a function of the number of dimensions, hence we could not perform the exact enumeration in d > 2. Nevertheless, as our dual formulation does not present any fundamental difference when applied to higher dimensions, we believe that this crosscheck gives some hints about its validity in d = 3, 4. In Figs. 6-7, we show the results of this comparison for the average plaquette P and for the chiral condensate χχ , respectively for U(2), U(3) and SU(2), SU(3). They were analytically determined from P (β,m q , z q ) by for L = 2, 4 and N = 2, 3. A clear result, emerging from Figs. 6 -7, is that the strong coupling branch is well described by the polynomials P for all quark masses, and that the agreement with the HMC results indeed extends to larger and larger β as the truncation order is increased. Notice that at any fixed order O(β n ), the continuum limit β → ∞ of the average plaquette is always zero, as it can be seen from its definition in terms of the dual degrees of freedom Eq. (37). The value β max that corresponds to a maximum of the average plaquette can be used as a strong upper bound for the validity of the expansion at O(β n ). Another relevant information we can extract from the exact enumeration concerns the magnitude of the sign problem. A measure of its severity is given by the average sign σ. It is defined as the ratio of the full (Z) and the so-called phase quenched (Z p.q. ) partition function: The latter is obtained by taking the norm of each statistical weight in Z. In our case, the partition function is a sum of real quantities, hence the norm is just the absolute value. From the definition it follows that σ ≤ 1 and the equality holds if there is no sign problem. As we want to compare the sign problem in the dual representation with and without the DOIs as an additional degree of freedom, we need to employ two different definitions for the phase quenched system. In the first case, we need to set the fermionic sign σ f = 1 and take the absolute value of each tensor element while in the second case it suffices to set σ f = 1 as a configuration is now determined by the contracted tensor network. The two resulting average signs are respectively σ = σ f σ ρ and σ f . In Fig. 8 they are shown in the most relevant cases of SU(2) and SU(3) as a function of the truncation order and in the SU(3) case at non-zero baryon chemical potential as well. In the SU(2) case, the fermionic sign does not play a role on a 2 × 2 lattice as the allowed loop geometries have σ f = 1 and the only source of negative signs is due to the tensor network. In Fig. 8 (a), this is shown for various quark masses and for different truncations up to O(β 4 ). The trend corresponds to a mild deterioration of the sign as β and the truncation order is increased. This deterioration is not dramatic and corresponds to a fall in σ of about 10% at β ≈ 2. In the case of SU(3), the fermionic sign σ f is not positive (Fig. 8 (b)) but remains almost constant as a function of β and truncation order. When considering the sign σ , a trend similar to the SU(2) case shows up (Fig. 8 (c)): the sign in this case remains almost constant for β ≤ 1 where it starts to get worse as a function of the truncation order. When a non-zero baryon chemical potential is considered (Fig.8 (d)), this behaviour does not change. Although our numerical results are preliminary and only based on an exact enumeration of the partition function on a 2 × 2 lattice, we highlight some of the findings that could extend to larger volumes and higher dimensions. First of all, the comparison with the HMC simulations shows that our method provides the correct Boltzmann weights for the dual configurations as the strong coupling branch up to β ≤ 0.5 is well described by the polynomials P (β,m q , z q ) for different gauge groups and quark masses. This is non-trivial: the number of configurations considered already on a small volume is very large (Tab. I) and the computation very sensitive to the exact evaluation of the tensor elements T ρ x and of the fermionic sign σ f . The evaluation of the Boltzmann weights away from the strong coupling limit is thus under control and can be used in Monte Carlo simulations if the truncation order O(β n ) is not too large. We considered two main strategies in view of Monte Carlo simulations: the bubble decomposition Eq. (33), and sampling an enlarged configuration space that includes the DOIs. While we are not yet in position to draw general conclusions on the sign problem, when comparing it with the behaviour in a permutation based dualization [50], the improvement is drastic. Hence, resumming the permutations as in Eq. (13), effectively reduces the sign problem from the Weingarten functions.

VI. CONCLUSION AND DISCUSSION
In this work we proposed a new strategy for the evaluation of higher order contributions in the strong coupling expansion of lattice QCD with staggered fermion discretization.
The dual representation in terms of local tensorial weights improves on the sign problem as compared to evaluations in a Weingarten function basis. The color constraints from gauge and Grassmann integration combine to yield admissible configurations that after contracting the tensors are intersecting plaquette surfaces that are either closed or bounded by fermion fluxes. The configuration space is thus a worldline & worldsheet representation with the additional multi-indices ρ which we called Decoupling Operator Indices and that encode the information about the interplay of the unitary and symmetric group.
The prospects of Monte Carlo simulations of lattice QCD at finite density in the strong coupling regime are encouraging: the weights in the partition functions are local, and various strategies to sample the partition function Eq. (27) are possible. We will be able to obtain results on the phase diagram in the strong coupling regime beyond O(β). One possible way to perform Monte Carlo is via a Worm algorithm based on vertices, as was discussed in the context of the Schwinger model [68]. The drawback of this method is that this algorithm slows down drastically with the number of vertices (Tab. II). This limits in practice the maximal order of β feasible in 3+1 dimensions. Another intriguing possibility is to perform local Metropolis updates that could be parallelized. We can either sample the multiindices ρ alongside the occupation numbers (monomer, dimer plaquette and fermion flux) or contract all ρ's on a background of occupation numbers, employing the bubble decomposition discussed in Sec. IV B. Even when including the higher orders, the sign problem might still be manageable if β is not too large. For what values of β simulations are possible in 3+1 dimensions is only to be seen in practice and will be co-determined by the magnitude of the sign problem, by the numerical cost of evaluating the Boltzmann weights and will also depend crucially on the quark mass. Our representation is also valid for pure Yang Mills theory, which is sign problem free after contracting the tensor network.
A finite chemical potential does not introduce an additional sign problem as the zero-density Boltzmann weights get multiplied only by positive factors. Moreover, at fixed values of β, the sign problem becomes milder for large enough temperatures and/or densities: the worldline configurations contributing to the fermionic sign σ f simplify as the quark fluxes are mainly aligned in temporal direction. A detailed analysis of the sign problem requires, however, large volumes that cannot be obtained via exact enumeration and will be presented in a forthcoming publication.
by taking successive derivatives with respect to the sources J, K ∈ GL(N, C), according to the following equation: To evaluate Z q,p [K, J], we first convert the integral [A1] into a U(N ) integral, using and assuming for the moment J, K ∈ U(N ). The equality holds because the last integrand is invariant under multiplication of the U matrix by a complex phase. As a consequence, it gives the same result when integrated using the SU(N ) or the U(N ) Haar measure. Exploiting this trick, we can make use of the U(N ) character expansion to compute the quantity in the r.h.s of Eq. (A3). Thanks to the Schur-Weyl duality [69,70], power of traces of U(N ) matrices have the following character ex- whereχ λ are the U(N ) characters 5 . Instead, det U q are irreducible one dimensional representations ∀q ∈ Z (socalled determinantal representations). According to a standard group theory result, the tensor product of the irrep. V λ with a determinantal representation V q det gives where V λ+q is the U (N ) irreducible representation with highest weight: {λ 1 + q, . . . , λ N + q}. This gives: where the second equality follows from the orthogonality of characters, the third from the combinatorial identity valid for len(λ) ≤ N and the fourth one from the Frobenius relation (see for instance the Appendix A of [49]). The last equality is just a rearrangment of terms. The quantitiesWg q,p N are the Generalized Weingarten func- They are S p class functions and therefore depend only on the conjugacy class of a given permutation. Conjugacy classes and irreducible representations are in 1-1 correspondence. This is the reason why the Weingarten functions can also have integer partitions as argument.
In Eq (A7), h ρ is the number of permutations within the conjugacy class associated to the partition ρ, while t ρ (JK) is a shortcut for Tr(JK) ρi .
The SU(N ) generating functional is and given the polynomial nature of the expression, it can be extended to any K, J ∈ GL(N, C). In the limits q = 0, q = 1 and p = 0 the known results [49], [63] and [48] are recovered. Given the expression (A11), the I-integral is obtained by taking derivatives with respect to the sources K, J. We do not need to do this explicitly. In fact, it is sufficient to know the result in the cases p = 0 and q = 0 and then use Leibnitz Formula for the derivative of a product. Luckily, these two special cases have already been solved respectively by Creutz [48] and by Collins and collaborators in [58,59]. The two results are where δ lπ i and δ j kσ are the usual Kronecker deltas where the indices are swapped according to permutations π and σ. The sum in the first line of Eq (A12) runs over all possible ways α = {α 1 , . . . , α q } of partitioning the qN indices into q epsilon tensors. To get the general I-integral it is sufficient to exploit the fact that the the generating functional (A11) can be decomposed, apart from a trivial combinatorial factor, as a product of Z q,0 and a term that resemble the generating functional Z 0,p . The only difference is in the coefficientsWg 0,p N that must be sub-stituted withWg q,p N . Therefore, by looking at Eq. (A2), when qN derivatives of K act on the power of the determinant det K q , they will reproduce the result for I qN,0 . Similarly, when p derivatives of K and p derivatives of J act on the second term, they will reproduce I p,p with the substitutionWg where the leftmost sum now runs over all the ways (α, β) of partitioning the i, j indices into the Kronecker deltas and into the q epsilon tensors. This "multplicity" stems from the fact that we need to take into account every possible way of acting with the K derivatives, on the determinant and on the traces tr ρ (JK), and from the Creutz result for I qN,0 in Eq. (A12).
and gives the correct link weight to be used when a dimer belongs to an excited link. The genuine SU(N ) configurations are instead of three types: 1) An incoming (strong coupling) baryon splits, at a corner of the plaquette, into a single quark flux and a N − 1 quark flux. Equivalently, a single quark flux and a N − 1 quark flux can recombine to form an outcoming (strong coupling) baryon ( Fig. 9 (b)).
2) An incoming N − 1 quark flux exits the site following the gauge flux induced by the plaquette. A monomer or an external dimer is also present in order to fulfill the Grassmann constraint ( Fig. 9 (c)).
3) As in 2) with the external dimer or monomer replaced by a dimer on one of the two excited links ( Fig. 9 (d)).
The first two types of configurations are somewhat trivial as the associated tensors have size one. There is in fact only one DOI associated to the external legs of the two tensors. Their values can be readily computed: In the case of configurations of type 3), the associated tensor has size 2 × 1. There are in fact two DOIs in direction +1, where a dimer is superimposed to a N − 1 quark flux. This tensor is given by: To remove this multiplicity, it is sufficient to notice that a link carrying a dimer plus a N − 1 quark flux can only recombine with a N − 1 quark flux from another direction. The latter involves an I-integral made up of a single Decoupling Operator. Therefore, we can perform a resummation of the two DOIs by considering the following modified "tensor" of size 1 2 + T 1,2 and all tensors have been thus reduced to scalar quantities. It is easy to check that the modified dimer weights The rules (1)-(5) together with the usual strong coupling weights define the O(β) partition function [39]. Beyond this order, it is not possible to reduce the tensor network to a product of scalar link and site weights depending only on {n p ,n p , k , f , m x }.