Optimising creation operators for charmonium spectroscopy on the lattice

Smearing the bare quantum fields in lattice calculations before applying composite hadron creation operators has a long record of substantially improving overlaps onto low-lying energy eigenstates. A technique called distillation which defines smearing for quark fields as a low-rank linear projection operator into a small vector space of smooth gauge-covariant fields has proven to be both effective and versatile in hadron spectroscopy calculations albeit with significant computational cost . In this paper, more general operators in this space of smooth fields are introduced and optimised, which enhances the performance of the method when tested on systems of heavy quark-anti-quark pairs close to the charmonium energy scale.


I. INTRODUCTION
In hadron spectroscopy, quark smearing is widely used to improve the overlap between the creation operators applied and the desired state by including quark wave-functions that are not point-like and instead have an extended spatial distribution. A well-known technique called distillation was first proposed in [1] and has been widely used thanks to numerous advantages. First, it provides a cut-off of the high eigenmodes of the 3D lattice covariant Laplacian, which are exponentially suppressed by the Jacobi smearing [2], by serving as a projector onto the subspace spanned by low eigenmodes. Second, lattice quark propagation is represented via a low-dimensional matrix, called the perambulator. Since the lattice propagator is the inverse of a large-but-sparse matrix, practical calculations of correlation functions usually require point-to-all methods or stochastic estimations, and these are simplified in calculations involving smeared fields when written in terms of the perambulator. The subspace is small enough to enable direct manipulation in many cases. Third, since perambulators are independent of the creation operators, the inversion cost invested in their construction is fixed and correlation functions of many arbitrary operators can be calculated efficiently.
Unfortunately, this method also comes with disadvantages. Apart from the additional cost of calculating a number of eigenvectors of the Laplacian for each value of time in the lattice, the Dirac operator must be inverted a significant number of times using these eigenvectors. Namely, for a total N v of eigenvectors per time-slice of a lattice of temporal extent N t and N U gauge configurations, 4 × N v × N t × N U inversions must be performed. The choice of N v is related to the level of smearing and, although commonly used values of N v are very small compared to the total number of eigenvectors the Laplacian has, this can still be of the order of 100, which requires many inversions when large lattices and statistics are considered. Additionally, when the same level of smearing is desired on two different lattices, N v scales with the physical volume. This expense led to the introduction of stochastic es-timation in the space of the Laplacian eigenvectors opposed to the actual calculation involving a large number of inversions [3]. While greatly reducing the number of inversions required, this method uses approximations to the perambulators which depend on the estimation done and the dilution scheme of choice, therefore requiring a tuning of the parameters involved. The ultimate goal of this modification is to obtain results of comparable statistical quality to the exact case while keeping the number of inversions at a manageable level.
In this work an improvement of the distillation method is proposed which still allows for the exact calculation of the perambulators, keeping the number of inversions needed untouched, yet taking advantage of a degree of freedom that neither standard distillation, nor its stochastic version, have exploited in detail before; the use of distillation profiles. The original formulation of the distillation operator allowed it to be expressed as where V [t] has the eigenvectors on time slice t as columns and the diagonal matrix J[t] defines the restriction in the space spanned by V [t]. In Ref. [1], the matrix J[t] is set to unity. By construction, S[t] has support only on time-slice t so creation operators have explicit temporal locality. The spatial locality of the smeared fields has been explored [1] and established empirically when J = 1. It seems reasonable to expect using a smooth function of the eigenvalues to set diagonal entries in J, while all off-diagonal entries are kept fixed to zero will similarly result in a local smearing profile. By exploring the effects of J[t] = 1 and generalising this freedom at the meson creation, it is expected that states with better overlap with the ones of interest can be constructed. The rest of this paper is organized as follows: Section II explores the use of J[t] = 1 combined with a generalized eigenvalue problem to find the optimal J[t]. Section III displays the results of the charmonium spectrum in a model of QCD with N f = 2 degenerate heavy quarks whose mass is about half of the value for the charm quark in nature. We find significant advantages of using the optimal distillation profiles when compared to standard distillation. Section IV presents the conclusions of this work and future directions of the method are proposed here.
Appendix A shows how the optimal smearing profile can be equivalently expressed as an optimal creation operator for a meson.

II. OPTIMAL USE OF QUARK DISTILLATION PROFILES
The starting point is the most general distillation operator S[t] at a fixed time t given by Eq. (1). V [t] is a matrix with 4 × N v columns constructed from eigenvectors of the covariant 3D Laplace operator [t]. Since the operator does not act on Dirac components, V [t] is a block identity in Dirac space and each block contains the first N v eigenvectors v i [t], sorted by magnitude of the corresponding eigenvalue λ i [t] in ascending order. A given column V (i,α) [t] has entries given as is a 4N v ×4N v matrix, also block diagonal in Dirac space. Its entries are given as where g (λ) is the quark distillation profile and α, β = 0, ..., 3 are Dirac indices. S[t] can be written as and the role of J[t], and therefore g(λ), becomes apparent; it serves as the modulation of the contribution from each of the eigenvectors. The choice g (λ) = 1 turns S[t] into an orthogonal projector onto the range of V [t] and is the most commonly used one [4][5][6][7][8][9]. In [6] also a single gaussian profile was tested.
It should be noted that, while remarkable results are obtained with this choice, the exponential suppression caused by the Jacobi smearing, the inspiration for distillation, indicates that it is not a requirement that all v i [t] contribute with the same weight that they have in the original quark field when constructing an optimally smeared quark field. With this in mind, a step to improve over the standard distillation projector commonly used consists on taking advantage of the freedom of choice of g(λ) to build an optimal meson operator based on distilled quark fields. The strategy to follow consists of fixing an operator and building a basis of size N B with different g k (λ), k = 1, ..., N B , from which a variational formulation [10] helps to extract the optimal meson operator.
A. Quark distillation profile in meson two-point correlations Following the procedure in [1], consider the meson operator where q is a doublet of mass-degenerate quarks q = (q 1 , q 2 ), Ω ∈ {I, τ 3 } is a 2 × 2 flavour matrix and Γ is an operator that fixes the symmetry channel. The quark fields q i are replaced in Eq. (5) by their distilled counterpartsq i defined as to build the distilled meson operator This operator can then be projected to zero spatial momentum usingÔ and the two-point meson correlation function C(t , t) can be written as where the subscripts U and F denote the averaging over gauge and fermion fields. The integration over the fermion fields can be done analytically, yielding 1 where Tr (Ω) can either be 2 for Ω = I, the iso-scalar case, or 0 for Ω = τ 3 , the iso-vector case. The modulated elementals and the perambulators τ [t, t ] have entries where the quark propagator D −1 includes the dependence on the mass of the degenerate quarks. For clarity the operator Γ[t] is explicitly separated into an operator H that acts on spin space and an operator D[t] that acts on color and coordinate space, the latter being the one that can act on the eigenvectors. These definitions can be written in matrix form as and 1 whenÔ(t) has non-zero vacuum expectation value the replacement It can be seen that the modulated elementals contain a factor independent of the choice of (λ); the elementals of Γ = HD There is a two-fold advantage in defining and calculating these elementals. On one hand, once they are calculated it is possible to introduce the factor g (λ i [t]) * g (λ j [t]) for different choices of distillation profiles. This represents a notable gain as expensive operations, such as the application of an arbitrary number of lattice derivatives, have to be performed only once while the factor including the distillation profile defines the different N B operators. On the other hand, the elementals of operators that contain linear combinations of an arbitrary number of lattice derivatives are a linear combination of derivative elementals defined as where ∇ i [t] is the symmetric gauge covariant lattice derivative in direction i. These derivative elementals need to be calculated once and can then be combined in different ways depending on the operator and scaled according to the chosen distillation profile. Additionally, the number of derivatives involved in these derivative elementals leads to further simplifications. For the case where there are no derivatives, no elemental needs to be calculated due to the orthogonality of the eigenvectors for a fixed time, The elemental in such a case is given by For the case with a single derivative, the elemental is given by and satisfies Λ [m] ji [t] * , i.e it is anti-hermitian. This means that for a fixed time t only 1 2 N v (N v +1) entries must be calculated, instead of N 2 v , via only N v lattice derivatives and 1 2 N v (N v + 1) inner products. The case with two derivatives yields a derivative elemental [t] must be calculated for fixed t, m and n if m = n. However, the aforementioned property means that only three pairs [mn] satisfying m = n out of the possible 6 need to be calculated. For the case m = n this same property dictates that only 1 2 N v (N v + 1) entries must be calculated. The case with three or more derivatives is not treated here yet similar relationships between the derivative elementals and their hermitian conjugates are expected.

B. Determination of the optimal meson distillation profile
We consider a variational basis of meson operators O i with different quark smearing profiles g i (λ) in a fixed symmetry channel Γ = HD. From such a basis, operators can be constructed that have the best overlap with a given energy eigenstate in this channel. The starting point is the correlation matrix For a fixed value t G the generalized eigenvalue problem (GEVP) given by is solved for t > t G and e = 0, ..., N B − 1 for the generalized eigenvalues ρ e (t, t G ) and eigenvectors w e (t, t G ). As shown in [10][11][12], the generalized eigenvalues ρ e (t, t G ), when periodic boundary conditions in the gauge field are used, behave in the large t limit as where c e is a constant, m e is the mass of the state of interest and T is the temporal extent of the lattice in physical units.
Ordering the generalized eigenvalues from largest to smallest, e = 0 corresponds to the ground state and e = 1, ..., N B − 1 to the further excited states. The effective mass m e (t) can be extracted for t = t G + a, ..., T 2 − a via a root finding method as shown in [12]. This GEVP formulation allows access to excited states but it may suffer from numerical instabilities when many similar operators are considered for the construction of C(t). This issue can be attenuated by following the strategy presented in [13,14]. At a chosen time separation t S a singular value decomposition of C(t S ) is performed. The first N S singular vectors corresponding to the largest N S singular values are used to define a pruned correlation matrix where u i are the selected singular vectors for i = 0, ..., N S − 1. The pruned correlation matrix is better conditioned by construction, and less prone to the numerical instabilities mentioned above. Generalized eigenvalues obtained from solving the GEVP of C S (t) are then used to extract the effective mass [12]. Another use of the GEVP formulation is the fact that the entries of the generalized eigenvectors give the coefficients of an optimal linear combination of O i that achieves the best possible overlap with the energy state of interest. The first step to find the explicit form of the optimal operators is to transform the original basis into the basis of the pruned matrix C S (t).
The original meson profile basis is for k = 0, ..., N B − 1 and the pruned profile basis is constructed from the singular vectors u j as with n = 0, ..., N S − 1 and where u n,m denotes the m-th entry of the n-th singular vector. Note that at this point a dependence on Γ is introduced via the matrix C S (t). The second and final step is to linearly combine the pruned profiles using the generalized eigenvectors w (Γ,e) which yields Here w (Γ,e) k denotes the k-th entry of the e-th generalized eigenvector in the channel Γ and the t, t G dependence is suppressed. With the optimal meson profilef (Γ,e) (λ i , λ j ) the optimal meson elemental can be defined, now including the time dependence, as or in matrix form as where denotes the element-wise product and F (Γ,e) [t] is given by Eq. (28) shows the main difference between distillation using a quark profile and a meson profile: in the first case the profile is inserted manually at quark level via the distillation operator V (λ) such that and therefore it is not possible to define an optimal quark smearing profile J (Γ,e) q [t] with entries given by With this in mind it is more convenient to not attempt to translate these meson profiles to quark ones but rather keep the analysis at meson level. One can shift the attention to the Γ structure that is in the meson operator and define an opti-malΓ † structure which couples the quark fields, includes in its definition the optimal meson profile of the desired energy state of the channel of interest and yields the same correlation function as the one obtained from the GEVP described above (See appendix A).

III. CHARMONIUM SPECTRUM
All the calculations are performed on an ensemble with a 48× 24 3 lattice generated with two dynamical non-perturbatively O(a) improved Wilson quarks [15] with a mass equal to half of the physical charm. We use periodic boundary conditions except for anti-periodic boundary conditions for the fermions in the temporal direction. The bare gauge coupling is g 2 0 = 6/5.3 and the hopping paramater is κ = 0.13270. The lattice spacing is a=0.0658(10) fm [16,17] and the flow scale [18] t0 a 2 = 1.8486 (7). A total of N v = 200 eigenvectors of the 3D covariant Laplacian are used in each time-slice of the lattice. These are calculated with a Chebyshev accelerated [19] Thick-Restart Lanczos algorithm [20] with periodic reorthogonalization [21,22] in the first run and full reorthogonalization in the restarts. The Chebyshev acceleration uses a carefully chosen Chebyshev polynomial as a spectral filter which shifts and spreads out the segment of the spectrum one is interested in, therefore accelerating the convergence of the Lanczos algorithm. The periodic reorthogonalization reduces the computational cost of the algorithm while still keeping its precision. In this work the lower end of the spectrum is subjected to the filtering, however one could choose the upper end of the spectrum as well since there is a one-to-one relationship between the eigenvalues and eigenvectors of the 3D covariant Laplacian at both sides of the spectrum [23]. A total of 20 3D APE smearing [24] steps with α AP E = 0.5 are applied on each gauge field before the eigenvector calculation so as to smooth the link variables that enter the Laplacian operator. These parameters were found to yield the best results in a study of the static potential on the same ensemble. No smearing is applied to the gauge field used for the derivatives ∇ i . All calculations are perfomed by a code based on QCDlib, a library written by us in C+MPI that facilitates massively parallel QCD calculations. The inversions of the Dirac operator are performed by calling the package openQCD 2 [25], namely a deflated SAP GCR solver with improvements based on the two-grid method of [26]. The error analysis in this work is done using the Γ method [27,28]. The N B quark distillation profiles g k (λ) used are given by where the σ k define the width of the gaussians. This basis was chosen due to two reasons. First of all, each g k (λ) enforces a suppression of large eigenvalues, just as Jacobi smearing does, and follows distillation's intuition that small eigenvalues contribute more than large ones. Second of all, different basis functions, such as the monomials λ k , were tried with low statistics and the gaussian basis resulted in the most numerically stable GEVPs. The values σ k are chosen such that a wide range of widths is covered. N B = 7 is fixed and the widths are equally spaced between σ 1 = 0.0924/t 0 and σ 7 = 0.7949/t 0 . This choice means that the broadest profile still allows for a non-negligible contribution from the 200-th eigenvalue while the thinnest has already majorly suppressed it. This way the entire range of the 200 eigenvalues is covered and can be either enhanced or suppressed. The resulting quark distillation profiles can be seen in Fig. 1. The pruning of the correlation matrix mentioned in the previous section is done by taking N S = 4 singular vectors for all the operators studied in this work at a time t S = 3a which also corresponds to the value of t G which is fixed for the GEVP formulation.  For the charmonium spectrum the J P C of interest are 0 −+ , 1 −− , 0 ++ , 1 ++ , 1 +− , 2 ++ and 1 −+ , with the last one being a spin-exotic quantum number. Table I shows the operators that transform according to the irreducible representations of the cubic group that contain a contribution from the J P C of interest. Local operators are used due to their ease of computation. Derivative based operators, here taken from [29], are used to sample different spatial structures [1,6,30,31], explore J P C not available via local operators [6,29,31] and also ones that might include gluonic degrees of freedom [4,29,32]. One example is the operator ijk γ j B k , where B i = ijk ∇ j ∇ k is proportional to ijk F jk in the continuum and therefore explicitly contains the field-strength tensor. An important difference between local and derivative-based operators should be mentioned at this point: while for local operators the elementals are diagonal in distillation space, for derivative-based operators this is not the case. This is due to the fact that the latter act on coordinate/color space and therefore act directly on the eigenvectors that are used. In this sense the elementals retain the information of the original derivative-based operator that exists only in the span of these eigenvectors, just as the perambulators do with respect to the inverse of the Dirac operator. All operators were measured on 4080 gauge configurations. Some of the following results were already presented in [33].

A. Using quark-connected correlations
The first results to be presented correspond to the iso-vector spectrum and are obtained when Ω = τ 3 in Eq. (5). The results for different J P C using local operators will be presented first, followed by derivative-based operators. The first channel of interest to be studied is J P C = 0 −+ and Fig. 2 displays a zoomed-in plot of the effective masses obtained for the ground state of the local operator γ 5 built using both standard distillation and the optimal meson distillation profile from the GEVP analysis. The use of the optimal meson profile results in a notable suppression of excited state contamination and therefore an earlier mass plateau. Fig. 3 shows the effective masses of the lowest three states generated by the GEVP using the basis of operators formed from the seven smearing profiles in the bilinear Γ = γ 5 alone. Clear signals for the first-and secondexcited states are resolved using this single spin bilinear in with multiple meson profiles. Fig. 3 includes the result of a fit of the correlation functions to a single hyperbolic cosine. Given this improvement, the form of the optimal meson profile that corresponds to this ground state is of interest. It can be easily obtained from the formulation presented in the previous section. Since the operator is point-like, the elemental is diagonal in distillation space and only the values off (γ5,e) (λ i , λ j ) with λ i = λ j are relevant. As a result, this profile can be plotted as a function of only one variable. Fig. 4 displays this optimal meson profile. It is only known in the interval in which the eigenvalues are contained, which in the plot corresponds to the region between the two gray shaded areas. The error of this profile, which is also a function of the eigenvalues, has values of order 10 −3 so the corresponding errorbars are omitted since they would not be clearly visible at the scale of the plot. Two important pieces of information can be taken from Fig. 4. First, this optimal profile is not a constant, as standard distillation enforces, which implies that, apart from the truncation of Laplacian eigenmodes, a modulated filtering of these remaining eigenmodes can significantly improve the results. Second, this filter, given here by the meson distillation profile, is not arbitrary but rather obeys one of the ideas behind distillation; low eigenmodes contribute considerably more than higher ones and therefore the weights they have in the entries of the elemental must enhance them accordingly.
It is interesting to see whether such a picture emerges for the excited states as well. Therefore the first excited state for this channel is also analyzed. The optimal meson distillation profile for that state is shown in Fig. 5 together with a scaled version of the profile for the ground state for comparison. The errorbars for this excited state profile are also omitted since the corresponding errors are at the percent level. Similar observations as for the ground state can be made regarding the importance of low eigenmodes compared to high ones and the need for further filtering. However, and unlike the case of the ground state, the profile for this first excited state displays a node. The magnitude of the profile has a second local maximum at larger eigenvalues, indicating that for excited states higher modes are more importand than for the ground state. Moreover the suppression of the highest modes is not as pronounced as it was for the ground state. Profiles like these can serve as a guide for the choice of a reasonable number eigenmodes to work with. The second excited state was also analyzed and although the effective mass data becomes rather noisy it is still possible to extract the optimal meson distillation profile for this state. In this case it displays two nodes, and even higher modes contribute significantly. In addition to the profiles in "eigenvaluespace", it is interesting to visualize the corresponding spatial profiles of the optimalΓ[t] for both ground and first excited state. They can be constructed from the optimal elementals and their explicit form is whereΦ γ5,e [t] is the optimal elemental built from the optimal profile as given by Eq. (28) and e = 0, 1 denote the ground and first excited states. In the quark model this particular operator corresponds to a spin singlet and for this reason it is convenient to define the projected operator where γ 5 is chosen to reflect the fact that this is a spin singlet state and the trace is taken only over the Dirac indices, so that Γ (γ5,e) S [t] only has space and color indices. For a fixed time t and a point-like source at an arbitrary 3D position z given by The spatial distribution of interest is obtained by averaging its norm squared over all time slices where the norm is taken in color space such that Ψ (γ5,e) ( x) is only a function of position that can be calculated for each gauge configuration and normalized so that x Ψ (γ5,e) ( x) = 1. Since the spin singlet has S = 0 then the angular momentum must satisfy L = 0 within the quark model, giving it an S-wave classification and justifying the subscript "S" in Eq. (34). One would therefore expect the spatial distribution of the resulting vector to have an S-wave behavior. This is tested by setting the point-like source to be in z = (12a, 12a, 12a) and explicitly calculating Ψ (γ5,e) ( x). Fig. 6 shows Ψ (γ5,e) ( x) (e = 0, 1 for ground and first excited state) in the xy-plane at z = 12a averaged over 2 gauge configurations, which agrees quite well with the mentioned expectation in terms of spherical symmetry and number of nodes. The extraction of effective masses for the different states, together with the corresponding optimal profiles, is first studied for the local operators listed in Table I. Fig. 7 shows the ground state effective masses of some of these operators using three different methods; standard distillation, distillation with optimal meson profiles and stochastic trace estimation. The latter is done by evaluating the expression using 16 U (1) noise sources per configuration and a total of 2000 configurations. The source vectors are non-zero only on the source time-slice y 0 , chosen randomly on each configuration. The Γ = γ 5 correlator requires one inversion per noise vector and configuration and for every additional Γ one more such set of inversions is needed. This method of evaluating mesonic correlators is commonly used, e.g. recently in [17].
The calculation is carried out using a variant of the program mesons 3 . For all local operators standard distillation leads to a significant decrease of excited state contamination compared to the traditional stochastic estimation and the optimal profile further improves on this. For Γ = I, γ 5 γ i , ijk γ i γ j the use of both variants of distillation also reduces the notorious presence of noise that the stochastic estimation displays after the excited state contamination seems to have been suppressed. Table II shows the corresponding plateau averages 4 for standard and optimized distillation. For all cases, use of distillation allows access to an effective mass plateau and in addition, the use of the optimal meson profiles notably decreases excited state contamination, leading to earlier plateaus. Notable examples are the γ 5 and γ i operators, where the length of the plateaus more than doubles. This improvement clearly speaks in favor of such profiles over standard distillation, especially since the number of necessary inversions remains unchanged. 3 Available at https://github.com/to-ko/mesons 4 Plateau averages are weighted by the inverse errors squared Ground state effective masses of the local operators studied in this work using standard distillation,f (λ i , λ j ) = 1, distillation with the optimal profilesf (Γ,0) (λ i , λ j ) and stochastic trace estimation for each operator Γ. Plateaus of the ones using the optimal meson profiles are displayed. Different channels are displaced for clarity. Points of later times of the estochastic estimation results are omitted as to emphasize only their slower approach to the plateau. TABLE II: Ground state effective mass plateau averages for the local operators. The first row corresponds to the case with optimal meson distillation profiles while the second row corresponds to using standard distillation.
The resulting profiles for the ground states of the local operators are displayed in Fig. 8. All display the previously mentioned main characteristic: a modulated suppression of higher Laplacian eigenmodes. Although the heights of the different profiles differ, their overall shapes seem to be very similar, especially for the I, γ 5 γ i and ijk γ j γ k operators which suppress higher modes less than the γ 5 and γ i operators. The latter two have very similar profiles. In all cases the profiles are different from the flat Heaviside function of standard distillation. f( )   9 shows the optimal meson profiles for the first excited states created by the local operators, where again increased dependence on higher eigenmodes is observed. As for the ground state, the I, γ 5 γ i and ijk γ j γ k are very similar in shape just as the γ 5 and γ i are to each other, the latter group more strongly suppressing the higher eigenmodes than the former and having an earlier node. For the case of the second excited state the same similarity between groups of operators is observed regarding shape, location of the two nodes and the level of suppression of eigenmodes. f( ) The effective masses and optimal meson distillation profiles are also extracted for the derivative-based operators. As with the local operators, these profiles yield a significant improvement over standard distillation. Fig. 10 shows the ground state effective masses for some of the derivative-based operators studied. The gain from the optimal profiles can be seen as clearly as for the local operators. Table III shows the corresponding plateau averages. Notable improvement can be seen for example with the operator γ 0 γ 5 γ i ∇ i , where the length of the plateau more than doubles compared to standard distillation. This speaks for the importance of these profiles to make sure the relevant elementals use the eigenvectors in an optimal way. As was done for the γ 5 operator for the S-wave, the spa-tial distribution of the optimalΓ[t] corresponding to the spin singlet P-wave, J P C = 1 +− can be visualized. It is studied via Γ = γ 5 ∇ i when one derivative is used. The same procedure used for the γ 5 operator is employed for γ 5 ∇ i , yielding the spatial distributions shown in Fig. 11 for i = 1.

10 120
t a am eff FIG. 10: Ground state effective masses of some of the derivative-based operators using both standard distillation, f (λ i , λ j ) = 1, and the one with the optimal profiles f (Γ,0) (λ i , λ j ) for each operator Γ. Plateaus of the optimal meson profiles are displayed. Different channels are displaced for clarity.  III: Ground state effective mass plateau averages for some of the derivative-based operators used. The first row corresponds to the case with optimal meson distillation profiles while the second row corresponds to using standard distillation. Of special interest is the 1 −+ channel, an exotic quantum number that can be modeled as a hybrid meson which includes explicit gluonic degrees of freedom combined with quark and anti-quark components [4]. The operators γ 0 ∇ i and ijk γ j B k are used to study this channel, where the latter operator includes the field-strength tensor via B k for gluonic excitation. Fig. 12 shows the ground state effective mass for both operators with and without the optimal meson distillation profiles. Results corresponding to the optimal profiles are plotted starting at t G +a and shown only when the error is smaller than the signal. As expected, the optimal profile leads to a decrease of excited state contamination, more substantial for the ijk γ j B k operator than for the γ 0 ∇ i one. Second, when considering the optimal profile for both operators there is a clear difference in behavior of the effective mass which points to ijk γ j B k having most overlap with the lowest energy eigenstate. Since ijk γ j B k has an explicit gluonic excitation, unlike  Optimal meson profiles are extracted for all derivative-based operators used. The first optimal profile shown in Fig.  13 corresponds to the ground state created by the operator | ijk |γ j ∇ k . The main contribution to the profile comes clearly from pairs of low eigenvalues, as expected in distillation, and exhibits an approximately radially symmetric decay with increasing eigenvalues. As with the local operators, no nodes are observed for the ground state. Fig. 14 shows the optimal profile for the first excited state, where a single node can be seen and, while higher values of eigenvalues are less suppressed than in the ground state profile, the overall tendency of lower eigenmodes contributing more than higher ones still remains. To compare these profiles with those obtained for local operators one can take the one-dimensional pro- Fig. 15 with error bands. The notationf (| ijk |γj ∇ k ,0) (λ) should be understood asf (| ijk |γj ∇ k ,0) (λ, λ) and the same for the first excited state. The suppression of higher eigenmodes, together with the node for the first excited state, are clearly visible with notable resemblance to the local operators. f( )  ground states can be explained via lattice artifacts since these two should coincide when taking the continuum limit. Fig. 16 displays the ground and first excited state for each fixed irrep, where the values come from the operators that displayed the best signal, and the spin assignment for the ground state is given by the label under each data point. The access to a first excited state purely by the inclusion of meson distillation profiles further demonstrates their usefulness. Note the heirarchy of states computed matches the pattern seen in nature, where there are eight narrow charmonium resonances below the DD threshold. In this investigation, which has no light dynamical quarks, this threshold is absent. The ground state for the 1 −+ channel is shown in red to emphasize its spin-exotic nature.    A set of quantities of interest are the mass splittings. Their definitions are taken from [35] and are where ∆m HF is the 1S hyperfine splitting, ∆m 1P −1S is the spin-average 1P −1S splitting, ∆m S−O is the spin-orbit splitting, ∆m tensor is the tensor splitting and ∆m 1P HF is the Pwave hyperfine splitting. They are calculated using the assignment shown in Table I and from the operator that yields the mass with smallest error in lattice units. In all cases only isovector masses are considered. Table V shows the mass splittings obtained for the ground and first excited states. Good precision is achieved in both cases thanks to the use of the optimal distillation profiles. The last row in Table V gives the hyperfine splitting in physical units along with two uncertainties. The first indicates solely the statistical precision while the second includes the systematic uncertainty arising from scale setting, which relates the lattice spacing to physical units. Recall the quark mass is about one half of the physical charm-quark mass so this comparison is made to emphasise the difference between the precision obtained by this lattice calculation and the overall scale setting. The scale uncertainty is significantly higher in both cases.  VI: Fractional overlaps with the ground state using standard distillation (f (Γ,0) (λ i , λ j ) = 1 ) and the optimal meson profiles (Optimalf (Γ,0) (λ i , λ j )). For the Γ = γ 5 bilinear, data for the first-excited state is included Finally, it is important to numerically quantify how much the use of the optimal distillation profiles helps to create a state closer to the desired energy eigenstate for the different channels. To do this one first looks at the correlation function calculated for a given Γ using standard distillation. It has a large t limit given by where m 0 is the ground state mass for the chosen channel and the coefficient c 0 is the amplitude squared of the overlap between the state created by the meson operator involving Γ and the lowest energy eigenstate in the channel. However, to compare it with the obtained data one must take into account the presence of excited state contamination. The correlation function is then given by where B 1 (t) contains the excited state contamination. One can define a normalized correlator C (t) that satisfies C (t G ) = 1 as From the spectral decomposition it is known that B 1 (t) < B 1 (t G ) if t G < t < T 2 , which is the regime which is analyzed, so it holds that A(t) = 1+B2(t) 1+B2(t G ) < 1 in this regime. In the mass plateau interval where B 1 (t) = 0 one obtains A G = 1 1+B2(t G ) < 1 and this parameter quantifies the presence of excited state contamination at early times. If there is none then B 1 (t G ) = 0 and A G = 1, however B(t G ) > 0 means 0 < A G < 1. The closer A G is to 1 the larger the suppression of excited state contamination, which is the goal of the optimized operators used in this work. To get the value of A G one can fit the normalized correlation data to the form of Eqn. (47) in the plateau region where B 1 (t) is considered to be 0. This way only one fit parameter must be found. To simplify the analysis one can solve for an effective parameter A ef f (t) as and estimate A G as the average of A ef f (t) in the previously mentioned plateau region. A similar procedure can be done for the generalized eigenvalues from the GEVP defined in Eqn. (22). They have a large t limit given by [12] ρ e (t, t G ) = 2c e e −me T 2 cosh which is an identical form to the normalized correlation in the case of standard distillation. One can again account for the presence of excited state contamination and perform a fit of the data to the form of Eqn. (47) by taking e = 0 and replacing C(t) C(t G ) with ρe(t,t G ) ρe(t G ,t G ) , which is equal to ρ e (t, t G ) due to normalization. By doing so via an average of A ef f (t) in the region of the effective mass plateau where B 1 (t) is 0 one gets the value that quantifies the suppression of excited state contamination just as for the case of standard distillation. Once the value of A G is known for the ground state of every operator analyzed in this work via both standard distillation and the one with the optimal profile it is possible to compare them as to quantify the improvement that the optimal profile brings. These values are displayed in Table VI, where A G is denoted as the fractional overlap. For the γ 5 bilinear, the fractional overlap from fitting the first-excited state is included. Again, a value close to unity indicates the profiles also accurately represent excited-state wavefunctions. Note no comparison with the standard distillation method is available, since only a single correlation function is generated from the γ 5 bilinear. As expected this fractional overlap is larger when using the optimal profiles compared to standard distillation. Remarkable improvement is obtained for a majority of the operators studied, with notable examples being γ 0 γ 5 γ i ∇ i , γ i , ijk γ j γ k and ∇ i where the fractional overlaps increase by factors of approximately 1.25, 1.13, 1.2 and 1.56 respectively. The case of ∇ i is also particularly special since the fractional overlap goes from being below 0.5 with standard distillation, which one might consider small, to around 0.742 with the optimal profile, which is around the values that other operators have for the standard distillation case. While this shows that this operator has significant contributions from excited states compared to others of the same J P C and therefore might not be the best one to use to access the ground state, it also holds that the use of the optimal profile significantly decreases such contributions and puts the operator in approximately equal standing to the others with standard distillation. The case of the operators involving B i also display an interesting behaviour. Both γ i B i and γ 5 B i present a rather small increase in their fractional overlaps when the optimal profiles are used. This could be explained by the fact that these operators are expected to have considerable contributions from excited states, including members of hybrid supermultiplets with the same J P C [4], given the combination of derivatives they contain. Because of this the optimal profile might not be able to suppress these contributions significantly or otherwise enhance too much the one from the ground state already present. However, the contribution to these excited states is expected to be enhanced by the optimal meson profile corresponding to them which is of course different than the one of the ground state.

B. Using quark-connected and quark-disconnected correlations
The use of distillation also grants access to the quarkdisconnected correlations essential for iso-scalar spectroscopy, seen as the second term in Eq. (10). To repeat the GEVP analysis to find the optimal distillation profiles requires a calculation of these disconnected pieces. However the presence of significant noise in these quark-disconnected correlations represents a major problem for the GEVP both in terms of numerical stability and the small number of points that contain a discernible signal. To avoid such complications the optimal profiles for the iso-vector operators are used to build corresponding iso-scalar operators. Two reasons motivate such a choice: first, if the iso-vector and iso-scalar states are not too different in mass then it is not unreasonable to expect their optimal meson distillation profiles might also be not too different. Second, even if thet are not very similar, the iso-vector profile might still constitute an improvement over the constant profile of standard distillation. With these considerations in mind, all optimal profiles for different operators Γ described in this section keep the labelf (Γ,e) (λ i , λ j ) and it is understood that they correspond to those used for the isovector analysis. The first operator analyzed is Γ = γ 5 . Fig.  17 shows the resulting effective masses for both iso-scalar and iso-vector operators when using both standard distillation and the optimal meson profile of the iso-vector operator. Results corresponding to the optimal profiles are plotted starting at t G + a and only shown when the error is smaller than the signal. Both profiles display a non-negligible mass difference between iso-scalar and iso-vector channels at early enough times where the error is manageable. The use of the optimal profile leads to a considerable reduction of excited state contamination compared to the constant profile. This confirms the intuition that a reasonable profile is a better choice than the constant. From these improved mass data for the iso-scalar one can establish a rather early plateau whose average has a difference of 99 ± 14 MeV with respect to the improved isovector mass plateau average. Initial estimates of this quantity have pointed to relatively small values in lattice studies [36][37][38] and perturbative NRQCD [39][40][41][42][43], albeit with conflicting signs. Recent lattice calculations of this splitting, either indirectly [39] or directly [8,9], also point to small-yet-positive values. In both cases the value was significantly lower, albeit at a much higher quark mass close to its physical value for the charm quark. Other possible reasons for this disagreement include different flavor content, finite lattice spacing, model assumptions and residual contamination from excited states. Nonetheless it is still clear that the use of an optimal meson profile improves the quality of the result obtained for this mass splitting. FIG. 17: Effective mass of the ground state for the isoscalar and iso-vector γ 5 operators using standard distillation, f (λ i , λ j ) = 1, and the optimal profile built from the isovector data,f (γ5,0) (λ i , λ j ). Data points are displaced horizontally for clarity.
The disconnected pieces for all other operators used were calculated to assess two main aspects: the presence of a signal at small time-separations and how much statistical noise it has. These dictate the feasability of calculating the correlators of iso-scalar operators. Fig. 18 shows the disconnected pieces for the operators that display the best signal when using the corresponding optimal iso-vector meson distillation profiles. Results are shown until the error becomes larger than the signal. The clear signal for the γ 5 operator is shown in Fig. 17 and serves as evidence for the advantage of using distillation. The clear signal for the disconnected correlation function for operator I seen in Fig. 18 motivates a more detailed study of the iso-scalar J P C = 0 ++ channel. Fig. 19 shows the effective masses for the 0 ++ iso-vector and iso-scalar operators using standard distillation, distillation with the optimal profile obtained from the iso-vector data only and operators built from unsmeared quark fields evaluated via stochastic trace es-timation. The traces x Tr D −1 (x, x)Γ (51) required for the latter method are evaluated on every timeslice using 64 U (1) noise vectors per time slice and configuration. Φ (Γ) for all Γ matrices can be evaluated using the same inversions. Iso-vector correlators built from unsmeared and from both types of distilled operators lead to the same effective mass at large temporal separations, with distillation using the optimal profile performing considerably better. Meanwhile, the iso-scalar effective masses tend to values significantly lower than the iso-vector case. Not only this, but the use of the optimal profile from the iso-vector for the iso-scalar operator considerably increases the noise. These two observations are not independent. The first indicates that there exists an iso-scalar state, probably a glueball, with quantum numbers 0 ++ which is much lighter than the iso-vector 0 ++ predominantly created by quark-anti-quark excitation. This explains the second fact: such a mass difference indicates the optimal profile for the heavier state is not sufficiently close to the true optimal profile of the lighter one and therefore is a bad choice to build an operator to create this lighter state. A better approach would include glueball operators together with the iso-scalar Γ = I in the GEVP but this is beyond the scope of the current work. ameff Iso-vector, Stoch.
FIG. 19: Effective masses extracted from the iso-vector and iso-scalar operators defined by Γ = I using standard distillation (f (λ i , λ j ) = 1), distillation with the optimal profile obtained from the iso-vector data (f (I,0) (λ i , λ j )) and stochastic trace estimation. Data points are displaced horizontally for clarity and omitted if the signal is lost to the noise.
It is also useful to compare the signal of the disconnected correlation obtained with local operators with the derivativebased operators. Since this data has the largest noise, the operators which reduce this noise are the natural choice. a priori, it is not know if the local or derivative-based operators will perform best. Fig. 20 shows the disconnected correlation for the 0 ++ , 0 −+ and 1 ++ channels using both types with their corresponding profilef (Γ,0) (λ i , λ j ). These channels exhibit the best signal for this correlation. It is clear that both types show a significant signal with only the 0 ++ case where there is a major difference in noise between local and derivative operators while the other two channels exhibit somewhat similar errors. This indicates the derivative based operators can also be used to analyze iso-scalar operators reliably. Since the quarkdisconnected correlation functions of derivative-based operators were analyzed on a smaller subset of gauge configurations a robust test would first match the statistics of the calculation with local operators. This not only improves the signal for the γ i ∇ i compared to the I operator but also for the 0 −+ and 1 ++ operators. Since the last two already seem to match the error from local operators, an increase of statistics can only provide better results, which could point to promising results when mixing these different types of operators together.

IV. CONCLUSIONS AND OUTLOOK
In this work, the distillation framework for quark-field smearing in lattice QCD was extended and techniques for optimising the resulting creation operators were developed and tested. The extension applies arbitrary scalar functions of each of the eigenvalues of the gauge-covariant laplacian on each timeslice inside the smearing kernel. As a consequence of this extension, meson profile functions are also introduced and investigated. The procedure to build optimal meson distillation profiles via the GEVP formulation was presented explicitly, making the best use of the Laplace eigenvectors which form the basis for the distillation method. Although only meson operators were examined in this work, the method can be extended naturally to other hadrons such as baryons and tetraquarks. Tests were performed in QCD with N f = 2 degenerate quarks, with a mass close to half that of the physical charm-quark. Results for the spectrum of the quark-anti-quark system were determined using local and derivative-based operators and the comparison between standard distillation and the optimisation proposed here was made.
Several advantages were observed. First, comparing the effective mass plots obtained with standard distillation to the optimal profiles, a remarkable decrease in excited state contributions is seen. This effect is graphically seen in Figs. 7 and 10 for some operators and quantified in Table VI for all operators considered in this work. Only some of the operators involving a chromomagnetic component exhibit a small improvement, serving as indication that other aspects not influenced by the distillation profile should be improved, such as their gluonic excitations. The increased suppression of excited state contamination in all others yields earlier plateaus with at least five more data points in several cases. Additionally, even where the mass plateaus have equal length those generated from the optimal profiles occur earlier, benefiting studies of quark-disconnected correlation functions. These notoriously suffer from a signal-to-noise problem which sets in at early times. So achieving a mass plateau at early times for the quark-connected correlations gives more precise information to combine with the quark-disconnected correlation. Second, this improvement does not significantly increase the computational work since no additional solutions of the Dirac equation or lattice derivatives must be calculated beyond those required for the perambulator and standard elementals. Third, the appropriate choice of the number of eigenvectors which is crucial to distillation can be approached more systematically. Should too few be chosen, not enough useful information is kept and the resolution is poor. With too many, insufficient quark-field smearing is applied and the computational cost increases. The optimal distillation profiles studied here provide a very useful guide; eigenvectors that are important for a given state have a large contribution in the profile while less important ones are suppressed. For a fixed distillation-space size, eigenvectors beyond those included might contain important structure that is subsequently neglected. Introducing profile functions does not correct for this defect however it does maximally use the data in the basis of smooth fields and so can make up in part for the limited resolution given by a fixed number of modes. The use of too few eigenvectors is detected by monitoring whether the profile is small for the largest eigenvalue included. At the same time the contributions of higher modes in a very large basis is regulated automatically via the optimal distillation profile. Fourth, the optimal meson distillation profiles can be used to visualize the spatial distributions of the resulting meson operators. These distributions provide not only an estimate of the physical extent of the mesons but also suggest if there are notable finite-volume effects when the spatial distribution has sizable contributions near the edges of the lattice. Fifth and final, this improvement is in principle applicable in the stochastic distillation framework, where the perambulators are not calculated exactly. In this case the noise introduced by the estimation may affect the GEVP formulation and therefore the accuracy of the determined profiles. It could be preferable to perform a small statistics, exact calculation of the perambulators to determine the optimal meson profiles for use in a full-statistics study.
A natural next step is to extend the GEVP basis by not only varying the quark distillation profiles of a single fixed Γ structure but also including multiple structures, each with a different profile. This should allow not only access to further excited states but also to potentially improve the signal obtained for the ground and first excited states. As pointed out in Sec. III from the results displayed in Fig. 19, a further step is to build a GEVP basis of iso-scalar operators resembling both mesons and glueballs to explore their possible mixing. Such studies are under way, both using the ensembles studies in this work and with a finer lattice spacing and larger volume to explore the effects of the lattice spacing.

A. OPTIMALΓ STRUCTURE
To define an optimalΓ (Γ,e) structure that includes the optimal meson distillation profile Eq. (28) at meson level one starts with the original Γ structure in the channel of interest. Assuming the decomposition into an operator H acting on Dirac space and an operator D[t] acting on coordinate/color space, one can perform a change of basis to express D[t] in terms of V N [t], a matrix whose columns are all the Laplacian eigenvectors at time t. Doing this results in it corresponds to applying standard distillation to the quark fields using all of the Laplacian eigenvectors, which is known to have no effect. However, one can exploit this formulation of the Γ structure to manually insert at meson level an arbitrary meson distillation profile just as one could introduce at quark level the quark distillation profile via the matrix J The two-point correlation function of such an operator, considering for now only the quark-connected piece for simplicity, will be given by where the sign depends on Γ and τ N [t 1 , t 2 ] corresponds to the perambulator involving all eigenvectors. Such a correlation function is not tractable since one would need to know all the eigenvectors and eigenvalues of the Laplacian at every value of time and perform an unfeasable amount of inversions of the Dirac operator. However, one can avoid such a problem by using standard distillation, where the quark fields are replaced by their distilled counterparts using a manageable number of eigenvectors. The distilled meson operator will be given bỹ and the corresponding correlation function is given by to be the optimal meson profiles calculated from the GEVP described in Sec. II then C D (t) corresponds to the same correlation function that one would obtain from the eigenvalues of said GEVP. This can be seen first by replacing C(t) with C S (t) (See Eq. (23)) in Eq. (21) and then multiplying from the left by w k (t, t G ) † , yielding w k (t, t G ) † C S (t)w e (t, t G ) = ρ e (t, t G )w k (t, t G ) † C S (t G )w e (t, t G ) = ρ e (t, t G )δ ke , (A.11) where the orthonormality condition w k (t, t G ) † C S (t G )w e (t, t G ) = δ e,k (A.12) has been used. On the right hand side of Eqn. (A.11) are the generalized eigenvalues of the GEVP while on the left hand side is the projected correlation matrix w k (t, t G ) † C S (t)w e (t, t G ). To relate this projected correlation to C D (t) it is important to note that eventhough the vectors w e (t, t G ) have time indices they are expected to be independent of time up to noise effects. The results presented in this work are built using w e (t, t G ) for a fixed value of t where there is no significant change between consecutive times so this time independence is assumed and the t, t G pair is omitted in the following treatment. Expressing the entries of C(t) for a fixed Γ as C(t) mn = O m (t)Ō n (0) F,U (A. 13) one can express the entries of C S (t) as C S (t) a,b = P a (t)P b (0) F,U , (A.14) where the pruned operator P a (t) is given by [t] is equal to the optimal meson elemental obtained from the GEVP as given in Eq. (28), which means that the projected correlation w (Γ,e) † e C S (t)w (Γ,e) e from the GEVP is equal to C D (t) since they are built using identical elementals. However, while in the GEVP case the distillation profiles were introduced at quark level independently of the Γ structure and different optimal meson distillation profile was obtained for every choice of Γ and energy state, here the optimal meson profile is inserted via a redefinition of the Γ structure while standard distillation, i.e a constant profile, is used for the quarks. The second observation is the fact that even though Γ cannot be explicitely built without an excessive amount of computational work one can construct an approximation with a limited number of eigenvectors which is given bỹ where it can be seen that when all eigenvectors are used one recoversΓ [t]. The distilled meson operator in Eq. (A.9) can now be written asÕ D (t) =q(t)Γ D q(t) (A.26) and corresponds to the closest one can get to the meson operator defined in Eq. (A.7). Furthermore, ifΓ D leads to improved correlation functions, as is the case in this work, then it probably conserves useful properties ofΓ that can be studied without the need for excessive computational work.