Improving the Kinetic Couplings in Lattice Non-Relativistic QCD

We improve the non-relativistic QCD (NRQCD) action by comparing the dispersion relation to that of the continuum through $\mathcal{O}(p^6)$ in perturbation theory. The one-loop matching coefficients of the $\mathcal{O}(p^4)$ kinetic operators are determined, as well as the scale at which to evaluate $\alpha_s$ in the $V$-scheme for each quantity. We utilise automated lattice perturbation theory using twisted boundary conditions as an infrared regulator. The one-loop radiative corrections to the mass renormalisation, zero-point energy and overall energy-shift of an NRQCD $b$-quark are also found. We also explore how a Fat$3$-smeared NRQCD action and changes of the stability parameter $n$ affect the coefficients. Finally, we use gluon field ensembles at multiple lattice spacing values, all of which include $u$, $d$, $s$ and $c$ quark vacuum polarisation, to test how the improvements affect the non-perturbatively determined $\Upsilon(1S)$ and $\eta_b(1S)$ kinetic masses, and the tuning of the $b$ quark mass.


I. INTRODUCTION
The Standard Model (SM) of particle physics has been incredibly successful at describing experimental data to date [1,2]. However, in many ways, this success has been a double-edged sword; while SM predictions have overwhelmingly agreed with experimental measurements within errors, this has left little room for large newphysics effects to be observed. Consequently, to illuminate any new-physics phenomena high-precision tests of the SM must be performed. In the b-quark sector, the LHCb and BELLE II experiments will generate increasingly precise measurements. In response to this, we make the next level of improvement to the HPQCD collaboration's formulation of the NRQCD action [3] which has been used for a number of state-of-the-art b-physics calculations [4][5][6][7][8][9][10][11].
In this study we will include, for the first time, operators in the NRQCD action which reproduce the correct quark dispersion relation to O(p 6 ). Then, with different values of the NRQCD stability parameter n, we use lattice perturbation theory to compute the kinetic matching coefficients to O(α s p 4 ). We remove the unphysical tadpole contributions [12] from the lattice action and give perturbative results for two different tadpole improvement programs: the first by using a mean-field improvement parameter in Landau gauge u 0 [12], and the second via Fat3 smearing [13]. Additionally, we determine the one-loop (bare-to-pole) mass renormalisation and zeropoint energy of the b-quark. These can be combined a christine.davies@glasgow.ac.uk b Judd.Harrison@glasgow.ac.uk c chughes@fnal.gov d http://www.physics.gla.ac.uk/HPQCD to give the one-loop energy shift of the NRQCD heavy quark, and added to non-perturbatively obtained static masses to give numerical results which, after converting from lattice units to GeV, can be compared to experimental data. Further, for each of these quantities the scale µ = q * at which to evaluate the strong coupling constant defined in the V-scheme is determined using the Brodsky-Lepage-Mackenzie (BLM) procedure [12,14].
After perturbatively determining the full one-loop radiative corrections to the kinetic couplings, we nonperturbatively determine the Υ(1S) and η b (1S) energies in order to examine how improving the NRQCD action, both with additional O(p 6 ) operators and with the O(α s p 4 ) couplings, reduces the effect of lattice artefacts. This paper is organised as follows. In Section II we describe the improved NRQCD action. In Section III we match the O(α s p 4 , p 6 ) NRQCD dispersion relation to the continuum, describe our tadpole improvement procedures and how the scale at which to evaluate α V is found. Section III A describes the computational setup of the automated lattice perturbation theory, while Section III B gives an analysis of the perturbative results. Section IV gives details of the non-perturbative calculation and Section IV B presents the non-perturbative results. We summarise our findings in Section V.

II. b-QUARKS USING NRQCD
Information about processes involving heavy quarks can be computed on the lattice using correlation functions constructed from combinations of heavy-quark propagators.
Current lattice ensembles have small enough lattice spacings and large enough volumes so that accurate relativistic c-quark formalisms (e.g., Highly Improved Staggered Quarks (HISQ) [15]) are available.

arXiv:1812.11639v1 [hep-lat] 30 Dec 2018
Since the b-quark has a Compton wavelength of O(0.04) fm, most current lattice ensembles cannot resolve relativistic b-quarks since am b > 1 [16] 1 . However, it is well known that b-quarks are very nonrelativistic inside their bound states (with v 2 rel ≈ 0.1 for low-lying bottomonium states) and thus using a nonrelativistic effective field theory, which has a formal expansion in p/m b = v rel [18], is very appropriate. This effective field theory is then discretised as lattice NRQCD [18].
HPQCD's formulation of lattice NRQCD has already proven successful in producing accurate b-physics results in the literature. For example, the NRQCD formalism that gave a quark dispersion relation correct to O(α s p 4 ) has already been used to study bottomonium S, P and D wave mass splittings [3,4], B meson mass splittings [5], B meson decay constants [6,19], Υ and Υ leptonic widths [7]. Subsequently, the spin-dependent O(v 6 ) operators were added to that NRQCD action in order to compute hindered M1 radiative decays [8], precise bottomonium hyperfine splittings [9,10] 2 and to aid in the search for bbbb-type bound tetraquarks [11].
Given the increasingly important emphasis being put on high-precision calculations needed to keep pace with measurements from the LHCb and BELLE II experiments, we take the next steps in improving the lattice NRQCD action to reduce the systematic uncertainties in future theoretical calculations using it. The first part of this improvement is to add the necessary operators to the aforementioned NRQCD action that reproduce the correct O(p 6 ) quark dispersion relation at tree level.
The NRQCD action that gives rise to a O(p 6 ) correct quark dispersion relation, including O(v 4 ) interaction operators [3], produces a heavy-quark propagator which can be found through the evolution equation G(x, t + 1) = e −aH G(x, t), G(x, t src ) = φ(x) (1) where φ(x) is a source function and aH 0 = − ∆ (2) 2am b , aδH = aδH v 4 + aδH p 6 ; 1 Combining results at multiple lattice spacing values and multiple heavy quark masses with a highly improved relativistic action does allow results to be obtained at the physical b quark mass [17]. 2 Four-quark operators were also used in this study.
Here, am b is the bare b-quark mass, ∇ is the symmetric lattice derivative, with∇ the improved version, and ∆ (2) , ∆ (4) , ∆ (6) are the lattice discretisations of i respectively, with our conventions given in Appendix A.Ẽ,B are the improved chromoelectric and chromomagnetic fields, details of which can be found in [3]. Each of these fields, as well as the covariant derivatives, must be tadpole-improved using the same improvement procedure as in the perturbative calculation of the matching coefficients [12]. This will be discussed further in Sec. II B. The parameter n is used to prevent instabilities at large momentum from the kinetic energy operator, and needs to satisfy the constraint p 2 < 4nam b . A choice of n = 4 was suitable for values of am b used in previous non-perturbative studies. We choose to put the aδH p 6 corrections into aδH rather than alter aH 0 so that the kinetic operator remains unchanged. This formulation is also consistent with previous HPQCD NRQCD actions, is symmetric with respect to time reversal and has smaller renormalisations than other formulations [18]. The rotationally-symmetry breaking operators (which vanish as a → 0) with coefficients c p 6 and c p 2 p 4 in δH p 6 remove higher-order discretisation effects from using finite-difference derivatives. The operator with coefficient c (p 2 ) 3 correctly adds the term proportional to (p 2 ) 3 into the heavy-quark dispersion relation. The matching coefficients c i in the above Hamiltonian take into account the high-energy UV modes from QCD processes that are not present in NRQCD. Each c i can be fixed by matching a particular lattice NRQCD formalism to full continuum QCD. Each c i can be expanded perturbatively as and, after tadpole improvement [12], we expect c (1) i to be O(1). In Sec. II A, we will match the on-shell NRQCD dispersion relation to that of the continuum, and determine c 6 and c (1) 5 . Each of these coefficients should exhibit benign behaviour as a function of am b in the regime where the NRQCD effective field theory is wellbehaved. In contrast, the coefficient may diverge as the effective field theory breaks down as p ∼ π/a gets too large or am b gets too small. We take tree level values, c i = 1, for the coefficients appearing in δH p 6 .
We call the NRQCD Hamiltonian presented in Eq. A high-precision non-perturbative calculation of mass splittings will require knowledge of at least the O(α s ) corrections to the matching coefficients in order to improve upon existing few percent errors. For example, when tuning the bare quark mass am b fully nonperturbatively in NRQCD, one computes the kinetic mass of a hadron 3 [3]. This kinetic mass depends on the internal kinematics of the hadron, and hence on (at least) the terms c 1 , c 5 , and c 6 in the Hamiltonian. These matching coefficients are known as the kinetic couplings [20].
The kinetic couplings can be found perturbatively by matching the NRQCD on-shell energy (which corresponds to the location of the pole of the quark propagator in the interacting theory) to the continuum QCD dispersion relation. From now on, to avoid superfluous notation, we will implicitly work in lattice units unless otherwise stated. To O(α s ) the inverse quark propagator may be written in momentum space as with G (0) (p) −1 the quark propagator obtained at treelevel from the non-interacting part of the NRQCD action, Σ(p) the one-loop quark self-energy, p = (p 4 , p) a four-vector in Euclidean space and ω = −ip 4 the energy in Minkowski space. The free quark propagator can be explicitly found as where we have definedc 1 = (c 1 + c 6 m b /2n)/(1 + m b /2n) for computational ease, and c 1 = c 6 =c 1 . The term F (p) arises from the non-interacting momentum space part of (1−H 0 /2n) in (2), while F 1 (p) comes from the (1−δH/2) piece.
To find the NRQCD dispersion relation we determine the on-shell energy ω(p) which causes a pole in the full heavy-quark propagator. The one-loop ω(p) can be found from Eq. (5) and (6) as with ω 0 (p) = − log(F 2n (p)F 2 1 (p)), with tree-level coefficients in F and F 1 , being the tree-level on-shell energy found by setting the tree-level inverse propagator in Eq. (6) to zero. We have constructed the Hamiltonian in Eq. (2) to produce a non-relativistic dispersion relation correct to O(p 6 ), and we now include the O(α s p 4 ) correction. This yields 4 When matching the dispersion relation to O(α s p 4 , p 6 ), it is necessary to decompose the self-energy Σ(p) using the small-p expansion [20] as Further, when ω is small, each function has a well-defined series expansion Σ m (ω) = Σ 3 (ω) = 1 24 Then, by using the tree-level ω 0 from (10) in Eq. (11) we find where the superscript 'r' denotes renormalised quantities and Z (1) m is the O(α s ) coefficient of the bare-to-pole mass renormalisation. Substituting (8) and (17) into (9) gives the one-loop NRQCD dispersion relation to O(α s p 4 , p 6 ) as Matching Eq. (23) to the continuum QCD dispersion relation [3,21] gives the matching coefficients forc as well as the energy shift of a heavy quark (to this order) asc c (1) The shift C is the perturbative shift of the zero of energy. For each heavy quark in a non-perturbative calculation, the shift can be added to the simulation energy and, after being converted from lattice units to GeV, this can then be compared to experimental masses [20,21]. In practice hadron masses can be more precisely determined fully non-perturbatively through their kinetic mass in lattice QCD.
The aim of this study is to determine the one-loop coefficientsc 5 , δC (and thus also Z (1) m and W 0 ) for different improved NRQCD actions to find the best way forward for increasingly accurate non-perturbative calculations in the future. Before these coefficients can be used, it is first necessary to remove unphysical contributions from tadpole diagrams which can cause the coefficients to be rather large [12].

B. Tadpole Improvement
The authors of Ref. [12] show that using Lie group elements when constructing the lattice field theory introduces unphysical tadpole diagrams which do not contribute to continuum schemes. These unphysical tadpole diagrams cause large, process independent renormalisations and produce a poor convergence of the perturbative series. Ref. [12] also suggests a solution to this: a gauge-invariant mean-field improvement program (tadpole-improvement) where each lattice link, U µ (x), is scaled to U µ (x)/u 0 . We choose u 0 to be the mean link in Landau gauge, i.e., u 0 = 1 3 TrU µ (x) . This meanfield parameter has been calculated for the Symanzikimproved gluon [3,22] action both perturbatively to oneloop (with u 0 = 1 − α s u (2) 0 giving u (2) 0 = 0.750) [23] and non-perturbatively [3,6] (where the value of u 0 depends on the ensemble used, e.g., see Table IV). After a tadpole improvement procedure has been implemented, the one-loop coefficients are expected to be O(1). The same tadpole-improvement program must of course be implemented in the nonperturbative calculations as has been used for the perturbative calculations.
Before the mean-field improvement procedure is performed, care must be taken to ensure that any link-pair cancellations U † µ (x)U µ (x) = 1 occur in the lattice action used in both non-perturbative and perturbative calculations. Such cancellations do not generate any unphysical tadpole diagrams and scaling by 1/u 2 0 would be incorrect. Yet, expanding out the complicated NRQCD Hamiltonian in (2) in terms of links U µ (x) is excessively expensive for numerical calculations. Consequently, link-pair cancellations are only taken into account separately for each derivative, (∆ (2n) ) m , or field strengths (E i (x) or B i (x)) appearing in the action. This is called partial cancellation [3,21]. The difference between the complete and partial cancellation prescriptions was empirically shown not to be sizable [21]. Formulae for the partially-cancelled derivatives are given in Appendix A.
By using the partially-cancelled mean-field improve-ment procedure just described, one can find the O(α s ) tadpole counterterms for the one-loop quantities described in Sec. II A. For the NRQCD action without the O(p 6 ) operators, the computation of the tadpole counterterms was checked in two separate calculations. The first was performed analytically, and the second using a Mathematica script. Both calculations reproduced the results of [21] (in the case where the additional parameter used there, v, is set to zero) and [3,24]. We extended the numerical code to include the O(p 6 ) operators. The oneloop tadpole counterterms are given in Appendix B. After the (unimproved) one-loop quantities have been found in lattice perturbation theory, we can add the appropriate tadpole counterterms to determine the improved values. This will be discussed further in Sec. III B.
As can be seen in Appendix B, the mean-field counterterms obtained from using the O(p 6 ) NRQCD action contain higher-orders of 1/m b relative to the counterterms obtained from using the O(p 4 ) NRQCD action. This is a consequence of partially-cancelling the derivative operators (∆ (2n) ) m , whose counterterms are given in Appendix A. The impact of this becomes pronounced as am b is reduced as will become evident in Section III.
In this study, we choose to account for the unphysical tadpole contributions using two different prescriptions. The first prescription proceeds via the partially-cancelled mean-field improvement procedure just described. The one-loop tadpole counterterms given in Appendix B, which depend on 1/am b , remove the unphysical tadpoles. As seen in Sec. III B, the improved values give smaller absolute renormalisations compared to the unimproved case, and exhibit a longer plateau over a larger range in am b indicating stable behaviour in the effective field theory. However, the tadpole counterterms from using the O(p 6 ) action diverge faster as am b → 0 due to the higher-order terms in 1/am b , and therefore the tadpole-improved one-loop results obtained from the O(p 6 ) NRQCD action also diverge faster (see Sec. III B). This could be slightly inconvenient for ensembles with increasingly small lattice spacings, such as the super-fine ensembles currently in use [25], which have a lattice spacing of a ≈ 0.06fm.
Because of this, we explore an alternative improvement procedure based on the fattening of gauge-links [13,26]. The Fat7-smeared link [27] introduces staples of up to seven-link paths to completely remove the tree-level couplings to gluons with high transverse-momentum modes equal to ±π. As the Fat7 link is computationally expensive, alternative fat-links have been designed based on three-or five-link staples, called Fat3 and Fat5 respectively. These latter links reduce the couplings to gluons with high transverse-momentum and suppress unphysical tadpole diagrams [27]. This will be discussed further in section III B. Therefore, we also explore, for the first time, how a Fat3-smeared NRQCD action correct to O(p 4 ) affects the renormalisation of kinetic couplings.
The last piece of information needed to use the tadpoleimproved one-loop coefficients in a non-perturbative computation is the scale, q * , at which to evaluate the strong coupling constant.

C. Determining the scale of αs
The Brodsky-Lepage-Mackenzie procedure [12,28] determines an optimal q * for α V , the coupling defined using the heavy quark potential [29], by examining the momentum flowing through a gluon in the one-loop Feynman diagram. In this prescription, one studies the oneloop integral of a fully dressed gluon within a particular diagram, then uses the running of α V (q) to find a mean-value q * which reproduces the integral. To do this, one expands the running of α V (q) as a polynomial in log(q 2 /q * 2 ) and assumes that the leading order logmoments are the dominant contributions. However, in certain areas of parameter space, the leading order logmoments can be anomalously small and give unphysically large or small erroneous q * . This was noticed in [20] after which [14] determined q * when the zeroth and first log-moments are anomalously small via The appropriate choice of ± in Eq. (28) is usually clear based on requiring q * to be continuous and physically sensible, although the ambiguity can be removed by calculating higher log-moments [14]. When σ 2 > 0, only the first term in Eq. (28) is used. However, when σ 2 < 0, Eq. (28) takes into account the anomalies to first order.
The unphysical tadpole diagrams contribute to the scale q * , using the mean-field improvement prescription described in Sec. II B will alter its value. When the tadpole counterterm c tad α V (q * tad ) is added to the one-loop contribution c a α V (q * a ), the second-order formula (28) is altered to [14] log Again, if σ 2 > 0, then only the first-order term in Eq. (29) is needed and used, while if σ 2 < 0 both terms are needed to yield physical results. Theoretically, we expect aΛ QCD < aq * < π as the oneloop corrections take into account UV modes neglected by imposing a momentum cutoff. Even though the corrected second order formula given in Eq. (29) was used, unphysical values of q * for certain values of am b in the one-loop quantities were sometimes obtained. In these cases, although rare, it was usually clear that the issue was due to the 0 th − 2 nd log-moments being anomalously small. To rectify this issue, we use the simple n th -order formula given by [14] log Leaving the tadpole pieces out of Eq. (31) gives the higher-order tadpole-unimproved scale. As we do not mean-field improve the Fat3-smeared one-loop quantities, the above formulae with the tadpole pieces set to zero are used to find q * in the case of Fat3-smeared links.

A. Perturbative Computational Details
Due to the complexity of the NRQCD action that we utilise, an efficient computational methodology is needed to calculate the Feynman integrals of the one-loop formulae given in (21) and (22). Fortunately, the theory behind the automatic generation of Feynman rules for complex lattice actions exists [26,30]. Here, we employ the automated lattice perturbation theory routines HiPPy and HPsrc [26,31]. These routines have been thoroughly tested and used in previous perturbative calculations [3,5,21,22,32]. Given that we will produce results for a number of different NRQCD actions, these automated packages are ideal.
We automatically generate the Feynman rules for a specific NRQCD action (along with the Symanzikimproved gluonic action [3,32]) using the HiPPy package. We can then construct the Feynman diagrams in a generic fashion using the HPsrc package, which will use these Feynman rules to numerically evaluate the diagram, along with its derivatives thanks to automated differentiation techniques [26,33].
In the matching procedure both the continuum and lattice contributions to the dispersion relation are separately infrared (IR) finite. However, intermediate steps on the lattice may produce IR divergences which cancel when evaluating the one-loop quantities. To regulate the IR divergences, we use twisted boundary conditions (TBCs) on a finite-volume lattice where the momentum integral is replaced by a summation over momentum modes [30]. TBCs introduce a lower momentum cutoff by removing the zero mode from the gluon propagator. Specifically, we employ triple-twist boundary conditions with an appropriate squashing factor in the untwisted temporal direction (used to broaden peaks of the integrand, thus removing most of the dependence on L [30]). Computational details of both concepts are The raw data for W1 at multiple values of 1/L overlaid with our fit curve. described in [26,32], and we refer the reader to those articles for further details. All numerical results are IR finite as expected. As the dispersion relation is UV finite, this allows us to directly equate results obtained on the lattice to those obtained in the continuum. Furthermore, we test that the gauge-invariant quantities are independent of the gluon propagator gauge parameter by working in both Feynman gauge and Landau gauge. All perturbative results presented, except for the Landau-gauge mean-field parameter u (2) 0 , will be in Feynman gauge. The one-loop contributions to the self-energy are shown in Figure 1. Care must be taken when numerically evaluating the rainbow diagram so that the pole of the heavy-quark propagator does not cross the temporal integration contour. Details of our implementation of the contour shift can be found in [21].
In this study we will always take the spatial box length to be 4 ≤ L ≤ 16 and choose a temporal extent of T = 16L. This allows the pole structure to be resolved in greater detail and reduces finite-T effects. As the oneloop integration is carried out by direct summation of the twisted momentum modes, numerical results are exact [32]. We follow the approach suggested by [34] in order to fit exact data. Here, our exact results from TBCs can be expressed as a polynomial in 1/L [30], yet we are only interested in knowing the constant term (corresponding to the infinite-volume result). We may use priors to model the polynomial dependence and then marginalise [35] from the exact data the part of the polynomial that we are not interested in. Using a finite-degree polynomial of order N to model the exact results, we find that N = 20 is a suitable choice and check that all results are unchanged with its variation. Marginalising the last N − N L terms of this polynomial into the exact data and then performing a Bayesian fit [34,36] to a polynomial of degree N L successfully determines the desired constant parameter of the polynomial. As is common with marginalised Bayesian fits [35], marginalising all but one or two fit parameters produces stable and precise results and has seen wide success [8,37]. Even though we produce successful fits when marginalising all but the constant term, we choose N L = 5 and ensure that there is no sensitivity to this.
In the following section, we will give perturbative results for three NRQCD actions: (i) the O(p 6 ) NRQCD action with stability parameter n = 4 as described in Sec. II; (ii) the O(p 4 ) NRQCD action with stability parameter n = 4, 6 and 8; and (iii) a Fat3-smeared O(p 4 ) NRQCD action with stability parameter n = 4 and no mean-field improvement. For a fixed quark and gauge action the one-loop coefficients depend only on the input parameter am b . We calculate results for a range of am b values, enabling interpolation to values not explicitly calculated that may be useful for lattice calculations. This also allows us to demonstrate the functional dependence on am b graphically to see where the divergent behaviour begins as am b goes to zero.

B. Perturbative Results and Analysis
We calculate W 1 and W 2 for both the O(p 4 ) and O(p 6 ) actions with n = 4 (including all log moments) with spatial extent L = 4, 6, 8, 10, 16. We then successfully fit this data using the methodology described in Sec. III A. Figure 2 shows an example of this, with the raw W 1 data   at multiple values of 1/L overlaid against the fit curve. In fact, we found Bayesian fitting to a polynomial so successful that we only needed data with L = 4, 6, 8, 10 to obtain the constant term to sub-percent precision in general. Consequently, we calculate data for W 1 and W 2 for an O(p 4 ) action with n = 6, 8 and with L = 4, 6, 8, 10, as well as all Z (1) m , W 0 and Fat3-smeared results (including log moments). The short computational time needed to calculate u (2) 0 (and its log moments), meant that we were able to do this on lattices of size L = 4, 6, 8, 10, 12, 14, 16. We present the infinite-volume results for the mean-field unimproved W 1 and W 2 in Figure 3. Also shown on each figure is a smooth interpolating curve between the results. This interpolating curve was chosen to be a polynomial in 1/am b in order to reproduce the static limit as m b → ∞. It is expected that all one-loop quantities diverge as am b → 0 for our improved NRQCD action, indicating a breakdown of NRQCD, and that is clearly illustrated in our figures.
The difference between W 1 and W 2 in Figure 3 and c  Figure 4 is purely the conversion factors given in Equations (24) and (25). In these plots, a clear observation is that the results are very insensitive to an increase in n over the mass ranges that we are interested in for non-perturbative calculations on the lattice (1 < am b < 5). Therefore, as there is no clear bene-   Note that the Fat3 smeared data is the same as in Figure 4, since no mean-field correction is applied in this case.
fit to increase n in the perturbative results, future nonperturbative calculations can choose n = 4 for all am b in this range. The Fat3 smearing works as expected to remove the unphysical tadpoles (as outlined in Sec. II B), indicated by a reduction in the absolute size of the oneloop corrections. There is a significant improvement, in terms of longer plateau in am b and sharper divergence at smaller am b , when using the O(p 6 ) NRQCD action over the O(p 4 ). This improved behaviour in the couplings leads to the expectation that the second-order couplings are also well-behaved.
In Figure 5, we then include the mean-field tadpole corrections for all results (except those for Fat3 smearing data) with the formulae explicitly given in Appendix B. The infinite-volume values are given in Table I. Table I shows why we do not need tadpole-improvement when smeared links are used. The one-loop coefficient in u 0 is much smaller in the smeared cases reflecting the fact that tadpole effects are much smaller and the mean smeared link is much closer to 1. Little is then gained by dividing by it.
As can be seen in Figure 5, mean-field improvement noticeably reduces the magnitude of the one-loop coefficients, where it is applied. Due to the higher-order 1/am n b terms in the O(p 6 ) mean-field counterterms, as described in Sec. II B, the one-loop couplings with the O(p 6 ) action now diverge earlier as am b → 0. This is not a desirable feature. This common behaviour is seen in all mean-field improved data we present. Interestingly, the absolute value of c (1) 5 is significantly reduced when using a O(p 6 ) action. c is the coupling which removes the rotational-symmetry breaking operator ∆ (4) at one-loop. Therefore, it is indicative that the O(p 6 ) action will reduce SO(3) symmetry breaking in non-perturbative calculations also, as will be discussed in Sec. IV B.
To fully determine the one-loop shift to the kinetic couplings, the scale at which to evaluate α s in the Vscheme needs to be found. We give the mean-field improved aq * and the Fat3 smeared aq * in Appendix C. To determine the physical scale, q * , we use a −1 = 1.3, 1.6, 2.2 and 3.3 GeV corresponding to very coarse, coarse, fine and superfine MILC ensembles used by the HPQCD collaboration [3]. To run the strong coupling in a particular renormalisation scheme, an initial condition needs to be chosen. Here, we use α MS s (M Z , n f = 5) taken from the Particle Data Group [1], where α s is defined in the MS-scheme, M Z is the Z-boson mass and n f = 5 is the  number of active flavours. To use this with our data, we perturbatively remove the b-quarks' contribution to the running [38], with m b (m b ) = 4.164(23) GeV [35], convert to the V -scheme [12,39] and run to q * [40]. Finally, we combine α V (q * ) with the one-loop coefficient to give the full one-loop coefficient. These are plotted in Figure 6.
We show data for the tadpole-improved Z W 0 is observed to have opposite sign to Z (1) m , but has similar qualitative features as those just described, e.g., the mean-field unimproved result is typically is a factor of 2−3 in magnitude larger than the mean-field improved values and there is a clear plateau and a sharp divergence at small am b . The tadpole-improved W 0 is shown in Figure 8.
Because Z m and W 0 have opposite sign, the one-loop shift in the zero of energy δC is found to be very close to zero in all cases as seen in Figure 9.
We note that our O(p 4 )c m differ by small but significant amounts from those in Ref. [3]. Ref. [3] used Monte Carlo integration combined with numerical derivatives, which they note leads to unstable behaviour when there are large peaks in the IR region. Consequently subtraction functions were used [41]. In our study, we avoid these complications by using TBCs as a gauge-invariant IR regulator, and automatic differentiation for the derivatives, which avoids the numerical instabilities arising from finite-differencing schemes [26,33].
Finally, the tadpole-improved results for the one-loop coefficients plotted here are given in Appendix C. Subtracting the mean-field corrections (given in Appendix B) from this data gives the results before tadpole improvement.

IV. NON-PERTURBATIVE KINETIC MASSES
Here we test how improving the NRQCD action as in Sec. II and III affects the reliability and accuracy  [32] and its log moments for either an unsmeared or smeared gauge-link definition.
Gauge-Link u (2) 0 = f tad f tad log(q 2 ) f tad log 2 (q 2 ) Unsmeared 0.750275(5) 1.45755 (2) 3.6022 (1) Fat3 0.231784(5) 0.26101 (7) 0.6429 (2) Fat7 0.108244(5) 0.10271(4) 0.4117 (2) of energies of bottomonium mesons obtained from nonperturbative calculations. The static mass (the energy corresponding to zero spatial momentum) in lattice NRQCD is shifted due to the removal of the mass term from the Hamiltonian [3], where we found the one-loop shift, C, in Sec. II A. Consequently, one can only determine static mass differences fully nonperturbatively. However, one can still obtain kinetic masses [3,42] entirely non-perturbatively via a fully relativistic dispersion relation as where a∆E is the energy difference between the meson with momentum aP and the meson at rest. The kinetic mass depends on the internal kinematics of the hadron, and hence on the kinetic terms in the NRQCD action. For example, changing the coefficient of the (∆ (2) ) 2 /8am 3 b term, c 1 , from 1 to 1 + O(α s ) will modify the amount of the internal kinetic energy that is incorporated into the meson's kinetic mass, effectively correcting for an O(α s ) mismatch between the static and kinetic masses from this operator's contribution to the binding energy [3]. The change would be expected to be O(α s B) where B is the binding energy of O(500) MeV. This could in principle be as large as 150 − 200 MeV but in practice was found to be much smaller and around 80 MeV on coarse and fine lattices (because c (1) 1 is small) [3]. Therefore, the kinetic mass is the ideal candidate on which to test our improvement of the kinetic part of the action. Furthermore, the kinetic mass is typically utilised to tune the b-quark mass [3,[42][43][44] and thus if sizable improvement is seen, this would indicate that improving the kinetic action would benefit future calculations, where a highly accurate calculation with a reliable error budget requires knowledge of at least the O(α s ) corrections to the matching coefficients.
In a rotationally invariant theory, the symmetry group is the semi-direct product of the rotational group SO(3) with three translations. The little group of the symmetry group, used to classify energy eigenstates in terms of invariant quantities (e.g., J P at zero-momentum and helicity λ at non-zero momentum), is broken by a finitevolume lattice [45]. The symmetry of the lattice discretisation, which breaks SO(3) symmetry at small distances,  does not need to be the same as the symmetry of the finite volume, which breaks rotational symmetry at larger distances [46]. Here we consider a cubic lattice in a finite cubic box with PBCs, and so both the lattice and the boundary break the full SO(3) rotational symmetry of the continuum to the (double cover) of the octahedral group, O D h . The lattice irreducible representations (irreps) for a cubic finite-volume on a cubic lattice depend on the allowed momenta types [45,46] (as not all latticemomenta are related by an octahedral symmetry) and we reproduce them in Table II for convenience. The energy eigenstates of the lattice Hamiltonian (as obtained from non-perturbative lattice QCD calculations) are classified according to representations of the lattice symmetry group.
We denote the energy computed on the lattice for a η b meson with spatial momentum P as E η b (|aP |). Then E η b (|aP|) computed with the same a 2 P 2 but with aP which lie in different lattice little groups (e.g., (3, 0, 0) which has little group Dic 4 and (2, 2, 1) which has little group C 4 ) do not need to yield the same energy within errors. However, as the infinite-volume continuum limit is taken and full SO(3) symmetry is restored, these energies should converge. Improving the lattice NRQCD action, both by adding in higher-order O(p 6 ) terms and one-loop radiative corrections, should reduce SO(3) symmetry breaking and produce the desired infinite-volume continuum energies more accurately at a given value of (n, m, p) C2 A the lattice spacing. Examining the non-perturbative energies should indicate this to be the case. Improving the NRQCD action will reduce the breaking of SO(3) symmetry due to a cubic lattice. This is because higher-order rotational-symmetry breaking operators (which vanish as a → 0) will be increasingly taken into account correctly, e.g., . It is perhaps indicative that including O(p 6 ) operators reduces rotational symmetry breaking, as we found in Sec. III B that the one-loop coupling c (1) 5 , which is constructed in (25) to remove the rotationalsymmetry breaking i p 4 i terms from the dispersion relation to one-loop, gets reduced when improving to the O(p 6 ) NRQCD action.
In the following we will describe our non-perturbative computational setup as well as discuss how the data from the kinetic masses illustrates the reduction of SO(3) symmetry breaking when improving the kinetic parts of the NRQCD action.

A. Non-Perturbative Computational Setup
Our computational setup is similar to that in Refs. [3,8] and we point the reader to those texts for specific details. However, we give a brief overview. We use gauge ensembles generated by the MILC collaboration [47] with the tadpole-improved Lüscher-Weisz gauge action [48] with 2 + 1 + 1 dynamical flavours of HISQ sea quarks [15]. Details of these ensembles are given in Table  III. We use ensembles at three values of the lattice spacing, approximately 0.15 fm, 0.12 fm and 0.09 fm, so that we can test the changing impact of lattice discretisation effects.
Details of the covariant derivative and chromomagnetic/electric field implementation in our NRQCD action can be found in [3]. Each of these must be tadpole-TABLE III. Details of the gauge ensembles used in this study. β is the gauge coupling. aΥ is the lattice spacing determined from the Υ(2S − 1S) splitting [3], where the error combines statistics, experiment and the dominant NRQCD systematic error. amq are the sea quark masses, Ns×NT gives the spatial and temporal extent of the lattices in lattice units and n cfg is the number of configurations in each ensemble. In column 1 we use the numbering convention for the ensembles from [3]. Ensemble 1 is referred to as "very coarse", 3 as "coarse," and 5 as "fine".
Set β aΥ(fm) am l ams amc Ns × NT n cfg improved using the same improvement procedure as in the perturbative calculation of the matching coefficients in Sec. III B. We present kinetic masses using the meanfield improvement procedure where, as in the perturbative results, we take u 0 as the mean trace of the gluon field in Landau gauge, calculated in [3,6]. The u 0 values used for each ensemble are given in Table IV. We also give in Table IV the values that we use for the bare b quark mass am b on each ensemble. The lattice two-point correlator most naturally encodes information on meson energies. We use bilinear bb interpolating operators, listed in Table V with Γ = iγ 5 , γ j , which overlap onto definite J P C = 0 −+ , 1 −− energy eigenstates at rest, respectively, in the infinitevolume continuum version of our theory (which is rotationally invariant) [46]. In [8], as well as [46], it has been shown that at nonzero momentum, O γ 5 (p) is a helicity operator which creates a definite λ = 0 − energy eigenstate, but O γ i (p) creates an admixture of λ = 0 + , ±1, where these λ get contributions from J P values as listed in the third column of Table V. The ± superscript on the λ = 0 case represents the eigenvalueη = P (−1) J from theΠ symmetry (a parity transformation followed by a rotation to bring the momentum direction back to the original direction) [46]. Again, following [3], we simultaneously fit multiexponential functions to the bottomonium meson correlator at rest and with momentum aP. Doing so allows the correlations between the ground states energies to be correctly taken into account when computing the kinetic mass. We take priors of 0.1(1.0) on the amplitudes, priors on the ground state energies are estimated from previous results and given a suitably wide width [3], and priors on energy splittings are taken to be E n+1 − E n = 500(250) MeV. To help invert the covariance matrix a singular value decomposition is used with a tolerance of 10 −5 [8,49]. We present fit results, following [3], for fits including eleven exponentials for Set 1, nine exponentials IV. Parameters used for the valence quarks. am b is the bare b-quark mass in lattice units, u0L is the tadpole parameter [8]. Tp is the total propagation time for the bquark propagator and nt is the number of time sources used per configuration. The O(αs) matching coefficients for c1, c6 and c5 are taken from Tables VIII and IX. As explained in Sec. III the O(αs) coefficients are functions of am b ; the αs value they are combined with to give c1, c5 and c6 depends on the lattice spacing. The values are different for each version of the NRQCD action tested. As we focus on the improvements made in this study, c2, c3 and c4 are taken to be their tree-level values of 1.0.
Tp nt for Set 3, and seven exponentials for Set 5.

B. Non-Perturbative Results and Analysis
We generate data for E η b (|aP|) and E Υ (|aP|) with momenta aP = (0, 0, 0), (1, 0, 0), (1, 1, 0), (1, 1, 1), (2, 0, 0), (2, 1, 1), (2, 2, 1) and (3, 0, 0) in multiples of 2π/L. As discussed above, helicity classifies the energy eigenstates of the infinite-volume continuum NRQCD theory at nonzero momentum. Therefore, compared to the zeromomentum case, additional J P states can contribute to the correlator data at non-zero momentum. The authors of [8] found that when fitting to a 3×3 matrix of smeared correlators, the first excited state in the fit at non-zero momentum was the χ b1 (1P ), h b (1P ) for the operators O γ 5 (x), O γ i (x) respectively. At zero momentum, the first excited state was the η b (2S), Υ(2S) respectively. By using the same smearing types and correlators as those authors, we check that the additional states are present at non-zero momentum when using a 3×3 fit. However even when a fit does not resolve the additional (first excited) state accurately we find that the ground state is uncontaminated and still precise. Further, the finite-volume lattice breaks SO(3) symmetry and allows mixing with higher J P states within each of the lattice irreps given in Table II. As in [8], we find no signal for any mixing in the low-lying spectrum. We conclude that our ground state energies are reliably determined.
Each E Υ (|aP|) extracted from our lattice calculation has larger errors than those on E η b (|aP|) because of the slightly poorer signal-to-noise ratio. The statistical errors also grow with momentum. Consequently, ∆E(|aP|) has larger absolute errors as a 2 P 2 grows, but the relative error on ∆E(|aP|) is larger for small a 2 P 2 . As a result, the kinetic masses with smaller a 2 P 2 have larger errors, which then stabilise. Note the iγ 5 is needed to make the overlaps real [50]. The second column gives the J P C states that these operators create at rest in an infinite volume continuum. The third column gives the helicity eigenvalues λ that these operators create at nonzero momentum in an infinite volume continuum which is only rotationally invariant, while the J in brackets are the states which contribute to that helicity (c.f. [8,46].) On each ensemble we examine the kinetic mass, given by Eq. (32), in order to see how the kinetic mass changes for a given am b as a function of momentum, as we improve the NRQCD action. Since the energies and kinetic masses should only depend on the magnitude of the spatial momentum, rotational symmetry breaking effects show up most clearly as a difference between the energies corresponding to momenta (3, 0, 0) and (2, 2, 1) in units of 2π/L, and also as a kinetic mass that depends on a 2 P 2 .
One feature of the results is that the kinetic mass for the η b is slightly larger than that of the Υ, rather than being lower to reflect the ordering of the masses seen in experiment. This was also seen in [3] and explained there as the result of not including relativistic corrections to the σ · B/2m term in the NRQCD action (the term with coefficient c 4 in Eq. (3)). Such corrections (spindependent terms at O(v 6 )) would allow the effect of the σ·B term to be correctly incorporated in the kinetic mass and solve this problem [9,10]. The strategy adopted in [3] to mitigate this problem was to use the spin-averaged kinetic mass, which is less sensitive to these effects, to tune the b quark mass. The spin-averaged kinetic mass is given by Figure 10 illustrates this feature by showing results for the Υ and η b kinetic masses on set 5, for the p 6 +α s p 4 action. We also show the spin-averaged kinetic mass. The solid lines show the corresponding experimental values. In a full nonperturbative calculation we would want to tune the am b value for each action separately to match the spin-averaged kinetic mass to experiment. Here however we keep the same am b value for each action on a given ensemble (with only an approximate tuning) so that we can compare how the kinetic mass changes.
Data for the spin-averaged kinetic masses from the tree-level O(p 4 ) and O(p 6 ) NRQCD actions, both with and without O(α s p 4 ) corrections, are presented in Figures 11 and 12 for the ensembles described in Table III. Errors are statistical only. Since these plots have the same vertical scale we can see a reduction in the size of ap 6 and O(α s ) effects as the lattice spacing is reduced, from the fact that the range of results become more compressed from Sets 1 to 5.
In Figure 13 we plot the differences between the energies of η b states with momentum (2, 2, 1) and (3, 0, 0), in units of 2π/L, on each ensemble using different actions. We see that the largest SO(3) breaking occurs for the O(p 4 ) NRQCD action. This breaking is reduced when including the O(α s p 4 ) kinetic couplings, and then reduced further by the O(p 6 ) NRQCD action. The least SO(3) breaking occurs for the O(α s p 4 , p 6 ) NRQCD action. This improvement is sizable for the very coarse ensemble, Set 1, while for the coarse and fine ensembles the improvement is visible, but small. Further, the breaking on the coarse and fine ensembles goes from being a significant effect to a non-significant (2σ) one after improvement. Using this improved action allows for a more accurate and reliable determination of the kinetic mass, and hence also of the tuned b-quark mass in high precision calculations.
In Figure 14, we show the speed of light squared, c 2 = (∆E + M kin ) 2 − M 2 kin /P 2 , computed on Set 5 with the O(α s p 4 , p 6 ) NRQCD action against P 2 a 2 , where good agreement with the value of 1 is seen. FIG. 11. Non-perturbatively obtained spin-averaged kinetic masses, given by Eq. (33), on very coarse Set 1 (above) and coarse Set 3 (below) for p 4 and p 6 actions with both tree-level and O(αs) c1, c6 and c5. The errors shown are statistical only, excluding lattice spacing uncertainty, and are correlated. The data points at each value of P 2 a 2 have been offset symmetrically for clarity. The larger energy with a|P| = 9 is from the (2, 2, 1) ground state. The solid line is the experimental value.

V. DISCUSSION AND CONCLUSIONS
In this work we have made the next round of improvement to the HPQCD collaboration's formulation of the NRQCD action to allow increasingly accurate nonperturbative calculations in the future. The key results  presented herein include: • Determining the required operators which need to be added to the NRQCD action in order to give a correct heavy-quark dispersion relation to O(p 6 ), presented in Sec. II.
• Determining the one-loop coefficients of the O(p 4 ) kinetic couplings, namely c 6 and c boundary conditions as an IR regulator. We also present results for the one-loop (bare-to-pole) heavy-quark mass renormalisation Z (1) m and zeropoint energy W 0 which can be combined to give the one-loop energy shift (from neglecting the quark mass term in the NRQCD action) of a b-quark. This one-loop energy shift can be added to the non-perturbatively determined simulation energies to give a numerical value, which after converting to GeV, can be compared to the experimentally determined masses. All perturbative results are shown in Sec. III B.
• Determining the full one-loop radiative correction of these quantities by finding the scale q * of α s defined in the V -scheme. In doing this we use the higher order methodology which takes into account the anomalously small leading-order moments in order to obtain physical q * as described in Sec. II B.
• Determining the one-loop quantities for three different NRQCD action formulations, namely a NRQCD action that gives a heavy-quark dispersion relation correct (i) to O(p 4 ) and (ii) to O(p 6 ). These actions employ a mean-field tadpole improvement procedure. For reasons described in Sec. II B we also explore, for the first time, a (iii) Fat3 smeared NRQCD action with the quark dispersion relation correct to O(p 4 ) which does not require mean-field improvement. The Fat3 results are en-couraging and show stable behaviour against am b , indicating that the use of this or a similar smearing may be the way forward in future, rather than tadpole-improvement.
• Varying the stability parameter with n = 4, 6 and 8 to show that, as shown in Sec. III B, the O(α s p 4 ) kinetic couplings in the NRQCD action are insensitive to this choice. Thus, if future calculations need to compensate a decrease in lattice spacing (which allows higher momentum fluctuations) with an increase in n, they can do so reliably.
• Testing how the improvement of the NRQCD action, both in terms of additional O(p 6 ) operators and one-loop radiative kinetic coefficients, affects the non-perturbatively obtained kinetic masses, c. f. Figs. 11, 12 and 13. The impact of the p 6 terms and the radiative corrections on the kinetic masses obtained is small, particularly on the finer lattices. We find a significant reduction in SO(3) symmetry breaking when using the improved actions on the very coarse ensemble, Set 1, which decreases as the lattice spacing is reduced. On the fine lattice, Set 5, SO(3) symmetry breaking has been reduced to the point that the energy splitting, shown in Figure  13, is very nearly consistent with zero.
Taken together, NRQCD allows increasingly accurate and precise numerical calculations to be performed by including higher-order operators, in combination with determining the matching coefficients using perturbation theory. We have taken both these steps in this work. Furthermore, NRQCD is numerically cheap compared to its relativistic counterparts, being an initial-value, rather than a boundary-value problem. The outlook for NRQCD in the high-precision era is promising and this work helps ensure that this NRQCD formalism will continue to be an active contributor.  4 ) and O(p 6 )) are mean-field improved given the formulae in Appendix B and the mean field parameter in Table I are utilised to produce the one-loop mean-field improved quantities discussed in Sec. III B. Features of these formulae have been discussed in Sec. II B.
For a O(p 4 ) NRQCD action (e.g., using Eq. (2) with δH p 6 = 0) with partial cancellation, the tadpole counterterms are Z (1),tads m TABLE VII. Numerical values for the scale aq * used to evaluate αs in the V-scheme when computing the one-loop contributions. Note that the unsmeared results (O(p 4 ) and O(p 6 )) are mean-field improved given the formulae in Appendix B and the mean field parameter in Table I. The smeared results (Fat3) are not mean-field improved. The aq * are determined as described in Sec. II C.   ) and O(p 6 )) are mean-field improved given the formulae in Appendix B and the mean field parameter in Table I. The smeared results (Fat3) are not mean-field improved. This data is plotted in Figure 6. To determine the physical scale, q * , we use a −1 = 1.3, 1.6, 2.2 and 3.3 GeV corresponding to very coarse, coarse, fine and superfine MILC ensembles used by the HPQCD collaboration [3]. The am b values of 3.4, 2.8, 1.9 and 1.1 are the appropriate ones (approximately) for the b quark on very coarse, coarse, fine and superfine ensembles respectively. However, here we give results for all 4 lattice masses at each lattice spacing for completeness.