Gradient sub-grid-scale model for relativistic MHD Large Eddy Simulations

Federico Carrasco, 2, 3 Daniele Viganò, 2 and Carlos Palenzuela 2 Departament de F́ısica, Universitat de les Illes Balears and Institut d’Estudis Espacials de Catalunya, Palma de Mallorca, Baleares E-07122, Spain Institut d’Aplicacions Computacionals de Codi Comunitari (IAC3), Universitat de les Illes Balears, Palma de Mallorca, Baleares E-07122, Spain Max Planck Institute for Gravitational Physics, Am Mühlenberg 1, 14476 Potsdam, Germany (Dated: January 2019)


I. INTRODUCTION
The first long-awaited simultaneous detection of gravitational waves (GW) and electromagnetic radiation from a binary neutron star (BNS) merger [1] contains a wealth of data, including multi-wavelength electromagnetic radiation ranging from radio to gamma [2]. The existing models of BNS mergers allow to compare the observed GW signal with the theoretical waveform, thus inferring the chirp mass, and, secondarily, constraining the mass ratio and the deformability of nuclear matter. This analysis, together with the associated short gamma-ray burst and the kilonova emission [3], served to constrain some unanswered questions, regarding for instance the equation of state of nuclear matter [4], the maximum mass supported by a neutron star [5,6] and its radius [7], the amount of ejecta [8,9] and the contribution of BNS mergers to heavy elements production via r-processes [10].
Given the intrinsic difficulties of numerical general relativistic magneto-hydrodynamics (GRMHD) simulations, a few works have consistently included the full set of combined Einstein equations and MHD equations to solve such scenarios. Although the main features of the dynamics are quite clear from these simulations, some interesting and more subtle questions regarding the role of magnetic fields and the process they undergo after the merger are still unclear. In particular, during the merger, the Kevin-Helmholtz instability (KHI) is triggered at the shear layer between the colliding stars, and induces a fast growth of any seed magnetic field [11][12][13][14]. The instability develops faster for small-scale perturbation, with a cutoff wavelength of the order of the shear layer thickness (likely being ∼ meter or less). Additionally, other two mechanisms related to the remnant's rotation are supposed to play an important role at longer timescales: the magnetic winding, which creates and amplifies toroidal magnetic field starting from a radial component, and the magneto-rotational instability (MRI) [15].
A fundamental component of most theoretical models of binary NS mergers involved the formation of a jet perpendicular to the orbital plane, which is expected to be produced due to the accretion of the disk onto a central black hole. This jet is supposed to power the short gamma ray burst [16,17], and it is often thought to require or be favoured by the presence of a strong, largescale (dipolar) magnetic field. The creation of the jet is not trivial, with some simulations succeeding in generating it when starting with a strong magnetic field [18], but most of them not yet able to see it (see for instance [19]). In any case, the growth and creation of a large scale magnetic field has not yet been observed, not even in simulations with the highest numerical resolutions of O(10m) [14]. Even in that case, they are not yet able to fully capture the KHI, whose smallest and fastestgrowing scales are of the order of the discontinuity layer (probably a few meters or smaller).
Limitations are being slowly overcome thanks to the growing computational resources. However, the full cap-arXiv:1908.01419v1 [astro-ph.HE] 4 Aug 2019 ture of all scales by means of direct numerical simulation (DNS) is still relatively far out of reach even for the most advanced codes and infrastructures. Thus, it is well worth to study alternative approaches that try to simulate the physical processes at play with a much lower computational cost. There have been several attempts to effectively model the dynamics occurring in the small scales. For instance, Ref. [20,21] employs the viscous hydrodynamic equations to model the differentially rotating remnants of binary neutron stars mergers. Other works concern dynamo mechanisms, in which the magnetic field growth can be modeled by a simple algebraic modification of the induction equation [13,22].
An alternative approach to the problem comes from computational fluid dynamics and it can be summarized in the following statements. First, any numerical simulation can be seen as a filtered version of the evolution equations. Secondly, the filtering of any non-linear combination of the evolved fields (e.g., in the fluxes of the MHD equations) implies the appearance of residual terms in the evolution equations, corresponding to the unresolved dynamics in the small scales. In this approach, commonly known as explicit Large Eddy Simulations (LES), relies first on this separation of the scales of the solution (i.e., resolved and unresolved), combined with an explicit sub-grid scale (SGS) model to account for the dynamics occurring at the unresolved small scales. Although LES are commonly used in many engineering applications, including combustion, acoustics, and simulations of the atmospheric boundary layer and other fluids, its extension to astrophysical magnetized plasmas is less spread [23]. Its application to strongly self-gravitating fluids is even more limited. To our knowledge, only Ref. [24] has so far explicitly developed explicit LES for the relativistic hydrodynamic equations with a covariant form of the viscous SGS model, proposed by Smagorinsky in the sixties [25] and commonly used in other contexts. Since the Smagorinsky model is by definition purely dissipative, it easily allows numerical stability, but it cannot capture any possible backscatter towards larger scales. This is one of the reason why more elaborated models have been considered in the non-relativistic case.
In this paper we propose a more general formulation of the so-called gradient model [26,27], that we already assessed and implemented in the non-relativistic case [28]. It allows to capture part of the unresolved small-scale dynamics, regardless of the specificity of the problem. The main advantage of this model is that it does not rely on any physical assumption in the functional form of these residual terms. First we have to extend the LES formalism to generic conservative evolution equations like the relativistic MHD ones. This system has some particularities; first, the evolution equations of the evolved fields depend on some other quantities, usually called primitive fields. Second, the relation between the conserved and the primitive fields is explicit and analytical, but the inverse is usually not. After writing the LES for generalized conservation equations, we will extend the gradient model to this system.
In § II, we summarize the formalism of the filtering process in LES, very common in a plethora of hydrodynamical simulations related to different fields, but less known in relativistic hydrodynamics. § III summarizes the basics of the gradient SGS model. In § IV, we extend the LES formalism and the SGS gradient model to a general system of conservation laws, applied to nonrelativistic and relativistic compressible ideal MHD in § V. We validate the proposed model in a set of box simulations of the relativistic KHI in § VI, by comparing the SGS model to the residuals coming from filtering highresolution simulations in § VII. Finally, we summarize our results in § VIII.

II. LARGE EDDY SIMULATIONS AND SUB-FILTER-SCALE RESIDUALS
From a formal point of view, the effect of finite resolution in a numerical simulation is equivalent to a low-pass spatial filter, with a filter size of the order of the grid cell.
Applied to a field f (x, t), the filtering operation separates the resolved part from the sub-filter scale (SFS) residuals: Indicating the filter kernel with G, the filtering operator over a field f can be written generically as Large-Eddie-Simulations consist on applying the filter to the evolution equations under consideration and write them down as a function only of the resolved fields. Let us consider a generic evolution system written as where U a is a set of evolved fields and F ka (U ) is the flux along the direction k of the field U a . The equation for the filtered fields can be obtained by applying the filtering operator to the equations. Since the filtering operator commutes with both spatial and time derivatives, we have where we have defined the SFS tensors related to the fluxes as In order to see the effect of this scale-separation in a simple application, let us consider the well-known Burgers equation, a longstanding field of research in LES [29]. In the one-dimensional case, given the velocity field u, the equation reads: The equation for the filtered fields can be obtained by applying the filtering operator to the equations, namely Notice that the filtered equation is equivalent to the original one except by a term proportional to τ which represents the SFS residuals, that is, the loss of small-scale information due to the filtering process related to the non-linear terms. If in the region over which we average u has a definite sign (which is usually the case except close to the shock), then u 2 < u 2 and τ is negative-definite, so that −τ represents the kinetic energy hidden in the SFS. Regardless of the specific system considered, these new terms always appear due to the non-commutativity of the filtering operator with the non-linear terms of the equations. In a LES, which effectively only evolves u and cannot simulate the unresolved scales, the explicit expression of τ is a-priori unknown, and needs to be written as a function of the filtered fields in order to close the system. This implies in practice to approximate the SFS tensors with a sub-grid scale (SGS) model.
There are several ways of doing that, based mostly either on physical arguments or on an expected selfsimilarity of the solution. For instance, a popular, historical approach to fluid dynamics [25] is the setup of an artificial viscosity ν, namely The viscosity parameter is often taken to be proportional to ∆ 2 |∂ x u| due to dimensional reasons and in order to ensure that this term vanishes in the continuum limit, thus guaranteeing numerical convergence to the continuum (or DNS) solution. The numerical value of the proportionality coefficient can be fixed by hand or estimated by means of dynamical procedures which assume selfsimilarity (thus similar to a multiscale/multivariational approach) [30]. Regardless of the value of ν, this functional form of τ is equivalent to the viscous Burger's equation. However, in many cases the physics involved does not consist only in dissipation, but can involve, for instance, inverse cascade and scale-dependent transfers of energy between the fluid kinetic energy and the magnetic one. Therefore, in those cases any result coming from the LES might be biased by the a-priori choice of a given SGS model. Hereafter, we will focus on the so-called gradient model, which has already been shown to capture the main features of the turbulent dynamics in our previous study for non-relativistic MHD turbulence box simulations [28].

III. THE SGS GRADIENT MODEL
Although many of the SFS terms are modeled based on some physical properties, it is possible to compute them relying only on mathematical arguments by considering the analytical Taylor expansion of the SFS terms [26,27], which rely on the properties of the filtering operator G.
An homogeneous isotropic low-pass filter is independent on the direction (i.e., G(x − x ) = G(|x − x |)) and only smooths out fluctuations on length scales smaller than the filter size, leaving unchanged the variations of the solution at larger length scales. In addition, the filter operator is linear and commutes with spatial derivatives. Generically, it can be written for any dimension D as where The simplest low-pass filter is the mean value in a cubic domain with size ∆ f in each Cartesian direction {x i }, described by the normalized kernel Despite the appealing simplicity of the box filter, which makes it very useful to perform numerical calculations, we will see below that it is not suitable for analytical calculations involving its derivatives, since they are not continuous. Therefore, at a formal level, it is more practical to introduce the normalized Gaussian kernel, which in the space domain can be written as where ξ defines the effective filtering width. Besides having the same zeroth and first moments, Gaussian and box filters have the same second moment if we set ξ = ∆ 2 f /24. The filter is a useful mathematical tool to analyze not only the different scales on the solution, but also the structure of the evolution equations, allowing us to distinguish between the resolved or filtered fields and the unresolved ones on the SFS.
The main hypothesis is that the discretization of the equations over a finite-size grid is approximated by a filtering operator acting on the equations, with a Gaussian kernel equivalent to a box filter with width ∆ f , eq. (11). Its D-dimensional Fourier transformed function iŝ where k is the wavenumber. The main idea is to compute an approximation of the inverse filtering operator based on gradient expansion of the filter kernel G, that is, an approximation of the inverse Fourier transform of 1/Ĝ. The first step is to perform a Taylor expansion of the transformed function and its inverse in terms of the filter scale, that is,Ĝ 1 Considering the expansions to the fieldsf andf , the application of the inverse Fourier transformation yields to an infinite series representation of the filter operator and its inverse in terms of gradient operators acting on the fields, namely [31] f These expressions are absolutely convergent and formally accurate at all orders, since the Gaussian kernel is infinitely differentiable and with unbound support. In fact, it was found that these series converges for all canonical filters.
The main idea is to filter the evolution system by means of a spatial filtering with Gaussian kernel (e.g. [32,33]). Then, there appear unknown SFS tensors, which can be computed explicitly by using the Taylor expansion Eqn. (16). Therefore, we have obtained the following useful relations to first order in ξ: where " " means (here and thereafter) "accurate up to O(ξ 2 ) terms" and ∇()·∇() denotes contractions of spatial derivatives. Also, note that by df dg we really mean df dg (g). Going back to our previous Burger's equation example, it is easy to model the SFS terms appearing in Eq. (7) using the gradient expansion above, namely In principle, corrections to these expression are expected due to the form factor (due to the kernel shape) and the contribution from higher orders. In other words, this approximated expression can help as long as we can capture most of the dynamics with our LES, leaving to the SGS the task of mimicking the small-scale contributions.
Notice that the SGS terms scale with ξ ∝ ∆ 2 f , thus ensuring the numerical convergence (i.e., it vanishes in the continuous limit ∆ f → 0).
Regardless on these quite obvious caveats, notice that this prescription is very different from the viscous one given by eqn. (8): one arises from physical considerations, while the other just from mathematical ones. The first one is very useful in scenarios where the physics involved is well known and present universal behaviour, like in pure hydro-dynamical, non-relativistic cases. However, in the case of MHD, the non-linear interplay between kinetic and magnetic energy is rich of different physical mechanisms (dynamo effect, dissipation, helicity transfer...), at different scales. In these cases, it is more useful to have a SGS model which relies only on the mathematical features (basically, the gradients) of the involved field.

IV. GRADIENT MODEL FOR GENERAL SYSTEM OF CONSERVATION LAWS
The gradient model described before has been applied mainly to the non-relativistic HD or MHD equations, often in their incompressible version. In order to generalize it to the relativistic case, we tackle the problem from a broader perspective by considering a general system of conservation laws, where C a is a set of conserved evolved variables and F ka (P ) is the flux along the direction k of the field C a , which can be expressed in terms of the primitive fields P a . The transformations between conserved and primitive fields can be expressed formally as Notice that, although the relation f a between conserved and primitive fields is always known explicitly, the inverse function g a is not analytical in the relativistic case, where it usually needs to be solved numerically.
Conservative-scheme simulations are then equivalent to a filtered version of the evolution equations, such that the values of the filtered conserved variables C a are numerically known. The filtering operation, implicitly contained in a LES, allows us to evaluate P a := g a (C) but not P a ≡ g a (C), which is what really appears in the filtered version of the equations. This distinction suggests a different definition of the SFS tensor in the filtered equations, which can be obtained again by applying the filter to eqs. (18), namely where hereafter the indexes "a, b, e" represent the different fields of the set of primitive and conserved quantities, while "i, j, k" represent the spatial directions. Here we have defined the SFS tensors related to the fluxes as that is, in terms of P a instead of P a , which are the fields that we can compute from the evolved C a . Notice that these SFS tensors for the fluxes are not special, and similar ones can be calculated for any function. In particular, such residuals can be computed for the primitive fields P a (or for any other non-conserved field), namely, where the labels on τ indicates to which field (and specific component of the field) the SFS residuals are referred. In general, neither the SFS tensors associated to the primitive fields nor to the fluxes are known, and they can be important whenever a non-negligible part of the dynamics occurs in the small scales. This is the typical scenario in turbulence, where the fluctuations are relevant. The task is now to exploit the above gradient expansion of the filter kernel to provide a closed expression for τ ka F (20) in terms only of the filtered variables and its derivatives. As we will see, this process essentially involves the filtering of composed functions; the Taylor expansion; and the inverse function theorem.
Let us start by expanding the expression where repeated index "a, b, .." denotes summation over the space of fields. Then, we can Taylor expand F ka (P ) around P , to first order in ξ, as and by the inverse function theorem we can express (locally) the Jacobian of the inverse variable transformation as the inverse of the Jacobian. That is, meaning the matrix inversion of the Jacobian dC a dP b , evaluated at the filtered variables (namely, either C a or P a ).
Finally, by noticing that we can interchange P P (at first order in ξ) in the last term of (22), and then combining it with (23) we finally get the functional form of the SGS tensor τ ka F , namely which approximates the SFS residuals of the fluxes (i.e., τ ka F τ ka F ). In most of the cases it is useful to re-express the SGS tensor τ ka above in the following equivalent form Due to its simplicity we will use this relation in the following, although in some specific evolution systems other equivalent forms can be preferred 1 .
1 For instance, another interesting equivalent expression is given

V. APPLICATIONS TO COMPRESSIBLE IDEAL MHD
We will check the validity of our approach by applying first to the non-relativistic MHD equations, for which the LES equations with the gradient model has been studied for decades and were derived in the compressible case for a generic equation of state [28]. After that, we will consider the unexplored MHD relativistic case.

A. Non-relativistic case
The set of primitives fields for the non-relativistic MHD is given by P a = ρ, v i , , B i (density, velocity, specific internal energy and magnetic field, respectively). The conserved ones are C a = ρ, N i , U, B i (density, momentum density, energy density and magnetic field), which can be written explicitly as a function of the primitive ones as: The evolution system is thus written as where the fluxes read denoting the usual antisymmetrization operation. Notice that these evolution equations involve also the pressure p(ρ, ), so an equation of state is required in order to close the system. The filtered version of the system, including the SFS terms in the right hand sides, is In this problem the relation between conserved and primitives is easily invertible, allowing to express all the fluxes as explicit functions of the conserved quantities. Thus, it is possible to compute dF ka and then use equation (26) to solve for the SGS tensors τ ka F . The former can be written explicitly as: . Also we will need the following derivative, Finally, we obtain where the SGS tensors have been splatted conveniently in order to connect with previous works, being: Notice that these results, obtained starting from a general formulation and applying it to the non-relativistic MHD case, agree with previous derivations calculated for this specific case [28]. The results here were derived for a general equation of state, which is reflected on the expression for τ pres . In particular, the form of τ pres is greatly simplified for an ideal gas equation of state.

B. Relativistic case
Let us now consider the special relativistic MHD, for which a gradient SGS model has not been calculated so far. In this case, the set of primitives variables is given by P a = ρ, v i , , B i , the conserved ones are C a = D, S i , U, B i , and the relations among them are given by is the Lorentz factor, the pressure p is defined through an equation of state and the enthalpy is defined by, h := ρ(1 + ) + p . The evolution system is written, where Since the electric field can be obtained from the ideal MHD condition E i = − ijk v j B k , all the previous fluxes can be easily written in terms of the primitive fields. Finally, we shall define E := hW 2 and Θ := E + B 2 , in order to simplify our following calculations. The filtered version of the system can be written as: where on the right-hand side we have introduced the SFS tensors associated to each flux, as defined above. As in the non-relativistic case, we propose to model the filtered SFS terms appearing in the equations (41)-(44) by means of a compact application of the gradient model. First of all, we define the double gradient operator H, acting on any given field X, as which satisfies a sort of generalized Leibniz's rule, Notice that, when acting on any conserved field, it vanishes by its definition (i.e., On the other hand, when the operator applies to a nonconserved variable, the quantity is non-zero. This holds in particular for { p, Θ, v k } as we shall see below. By applying these rules and using eq. (26), the SGS gradient tensors approximating the SFS terms of eqs. (41)-(44) read: where the set of the H tensors, after some algebraic ma-nipulations, can be written as: 2 −ξH X (exactly like for the conserved field SFS residuals). Their explicit expressions are obtained by computing the following set of equations in the order in which they appear, where the quantities Ψ denote the auxiliary fields which are used to simplify the presentation (and also to facilitate their implementation): Note that these equations are remarkably much more involved than in the well-known non-relativistic case [28]. The main difference being, of course, the more complicate relationship between the conserved and primitive variables in the relativistic setting. Notice that the continuity equation now acquires a SFS term, while the energy equation does not contain any SFS residual. Such "exchanged roles" of the continuity and energy equations comes from the non-relativistic limit and can be seen already at the level of the equations. Notice also that, depending on the equation of state, some terms can be considerably simplified. For instance, Ψ A = W 2 Γpρ for an ideal gas EoS with coefficient Γ.

VI. APPLICATION TO MAGNETIC RELATIVISTIC SIMULATIONS
One way to check the validity of the relativistic SGS gradient model is by performing a detailed analysis of numerical simulations displaying turbulence. Therefore, we will consider a decaying (i.e., non-forced) turbulent dynamics in relativistic MHD, triggered by the KHI, a scenario already studied especially in the non-relativistic case [34,35]. Notice that the KHI provides different stages of the turbulent flow: the initial development at small scales, the transfer of kinetic to magnetic energy, the saturation and mixing, and the final slow decay [28,34]. Moreover, the KHI is thought to take place in binary neutron star mergers, and will be the natural mechanism to be studied in the forthcoming general relativistic simulations.

A. Setup
The initial setup is just an extension from the nonrelativistic [28,35] to the special relativistic case. We evolve eqs. (37)-(40) by using the SAMRAI infrastructure [36,37] with a code generated by the platform Simflowny [38,39]. The numerical schemes are the same used in Ref. [28], which were described in detailed in [40,41]. The discretization of the continuum equations is performed by using the Method of Lines, which allows to address separately the time and the space discretization. We employ High-Resolution Shock-Capturing (HRSC) methods [42] to deal with the possible appearance of shocks and to take advantage of the existence of weak solutions in the equations. The fluxes at the cell interfaces are calculated by combining the Lax-Friedrichs splitting [43] with the WENO5Z [44] high-order non-oscillatory reconstruction scheme. The time integration of the resulting semi-discrete equations is performed by using a fourthorder Runge-Kutta scheme, which ensures the stability and convergence of the solution for a small enough time step ∆t ≤ 0.4 ∆.
We set our problem in Cartesian coordinates, considering a periodic box [−L/2, L/2] 3 . We shall consider differ-ent resolutions between 128 3 and 1024 3 , and evolve the system up to t = 20. The primitive fields read initially: where a l is the mixing layer scale, y l is the distance of the shear layers to the plane y = 0, σ y and σ z are the extension scale of the initial perturbation in the y-direction and the profile of v z , respectively. The main flow is initially given by v x0 . The standard values that we consider are L = 1, y l = 1/4, ρ 0 = 1.5, ρ 1 = 0.5, a l = 0.01, v x0 = 0.5, B x0 = 0.001, B y0 = B z0 = 0, p 0 = 1 and σ 2 z = 0.1. We consider an ideal gas equation of state, p = (γ − 1)ρ , with γ = 4/3. The purpose of this paper is not to explore the dynamics for different parameters (see for instance [34] for a discussion about the role of the initial values of Mach number and magnetic field).
The initial perturbation, δv i , is a superposition of single-mode perturbation with a number of nodes n i ∈ [1, N/2], periodic in the boundary box: We underline that the specific form of the initial perturbation has no influence on the asymptotic turbulent behavior, as long as we excite the entire spectrum of modes, which can be achieved easily if the modes are not the same (or multiple) to each other, n i 1 and δv i v 0x . Hereafter, we use n x = 11, n y = 7, n z = 5, δv x = δv z = 0.01, δv y = 0.1, σ 2 y = 0.01. Since the KHI is known to grow faster for smaller scales, and due to the absence of physical viscosity in this test, we do not expect a numerical convergence, at least in the growth phase, since the more we refine the grid, the more fastgrowing excited mode will be included. This actually reproduces the scenario of the binary neutron star mergers, where the finest available resolutions are likely still very far from being able to capture all the relevant modes, and the simulations do not show numerical convergence, in terms of total magnetic energy and its spectra. See also our previous work [28] for more details and for the general, similar behavior in the non-relativistic case.

B. General behaviour
The following qualitative behavior is similar for all the resolutions considered. The development of the turbulent dynamics is illustrated in Fig. 1, where the density distribution over the slice z = 0 is displayed at As the resolution increases, the turbulent regime develops faster and the transference from kinetic to magnetic energy is more predominant.
times t = {2, 10, 20} for the highest resolution case with N = 1024 3 . At the beginning, the instability develops as small-scale structures with modes given by the initial perturbation. The development starts at the shear layer, where the transfer of kinetic to magnetic energy (dynamo effect) is fast up to t ∼ 2 for all the resolutions (left panel), extending up to t ∼ 4 for the highest one. Then, a larger scale mixing takes place (middle and right panel), during which the effectiveness of the dynamo mechanism reduces. After that period the mixing is completed and the fluid looks isotropic and homogeneous.
In Fig. 2 we compare the evolution of the integrated kinetic and magnetic energy for different resolutions. In our initially kinetic-dominated setup, both the internal and magnetic energies grows during the evolution, at the expense of the kinetic energy. For this particular setup, the internal energy is quantitatively the dominant one (due to the chosen values of the initial pressure), while the magnetic energy is always quite smaller than the kinetic energy.
As it has been observed already by several works, the transfer from kinetic to magnetic energy is more prominent at small scales. Therefore, since the initial perturbation spans all the scales and we have no viscosity included, when we consider higher resolution we capture more unstable modes, which in turn enhance further the magnetic growth. For the explored resolutions, we do not observe yet a saturation of the magnetic field in the homogeneous phase. Actually, in all cases, the dynamo mechanism slowly goes on, allowing in time to reduce the difference between the total kinetic and magnetic energy (possibly showing a slow approach to equipartition at the end of our N = 1024 3 simulation). Similar behaviours are observed in the non-relativistic case for this problem. The saturation level will arguably be set by the strong feedback of large magnetic scales, but longer, and possible even more accurate simulations are needed to have a definitive answer.

C. Spectra
In addition to volume-integrated quantities, whenever there is turbulence it is illustrative to compute also the radially-averaged spectrum [28,45]. For a given field f defined in a periodic box of side L, we use common python functions to calculate its discrete fast Fourier transformf ( k) = Σ x f ( x)e −i k· x , where the sum is performed over the N 3 spatial points equally spaced in each direction, with k j = n ∆k, where ∆k = 2π L and n ∈ [0, N/2] is an integer.
We consider the radial coordinates of the Fourier space, describing it with ∆k-wide radial bins also centered on FIG. 3. Spectra of the kinetic (solid) and magnetic (dashed) energies, for different resolutions, at three representative times; very early, when the turbulence is just quickly developing, an intermediate state when it is growing but at a slower pace and a late state when it saturated already (for the high resolution cases). As the resolution increases, the turbulent regime develops faster and the transference from kinetic to magnetic energy is more predominant. k r = {n ∆k}. Then, we calculate the spectra E(k) as averages < · > kr over the annular bins of the power density per unit of radial wavenumber in 3D: where k 2 = k 2 x +k 2 y +k 2 z and the normalization arises from the Parseval identity. For further technical considerations about normalizations, caveats (e.g. the systematic noise introduced by the conversion to radial coordinates in the Fourier space) and possible corrective factors, we refer to a recent dedicated paper [45]. The analysis of our simulations with a large number of points have been parallelized by using the python package [46] due to the large memory required.
The obtained spectra for t = {2, 10, 20} are displayed in Fig. 3. First of all, note that all spectra show a change of slope at k ∼ few times 2π/∆, due to numerical dissipation. This depends on the scheme, and it is a feature that already appeared in other relativistic hydrodynamical turbulence works employing finite-differences [47]. A possible cure could be to use spectral methods, which are suitable for bounding box simulations, but not for a complex astrophysical scenario. The change of slope means that the dynamics of the smallest resolvable scale is partially numerically damped. In the literature, this is actually a known issue that go under the name of implicit LES, meaning that the numerical dissipation effectively acts as an (uncontrollable) SGS Smagorinsky-like model within the simulation.
With this caveat in mind, we can analyze the spectra, focusing mostly on the large and intermediate scales. As time goes on, there is a direct cascade transferring kinetic energy from large to small scales (i.e., low to high wavenumbers). Therefore, the kinetic energy spectra extends to larger wavenumbers (i.e., smaller scales) with a Kolmogorov's slope ∝ k −5/3 , as the resolution is increased.
When one looks at the magnetic spectra, one can see that most energy tends to be stored at small scales, because these are where they are injected via dynamo process. The magnetic energy is spread also at larger scales (what is known as inverse cascade), so that we obtain a similar shape for the magnetic energy (roughly compatible with the analytical Kazantsev dependence ∝ k 3/2 [48]). The more we rise the resolution, the more effective is this dynamo mechanism. Since the physical limit to the small scale is set by the viscosity (which in turns set the thickness of the shear layer), which is neglected here, we can understand the lack of numerical convergence, at least in the growth phase. This explains the evolution of the total magnetic energy seen above.
Ideally, the implementation of an effective SGS model should properly include the feedback of the small-scale dynamics on the magnetic energy distribution and its growth. That should provide a magnetic spectrum which, at the intermediate and large scales, should be similar to a case with a much higher resolution (see our nonrelativistic results in [28]).

A. Methodology
The most important analysis of our simulations is the a-priori test of the gradient SGS model. We run simulations with a certain grid step ∆ = L/N . Then, we consider a snapshot at a given time, and spatially filter all the evolved fields. The simplest recipe is to use a simple average groups of S 3 f cells, where we define S f as the filter factor. This corresponds to apply a filter in the real space, with a box kernel of size ∆ f = S f ∆, obtaining filtered fields evaluated over N 3 f = (N/S f ) 3 points. The information lost in the filtering process is the field variation contained between the scales represented by N f and N . The solution in these SFS can be quantitatively evaluated by the explicit formal definitions of τ as defined above in Eq. (20). Let us consider a simple case as an illustrative example, with a SFS tensor defined as where i indicate each of the S 3 f discrete positions considered inside the cell centered in x f . Note that this estimation is not an exact evaluation of the loss information, since, by construction, it can only include the range of scales [∆, S f ∆]. The information for scales < ∆ cannot be evaluated. Once built each component of each SFS tensor, one can consider a given SGS model τ . A measurement of the linear correlation between the numerical data and the different models can be estimated with the Pearson correlation coefficient between the SFS and SGS quantities, While the Pearson coefficient tests the functional form, one can also consider each SGS component with a precoefficient C to be adjusted. Its best-fit value can be calculated by the minimizing the L2-norm, where the sum is performed over all the positions { x f }. The minimization gives simply: This procedure can be repeated independently for each SGS component, for each tensor τ . As we showed in our previous work [28], when one compares the performance of the gradient model with other SGS models available for the non-relativistic case, the Pearson correlation of the gradient model stands out, being always much closer to one than the others, and degrading to 0.5 only for quite large filter sizes.
Due to these previous findings and the absence of consistent models to compare with in the relativistic case, we assess the gradient model by a-priori fits for different times, resolutions, and filter size. Below, we report our main findings, usually represented by averaging the different independent components of a given tensor. Differences between different components are statistically negligible, and can arise only temporarily during the initial phase of the development when fields are not homogeneous (see [28] for more details).

B. Results
As a representative example, we consider the different resolutions filtered with S f = 2 (thus, averaging to N f = 512 3 ). In Fig. 4, we show the correlation Pearson values and the best-fit coefficients between the main SFS tensors, eqs. (41)- (44) and the corresponding SGS components tensors, eqs. (47), as a function of time. The Pearson coefficients are very close to one for all tensor components (P 0.8) at all times, indicating a very good correlation between the SFS and the SGS tensors even during the transitional development of the instability (where we have the minimum value of the correlation). Moreover, the best-fit coefficients are fairly constant in time and ∼ O(1). This is the most important result, which confirms that the proposed model is actually fitting well for a variety of time (i.e., MHD configurations) and resolutions. As a comparison, other SGS models in the non-relativstic cases studied in [28] showed little correlation, or even not at all. In particular, the Smagorinsky model, the only one tested so far in GRMHD [24], showed a non-zero correlation (P 0.3 at its best, for S f = 2, with a largely-varying best pre-coefficient) due to the ability to catch the transfer from resolved to SFS scales. Notice however that this is only a part of the non-linear dynamics involved in MHD turbulence. The overall great performance of the gradient model are well known [28,32,33,[49][50][51][52], due to the strong mathematical basis on which it relies. Let us stress that we have explored different initial conditions, finding mainly the same results. Another interesting result can be obtained by studying the simulation with N = 1024 3 points with different filter size ∆ f , as displayed in Fig. 5. As the filter size increases, there is more information lost by averaging in the cells. This leads to a degradation of the Pearson coefficient, although it is still above P 0.5 even for ∆ f ≥ 16, indicating that the LES gradient terms are able to fit the functional form of the non-linear terms corresponding to a quite higher resolution. The variation over time of the best-fit coefficients are limited, which legitimates one to consider a constant pre-coefficient in the LES implementation, without any dynamical procedure (like in other non-relativistic models [30]) for its estimation.
One can also assess the correlations between the auxiliary double gradient terms given by eqs. (51)-(54) and the corresponding SFS terms. In Fig. 6, we show as an illustrative example the good agreement obtained by comparing the SGS term (−ξH v ) with ( v −v) (left panel), and (−ξH Θ ) with ( Θ − Θ) (right). This is an important check, since the contribution to the overall SGS tensors of Fig. 4 could be given by a dominant component, thus "hiding" the others. Our results show that, taken one by one, each testable term (i.e., the ∝ H terms) correlates well with the corresponding SFS value.

VIII. CONCLUSIONS
In this paper we have presented a formalism to study LES for a generic system of conservation equations, in which the evolved fields appear through non-linear (and possibly not analytical) relations in the fluxes. We have also extended the gradient SGS model for this generic system of conservation laws.
We have verified that, by applying this formalism to the non-relativistic MHD equations, one recovers known results available in the literature. Furthermore, we have applied the formalism to the relativistic MHD system, extending for the very first time the gradient SGS model to the relativistic case.
We have performed 3D numerical simulations of the KHI in a bounding box at different resolutions with highorder numerical schemes. Within these simulations we have been able to compare the residual SFS tensors with the gradient SGS tensors (i.e., a priori tests), showing a high correlation between these two quantities, as indicated by a Pearson number close to 1. Notably, the resulting best fit-parameters are also close to the expected value C best ≈ 1. Finally, we have seen that the Pearson value of these tensors is still significant (i.e., larger than 0.5) with a filter size up to ∆ f = 16∆, indicating that this approach can effectively account for a resolution between a factor of 4 and one order of magnitude times larger.
Reproducing small-scale dynamics by means of a validated LES can be seen as a computationally efficient way to solve the equations, equivalent to consider an effective higher resolution. When implemented in a LES, this approach can guarantee a considerable saving of computational time, by using a relatively low resolution, or if combined with high resolution, a feasible way to capture previously unaccessible small-scale dynamics. Quantifying the gain is not trivial, since it depends on the numerical scheme used (see [28] for considerations about the intrinsic dissipation in commonly-used finite-difference schemes). Our previous results in the non-relativistic case and the a-priori assessment here presented, allow us to estimate a gain in the effective resolution of at least a factor 4, and possibly up to a factor 8.
Finally, let us mention that although the LES and gradient SGS model presented here are valid only for the special relativistic case since we have used the Minkowski metric, the extension to General Relativity is quite straightforward. As a matter of fact, although Einstein equations are highly non-linear, they do not show a "turbulent" regime: actually, the metric tends to be smooth and slowly varying, compared to the matter fields. Therefore, the metric functions entering in the filtered MHD equations vary much more smoothly than the fields involved in the turbulent dynamics. Such extension, together with its application to binary neutron star simulations, will be presented in a forthcoming paper.