Unified discontinuous Galerkin scheme for a large class of elliptic equations

We present a discontinuous Galerkin internal-penalty scheme that is applicable to a large class of linear and nonlinear elliptic partial differential equations. The unified scheme can accommodate all second-order elliptic equations that can be formulated in first-order flux form, encompassing problems in linear elasticity, general relativity, and hydrodynamics, including problems formulated on a curved manifold. It allows for a wide range of linear and nonlinear boundary conditions, and accommodates curved and nonconforming meshes. Our generalized internal-penalty numerical flux and our Schur-complement strategy of eliminating auxiliary degrees of freedom make the scheme compact without requiring equation-specific modifications. We demonstrate the accuracy of the scheme for a suite of numerical test problems. The scheme is implemented in the open-source SpECTRE numerical relativity code.


I. INTRODUCTION
Many problems in physics involve the numerical solution of second-order elliptic partial differential equations (PDEs). Such elliptic problems often represent static field configurations under the effect of external forces and arise, for example, in electrodynamics, in linear or nonlinear elasticity, and in general relativity. Elliptic problems also often accompany time evolutions, where they constrain the evolved fields at every instant in time or provide admissible initial data for the evolution.
Discontinuous Galerkin (DG) methods are gaining popularity in the computational physics and engineering community and are currently most prevalently used for time evolutions of hyperbolic boundary-value problems [1][2][3][4]. Many properties that make DG methods advantageous for time evolutions also apply to elliptic problems, which lead to the development of DG schemes for elliptic PDEs [5,6]. In particular, DG schemes provide a flexible mechanism for refining the computational grid, retaining exponential convergence even in the presence of discontinuities when adaptive mesh refinement (AMR) techniques are employed [7,8]. Furthermore, some difficulties with DG schemes in time evolutions, such as shock capturing, are not present in elliptic problems and their static nature makes it often (but not always) straightforward to place grid boundaries at discontinuities, thus relieving the AMR scheme from the responsibility of resolving them. See, e.g., Ref. [2] and the seminal paper [6] for extensive discussions of DG schemes for the Poisson equation, and Refs. [9][10][11] for discussions of linear and nonlinear elasticity.
In the context of relativistic astrophysics and numerical relativity, DG methods have been developed for hyperbolic equations on curved manifolds thus far [12][13][14]. In Ref. [8] we explored the feasibility of the DG method for elliptic problems in numerical relativity confined to flat Poisson-type equations with nonlinear sources. In this article we present a DG scheme suitable to solve a * nils.fischer@aei.mpg.de significantly larger class of elliptic problems that arise in numerical relativity. Most notably, the scheme encompasses the extended conformal thin sandwich (XCTS) formulation of the general-relativistic Einstein constraint equations on a curved manifold, and associated boundary conditions [15][16][17]. Solutions to the XCTS equations provide admissible initial data for general-relativistic time evolutions, for scenarios such as two orbiting black holes or neutron stars [18][19][20][21]. To our knowledge, this article presents the first discontinuous Galerkin solution of the full Einstein constraint equations. Aimed at applications in numerical relativity, the scheme is implemented in the publicly available SpECTRE code [13,22,23].
Furthermore, the elliptic DG scheme presented in this article is not limited to applications in numerical relativity. It applies to all second-order elliptic problems that can be formulated in first-order flux form. Besides the classic Poisson and elasticity equations it covers a large class of elliptic problems in general relativity and hydrodynamics, including coupled systems of equations and those formulated on a curved manifold. With our unified DG scheme, new elliptic systems can be implemented by supplying their first-order fluxes and sources, hence no knowledge of the DG technology or of finite-element formulations is required. This lowers the barrier for extending the capabilities of a simulation code. We pay particular attention to support a wide range of linear and nonlinear boundary conditions so our DG scheme is suited to solve many real-world scenarios (as well as some out-of-thisworld scenarios such as initial data for evolutions of black holes and neutron stars). We are aware only of Ref. [24] studying a nonlinear boundary condition for an elliptic DG problem.
To formulate the unified DG scheme we present a generalized internal-penalty numerical flux, which avoids problem-specific parameters that are needed, e.g., in Refs. [25][26][27]. We eliminate auxiliary degrees of freedom that arise from the first-order form with a Schurcomplement strategy, which has proven more suitable to the unified DG scheme than primal formulations that are commonly employed in the literature [2,6]. The resulting DG scheme is compact, in the sense that it involves only nearest-neighbor couplings and no auxiliary degrees of freedom, and symmetric, unless the symmetry is broken by the elliptic equations.
This article is structured as follows. Section II details the generic first-order flux formulation that serves as the starting point for our DG discretization. Section III develops the unified DG scheme. In Section IV we apply the DG scheme to a set of increasingly challenging test problems. The test problems include scenarios derived from general relativity that feature sets of coupled, strongly nonlinear equations on a curved manifold with nonlinear boundary conditions, solved on curved meshes. We conclude in Section V.

II. FIRST-ORDER FLUX FORMULATION
We consider second-order elliptic PDEs of one or more "primal" variables u A (x), where the index A labels the variables. The variables can be scalars (like in the Poisson equation) or tensorial quantities (like in an elasticity problem). We reduce the PDEs to first order by introducing "auxiliary" variables v A (x), which typically are gradients of the primal variables. We then restrict our attention to problems that can be formulated in first-order flux form where the index α enumerates both u A and v A . Here the fluxes F i α and the sources S α are functionals of the variables u A and v A , but not their derivatives, as well as the coordinates x. The fixed sources f α (x) are independent of the variables. Lowercase Latin indices i, j, k, l enumerate spatial dimensions, and we employ the Einstein sum convention to sum over repeated indices.
The flux form (1) is general enough to encompass a wide range of elliptic problems. For example, a flat-space Poisson equation in Cartesian coordinates (2) has the single primal variable u(x). Choosing the auxiliary variable v i = ∂ i u we can formulate the Poisson equation with the fluxes and sources where δ i j denotes the Kronecker delta. Note that Eq. (3a) is the definition of the auxiliary variable, and Eq. (3b) is the Poisson equation (2).
The equation of linear elasticity in Cartesian coordinates, has the primal variable ξ i (x), describing the vectorial deformation of an elastic material. The constitutive relation Y ijkl (x) captures the elastic properties of the material in the linear regime. Choosing the symmetric strain S ij = ∂ (i ξ j) = (∂ i ξ j + ∂ j ξ i )/2 as auxiliary variable we can formulate the elasticity equation with the fluxes and sources Again, Eq. (5a) is the definition of the auxiliary variable and Eq. (5b) is the elasticity equation (4). The fluxes and sources for the elasticity system (5) have higher rank than those for the Poisson system (3). The first-order flux form (1) also accommodates equations formulated on a curved manifold which is equipped with a metric g ij (x). Such equations typically involve covariant derivatives ∇ i compatible with g ij . To formulate the equations in flux form (1) we expand covariant derivatives in partial derivatives and Christoffel symbols Γ i jk = 1 2 g il ∂ j g kl + ∂ k g jk − ∂ l g jk . Christoffel symbols also appear when formulating equations in curvilinear coordinates. In our scheme, the terms with partial derivatives are assigned to the fluxes F i and the terms with Christoffel symbols are assigned to the sources S. For example, a curved-space Poisson equation with auxiliary variable v i = ∇ i u can be formulated with the fluxes and sources Our strategy of expanding covariant derivatives differs from the formulations employed for relativistic hyperbolic conservation laws in Ref. [12], where fluxes are always vector fields and therefore the covariant divergence can always be written in terms of partial derivatives and the metric determinant. 1 In contrast, fluxes in the elliptic equations (1) can be higher-rank tensor fields, as exemplified in Eq. (5).
The fixed sources f α (x) could, in principle, be absorbed in the sources S α . However, it is useful to keep these variable-independent contributions separate for two reasons. First, they remain constant throughout an elliptic solve, so they need not be recomputed when the dynamic variables change. Second, the constant contributions represent a nonlinearity in the variables u A and v A when included in the sources S α . Assigning the constant contributions to the fixed sources f α eliminates this particular nonlinearity, hence allowing us to avoid an explicit linearization procedure if the remaining sources S α are linear.
The Appendix lists fluxes and sources for selected elliptic problems. Our focus on systems in generic first-order flux form allows us to solve a variety of elliptic systems by only implementing their fluxes and sources. We now proceed to discretize this generic formulation.

III. DG DISCRETIZATION OF THE FLUX FORMULATION
In this section we develop the unified DG scheme for elliptic equations in flux form, Eq. (1). Novel features of our scheme are the formulation of DG residuals and boundary conditions in terms of generic fluxes and sources of arbitrary tensor rank (Sections III B and III E), and the generalized internal-penalty numerical flux (Section III D). The Schur-complement strategy of eliminating auxiliary degrees of freedom has been employed before, e.g., in Ref. [28], but we generalize it to a larger class of equations, including equations with nonlinear fluxes or sources (Section III C). We follow Ref. [12] whenever possible and refer to Ref. [2] for details that have become standard in the DG literature. 2

A. Domain decomposition
We adopt the same domain decomposition based on deformed cubes detailed in Refs. [8,12,13] and summarize it here.
A d-dimensional computational domain Ω ⊂ R d is composed of elements Ω k ⊂ Ω such that Ω = k Ω k . Elements do not overlap, but they share boundaries, as illustrated in Fig. 1a. Each element carries an invertible map ξ(x) from the coordinates x ∈ Ω k , in which the elliptic equations (1) are formulated, to logical coordinates ξ ∈ [−1, 1] d representing a d-dimensional reference cube. Inversely, x(ξ) maps the reference cube to the element Ω k . We define its Jacobian as with determinant J and inverse (J −1 ) j i = ∂ξ j /∂x i . Within each element Ω k we choose a set of N k,i grid points in every dimension i. We place them at logical coordinates ξ pi , where the index p i ∈ {1, . . . , N k,i } identifies the grid point along dimension i. The points are laid out in a regular grid along the logical coordinate axes, so an element has a total of In this example we chose N k,ξ = 3 and N k,η = 4 Legendre-Gauss-Lobatto collocation points along ξ and η, respectively. Each grid point is labeled with its index (p ξ , pη). The dotted line connects points in the order they are enumerated in by the index p.
logical coordinates. Instead, we choose Legendre-Gauss-Lobatto (LGL) collocation points, i.e., the points ξ pi fall at the roots of the (N k,i − 1)th Legendre polynomial plus a point on each side of the element, at −1 and 1. 3 It is equally possible to choose Legendre-Gauss (LG) collocation points, i.e., the roots of the N k,i th Legendre polynomial. 4 Figure 1b illustrates the geometry of an element. Fields are represented numerically by their values at the grid points. To facilitate this we construct the onedimensional Lagrange polynomials and employ their product to define the d-dimensional basis functions The choice of Lagrange polynomials makes Eq. (10) a nodal basis with the useful property ψ p (ξ q ) = δ pq . We use the nodal basis (10) to approximate any field u(x) within an element Ω k by its discretization where the coefficients u p = u(x(ξ p )) are the field values at the grid points. We denote the set of discrete field values within an element Ω k as and the collection of discrete field values over all elements as u. The discretization (11) approximates fields with polynomials of degree (N k,i − 1) in dimension i. Although rarely needed, field values at other points within an element can be obtained by Lagrange interpolation (11).
The field values at element boundaries are double valued because the Lagrange interpolation from neighboring elements to their shared boundary is double valued. Therefore, field approximations will in general be discontinuous at element boundaries. The test problems in Section IV illustrate a few examples of domain decompositions. We refer the reader to, e.g., Ref. [2] for further details on the choice of collocation points, basis functions and their relation to spectral properties of DG schemes.

B. DG residuals
The DG residuals represent the set of equations to be solved for the discrete primal field values u A . The derivation in this section follows the standard procedure, e.g., laid out in Ref. [2], applied to the generic elliptic flux formulation (1), and taking details such as a curved manifold into account.
In the spirit of a Galerkin scheme we project our target PDEs (1) onto the same set of basis functions ψ p (ξ) that is used to approximate fields within an element Ω k , Here we dropped the index α that enumerates the equations, and we define the inner product on Ω k , These integrals are defined with respect to proper volume where g denotes the metric determinant in the coordinates x in which Eq. (1) is formulated. It refers to the metric that covariant derivatives in the equations are compatible with. Since the basis polynomials, Eq. (10), are functions of logical coordinates, we abbreviate ψ p (ξ(x)) with ψ p (x) here. The terms without derivatives in Eq. (13) are straightforward to discretize. We approximate the field f , or similarly S, using the expansion in basis functions (11) to find using the symmetric mass matrix on the element Ω k , We will discuss strategies to evaluate the mass matrix on the elements of the computational domain in Section III F. The divergence term in Eq. (13) encodes the principal part of the elliptic PDEs and requires more care in its discretization. The derivatives in this term will help us couple grid points across element boundaries. To this end we integrate by parts to obtain a boundary term (17) where n i is the outward-pointing unit normal one form on the element boundary ∂Ω k . The unnormalized face normal is computed from the Jacobian asñ i = sgn(ξ j )(J −1 ) j i , where ξ j is the logical coordinate that is constant on the particular face and no sum over j is implied. The face normal is normalized as n i =ñ i / ñ kñl g kl using the inverse metric g ij (x). The surface integral in Eq. (17) is defined just like Eq. (14), using the element boundary' where g Σ is the surface metric determinant induced by the metric g ij and J Σ is the surface Jacobian. The crucial step that couples grid points across element boundaries follows from the field n i F i being double valued on any section of the boundary that an element shares with a neighbor, with one value arising from either side. We must make a choice how to combine the two values from either side of a shared element boundary. This choice is often referred to as a numerical flux. For now we will denote the function that combines values from both sides of a boundary as (n i F i ) * and refer to Section III D for details on our particular choice of numerical flux. Substituting the numerical flux in Eq. (17) yields the weak form of the equations, The numerical flux in Eq. (19) introduces a coupling between neighboring elements that allows us to obtain numerical solutions spanning the full computational domain. Another integration by parts of Eq. (19) yields the strong form of the equations, We will make use of both the strong and the weak form to obtain symmetric DG operators (see Section III I). Approximating F i using its expansion in basis functions (11) we find where the stiffness matrix on the element Ω k is The divergence term in its weak form can be expressed in terms of the stiffness-matrix transpose MD T i,pq = MD i,qp , Evaluation of the stiffness matrix and its transpose is discussed in Section III F. We now turn towards discretizing the last remaining piece of the DG residuals, the boundary integrals in Eqs. (19) and (20). It involves a "lifting" operation: the integral only depends on field values on the element boundary but it may contribute to every component p of the DG residual, hence it is "lifted" to the volume. However, on an LGL grid all components p that correspond to grid points away from the boundary evaluate to zero because they contain at least one Lagrange polynomial that vanishes at the boundary collocation point. This is not the case on an LG grid, where evaluating the Lagrange polynomials on the boundary produces an interpolation into the volume. Expanding the boundary fluxes in basis functions (11) we find where we have defined the lifting operator on the element Ω k , Section III F provides details on evaluating the lifting operator.
Assembling the pieces of the discretization and restoring the index α that enumerates the equations, the DG residuals on the element Ω k in strong form are Matrix representation of the noncompact DG operator in strong form (26a) for a two-dimensional Poisson equation (3). The computational domain is partitioned into 2 × 2 elements with 6 × 6 LGL grid points each. The image shows the nonzero entries of the operator matrix, i.e., its sparsity pattern, in the order laid out in Fig. 1b. where · denotes a matrix multiplication with the field values over the computational grid of an element. The DG residuals in weak form are We can choose either the strong or the weak form for each variable α.
Since the fluxes and sources are computed from the primal and auxiliary variables, the DG residuals (26) are algebraic equations for the discrete values u A and v A on all elements and grid points in the computational domain. The left-hand side of Eq. (26) is an operator A(u A , v A ) and the right-hand side of Eq. (26) is a fixed value at every grid point, so Eq. (26) has the structure If the fluxes and sources are linear, the DG operator A(u A , v A ) can be represented as a square matrix, and Eq. (27) is a matrix equation. The size of the DG operator A(u A , v A ) is the product of N points with the number of both primal and auxiliary variables. Figure 2 presents a visualization of the DG operator A(u A , v A ) for a Poisson equation on a regular grid. The axes annotate entries of the operator that correspond to the "input" variables v i and u, and to the corresponding "output" DG residuals. The mass matrix applied to v i appears as a diagonal line (see Section III F) and the stiffness matrices applied to both v i and u appear as block-diagonal and shaded regions for derivatives in x and y, respectively. The remaining entries represent the coupling between neighboring elements through the numerical flux (see Section III D). Note that the elements Ω 1 and Ω 4 as well as Ω 2 and Ω 3 decouple, because they share no boundaries as they are placed diagonally across the 2 × 2 grid of elements. Solving the Poisson equation amounts to inverting the matrix pictured in Fig. 2. However, it is significantly cheaper to invert the equivalent compact operator pictured in Fig. 3, which we derive in the following section.

C. Eliminating auxiliary degrees of freedom
So far we have treated the primal and the auxiliary equations of the first-order formulation on the same footing, which means the discretized DG operator applies to the primal variables as well as to the auxiliary variables. However, the auxiliary equations inflate the size of the operator significantly, increasing both its memory usage and the computational cost for solving it. In this section we eliminate the auxiliary degrees of freedom from the DG operator, demoting them to quantities that are only computed temporarily.
Many publications on DG formulations adopt a "primal formulation" to eliminate auxiliary degrees of freedom from the DG operator. 5 However, in practice we have found a simpler approach taking a Schur complement of the discretized equations in flux form, e.g., applied in Ref. [28], more suited to the generic implementation of DG schemes. The resulting DG operator remains equivalent to the original operator; i.e., it has the same solutions up to numerical precision. This strategy is facilitated by the auxiliary equations defining the auxiliary variables v A , such as Eqs. (3a) and (5a). We assume here that the auxiliary fluxes depend only on the primal variables, , and the auxiliary sources have the form whereS v A depends only on the primal variables. We further assume f v A = 0 for convenience. All elliptic systems that we consider in this article fulfill these assumptions. We insert Eq. (28) into the strong DG residuals (26a) and solve for v A by inverting the mass matrix to find where we define D i := M −1 MD i and L := M −1 ML. Note that the right-hand side of Eq. (29) depends only on the primal variables u A . Evaluating the DG residuals now amounts to first computing the auxiliary variables v A by Eq. (29), and then using them to evaluate the primal equations. This approach preserves the freedom to use either the strong form (26a) for the primal equations, or the weak form (26b), The strong scheme is slightly easier to implement because the primal and auxiliary equations involve the same set of operators. The strong-weak scheme, i.e., selecting the strong form for the auxiliary equations (29) and the weak form for the primal equations (30b), has the advantage that the DG operator can be symmetric as discussed in Section III I. For linear equations the strategy employed in Eq. (29) of eliminating the auxiliary variables is equivalent to taking a Schur complement of the DG operator with respect to the (invertible) mass matrix, but the strategy works for nonlinear equations as well. The result is an operator A(u A ) of only the discrete primal variables on all elements and grid points in the computational domain, so the DG residuals (30) have the form  (5) with an isotropic-homogeneous constitutive relation Y ijkl . The computational domain is again partitioned into 2 × 2 elements with 6 × 6 LGL grid points each.
The size of the DG operator A(u A ) is the product of N points with the number of primal variables. No auxiliary degrees of freedom from the first-order formulation inflate the size of the operator. We refer to such DG operators A(u A ) of only the primal degrees of freedom as compact if they also only involve couplings between nearest-neighbor elements [30]. The coupling between elements is related to the choice of numerical flux (n i F i α ) * and the subject of Section III D. If the fluxes and sources are linear, A(u A ) can be represented as a square matrix.
Figures 3 and 4 present visualizations of A for a Poisson equation and an elasticity equation, respectively. The block-diagonal structure in Fig. 3 represents the DGdiscretized two-dimensional Laplacian on the four elements of the computational domain. The entries that break the block-diagonal structure represent the coupling between nearest-neighbor elements through the numerical flux (Section III D).

D. A generalized internal-penalty numerical flux
Up to this point we have not made a choice for the numerical flux (n i F i ) * that combines double-valued field values on element boundaries. The numerical flux is a function of the field values on both sides of the boundary. From the perspective of one of the two adjacent elements we refer to the field values on itself as interior and to the field values the neighboring element as exterior. Contrary to much of the DG literature we formulate the numerical flux entirely in terms of the primal and auxiliary boundary flux quantities n i F i u A and n i F i v A on either side of the boundary instead of the primal and auxiliary variables u A and v A . This choice keeps our scheme applicable to the wide range of elliptic problems defined in Section II. The numerical flux presented here is a generalization of the symmetric internal penalty (SIP) scheme that is widely used in the literature [2,6,8,31]. Our generalized internal-penalty numerical flux is The numerical flux for the auxiliary equations, Eq. (32a), averages the boundary fluxes of the two adjacent elements. The numerical flux for the primal equations, Eq. (32b), is an average augmented with a penalty contribution with parameter σ. Note that the numerical flux (32) involves only the primal fields and their derivatives, and thus is independent of the auxiliary fields altogether, as is typical for internalpenalty schemes. This has the practical advantage that the contribution from either side of the boundary to both the primal and the auxiliary numerical flux in Eqs. (29) and (30) can be computed early in the algorithm and communicated together, coupling only nearest-neighbor elements and thus making the DG operator compact. If the primal numerical flux (32b) depended on the auxiliary fields, evaluating the DG operator (30) would require a separate communication once the boundary corrections have been added to the auxiliary equation (29), effectively coupling nearest-neighbor elements as well as next-tonearest-neighbor elements. 6 DG literature usually assumes that the face normals on either side of the boundary are exactly opposite, i . This assumption breaks when the background geometry responsible for the normalization of face normals depends on the dynamic variables, since those are discontinuous across the boundary. All of the elliptic problems that we are expecting to solve in the near future are formulated on a fixed background geometry, but it is useful to distinguish between the interior and exterior face normals nonetheless because the quantity n i F i is cheaper to communicate than F i . Therefore, we always project an element's boundary fluxes onto the face normal before communicating the quantity.
For a simple flat-space Poisson system (3) our generalized internal-penalty numerical flux (32) reduces to the canonical SIP, 7 As is well studied for the canonical SIP numerical flux, the penalty factor σ is responsible for removing zero eigenmodes and impacts the conditioning of the linear operator to be solved [2,32]. It scales inversely with the element size h and quadratically with the polynomial degree p, both orthogonal to the element face. 8 Both h and p can be different on either side of the element boundary, so we choose where we follow Ref. [33] in choosing the scaling with the polynomial degree p on hexahedral meshes, and we follow Ref. [8] in our definition of the element size h = 2/ ñ iñj g ij . 9 Note that h generally varies over the element face on curved meshes or on a curved manifold, and that the min operation in Eq. (34) is taken pointwise, so σ also varies over the element face. The remaining penalty parameter C ≥ 1 remains freely specifiable. Note also that we do not need to include a problem-specific scale in the penalty factor, as is done in Refs. [25][26][27], because the generic numerical flux (32b) already includes such scales in the fluxes F i .

E. Imposing boundary conditions
The flux formulation allows imposing a wide range of boundary conditions relatively easily "through the fluxes" without the need to treat external boundaries any differently than internal boundaries between neighboring elements. Imposing boundary conditions amounts to specifying the exterior quantities in the numerical flux, Eq. (32). This strategy is often referred to as imposing boundary conditions through "ghost" elements. As suggested in, e.g., Ref. [2], we impose boundary conditions on 7 See Eq. (3.21) in Ref. [6] or Section 7.2 in Ref. [2]. 8 See Ref. [32] for sharp results for the optimal penalty factor on triangular and tetrahedral meshes, and Table 3.1 in Ref. [33] for a generalization to hexahedral meshes. 9 Note that Ref. [8] omits the factor of 2 in the definition of h, which we include so the definition reduces to the canonical element size on rectilinear meshes.
the average of the boundary fluxes to obtain faster convergence. Therefore, on external boundaries, we choose for the exterior quantities in the numerical flux (32) where we set the quantities (n i F i α ) b according to the boundary conditions at hand. Here we define n b i = n int i , i.e., we choose to compute external boundary fluxes with the face normal pointing out of the computational domain. The symmetry between the primal and the auxiliary equations in the first-order flux formulation (1) that we employ throughout this article makes this approach of imposing boundary conditions particularly straightforward: a choice of auxiliary boundary fluxes (n i F i v A ) b imposes Dirichlet-type boundary conditions and a choice of primal boundary fluxes (n i F i u A ) b imposes Neumann-type boundary conditions. The choice between Dirichlet-type and Neumann-type boundary conditions can be made separately for every primal variable u A and every external boundary face, simply by setting either ( and setting the remaining boundary fluxes to their interior values (n i F i ) b = (n i F i ) int . Note that we neither need to distinguish between primal and auxiliary variables in Eq. (35), nor take the choice of Dirichlettype or Neumann-type boundary conditions into account, but we require only that (n i F i ) b be chosen appropriately for every variable. Then, the Neumann-type boundary conditions enter the DG residuals directly through the numerical flux in Eq. (30), and the Dirichlet-type boundary conditions enter the DG residuals through the numerical flux in Eq. (29), which is substituted in Eq. (30).
In practice, this setup means we can initialize (n i F i α ) b = (n i F i α ) int for all variables on a particular external boundary face when preparing to apply the numerical flux, decide which boundary fluxes to modify based on the boundary conditions we wish to impose on the particular face, and then evaluate Eq. (35) to compute the exterior quantities in the numerical flux (32). To impose Neumann-type boundary conditions we set the primal boundary fluxes (n i F i u A ) b directly, but to impose Dirichlet-type boundary conditions we typically choose the primal field values u b A and compute the auxiliary boundary fluxes as ( . The auxiliary (Dirichlet-type) external boundary fluxes may depend on the interior primal fields u int A , and the primal (Neumann-type) external boundary fluxes may depend on both the interior primal fields u int A as well as the interior auxiliary fields v int A . This means we can impose a wide range of boundary conditions that may depend linearly or nonlinearly on the dynamic fields. For example, a Robin boundary condition for the Poisson equation (2) or (6), where a and b are constants and g(x) is a function defined on the boundary, can be implemented as Neumann-type and as Dirichlet-type for b = 0, An important consideration is that boundary conditions are generally nonlinear. Even for linear PDEs, such as the Poisson equation, a simple inhomogeneous Dirichlet boundary condition u b = 0 introduces a nonlinearity in the DG operator because A(0) = 0. Therefore, we always linearize boundary conditions. For nonlinear equations the boundary conditions linearize along with the DG operator and require no special attention (see Section III J). However, for linear equations the inhomogeneity in the boundary conditions is the only nonlinearity present in the DG operator, so we skip the full linearization procedure. Instead, we contribute the inhomogeneous boundary conditions A(0) to the fixed sources, leaving only the linearized boundary conditions in the DG operator, where δA δu is just A with linearized boundary conditions. Note that this strategy is equivalent to the full linearization procedure described in Section III J at u = 0. In practice, evaluating A(0) simplifies significantly for linear equations because only the lifted external boundary corrections contribute to it.

F. Evaluating the mass, stiffness, and lifting matrices
The mass matrix (16), the stiffness matrix (22), and the lifting operator (25) are integrals that must be evaluated on every element of the computational domain. We evaluate these integrals on the same grid on which we expand the dynamic fields, which amounts to a Gauss-Lobatto quadrature of an order set by the number of collocation points in the element. This strategy is commonly known as mass lumping. 10 Employing mass lumping and our choice of nodal basis (10), the mass matrix (16) evaluates to Here the coefficients w pi denote the Legendre-Gauss-Lobatto quadrature weights associated with the collocation points ξ pi , and the geometric quantities √ g and J are evaluated directly on the collocation points. 11 Recall from Section III A that the index p enumerates grid points in the regular d-dimensional grid and that p i denotes the grid point's index along the ith dimension. The diagonal mass-lumping approximation (40) has the advantage that it is computationally efficient to apply, invert and store since it amounts to a pointwise multiplication over the computational grid. Note that Eq. (40) is exact on a rectilinear LG grid with a flat background geometry, and can be made exact on rectilinear LGL grids by including a correction term without increasing the computational cost for applying or inverting it [34]. The quadrature weights w pi can be cached and reused by all elements with the same number of collocation points in a dimension.
The strong stiffness matrix (22) evaluates to with Here qj (ξ rj ) are the one-dimensional "logical" differentiation matrices obtained by differentiating the Lagrange polynomials (9) and evaluating them at the collocation points. 12 The stiffness matrix is essentially a "massive" differentiation operator that decomposes into onedimensional differentiation matrices due to the product structure of the basis functions (10) on our hexahedral meshes. The one-dimensional logical differentiation matrices can be cached and reused by all elements with the same number of collocation points in a dimension, keeping the memory cost associated with the stiffness operator to a minimum. The weak stiffness matrix can be computed analogously from the transpose of the logical differentiation matrices. The lifting operator (25) evaluates to with (42b) 11 See, e.g., Algorithm 25 in Ref. [29] for details on computing Legendre-Gauss-Lobatto quadrature weights, and Algorithm 23 for Legendre-Gauss quadrature weights. 12 See, e.g., Ref. [2] and Algorithm 37 in Ref. [29] for details on computing the one-dimensional logical differentiation matrix q j (ξr j ) from properties of Legendre polynomials.
It is diagonal and has a contribution from every face of the element. Note that each face only contributes to the LGL grid points on the respective face. On LG grids additional interpolation matrices from the face into the volume appear in this expression. Also note that the root in Eq. (42b) is simply the magnitude of the unnormalized face normalñ j [12].
Recall that the objective of these matrices is to evaluate the compact DG operator (30) along with Eq. (29) on every element of the computational domain. In practice, neither matrix must be assembled explicitly and stored on the elements: the mass matrix (40) reduces to a multiplication over the computational grid, the stiffness matrix (41) involves logical one-dimensional differentiation matrices that are shared between elements, and the lifting operation (42) reduces to a multiplication over the grid points on the element face. Since both the stiffness and the lifting operation decompose into a mass matrix and a "massless" operation, the same set of operations can be used to evaluate both Eqs. (29) and (30), and the mass matrix factors out of the DG operator entirely. Nevertheless, we will see in Section III I that it is advantageous to keep the mass matrix in the operator (30).

G. A note on dealiasing
The integral expressions discussed in Section III F involve geometric quantities that are typically known analytically, namely the Jacobian and the background metric. Limiting the resolution of the integrals to the quadrature order of the elements can make the scheme susceptible to geometric aliasing because these quantities are resolved with limited precision, potentially reducing the accuracy of the scheme on curved meshes or on a curved manifold. To combat geometric aliasing we can, in principle, precompute the matrices on every element at high accuracy, but at a significant memory cost. Precomputing the matrices is possible in elliptic problems because the geometric quantities remain constant. This is different to time-evolution systems that often involve time-dependent Jacobians ("moving meshes"). Alternatively, a number of dealiasing techniques are available to combat geometric aliasing, and also to combat aliasing arising from evaluating other background quantities on the collocation points, i.e., quantities in the PDEs that are independent of the dynamic variables and known analytically [35]. For example, Ref. [8] interpolates data from the primary LGL grid to an auxiliary LG grid, on which the Jacobian is evaluated, to take advantage of the higher-order quadrature. However, these dealiasing techniques can significantly increase the computational cost for applying the DG operator. We have chosen to employ the simple mass-lumping scheme detailed in Section III F to minimize the complexity, computational cost, and memory consumption of the DG operator. Detailed studies of cost-efficient dealiasing techniques are a possible subject of future work. faces two elements Ω2 and Ω3 towards the right. Since both Ω1 and Ω2 have four vertical grid points their shared mortar ∂Ω12 also has four grid points, but covers only a logical half of Ω1 (h nonconforming). The element Ω3 has five vertical grid points, so the mortar ∂Ω13 has max(4, 5) = 5 grid points and also covers only a logical half of Ω1 (hp nonconforming). Elements Ω2 and Ω3 are h conforming but differ in their number of horizontal grid points, so their shared mortar ∂Ω23 has max(4, 5) = 5 grid points (p nonconforming). Note that the empty space between the elements in this visualization is not part of the computational domain.

H. Mesh refinement
The domain decomposition into elements, each with their own set of basis functions, allows for two avenues to control the resolution: we can split the domain into more and smaller elements (h refinement) or increase the number of basis functions within an element (p refinement). We can perform h and p refinement in each dimension independently.
Both h refinement and p refinement can lead to nonconforming boundaries between elements, meaning that grid points on the two sides of the boundary do not coincide. Since we need to work with data from both sides of an element boundary when considering numerical fluxes (see Section III D) we place mortars between elements. A mortar is a (d − 1)-dimensional mesh that has sufficient resolution to exactly represent discretized fields from both adjacent element faces. Specifically, a mortar ∂Ω kk between the elements Ω k and Ωk that share a boundary orthogonal to dimension j has max(N k,i , Nk ,i ) grid points in dimension i = j. We limit the h refinement of our computational domains such that an element shares its boundary with at most two neighbors per dimension in every direction ("two-to-one balance"). This means a mortar covers either the full element face or a logical half of it in every dimension. Figure 5 illustrates an hp-refined scenario with nonconforming element boundaries.
To project field values from an element face to a mortar we employ the (d − 1)-dimensional prolongation operator where p enumerates grid points on the coarser (element face) mesh,p enumerates grid points on the finer (mortar) mesh, andξp i are the coarse-mesh logical coordinates of the fine-mesh collocation points. For mortars that cover the full element face in dimension i the coarse-mesh logical coordinates are just the fine-mesh collocation points, ξp i = ξp i . For mortars that cover the lower or upper logical half of the element face in dimension i they arẽ ξp i = (ξp i − 1)/2 orξp i = (ξp i + 1)/2, respectively. Note that the prolongation operator (43) is just a Lagrange interpolation from the coarser (element face) mesh to the finer (mortar) mesh. The interpolation retains the accuracy of the polynomial approximation because the mortar has sufficient resolution. The prolongation operator is also an L 2 projection (or Galerkin projection) because it minimizes the where u (k) denotes the prolongated field values on the finer (mortar) mesh.
To project field values from a mortar back to an element face we employ an adjoint R of the prolongation operator such that RP = 1. We also refer to this operation as a restriction because it truncates higher modes from the mortar down to the resolution of the element face. Specifically, we employ the mass-conservative adjoint In matrix notation the restriction operator reduces to where M −1 is the inverse mass matrix on the coarser (element face) mesh,M is the mass matrix on the finer (mortar) mesh, and P T is the transpose of the prolongation operator (43). Note that the d-dimensional restriction and prolongation operators can serve not only to project field values to and from mortars, but also to project field values to and from elements that cover the computational domain at different h-and p-refinement levels. We make no use of projections across refinement levels in this article but will do so in upcoming work for the purpose of adaptive mesh-refinement strategies and for multigrid solvers. 13 13 See also Sections 3.2 and 3.3 in Ref. [28] for details on the restriction and prolongation operators in the context of multigrid solvers.

I. A note on symmetry
For practical applications it is often advantageous to work with a symmetric operator. For example, some iterative linear solvers such as conjugate gradients take advantage of the symmetry to invert the operator more efficiently. One can also often show stronger convergence bounds for iterative linear solvers applicable to nonsymmetric matrices, such as GMRES, if the matrix is symmetric [36].
The compact strong-weak scheme presented in Eq. (30b) with the generalized SIP numerical flux (32) is symmetric unless the elliptic equations break the symmetry, e.g., with an asymmetric coupling between equations. Note that a curved manifold will typically break the symmetry because it involves first derivatives in Christoffelsymbol contributions to the primal sources [see, e.g., Eq. (7)]. It is straightforward to see how the strongweak scheme can make the DG operator symmetric if the elliptic equations allow it: the strong-weak operator involves a symmetric stiffness term of the schematic form (M D) T D = D T M D, whereas the strong scheme has a nonsymmetric expression of the form M DD instead. Note that the "massless" variant of the strong-weak scheme, schematically M −1 D T M D, is not generally symmetric, and neither is the "massless" strong scheme DD.

J. Linearizing the operator
To solve nonlinear elliptic equations A(u) = b we typically employ a correction scheme, repeatedly solving the linearized equations for a correction quantity ∆u. For example, a simple Newton-Raphson correction scheme solves the linearized problem δA δu (u) ∆u = b − A(u) at fixed u and then iteratively corrects u → u + ∆u. Since the fluxes F i α are already linear for all elliptic systems we consider, the linearization δA δu (u) involves only linearizing the sources S α and the boundary conditions.

K. Variations of the scheme
We have made a number of choices to formulate the DG discretization in the preceding sections. This section summarizes some of the choices and presents possible variations to explore in future work.
Massive vs massless scheme: We can eliminate the mass matrix in Eq. (30) to obtain a "massless" DG operator. However, we have found evidence that iterative linear solvers converge faster when solving the "massive" DG operator. We attribute this behaviour to the symmetry considerations discussed in Section III I.

Mass-lumping:
We diagonally approximate the mass matrix to reduce the computational cost to apply, invert and store it, and to simplify the scheme (see Section III F). Dealiasing techniques can potentially increase the accuracy of the scheme on curved meshes as discussed in Section III G.
LGL vs LG mesh: We chose to discretize the DG operator on LGL meshes to take advantage of the collocation points on element boundaries, which simplify computations of boundary corrections. Switching to LG meshes can have the advantage that quadratures are one degree more precise, making the mass-lumping exact on rectilinear grids (see Section III F).
Numerical flux: The generalized internal-penalty numerical flux presented in Section III D has proven a viable choice for a wide range of problems so far. However, the ability to switch out the numerical flux is a notable strength of DG methods, and augmenting the numerical flux in the elliptic DG scheme may improve its convergence properties or accuracy.
In particular, the choice of penalty, Eq. (34), on curved meshes remains a subject of further study.

Strong vs weak formulation:
We have chosen the strong formulation (30a) over the strong-weak formulation (30b) because it is slightly simpler and we have, so far, found no evidence that the strongweak formulation converges faster than the strong formulation, despite the symmetry considerations discussed in Section III I. However, the strong-weak formulation can be of interest if a symmetric DG operator is necessary, e.g., to take advantage of specialized iterative solvers.
Flux vs primal formulation: We have eliminated auxiliary degrees of freedom in the DG operator with a Schur-complement strategy. An alternative strategy is to derive a "primal formulation" of the DG operator (see Section III C). We have found the flux formulation easier to implement due to its similarity to hyperbolic DG schemes. Furthermore, Ref. [28] suggests that the flux formulation can be advantageous in conjunction with a multigrid solver.

IV. TEST PROBLEMS
The following numerical tests confirm the DG scheme presented in this article can solve a variety of elliptic problems. The test problems involve linear and nonlinear systems of PDEs with nonlinear boundary conditions on curved manifolds, discretized on hp-refined domains with curved meshes and nonconforming element boundaries.
For test problems that have an analytic solution we quantify the accuracy of the numerical solutions by computing an L 2 error over all primal variables, FIG. 6. The two-dimensional rectilinear domain used in the Poisson problem (Section IV A). Black lines illustrate element boundaries and gray lines represent the LGL grid within each element. This domain is isotropically h refined once, i.e., split once in both dimensions, resulting in four elements. Each element has six grid points per dimension, so fields are represented as polynomials of degree five. This is the domain that Figs. 2 and 3 are based on, and that is circled in Fig. 7.
where the integrals are evaluated with Gauss-Lobatto quadrature on the elements of the computational domain.
To assess the DG operator is functional for our test problems we study the convergence of the discretization error (46) under uniform hp refinement of the computational domain (see Section III H). We compute the h-convergence order under pure uniform h refinement where ∆ h denotes the difference between successive hrefinement levels and h is the size of an element. Since we always split elements in half along all logical axes we use ∆ h ln(h) = ln (2). We also compute the exponential convergence scale under pure uniform p refinement where ∆ p denotes the difference between successive prefinement levels.

A. A Poisson solution
With this first test problem we establish a simple baseline that the following tests build upon. It is reduced to the absolute essentials to illustrate the basic concepts of the scheme. We solve a flat-space Poisson equation (2) in two dimensions for the analytic solution u analytic (x) = sin (πx) sin (πy) (49) on a rectilinear domain Ω = [0, 1] 2 . The domain is illustrated in Fig. 6. To obtain the solution (49) numerically we choose the fixed source f (x) = 2π 2 sin (πx) sin (πy), Discretization error Convergence of the two-dimensional Poisson problem detailed in Section IV A with uniform hp refinement. Solid lines connect numerical solutions where the domain is split into an increasing number of elements (isotropic h refinement), and dotted lines connect numerical solutions with increasing polynomial order (isotropic p refinement). The DG scheme recovers optimal O(h P +1 ) convergence with odd-order superconvergence under h refinement (right panel) and exponential convergence under p refinement (bottom panels). For reference, the circled configuration is pictured in Fig. 6. select homogeneous Dirichlet boundary conditions u b = 0, and solve the strong compact DG-discretized problem (30a) with C = 1. This essentially means we invert the matrix depicted in Fig. 3 and apply it to the discretization of the fixed source f (x). Instead of inverting the matrix directly we employ the iterative elliptic solver of the SpECTRE code [22] presented in Ref. [23]. However, note that the technology we use to solve the DG-discretized problem is not relevant for the purpose of this article, since the matrix equation has a unique solution. Assuming the matrix equation is solved to sufficient precision, Eq. (46) quantifies the discretization error of the DG scheme.
We solve the problem on a series of uniformly and isotropically refined domains and present the convergence of the discretization error in Fig. 7. Under h refinement the scheme recovers optimal O(h P +1 ) convergence, where P denotes the polynomial degree of the elements. It also recovers the odd-order superconvergence feature expected for the antisymmetric problem (49). 14 Under p refinement the scheme recovers exponential convergence. The exponential convergence scale τ p is modulated by the superconvergence feature and its mean increases linearly 14 See also Fig. 7.9 in Ref. [2].  TABLE I. Parameters used in the thermal-noise problem (Section IV B). The beam width and the material properties correspond to Table 1 in Ref. [37]. These material properties characterize a fused-silica mirror, which is a material used in the LIGO gravitational-wave detectors.
with the h-refinement level.

B. Thermal noise in a cylindrical mirror
In this second test problem we solve the equations of linear elasticity (4) on a curved mesh with nonconforming element boundaries. The test problem represents a cylindrical mirror that is deformed by pressure from a laser beam incident on one of the sides. This problem arises in studies of Brownian thermal noise in interferometric  FIG. 8. A cut through the cylindrical domain used in the elasticity problem (Section IV B). The domain consists of four wedges enveloping a cuboid, and two vertical layers. The layers are partitioned vertically at z = r0 and the cuboid lies radially within r = r0. In the top layer, the wedges are h refined radially once and the cuboid is h refined in the x and y directions once, resulting in 12 elements in the top layer and 5 elements in the bottom layer. Elements in this example have six grid points per dimension, and the wedge-shaped elements have two additional grid points in their angular direction.
gravitational-wave detectors [37,38]. 15 Here we consider an analytic solution to this problem that applies in the limit of an isotropic and homogeneous mirror material with constitutive relation characterized by the Lamé parameter λ and the shear modulus µ, or equivalently by the Poisson ratio ν = λ 2(λ+µ) , Young's modulus E = µ(3λ+2µ) λ+µ , or the bulk modulus K = λ + 2 3 µ. We assume the material fills the infinite halfspace z ≥ 0, choose a vanishing force density f j (x) = 0, and a Gaussian profile of the laser beam incident at z = 0, Here T ij = −Y ijkl S kl is the stress, n i is the unit normal pointing away from the mirror, i.e., in negative z direction, r = x 2 + y 2 is the radial coordinate distance from the axis of symmetry, and r 0 is the beam width. Under these assumptions the displacement field ξ i (x) has the analytic solution [39][40][41] and ξ φ = 0 in cylindrical coordinates {r, φ, z}. Here J 0 and J 1 are Bessel functions of the first kind, and p(k) = 1 2π e −(kr0/2) 2 is the Hankel transform of the laserbeam profile. We evaluate these integrals numerically at every collocation point in the computational domain to determine the analytic solution.
To obtain numerical solutions to the thermal noise problem we DG discretize the equations of linear elasticity (5) on a cylindrical domain with height and radius R, employing the strong compact DG operator (30a). Since the stress T ij = −F ij ξ is the negative primal flux in the elasticity equations (5) we impose Eq. (51) as a Neumann-type boundary condition on the base of the cylinder at z = 0. We impose the analytic solution (52) as Dirichlet-type boundary conditions on the remaining external boundaries of the domain, i.e., on the base at z = R and on the mantle at r = R. These boundary conditions mean that we solve for a finite cylindrical section of the infinite half-space analytic solution (52). We choose a penalty parameter of C = 100 for this problem to eliminate variations in the discretization error arising from curved-mesh contributions to the penalty (34) at high resolutions. Table I summarizes the remaining parameters we use in the numerical solutions. Figure 8 illustrates the cylindrical domain. It is refined more strongly toward the origin x = 0 where the Gaussian laser beam applies pressure. The refinement is both anisotropic and inhomogeneous, leading to nonconforming element boundaries with different polynomial degrees on either side of the boundary, multiple neighbors adjacent to an element face, or both. Specifically, elements facing the top-layer cuboid or the interface between top and bottom layer are matched two-to-one, and wedge-shaped elements have two additional angular grid points. Therefore, the elements facing the cuboid are both p nonconforming and h nonconforming in the top layer, and p nonconforming in the bottom layer. The elements facing the layer interface are h nonconforming. Figure 9 presents the convergence of the discretization error under uniform hp refinement. Specifically, we split every element in two along all three dimensions to construct additional h-refinement levels, and increment every polynomial degree by one to construct additional p-refinement levels, retaining the nonconforming element boundaries. Note that the wedge-shaped elements retain a higher polynomial degree of P +2 along their angular direction throughout the refinement procedure, where P is the polynomial degree of all other elements and dimensions. The DG scheme recovers optimal O(h P +1 ) convergence under h refinement and exponential convergence under p refinement. Note that the exponential convergence scale τ p depends on the domain geometry, the structure of the solution, the placement of grid points and the refinement strategy. We have chosen to refine the domain as uniformly as possible here to reliably measure convergence properties of the DG scheme. Optimizing the distribution of elements and grid points with adaptive mesh refinement (AMR) strategies to increase the rate of convergence is the subject of ongoing work.

C. A black hole in general relativity
Now we apply the DG scheme to solve the Einstein constraint equations of general relativity in the XCTS formulations, which is a set of coupled, nonlinear, elliptic PDEs on a curved manifold (see Appendix 2). Solutions to the XCTS equations describe admissible configurations of general-relativistic spacetime and provide initial data for general-relativistic time evolutions.
In this test problem we solve the XCTS equations (A.6) for a Schwarzschild black hole in Kerr-Schild coordinates, with the background quantities A cut through the uniformly refined sphericalshell domain used in the black hole problem (Section IV C). The domain consists of six wedges with a logarithmic radial coordinate map enveloping an excised sphere. In this example each wedge is isotropically h refined once, i.e., split once in all three dimensions, resulting in a total of 48 elements. Note the elements are split in half along their logical axes, so the element size scales logarithmically in radial direction just like the distribution of grid points within the elements. Each element has six grid point per dimension, so fields are represented as polynomials of degree five.  and where M is the mass parameter, r = x 2 + y 2 + z 2 is the Euclidean coordinate distance, and l i = l i = x i /r. 16 The time-derivative quantitiesū ij and ∂ t K in the XCTS equations (A.6) vanish, as do the matter sources ρ, S, and S i . Note that we have chosen a conformal decomposition with ψ = 1 here, but other choices of ψ andγ ij that keep the spatial metric γ ij = ψ 4γ ij invariant are equally admissible.
We solve the XCTS equations numerically for the conformal factor ψ, the product αψ, and the shift β i . The conformal metricγ ij and the trace of the extrinsic curvature K are background quantities that are chosen in advance and remain fixed throughout the solve. They are a source of aliasing when evaluated on the computational grid (see Section III G). Importantly for this test problem the conformal metricγ ij is not flat, resulting in a problem formulated on a curved manifold. For example, unit normal one forms in the DG scheme are normalized with respect to the conformal metricγ ij and the metric determinant appears in the mass matrix and in the L 2 error (46). 16 See Table 2.1 in Ref. [17].
To solve the black hole problem numerically we employ the strong compact DG scheme (30a) with C = 1 to discretize the XCTS equations (A.6) on a three-dimensional spherical shell, as illustrated in Fig. 10. The domain envelops an excised sphere that represents the black hole, so it has an outer and an inner external boundary that require boundary conditions. To obtain the Schwarzschild solution in Kerr-Schild coordinates we impose Eqs. (53a) to (53c) as Dirichlet-type boundary conditions at the outer boundary of the spherical shell at r = 10M . We place the inner radius of the spherical shell at r = 2M and impose nonspinning apparent-horizon boundary conditions at the inner boundary, wherem ij =γ ij − n i n j . These boundary conditions are not specific to the Schwarzschild solution but ensure the excision surface is an apparent horizon [16]. 17 Since the Schwarzschild solution in Kerr-Schild coordinates has an apparent horizon at r = 2M we recover the solution (53) when we place the inner radius of the spherical shell at that radius. The apparent-horizon boundary conditions (54) do not constrain the lapse α, so we impose Eq. (53b) at the inner boundary. The apparent-horizon boundary conditions are of Neumann-type for the variable ψ, of Dirichlet-type for αψ and β i , and nonlinear. Since the XCTS equations (A.4) and the apparenthorizon boundary conditions (54) are nonlinear the initial guess for the iterative nonlinear solver becomes relevant. We choose an initial guess close to the analytic solution to ensure fast convergence of the iterative solver to the numerical solution. Note that the initial guess and other details of the iterative solve do not affect the discretization error of the numerical solution once the solve has converged to sufficient precision.
We present the convergence of the discretization error under uniform hp refinement in Fig. 11. The DG scheme for the nonlinear black hole problem recovers O(h P ) convergence under h refinement, which is an order lower than that obtained for the two preceding linear test problems. We find higher-order convergence for pure Dirichlet boundary conditions for this problem, suggesting the apparent-horizon boundary conditions (54) are responsible for the reduction of the convergence order. For a Poisson problem with nonlinear boundary conditions the authors of Ref. [24] also find a loss of convergence under h refinement. Under p refinement the scheme recovers exponential convergence and the mean exponential convergence scale τ p increases linearly with the h-refinement level.

D. A black hole binary
Finally, we solve the Einstein constraint equations in the XCTS formulation as in Section IV C, but now we choose background quantities and boundary conditions that represent two black holes in orbit. This binary black hole problem is of significant relevance in numerical relativity to procure initial data for simulations of merging black holes [15,[17][18][19].
Following the formalism for superposed Kerr-Schild initial data, e.g., laid out in Ref. [18,19], we set the conformal metric and the trace of the extrinsic curvature to the superpositions and where γ (n) ij and K (n) are the conformal metric and extrinsic-curvature trace of two isolated Schwarzschild A cut through the three-dimensional black-hole binary domain used in Section IV D. It involves two excised spheres centered at Cn along the x axis and extends to a spherical outer surface at radius R. The domain is h refined such that spherical wedges have equal angular size, so the cubeto-sphere boundary is nonconforming. All elements in this picture have eight angular grid points, and {7, 8, 8, 9, 11, 11} radial grid points in the layers ordered from outermost to innermost.
black holes in Kerr-Schild coordinates as given in Eqs. (53). They have mass parameters M n and are centered at coordinates C n , with r n being the Euclidean coordinate distance from either center. The superpositions are modulated by two Gaussians with widths w n . The time-derivative quantitiesū ij and ∂ t K in the XCTS equations (A.4) vanish, as do the matter sources ρ, S and S i .
To handle orbital motion we split the shift in a background and an excess contribution [42], and choose the background shift where Ω 0 is the orbital angular velocity. We insert Eq. (56) in the XCTS equations (A.4) and henceforth solve them for β i excess , instead of β i . We solve the XCTS equations on the domain depicted in Fig. 12. It has two excised spheres with radius 2M n that are centered at C n , and correspond to the two black holes, and an outer spherical boundary at finite radius R. We impose boundary conditions on these three boundaries as follows. At the outer spherical boundary we impose asymptotic flatness, Since the outer boundary is at a finite radius, the solution will only be approximately asymptotically flat. On the two excision boundaries we impose nonspinning apparenthorizon boundary conditions, Eq. (54). For the lapse we choose to impose the isolated solution (53b) as Dirichlet conditions at both excision surfaces. Note that this choice differs slightly from Ref. [19], where the superposed isolated solutions are imposed on the lapse at both excision surfaces.
Since the binary black hole problem has no analytic solution we assess the precision of numerical solutions by comparing them to a high-resolution reference configuration. Specifically, we interpolate all five fields u A = {ψ, αψ, β i excess } to a set of sample points x m . Then, we compute the discretization error as an L 2 norm of the difference to the high-resolution reference run over all fields and sample points, Figure 13 presents the convergence of the discretization error under uniform hp refinement for our strong compact DG scheme (30a) with C = 1. Specifically, we obtain hrefinement levels from the domain depicted in Fig. 12 by splitting all elements in two along their three logical axes. We obtain p-refinement levels by incrementing the number of grid points by one in all elements and dimensions. The DG scheme recovers exponential convergence under p refinement, and suggests the same O(h P ) convergence x1 = (8.846, 0, 0) x2 = (0, 0, 0) x3 = (100, 0, 0) ψ under h refinement that we have found for the single black hole problem in Section IV C. We have chosen M n = 0.4229, C n = (±8, 0, 0), Ω 0 = 0.0144, w n = 4.8, R = 300, and sample points along the x axis at x 1 = 8.846 (near horizon), x 2 = 0 (origin) and x 3 = 100 (far field) here. For the high-resolution reference configuration in Eq. (59) we use a run that is h refined twice, and has one grid point more per element and dimension than the highestresolution configuration included in Fig. 13. The reference values at the interpolation points are listed in Table II.
We have verified that these values are consistent with the same problem solved with the SpEC [43,44] code up to an absolute error of at most 10 −7 , which is the precision we report in Table II.
In forthcoming work we intend to employ the DG scheme that we have presented here to develop a scalable initial-data solver for binaries involving black holes and neutron stars in the SpECTRE numerical relativity code [23].

V. CONCLUSION AND FUTURE WORK
We have presented a unified discontinuous Galerkin (DG) internal-penalty scheme that is applicable to a wide range of elliptic equations. Our scheme applies to linear and nonlinear second-order elliptic PDEs of one or more variables, where the variables can be scalars, vectors, or tensors of higher rank. It does not require problem-specific modifications of the DG discretization or of the numerical fluxes that couple neighboring elements. The scheme supports a wide range of linear and nonlinear boundary conditions, and applies to equations formulated on curved manifolds. We demonstrate its versatility by solving a simple Poisson problem, a linear elasticity problem on a curved mesh with nonconforming element boundaries, and two nonlinear problems in general relativity involving black holes. The unified DG scheme is capable of solving these problems with no structural changes. It recovers optimal O(h P +1 ) convergence for the linear test problems and O(h P ) convergence for the nonlinear test problems, where P is the polynomial degree of the elements. The scheme is implemented in the open-source SpECTRE code [22] and the results presented in this article are reproducible with the supplemental input-file configurations. 18 The DG scheme developed here can potentially be improved in multiple ways in future work. Dealiasing techniques have the potential to increase the accuracy of the scheme on curved meshes and for equations with background quantities. The choice of penalty on curved meshes remains a subject of ongoing study. Furthermore, detailed studies of the symmetry of the DG operator and related adjustments to the scheme, such as switching to the strong-weak formulation, can potentially make the DG operator faster to solve.
Since the convergence properties of the DG scheme are sensitive to the specifics of the computational domain, we have chosen to refine the domains as uniformly as possible while retaining some important features, such as curved meshes and nonconforming element boundaries. For practical applications it is typically more important to obtain steep rather than uniform convergence, in order to conserve computational resources and thus achieve faster or more precise solves. Therefore, a focus of future work will be to develop adaptive mesh-refinement strategies for the elliptic DG scheme that place grid points in regions and dimensions of the domain that dominate the discretization error.
Once the DG discretization of the elliptic equations is at hand, numerical techniques for solving the resulting matrix equation become important. Sophisticated linear and nonlinear iterative algorithms are necessary to solve high-resolution elliptic problems in parallel on large computing clusters. Many of the choices we have made in the development of the DG scheme are motivated by such large-scale applications. For this purpose we are developing a scalable multigrid-Schwarz preconditioned Newton-Krylov iterative solver with task-based parallelism that will be presented in [23].
along with both f α = 0. Other Poisson-type equations with nonlinear sources can be formulated analogously.

The XCTS equations of general relativity
The XCTS equations withĀ ij = 1 2ᾱ (Lβ) ij −ū ij andᾱ = αψ −6 are a set of nonlinear elliptic equations that the spacetime metric of general relativity must satisfy at all times [15]. 20 They are solved for the conformal factor ψ, the product of lapse and conformal factor αψ, and the shift vector β j . The remaining quantities in the equations, i.e., the conformal metricγ ij , the trace of the extrinsic curvature K, their respective time derivativesū ij and ∂ t K, the energy density ρ, the stress-energy trace S and the momentum density S i , are freely specifiable fields that define the scenario at hand. Of particular importance is the conformal metricγ ij , which defines the background geometry, the covariant derivative∇, the Ricci scalarR and the longitudinal operator Note that the XCTS equations are essentially two Poisson equations and one elasticity equation with coupled, nonlinear sources on a curved manifold. In this analogy, the longitudinal operator plays the role of the elastic constitutive relation that connects the symmetric "shift strain"∇ (i β j) with the "stress" (Lβ) ij of which we take the divergence in the momentum constraint (A.4c). This particular constitutive relation is equivalent to an isotropic and homogeneous material (50) with bulk modulus K = 0 (not to be confused with the extrinsic curvature trace K in this context) and shear modulus µ = 1.
To formulate the XCTS equations in first-order flux form (1) we choose for auxiliary variables the gradient of the conformal factor, v i = ∂ i ψ, the gradient of the lapse times the conformal factor, w i = ∂ i (αψ), and the symmetric shift strain B ij =∇ (i β j) . Then, the XCTS equations (A.4) can be formulated with the fluxes and sources  20 See, e.g., Ref. [17] for an introduction to the XCTS equations, in particular Box 3.3.