Least-Informative Priors for $0\nu\beta\beta$ Decay Searches

Bayesian parameter inference techniques require a choice of prior distribution which can strongly impact the statistical conclusions drawn. We discuss the construction of least-informative priors for neutrinoless double beta decay searches. Such priors attempt to be objective by maximizing the information gain from an experimental setup. In a parametrization using the lightest neutrino mass $m_l$ and an effective Majorana phase parameter $\Phi$, we construct such a prior using two different approaches and compare them with the standard flat and logarithmic priors in $m_l$.


Introduction
Neutrinoless double beta (0νββ) decay is a hypothetical process of crucial interest due to its sensitivity both to the neutrino mass scale and to lepton-number violation. Direct searches for the decay, alongside neutrino oscillation studies and other probes of neutrino masses such as cosmological bounds [1] and single-beta decay measurements [2], are key to the improvement of our understanding of neutrinos. While a measurement of the 0νββ decay rate has not yet been made, upper bounds have been placed on the effective 0νββ mass m ββ , from which constraints on the neutrino mass scale and Majorana phases may be inferred.
Lacking evidence to the contrary, early formulations of the Standard Model (SM) took neutrinos to be massless. However, data suggesting the occurrence of neutrino flavour oscillations has accumulated since the 1960s, beginning with the Homestake Experiment [3] and culminating with the combined observation of oscillations for atmospheric neutrinos by Super-Kamiokande (SK), for solar neutrinos by the Sudbury Neutrino Observatory (SNO) [4] and for reactor antineutrinos by the KamLAND-Zen Experiment [5]. From this it is clear that oscillations necessitate a nonzero mixing angle as well as a nonzero mass difference, i.e. no more than one neutrino mass may be zero. For a model describing the three flavours ν e , ν µ , ν τ of neutrinos in the SM, mixing between weak and mass eigenstates is described by the Pontecorvo-Maki-Nakagawa-Sakata (PMNS) matrix, and again nonzero mixing angles and two nonzero mass eigenvalues are required, with the additional possibility of CP violation.
The realisation that at least two generations of neutrinos are massive leads to natural questions: What is the mechanism responsible for the neutrino masses, and how can all three masses be measured? Oscillation itself lends evidence to the latter question, as measurements have been made of all PMNS matrix elements as well as mass-squared differences ∆m 2 21 ∼ 7.4 · 10 −5 eV 2 and |∆m 2 31 | ∼ 2.5 · 10 −3 eV 2 [6] 1 . The sign of ∆m 2 21 is known, due to solar matter effects on oscillation [8], leading to two candidate mass hierarchies: the normal-ordering (NO) m 1 < m 2 < m 3 , and the inverse-ordering (IO) m 3 < m 1 < m 2 . Throughout this report, we take these oscillation parameters to have their best-fit values from NuFIT 4.0 + SK [9], for which uncorrelated errors are also accounted for in simulation.
Further information on neutrino parameters is provided by cosmological observations, as massive neutrinos play the unique role of both radiation during the early baryon acoustic oscillation epoch, and hot dark matter during later formation of large-scale structure. With the reasonable assumption of equal number densities for all flavours, the sum of masses m i is proportional to the total energy density of neutrinos in non-relativistic eras. As a result, this quantity is observable in the redshift fluctuations of the cosmic microwave background, which are controlled by energy density at last-scattering via the integrated Sachs-Wolfe effect, as well as in a suppression of structural matter fluctuations due to freestreaming neutrinos. The latest fits from Planck observatory data [1] place an upper bound m i < 0.12 eV at 95% confidence, and efforts in this direction have great potential for further precision.
Our focus is on 0νββ decay, which is a hypothetical nuclear transition 2n → 2p + + 2e − [10,11]. Total electron number is violated by two units and the process is therefore not permitted in the SM with zero neutrino masses. It instead proceeds if the SM Lagrangian is extended by a Majorana mass term of the form − 1 2 mν C L ν L . The decay process is sensitive to Majorana neutrinos with an observable effective mass where PMNS matrix U includes Majorana phases that are unobservable in oscillation experiments.
The focus of the present work is the development of computational techniques for data-driven Bayesian inference on the 0νββ parameter space. Bayesian methodologies such as Markov Chain Monte Carlo (MCMC) require a choice of prior distribution, which can strongly influence derived bounds. While it is standard to apply flat priors to bounded parameters such as Majorana phases, for unbounded parameters such as neutrino masses there is less of a consensus. Recent Bayesian analyses have either preferred log-flat priors for their scale-invariance [12], or have considered both flat and log-flat priors [13] in order to demonstrate the strong dependence of quantities such as discovery probability on prior selection. In this paper we examine a class of least-informative priors (LIPs) and apply them to the analysis of 0νββ decay. LIPs are constructed by numerical maximisation of the expected Kullback-Leibler divergence between posterior and prior distributions [14], which is taken to represent inferential information gain and therefore to indicate a minimum quantity of information contained in the prior distribution.
In the frequentist interpretation of statistics, a perfect theory is believed to exist, whose parameters have an unknown but fixed value. Bayesian statistics, however, treats theories as having associated degrees of belief. Provided some data, this approach allows a practitioner to infer a sensible probability that a candidate theory is correct. Given that the primary goal of particle physics experimentation is to reject or accept candidate theories and refine their parameters, this Bayesian inference methodology is a powerful analysis tool. We consider the light neutrino exchange mechanism for 0νββ decay, and use published or Poisson-estimated likelihood functions from cosmological observations of m i and direct searches for m ββ to derive bounds on the neutrino masses and Majorana phases.
In Section 2, we briefly summarize the key aspects of neutrinoless double beta decay and how MCMC studies are being used to infer neutrino parameters. We also introduce an effective parameter which combines the effect of the Majorana phases. In Section 3, we outline the theoretical foundations for least-informative priors and we detail an algorithm for generating LIPs applicable to our physics context. Our results are presented in Section 4. Conclusions and an outlook are featured in Section 5.

Neutrinoless Double Beta Decay and Neutrino Parameter Inference
Thinking of the three terms in Eq. (1.1) as complex numbers, two relative phases −π ≤ α, β < π are sufficient to describe m ββ . Explicitly writing out the PMNS mixing matrix elements, this yields m ββ = c 2 12 c 2 13 m 1 + s 2 12 c 2 13 m 2 e iα + s 2 13 m 3 e iβ = A + Be iα + Ce iβ , where s 2 ij and c 2 ij are shorthand for sin 2 θ ij and cos 2 θ ij , respectively. The real-valued coefficients A, B and C are functions of the neutrino masses and mixing parameters. The neutrino masses are not independent of each other, with the mass-squared splitting values fixed by oscillations, where the lightest mass m l can be chosen as a free parameter; m l = m 1 if the neutrinos are normally ordered (NO) and m l = m 3 if they are inversely ordered (IO). 0νββ experiments measure the half-life T 1/2 ( A X) of the decay of an isotope A X [12], which is connected to the effective double beta mass as Here, G 0ν is a phase-space factor encoding the leptonic part of the process including Coulomb effects between the nucleus and outgoing electrons, and M 0ν is the nuclear matrix element (NME) of the underlying nuclear transition. Both depend on the isotope in question. Due to the large size of the relevant nuclei considered and correlations between nucleon states, the 0νββ NMEs are challenging to calculate, and disagreement between different nuclear models is a significant source of theoretical error in 0νββ studies [15].

An Effective Majorana Phase Parameter
The parametrization of 0νββ decay arising from the neutrino mass matrix, while physically natural, introduces unnecessary complications into our inference. The Majorana phases entering it cannot be determined individually [18] and a combination of both leads to the band structure observed. Fig. 1 shows a continuum of |m ββ | trajectories against the lightest neutrino mass m l for constant choices of the Majorana phases (α, β), where both α and β are uniformly scanned over the range [−π, π]. In a 2D rectilinear uniform distribution, volume effects imply that a point is more likely than not to be found near the boundary of the rectangle, and in particular near the corners. For both types of ordering, this and the functional dependence result in four bands of markedly-high density, which are often but not always located near the theoretical limits on |m ββ |.
Eq. (2.1), we can think of m ββ as the sum of three complex numbers A + Be iα + Ce iβ , where A ≈ 0.67m 1 , B ≈ 0.30m 2 and C ≈ 0.02m 3 . This is depicted in Fig. 2, and in the IO case m 3 is too small for the sum to reach 0. In the NO case, many such closed triangles are found. Funnel edges m l and m l occur at (α, β) = (π, 0) and (π, π), respectively, and are analytic over the neutrino masses and mixing angles. Using best-fit values from NuFit v4.0 + SK [6,9], m l = 2.330 meV and m l = 6.535 meV. For any m l ∈ [m l , m l ], there is a unique Majorana phase pair (α, β) which satisfies m ββ = 0. This choice of phase pair is extremely fine-tuned, leading to a statistical inaccessibility of the funnel; within the funnel interval in m l , the fraction of (α, β) parameter space with |m ββ | < 10 −3 , 10 −4 and 10 −5 eV is ≈ 10 −1 , 10 −3 and 10 −5 , respectively.
The high-density banding and the inaccessibility of the funnel in the |m ββ |−m l parameter space due to the variation of the Majorana phases are a consequence of the parametrization chosen. In Bayesian language we may also say that the corresponding flat prior on the phases is an arbitrary choice given our lack of knowledge on the phases. Moreover, the two Majorana phases are somewhat redundant and what truly matters is the range of values in |m ββ | that can occur for a given m l . For phase angles and parameters derived from phase angles, a flat prior is the standard choice [12] and may be most easily described by a linear parametrization. To capture the effects of the Majorana phases we therefore introduce an effective phase parameter 0 ≤ Φ ≤ 1 which interpolates linearly in |m ββ | between the boundaries of this permissible region. From Eq. (2.1), these boundaries occur for pairs of 0 and π Majorana phases, which leads to the definitions for the NO case and in the IO case, where A, B, C depend explicitly upon m l . In this parametrization, |m ββ | changes linearly as Φ is varied linearly. This includes the funnel region of the NO case.

Bayesian Methodology
A model (H, θ) is defined by both selecting a theory H and a vector of values θ in the space of continuous parameters Θ H spanning that theory. The following probability densities may be defined.
• π(θ) ≡ P (H, θ) is the prior belief in the model (H, θ) before data collection; is assumed to be true; • p(θ|x) ≡ P (H, θ|x) is the posterior belief in the model (H, θ) given observed data x.
While the posterior and prior distributions are probability densities over Θ H , and therefore normalisable on this domain, the likelihood is instead a probability density over the space of measurable data X. Bayes' Theorem relates these quantities in a manner analogous to statistical mechanics, where the posterior gives the probability density for the model to 'occupy' state θ out of all possible parameter choices, where the normalisation factor M H x is known as the marginal likelihood. In practice, the prior π(θ) is an educated guess, perhaps taking preceding experimental information into account, but presumed to be incomplete. As measured data becomes available, the prior probability is updated according to Bayes' Theorem, and each calculated posterior probability becomes the new prior. Given enough data, this process converges to the true best-fit model regardless of error in the prior. 3 In this paper, we take 0νββ decay to be mediated by light neutrino exchange as described above. Our model parameter space Θ H is represented by the lightest neutrino mass m l and the effective Majorana phase parameter Φ, (m l , Φ) ∈ Θ H . For simplicity, the other neutrino oscillation parameters are fixed at their best fit values and the mass ordering scenario, NO or IO, is considered to be known. The natural hypothesis to consider is the observation of a certain number of signal events n in a 0νββ experiment and so our data space X is represented by the possible counts n = 0, 1, 2, . . . . This framework is easily applied to the comparison of multiple hypotheses, e.g. NO vs IO neutrino mass hierarchies, by computing marginal likelihoods for both models given the same observed data. The ratio of these quantities K = M N O x /M IO x is the Bayes' factor of the NO hierarchy versus the IO hierarchy.
Data available from current or upcoming 0νββ decay experiments in isolation is insufficient to claim convergence of the posterior distribution -instead predictions and bounds on model parameters are sought. All such quantities are expressible as posterior integrals [19], which we calculate using samples obtained by MCMC with the Metropolis-Hastings algorithm [20].

Least-Informative Priors
We now consider the impact of prior selection on Bayesian inference from 0νββ decay. In the limit of perfectly precise measurements of observables which fully cover the given parameter space, the bias introduced by a prior vanishes. Unfortunately, this is not the situation for any real experimental outcome, and any assumption implicitly made by a prior distribution must contribute to the posterior, as demonstrated for 0νββ in particular by [21]. It is therefore advantageous, to the practitioner who wishes to avoid inferential bias, to choose priors which assume the least about the outcome of the experiment, and are in this sense "uninformative". In this section we study least-informative priors (LIPs), which are intended to maximise the expected information gained through measurement and inference, and develop a methodology for the case at hand.

Theoretical Construction of Reference Priors
It is of import to first cite [22] for a justification of the overall methodology of this paper; that is, to conduct Bayesian inference with a diverse set of priors in order to arrive at the fullest picture of what our data implies. The situation in scientific experiment is not so different from that in psychological studies of human behaviour, where personal knowledge or opinions play a non-trivial role in even the most rational decision-making. By employing priors which take account of some full range of acceptable prior beliefs which an experimenter might hold, we can both gain confidence in inferences which hold broadly across the considered priors, and quantify the variation of inferred bounds or measurements between priors.
However, even if we are persuaded that no single prior can offer a complete understanding of any statistical inference, it becomes necessary to establish a reference prior against which the performance of all other priors might be consistently compared. A general procedure is developed in [14], which may be applied to any inference, for identifying such a prior as the solution to an optimisation problem over information-gain; an LIP. This procedure is summarised as follows, applied separately for each parameter θ i with all others fixed.
First, taking only the requirement that information gained from multiple measurements is additive, the information contained in a distribution P (θ i ) is [23,24] which is familiar in physics as the Boltzmann-Gibbs entropy for a continuous collection of states, up to a constant factor [25]. We then take an experiment E, which measures data x, with likelihood function L x (θ i ). Following [24], Ref. [26] calculates the expected information gain of prior π(θ i ) as where p(θ i |x) is the posterior given by Bayes' Theorem, and M x = dθ i L x (θ i )π(θ i ) is the marginal likelihood of data x. The inner integral on the second line of Eq. (3.2) is known as the Kullback-Leibler divergence K[p, π] between p(θ i |x) and π(θ i ) [26], of which I{E, π} is therefore an expectation value over the data-space. Whether phrased as a prior or posterior integral, this quantity depends strongly on the choice of prior, which appears implicitly in the inference of the posterior and as the measure over θ i in the marginal likelihood. Letting E(k) indicate k independent replications of experiment E, the quantity I{E(∞), π} describes the vagueness of prior π(θ i ) [14], as an infinite quantity of well-defined experiments must arrive at the same precise measurement, and so a greater expected information-gain through inference implies that more information was missing to begin with. The prior π which maximizes I{E(∞), p} cannot simply be selected because an infinite quantity of information is needed to measure any parameter exactly. Instead, a limit must be taken as the number of measurement repetitions k approaches ∞. For a given k, the reference prior π k (θ) is defined as that among all permissible priors [27] which maximizes I{E(k), π}, where permissibility is defined using boundedness and consistency arguments over compact subsets of the parameter space. Given any measurement x, the reference posterior p k (θ i |x) corresponding to prior π k (θ) is calculable by Bayes' Theorem, and assuming compactness on the set of possible posteriors, the limit p(θ i |x) = lim k→∞ p k (θ i |x) is welldefined. Due to consistent validity of Bayes' Theorem across the measurement domain, a prior π(θ i ) proportional to p(θ i |x)/L x (θ i ) is then a well-defined LIP which is independent of x.
Obtaining the LIP via a posterior limit of course does not feel very efficient, but so long as certain regularity conditions are met [27], a limiting sequence among priors which still maximizes Eq. (3.2) may be found: Here, x k is a collection of data from k repeated measurements, π * (θ) is an initialization prior chosen among any in the permissible set, and k is taken to be large enough that the posterior p * (θ|x k ) induced by prior π * (θ) is dominated by the characteristics of the likelihood rather than by that prior. A subtlety of this construction is that a direct limit lim k→∞ π k can be poorly behaved at singularities and boundaries, resulting in a comb-like LIP. In such cases, the LIP can instead be defined as a conditioned limit at some well-behaved parameter point θ i,0 [27], .

(3.4)
In addition to maximizing the expected inferential information-gain, the generated LIP enjoys simple Jacobian transformation under re-parametrizations of θ [14].

Implementation of LIP Algorithm for 0νββ
The application of LIPs to neutrino oscillation experiments is explored in [26]. When an experiment is such that the asymptotic posterior is well-approximated by a Gaussian distribution (a condition known as asymptotic posterior normality), it can be shown that the LIP is simply the multivariate Jeffreys prior of the likelihood [14]. This assumption holds and significantly simplifies computation for oscillation studies of the neutrino mass splittings, and cosmological studies of the sum-of-masses Σ. However, in 0νββ decay, the observable of interest |m ββ | may be asymptotically small, or may even vanish if neutrinos are not Majorana particles. Asymptotic posterior normality can therefore not be said to hold for any 0νββ decay search, and instead we follow the computational procedure set out by Berger [27] for solving Eqs. Following [13], we define our 0νββ measurement model to be a Poisson counting experiment, where the likelihood of observing n counts given a background expectation λ and signal expectation ν is The number of expected signal events ν is related to the 0νββ decay half-life by where N A is Avogadro's constant, m iso is the molar mass of the enriched isotope used in detection and E is the sensitive exposure, also accounting for detection efficiency. In our simulations for the LEGEND-200 76 Ge experiment, we take one year of runtime with m iso = 75.921 u and λ = 1.7 · 10 −3 cts/(kg·yr) · E with sensitive exposure E = 119 kg·yr [12]. The half-life T 1/2 (|m ββ |) also depends on the isotope through the phase space factor and nuclear matrix element, where we use the values G 0ν = 3.04 · 10 −26 yr −1 eV −2 and |M 0ν | = 4.32 [28].
Berger's algorithm discussed above applies only to a single parameter θ i , and therefore in a multi-parameter problem such as ours, the LIP must be obtained sequentially. Following [26], at step j in the iteration, the prior on θ j is computed using fixed values of all θ i>j , written π(θ j |θ i>j ). The likelihood function is then marginalised by parameter θ j , (3.7) Note that this procedure must be repeated for each combination of parameter values for which we seek to know the LIP, placing a strong bottleneck on achievable precision. At the end of the iteration, the total LIP is given by the product For a non-separable likelihood function, this depends on the ordering of parameters [26], with more impactful parameters customarily ordered first; we therefore take θ 1 ≡ m l and θ 2 ≡ Φ. The resultant two-stage LIP algorithm is summarized in Fig. 6 in Appendix A. A "free-phi" approximation is also considered, in which the above multi-parameter procedure is still followed, but the Φ-likelihood L Φ n (Φ) is computed over a flat m l prior, thereby removing costly interpolation evaluations.
From a practical standpoint, the number of repetitions m impacts the precision of the final LIP, while the sample quantity k affects its accuracy. If m is too small, the prior may be noisy but still accurate, while a k far from the convergence region could lead to a prior which is far from least-informative. We select m = 100 for our simulations, sufficiently large to be near-convergence, but small enough to avoid precision errors as the product of likelihoods dips near 10 −200 . The outer loops of the algorithm are parallelizable, and so a 16-core MPI implementation of the algorithm leads to significant speed-up, allowing for k up to 2000 with ∼ 12-hour run-times.

Generated LIPs for LEGEND-200
To illustrate the above procedure, we choose the future 0νββ decay experiment LEGEND-200 [29] as the basis for a measurement example. LEGEND-200 plans to use 175 kg of the isotope 76 Ge to achieve a 3σ sensitivity to half-lives greater than 10 27 yr. This corresponds to an expected number of background events of λ = 1.7 · 10 −3 cts/(kg·yr) · E with sensitive exposure E = 119 kg·yr [12] taken over one year of runtime as an example to illustrate our algorithm. Here the sensitive exposure E is defined as the product of the total exposure with fiducial volume and signal detection efficiencies, also accounting for a 2σ region of interest around the decay energy. Two configurations of the LIP algorithm were considered: the full two-parameter integration discussed above and specified in Figure 6 in the Appendix, and the free-phi approximation.
The results of our simulation, with 2 × 10 5 likelihood draws at each parameter point (m l , Φ) across a grid with resolutions ∆Φ = 0.1 and ∆ log 10 (m l ) = 0.2, are shown in Fig. 3. The top left and top right plots show the LIP π(m l , Φ) in the NO scenario calculated using the full simulation and the free-phi approach, respectively. They demonstrate that the free-phi approximation is generally valid throughout the parameter space. The greatest deviation occurs for large values of both parameters, a region where relevant likelihoods tend to be very low, and the impact upon posterior inference is therefore expected to be negligible. It should be noted that slow convergence at parameter boundaries causes erroneous growth of the raw LIP, producing boundary walls which are locally smoothed in post-processing to produce the visuals throughout this work, and before MCMC or information computations are performed. Signal-to-noise ratio (SNR) analysis of repeated trials showed a minimum SNR of 1000, or 30 dB, between individual generated LIPs and averages of three LIPs, as a reference. Practically, this corresponds to a noise amplitude of at most ±0.05 everywhere in the LIP distribution.
In the right column of Fig. 3, LIPs computed for NO (top) and IO (bottom) neutrino mass orderings are compared and seen to be structurally similar. Both priors feature a near-linear increase with Φ, and a significant trough in m l between 10 −2 and 10 −1 eV. On either side of this trough, the prior is nearly flat in m l , with higher density for m l > 0.1 eV. Note that these functions are distributions over m l , plotted on a logarithmic scale, rather than distributions over log(m l ), which would feature the presence of an additional factor 1/m l from the Jacobian transformation.
It is significant that a predominantly flat prior in m l emerges from a first-principles Bayesian simulation, perhaps indicating the naturalness of a flat prior for unknown particle masses. The trough may be understood physically as the region of parameter space where LEGEND-200 has the greatest propensity to make either a measurement or an exclusion; the LIP therefore reduces the weight in this region so that any inferences made can be said to more fully data-driven.

Comparison of Inferred Bounds on m l
We evaluate the performance of the computed LIPs by utilizing them (after bi-cubic spline interpolation) in our MCMC analysis, equipped with the effective parametrisation (m l , Φ). Only the experimental likelihood from LEGEND-200 is included, modelled with a Poisson distribution following [13] with λ = 1.7 · 10 −3 cts/(kg·yr) · E with sensitive exposure E = 119 kg·yr [12].
We consider first the case where LEGEND-200 registers n = 0 signal events during its run. We include 10 7 MCMC samples, using a Gaussian proposal distribution of width 10 −2 in both m l and Φ. The resulting posterior distributions for flat (in both m l and Φ) and log-flat (flat in log m l and Φ) priors as well as both LIP implementations are shown in Fig. 4 for NO (left) and IO (right). A further confirmation of the strong match between free-phi and full LIP calculations is gained, and both are seen to lead to very similar inferences as the flat prior. In the NO case, these three priors exclude m l > 0.11 eV at a 90% credibility level 4 , while the log-flat prior excludes m l > 0.03 eV; in the IO case, these upper bounds are expectedly slightly higher, though within simulation error.
Next we set the observed count to n = 1, and repeat the inference procedure, with results shown in Fig. 5. Again both calculated LIPs give similar posteriors to the flat prior. In the NO case, Fig. 5 (left), these three priors result in m l = 90 ± 50 meV. In the IO case, Fig. 5 (right), the flat and LIP measurements give slightly lower values, but with comparable precision. However, the log-flat prior fails to make any measurement in either  hierarchy, instead placing an exclusion at 90% credibility on m l > 75 meV in the NO case, and on m l > 45 meV in the IO case, as the measurement falls squarely within the region extending to low m l which log-flat priors are intended to probe. It is not surprising that LEGEND-200 fails to distinguish between the neutrino-mass hierarchies, as it does not probe sufficiently small |m ββ | where the allowed parameter regions notably diverge.

Information Content of Inferences
However, the value of LIPs does not lie in their propensity to give a conservative bound or measurement (which the flat prior already achieves), but in their trustworthiness as a reference prior with minimised bias. In Table 1, we report the Kullback-Leibler divergences of each MCMC posterior against its prior, using the same LEGEND-200 likelihood as above. Error propagation was performed by considering a noisy LIP of the form π(θ) = π true (θ) ± n π (θ), which in the limit of small noise (and assuming that variation in the posterior is dominated by variation in the prior) corresponds to noisy Kullback-Leibler divergence: where the integrated error may be interpreted as the posterior expectation value of the inverse of the signal-to-noise ratio distribution for π(θ). Computation of this integral for the worst-case noise distribution mentioned in Section 4.1 gave an error of ±0.03 bits, leading us to quote our divergence values to 0.01 bit precision.
The results give a numerical confirmation of the similarity between free-phi and full LIP computations, whose divergences consistently fall within 0.25 bits. In all configurations, the LIPs outperform both standard priors in information gain, as expected from their construction.

Conclusion
Bayesian parameter inference is a common tool to constrain or determine parameters in particle physics. An inherent issue in this context is the choice of a prior distribution over the model parameters. We have here focussed on the neutrino parameter space relevant to 0νββ decay searches, specifically the lightest neutrino mass m l and an effective Majorana phase parameter Φ encapsulating the effect of the Majorana phases in the lepton mixing matrix. Given that 0νββ decay has not been observed yet, prior distributions are expected to have a strong impact on the conclusions drawn.
We have adapted an algorithm for computing least-informative priors for a given experiment via likelihood-sampling to the case of 0νββ direct searches, resulting in exact and approximate parallelised implementations. The LIPs were seen to take the form of a flat-m l , linear-Φ distribution broken by a trough between m l = 10 −2 and 10 −1 eV. We demonstrated that for the proposed 200 kg 76 Ge LEGEND experiment, these priors give similar posterior bounds to the usually adopted flat prior for both neutrino orderings, and in both observation and non-observation scenarios. Furthermore, the LIPs were seen in nearly all cases to outperform both standard flat and logarithmic priors as far as their information-gain during MCMC inference is concerned. This supports the functionality of the adapted algorithm and strengthens the argument for the use of LIPs as reference priors for 0νββ decay searches.
Natural extensions of this work include a study of the variation in LIP performance across proposed experiments of diverse background levels and exposures, simulation of LIPs for the usual parametrization using two Majorana phases and research towards the construction of a prior which is jointly least-informative over both the 0νββ observable |m ββ | and the cosmology observable m i . Interpolate over pairs (m * l , π(m * l |Φ * )) to get smooth π(m l |Φ * )