A Unified Approach to Enhanced Sampling

The sampling problem lies at the heart of atomistic simulations and over the years many different enhanced sampling methods have been suggested towards its solution. These methods are often grouped into two broad families. On the one hand methods such as umbrella sampling and metadynamics that build a bias potential based on few order parameters or collective variables. On the other hand tempering methods such as replica exchange that combine different thermodynamic ensembles in one single expanded ensemble. We adopt instead a unifying perspective, focusing on the target probability distribution sampled by the different methods. This allows us to introduce a new method that can sample any of the ensembles normally sampled via replica exchange, but does so in a collective-variables-based scheme. This method is an extension of the recently developed on-the-fly probability enhanced sampling method [Invernizzi and Parrinello, J. Phys. Chem. Lett. 11.7 (2020)] that has been previously used for metadynamics-like sampling. The method is thus very general and can be used to achieve different types of enhanced sampling. It is also reliable and simple to use, since it presents only few and robust external parameters and has a straightforward reweighting scheme. Furthermore, it can be used with any number of parallel replicas. We show the versatility of our approach with applications to multicanonical and multithermal-multibaric simulations, thermodynamic integration, umbrella sampling, and combinations thereof.


I. INTRODUCTION
Sampling is one of the main challenges in atomistic simulations. In fact even the most accurate models cannot produce high quality results if the phase space is not properly sampled. The sampling issue is due to the large gap between the physical macroscopic timescales and the actual time that can be explored in standard atomistic simulations. This results in an ergodicity problem that can be encountered in fields as varied as materials science, chemistry, and biology. One facet of this problem is the existence of different metastable states separated by kinetic bottlenecks, that make the transition from one state to another a rare event. Enhanced sampling methods are a possible solution to this problem. Instead of extracting configurations from the relevant physical ensemble, these methods create an ad hoc modified ensemble in which the probability of sampling rare events is greatly enhanced. One kind of such target ensembles is obtained by combining multiple sub-ensembles that differ only for the temperature or some other quantity, a typical example being parallel tempering [1]. We shall refer to these ensembles as expanded ensembles [2].
In the present paper we formulate the problem of generating such expanded ensembles in a way that allows us to use collective-variables-based methods. Particularly efficient in this respect has proven to be the recently developed on-the-fly probability enhanced sampling [3] (OPES). This method was introduced as an evolution of metadynamics [4], since it is capable of the same kind of enhanced sampling but presents in most cases a faster convergence, while having only few and robust adjustable parameters. These properties of OPES are retained when it is applied to sample expanded ensembles. This provides us with a general and reliable method, that can be easily applied to sample many different ensembles.
We accompany this paper with a number of general considerations (Secs. II and VII), but the reader mostly interested in the method itself and its practical implementation can go directly to Sec. IV. Section III briefly recalls OPES in its original formulation for metadynamics-like sampling. We also present a variety of simulations to show the versatility of the new scheme, in particular (Sec. V) multicanonical, multithermal-multibaric, thermodynamic integration, but also (Sec. VI) enhanced sampling based on an order parameter, both alone and in combination with the previous ensembles.

II. A UNIFIED APPROACH
The most popular approaches to enhanced sampling follow mainly two strategies. A first one was proposed in a pioneering work by Torrie and Valleau and referred to as umbrella sampling [5,6]. This method starts by identifying one or few order parameters, or collective variables (CVs), s = s(x), that are function of the microscopic configuration and encode some of the slow modes of the system. Then a bias potential that is function of the CVs is added to the energy of the system, so that the sampling of the slow modes encoded by the CVs is accelerated. Many have followed this approach, and nowadays one of the most popular methods in this class is metadynamics [4].
A different perspective to enhanced sampling is that of parallel or simulated tempering [7,8]. In this case the idea is to combine in the same generalized ensemble the configurations explored by the system at different temperatures. This can improve the sampling because at higher temperatures the exploration of the phase space is often more efficient, and the system is less likely to remain stuck in metastable states. Over the years this approach has been extended and implemented in a variety of different methods, among which replica exchange [9] is probably the most widely employed.
These two families of enhanced sampling methods often have been seen as distinct and complementary. Although there are some papers in which the two perspectives are combined [10][11][12][13], typically they have been perceived as hybrid approaches [14]. Here we want to take a closer look at these two families and show that it is possible to provide a unified perspective to the enhanced sampling problem.
For a start we must specify that we are not interested in looking at the specific computational technique the various enhanced sampling methods use, since according to this criterion there would be many more than two families. There are methods that use a bias potential and others that use specific Monte Carlo moves [9], but also methods that introduce a fictitious dynamics [15], or that focus on directly modifying the atomic forces [16], to name just a few. This kind of classification is of course perfectly legitimate, but we find it of limited relevance for our purposes.
The distinction between the two families cannot be based on the fact that one uses system-specific CVs, while the other makes use of general thermodynamic properties. For instance it is known that Hamiltonian replica exchange can be used to enhance the fluctuations of any chosen CV [17], and on the other hand that it is possible with metadynamics to use the potential energy itself as CV and sample a multithermal ensemble [18].
Thus we prefer to focus on the target distribution p tg (x) that the different methods sample. In fact each enhanced sampling method explicitly or implicitly aims at sampling a specific probability distribution in the configuration space that is not the physical one, but assigns a higher probability to some rare event. Designing such target distributions so that they are effective is far from trivial, and we can relate the two families of methods to the type of target distribution they imply.
A first class of enhanced sampling methods defines the target distribution by setting a constraint on its marginal distribution along some chosen CVs, p tg (s) = δ(s(x) − s) p tg (x) dx. The most common choice is to impose a uniform marginal, p tg (s) = const. Among the methods that adopt this strategy are adaptive umbrella sampling [19] and metadynamics in its original formulation [4]. Also the Wang-Landau algorithm [20] and various multicanonical algorithms [21,22] chose to sample a flat marginal distribution, using the potential energy as CV. An interesting case is the one of well-tempered metadynamics [23] that aims at sampling an s distribution that is a smoothed version of the original one. Contrary to the uniform case and in general to the fixed target case [24], the well-tempered target explicitly depends on the unbiased probability, and is thus not completely known beforehand. Other kinds of targets are also used in the 1/k ensemble [25] and in nested sampling [26].
Another class of methods will be the main focus of this paper and it is the one that aims at sampling the so-called expanded ensembles [2]. These targets are not defined explicitly as a function of some CVs, but rather consist in the sum of overlapping probability distributions. A typical enhanced sampling technique that targets expanded distributions is for example replica exchange [9]. Expanded ensembles can be obtained by combining the same system at different temperatures, or more in general different Hamiltonians [17,27,28]. They can be sampled also with single replica approaches, such as simulated tempering [8], and integrated tempering [29,30]. Broadly speaking, one could consider as part of this expanded ensemble class also methods like multiple windows umbrella sampling [31], or thermodynamic integration [32], where multiple separated ensembles are simulated and then combined into one via some post-processing procedure such as WHAM [33].
It is important to notice that by classifying enhanced sampling methods with respect to p tg (x) we are not implying that methods in the same class are equivalent. Different methods in fact can use very different strategies to reach their target, each having its own strengths and weaknesses. However, this target-distribution perspective suggest that there is not a fundamental difference between the two traditional families, and that a unified approach is possible.
From this point of view, a special place is occupied by variationally enhanced sampling (VES) [34] and onthe-fly probability enhanced sampling (OPES) [3], since in these methods one has to explicitly choose a target distribution. This makes them particularly suited for developing a unified approach, since they are in principle capable of sampling the targets of both of the two families of enhanced sampling, and also combine them in new ways. In VES the usefulness of various target dis-tributions has already been explored [35][36][37]. In particular, a target distribution has been proposed for sampling multithermal-multibaric ensembles [38] and also for combining them with a CV that drives a phase transition [39]. It is this paper that inspired us to try a generalized unified approach.
Our goal here is to introduce explicitly the expanded ensemble target distribution and show that it can be sampled by using a CV-based bias potential method such as VES or OPES. In doing so we will introduce the concept of expansion CVs, that allows us to define both the target expanded distribution and the bias needed to sample it. The method we propose is thus capable of sampling both kind of target distributions, those typical of replica exchange, but also the uniform and well-tempered distributions similarly to metadynamics. In this sense it provides a unified approach to enhanced sampling.

III. ON-THE-FLY PROBABILITY ENHANCED SAMPLING
The recently developed on-the-fly probability enhanced sampling (OPES) [3] is a collective-variablesbased method. Collective variables (CVs) are function of the microscopic configuration, s = s(x), that provide a low dimensional description of the system. In OPES we aim at modifying the physical probability distribution of s, P (s), in order to reach a given target probability distribution, p tg (s). To achieve this we must add the following bias potential where β is the inverse temperature. OPES has been introduced as an evolution of metadynamics and in this spirit we first have used the well-tempered target distribution, defined as p W T (s) ∝ [P (s)] 1/γ , where γ > 1 is known as bias factor. This target distribution aims at increasing the transition rate between metastable states of the system, by lowering of a factor γ the free energy barriers along the CVs. In the limit of γ = ∞ it is equivalent to choosing a uniform target. Since P (s) is not known a priori, we resort to an iterative scheme. The core idea in OPES is to estimate the probability distribution at each step n, P n (s), by reweighting on-the-fly a simulation that is biased with V n (s) which is itself constructed from such P n (s) estimate according to Eq. (1). The P n (s) is obtained via a weighted kernel density estimation, adding a new kernel at a fixed small interval, similarly to metadynamics.
We refer the interested reader to Ref. 3, where the OPES iterative equations for the case of a well-tempered and a uniform target are presented in detail, and to Refs. 40, 41 for some initial applications. In the present paper we introduce a class of target distributions that allows sampling any expanded ensemble. We will also present in detail the OPES iterative scheme for this class of targets.
In applying OPES to sample expanded ensembles we find a method that is similar in spirit to that of Ref. [2] and to other more recent methods, such as integrated tempering sampling [29], infinite switch simulated tempering [30], and variationallyderived intermediates [42].

IV. TARGETING EXPANDED ENSEMBLES
Let us call u(x) the adimensional reduced potential that contains all the terms depending on the thermodynamic constraints, such as temperature, pressure, or others. With x we concisely indicate the atomic coordinates and any other configurational variable that the reduced potential might depend on, such as the volume or the box tensor. As an example, in the case of the canonical ensemble one has u(x) = βU (x), where β is the inverse temperature and U (x) is the potential energy of the system. Let us consider a system with a reduced potential u λ (x) that is a function of λ, where λ could be either a single parameter or a set of parameters, and might indicate e.g. a thermodynamic property such as the temperature. At equilibrium its probability distribution follows Boltzmann statistics: where Z λ is the partition function, Z λ = e −u λ (x) dx.
We are interested in sampling configurations in a range ∆λ of λ-values. Instead of running multiple independent simulations at different values of λ, we can sample a generalized ensemble which contains all the relevant microscopic configurations, and then reweight them to retrieve the correct statistics for any λ ∈ ∆λ. Sampling such ensemble over ∆λ instead of the separate single λ-ensembles is more efficient when different λ-ensembles overlap in the coordinate space, and it can also help in solving ergodicity problems.
We must choose as target a distribution that covers all the microscopic configurations relevant to the chosen ∆λ range. Similarly to what is done in replica exchange, we choose a set {λ} of N {λ} values λ ∈ ∆λ such that there is a good overlap between contiguous P λ (x). We then define our target distribution as: We will refer to this class of target probability distributions as expanded ensemble target distributions. In the present paper we limit ourselves to considering nonweighted expanded targets, that assign the same 1/N {λ} weight to all the sub-ensembles, but it is also possible to add some λ-dependent weights and give different importance to different P λ (x).
Without loss of generality, one can consider λ = 0 to be inside the desired interval ∆λ. It is then possible to run a simulation at λ = 0 and use OPES to iteratively optimize a bias that allows one to sample p {λ} (x). Before proceeding to explicitly write the target distribution and the bias potential, we express P λ (x) as where ∆u λ (x) = u λ (x) − u 0 (x) is the potential energy difference and is the dimensionless free energy difference from the reference system u 0 , that can be expressed also as an ensemble average, indicated with the notation · u0 . Our expanded target thus becomes In order to define the target bias, we first rewrite Finally the adimensional bias needed to sample the ex- that bears some resemblance to the one adopted in parallel bias metadynamics [43]. Note that in writing the bias in this way P 0 (x) cancels out. It follows that the bias potential v(x) depends on the coordinates x only through the N {λ} quantities ∆u λ (x). We shall refer to these ∆u λ as expansion collective variables. The expansion CVs completely characterize the expansion, since not only the bias, but also ∆F (λ), Eq. (5), and the expanded target distribution p {λ} (x), Eq. (6), are unambiguously defined through these quantities. We will see how, by properly choosing the expansion CVs ∆u λ (x), it is possible to sample different kinds of expanded ensembles. For each of them we will also highlight the connection between these expansion CVs and more traditional CVs that have a straightforward physical interpretation.
Our target bias, Eq. (8), depends on the free energy along the λ parameter, ∆F (λ), that is in general unknown. In the OPES spirit we will reach the target bias iteratively, by estimating on the fly ∆F (λ) via a reweighting procedure, and using such estimate to define the applied bias.

A. Iterative OPES Scheme
The free energy difference ∆F (λ) defined in Eq. (5) can be written using an ensemble average over the reference unbiased system u 0 [44]. However, estimating e −∆u λ u0 from an unbiased trajectory is practically impossible due to the extremely small overlap between P 0 and e −∆u λ . For this reason we use reweighting to write it as an average over the biased ensemble where the ensemble average · u0+v is computed as a time average over a biased trajectory. In this way, one can obtain a much more accurate estimate of ∆F (λ).
The problem with this procedure is that the target bias v, Eq. (8), is itself a function of ∆F (λ). Therefore we set up an iterative scheme that consists in running a biased simulation whose bias is based on the estimate of the free energy difference that we obtain via on-the-fly reweighting. At step n the simulation runs with potential The reweighted estimate ∆F n (λ) is updated at every iteration step n. In between the iteration steps there is a fixed short stride where the simulation proceeds and both ∆F n (λ) and the bias v n (x) are kept constant. The free energy estimate at the nth step can be explicitly written as where we use the notation ∆u , with x k the configuration at the kth simulation step.
As the bias approaches convergence, the ensemble sampled approaches the target one, and the ∆F n (λ) estimates become more and more accurate. Thus not only do we obtain the target bias, but we also have an estimate of the free energy as a function of the λ parameter, i.e. ∆F (λ). Our iterative scheme is similar in spirit to the one used in integrated tempering sampling [29], but the two differ both in their implementation and in their applications.
Eqs. (10) and (11) are the explicit OPES iterative equations used for sampling the expanded ensemble defined by the target distribution p {λ} (x), Eq. (3), and are at the heart of our method. In the following sections we will show how these equations can be used to sample different expanded ensembles, simply by specifying different expansion CVs ∆u λ (x). Once these are chosen, the only free parameter of the method is the stride between the iterations. This should be set so that consecutive steps are not too correlated, as it is the case for the on-the-fly Gaussians deposition in metadynamics.
It is possible to parallelize the procedure using multiple replicas of the system, as done in multiple walkers metadynamics [45], where each replica shares the same bias and all contribute to the ensemble average in Eq. (11). At variance with replica exchange, the number of parallel simulations does not have to be equal to the number N {λ} of λ-points that define the target.
Finally we notice that one could consider expressing the free energy ∆F (λ) via a cumulant expansion [46,47]. This generally provides a very good estimate close to the reference λ = 0, but can be very inaccurate when the range is broad, requiring a great number of terms in the expansion. Furthermore we found it can introduce artificial barriers that stop the system from visiting all the relevant configurations, thus breaking the OPES selfconsistent procedure.

B. Reweighting
Until now we have seen how to sample expanded ensembles by applying a bias potential. We now need a reweighting procedure in order to retrieve statistics at any desired value of λ. To this effect one can use standard umbrella sampling reweighting [6]. Given any observable O = O(x) that is a function of the atomic coordinates, we can calculate its average in the ensemble λ via the following reweighting equation: where O k ≡ O(x k ) and the weight w k (λ) is defined as k−1 . This equation assumes that the applied bias is static or quasi-static, meaning that it is updated in an adiabatic fashion. It is thus good practice to discard an initial transient of the simulation, where the bias changes quickly, and not use it for reweighting. Determining the exact length to be discarded might not be intuitive, however OPES generally assigns a very low weight to this initial transient, and thus the result will not be significantly affected by this choice [3].
A useful diagnostic tool when performing reweighting is the so-called effective sample size, defined as the number of sampled points n times the ratio between the variance of an observable in the unbiased ensemble and its variance in the reweighted ensemble [48]. For practical purposes we will use not this definition, but rather the popular estimator [49] defined as: where w k (λ) are the importance sampling weights. Intuitively, the effective sample size for a given λ will be smaller than the total number of samples, n eff (λ) < n. One should expect the efficiency to be roughly n eff (λ)/n ∝ 1/N {λ} , given a minimal choice of λ-points that properly covers the target range. Plotting n eff /n as a function of λ can be a good diagnostic tool to monitor the consistency of the iterative procedure. Finally we notice that the estimate of uncertainties requires some extra care in case of weighted samples [50]. In the Appendix A we describe in detail the weighted block averaging procedure we adopt, and show how the effective sample size plays a role.

V. LINEARLY EXPANDED ENSEMBLES
An important type of expanded ensemble is the one obtained by linearly perturbing the reduced potential of the system, u λ (x) = u 0 (x) + λ∆u(x). It is defined by the following expansion CVs Various different ensembles can be obtained in this way, such as the multicanonical ensemble and the multibaric ensemble, but also alchemical transformations, and others. Recently an interesting "multiforce" ensemble that falls in this category has been proposed [51]. We will present in detail some of these ensembles in the following sections.
It can be useful to group together these linearly expanded ensembles because they share some interesting properties. In particular for these ensembles we can propose a simple automatic way to chose the λ-points that define the target p {λ} (x). The idea is to have the λ-points uniformly distributed in the ∆λ interval with a spacing ∆λ/N {λ} estimated from the effective sample size as a function of λ, n eff (λ). In practice what we do is to run a short unbiased simulation of n steps at λ = 0 and use a root finding algorithm to determine the points λ + > 0 and λ − < 0 such that n eff (λ ± )/n ≈ 0.5. Then one can use a total of N {λ} = ∆λ/(λ + + λ − ) equally spaced λpoints to define the target p {λ} (x).
This heuristic way of choosing the λ-points is not optimal and more elaborate options have been explored in the replica exchange literature [52]. However, in our case this choice is less critical, since in OPES one can increase N {λ} without the need to simulate additional replicas of the system. Thus this procedure provides an easy and automatic guess for linearly expanded ensembles that can be practically useful in many scenarios.

A. Multicanonical Ensemble
We start by considering as example of linearly expanded ensemble the case of the multicanonical ensemble, which is probably the one with the longest history.
The goal is to sample all the configurations relevant for canonical simulations in a given range of temperatures. In a canonical simulation the reduced potential is u(x) = βU (x), where U (x) is the potential energy, β = 1/(k B T ) is the inverse thermodynamic temperature and k B is Boltzmann constant. It is possible to define a multicanonical linearly expanded ensemble, by putting ∆u(x) = u 0 (x) = β 0 U (x) and λ = (β − β 0 )/β 0 , where β 0 is the inverse temperature set by the thermostat of the simulation, and β spans the target range, β min < β < β max .
The expansion CVs that define such target are and by using them in the OPES iterative equations, Eqs. (10) and (11), we obtain our multicanonical simulation. Given the physical significance of the inverse temperature β, it is more natural to directly consider β as parameter instead of the dimensionless λ. We thus write ∆u β and ∆F = ∆F (β), where we have set ∆F (β 0 ) = 0. Similarly, it is natural to consider the potential energy U (x) as collective variable, and thus write the bias as It is important to notice that we did not require the bias to be a function of a single CV, but rather we find it to be the case when we set as target the temperatureexpanded ensemble. This is in fact a general property of linearly expanded ensembles. When expanding according to a given λ, the resulting bias will be a function only of the thermodynamic conjugate variable ∆u. To define the bias v = v(∆u) we then need to estimate the free energy along λ, ∆F (λ).
Other multicanonical methods aim instead at sampling a flat energy distribution [21,22,38]. In order to do so, they need to estimate the free energy as a function of U (or equivalently the density of states along U ), while in our method, as in other tempering approaches [30], we instead need to estimate the free energy as a function of temperature, ∆F (β).

Example: Alanine Dipeptide
As an example of multicanonical sampling we consider alanine dipeptide in a vacuum, at temperature T 0 = 300 K. This is a typical toy model for testing enhanced sampling methods, since at room temperature it presents two metastable states with an extremely low transition probability. A possible way of enhancing the sampling is to bias the φ and ψ dihedral angles, using as target a flat uniform distribution or the well-tempered distribution [3]. Here instead we bias the potential energy U , and use as a target the multicanonical ensemble over a temperature range from 300 K up to 1000 K.
Simulations are performed with the molecular dynamics software GROMACS [53], patched with the enhanced sampling library PLUMED [54] (see SM for computational details). The only input needed for OPES, beside the temperature range we are interested in sampling, is the stride with which updating the bias, that is taken to be 500 simulation steps (1 ps). Fig. 1a shows on the φ, ψ plane the configurations sampled during the 100 ns multicanonical run. It is interesting to notice that the potential energy U would be considered a bad CV in enhanced sampling methods such as metadynamics, since it cannot distinguish between the two basins, that have roughly the same energy. However, when using the multicanonical ensemble as target, by biasing U we can enhance the probability of visiting the transition state (roughly the region where φ = 0), and thus observe multiple transitions between the basins and converge the free energy difference between them, ∆F AB (Fig. 1b). We can use the angle φ to define this free energy difference between the two basins: where χ is a characteristic function, equal to 1 if the variable is in the proper range and 0 otherwise.
In the supplemental material we show a comparison between this multicanonical run and a well-tempered run biasing the two dihedral angles. As expected the latter is much more efficient (roughly 10 times) in converging the free energy difference at a single temperature, due to the fact that it focuses on the relevant degrees of freedom.
In this case we consider NPT simulations with a ref- where p is the pressure and V (x) the volume. Similarly to what done before, it is more natural to use as λ parameters directly the temperature β and the pressure p, and write the expansion CVs ∆u λ (x) as The target distribution is defined by a set of N {β} temperatures β ∈ [β min , β max ] and N {p} pressures p ∈ [p min , p max ], for a total of N {β,p} = N {β} × N {p} different ∆F (β, p) to be estimated. We will also express the bias, Eq. (8), as a function of the potential energy and the volume v = v(U, V ), which come as a natural CVs choice. As already discussed, the intermediate temperatures β and pressures p that define the target can be chosen automatically from a short unbiased simulation. We can do this independently for the two parameters, despite the fact that the pressure term p is multiplied by β in Eq. 18.
Finally we notice how the choice of β 0 and p 0 is completely free. As long as they lay inside the range of temperatures and pressures that we aim at sampling, no matter what thermodynamic conditions we start from at convergence we will sample the same configurational space. However, when the target range is very broad, choosing β 0 and p 0 roughly at the center can help to speed up convergence.

Example: Chignolin
As an example of a multithermal-multibaric simulation we consider the miniprotein chignolin (CLN025) with CHARMM22* force field [55] and TIP3P water (about 2800 molecules), over a temperature range from T min = 270 K to T max = 800 K and a pressure range from p min = 1 bar to p max = 4000 bar. The velocity-rescaled thermostat [56] is set at T 0 = 500 K and the Parrinello-Rahman barostat [57] at p 0 = 2000 bar. The ∆F n (β, p) estimates and the bias are updated every 500 simulation steps (1 ps). The N {β} temperature steps and N {p} pressure steps are chosen automatically based on a short 100 ps unbiased run. This results in 92 steps in temperature and 26 in pressure, for a total of N {β,p} = 2392 points. In order to avoid the region of low pressure and high temperature where water could evaporate, we discard any (β, p)-point laying below the line from (500 K, 1 bar) and (800 K, 1000 bar). In this way 91 (β, p)-points are discarded.
The simulation is performed in parallel using 40 multiple walkers, and runs for a total of 300 ns of which roughly 10 ns are needed to converge the bias. Also in this case we use GROMACS patched with PLUMED.
In Fig. 2a we show the distribution sampled in the energy-volume space, while in Fig. 2b the corresponding effective sample size n eff is plotted, as a function of temperature and pressure and rescaled over the total number of samples n. The n eff /n is not perfectly uniform, but it has the same order of magnitude over the whole target region. On the right, Fig. 2c, we show for one of the 40 replicas the energy and volume trajectory, together with the trajectory of the Cα-RMSD to the experimental NMR structure [58].
In Fig. 3 we show the chignolin folded fraction at the different temperatures and pressures we targeted. The folded fraction is defined as in Ref. [59], using a dualcutoff on the Cα-RMSD based on the CLN025 experimental NMR structure. A configuration is considered folded when the RMSD goes below 0.1 nm, and unfolded when it goes above 0.4 nm. In the inset we compare our results with those of Ref. [59], that considered a smaller temperature range at standard pressure. The confidence interval of our estimate is calculated with the block analysis described in Appendix A.
The stability diagram of chignolin (Fig. 3) does not present striking features, but it has been recently shown [60] that other miniprotein can have a non-trivial phase diagram, with high pressure and low temperature unfolding.

C. Thermodynamic Integration
Another interesting application of our method is its use for performing thermodynamic integration [61]. Let's consider a system with reduced potential energy u 0 (x) and free energy F 0 = − log Z 0 and another similar system with potential u 1 (x) and free energy F 1 . We are interested in calculating the free energy difference ∆F 0→1 = F 1 − F 0 , for instance because we know the free energy of one of the two systems and in this way we can retrieve the other one. The key idea of thermodynamic integration is to define a ladder of intermediate systems with reduced potentials u λ (x) and 0 < λ < 1, to connect the two systems. The free energy difference ∆F 0→1 to go from the u 0 system to the u 1 can be calculated via the following integral: Typically individual simulations are run using u λ (x) for different values of λ and the ensemble average is estimated for each of them. Then the integration in Eq. (19) can be carried out numerically, e.g. using the trapezoid rule or a Gaussian quadrature. The most common way to define the intermediate states u λ (x) is via a linear interpolation where ∆u(x) ≡ u 1 (x) − u 0 (x). In this case we have ∂u λ /∂λ = ∆u.
In the spirit of the present paper, we aim at performing a single simulation that samples all values of λ simultaneously. It is then possible to reweight for any λ and calculate the integral in Eq. (19). Thus we simulate the system at u 0 (x) and build a target p {λ} (x) as in Eq. (3) using N {λ} λ-points in the interval 0 < λ < 1. The OPES iterative equations, Eqs. (10) and (11), can be written using the expansion CVs ∆u λ = λ∆u as defined in Eq. (20).
Finally we notice that thermodynamic integration can be performed using interpolation schemes different from the linear one, and our method is general and can be applied also in those scenarios, simply by properly defining the expansion CVs ∆u λ (x).

Example: TIP4P Water to Lennard-Jones
We now use the thermodynamic integration formalism described above to calculate the free energy of TIP4P water, relative to a reference Lennard-Jones system. The TIP4P potential energy (U T IP 4P ) is made of an electrostatic energy term and a van der Waals type interaction between the oxygens described by a Lenard-Jones potential (U LJ ). The free energy of a Lennard-Jones fluid with the same U LJ potential has been fit to an equation of state and thus is a good reference [62]. For the simulations we use the LAMMPS [63] molecular dynamics software, patched with PLUMED. We perform an NVT canonical simulation at 443 K using the TIP4P water potential, thus u 0 (x) = βU T IP 4P (x), with N = 384 molecules. The reference system is characterized by u 1 (x) = βU LJ (x).
Being β a constant, we consider as collective variable ∆U (x) ≡ U LJ (x) − U T IP 4P (x), and write the bias according to Eq. (8): From a short 20 ps unbiased run we obtain with the usual automatic procedure (Sec. V) 30 equispaced points in the interval 0 < λ < 1, that define our target distribution. The evolution of ∆U as a function of simulation time is shown in Fig. 4a. There is an initial transient of about 3 ns until the bias potential is optimized and then the system diffuses freely. This has to be compared with a simulation for a given value of λ in which the fluctuations of ∆U would be very small. From this simulation the integrand ∂u λ (x) ∂λ u λ = β ∆U λ can be calculated via reweighting, Eq. (12), where ∆U k = U LJ (x k ) − U T IP 4P (x k ) and w k (λ) = e −λβ∆U k +v (k) k−1 . The values of ∆U λ thus calculated are shown in Fig. 4b. Using these results and Eq. (19) we find a free energy difference ∆F T IP 4P →LJ = F LJ −F T IP 4P = 7.00(1) (N k B T units) in agreement with the result reported in Ref. 32.

VI. BEYOND LINEARITY: MULTIUMBRELLA ENSEMBLE
We consider now another important kind of expanded ensemble, namely the one obtained by combining all the different windows of a typical umbrella sampling simulation [17]. We will refer to such an ensemble as multiumbrella ensemble.
Multiple windows umbrella sampling [31] allows for the free energy surface (FES) reconstruction along some collective variable s = s(x), that can be the reaction coordinate or some slow mode of the system. Typically one simulates multiple copies of the system, each one with a parabolic bias potential centered at a different s λpoint, in such a way that the resulting probability distributions have an overlap and cover the whole CV range. Post-processing via WHAM [33] or other methods is then needed to combine the data in a single FES estimate. Here instead we aim at sampling all the umbrella windows in the same simulation via a single global potential, and obtain the FES with the simple reweighting scheme described in Sec IV B, without further post-processing.
Given a system with reduced potential u 0 (x) and equilibrium Boltzmann distribution P 0 (x), we can write the reduced potential of each umbrella window as u λ (x) = u 0 (x) + ∆u λ (x), with expansion CVs The associated probability distribution is is clearly not linear in λ, and in fact requires an extra parameter σ to be defined. The width σ can in principle vary with λ, but we consider here only the case of uniform umbrellas.
Since the expansion CVs, Eq. .
Contrary to the linear case, in this multiumbrella case both the bias v(s) and the free energy differences The N {λ} s λ -points can be chosen to be uniformly distributed in the desired ∆s = s max − s min interval, in such a way to be at most at a distance of σ, ensuring overlap between contiguous P λ . For a small enough σ, the estimate ∆F n (s) converges precisely to the free energy surface (FES), while if σ is too broad there will be small artifacts, similarly to what happens when a too broad bandwidth is used in kernel density estimation.
It is instructive to consider the marginal of the target probability with respect to the CV, p {λ} (s). In the limit of infinitely small σ and thus infinitely large N {λ} , the multiumbrella target p {λ} (s) is a uniform flat distribution over the ∆s interval. In the opposite limit, of a very broad σ, the target distribution will look like the original, hard-to-sample P 0 . As a rule of thumb σ should be as small as the smallest features of the FES we are interested in. We notice that this is the same criterion used to chose the σ parameter in metadynamics [64], and it can typically be guessed from a short unbiased run. For this reason we prefer to use as parameter σ instead of the more commonly used strength of the harmonic umbrella potential K = 1/σ 2 [31].
In some cases it proved useful to introduce two small modifications to make the multiumbrella iterative optimization scheme more robust. We leave the explanation of them to Appendix B, since they have not been necessary for the examples presented in the paper.
For simplifying the exposition we presented the procedure in case of a 1D CV, but it is straightforward to extend it to higher dimension, by using multidimensional Gaussians and placing the s λ -points on an appropriate multidimensional grid. When dealing with higher dimensions it might be interesting to use some more elaborate shapes for the umbrellas, e.g. a Gaussian mixture in a similar but complementary way to Ref. 65, or to follow a specific path, as in Ref. 66.

Example: Double-well Model
As an example for the multiumbrella ensemble we consider a Langevin dynamics on a 2D model potential [67] using as CV the x coordinate only, Fig. 5a. Such CV is suboptimal, in the sense that it does not include all the slow modes of the system, and can be problematic when performing standard umbrella sampling with multiple windows. For instance the window centered at x = 0 cannot be efficiently sampled in a single simulation, since it presents a barrier along y. With our approach this suboptimality only causes a slower convergence, but does not constitute a problem, and no extra care is required to handle it. Figure 5b shows how the target distribution changes for different σ choices, expressed in units of the unbiased standard deviation in the basins, σ 0 ≈ 0.18 . The FES estimate could be directly obtained from the ∆F n (s), but in the case of large σ this would lead to an estimate in which features are oversmoothed (see SM). As a general rule it is better to estimate the FES via the reweighting procedure.
In the supplemental material we provide all the simulation details and show the convergence of the free energy, comparing it to well-tempered OPES and metadynamics. While in metadynamics and well-tempered OPES the bias is constructed in such a way to push the system out of the visited areas, with multiumbrella OPES we are forcing the system to stay in a chosen CV region. Despite this difference in both cases we have similar target distributions and the resulting sampling allows us to reconstruct the FES.

A. Combining Thermodynamic and Order Parameter Expansions
An important characteristic of the present scheme is that it allows for a straightforward combination of different expanded ensembles. In particular it allows for a rigorous and efficient combination of thermodynamic generalized ensembles with enhanced sampling along a system-specific order parameter.
To understand why this is important one can think about a first order phase transition, where there is a kinetic bottleneck between the two phases that is responsible of an ergodicity problem. Increasing the temperature typically changes the relative stability of the two phases, but the free energy barrier separating them might remain high along the whole coexistence line, thus making convergence very slow. A possible solution is to identify a suitable order parameter and biasing it to increase the transition probability. Combining the two approaches might actually outperform both [68,69]. This kind of combination can be useful not only for phase transitions, for instance also in alchemical free energy calculations an open problem is how to properly handle barriers orthogonal to the transformation [70].
We have already cited some hybrid methods that combine a replica-exchange approach with metadynamics, in order to enhance the sampling both along a thermodynamic quantity and an order parameter [10][11][12][13]68]. A non-hybrid approach has been first proposed with multidimensional replica exchange [17], but it has the drawback of requiring a sometimes impractical number of parallel replicas, due to the multidimensionality of the expan-sion. With OPES we can sample the same target distribution of multidimensional replica exchange, but using a bias potential and without requiring a minimum number of parallel replicas. In developing our method we followed the footsteps of another non-hybrid approach that has been recently proposed by our group, using the VES formalism and a custom target distribution [39,71]. Compared to the very flexible and customizable VES approach, OPES has the advantage of having much fewer free parameters and thus being simpler to set up and use.

Example: Sodium
We consider here as an example the calculation of the liquid-bcc phase diagram of a model of sodium [72], the same studied in Ref. 39. We will sample the liquid and solid phase over a range of temperatures and pressures, using a recently proposed order parameter s, called environment similarity collective variable [39]. Such CV provides a measure of the crystallinity of the system, by comparing the local environment of the atoms to a reference one. For this reason we will refer to it as crystallinity CV, but it is actually more general and can be used to describe a variety of phase transitions [71,73].
Using LAMMPS patched with PLUMED, we perform NPT simulations, u 0 (x) = β 0 U (x) + β 0 p 0 V (x). We can write the OPES equations, Eqs. (10) and Eqs. (11), via the following expansion CVs The free energy estimates ∆F n (β, p, s) are expressed as a function of the inverse temperature β, the pressure p and the crystallinity CV s. The bias v = v(U, V, s), is expressed as a function of the potential energy U , the volume V and the crystallinity CV s. The simulation is performed with 250 atoms at T 0 = 400 K and p 0 = 0.5 GPa (5 kbar), using 4 multiple walkers that share the same bias and contribute to the same ensemble averages to update the ∆F n (β, p, s) estimate.
The aim is to sample liquid and solid configurations in the temperature range from 350 K to 450 K and pressures from 0 GPa to 1 GPa (10 kbar). The uniform grid over β and p to define the target distribution is automatically generated from a short 100 ps unbiased run, and consists of 4 temperature steps and 8 pressure steps. We chose as σ for the multiumbrella target a value of about 2.5 times the unbiased standard deviation in the basins, and it determines the presence of 26 umbrellas uniformly placed between s min = 0 (liquid) and s max = 1 (solid). In total the ∆F n (β, p, s) to be estimated are 4 × 8 × 26 = 832. After less than 3 ns the bias is practically converged, but the simulation is kept running until a total combined time of 100 ns, in order to collect enough statistic for a smooth reweighting.
In Fig. 6a we show the points sampled in the CV space during the simulation, colored according to the value of the bias potential v(U, V, s). We can clearly distinguish the hourglass shape described in Ref. 39. In Fig. 6b we see the same trajectory plotted on the energy-volume plane. For comparison we show the configuration sampled in an unbiased simulation at a single temperature and pressure, in which the system remains all the time in the bcc crystal phase.
We can define the free energy difference between the two phases as in Ref. 39: where χ is a characteristic function, equal to 1 if the variable is in the proper range and 0 otherwise, and · T,p is the ensemble average at temperature T and pressure p.
In Fig. 7a we show ∆F liq→bcc (T, p) obtained by reweighting at different temperatures and pressures. The coexistence line ∆F liq→bcc (T, p) = 0 is shown with a dotted gray line. On the right side, Fig. 7b, we provide the free energy surface as a function of the CV, F (s), at different thermodynamic conditions. Error bars are calculated with a weighted block average (Appendix A) and all the results are in agreement with Ref. 39, see SM.
It is important to notice that while the relative stability between liquid and solid changes across the range considered here, the probability of being in the transition state between the two is always extremely small, as can be seen from the high FES values around s = 0.5 in Fig. 7b. By actively biasing the CV s we allow the system to efficiently sample the transition region as well, and this makes it possible to quickly converge the multithermalmultibaric simulation despite the presence of a first order phase transition.

VII. ABOUT THE OPTIMAL TARGET DISTRIBUTION
Before reaching the conclusion of the paper we would like to add a final remark. At the beginning of Sec. IV we presented the non-weighted expanded ensemble target distribution, p {λ} (x) = 1 N {λ} λ P λ (x). It is reasonable to wonder if this is an optimal target and in which sense. We argue here that the effective sample size can be used to define an optimality criterion.
Let's say our goal is to sample from a generalized ensemble that contains all the relevant microscopical configurations for a give range of the parameter λ. While the expanded target distribution p {λ} (x), Eq. (3), fulfills such goal, there are in principle other possible choices.
It is useful to look at the special case of multicanonical ensembles, that has a long history (see also Sec. V A). In this context various different target distributions have been used, other than the non-weighted expanded ensembles one. One option is to have a uniform sampling in the energy [21,22,38], another one is to have a uniform sampling in the entropy [25], and a third one is to define the target by integrating the probability over the temperature, as in Refs. 29, 30. In this last approach one often approximates the integral with a sum, and effectively uses a target similar to our, Eq. (3), which is also the one used in temperature replica exchange. Another interesting perspective is presented in Ref. 74, where a Riemann metric is introduced to define optimality.
We believe that if our goal is to reweight at different temperatures, then the optimal target distribution is the one that yields the highest possible uniform effective sample size over the whole considered ∆λ range. Here we will not further explore such optimal target distribution nor dig deeper in the definition of effective sample size. However, simply by defining this criterion we can notice that one should not see the sum in Eq. (3) as an approximation to an integral. As a matter of fact, using as few as possible intermediate λ-points brings us closer to this optimal target than having more, at least in the systems we studied (see SM).
It might also be the case that one is not interested in obtaining statistic for the whole ∆λ range, but only for a subset of λ-states. In this case the optimal target would be the one that maximizes the effective sample size for those λ-states while ensuring ergodicity. According to this criterion we argued in Ref. 3 that the well-tempered target is better than a uniform one, since it allows for an ergodic sampling while providing a higher n eff /n ratio and avoids unimportant high free energy regions.

VIII. CONCLUSION
In this paper we presented a general framework that provides a unified approach to enhanced sampling. We showed how OPES, that is an enhanced sampling method based on the construction of a bias potential along a set of collective variables, can be used to sample the same expanded ensembles typically sampled by a different family of enhanced sampling methods.
We also introduced the concept of expansion CVs, ∆u λ (x), that can be used to fully characterize a non-weighted expanded target distribution p {λ} (x), Eq. (3), together with the free energy differences to be iteratively estimated, Eq. (5), and the target bias, Eq. (8).
We then presented various examples of the application of the method to sample the most common expanded ensembles. These ensembles are summarized in Tab. I. In particular we have shown how OPES can be used to enhance at the same time temperature-related fluctuations and a system-specific order parameter, Sec. VI A.
We notice that in defining the target distribution p tg (x) we consider only the positional degrees of freedom, and not the atomic velocities. Thus the ensembles sampled by our method are not identical to the ones sampled for instance by replica exchange, even though the target distribution is the same. In fact the two methods sample the same configuration space, but a different velocity space. This does not have an effect on any statistical average of observables that are function of the coordinates only, but might be an interesting point for further research.
In the future it would be interesting to combine expanded target distributions with well-tempered-like distributions, which can scale better with higher dimensionality. Also weighted expanded targets might be of interest, where each sub-ensemble λ has a specific different normalized weight. More generally, we believe that our perspective of focusing on the target distribution has further potential that should be explored.

DATA AVAILABILITY STATEMENT
The method is implemented in the open source PLUMED enhanced sampling library [54], and will be available in a contributed module called OPES. All the input files needed to reproduce the examples of the paper and the trajectories obtained will be openly available on the PLUMED-NEST [75] (www.plumed-nest.org) and in the Materials Cloud Archive (www.materialscloud.org). I. Some of the most common expanded ensembles, together with the expansion collective variables ∆u λ (x) that define the OPES target bias, Eq. (8), and the free energy differences ∆F (λ), Eq. (5). Each of the considered target biases can in turn be expressed as a function of one or two CVs. It is also possible to easily combine these ensembles to form new ones, as shown in Sec. VI A.

Target Ensemble
Expansion CVs Parameters CVs Linearly Expanded When the iterative scheme starts, the first guess of the ∆F n (s λ ) comes from just one single point, and is thus very inaccurate for CV values far away from the visited one. In particular it tends to become extremely large, which causes the bias to be very strong in pushing the system to the farthest s λ -value. This initial bias is stronger the smaller the σ, and might even cause the simulation to fail during the very first biased steps. To avoid this, we can limit the initial value of the ∆F n (s λ ) estimates to be always smaller than a given value, thus ∆F 0 (s λ ) ≤ ∆E. This ∆E value can be set to be equal to an estimate of the free energy barrier that has to be overcome. Thus, similarly to Ref. [3], we add an extra optional parameter called "barrier", that sets the value of ∆E. This barrier guess does not have to be perfect and a very rough estimate typically suffices. Also, we did not observe significant change in the convergence speed, thus we suggest to use this extra parameter only in case of an initial failure of the simulation, due to a too strong initial bias. The barrier parameter can in principle be used also with types of expansion other than the multiumbrella one, even though in those cases it might be less useful.
The second modification, comes from the observation that, contrary to the previously considered expanded ensembles, the multiumbrella one is not granted to sample the full unbiased distribution P 0 (x). This can be a problem for the iterative scheme, because all the ∆F n (s λ ) use as reference the unbiased free energy F 0 = − log Z 0 , whose estimate can vary substantially if P 0 is not properly sampled. Typically it should be easy to chose an interval ∆s that contains all the s values relevant for P 0 , but if this is not the case then the problem can be fixed by simply adding P 0 itself to the target distribution. To add P 0 to the target, it is sufficient to add an extra expansion CV ∆u 0 (x) ≡ 0, that always returns zero.
In the examples presented in the paper we did not have to use any of these two modifications.