Generalized Langevin Equation with a Non-Linear Potential of Mean Force and Non-Linear Memory Friction From a Hybrid Projection Scheme

We introduce a hybrid projection scheme that combines linear Mori projection and conditional Zwanzig projection techniques and use it to derive a Generalized Langevin Equation (GLE) for a general interacting many-body system. The resulting GLE includes i) explicitly the potential of mean force (PMF) that describes the equilibrium distribution of the system in the chosen space of reaction coordinates, ii) a random force term that explicitly depends on the initial state of the system, and iii) a memory friction contribution that splits into two parts: a part that is linear in the past reaction-coordinate velocity and a part that is in general non-linear in the past reaction coordinates but does not depend on velocities. Our hybrid scheme thus combines all desirable properties of the Zwanzig and Mori projection schemes. The non-linear memory friction contribution is shown to be related to correlations between the reaction-coordinate velocity and the random force. We present a numerical method to compute all parameters of our GLE, in particular, the non-linear memory friction function and the random force distribution, from a trajectory in reaction coordinate space. We apply our method to the dihedral-angle dynamics of a butane molecule in water obtained from atomistic molecular dynamics simulations. For this example, we demonstrate that non-linear memory friction is present and that the random force exhibits significant non-Gaussian corrections. We also present the derivation of the GLE for multidimensional reaction coordinates that are general functions of all positions in the phase space of the underlying many-body system; this corresponds to a systematic coarse-graining procedure that preserves not only the correct equilibrium behavior but also the correct dynamics of the coarse-grained system.

We introduce a hybrid projection scheme that combines linear Mori projection and conditional Zwanzig projection techniques and use it to derive a Generalized Langevin Equation (GLE) for a general interacting many-body system. The resulting GLE includes i) explicitly the potential of mean force (PMF) that describes the equilibrium distribution of the system in the chosen space of reaction coordinates, ii) a random force term that explicitly depends on the initial state of the system, and iii) a memory friction contribution that splits into two parts: a part that is linear in the past reaction-coordinate velocity and a part that is in general non-linear in the past reaction coordinates but does not depend on velocities. Our hybrid scheme thus combines all desirable properties of the Zwanzig and Mori projection schemes. The non-linear memory friction contribution is shown to be related to correlations between the reaction-coordinate velocity and the random force. We present a numerical method to compute all parameters of our GLE, in particular the non-linear memory friction function and the random force distribution, from a trajectory in reaction coordinate space. We apply our method on the dihedral-angle dynamics of a butane molecule in water obtained from atomistic molecular dynamics simulations. For this example, we demonstrate that non-linear memory friction is present and that the random force exhibits significant non-Gaussian corrections. We also present the derivation of the GLE for multidimensional reaction coordinates that are general functions of all positions in the phase space of the underlying many-body system; this corresponds to a systematic coarse-graining procedure that preserves not only the correct equilibrium behavior but also the correct dynamics of the coarse-grained system.

I. INTRODUCTION
Most interesting physical systems are interacting many-body systems. When dealing with the kinetics of such systems, one is typically interested in the dynamics of a low-dimensional reaction coordinate, which is, however, generally influenced by the entire system [1]. Examples include the motion of a particle in a liquid [2][3][4][5][6][7], vibrational modes of a molecule in the gas phase or in a liquid [8][9][10][11], chemical or associative reactions between molecules [12][13][14][15][16] and protein folding [17][18][19][20]. To predict the dynamics of the reaction coordinate, one in principle has to solve the equation of motion of the underlying many-body system, which is typically analytically impossible and is only numerically possible for small systems and over short times. The very attractive idea of coarsegrained modeling is to replace the description of the full many-body system by a description in terms of the reaction coordinates only. The challenge is to derive the appropriate equation of motion that describes the dynamics of the reaction coordinates accurately while maintaining numerical efficiency. For some biologically relevant scenarios, such as the folding of a protein, sufficiently long simulations of the full system dynamics can be performed [21][22][23], but even for these cases, the interpretation of the results typically requires mapping onto a low-dimensional reaction coordinate.
Rigorous coarse-graining methods based on projection operator techniques were introduced by Zwanzig and Mori, which are directly applied to the Liouville equation that describes the dynamics of a classical many-body system governed by a time-independent Hamiltonian [24,25] (in fact, a similar approach applicable to quantum systems was developed by Nakajima even earlier [26]). The result of the projection is a coarse-grained equation of motion for the chosen set of reaction coordinates, the socalled generalized Langevin equation (GLE). It contains three distinct terms: a force term due to a potential that depends on the reaction coordinates, a memory friction contribution that involves the past time dynamics of the reaction coordinates, and a force that explicitly depends on the initial state of the entire many-body system and which is typically interpreted as a random or stochastic force. The GLE is therefore an integro-differential stochastic equation. It should be noted that the Zwanzig and Mori projection schemes give rise to fundamentally different GLEs for non-linear systems, which are both rigorous and reproduce the system dynamics described by the reaction coordinates exactly [24,25]. However, except a few notable exceptions [27][28][29][30][31][32], the exact Zwanzig or Mori equation have rarely been used in practice for non-trivial, i.e. non-linear, systems, for different reasons: In the Mori framework, the force from the potential as well as the memory friction are linear in the reaction coordinate and their velocities, respectively, and therefore all non-linearities are accounted for by the random force, which thus becomes non-Gaussian and is difficult to parameterize; in the Zwanzig framework, the potential term in the GLE is in general non-linear and corresponds to the potential of mean force (PMF), which ensures the correct equilibrium distribution of the reaction coordinates [28], which is a desired property. On the other hand, the memory friction is a general function of both the reaction coordinates and their velocities, which poses severe problems when estimating such a function from simulation or experimental data.
As a consequence, many previous works considered a simplified form of the GLE, which in this paper we refer to as the approximate GLE. It contains the non-linear PMF and a memory friction that is linear in the velocity of the reaction coordinate [6,[33][34][35][36][37][38][39][40]. In principle, this approximate GLE follows from the Zwanzig GLE, assuming that the friction memory depends only linearly on the past reaction coordinates and is independent of the reaction-coordinate velocities. The validity of this approximation can typically not be checked in a systematic manner. The applications of the approximate GLE range from non-Markovian rate theory [41][42][43], over protein folding dynamics [17][18][19][20] to molecular diffusion and conformational dynamics [6,7,37,44]. Methods to derive memory functions from trajectory data for non-linear systems within the framework of the approximate GLE have been introduced and it was demonstrated that the resulting GLE correctly describes the multi-scale fractal dynamics of protein folding [20] and the vibrational spectra of molecules in non-linear bond-length and bondangle potentials [45]. Although widely used, the validity of the approximate GLE in the presence of a non-linear potential is subject to ongoing discussions [46,47].
In this paper, we introduce a projection method that is a hybrid of the Zwanzig and Mori projection schemes. As an advantage over the Mori projection scheme, the resulting GLE contains the force stemming from the generally non-linear PMF, which by itself guides the system into the correct equilibrium distribution in the longtime limit. As an advantage over the Zwanzig projection scheme, the generally non-linear memory friction does not depend on the velocity of the reaction coordinate but only on the reaction coordinate itself, which significantly simplifies the numerical estimation of the memory function from trajectory data. We develop the necessary framework to compute all parameters of the resulting GLE from trajectories of a reaction coordinate. Thus, we present data-based methods i) to derive the non-linear memory friction from simulation or experimental trajectories, ii) to thereby examine the validity of the approximate GLE, and iii) to study the distribution and correlation of the random force from trajectories. We also derive a multidimensional GLE in terms of a general set of reaction coordinates that are arbitrary functions of the positions of the underlying many-body system; this constitutes a rigorous derivation of the equations of motion that accurately describe the equilibrium and dynamic behavior of coarse-grained systems. For the explicit example of the dihedral-angle dynamics of a butane molecule in water, obtained from atomistic molecular dynamics simulations, we demonstrate that non-linear memory friction is present and that the random force exhibits significant non-Gaussian corrections. Therefore, we find that even for this simple molecular system, the approximate GLE, which neglects non-linear memory friction and assumes Gaussian random forces, does not correctly describe the dynamics.
The paper is organized as follows: First, we introduce the Hamiltonian of the many-body system, as well as our notation, and we present important expressions for correlation functions and conditional averages. We then review the Mori and Zwanzig projection schemes and highlight practical problems of the resulting GLEs. After this, we introduce our hybrid projection scheme and derive the GLE that features a non-linear PMF and non-linear memory friction. In the subsequent section, we introduce an algorithm to extract all parameters of our GLE from trajectories. In the final section, we apply our formalism on two exactly solvable model systems and on MD trajectories for the dihedral angle dynamics of a butane molecule in water.

II. HAMILTONIAN MODEL, NOTATION AND USEFUL PROPERTIES
We denote the phase space of a system of N interacting particles in three-dimensional space by Ω. One specific microstate, i.e., a point in Ω, is denoted by ω = (R, P) = (r 1 , r 2 , . . . , r N , p 1 , p 2 , . . . , p N ) which is a 6N vector of the Cartesian positions r i = (r x i r y i r z i ), and the conjugate momenta p i = (p x i p y i p z i ) of all i = 1, 2, . . . , N particles in the system. The Hamiltonian of the system is an invariant of motion and splits into a kinetic and a potential part The potential V (R) contains all interactions between the particles and possible external potentials. The only assumption on V is that it is a function of the positions R only. The time evolution of a point ω in phase space is determined by Hamilton's equation of motion, which can be written in the formω where ω t is the location of the system in phase space at time t andω t denotes the corresponding velocity, given the system was initially at ω 0 . For the sake of compact notation, we denote time dependencies of phase space coordinates by a subscript. In eq. (2), L is the Liouville operator given by All of the operators that we consider in this work, including the Liouville operator L, act on the initial phase space position ω 0 . From eq. (2), it follows that the system is propagated in time by the operator e tL , i.e., e tL ω 0 = ω t . We consider observables that are real-valued functions of phase-space coordinates only and that depend on time implicitly via the time dependence of a trajectory moving in phase space. For the sake of notational brevity, we also denote the time dependency of observables by a subscript too, i.e., A t ≡ A(ω t ) = A(ω 0 , t). Using the chain rule for differentiation, it follows that the time evolution of an observable A t is also governed by the Liouville equation whereȦ t denotes the time derivative of A t . Thus, the time propagation operator of an observable in the initial state A(ω 0 ) ≡ A 0 , is also given by e tL . From this, it follows that Eq. (5) describes how observables are propagated in time by e tL and will be used throughout our derivations. All observables are elements of a Hilbert space, i.e., a vector space equipped with an inner product. Let A and B denote two system observables. For the inner product, we choose where ρ eq (ω 0 ) = e −βH(ω0) /Z is the canonical Boltzmann distribution with the inverse thermal energy β = 1/k B T and the partition function Z = Ω dω 0 e −βH(ω0) . The inner product in eq. (6) thus corresponds to an equilibrium time correlation function which establishes the link to statistical mechanics. The average of a single observable B t is given by B t ≡ B t , 1 and does not depend on time. Because of the form of the Hamiltonian in eq. (1), the Boltzmann distribution factorizes into a position and a momentum-dependent part where ρ kin (P 0 ) ∝ N i=1 exp −β p 2 i,0 /2m i is a Gaussian with zero mean. With respect to the inner product in eq. (6), the Liouville operator, as defined in eq. (3), is anti-self-adjoint [48]

A. Conditional Averages
In addition to time-correlation functions calculated over the entire phase space Ω, as in eq. (6), we will also use conditional time-correlation functions that result from averages over a hyper surface in phase space on which an observable of choice at the initial time t = 0, A 0 = A(ω 0 ), takes a constant value A(ω s ). A conditional correlation of two observables B t = B(ω t ) = B(ω 0 , t) and C t = C(ω t ) = C(ω 0 , t ) is defined by [27,28] In eq. (9), the phase space variable with a hat, ω 0 , is integrated over. The phase space variable ω s is not. Therefore, B t , C t As is a function of ω s , and the times t and t . The conditional average of a single observable B t is given by B t As ≡ B t , 1 As . Finally, we give a few relations which will be frequently used later on. We repeat that a conditional average is a function of phase space via the conditional function A s = A(ω s ) in eq. (9). The time propagation of a conditional average is thus given by The normalized probability that an observable A t has the value a is given by P(a) ≡ δ(A t − a) , from which the potential of mean force (PMF) for an observable follows as [33] U PMF (a) ≡ −k B T ln P(a).
Acting with the Liouville operator on a delta function gives [30] Lδ Using the definition in eq. (9) together with the relations in eq. (8), eq. (12) and the PMF defined in eq. (11), we derive in appendix A the important relation [30]

III. PROJECTION OPERATOR METHOD
We now derive the equation of motion for an arbitrary scalar observable A t , which can of course also be the position of a single particle [48]. The derivation for a general multi-dimensional observable is given in appendix B. A projection P is a linear, idempotent operator, i.e., for arbitrary scalars c 1 , c 2 , it fulfills the properties P (c 1 A t + c 2 B t ) = c 1 P A t + c 2 P B t and P 2 = P . The operator Q = 1 − P projects onto the complementary subspace with 1 being the identity operator. For briefness, we will refer to the subspace onto which P projects as the relevant subspace. The operators P and Q can be used to decompose the Liouville equationÄ t = LȦ t for the observableȦ t as A t = e tL (P + Q)LȦ 0 = e tL P LȦ 0 + e tL QLȦ 0 . (14) To obtain an equation of motion for A t from eq. (14), we introduce the operator Φ(t) = e tL Q.
Eq. (16b) is an inhomogenous differential equation of first order. Using Φ(0) = Q, as follows from eq. (15), the solution reads By using Qe tLQ = e tQL Q and the substitution s = t − u in eq. (17), we find Replacing e tL Q in eq. (14) by eq. (18) leads to the GLE for A t in terms of a general projection P [24,25,48] The function F R (t) stays in the complementary subspace for all times and is an explicit function of the initial state of the entire system, i.e., F R (t) = F R (ω 0 , t). Hence, for large systems, it can be interpreted as a random or stochastic function. For the sake of brevity, we will write out the ω 0 dependence of F R (t) only when it improves clarity. The first term on the r.h.s. of eq. (20a) represents the time evolution of the part ofÄ 0 = LȦ 0 which lies in the relevant subspace and reflects a deterministic force. The second term on the r.h.s. of eq. (20a) is due to the relevant part of LF R (ω 0 , t) and describes dissipative effects. Clearly, the explicit form of eq. (20a) depends on the specific form of the projection operator P . Before we introduce our hybrid projection scheme, we will present the GLE's generated by the Mori projection P M and by the Zwanzig projection P Z .

A. Mori Projection
The Mori projection applied on an observable A t is given by [25] and uses the inner product defined in eq. (6). The observables one projects onto, i.e., B 0 andḂ 0 , are referred to as the projection functions. The projection in eq. (21) maps any observable A t onto the subspace of all functions linear in the observables B 0 andḂ 0 . In addition to being linear and idempotent, P M is self-adjoint w.r.t. to the inner product in eq. (6), i.e., for two arbitrary observables A t , C t , the relation holds. Thus, it is an orthogonal projection, since all functions P M A t and Q M C t are orthogonal, i.e., as follows directly from eq. (22) and from the idempotence of P . For P = P M and choosing the projection functions to be B t = A t andḂ t =Ȧ t , i.e., projecting onto the observable of interest itself, eq. (20a) takes the form [25,48] where Γ M (s) is the memory friction kernel obtained from the Mori projection. Eq. (24) is an exact decomposition of the Liouville equation into three terms: the first term is a generalized force due to a potential of quadratic form; the second term accounts for linear friction and includes the memory kernel Γ M (s), which is related via eq. (24b) to the second moment of the random force F R (ω 0 , t), defined in eq. (20b). The exact form of the memory function can only be computed for very simple models, for realistic systems and practical applications it is infeasible to compute since the fluctuating term F R (ω 0 , t) is an explicit function of the initial state of the entire system. Instead, one typically models the function F R (t) as a stochastic process with zero mean and a second moment given in eq. (24b). Although information on higher-order moments of F R (t) can be obtained from the Mori formalism, F R (t) is typically assumed to be Gaussian. In general, however, this assumption can not hold, since F R (t) contains all non-linearities that A t may exhibit. Thus, imposing F R (t) to be a Gaussian variable becomes a bad approximation for non-linear systems, which reflects a fundamental short-coming of the Mori projection scheme for practical applications.

B. Zwanzig Projection
Contrary to the Mori projection, the Zwanzig projection P Z of an observable A t is non-linear in the projection where we repeat that phase-space variables with a hat inside inner products, i.e., ω 0 , are integrated over. The Zwanzig projection thus is a conditional average as defined in eq. (9) and is linear, idempotent and self-adjoint, similar to the Mori projection. The resulting GLE from the Zwanzig projection is best illustrated by choosing the observable of interest to be the momentum of a single particle,Ȧ 0 = p 0 , and the projection functions as the position and the linear momentum of the same particle, i.e., B 0 → r 0 ,Ḃ 0 → p 0 . With this, eq. (20a) becomes [52] with a memory friction kernel defined by Here, U PMF (r) = −k B T ln δ(r 0 − r) denotes the potential of mean force (PMF) defined in eq. (11), which creates in the GLE a force on the particle that tends to establish the equilibrium positional distribution. This is the main advantage over the Mori projection, since this ensures the correct equilibrium behavior once we switch to a stochastic description and replace the fluctuating force F R (t) by a Gaussian stochastic variable with zero mean [28]. The memory friction kernel Γ Z is a 3 × 3 matrix that, as a result of the conditional average, is a function of particle position r s and particle momentum p s . This is the main drawback of the GLE in eq. (26a), since the position and momentum dependence is difficult to deal with in applications. As a way out, one typically invokes the ad-hoc assumption that the memory function is independent of position and momentum, i.e., This assumption leads to an approximate GLE that is amply used in literature [7, 17-20, 33, 34, 36-43, 45] and readṡ While for various applications the approximate GLE has been demonstrated to reproduce the full system dynamics very accurately [20,45], it is difficult to check for realistic systems whether the ad-hoc assumption Γ Z (t − s, r s , p s ) ≈ Γ app (t − s) is in fact valid. This is one motivation for our hybrid projection scheme, since it allows to derive all parameters of the exact GLE from trajectory data and thereby to access the validity of the approximate GLE explicitly.

IV. HYBRID GLE
Our projection operator P H is a hybrid of the Mori and Zwanzig projection operators and is written in the form P H = P x + P p . Here, we derive the GLE for a scalar observable A t = A(ω t ), the derivation for a general multi-dimensional observable is given in appendix B. Using general projection functions B 0 = B(R 0 ), which is a function of positions only, andḂ 0 =Ḃ(R 0 , P 0 ), which in general is a function of positions and momenta, the hybrid projection operator is given by The projection P x is a conditional average, defined in eq. (9), onto the observable B 0 = B(R 0 ), which is a function of positions R only. As a result, the conditional average is independent of momenta. In appendix C we show that P x P p = P p P x = 0, from which follows that P 2 H = P H , so that P H is idempotent in addition to being linear and hence is a projection. In appendix D, we show that P H is self-adjoint w.r.t the inner product defined in eq. (6), i.e., it fulfills the property in eq. (22). Therefore, P H is an orthogonal projection. Again, we denote the projection onto the complementary subspace of P H by Q H = 1 − P H , where 1 is the identity operator. In appendix E, we prove for the projections P H , Q H of an arbitrary observable A t the important property Hence, the equilibrium ensemble average of any observable that lies completely in the complementary subspace vanishes. As an important consequence, the random force F R (t) defined in eq. (20a) lies completely in the complementary subspace for all times and, therefore, has a vanishing equilibrium average. This property is also obtained for the Zwanzig projection, but not for the Mori projection.
In the remainder, we choose the observable of interest and the projection function to coincide, B(R t ) = A(R t ). Therefore, the GLE we derive from our hybrid scheme describes observables that are functions of positions only, such as the center of mass position, distances and angles. As an important property, Our hybrid projection P H projects the observable A 0 and its velocityȦ 0 onto themselves, meaning that With this choice for the projection function and the specific form of the projection P H in eq. (29), we find for the first term on the r.h.s. of eq. (20a), where we used the relation in eq. (13) to obtain eq. (32c). Equation (32c) describes the force due to a potential. To show this, we make use of the fact that the expectation value Ȧ 2 0 A0 is strictly positive. Thus, we can use it via to define the generalized mass M (A 0 ), which in general is a function of A 0 . Using M (A 0 ), eq. (32c) can be simplified to where we defined the effective potential as The effective potential combines the effects of the PMF and the logarithmic effective mass. The second term on the r.h.s. of eq. (20a) accounts for memory friction, the integrand for our hybrid projection reads e (t−s)L (P x + P p )LF R (s). The P p projection leads to a memory function of the same form as in the Mori projection where we defined the memory kernel due to the P p projection as The memory friction due to the P x projection can, using eq. (10), be written as a conditional average which, using the relation in eq. (13), can be rewritten as (39) Here, we introduced the conditional correlation function between the time derivative of the observable at the initial time,Ȧ 0 , and the random force F R (s) With the definition of the hybrid projection operator P H in eq. (29) and the results in eq. (34), eq. (36b) and eq. (38), the general GLE in eq. (20a) takes the specific form which is the exact GLE that follows from our hybrid projection scheme and constitutes a main result of our paper. A few comments are in order: i) The PMF U PMF (A t ) appears explicitly in the equation of motion, similar to the Zwanzig projection scheme. ii) An inhomogeneous effective mass M (A t ) gives rise to a drift term. If M (A t ) is constant, i.e., if the variance ofȦ t is independent of A t , see eq. (33), this drift term vanishes. For an observable A t that is a linear combination of positions, it follows directly that the effective mass is constant [47]. Even for certain non-linear observables, such as distances in position space, it can be shown that the generalized mass is constant, as demonstrated in appendix F. On the other hand, for angles, which are three-body terms, the effective mass will in general depend on A t , as demonstrated for the dihedral angle of butane in section VI C. iii) The memory kernel Γ p (s) is determined via the unconditional average over the random-force correlations in eq. (37), similarly to the Mori projection, and therefore only depends on time. It thus describes the linear friction contribution. iv) The memory friction function Γ x (A t−s , s) is a general function of the observable A t−s , it therefore accounts for non-linear friction contributions. According to eq. (39), this contribution disappears if the conditional correlation function between the random force and the time derivative of the observable, D(A t , s), as defined in eq. (40), vanishes. This constitutes the exact condition for which the approximate GLE in eq. (28) is valid. v) The first moment of the random force vanishes, F R (t) = 0, as follows from the relation eq. (30). The second moment is determined by the memory kernel Γ p (s) via eq. (37). Higher cumulants do not necessarily vanish but are not expected to play a significant role since non-linear effects are already accounted for by the PMF U PMF (A t ). Indeed, in section VI C we demonstrate for the explicit example of the butane dihedral angle that the random-force distribution exhibits finite but moderate non-Gaussian contributions. The multi-dimensional generalization of eq. (41), i.e., the case in which the observable is a vector

V. NUMERICAL SCHEME FOR EXTRACTING RANDOM FORCES FROM TRAJECTORIES
In the absence of a potential and in the absence of non-linear friction, Carof et al. presented iterative algorithms to compute the random force trajectory and the linear friction kernel from a trajectory of the reaction coordinate [6,35]. Their derivations explicitly use the Mori projection, so the results are only valid for the Mori GLE in eq. (24).
We now introduce a method to compute the random force trajectory F R (ω 0 , t) and from that the memory kernel Γ p (t) and the non-linear memory function Γ x (A s , t − s) as defined by our GLE, eq. (41), from a given trajectory of an arbitrary observable. For this, let us consider the projected propagator e tQ H L based on our hybrid projection scheme eq. (29). From the Dyson decomposition in eq. (19), we obtain by rearranging Applying eq. (42) on the initial random force F R (ω 0 , 0) and using eq. (20b) and the memory functions Γ p (t) and Γ x (A, t) defined in eq. (36b) and eq. (38), respectively, we find Now, we consider eq. where we used the substitution s → s − t in the second integral. Acting with the operator in eq. (45) on the initial random force F R (ω 0 , 0) and using eq. (5) gives Comparing eq. (46) with eq. (43), we see that the first three terms on the r.h.s. of eq. (46) are equal to F R (ω ∆t , t). Hence, we find For given trajectories A t ,Ȧ t and given random force F R (ω ∆t , t) as a function of the phase space configuration ω ∆t , eq. (47) gives the random force F R (ω 0 , t + ∆t) one time step ∆t later as a function of the phase space configuration ω 0 one time step ∆t before. To obtain an iterative scheme for the random force, eq. (47) is discretized in time and A-space. For this, we use the left rectangular rule to discretize the time integrals. The random fore is discretized as , (40) and (47) read If the observable A t has at time t = i∆t a value in the interval I α , we write A i ∈ I α ; Ai∈Iα denotes the sum over all times i for which A i is in the interval I α , which is used to compute conditional averages in eq. (48). N traj denotes the total length of the A t trajectory used. The sums run from i = 0 to N traj − j − 1, because for given j, the iterative scheme has only determined the random force at times up to N traj − j − 1, as follows from eq. (48a). The sums in the denominator extend over the same interval as in the numerator in order to increase the numerical stability [6,35]. The derivatives in A-space in eq. (48c) are computed using central differences. The iterative scheme in eq. (48) works as follows: First, note from eq. (41) that F R (i, 0) = i.e., the random force at time t = 0 equals the acceleration plus the force from the effective potential for all possible initial times i∆t for i = 0, 1, 2, . . . , N traj − 1. This, together withȦ i , can be obtained directly from a given trajectory of the observable A. Then, F R (i, 0), A i , anḋ A i are inserted into eq. (48) to compute F R (i, 1) for i = 0, 1, 2, . . . , N traj −2. F R (i, 1) is then used to compute F R (i, 2) for i = 0, 1, 2, . . . , N traj − 3 and so forth. While computing F R (i, j), the memory friction functions Γ p (i) and Γ x (A i , i − j) are computed simultaneously. If our only goal is to compute the memory friction functions, we can stop the computation of F R (i, j) as soon as the memory functions have dropped to zero. As an example, if the memory functions decay to zero after N mem time steps, we can abort the computation of the random force at F R (i, N mem ). At that point, we generated N traj − N mem − 1 distinct random-force trajectories of length N mem each. Since the memory functions are computed simultaneously, the generated random-force trajectories only need to be stored if one is interested in the random-force statistics, in which case one could extend the length of the random-force trajectories. In appendix G, we present an alternative discretization in time for eq. (47) using the trapezoidal rule.

VI. APPLICATIONS
We test our numerical algorithm in eq. (48) on three different systems: an exactly solvable harmonic Hamiltonian model which leads to a GLE without spatial dependencies in the memory friction term, the non-linear Hamiltonian version of the Zwanzig model [53], where spatial dependencies in the memory friction term are present, and finally, we discuss results obtained for the dihedral angle dynamics of a butane molecule in water from MD simulations.

A. Harmonic Hamiltonian Model
The exactly solvable harmonic model is defined by the Hamiltonian The relevant coordinates are the one-dimensional position x and momentum p which are coupled to the auxiliary particle positions q n and momenta v n . If we choose the potential U (x) to be a harmonic potential, i.e., U (x) = kx 2 /2, we can use our hybrid projection formalism to exactly derive the GLE. For this we compute the random force F R (t) defined in eq. (20b) by an operator expansion to all orders as shown in appendix H. Once F R (t) is computed, the memory functions Γ p (t) and Γ x (x, t) are obtained from eq. (37) and eq. (39), respectively. In appendix I we show how to alternatively obtain a GLE without projection, namely by solving the equations of motion for the q n variables and inserting the result back into the equation of motion for x, which works for general potential U (x). The GLE's obtained from the projection formalism and the exact solution agree with one another and take the form of the approximate GLE in eq. (28), The memory friction kernel is given by with µ n = k n /m n . We note that the spatially dependent memory friction term Γ x (x, t) vanishes, as shown in appendix H and I. In order to test our numerical scheme in eq. (48), we need to generate trajectories of x t . To do this in a numerically efficient fashion, we identify eq. (52) as the Fourier series of an even function with Fourier coefficients k n . In the limit of N → ∞ and for a continuous frequency dependency, i.e., k n → k(µ)dµ/2π, we can choose the exponential-oscillating memory kernel  (49). In A, we compare the input potential U (x) = kx 2 /2 (broken line) with the numerically obtained effective potential U eff (x) defined in eq. (35) (solid blue line). In B, we confirm that the numerical extraction of the random force leads to the expected Gaussian distribution with zero mean and standard deviation p 2 0 Γ p (0). The shaded area in blue highlights the numerical error. In C, we check that the analytical memory function Γ p (t) given in eq. (53) and its running integral are accurately reproduced by the numerical results from the extraction scheme. This maps the Hamiltonian system in eq. (49) onto the stochastic system of two linearly coupled Langevin equations where η(t) in eq. (54) is a white noise variable, as derived in appendix J. The parameters in eq. (53) and eq. (54) are related by τ Γ = m y /γ and ν 2 = 4m y K/γ 2 − 1. The scalar y variable in eq. (54) is the stochastic representation of the Hamiltonian environment produced by the q n variables in eq. (49). Using eq. (54), we numerically generate trajectories x t for a system with thermal energy k B T = 2.5 kJ/mol, which corresponds to T = 300 K [54]. The other parameters are chosen to be m = 50 u, m y = 2 u, K = 30 kJ/mol/nm, k = 7.5 kJ/mol/nm, In fig. 1 we compare analytical results with results derived from the numerically generated trajectories using the scheme in eq. (48), where the x-space is discretized using N A = 200 bins of equal length. In fig. 1A we compare the input potential U (x) = kx 2 /2 (broken line) with the numerically obtained effective potential U eff (x) defined in eq. (35) (solid blue line), both potentials are shifted so that they are zero at x = 0. The agreement is perfect, which in particular means that the effective mass M (x) defined in eq. 33 is a constant, as expected. In fig. 1B we compare the analytical and the numerically determined random force distribution, which demonstrates that indeed non-Gaussian contributions are absent. In fig. 1C we compare the analytic memory kernel Γ p (t) in eq. (53) with the one extracted from the simulation trajectory and again obtain perfect agreement. This all shows that the numerical extraction scheme works perfectly on fluctuating trajectories. In fig. 2 we show the numerical result for the function D(x, t) defined by eq. (38) for different times. As predicted in appendix H, D(x, t) vanishes for all times.

B. The Non-Linear Zwanzig Model
As the second exactly solvable model we consider the Hamiltonian version of the non-linear Zwanzig model [53], for which non-linear friction effects are present and therefore the approximate GLE in eq. (28) is not valid The two potentials coincide, which means that the effective mass is constant, as expected on analytic grounds. B The numerically extracted random force distribution (blue solid line) is well described by a Gaussian with vanishing mean and standard deviation ẋ 2 0 Γ p (0) (broken black line). The shaded area in blue highlights the numerical error. In C, we show the numerically extracted memory function Γ p (t) and its running integral. anymore. This model is defined by the Hamiltonian In eq. (55), a generally non-linear function α(x) determines the coupling between the relevant variable x and the auxiliary variables q n . Note that for α(x) = x, we obtain back the harmonic model defined in eq. (49). The GLE that follows from the Hamiltonian system in eq. (55) can not be calculated in closed form using our hybrid projection scheme for general α(x), we therefore cannot derive the exact form of Γ p (t). On the other hand, by solving the equations of motion for the q n variables and inserting the result into the equation for x, one finds a GLE of the form with a (for α(x) = x) non-linear memory friction function k n cos(µ n (t − s)).

(56b)
Actually, the form of the memory kernel Γ(t−s, x t , x s ) in eq. (56b), and in particular its dependence on the trajectory x t , is not compatible with the form of the memory function Γ x (x t−s , s) or, equivalently, Γ x (x s , t − s), in eq. (41). In fact, in appendix K we demonstrate that the GLEs given in eq. (56b) and in eq. (41) are equivalent in the sense that they produce, for identical initial conditions, identical trajectories x t . This of course is expected, since they follow via exact derivations from the same Hamiltonian. This finding is similar to the fact that the Mori and Zwanzig GLEs are, in the absence of approximations, also equivalent and shows that even GLEs with identical PMFs and different friction memory and random force terms can be equivalent. For α(x) = x, it is therefore interesting to extract the non-linear friction term Γ x (x t−s , s), as defined by our GLE in eq. (56b), from simulation trajectories of x t . Similar to our approach to obtain eq. (54) for N → ∞, we exploit the structure of eq. (56b), which is equivalent to a Fourier decomposition in the time domain, to map the Hamiltonian system in eq. (55) onto a system of non-linearly coupled Langevin equations given by (see appendix J) For U (x) = kx 2 /2 and α(x) = x we recover eq. (54). Using eq. (57), we perform simulations for the parameter set k B T = 2.5kJ/mol, m = 50 u, m y = 2 u, K = 30 kJ/mol/nm, γ = 10 u/ps to generate 100 trajectories x t of 100 ns length each. For the potential we choose a non-linear double-well potential U (x) = U 0 (x 2 − 1) 2 with U 0 = 3k B T , as shown in fig. 3A, and for the nonlinear coupling potential we choose a quadratic function α(x) = α 0 x 2 /2 with α 0 = 4 nm −1 . The resulting trajectories are then used to compute via eq. (48) all parameters of the hybrid GLE in eq. (41), which are presented in fig. 3. In this calculation, the x-space is discretized using N A = 200 bins of equal length. The effective mass M (A t ) for an observable A t that is a linear function of atomic positions is constant [47], as follows directly from the fact that the velocity distribu-tion function factorizes for Hamiltonians of the form in eq. (1). Indeed, in fig. 3A the numerically obtained effective potential U eff (x) defined in eq. (35) (solid blue line) is shown to agree perfectly with the input potential U (x) (broken line) when both potentials are shifted so that they are zero at x = 0. In fig. 3B we compare the random force distribution obtained numerically via eq. (48) from the simulated trajectory (blue line) with a Gaussian with vanishing mean and a variance of ẋ 2 0 Γ p (0), as predicted by eq. (37), and obtain very good agreement; for the comparison, the value Γ p (0) is numerically extracted from the simulated trajectory. Note that eq. (37) does not imply that the distribution of the random force is a pure Gaussian, but the data in fig. 3B demonstrate that non-Gaussian contributions are either absent or very small.
In fig. 3C we show the memory kernel Γ p (t) extracted from the simulation trajectory, the result looks qualitatively similar to the result in fig. 1C for the harmonic model. In fig. 4, we show the correlation function D(x, t) defined in eq. (40) for a few different fixed times. Note that D(x, 0) vanishes at time 0, which is  fig. 4 rises before dropping back to zero in the longtime limit. The time after which D(x, t) decays to zero is about 1 ps and thus comparable to the memory time of Γ p (t) in fig. 3C. Note that a finite correlation function D(x, t) will via eq. (39) give rise to a non-linear friction memory function Γ x (x, t). The non-linear Zwanzig model defined by the Hamiltonian eq. (55) is thus represented by a constant effective mass term M (A) but a non-vanishing non-linear friction memory.

C. Dihedral Angle Dynamics of Butane from MD Simulations
To test our algorithm for an observable that is a nonlinear function of atomic positions, we consider the dihedral angle dynamics of a butane molecule in water as obtained from MD simulations. The dihedral angle φ of butane is a conceptually simple yet relevant observable and provides a simple scenario to study con-formational transitions in polymers and proteins that is both theoretically [55] and experimentally [56] accessible. In fig. 5, we present results for the rescaled angle A = φ/(φ max − φ min ), where the maximal and minimal observed angles in the studied trajectory are φ max = 155 • and φ min = −157 • . In fig. 5A, the effective potential U eff (A) defined in eq. (35) (blue solid line) shows small but significant deviations from the PMF U PMF (A) (broken line), which is explained by the dependence of the effective mass M (A) on the dihedral angle, as shown in fig. 5D. In fig. 5C, the deviations between Γ p (t), as defined within the exact GLE eq. (41) and determined numerically from the MD trajectory via eq. (48), and Γ app (t), defined within the approximate GLE in eq. (28) and obtained via a Volterra scheme [20,37], are pronounced and already suggest that non-linear friction effects, not captured by Γ app (t), are present. A closer look at the results in fig. 5C reveals that Γ p (t) and Γ app (t) have similar decay times, but Γ app (t) oscillates in time while Γ p (t) does not. These deviations between Γ p (t) and Γ app (t) must be due to non-linear memory effects, as confirmed in fig. 5E, where the correlation function D(A, t) defined in eq. (40) is shown for a few different fixed times. Thus, a non-linear memory friction contribution Γ x (A, t), defined in eq. (39) and shown in fig. 5B, is present in the GLE. As mentioned before, from the definition of D(A, t) in eq. (40) it follows that D(A, t) vanishes at time t = 0, i.e., D(A, 0) = 0, from which it is easy to see via eq. (39) that Γ x (A, 0) vanishes, too, as indeed confirmed by the data in fig. 5B and E. For finite time, both D(A, t) and Γ x (A, t) rise in amplitude before decaying to zero after a time corresponding to the memory time of Γ p (t) in fig. 5C, which is about 1 ps.
The rise and decay of non-linear friction effects is presented in fig. 6, where we show Γ x (A, t) as a function of time for different fixed values of A for the non-linear Zwanzig model in fig. 6A and for the butane dihedral angle dynamics in fig. 6B. The vertical gray lines indicate the time after which the linear friction kernel Γ p (t) for each system stays below 1% of its initial value Γ p (0). In addition to the raw numerical data (solid lines), we show smoothed curves which are obtained by fits to Legendre polynomials (broken lines), as described in appendix L.

VII. SUMMARY AND DISCUSSION
By using a hybrid projection scheme that combines linear Mori projection on the reaction coordinate velocities and non-linear conditional Zwanzig projection on the reaction coordinates themselves, we derive a GLE that contains the non-linear potential of mean force and a nonlinear memory friction contribution that is a function of the reaction coordinate A t but not of its velocityȦ t . The complete memory friction then splits into two parts. One part is linear in the reaction coordinate velocity and reflects linear friction proportional to a memory kernel Γ p (t). The memory kernel Γ p (t) is related to the fluctuat-ing force F R (t), defined in eq. (20b), by a relation that resembles a fluctuation-dissipation theorem, eq. (37). The non-linear memory friction function Γ x (A t−s , s) accounts for non-linear dependencies of friction on A t−s and is connected to the fluctuating force F R (t) by a conditional correlation function, given in eq. (40). Thus, when modeling F R (t) as a stochastic variable, it has to fulfill both relations, eq. (37) and eq. (40). The approximate GLE in eq. (28) is obtained from our GLE in eq. (41) only when the memory friction function Γ x (A t−s , s) vanishes, which thus establishes a firm criterion for the validity of the approximate GLE.
We also introduce a numerical scheme to compute all parameters of our GLE from a given trajectory A t and apply it on numerically determined trajectories for a harmonic and a non-linear exactly solvable many-body particle system, here we show that the numerical results agree with the analytical predictions. We also apply our numerical scheme on a dihedral angle trajectory of butane in water, obtained from atomistic MD simulations. We find that the effective mass of the dihedral angle depends on the value of the dihedral angle and that the non-linear memory friction contribution is finite and non-negligible. In order to estimate the importance of the non-linear memory friction, we have to compare the linear memory kernel Γ p (t) and the non-linear memory function Γ x (A, t). For this we multiply the linear-friction memory kernel at time zero, Γ p (0), by the root mean square velocity and obtain Γ p (0) Ȧ2 = 171 ps −3 , which can be directly compared with the maximal value of the non-linear memory friction function Γ x (A max , t max ) = 97 ps −3 , which is obtained for A max = 0.26 and t max = 0.043 ps. The value of Γ x (A max , t max ) thus turns out to be roughly half the value of Γ p (0) Ȧ2 , which means that non-linear memory friction effects are not negligible. Interestingly, our results demonstrate that non-linear friction memory leads to oscillations in the memory function Γ app (t) of the approximate GLE, which are not present in Γ p (t), as shown in fig. 5C. Finally, we show that the random force in the GLE from our hybrid projection scheme exhibits small but detectable deviations from a Gaussian distribution. All these results lead us to conclude that the GLE derived from our hybrid projection scheme is practically useful and allows to detect and model non-linear friction effects that have been neglected in previous applications of the approximate GLE with linear memory friction.

VIII. METHODS AND MATERIALS
MD simulations are performed using the Gromacs MD package (version 2020-Modified) [54]. For the MD simulation of the butane molecule, we use the GROMOS53A6 force field [57] with the TIP4P/2005 rigid water model [58]. The simulation box has side lengths of 3.35 nm and contains 1250 water molecules. We constrain the butane bond lengths and angles using the SHAKE algorithm [59]. For long-range electrostatic interactions, we use the particle-mesh Ewald [60], with a cut-off of 1 nm. The simulation time step is 1 fs, and the total simulation time is 100 ns. All simulations are performed in the NVT ensemble with a temperature of 300 K, controlled with a velocity rescaling thermostat [61]. Input files of the MD simulations are available upon request. The Langevin simulations are performed using the Leap Frog algorithm for numerical integration. Our Python codes for extracting the GLE parameters and running Langevin simulations are also available upon request. When computing non-linear memory contributions, the time resolution of the trajectory and the number of bins in reaction-coordinate space have to be chosen with care. In our analysis of Langevin and MD simulations, we use 200 bins to discretize the reaction-coordinate space. For the butane dihedral angle system in fig. 5, we exclude boundary regions in the trajectory for |A| > 0.4 in the computation of conditional correlations, that means we exclude observable values with a small fraction of realizations along the trajectory, since these would lead to significant noise in the extracted memory functions and thus destabilize the numerical extraction. Such noise effects are clearly visible in the effective mass profile in fig. 5F. where we used eq. (8) and eq. (12). We next pull out the derivative w.r.t. a in eq. (A1d) from the inner product and use the product rule of differentiation, which gives Finally, we use the definition of the PMF in eq. (11) and insert eq. (A1e) into eq. (A1b) to obtain eq. (13).

Appendix B: Multi-Dimensional Hybrid GLE
Here, we derive eq. (41) for a multidimensional observable that is a function of particle positions only. We denote the set of observables using the vector A(R t ) = (A 1 (R t ), A 2 (R t ), . . . , A n (R t )). As before, all observables implicitly depend on time via the positions R t . We denote components as A k (R t ) ≡ A k,t and A k (R 0 ) ≡ A k,0 . In the multi-dimensional case, the projection operator reads for general vectorial projection function B 0 Choosing B 0 = A 0 , as in the main text, the projection in eq. (B1) leads to the following potential term where we introduced the inverse generalized mass matrix The computation of the memory function proceeds similarly as in the main text and the multi-dimensional GLE reads where the following relations hold for all k, l = 1, 2, . . . , n. The k-th component of the vectorial non linear memory memory friction function Γ x (A, s) is given by Appendix C: Idempotency of the Hybrid Projection Operator The linear operator P H in eq. (29) is a projection, if it is idempotent, i.e., P 2 H = (P p + P x ) 2 = P H . Clearly, we have P 2 p = P p and P 2 x = P x . Therefore, one has to check that P p P x A t = P x P p A t = 0 for an arbitrary observable A t = A(ω t ). This is true because of the following: we project onto observables of positions only, i.e., onto B 0 = B(R 0 ). Thus, the velocityḂ 0 is linearly proportional to the particle momentȧ The operator P p maps any function onto the subspace of functions that are linear in the observable velocityḂ 0 , which is linear in the particle momenta p n . From this we see that since the operator P x involves an integral over the particle momenta but adds no momentum dependence. P x maps any observable onto a function which depends on particle positions only. SinceḂ 0 is linearly proportional to the particle momenta, P p applied on a function that depends on particle positions only gives zero. Therefore, it follows that Appendix D: Self-Adjointedness and Orthogonality of Hybrid Projection Here, we prove that the projection P H in eq. (29) is self-addjoint w.r.t. the inner product in eq. (6), i.e., for any observables A t = A(ω t ) and C t = C(ω t ), we have A t , P H C t = P H A t , C t . For this, we consider the projection operators P p and P x separately.
Using the definition in eq. (29c), we find Using the definition in eq. (29b) and eq. (9), we find This means that the hybrid projection P H in eq. (29) is self-adjoint and thus is an orthogonal projection, i.e., for arbitrary observables A t and C t .

Appendix E: Average of Complementary Observables Vanishes
In the following, we prove eq. (30), i.e., we show that the equilibrium average of any observable that lies in the complementary subspace at all times vanishes. For this, we must show for an arbitrary observable A(ω t ) = A t that P H A t = A t holds. First, from the definition of P p in eq. (29), it follows that since our projection function B 0 = B(R 0 ) is a function of positions only and therefore, its velocityḂ 0 = LB 0 is linear in the momenta (see eq. (C1)). For the P x projection operator we find From this, it immediately follows that P H A t = A(t) and thus all equilibrium averages in the complementary subspace vanish, i.e., Q H A t = (1 − P H )A t = 0. In particular, the equilibrium average of the random force vanishes at all times, i.e., F (t) = Q H F (t) = 0.

Appendix F: Generalized Mass of Distance Observables
We demonstrate that the generalized mass M (A) defined in eq. (33) is constant for an observable that corresponds to the scalar distance between particle positions, which is a non-linear function of particle positions. In this case, the force term dU eff /dA in eq. (41) reduces to dU PMF /dA. As an example, we consider the hydrogenbond distance between a nitrogen atom (donor) with initial position r N 0 and an oxygen atom (acceptor) with initial position r O 0 that are located four residues apart on the backbone of a polypeptide. The observable is thus given by Applying the Liouville operator on eq. (F1) gives the velocity of the observable As can be seen in eq. (F2), the velocityȦ 0 is linear in the momenta p N 0 and p O 0 . Computing the effective mass according to the definition in eq. (33), i.e., requires the computation of the numerator on the r.h.s. of eq. (F3). Given an Hamiltonian of the form in eq. (1), factorization of the phase-space integral leads to Inserting eq. (F4) for the numerator on the r.h.s. of eq. (F3), we find which is the reduced mass of the nitrogen-oxygen distance coordinate.
A similar derivation can also be done for a linear combination of distances. For example, consider the mean hydrogen-bond distance between N R donor nitrogen atoms and N R acceptor oxygen atoms that are located four residues apart along the backbone of a polypeptide. We define the observable as with A n,0 being the initial value of the n-th distance. Eq. (F3) becomes As before, terms consisting of mixed momentum factors average to zero, only diagonal terms contribute. In analogy to eq. (F5), the effective mass is constant also for this case.
Here, we present an alternative discretization of eq. (47). Similar to eq. (48), the equation derived here still has an overall error of the order O(∆t 2 ). The advantage over eq. (48) is that we use the trapezoidal rule for the integration involving the memory kernel Γ p (t); note that we keep the rectangular rule for the integration of the memory function Γ x (A, t). We discretize eq. (47) in the following way Note that now, the r.h.s. of eq. (G1) depends on Γ p (j+1).
To compute Γ p (j +1), we need the yet unknown F R (i, j + 1). In the absence of the memory function Γ x (A, t), it has been demonstrated how one can work around this problem [6,36,46]. The trick is to multiply eq. (G1) by F R (i, 0) and average according to eq. (6). This gives By identifying the l.h.s. of eq. (G2) with Ȧ 2 0 Γ p (j + 1) and solving for Γ p (j + 1), we find The function η(j) in eq. (G3d) appears due to the presence of the non-linear friction Γ x (A, t) and is computed using The alternative discretization is then found by replacing Γ p (j + 1) on the r.h.s. of eq. (G1) by eq. (G2). We derive the GLE for the harmonic Hamiltonian model eq. (49) using our hybrid projection in eq. (29). The Liouville operator defined in eq. (3) reads and acts on the initial values x 0 , p 0 , q n,0 , v n,0 . The hybrid projection is given by with the conditional average in eq. (H2b) being defined in eq. (9). Using eq. (H1) and eq. (H2), it follows that as confirmed in fig. 1A. To compute the random force F R (t), we use the operator expansion µ n t − (µ n t) 3 3! + (µ n t) 5 5! + . . . µ n v n,0 + k n 1 − (µ n t) 2 2! + (µ n t) 4 4! + . . . (q n,0 − x 0 ) , (H5) with µ n = k n /m n . Identifying the sums in the parenthesis as the series expansions of sine and cosine, respectively, F R (t) follows as (µ n sin(µ n t)v n,0 + k n cos(µ n t)(q n,0 − x 0 )) .
with Γ(s) = N n=1 k n cos(µ n s) being the result in eq. (I4c) obtained by explicit solution of the equations of motion for the special case α(x) = x.
In the stochastic interpretation of the GLE, it is sufficient to know the distribution of the initial conditions of the complementary variables. For the Hamiltonian in eq. (55), the distribution is given by the Boltzmann distribution. Thus, the initial values q n,0 , v n,0 are Gaussian distributed random variables with (q n,0 − α(x 0 )) = 0, v n,0 = 0, (J6a) α(x 0 ), v n,0 = 0, q n,0 , v n,0 = 0, v n,0 , v m,0 = δ n,m k B T m n , (q n,0 − α(x 0 )), (q m,0 − α(x 0 )) = δ n,m k B T k n . (J6d) From this, it follows thatF R Z is a stationary Gaussian process fulfilling The equal sign in eq. (J7b) follows from the explicit form given in eq. (I4b) and from the relation in eq. (J6d), where the average is a Boltzmann average over the initial conditions {q n,0 , v n,0 }. A Markovian stochastic system which leads to a memory function of the form given in eq. (J4) reads m yÿt = −k (y t − α(x t )) − γẏ t + 2k B T γη(t). (J8b) with η(t) = 0, η(t), η(s) = δ(t − s) being white noise. The relations between the parameters in eq. (J8) and the parameters in eq. (J1) are given by By solving eq. (J8b) and inserting the result into eq. (J8a), we find the random forcẽ F R (t) = ke −|t|/2τ cos 2π T t + 1 ν sin 2π T t × (y 0 − α(x 0 )) + 2 γν e −|t|/2τ sin 2π T t × p y,0 + 2k B T γ where the variable y 0 has the same distribution as q n,0 , and p y,0 = m yẏ0 has the same distribution as v n,0 . The equivalence ofF R in eq. (J5) andF R Z in eq. (J10) follows from the fact that their first and second moments are the same. Using this, we have mapped the non-linear Hamiltonian Zwanzig model defined by eq. (55) onto the set of coupled Markovian stochastic equations in eq. (J8), which can be used to perform numerical simulations.

Appendix K: Transformation Between Different GLEs
When applied to the non-linear Hamiltonian Zwanzig model defined by eq. (55), our hybrid projection operator P H = P x + P p , given in eq. (H2), leads to a GLE of the formṗ where Γ x (x t−s , s) follows from eq. (39). The two GLE's in eq. (I4a) and eq. (K1a) obviously have a different mathematical structure, but they describe the exact same dynamics. To see this, consider the random force F R Z (t, x t ) in eq. (I4b) F R Z (t) = n k n (q n,0 − α(x 0 )) cos(µ n t) + µ n v n,0 sin(µ n t) .
The time derivative of F R Z (t, x t ) is given bẏ Since the function α (x t ) depends on time only via x t , its time derivative can be written using the Liouville operator, i.e., d dt α (x t ) = Lα (x t ). The same is not true for the functionF R Z (t). By applying the Liouville operator, we find LF R Z (t) =Ḟ R Z (t) − p0 m α (x 0 ) n k n cos(µ n t). Hence, we can writė where we used Γ from eq. (I4c). By using the Dyson identity from eq. (19) for e tL , we can write eq. (K5b) in terms of the general projection operators P and Q as From eq. (K2), it follows that where we used the definition of the random force F R (t) in eq. (20b) and the equation of motion for the complementary variables. This means that F R Z (0, x 0 ) coincides with the random force F R (t) = e tQL QLp 0 at time t = 0. Therefore, by inserting the result in eq. (K6) into eq. (I4a), we obtain eq. (K1). Thus we have proven that the GLE obtained by explicitly solving the non-harmonic Hamiltonian Zwanzig model, eq. (I4a), is equivalent to the GLE obtained from our hybrid projection scheme, eq. (K1).