Kinetic Theory for Finance Brownian Motion from Microscopic Dynamics

Recent technological development has enabled researchers to study social phenomena scientifically in detail and financial markets has particularly attracted physicists since the Brownian motion has played the key role as in physics. In our previous report (arXiv:1703.06739; to appear in Phys. Rev. Lett.), we have presented a microscopic model of trend-following high-frequency traders (HFTs) and its theoretical relation to the dynamics of financial Brownian motion, directly supported by a data analysis of tracking trajectories of individual HFTs in a financial market. Here we show the mathematical foundation for the HFT model paralleling to the traditional kinetic theory in statistical physics. We first derive the time-evolution equation for the phase-space distribution for the HFT model exactly, which corresponds to the Liouville equation in conventional analytical mechanics. By a systematic reduction of the Liouville equation for the HFT model, the Bogoliubov-Born-Green-Kirkwood-Yvon hierarchal equations are derived for financial Brownian motion. We then derive the Boltzmann-like and Langevin-like equations for the order-book and the price dynamics by making the assumption of molecular chaos. The qualitative behavior of the model is asymptotically studied by solving the Boltzmann-like and Langevin-like equations for the large number of HFTs, which is numerically validated through the Monte-Carlo simulation. Our kinetic description highlights the parallel mathematical structure between the financial Brownian motion and the physical Brownian motion.

Inspired by these successes, physicists have attempted to apply statistical physic approaches even to social science beyond material science. In particular, financial markets have attracted physicists as an interdisciplinary area [18,19] since they exhibit quite similar phenomena to physics, represented by the Brownian motion. It is noteworthy that the concept of the Brownian motion was historically first invented by Bachelier in finance [20] before the famous work by Einstein in physics [21]. After the work by Bachelier, various characters of Brownian motions in finance and their differences from physical Brownian motions have been found by both theoretical and data analyses. On the level of price time series, the power-law behavior of price movements has been reported empirically [22][23][24][25][26]. Such universal characters have been summarized as the stylized facts [19] and have been theoretically studied by time-series models [19,[27][28][29] and agent-based models [30][31][32][33][34][35][36][37]. In addition, characters of order books (i.e., current distributions of quoted prices) are studied by both empirical analysis and order-book models [19,[38][39][40][41][42][43][44]. For example, the zero-intelligence order-book models [38][39][40][41][42][43][44] have been investigated from various viewpoints, such as power-law price movement statistics [38], order-book profile [41], and market impact by large meta orders [43,44]. The collective motion of the full order book was further found by analyzing the layered structure of the order book [45,46], which was a key to generalize the fluctuation-dissipation relation to financial Brownian motion. To date, however, the modeling of individual traders' dynamics based on direct microscopic evidence has not been fully studied, which was a crucial obstacle to apply the statistical mechanics from microscopic dynamics. To fully apply statistical mechanics to financial systems, it is expected necessary to establish the microscopic dynamical model of traders based on microscopic evidence and to develop a non-equilibrium statistical mechanics for such non-Hamiltonian many-body systems.
Recently, an extension of the kinetic framework for financial Brownian motion has been proposed by studying high-frequency data including traders identifiers (IDs) [46]. The dynamics of high-frequency traders (HFTs) were directly analyzed by tracking trajectories of the individuals, and a microscopic model of trend-following HFTs have been established showing agreeing with empirical analyses of microscopic trajectories. On the basis of the "equation of motions" for the HFTs, the Boltzmann-like and Langevin-like equations are finally derived for the mesoscopic and macroscopic dynamics, respectively. This framework is shown consistent with empirical findings, such as HFTs' trend-following, average order book, price movement, and layered order-book structure. However, the mathematical argument therein was rather heuristic similarly to the original derivation of the conventional Boltzmann and Langevin equations. Considering the traditional stream of kinetic theory, a mathematical derivation beyond heuristics is necessary for the financial Brownian motion paralleling to the works by BBGKY and van Kampen.
In this paper, we show the mathematical foundation for the financial Brownian motion in the parallel mathematics in kinetic theory. For the trend-following HFT model [46], we first define the phase space and the corresponding phase-space distribution (PSD) according to analytical mechanics [15,47]. We then exactly derive the time-evolution equation for the PSD, which corresponds to the Liouville equation in analytical mechanics. The many-body dynamics for the PSD are reduced into few-body dynamics for reduced PSD according to the reduction method by BBGKY. By assuming the molecular chaos, we obtain the non-linear Boltzmann equation for the order-book profile and the master-Boltzmann equation for the market price dynamics. We also present their perturbative solutions for large number of HFTs to study the dynamical behavior of this system for all hierarchies. The validity of our framework is finally examined by Monte Carlo simulation.
This paper is organized as follows: In Sec. II, we briefly review the mathematical structure of the standard kinetic theory before proceeding to our work. In Sec. III, we describe the detail of the trend-following HFTs model as the microscopic setups. In Sec. IV, the microscopic dynamics of the model are exactly formulated in terms of the Liouville equation and the corresponding BBGKY hierarchal equation. In Sec. V, the financial Boltzmann equation is derived as the mesoscopic description of this financial system. In Sec. VI, the macroscopic behavior is analyzed by deriving the financial Langevin equation. In Sec. VII, implications of our theory are discussed for several related topics. We conclude this paper in Sec. VIII with some remarks.

II. BRIEF REVIEW OF CONVENTIONAL KINETIC THEORY FOR BROWNIAN MOTION
Before proceeding to the core part of our work, we here briefly review the scenario of conventional kinetic theory for Brownian motion to convey our essential idea for generalization toward financial systems. Let us consider the Hamiltonian dynamics of N gas particles of mass m and a tracer particle of mass M with the hard-core interaction in a hard-core box of volume V (see Fig. 1a for a schematic). The momentum and position of the ith gas particle are denoted by p i ≡ (p i;x , p i;y , p i;z ) and q i ≡ (q i;x , q i;y , q i;z ) for 1 ≤ i ≤ N , and those of the tracer are denoted by P = p 0 and Q = q 0 . The dynamics of this system are described by the equation of motions, with interaction force F ij between particles i and j for 0 ≤ i, j, ≤ N (m i = M for i = 0 and m i = m otherwise).

A. Liouville equation
In analytical mechanics, the phase space is defined as S ≡ N i=0 (−∞, ∞) 6 . The state of the system can be designated as the phase point defined by Γ ≡ (P , Q; p 1 , q 1 ; . . . ; p N , q N ) ∈ S, and the corresponding PSD is denoted by P t (Γ). The time evolution of PSD is described by the Liouville equation, with the Liouville operator L [61] (see Refs. [14,15,[47][48][49][50] for the details). This equation is exactly equivalent to the equation of motions (1) mathematically, and is the fundamental equation for the microscopic description (Fig. 1a). This equation is however not analytically solvable as it fully addresses the original many-body dynamics without any approximation. Microscopic setup for the Brownian motions. Gas particles and a massive tracer interact with each other, where the dynamics are described by the Liouville equation (2). As the mesoscopic description (Fig. b), the full-dynamics are reduced to the one-body distribution φ (1) for the gas particles, which are governed by the Boltzmann equation (6). The macroscopic dynamics of the tracer (Fig. c) are described by the master-Boltzmann equation (8), or the Langevin equation (9) asymptotically for large system size M → ∞.

Reduce
(d-f) Hierarchal structure of financial markets parallel to molecular kinetic theory. In the microscopic hierarchy (Fig. d), each traders make decisions to submit or cancel orders. The dynamics of the traders correspond to those of molecules in kinetic theory. In the mesoscopic hierarchy (Fig. e), the information on traders identifiers is lost by coarse-graining. We thus obtain the dynamics of the order book (i.e., the quoted price distribution). The order-book profile corresponds to the velocity distribution in the conventional kinetic theory. In the macroscopic hierarchy (Fig. f), the dynamics of the market price movement is finally deduced by the coarse-graining, which exhibits the anomalous random walks. The market price dynamics corresponds to those of the Brownian motion in kinetic theory.

B. BBGKY hierarchy and Boltzmann equation
To focus on the one-body dynamics of a gas particle or the tracer, let us introduce the reduced PSDs, On the assumption of binary interaction, we can exactly derive hierarchies of PSDs, such that with one-body Liouville operators L (1) , L (T) and two-body collision operators L (2) , L (TG) . These equations are exact but not closed in terms of φ . To obtain analytical solutions, a further approximation is necessary. The standard approximation in kinetic theory is a mean-field approximation, called molecular chaos, which is mathematically shown asymptotically exact for dilute gas in the thermodynamic limit N, V → ∞ (called the Boltzmann-Grad limit [51]). We then obtain the closed dynamical equation for φ (1) as which is the fundamental equation for the mesoscopic description (Fig. 1b). The steady solution for φ (1) of the non-linear Boltzmann equation (6) is then given by the celebrated Maxwell-Boltzmann distribution.

C. Langevin equation
The stochastic dynamics for the macroscopic variables (P , Q) can be also obtained within kinetic theory. By applying molecular chaos for P (TG) (P , Q, p 1 , q 1 ) as we obtain the master-Boltzmann equation (or the linear Boltzmann equation) which belongs to the linear-master equations in the Markov process and describes the dynamics of the tracer particle. Equation (8) can be further approximated as the Fokker-Planck equation within the system size expansion [16]. One can thus deduce the Langevin equation for the tracer as the macroscopic description of the Brownian motion ( Fig. 1c), with viscous coefficient γ, temperature of the gas T , and the white Gaussian noise ξ G with unit variance. The above formulation shows the systematic connection from the microscopic Newtonian dynamics to the mesoscopic dynamics and macroscopic dynamics. This methodology is shown valid even for non-equilibrium systems when the gas is sufficiently dilute (see Refs. [3][4][5][6][7][8][9]12] for its application to various nonequilibrium systems), and is one of the most successful formulations in statistical physics.

D. Idea to generalize kinetic theory toward finance
Here, let us remark our idea to generalize the framework toward financial Brownian motion. Financial markets have a quite similar hierarchal structure to the conventional Brownian motion (see Fig. 1d-f for a schematic): In the microscopic hierarchy, individual traders make decisions to buy or sell currencies at a certain price (Fig. 1d). In the mesoscopic hierarchy, the dynamics are coarse-grained into the order-book dynamics with removal of traders' IDs ( Fig. 1e). In the macroscopic hierarchy, the dynamics are reduced to the price dynamics (Fig. 1f). One can notice that these hierarchies directly correspond to those in kinetic theory; traders, order book, and price correspond to molecules, velocity distribution, and Brownian particle, respectively. In this sense, the financial markets have a similar hierarchal structure to that in kinetic theory. From the next section, we present a parallel mathematical framework for the description of financial markets from microscopic dynamics.

III. MICROSCOPIC SETUP
In this section, the dynamics of the trend-following HFT model in Ref. [46] is mathematically formulated within the many-body stochastic processes with collisions on the basis of microscopic empirical evidences.

A. Notation
We here briefly explain the notation in this paper. Any stochastic variable accompanies the hat symbol such asÂ to stress its difference to non-stochastic real numbers such as A. For example, the probability distribution function (PDF) of a stochastic variableÂ(t) at real time t is denoted by P (A, t) ≡ P (Â(t) = A) with a non-stochastic real number A (i.e., the probability ofÂ(t) ∈ [A, A + dA) is given by P (A, t)dA). The complementary cumulative distribution function (CDF) is also defined as P (≥ A, t) ≡ ∞ A P (A , t)dA . To simplify the notation, arguments in functions are sometimes abbreviated without mention if they are obvious. The ensemble average of any stochastic We next explain the terminology for the order book for the whole market (Fig. 2a). The highest bid (lowest ask) quoted price among all the traders is called the market best bid (ask) priceb M (â M ). The average of the market best bid and ask prices is called the market mid priceẑ M ≡ (b M +â M )/2. The difference between the market best bid and ask prices is called the market spread. The market transacted price means the price at which a transaction occurs in the market. In this paper, the market price (mathematically denoted byp) means the market transacted price for short.
As for a single trader, the highest bid (lowest ask) quoted price by a single trader is called the best bid (ask) price of the trader (denoted byb i (â i ) for the ith trader). The average of the best bid and ask prices of the trader is called the mid price of the trader (denoted byẑ i ). Also, the difference between the best bid and ask prices of the trader is called the buy-sell spread of the trader (denoted byL i ≡â i −b i ), which is different from the market spread.
There are two types of time in this paper. One is the real time t and the other is the tick time T (Fig. 2b). The tick time T is defined as a discrete time incremented by every market transaction and corresponds to the real time as a stochastic variable, such as t =t [T ]. Here the square brackets for the function argument (e.g.,Â[T ]) means that the stochastic variableÂ(t) is measured according to the tick time T (i.e.,Â[T ] ≡Â(t[T ])), highlighting the differences to that measured according to the real time t (e.g.,Â(t) with the round brackets).

B. Characters of real HFTs
Here we describe the characters of real HFTs on the basis of high-frequency data analysis of a foreign exchange (FX) market. We analyzed the order-book data including anonymized trader IDs and anonymized bank codes in Electronic Broking Services (EBS) from the 5th 18:00 to the 10th 22:00 GMT June 2016. EBS is an interbank FX market and is one of the biggest financial platforms in the world. The minimum volume unit for transaction was one million US dollars (USD) for the FX market between the USD and the Japanese Yen (JPY). We particularly focus on HFTs, who frequently submit or cancel their orders according to algorithms. As reported in our previous work [46], HFTs have several characters quite different from low frequency traders (LFTs). For this paper, an HFT is defined as a trader who submitted more than 2500 time during the week, similarly to a previous research [52]. With this definition, the number of HFTs was 135 during this week, while the total number of traders submitting limit orders was 922 [62], and 89.6% of all the orders in this market were submitted by the HFTs. Here we summarize the reported characters with several additional evidence: (α1). Small number of live orders and volume: HFTs typically maintain a few live orders, less than ten (see Fig. 3a and b). Furthermore, a single order submitted by HFTs typically implies one unit volume of the currency. These characters are in contrast to those of LFTs, who sometimes submit a large amount of volumes by a single order (see Fig. 3a and c for the fat-tailed distributions of the number of orders or volumes for LFTs).
(α2). Liquidity providers: Typical HFTs plays the role of key liquidity providers (or market makers) and have the obligation to maintain continuous two-way quotes during their liquidity hours according to the EBS rulebook [53] Orders and volumes (HFT) Orders and volumes (LFT) Volumes filled in one transaction Typical Trajectory of the top HFT (see Fig. 3d for a typical trajectory of the top HFT). The balance between the ask and bid order book is kept statistically symmetric to some extent, seemingly thanks to the liquidity providers.
(α3). Frequent price modification: Typical HFTs frequently modify their quoted prices by successive submission and cancellation of orders (see Fig. 3d for a typical trajectories of the top HFT). The lifetime of orders were typically within seconds for the top HFT, while the typical transaction interval was 9.3 seconds in our dataset. In addition, 94.4% of the submissions by all the HFTs were canceled finally without transactions.
(α4). Trend-following property: HFTs tend to follow the market trends. We here denote the best bid and ask quoted price of the ith trader and the market price at the T tick time byb , respectively (see Fig. 2b). We also denote the mid quoted price of the ithe trader bŷ  L min , respectively. According to Ref. [46], the buy-sell distribution ρ L is directly measured to obey the γ-distribution, such that with decay length L * and empirical exponent α ≈ 3.

Trend-following random walks
HFTs have a tendency to maintain continuous two-sided quotes by frequently modifying their prices (i.e., successive cancellation and submission of limit orders), as required by the market rule [53]. This implies that the mid-price trajectory of an HFT can be modeled as a continuous random trajectory (i.e., the characters (α2) and (α3)). Remarkably, there is a mathematical theorem guaranteeing that the Itô processes (i.e., SDEs driven by the white Gaussian noise) are the only Markov processes with continuous sample trajectory [13]. As a minimal model satisfying all the characters of real HFTs (α1)-(α4), the dynamics of the HFTs are modeled within the Itô processes as in the absence of transactions (Fig. 4a) by taking into account the empirical trend-following properties (α4). Here c and ∆p * are constants characterizing the strength and threshold of trend-following effect andη R i is the white Gaussian noise with unit variance. The presence of the trend-following effect in Eq. (14) is the character of our HFT model, which induces the collective motion of limit orders [46]. The trend-following effect triggers translational motion of the full order book, which was crucial to reproduce the layered structure of the order book reported in Ref. [45].

Transaction rule
When the best bid and ask prices coincide, there occurs an transaction (see Fig. 4b). The transaction condition (i.e., the condition of price matching) is mathematically given bŷ for i = j. In the following, we assume that the index i is an integer always different from another integer j. At the instance of transactionb i =â j , let us assume that the traders requote their prices simultaneously (see Fig. 4c) such thatb whereb pst i andâ pst i are post-transactional bid and ask prices after transaction for between traders i and j, respectively. By introducing the mid-price of the individual traders asẑ i ≡ (b i +â i )/2, the transaction rule is rewritten aŝ We here define the market pricep(t) and the previous price movement ∆p(t) at time t.p(t) is the market price at the previous transaction; ∆p(t) is the price movement by the previous transaction. They are updated after transactions under the following post-transaction rule ( Fig. 4b and c): with signature function sgn(x) defined by sgn(x) = x/|x| for x = 0 and sgn(0) = 0.

D. Complete model dynamics
We here specify the complete dynamics of the quoted prices {ẑ i (t)} i within the framework of stochastic processes with collisions. When the previous price movement is ∆p, we assume that traders' quoted prices are described by the trend-following random walks: whereη T i is requotation jump term andτ k;ij is the kth transaction time between traders i and j satisfying The requotation jumpη T i corresponds to collisions in molecular kinetic theory. The price-matching condition (15) and the requotation rule (16) correspond to the contact condition and the momentum exchange rule in standard kinetic theory for hard-sphere gases, respectively. The summary of the model parameters is presented in the Table I with their dimensions. A sample trajectory of this model is depicted in Fig. 5a. We note that this model is a generalization of the previous theoretical model in Refs. [31,[34][35][36][37] on the basis of the above empirical facts (α1)-(α4) on HFTs.
The dynamics of the pricep and the previous price movement ∆p can be specified within the framework of stochastic processes. Sincep and ∆p are updated at the instance of transactions, their dynamics synchronizes with collision timeτ k;ij . Considering the transaction rule for prices (18), their concrete dynamical equations are thus given by with the price after collisionp pst ij ≡ẑ i − (L i /2)sgn(ẑ i −ẑ j ) and the price movement after collision ∆p pst ij ≡p pst ij −p. In this paper, the Itô convention is used for the multiplication to δ-functions.

Parameter
Meaning Buy-sell spreads of traders price c Strength of trend-following price/time ∆p * Saturation for trend-following price σ 2 Variance of random noise price 2 /time Introduction of slow variables is the key for reduction of the complex dynamics in general (e.g., the center of mass (CM) of the Brownian particle [16] and the slaving principles in synergetics [54]). Here we introduce the CM of the quoted prices as the slow variable of this system (Fig. 5a). The definition of the CM and its dynamics are given bŷ T i . The CMẑ CM characterizes the macroscopic dynamics of this system. As will be shown in Sec. VI C 1, indeed, the diffusion coefficient of the CM turns out to be proportional to N −1 for the weak trend-following case, implying that the selection ofẑ CM is reasonable as a slow variable.
Another motivation to introduce the CM is to define the relative price from the CM such that since the relative pricer i has better mathematical characters thanẑ i . For example, the relative pricer i fluctuates around zero (see Fig. 5b for the dynamics in the comoving frame of CM) and has the stationary distribution, while the original variableẑ i diffuses to infinity for a long time and has no stationary distribution.

F. Difference to other order-book models
One of the unique characters of the HFT model is the collective motion of order book due to trend-following. As shown in Ref. [45], the order book has the layered structure in the sense that the difference in volumes of bid (ask) order book near best price has positive (negative) correlation with price movements. This implies that the order book exhibits the translational motion like inertia in physics (Fig. 5c), and thus movements of HFTs are not independent of each other like herding behavior. This collective motion has not been implemented in conventional order-book models, which are based on independent Poisson processes for order submission and cancellation, and is minimally implemented in our HFT model as trend-following for the consistency with the layered order-book structure [46].

IV. MAIN RESULT 1: MICROSCOPIC DESCRIPTION
As the main results of this paper, the analytical solutions to the trend-following HFT model are presented by developing the mathematical technique of kinetic theory. We first introduce the phase space for the HFT model in the standard manner of analytical mechanics, and derive the dynamical equation for the PSD, which we call the financial Liouville equation. We next derive the hierarchy for the reduced distributions similarly to the BBGKY hierarchy in molecular kinetic theory, which is the theoretical key to understand the financial system systematically as shown in Secs. V and VI.

A. Phase space and phase-space distribution
Here first we introduce the phase space for the HFT model according to the standard manner of analytical mechanics. Let us introduce a vectorΓ ≡ (ẑ 1 , . . .ẑ N ;ẑ CM ,p, ∆p), which corresponds to a phase point in the phase space S ≡ (21), and (22) are the complete set of dynamical equations for the phase point, corresponding to the Newtonian equations of motions in conventional mechanics. Also, let us define the PSD function P t (Γ). Using the PSD, the probability is given by P t (Γ)dΓ where the phase point Γ exists at the time t in the volume element dΓ ≡

B. Financial Liouville Equation
As the first main result in this paper, we present the Liouville equation for the trend-following trader model (19)- (22) as the dynamical equation for the PSD. The dynamical equation for the PSD is given by where the advective and diffusive Liovuille operator L a and the binary collision Liouville operator L c are defined by Here we have introduced the symmetric absolute derivative |∂ ij |f ≡ |∂ i f | + |∂ j f | for an arbitrary function f (z i , z j ) and abbreviated derivatives ∂ i ≡ ∂/∂z i and ∂ CM ≡ ∂/∂z CM (see Appendix. A for the detailed derivation). We have also introduced a difference vector: with movement of the CM ∆z CM ≡ −(L i − L j )/2N . This is the first main result in this paper. The advective and diffusive Liovuille operator L a describes the continuous dynamics of the system in the absence of transactions, while the binary collision Liouville operator L c describes the discontinuous dynamics in the presence of transactions. Equation (24) formally corresponds to the Liouville equation (2) in molecular kinetic theory, and is called the financial Liouville equation in this paper. The financial Liouville equation completely characterizes the microscopic dynamics of all traders (Fig. 1d).

C. Financial BBGKY Hierarchy
The financial Liouville equation (24) is exact but cannot be solved analytically. We therefore reduce Eq. (24) toward a simplified dynamical equation for a one-body distribution in the parallel method to molecular kinetic theory. According to the standard method in the kinetic theory, the Boltzmann equation, a closed dynamical equation for the one-body distribution, can be derived by systematically reducing the Liouville equation in the parallel method to BBGKY (see Sec. II B). We here present the lowest-order equation of reduced distributions for the trend-following HFT model in the parallel calculation in kinetic theory. We first introduce the relative price from the CM as r i ≡ z i − z CM . We also define the one-body, two-body and three-body reduced distribution functions for the relative price: We then obtain the lowest-order hierarchal equation for the one-body distribution as with one-body, two-body, and three-body Liouville operators L (i) , L (ij) , L (ijk) defined by effective varianceσ 2 ≡ σ 2 (1 − 1/N ), and jump size ∆r ij;s ≡ ∆r Here ∆r (1) ij;s indirectly originates from the movement of the CM during requotation. The detailed derivation of Eq. (28) is described in Appendix. B. Equation (28) formally corresponds to the conventional BBGKY hierarchal equation (3) for the mesoscopic description. On the basis of Eq. (28), the Boltzmann-type closed equation for the one-body distribution is derived in the next section.
We also derive the hierarchal equation for the macroscopic dynamics. For the macroscopic variables Z ≡ (z CM , p, ∆p), we here define the reduced distributions: We then obtain the hierarchal equation for the macroscopic dynamics, with advective and diffusive Liouville operator L a CM and collision Liouville operator L a;ij CM between particles i and j: Equation (32) formally corresponds to the lowest-order conventional BBGKY hierarchal equation (8) for the macroscopic description. Using this hierarchal equation (32), a closed master-Boltzmann equation is derived for the macroscopic variables in the next section. The set of Eqs. (28) and (32) is the second main result in this paper. Equation (28) connects the microscopic description (Fig. 1d) to the mesoscopic description (Fig. 1e), and Eq. (32) connects the mesoscopic description (Fig. 1e) to the macroscopic description (Fig. 1f). Their detailed derivation is presented in Appendix. B. These equations are derived in a parallel calculation to the conventional BBGKY hierarchal equations (3) and (8), and are called the financial BBGKY hierarchal equations in this paper. Similarly to the conventional BBGKY hierarchal equations (3) and (8), our hierarchal equations (28) and (32) are exact but are not closed: the dynamics of low-order distributions are driven by those of higher-order distributions. Appropriate approximations are necessary to derive closed equations, such as the molecular chaos, which will be studied in the next section. Remark on the three-body collision term.
We here remark the emergence of the three-body collision term L (ijk) in the BBGKY hierarchy (28), which is slightly different from the conventional BBGKY hierarchy (3). This term appears because our kinetic theory is formulated on the basis of the relative pricer i . To understand this point, let us consider the movement of the relative pricer i of the ith trader during collision between traders j and k (see Fig. 7 for a schematic of three-body collision). While the mid priceẑ i of the ith trader does not move during the collision between traders j and k, the CM of this system z CM moves through a distance of ∆ẑ CM ≡ẑ pst . The relative pricer i thus moves indirectly through a distance of ∆r i ≡r pst i −r i = −∆ẑ CM = ∆r (1) jk;s , which appears in the three-body collision operator (29c). This effect is intuitively small for the large N limit and is finally shown irrelevant to the leading-order (LO) and next-leading-order (NLO) approximations as discussed later.

V. MAIN RESULT 2: MESOSCOPIC DESCRIPTION
From microscopic dynamics, we have derived the BBGKY hierarchal equation (28) for the mesoscopic description of the HFT model in a parallel manner to the conventional BBGKY hierarchal equation (3). Here we proceed to derive the closed mean-field model for the mesoscopic description, which will be finally shown useful to understand the order-book profile systematically.

A. Financial Boltzmann Equation
We here derive a closed equation for the one-body distribution function by assuming a mean-field approximation.  (37) of the tent function (36). For the δ-distributed spread (Case 1), the profile is the tent function (38). For the γ-distributed spread (Case 2), the profile obeys Eq. (39).
body distribution φ L t (r) is thus obtained as with mean-field probability flux J LL t;s (r) for s = ±1. The systematic derivation of this equation is the third main result in this paper (see Appendix. C for the detail). Equation (35) is a closed equation for the one-body distribution function, and corresponds to the Boltzmann equation in molecular kinetic theory (see Fig. 1b). Equation (35) is therefore called the financial Boltzmann equation in this paper. Here the dummy variable s = +1 (s = −1) implies the transactions as a bidder (an asker), and the integrals on the right-hand side (rhs) correspond to the collision integrals in the standard Boltzmann equation (6). Remarkably, Eq. (35) is derived from a systematic calculation from the Liouville equation (24), whereas it was originally introduced with a rather heuristic discussion in our previous paper [46].

B. Solution
Let us focus on the steady solution of Eq. (35). Equation (35) can be analytically solved for N → ∞ on an appropriate boundary condition (See Appendix. D for the detail) for the steady state. The LO steady solution is given by the tent function: The average order-book profile for the ask side f A (r) is given by the superposition of the tent function: We note that the average order-book profile has a symmetry, such that f B (r) = f A (−r) for the average bid orderbook f B (r). We also note that the NLO correction (E4) can be obtained as shown in Appendix. E. Though the LO solution (36) is sufficient to understand the average order-book profile, the NLO solution (E4) is necessary to understand the dynamics of the financial Langevin equation, as shown in Sec. VI.
Numerical comparison 1: δ-distributed spread. We here study the theoretical order-book profiles for two concrete examples with numerical validation (see Appendix. F for the detailed implementation). Let us first consider the case of a single spread L * . The corresponding average order-book profile is given by the tent function We have numerically examined the validity of this formula in Fig. 9a, which shows the numerical agreement with our formula (38). The LO solution (38) works quite well for the description of the order-book profile, and the numerical convergence in Fig. 9a implies that Eq. (38) might be exactly valid for N → ∞. Numerical comparison 2: γ-distributed spread.
The formula (37) works well even for L min → 0 and L max → ∞ when the integrals converge. As an example, let us consider the case where the spread obeys the γ-distribution which was empirically validated through single-trajectory analysis of individual traders in our previous work [46]. We have numerically examined the validity of this formula in Fig. 9b, which shows the numerical agreement with our formula (39). The numerical convergence in Fig. 9b implies that the LO solution (39) might be also exact for N → ∞.

VI. MAIN RESULT 3: MACROSCOPIC DESCRIPTION
In this section, we derive the stochastic equations for the macroscopic dynamics of this system from the BBGKY hierarchal equation (32) in the parallel method to the master-Boltzmann equation (8) for physical Brownian motions.

A. Master-Boltzmann Equation for Financial Brownian Motion
On the basis of the financial BBGKY hierarchy (32) for the macroscopic dynamics, we derive a closed dynamical equation for the macroscopic variables Z ≡ (z CM , p, ∆p). Here we first make the assumption of molecular chaos, Using the NLO solution (E4), we deduce a closed master-Boltzmann equation for the macroscopic dynamics (see Appendix. G for the detailed calculation): where the mean-field collision Liouville operators for the macroscopic variables L c;MF CM is defined by with 1/L * 2 ρ ≡ dLρ L /L 2 , Gaussian distribution N (x; σ 2 ), jump size distribution w N (y), and mean transaction interval τ * defined by by assuming ρ L is zero for L ∈ [L min , L max ]. Note that Eq. (42) is a master equation (or the differential form of the Chapman-Kolmogorov equation [13]) and is equivalent to a set of stochastic differential equations (SDEs) (see Eq. (G4) in Appendix. G).

B. Financial Langevin Equation
We have derived the stochastic dynamics for the three macroscopic variableẐ = (ẑ CM ,p, ∆p) as the master equation (42) (or equivalently SDEs (G4)) in the continuous time t. We next simplify the dynamics (42) of the three macroscopic variables into that of a single macroscopic variable ∆p in the tick time T . In the tick time T ...., the dynamical equation for the price movement ∆p is given by  (9), and is thus called the financial Langevin equation in this paper.
Within the mean-field approximation, we can specify all the statistics of the random noise terms from analytics. The time intervalτ [T ] is given by the exponential random number with mean interval τ * , The zigzag noise ∆ξ[T ] is defined by the difference of two Gaussian random numbers as whereξ[T ] is a discrete-time white Gaussian noise with unit variance. The random noise termζ[T ] is specified aŝ whereμ[T ] is a discrete-time white Gaussian noise with unit variance andν[T ] is a discrete-time white noise term obeying P (ν) =w(ν) with an N -independent distributionw(ν) = w N (ν/N )/N . We next discuss the interpretation of each term on the rhs of Eq. (44). The trend-following term induces the collective motion of the order book and thus keeps the price movement in the same direction for a certain timeinterval similarly to the inertia in physics. On the other hand, the zigzag noise term exhibits one-tick negative autocorrelation, such that and has the effect to change the price movement direction alternately. In this sense, the trend-following term and the zigzag noise have the opposite effect to each other; the balance between their strengths is crucial for the qualitative behavior of the market price dynamics. The random noise termζ

C. Solution
The macroscopic dynamics of the price strongly depends on the balance between the strength of trend-following effect and that of the zigzag noise. Here we present the solutions of the financial Langevin equation depending on the strength of trend-following with the dimensional analysis. The price movement originating from trend-following behavior is estimated to be cτ * (of price dimension). On the other hand, the amplitude of the zigzag noise is estimated to be L * ρ / √ 2N (of price dimension). Their balance is thus characterized by the dimensionless parameterc defined bỹ Another dimensionless control parameter is the ratio ∆p * between the average movement by the trend-following cτ * (of price dimension) and the saturation threshold against the market trend ∆p * (of price dimension): The set of dimensionless parameters (c, ∆p * ) governs the qualitative dynamics of the market price. For consistency with the empirical report [46], we focus on the case of ∆p * 1 in this section, whereby the saturation of the hyperbolic function is valid. (see Sec. VII I for the discussion on the case with ∆p * 1). Here we introduce three classifications in terms of the strength of trend-following: 1. Weak trend-following case:c 1 2. Strong trend-following case:c 1 3. Marginal trend-following case:c ∼ 1 Sample trajectories are plotted in Fig. 10 to highlight the character of each case: For the weak trend-following case (Fig. 10a), the price tends to move upward and downward alternatively every tick because of the zigzag noise ∆ξ. For the strong trend-following case (Fig. 10b), the unidirectional movement of price is kept for a certain time period.
For the marginal trend-following case (Fig. 10c), both zigzag and unidirectional movements randomly appear because both effects are in balance. As will be shown later in detail, the marginal case may be the most realistic, at least in our dataset. We next study these qualitative characters through statistical analysis of price time series within the mean-field approximation.  (42) can then analytically solved in continuous time t. By applying the system size expansion [16] (see Appendix. I for derivation), we obtain the diffusion equation for the CM with the renormalized diffusion coefficient D(N ) up to the order of N −1 and the second-order Kramers-Moyal coefficient α 2 ≡ ∞ −∞ dyy 2w (y). The diffusion constant D(N ) decays for N → ∞, which implies that the dynamics of the CM become slower as the number of the traders increases. Given that the dynamics of pricep coincides with that of the CMẑ CM for a long timescale, the diffusion of the price is also shown normal for a long timescale with the same diffusion coefficient D(N ) in the real time t. The mean square displacement (MSD) based on real time t is thus analytically obtained as showing the normal diffusion for a long time.
We also study price movement at one-tick precision. For the weak trend-following case, the only relevant term in Eq. (44) is the zigzag noise ∆ξ(T ) for a short timescale. Price movement ∆p then obeys the Gaussian distribution The autocorrelation function of the price movement ∆p is also given by Interestingly, this property is consistent with an empirical fact that price movements typically exhibit zigzag behavior for a short timescale, which is reflected in the one-tick strong negative autocorrelation of the price movement.
Here we discuss the origin of the strong negative correlation in terms of price movement. Remarkably, only the random noiseζ[T ] is dominant for long time whereas only the zigzag noise ∆ξ[T ] is dominant for a short timescale. For K N , indeed, we obtain which implies that the contribution by the zigzag noiseξ[T ] is negligible compared with that of the random noisê ζ[T ] (i.e., Considering that the random noiseζ[T ] originates from the diffusion of the CM, Eq. (55) means that the macroscopic behavior of price is governed by the slow dynamics of the CM. Even though the price movement at one-tick precision is much larger than that of the CM, such movement is irrelevant to the macroscopic dynamics of the whole system. This is the origin of the strong negative correlation for price movement in this model with weak trend-following. To relieve such negative correlation, stronger trend-following is necessary to induce the collective motion of the order book as discussed in Ref. [46]. We note that similar slow diffusion is observed in the conventional zero-intelligence order-book models [38][39][40], with which the trend-following effect is not incorporated likewise.
We also note that the negative correlation (54) is also related to the slow diffusion of price for a short timescale. Indeed, the MSD is given by within the mean-field approximation. This formula implies that the MSD is almost constant (i.e., no diffusion) for a short timescale K N while it is asymptotically linear (i.e., the normal diffusion) for a long timescale K N . Numerical comparison. Here we examine the validity of our formulas through comparison with numerical results for the γ-distributed spread (see Appendix. F for the implementation).
Transaction interval. We first check the statistics of the time-interval between transactionsτ . The mean transaction interval τ * ≡ τ is numerically plotted in Fig. 11a, showing the quantitative agreement with the theoretical prediction (45) including the coefficient. We also numerically plotted the probability distribution ofτ with scaling parameters for horizontal and vertical axes, qualitatively showing the exponential tail for largeτ . Here, we have introduced a scaled transaction intervalτ ≡ c ττ /τ * and plotted the scaled probability distribution in Fig. 11b P with scaling parameters for the horizontal and vertical axes c τ and Z τ . The coefficients c τ and Z τ were determined by the least-square method to fit the exponential tail for each N . The numerical results imply the modification for the decay length c τ ≈ 1.6, whereas the mean-field solution (45) predicts c τ = 1. This means that the mean-field solution (45) is not exact but is rather qualitatively correct for the probability distribution P (τ ). This factor modification c τ ≈ 1.6 can be roughly understood from the viewpoint of the order statistics, as discussed in Ref. [46]. The mean-field approximation predicts the exponential interval distribution (45), which means that the transaction obeys the exact Poisson process. As the numerics shows, however, the transaction obeys the Poisson process not exactly but only asymptotically. One candidate of its origin is that a transaction occurs as a pair of arrivals of both bid and ask quotes. Let us assume that the arrival of a bid (ask) quote at the transaction price obeys the Poisson statistics as P (τ B ) = e −τB/τ * B /τ * B (P (τ A ) = e −τA/τ * A /τ * A ). Any transaction is assumed to occurs when both bid and ask quotes arrive at the transaction price. We then make an approximation thatτ ≈ max{τ B ,τ A } and τ * B = τ * A . On the basis of the orders statistics [55], we obtain where the fitting parameter was determined by the consistency condition for the average interval as τ = τ * ⇐⇒ τ * B = 3τ * /2. We thus obtain the modification factor c τ = 3/2 as an approximation. We note that the transaction interval is not under the influence of the trend-following effect. The above statistical characters on transaction interval are therefore shared for any parameter set of (c, ∆p * ).
MSD. Our theoretical prediction on the MSD is numerically examined here for analyses based on both real time t and tick time K. We first numerically check the MSD (52) based on real time t in Fig. 11c. This figure shows the quantitative agreement with our theoretical formula (52) without fitting parameters. We also check the MSD based on tick time K in Fig. 11d, showing a quantitative agreement with the theoretical prediction (56) for K 1. For small K ∼ 1, the agreement is not perfect between the numerical data and the theoretical line, but the slowness of the diffusion is qualitatively observed as predicted in the mean-field solution (56).
Price movement. The dependence of the variance of price movement is checked in Fig. 11e on the number of traders N . We numerically obtained ∆p 2 ≈ C ∆p 2 (L * 2 ρ /2N ) with modification factor C ∆p 2 ≈ 0.4 and L * 2 ρ = 6L * 2 . Though there is a discrepancy in terms of the factor C ∆p 2 , the mean-field solution (53) qualitatively works well for the variance of price movement. We also checked the PDF P (|∆p|) of the scaled price movement ∆p ≡ √ N ∆p/L * ( Fig. 11f and g for the peak and tail of PDF, respectively). In Fig. 11g, we also show a Gaussian-type fitting curve h(∆p) = exp −h * 0 − h * 1 ∆p − h * 2 ∆p 2 for the tail with parameters h * 0 = 0.75 ± 0.05, h * 1 = 0.54 ± 0.04, and h * 2 = 0.238 ± 0.006. These figures suggests that the PDF of the price movement has the Gaussian tail, which is qualitatively consistent with the theoretical prediction (53) (h * 1 = 0 and h * 2 = 1/6). Fig. 11h, which supports the qualitative consistency between the theory (55) and the numerical results in terms of the negative correlation at K = 1 tick. This negative correlation implies that the price time series exhibits zigzag behavior in the absence of the trend-following effect. Indeed, the probability of ∆p[T + 1]∆p[T ] < 0 is theoretically 2/3 = 66.6...% for the mean-field model (see Appendix. J), considerably higher than 50% (i.e., the pure random walks). This result is also qualitatively consistent with the numerical result (around 61%) as shown in Table II  We numerically obtained the probability that the next price movement ∆p[T + 1] has the same (different) sign as (from) that of the previous price movement ∆p [T ] for N = 100. (a) For the weak trend-following casec = 0, the probability of taking different sign is higher than that of taking same sign, implying the zigzag motion of the price movement. (b) For the strong trendfollowing case (c, ∆p * ) = (2.0, 0.1), the probability of taking the same successive sign is much higher than that of taking different sign, implying the ballistic motion of the price movement. (c) For the marginal trend-following case (c, ∆p * ) = (0.5, 2.5), the probability of taking different sign is slightly higher than that of taking same sign. (d) We also obtained the probabilities from the real price time series in our dataset, showing that the probability of taking different sign is slightly higher than that of taking same sign. For simplicity, we omitted zero, such as ∆p[T ] = 0, during the data analysis of real price movement time series {∆p[T ]}T . This table implies that the marginal trend-following case is consistent with the real price time series and is the most realistic at least for stable markets.

Autocorrelation. The autocorrelation function C ∆p [K] is checked in
with decay length κ for |∆p| → ∞. The decay length is given by the mean movement originating from the trendfollowing as κ = cτ * within the mean-field approximation (45). By applying the improved mean-field approximation (58), more consistent coefficient κ = 2cτ * /3 is obtained with the numerical result below. The trend-following effect plays similar roles to momentum inertia in physics, which are reflected in the autocorrelation function and the MSD plot as shown numerically in the next paragraph. Numerical comparison. Numerical characters are studied here for the strong trend-following case under the parameter set (c, ∆p * ) = (2.0, 0.1). We first study the price movement distribution P (|∆p|). In Fig. 12a, the price movement distribution is plotted by scaling the horizontal and vertical axes, qualitatively showing the exponential tail for the scaled price movement ∆p ≡ ∆p/κ. Here the scaling parameters κ and Z ∆p were determined by the least square method for the tail. The mean-field solution (45) and the improved mean-field solution (58) predicts κ = cτ * and κ = 2cτ * /3, respectively. These theoretical predictions are qualitatively consistent with the numerical estimation κ ≈ 0.64cτ * . We next study the autocorrelation function C ∆p [K] of the price difference ∆p based on tick time K in Fig. 12b by scaling the horizontal line. For our parameter sets, the numerical result implies that the autocorrelation function can be written as with fitting parameters τ AC and Z AC . This autocorrelation suggests that the strong trend-following keeps unidirectional price movements for a certain time-interval. Indeed, the probability of ∆p[T ]∆p[T + 1] > 0 is much higher than 50% under this condition as shown in Table II. In addition, the numerical MSD plot in Fig. 12c shows the rapid diffusion (almost ballistic motion K 2 ) for a short time and the normal diffusion for a long time

Marginal case
The most complex case is the marginal casec ∼ 1, where both trend-following effect and zigzag noise contribute to the price movement as |cτ random noise term ∆ξ are relevant on this condition, the main contribution to the price movement tail originates from the trend-following term because the former yields the exponential tail while the latter yields the Gaussian tail.
We thus obtain the exponential tail (59) for the price movement for the marginal case. This theoretical conjecture is to be validated numerically below. Numerical comparison. We studied the marginal case under the parameter set (c, ∆p * ) = (0.5, 2.5). In Fig. 13a, we plot the price movement distribution by scaling both horizontal and vertical axes as Eq. (60). We thus obtain the exponential-law tail (60) for the price movement qualitatively.
In Fig. 13b, we also studied the autocorrelation function C ∆p [K] on tick time K through both numerical simulation (points) and empirical data analysis (solid line) of the real time series. This figure shows the slight negative correlation around K = 1, which was qualitatively consistent with the empirical result in our dataset. This result also implies that the price time series exhibits a slight zigzag behavior for a certain tick period. This theoretical implication was validated by analyzing the probability of ∆p[T ]∆p[T + 1] < 0 as summarized in Table II. The table II shows the quantitative consistency between the marginal trend-following case and the real price time series.
We also discuss the behavior of MSD in Fig. 13c and d, which shows both slow and rapid diffusions dependently on the parameters. For example, we set the parameters (c, ∆p * ) = (0.5, 2.5) and (c, ∆p * ) = (0.86, 1.43) for Fig. 13c and d, respectively. In Fig. 13c, the MSD plot exhibits a slightly slow diffusion for a short time and the normal diffusion for a long time. In Fig. 13d, on the other hand, the MSD plot exhibits a slightly rapid diffusion with the Hurst exponent H = 0.65 for a short time and the normal diffusion for a long time. We thus conclude that our HFT model can reproduce a variety of diffusion by adjusting the trend-following parameters.

VII. DISCUSSION
We here discuss implications of our theory to understand various topics intensively.

A. Comparison with real dataset
Here we provide a detailed comparison between empirical facts and the above theoretical predictions as follows: As for the order-book profile f A (r), the validity of the formula (39) was examined by analyzing daily average order- Prob. of diff. sign (a) Weak trend-following case Gaussian Exponential Strongly negative at K = 1 around 60% (b) Strong trend-following case Exponential Exponential Strongly positive less than 10% (c) Marginal trend-following case Exponential Exponential Slightly negative around K = 1 around 52% (d) Empirical facts Exponential Exponential Slightly negative around K = 1 around 52% book in Ref. [46]. The exponential-tail for time interval distribution P (τ ) ∼ e −τ /τ * was studied in Ref. [56] by removing the non-stationary property of time series. The price movement was reported to obey the exponential-law P (|∆p|) ∼ e −|∆p|/κ in Ref. [46] by removing the non-stationary property of time series. The price time series tended to exhibit zigzag behaviors, which were reflected in the negative autocorrelation function C ∆p [K] around K = 1 (see Fig. 13c) and the probability of ∆p[T ]∆p[T + 1] < 0 (i.e., taking different signs) slightly over 50% (see Table II). All these characters are consistent with our theoretical prediction for the marginal trend-following case (see Table III for the summary of the comparison). The HFT model presented here can show precise agreements with these empirical facts. Considering that the market was stable in our dataset, we concluded that our HFT model can describe the FX market well, at least during the stable period. Description of unstable markets is out of scope of this paper and is a next interesting problem for future studies.

B. Validity of Mean-Field Approximation
We have numerically validated the mean-field theory. The LO solution (36) quantitatively describes the orderbook profile (37) with high precision and the NLO solution (E4) qualitatively describes the price movement (44). Possible reasons are discussed here why the mean-field approximation works so well for the trend-following HFT model considering the common sense in physics.
The mean-field approximation is expected invalid for low-dimensional physical systems because two-body correlations do not disappear between colliding pairs for a long time. Colliding particles are not allowed to be separated far from each other because of the continuity of paths and the low-dimensional space geometry. For one-dimensional Hamiltonian systems with hard-core interactions, for example, any particle successively collides against the fixed neighboring particles and two-body correlations then remain forever. The mean-field approximation is therefore shown valid only for high-dimensional systems, at least for several concrete setups. From this viewpoint, the precise agreement is not trivial between the mean-field solution (37) and the numerical result.
In contrast, the continuity of the path is absent due to requotation jumps though our model is a one-dimensional system. The transaction rule (17) compulsorily separates the transaction pairs after their collision, because of which there is no restriction on the combination of possible transaction pairs. In the N → ∞ limit, in addition, transactions between the same pair traders becomes rare (i.e., the probability of successive transaction between the same pair decays as the order of N −2 ), which implies quick disappearance of the two-body correlation between transaction pairs for N → ∞. This is our conjecture to validate the mean-field approximation for this model. If this conjecture is correct, kinetic-like descriptions may be valid for various agent-based systems, if agents are separated compulsorily to avoid successive interactions between the same pairs. C. Non-stationary property for price movements: power-law behavior Financial markets are known to exhibit strong non-stationary properties statistically, such as the intraday activity patterns. Here we discuss the impact of such non-stationary properties on the price movements and its relation to the celebrated power-law behavior for a long time.
Our theoretical model implies that the exponential law (59) for the price movement as the basic statistical property. This property was shown consistent with the real price movement in Ref. [46] for a short time, by removing the nonstationary property in terms of the decay length κ. The decay length κ is related to the number of traders N and the strength of trend-following c, both of which are expected to have non-stationary properties. At least, indeed, the number of traders N exhibits a trivial but strong non-stationary property with a correlation with the decay length.
To illustrate this character, let us analyze the statistical relation between the mean absolute price movement |∆p| and the number of HFTs N in our dataset. We measured |∆p| as a representative of the market volatility for a short time and studied its correlation with N every two hours in Fig. 14a. Spearman's rank correlation coefficient was 0.63 between |∆p| and 1/N . This result implies that the market volatility is relatively small when N is large, which is qualitatively consistent with our theoretical prediction of |∆p| ≈ κ ∼ 1/N β (e.g., β = 1 if parameters are time-constant other than N ). The regression analysis between log |∆p| and log N implies β = 0.86 ± 0.1 as shown in Fig. 14b. We also note that both ∆p and 1/N had a tendency to become large during inactive hours of the EBS market (Fig. 14c).
The non-stationary property of the market volatility is related to the power-law behavior of the price movement for a long time. In Ref. [46], the decay length κ is shown to have a power-law distribution P (κ) ∝ κ −α−1 , which implies the power-law price movement for a long time as the superposition of the short-time exponential distribution, with the complementary cumulative price movement distribution P long (≥ |∆p|) and P short (≥ |∆p|) ∼ e −|∆p|/κ . This result is consistent with previous empirical researches [23][24][25][26]. We thus concluded that both exponential law and power-law can consistently coexist at least in our dataset. We here note that the FX market in our dataset was rather stable without any external shocks. While the exponential-law was essential for a short time in our dataset, we do not deny the possibility that the power-law may be essential even for a short time for unstable markets under external shocks. We believe that that there would exist essentially different structures in unstable markets and it would be interesting to study the statistics of traders' behavior in unstable markets under financial crisis for a future perspective.
D. Non-stationary property for transaction interval: power-law behavior As for the transaction interval, our theory predicts that the exponential-law (45) is essential rather than the powerlaw. This result is consistent with the previous report in Ref. [56], showing that the exponential-law is essential for a short time but it superposition leads the power-law behavior of transaction interval for a long time.
E. Non-stationary property for order-book dynamics: stability of the order-book profile We have discussed that both price movement and transaction interval are quite sensitive to non-stationary properties of the market. On the other hand, the average order-book profile f A (r) is relatively insensitive to such non-stationary properties, in contrast to the price movement and transaction interval. Indeed, the average order-book profile f A (r) is independent of the trend-following propertyc. In addition, the order-book profile shows a convergence for N → ∞, such that lim N →∞ f A (r) is an L 2 -functions, which implies that large variation of N does not have impact on the order-book profile.
Similar insensitivity does not exist for the price movement and transaction interval. Indeed, they exhibit the strong divergence for N → ∞ as lim N →∞ P (|∆p|) = δ(|∆p|) and lim N →∞ P (τ ) = δ(τ ), which implies the huge impact of large variation of N on their statistics.
In this sense, the average order-book profile is a stable quantity to measure under non-stationary processes, whereas the price movement and transaction interval are unstable quantities. Our theory provides the insight on the sensitivity of measured quantities to the non-stationary nature of the market. We believe that developing systematic methods to remove such non-stationary nature is the key to understand not only the origin of power-laws in finance but also the essence of market microstructure. One of the most interesting features in statistical physics lies in the fact that many-body systems can exhibits essentially different characters from few-body systems, such as the critical phenomena and collective motion. Though the current HFT model here does not exhibit critical phenomena, an essential difference can be shown between the cases of N = 2 and N 1. To illustrate this point, let us consider the case ofc = 0 without trend-following. Our theory is applicable to solve the case of N = 2 exactly, which leads the same solution presented in Ref. [36]. The price movement is then predicted to obey the exponential-law even without trend-following, which is qualitatively different from the Gaussian-law for N → ∞. This difference appears because the dynamics of the CM are not sufficiently slow for N = 2. For N = 2, indeed, one can show the absence of the zigzag noise term ∆ξ [T ] in the financial Langevin equation (44), which leads the dominance of the random exponential noiseζ [T ]. For N 1, on the other hand, the random noiseζ[T ] is negligibly small due to the slow CM dynamics, and the trend-following effect becomes necessary to explain the exponential price movements statistics. The model presented here thus exhibits essentially different characters as the number of traders increases.
G. Does the trend-following effect break the random walk hypothesis?
Seemingly, the trend-following effect is strongly contradictory to the conventional assumption of the random walk hypothesis. Our analysis however implies that the situation is not so simple: In the absence of the trend-following, the market price exhibits the strong zigzag behavior, which is far from the pure random walks. By adjusting the strength of trend-following appropriately (i.e., the marginal trend-following case), on the other hand, the zigzag behavior is somewhat relieved and the market price time series rather approaches the random walks. In this sense, the trendfollowing strategy might originate from the rational behavior of HFTs to equilibrate the strategies among traders. It would be interesting to pursue the origin of trend-following behavior from economical viewpoints as future studies.
We also note that the real price time series exhibits slightly zigzag behaviors (i.e., the negative autocorrelation and the tendency for price movement to take different sign), which are consistent with our HFT model for the marginal trend-following case. These different characters from the pure random walks have been well-known in finance and are obviously applicable to predict the direction of price movement in one-tick future. It is not easy however to make profits over the market spread (i.e., the difference between the market best bid and ask prices) by utilizing only these properties. While the real price time series slightly deviates from the pure random walks, it is not obvious whether these characters provide easy opportunities to statistically make profits. Making profits requires us to predict price movements beyond the market spread, which is out of scope of this paper but is an interesting topic for a future study.
H. Possible generalization 1: multiple-tick trend-following random walks and the PUCK model In this manuscript, we have addressed the trend-following HFT model with one-tick memory. It is straightforward to generalize the one-tick memory model toward a multiple-tick memory model, such that where ∆p EMA [T ] is the exponential moving average for the price movements {∆p[T ]} T with decay time τ EMA and renormalization constant Z EMA ≡ 1/(1 − e −1/τEMA ). In the authors' view, this model is more realistic because such an exponential moving average is a popular strategy among HFTs according to a detailed regression analysis for trend-following [57]. We then obtain a generalization of the financial Langevin equation as The generalized financial Langevin equation (64) is equivalent to the potentials of unbalanced complex kinetics (PUCK) model [29], which was previously introduced by time-series data analyses. Here we use an identity for the exponential moving averages ∆p EMA [T ] andp EMA [T ], which leads the PUCK model under a random potential U (p) = −ce −1/τEMA ∆p * Z EMAτ [T − 1] log cosh(e 1/τEMA p/∆p * Z EMA ) . In this sense, our theory is straightforwardly applicable to a derivation of the PUCK model.
I. Possible generalization 2: reduction to the random multiplicative processes In Sec. VI C, we assume ∆p * 1 both for analytical simplicity and for consistency with the empirical report [46]. Here we discuss the case with ∆p * 1, whereby the hyperbolic trend-following reduces to the linear trend-following as c tanh(∆p/∆p * ) ≈ c∆p/∆p * . The financial Langevin equation (44) [46]. Since Eq. (68) belongs to the random multiplicative processes [58], the price movement obeys the power-law statistics, consistently with the previous exact solution [36] for the two-body case N = 2.

VIII. CONCLUSION
In this paper, we have presented a systematic solution for the trend-following trader model, which was empirically introduced in our previous work [46]. Starting from the microscopic dynamics of the individual traders, we have systematically reduced the multi-agent dynamics by generalizing the mathematical method developed in molecular kinetic theory. We first introduce the phase space for our model and derive the dynamical equation for the phase space distribution function, which corresponds to the Liouville equation in the conventional analytical mechanics. On the basis of the Liouville equation for the trend-following trader model, we derive a hierarchy of reduced distributions in the parallel method to the BBGKY hierarchy. By introducing the mean-filed approximation, corresponding to the assumption of molecular chaos, we derive the mean-field dynamical equation for the one-body distribution function, similarly to the Boltzmann equation. We then derive the analytic solution for the mean-field model, whose validity is numerically examined when the number of traders is sufficiently large. We also derive the financial Langevin equation, governing the macroscopic dynamics of the financial Brownian motion, and study the macroscopic properties of the market price movements.
Here we have clarified the power of the kinetic frameworks in describing financial markets from microscopic dynamics. In our conjecture, this success lies on the fact that the financial markets approximately satisfy the key assumptions of the binary interaction and molecular chaos (see Secs. III B and VII B for related discussions); the one-to-one transaction (i.e., the binary interaction) is the most basic interaction, and traders less likely transact with the same counterparty for N 1. We believe that the financial market is one of the best subjects to apply the kinetic theory, besides traffic flow and wealth distribution [7][8][9]12]. We also believe that generalization of kinetic theories would be a key to clarify various social systems from microscopic dynamics, since we have access to various microscopic data these days. We here derive the financial Liouville equation for the trend-following trader model. The dynamics of our model is given by where we have introduce the colored Gaussian noiseη R i;ε satisfying η R i;ε = 0 and η R i;ε (t)η R i;ε (s) = e −|t−s|/ε /2ε. For the mathematical convenience below, we finally take the white noise limit ε → +0: lim ε→0η R i;ε =η R i . We next consider the dynamics of the center of the massz: Let next us consider the dynamics of an arbitrary function f (Γ) forΓ ≡ (ẑ 1 , . . . ,ẑ N ;ẑ CM ,p, ∆p) ∈ S. The timeevolution of f (Γ) is governed by the continuous movement by the continuous noise termη R i;ε and the discontinuous jumps by the deterministic transaction termη T i . We then obtain where we have introduced the difference vector ∆Γ ij induced by transactions defined by withp pst ij ≡ẑ i − (L i /2)sgn(ẑ i −ẑ j ) and ∆p pst ij ≡p pst ij −p. Let us decompose the sum of δ-functions here as where we have usedη R i;ε −η R j;ε > 0 just beforer i −r j − (L i + L j )/2 = 0 (or equivalently,η R i;ε −η R j;ε < 0 just beforê r i −r j + (L i + L j )/2 = 0) by taking collision directions into account. We then take the ensemble average of both hand side of Eq. (A3) with the aid of the Novikov's theorem [59] for an arbitrary functional for the colored Gaussian noiseη R i;ε . Here we remark the following two important relations for the δ-function for the with the dummy variable By substituting f (Γ) = δ(Γ − Γ), we take the ensemble average for both hand-sides of Eq. (A3) in the ε → 0 limit. We then obtain with an abbreviation symbol∂ ij ≡ ∂ i − ∂ j . Here, let us pay attention to the signature of the derivatives. Considering P (Γ) ≥ 0 for all Γ and P (Γ) = 0 for z i − z j > (L i + L j )/2, we obtain the signature of derivatives Equation (A10) can be simplified into Eq. (24) in terms of signatures by introducing the symmetric absolute derivative Note that Eq. (24) is a partial integro-differential equation because of the transaction jumps, though the conventional Liouville equation is a partial differential equation. This implies that our financial Liouville equation (24) technically corresponds to the pseudo-Liouville equation [14,[48][49][50] rather than the Liouville equation.

Appendix B: Detailed Derivation of Financial BBGKY Hierarchy
We here derive the lowest BBGKY hierarchal equation for the reduced distribution function (28), starting from the financial Liouville equation (24). We first introduce the relative price from the CM as r i ≡ z i − z CM . By making transformation Γ = (z 1 , . . . , z N ; z CM , p, ∆p) → Γ r ≡ (r 1 , . . . , r N ; z CM , p, ∆p), the financial Liouville equation can be rewritten as where we have used the chain rule for the variable transformation: We have also introduced ∆Γ ij;r = ∆Γ There is a small deviation from the LO solution around the boundary layer R2 because of the finite number effect for N . The deviation is studied within the NLO approximation for the financial Boltzmann equation (35).
The financial Boltzmann equation (35) is then approximated for r > +L/2 as which is consistent with the LO solution (36) for N → ∞: lim N →∞ φ L (r) = ψ L (r). We have obtained the NLO solution (E4) rather intuitively, but we can check that the solution satisfies the original Boltzmann equation (35) up to the order of N −1/2 by direct substitution. Around r ∼ L/2, indeed, we obtain where we have ignored the inflowJ LL (r + L/2) ∝ φ L (r + L/2) = O(exp(−N L 2 /4L * 2 ρ )) around r ∼ L/2. This implies that the solution (E4) satisfies the financial Boltzmann equation (35) directly. We also note that the NLO correction is the order of N −1/2 and is consistent with the assumptions in Appendix. C, where correction terms of O(N −1 ) are ignored for the derivation of Eq. (C8).

Appendix F: Numerical simulation of the microscopic model
Here we explain the numerical implementation of the trend-following HFT model. We focused on two type of buysell spread distributions given by the δ-distributed spread (38) and the γ-distributed spread (39). The length and time units of this system are taken by L * and L * 2 /(σ 2 N ), respectively. We performed the Monte Carlo simulation for various number of traders N and trend-following parameters (c, ∆p * ) under a fixed discretization time ∆t = 0.01L * 2 /(σ 2 N ). For initialization, we first run the simulation for the time interval of 10L * 2 /σ 2 and then run the simulation again to take samples. The simulation time was set to be 10 5 ticks except for the MSD plots in Fig. 11c,d, Fig. 12d, and Fig. 13c,d. For Fig. 11c,d, Fig. 12d, and Fig. 13c,d, the simulation time was set to be 10 6 ticks.