Interpretable Conservation Law Estimation by Deriving the Symmetries of Dynamics from Trained Deep Neural Networks

As deep neural networks (DNN) have the ability to model the distribution of datasets as a low-dimensional manifold, we propose a method to extract the coordinate transformation that makes a dataset distribution invariant by sampling DNNs using the replica exchange Monte-Carlo method. In addition, we derive the relation between the canonical transformation that makes the Hamiltonian invariant (a necessary condition for Noether's theorem) and the symmetry of the manifold structure of the time series data of the dynamical system. By integrating this knowledge with the method described above, we propose a method to estimate the interpretable conservation laws from the time-series data. Furthermore, we verified the efficiency of the proposed methods in primitive cases and large scale collective motion in metastable state.


I. INTRODUCTION
Various studies look to understand (and make predictions related to) complicated, large-scale dynamical systems.Such studies might, for example, look at the metastable state of a nonequilibrium large-degree-of-freedom system by modeling it as a low-dimensional canonical dynamical system.One research stream looks to reduce collective motion systems that feature a large degree of freedom (e.g., plasma or acoustic wave systems) [1][2][3][4].To develop reduced models, researchers have introduced collective coordinatessuch as the Fourier basis of density distribution or the charge distributionand have derived the Hamiltonian, which describes the coarse-grained property of a dynamical system.It is also well known that the dynamics of the vortex structure of a turbulence system are modeled as a Hamiltonian system in vortex feature space [5].Thus, to develop the reduced model, we need to introduce the collective coordinates and derive the Hamiltonian in the space, so that we may describe the properties of the dynamical system.On the other hand, this approach relies heavily on the intuition of physicists; it would be difficult to model a dynamical system that features a more complex structure, such as the collective motion of living things like fish or birds.Quite frequently, such systems have stable but very complex patterns in a metastable state [6,7].
In recent years, several machine-learning methods have been developed to estimate the Hamiltonian from dynamical data.Schmidt et al. [8] estimated the Hamiltonian by regressing the data, using a linear sum of multiple basis functions.It is difficult to apply this method to estimations of a reduced model that has complex unknown basis functions.More flexible Hamiltonian estimations have been realized using the deep learning techniques developed in recent years [9][10][11].
On the other hand, it is difficult to interpret the estimated Hamiltonian modeled derived through deep neural networks (DNNs), as a DNN is a function that bears enormous degrees of freedom.Thus, no method can construct an interpretable reduced model of complex phenomena.Additionally, such a machine learning approach finds Hamiltonian which has the properties only hold on given data.Historically, physicists have achieved great success in constructing reduced models by abstracting knowledge obtained from observational data and building universal models that can explain various physical phenomena, and not just the target data.For example, thermodynamicsa reduced model that describes the molecular motion of a gaswas linked to chemical reaction theory by Gibbs [12,13], who was very successful in this regard.It is difficult to estimate such a reduced model through the use of existing machine-learning methods; it is more effective for a physicist to construct a reduced model.For this purpose, it is necessary to extract from the data interpretable physical information that is useful for constructing a reduced model.The purpose of this study is to develop a machine-learning method that provides interpretable information and can therefore assist physicists who are looking to build reduced models.
Several studies [14][15][16][17][18][19] suggest that DNNs can model the distribution of datasets as manifolds and embed the manifolds in a low-dimensional Euclidean space.From this perspective, the mapping function of a DNN is considered a representation of data manifolds.Studies that apply DNNs to physics data employ time-series data from the phase space (comprising position and momentum) [20][21][22][23][24] or spin system data from the configuration space [25][26][27][28][29][30][31][32][33].In such datasets, the manifold structure (which implies that the system has a small degree of freedom) can be constructed in consideration of certain physical constraints, such as a conservation law.In other words, a manifold structure modeled by a DNN can represent the conservation law or order of the system.In addition, in physics, Noethers theorem [34] connects the symmetry of the Hamiltonian and the conservation law.To estimate the conservation law, we need only the tangent space of the manifold of the continuous transformation group that corresponds to the system symmetry.Thus, unlike direct Hamiltonian estimation, symmetry estimation demands that one model a manifold with, at most, first-order accuracy.The present study derives the relation between the symmetry of the Hamiltonian system and the dataset distribution of the time-series data of a dynamical system.For this purpose we develop a method by which to estimate the symmetry of a data manifold modeled by a deep auto encoder [35], and determine the conservation laws of the system.Furthermore, we apply the proposed method to four datasets that correspond to O(2), SO(2), and T(1) symmetries.The datasets of symmetries T(1) and SO(2) correspond to the time-series data of constant velocity linear motion and the central force potential dynamical system, respectively.
Another SO(2) system involves a case of large-scale collective motion, in what is called Reynolds model [36].This model conceives of a torus-like school of fish as a metastable state.As a result, the proposed method correctly estimates the O(2), SO(2), and T(1) symmetries, and directly estimates from the time-series data the conservation law of momentum and angular momentum.
Additionally, the proposed method estimates the canonical coordinates and conservation law of a large-scale collective motion system.

II. NOETHER'S THEOREM AND A DATA MANIFOLD OF TIME SERIES DATA
A. Noether's theorem Noether's theorem connects continuous symmetries of the Hamiltonian system with conservation laws [34].We consider Hamiltonian systems in the 2d + 2 dimensional extended phase space: and let the system's Hamiltonian be H(q, p).
The Hamiltonian representation of Noether's theorem is described as follows [37].Assuming that the Hamiltonian H(q, p) and the canonical equations (equations of motion), ∂H(q,p) ∂q i = − ṗi and ∂H(q,p) ∂p i = qi , are invariant under infinitesimal transformation: (q i , p i ) = (q i + δq i j , p i + δp i j ), where i = 1 ∼ d and j are the indices of the direction of infinitesimal transformations corresponding to the conservation laws.Then, based on Noether's theorem, the conserved value G satisfies the following equation: The canonical transformation which makes the Hamiltonian system invariant is given as (Q, P) = (Q(q, p, θ), P(q, p, θ)), where θ is a m-dimensional transformation parameter, Q(θ = 0) = q, and P(θ = 0) = p.We call this transformation as invariant transformation in this paper.A set of transformations characterized by the continuous parameters θ forms a Lie group.By the Taylor expansion of Q and P around θ = 0, we have the infinitesimal transformation: (δq i j , δp i j ) = (ε ∂Q i (q,p,θ) , where ε << 1.

B. Invariance of Hamiltonian and time series datasets
We show the relation between such invariant transformation and the time-series data of the dynamical system in the extended phase space (q, p).Here, we assume that the transformation (Q, P) = (Q(q, p, θ), P(q, p, θ)) has the inverse transformation (q, p) = (q(Q, P, θ ), p(Q, P, θ )).The transformation, which does not change the Hamiltonian, satisfies the condition (see Appendix A): ⇔ ∀(q, p), H (q, p) = H(q, p) Eq.( 6) implies that the transformation invariant of Hamiltonian is equivalent to the transformation invariant of the energy surface at each energy level in the extended phase space.
C. Invariance of canonical equations and data set of time series Next, we consider the relation between invariance of canonical equations of motion and timeseries data of the dynamical system.If the canonical equation of motion is discretized with respect to time differentiation: where q t and p t represent the parameters evolved according to the time t.Following the assumption (Q, P) = (Q(q, p, θ), P(q, p, θ)), these equations can be rewritten as follows: P T +∆T = P(q t+∆t , p t+∆t , θ) = G(Q T , P T ) := P(f(q(Q T , P T , θ ), p(Q T , P T , θ )), g(q(Q T , P T , θ ), p(Q T , P T , θ )), θ), where T = Q 0 , ∆T = ∆Q 0 and In order for the transformation (Q, P) = (Q(q, p, θ), P(q, p, θ)) to be a canonical transformation, the following conditions must be satisfied: If H and H are identically equal, this condition is equivalent to: G(q t , p t ) ≡ g(q t , p t ). ( The relation between this condition and the time series data set is derived as follows (see Appendix A and Appendix B), Eq.( 13) ⇔ ∀(q t+∆t , p t+∆t ), {q t , p t | (q t+∆t , p t+∆t ) = (f(q t , p t ), g(q t , p t ))} = {q t , p t | (q t+∆t , p t+∆t ) = (F(q t , p t ), G(q t , p t ))}, ( ⇔ {q t+∆t , p t+∆t , q t , p t | (q t+∆t , p t+∆t ) = (f(q t , p t ), g(q t , p t ))} = {q t+∆t , p t+∆t , q t , p t | (q t+∆t , p t+∆t ) = (F(q t , p t ), G(q t , p t ))}, ⇔ {q t+∆t , p t+∆t , q t , p t | (q t+∆t , p t+∆t ) = (f(q t , p t ), g(q t , p t ))} ⇔ {q t+∆t , p t+∆t , q t , p t | (q t+∆t , p t+∆t ) = (f(q t , p t ), g(q t , p t ))} Therefore, the transformation (Q(q, p, θ), P(q, p, θ)), which simultaneously makes the Hamiltonian and the canonical equations invariant, satisfies the following condition: Thus, the symmetry of the Hamilton system is associated with the symmetry of the time series data distribution.We define the transformation satisfying Eq.( 18) as ( Q(q, p, θ), P(q, p, θ)).
In the reduced model of collective motion, there is no guarantee that all energy states in the reduced Hamiltonian are realized.Also, in the first place, when constructing a reduced model of a metastable state, it is not realized anything other than that energy state.Therefore, we relax the condition as follows.We discretize the energy E at infinitesimal intervals, and define each discretized energy as E i .We also define ( Qi (q, p, θ i ), Pi (q, p, θ i )), which satisfies {q t+∆t , p t+∆t , q t , p t | . Because the transformation that satisfies Eq.( 6) does not change the energy, the transformations ( Q(q, p, θ), P(q, p, θ)) and ( Qi (q, p, θ i ), Pi (q, p, θ i )) are related as ( Q(q, p, θ), P(q, p, θ)) = i ( Qi (q, p, θ i ), Pi (q, p, θ i )).This implies that the invariant transformation for a certain energy E i , ( Qi (q, p, θ i ), Pi (q, p, θ i )), is always a good candidate for invariant transformation of the whole system.
The transformed data set } is obtained by the time evolution t → T of time series data at t: If the Hamiltonian is given, we can obtain the time evolved data set by evolving the data set obeying the equation of motion.Even if the Hamiltonian is not given, we can obtain a time-developed data set as follows.Assuming that we have time series data at ( The time transformation of data from t to T can be approximated by replacing T with data at time T : On the other hand, the purpose of this paper is to show that the proposed framework for estimating the conservation law is feasible, so we set transformation of time as identity mapping t → t in this paper for simplification.Thus, a transformation candidate that makes the Hamiltonian and the canonical equations invariant is obtained as the transformation that makes the subspace which are all possible states of the dynamical system at E i , invariant.
From observations or from computational simulations, let there be finite time series data D which are a part of the subspace S .From D, we assume that the subspace S can be approximated by the DNN as a manifold, in addition to assuming that the invariant transformation for S is estimated by the symmetry of the manifold.This assumption can be easily violated when the number of data samples is not enough to reconstruct the S .Although, the conservation laws obtained based on this assumption can be easily verified by confirming whether the conserved value is invariant in the time-series data.
In this study, we dealt only with classical systems.A similar relation holds between the data manifold and the symmetry of the system in canonical quantum field theory.In canonical quantum field theory, the Hamiltonian is given as where φ(x) is the field, π(x) i is the canonical momentum conjugate of φ(x), and x = (ct, is the Minkowski space.The infinitesimal transformation is given as Same as the nested relation of coordinate and time in the classical system, the canonical quantum field theory states that a field and its conjugate momentum have a nested Minkowski space.
Therefore, based on the same discussion in classical systems, the following relation is given as a condition of the invariant transformation of a Hamiltonian system.{φ(x 0 + ∆x 0 , }.

III. DNN AND THE DATA MANIFOLD
Except for the situation such as chaos, the time series data set, subspace S , is considered to have a manifold structure.Because it follows the continuous differential equations.A manifold is a space constructed by continuously pasting Euclidean spaces called a tangent space.An approximate example of a manifold is the Earth's surface.We consider the Earth's surface as a lamination of a map that is a two-dimensional Euclidean space.Some well-trained DNNs have the ability to model a distribution of the training dataset as a manifold.In this paper, we refer to the manifold modeling the data distribution by a DNN as "data manifold." We explain how a DNN models manifolds, using one of the simplest DNN cases.Additionally, we use a three-layer DNN, for which the input is of d in -dimension, hidden layer is of d h (> d in ) dimension, and output is of , where f is called the activation function.Usually, the sigmoid function or the LeRU function is used as the activation function.These activation functions are constructed using linear and flat domains.Based on these properties of the activation function, f j maps the input sub-space related to the linear domain of the activation function to a onedimensional space to align the vector (w 0 j , w 1 j • • • , w d in j ).If there are p out number of f j s sharing the same input subspace, they define the p out dimensional sub-hyper-plane.The DNN models the data distribution by continuously pasting these sub-hyper-planes as if they were tangent space of a manifold.In other words, the DNN embeds the input space in the output space by pasting the sub-hyper-planes and compresses the tangent direction of these sub-hyper-planes (Fig. 1).This is only one example story to explain how to model the data manifold by DNN.But there are many research which suggest there are resemble structure in successful DNN models [14][15][16][17][18][19].In this study, using the trained DNN which models a time series data manifold, we propose a method to extract information about the symmetry of a dynamical system from the trained DNN.As described later in the discussion section, our proposed framework does not require special DNNs, so we can directly utilize the vast knowledge of researches about physical data analysis using DNNs.This is the reason why we select the DNN model from candidates of multiple machine learning models that can model manifolds.

IV. METHOD
A. Extracting the invariant transformation of a data manifold using the Monte Carlo method In this sub-section, we propose a general method to estimate the symmetric property of data manifolds, not limited to physical time series data.From the discussion in Sec.III, data points that are not on the manifold in the input space are attracted to the manifold (Fig. 1).If the data points are attracted once to the manifold in the hidden layer, they continue to exist on the manifold in the output F(x).Based on this DNN property, we proposed a method for extracting the symmetry of the data manifold using a deep autoencoder (DAE) [35].The deep autoencoder is a model that compresses the input space to a low-dimensional hidden layer, and uncompresses the layer to the output space at the same dimension as the input space.In the uncompressing process, only the sub-space of the input space around the data manifold is recovered because of the DNN property.Based on this property, we can evaluate whether one transformation X(•) makes the data-set distribution {x i } N i=1 continue to be in the same sub-space of the data manifold or not (Fig. 2).The procedure is explained as follows.First, we train the DAE using {x i } N i=1 as training dataset.Second, we input the transformed data-set {X(x i )} N i=1 into the trained DAE.Note that DAE is not trained on the transformed dataset.Third, we evaluate the transformation X(•) using the squared root error between the input distribution of the dataset and its mapped distribution.
A smaller E samp value implies that X(•) is a more invariant transformation.Using the criterion E samp , we can estimate the invariant transformation X(•).In the case of time-series data of dynamics, x i = (q i , p i ) and transformation X(•) is replaced by continuous transformation (Q(•, •, θ), P(•, •, θ)).As we mentioned in Sec.II A, the continuous symmetry treated in Noether's theorem forms a Lie group.Using the smooth parameter set θ = {θ k } p k=1 , the representation of the Lie group is expressed as d × d dimensional matrix A i j (θ) = a i j (θ), where d is the dimension of data space x i and θ is the parameter of transformation and A(0) = I.In the following, candidates of invariant transformation are searched within the Lie group representations.
For this purpose, we set the linear transformation A = a jk as candidate of the invariant transformation.The invariant transformation is obtained by sampling the element a jk of matrix A following the probability distribution P(a 11 , a 12 , a 21 , , where σ is the standard deviation of the noise.To perform this sampling, we need to specify σ; however, it is difficult to specify σ in advance.In addition, the target distributions in this study are supposed to be the global flat local minimum, because the same E samp surface exists following the invariant transformation.Generally, such target distribution is difficult to sample.Therefore, as a sampling method [38] that could solve these problems, we used the replica-exchange Monte Carlo method.
It performs efficient sampling using parallel sampling with different noise intensities of σ, while exchanging noise intensity with each other.In the large noise state, we can realize global sampling from abstract distribution P (a 11 , a 12 , a 21 , , where σ > σ.Exchanging this sampling information with low temperature state, we can realize efficient sampling from target distribution P(a 11 , a 12 , a 21 , • • • , a dd ).The parameters of the sampling method were set to be the same as in previous studies [39,40], and target sigma determined by analysis of sampling result(see Appendix C).The representation of the Lie group is expressed as A i j (θ) = a i j (θ).A vector defined by the elements of this transformation matrix is defined as

Algorithm 1 Estimating the invariant transformation set
where d = d 2 .Lie groups correspond to p-dimensional differentiable manifolds and are constructed using the set of A (θ) with different θ.The implicit function representation of this manifold is defined as . What we wish to determine is the infinitesimal transformation, which corresponds to the tangent space of the manifold at position, I is the representation of the unit matrix I in the A (θ) space.We estimate this tangent space from the sampling results obtained in Sec.IV A.
The Jacobi matrix of Differentiating these equations with respect to b l around point I yields d − p simultaneous partial differential equations, Solving this simultaneous partial differential equation gives the tangent vector of the manifold around I , which is an infinitesimal transformation m=1 are obtained with the sampling method explained in Sec.IV A, we can obtain the simultaneous equations Eq. ( 27) by the following procedure.First, the upper limit of the dimension p max of the manifold of the transformation is estimated using Principal Component Analysis and the elbow" method [41].Second, we extract , where p (≤ p max ).Using orthogonal distance regression [42], we where β is the regression coefficients, and I is the indicator vector to determine whether the basis is selected or not.The indicator vector I and the dimension of manifold p are determined using a model selection method, such as the Bayesian information criterion(BIC) [43].If p ≤ 2, p can be determined by visualization first.The following likelihood function is set for use of statistical model selection.
where From the obtained simultaneous equations, the simultaneous differential equations are obtained.
If the Jacobian matrix J kl is singular, the solution of this simultaneous equation diverges or becomes indefinite.In that case, the variable set } is re-extracted again, and the same procedure is repeated.If the Jacobian matrix J kl is not singular, we can obtained the infinitesimal , which make the data manifold invariant.
Step4: Using BIC, select the indicator vectors I and dimension p of Lie group manifold in Eq.( 29).
Step5: Checking whether the jacobi matrix is non-singular or not.If J kl is singular, return to Step1 and re-extracting D b .
Step6: Differentiating the obtained simultaneous equations with respect to b l around point I to obtain Eq.( 28).
Step7: Solving the simultaneous equations Eq.( 28) and obtaining the infinitesimal transformation, δq l i j , δp l i j .

V. RESULTS
We evaluate the proposed method using four cases: a) half sphere, b) one-dimensional constant velocity linear motion, c) two-dimensional center force system, and d) collective motion system.
Case a) is a case with rotational symmetry.In this case, we confirm whether Method 1 can obtain a set of transformations corresponding to the symmetry.Cases b) and c) are systems that conserve the momentum and angular momentum, respectively.Using these cases, we verified the Method 2. Finally, we apply the proposed methods to d), which is a complex collective motion system, and try to estimate the collective coordinate and conservation law.
a) Half sphere The dataset of case a) was generated following the function where r = 0.25.The dataset of the case a) (shown in Fig. 3(a)) was used for verification of the symmetry extraction ability of the proposed method described in Sec.IV A. The transformation matrix set as The sampling results of a i j are shown in Fig. 3(b) as black dots.In the figures, the red curve represents the curve fitted by the selected model using BIC.The The dataset of case b) was generated based on the model In this case, we show that the proposed method could estimate the momentum conservation law.We set the transformation matrix A(θ) as Then, the transformation of the momentum space was represented as p = A(θ) • p.As a result, there were only four parameters a i j to be sampled.The coordinate space to verify transformation invariant is (x(t + ∆t), p(t + ∆t), x(t), p(t)).The sampling results of a i j are shown in Fig. 4 as black dots.In the figures, the red curve represents the curve fitted by the selected model using BIC.The fitting results of the selected models are obtained as The simultaneous partial differential equations Eq.( 28), where b l = b, were obtained from the fitting results.From the solution of the simultaneous partial differential equations, we could obtain the infinitesimal translation: where we determine p = 1 based on the visualization of distribution of D a , and the significant digits are one decimal points.By substituting this into Eq.( 1) and solving it, the conserved value G δ was estimated as G δ = 1.0εp.This result represents the conservation law of momentum p.The dataset of case c) was generated according to the Hamiltonian: We limited the transformation matrix A(θ) acts on the Euclidean space x, such that Then, the transformation of the momentum space was represented as p = A(θ) • p.As a result, there were only four parameters a i j to be sampled.The coordinate space to be verified the transformation invariance is (x(t + ∆t), p(t + solving the simultaneous partial differential equations, we obtained the infinitesimal translation as = (0, ε, −1.01ε, 0)q ≈ (0, ε, −ε, 0)q, (42) ≈ δ(0, ε, −ε, 0)p, (44) where the significant digits of final formula are one decimal points.By substituting this into Eq.
(1) and solving it, the conserved value G δ was estimated as G δ = ε(x 1 p 2 − x 2 p 1 ).This result represents the angular momentum was conserved.

d) Collective motion system
In this case, we apply our framework to a N-body collective motion system called the Reynolds' boid model [36]: S c = { j|| q j − q i | < r c , j i, j ∈ N}, n c = n(S c ) S a = { j|| q j − q i | < r a , j i, j ∈ N}, n a = n(S a ) S s = { j|| q j − q i | < r s , j i, j ∈ N}, where i is the index of each boid.By tuning the parameters W c , W a , W s , r c , r a and r a , Reynolds' model can simulate the behavior of the collective motion of a group of organisms such as birds or fish [36,44].In this study, we focused on the parameter set which simulates the torus type behavior likes fish-school in the sea.
To estimate the conservation law of collective motion, we need to set the candidate of a collective coordinate.We set the candidate of collective coordinate based on the following considerations.First, from the visual symmetry of the motion, the average position of all particles and the time is set as the origin of the coordinate system.Second, since the same behavior is performed regardless of the individuality, the degree of freedom of the individual is considered to degenerate.
From these considerations, we prepared the data set as D = {q(t) i , q(t + δt) i , p(t) i , p(t + δt) i } T N i=1 := {q(t) i j , q(t + δt) i j , p(t) i j , p(t + δt) i j } <i, j> , where < i, j > represents the all combination of N index i and T index j.Then, we set the transformation matrix A(θ) as The sampling results of a i j are shown in Fig. 6(b) as black dots.In the figures, the red curve represents the curve fitted by the selected model using BIC.The fitting results of the selected models are obtained as follows (red curves of Fig. 6(b)): where the significant digits of final formula are one decimal points.By substituting this into Eq.
(1) and solving it, the conserved value G δ was estimated as G δ = ε(x 1 p 2 − x 2 p 1 ).This result represents that the angular momentum was conserved.

VI. SUMMARY AND DISCUSSION
From the results of case a), we confirm that Method 1 could be used to extract the symmetry.
The results of cases b) and c)wherein the expected conservation laws were estimatedshow that Method 2 is effective.In comparing cases a) and c), we see there are differences in the selected polynomial models in the a 11 -a 22 and a 21 -a 12 spaces.These differences should indicate that there is mirror symmetry in case a).This finding supports the assertion that the method works well FIG.6. boids case in extracting system symmetry.In the more practical collective motion systems (i.e., case d), we estimated the angular momentum conservation law; the results thereof are consistent with a previous study [44] that suggest that angular momentum is conserved in torus-type swarming patterns.Additionally, the finding of a conservation law in the collective coordinateswhere the degree of freedom of individuality was degenerated and the origin of coordinates is the average position of the swarmsuggests that the large degree of a dynamical system can be reduced as a central force dynamical system.
The present study deals only with the case where there is a single conservation law.If there are multiple conservation laws at work, the dimension of the manifold S also has multiple dimensions, in line with the number of conservation laws.In such a case, Eq.( 28) derives multiple orthogonal solutions.Theoretically, the proposed method can handle the problem, but the number of regression polynomial combinations (Eq.( 29)) increases exponentially.Therefore, it is necessary to develop a more efficient means of estimating infinitesimal transformation.To estimate infinitesimal transformation, one need only estimate a tangent space around the identity element.As there is a finite sample, in the proposed method, the manifold formed by Lie groups was regressed over the entire space.It is expected that the direct estimation of tangent space can be undertaken by using orthogonal basis decomposition, by developing various constraints.
In the present study, we used DAE to model the time-series data manifolds; nonetheless, there is no need to use DAE.The only requirement for a machine-learning model is that it have a mapping function that can determine whether it is on the manifold or outside the manifold.
From this perspective, DAE can be replaced with other same type DNN models, such as variational autoencoder [45] or generative adversarial networks [46].Additionally, the feed-forwardtype DNNwhich is widely used in DNN researchcan be used in the module of our proposed method, by additionally training a neural network that reconstructs the input data from the output layer of the feed-forward neural networks.The same method should be feasible for use with machine-learning models that have mapping functions that embed data manifolds into output space (e.g., the kernel method).Thus, by leveraging the machine-learning model, the proposed framework could potentially extract explicit physical knowledge from the vast existing research findings on physical data analysis.
In this study, it was shown that explicit conserved law is possible to be estimated from timeseries data of the dynamical system.Based on these results, it is expected that the implicit knowledge of physical data obtained by previous studies using DNN and the explicit knowledge of physicists might be merged, and research of reduced model construction might be accelerated based on it.

Input:
Data set {x i } N i=1 .Output: Invariant transformation set D a = {a l 1 , a l 2 , • • • , a l d } L l=1 .Step1: Training the deep autoencoder by dataset {x i } N i=1 .Step2: Using the trained DAE, sampling transformation parameter a 11 , a 12 , a 21 , • • • , a dd based on probability P (a 11 , a 12 , a 21 , • • • , a dd ) ∝ exp[− N 2σ 2 E samp (a 11 , a 12 , a 21 , • • • , a dd )].Step3: Determine the σ based on distribution structure of sampling results.B. Estimating the infinitesimal transformation of symmetry from the sampling result Finally, from the L sampling results D a = {(a 11 , a 12 • • • , a 1d , a 21 • • • , a dd ) l } L l=1 in Sec.IV A, we propose a method for estimating the infinitesimal-transformation, which represents the invariance of the Hamiltonian and the equation of motion.

∂b l.
If the Jacobi matrix at A = I becomes non-singular, based on the implicit function theorem, variables other than (b 1 , b 2 , • • • , b p ), {c k } d −p k=1 ⊂ {A ∩ {b l } p l=1 }, can be expressed as c k = g i (b 1 , • • • , b p ).This implies that the equations representing the manifold of Lie group around I can be decomposed into the following d − p simultaneous equations: represents the minimum distance between point b m to the geometric future f (c k , b 1 , b 2 , • • • , b p ; β, I, p ) = 0.The normalized constant Z is estimated numerically by piece-wise integration.