Multi-body wave function of ground and low-lying excited states using unornamented deep neural networks

We propose a method to calculate wave functions and energies not only of the ground state but also of low-lying excited states using a deep neural network and the unsupervised machine learning technique. For systems composed of identical particles, a simple method to perform symmetrization for bosonic systems and antisymmetrization for fermionic systems is also proposed.

Among the above, DFT and QMC are classified into methods based on the variational principle.The variational principle [41] guarantees that the ground-state energy E gs of a Hamiltonian H satisfies where all the possible functions are considered in the infimum.The minimizer corresponds to the ground-state wave function.Indeed, it is scarcely possible to consider all the possible functions throughout minimizing the energy expectation value; hence, in practice, the calculation accuracy of a method based on the variational principle depends on the ansatz of a trial wave function.In other words, the size of the space of trial wave functions, in principle, determines the calculation accuracy.
For instance, a trial wave function of DFT is a Slater determinant, which is the simplest antisymmetric trial wave function.Owing to the simpleness of the ansatz, the numerical cost is drastically reduced, while it is known that interparticle correlation is partially missing [42].
In the QMC calculation, a Jastrow-type trial wave function [43] is often used.A Jastrow-type wave function |Ψ⟩ consists of a single-(or sometimes multi-) Slater determinant |Φ 0 ⟩ and a correlation factor F , |Ψ⟩ = F |Φ 0 ⟩.With assuming F as a symmetric function, |Ψ⟩ satisfies antisymmetry.Owing to the introduction of the factor F , interparticle correlations are described better than a single Slater determinant.Nevertheless, most QMC calculations optimize mainly F , while |Φ 0 ⟩ is optimized only around an initial ansatz [44].In addition, an ansatz is introduced even for F ; hence, accuracy also depends on the ansatz.Recently, based on the QMC calculation, a deep neural network (DNN) has been used for the ansatz of a trial wave function [45][46][47].Since deep neural networks span much wider space, calculation accuracy is much improved, which is guaranteed by the universal approximation theorem [48,49].Nevertheless, once the ansatz of a trial wave function is introduced, the systematic improvement of the calculation is difficult even with a DNN since the space trial wave functions span is already limited.
Another problem of variational-principle-based methods is calculation of excitation spectra.Excitation spectra are important quantities of molecules and atomic nuclei, while the variational principle [Eq.(1)] obtains the ground state only.Hence, another technique is needed to calculate excited states based on the variational principle.Indeed, a method to calculate low-lying excited states using the DMC calculation was proposed [50] by considering the orthogonality of wave functions, while its application has been still limited [51], whereas calculation of excited states on top of the DFT ground state has been widely performed by using the random phase approximation or some other techniques [52][53][54][55][56], while there is still room to be improved.Recently, low-lying excited states were also obtained in Ref. [57] using the combination of the DNN, QMC, and the orthogonal condition.
In this paper, we propose a new method to calculate energies and wave functions of the ground state and low-lying excited states based on the variational principle.The ground-state wave function is assumed to be a DNN, which, in principle, is able to represent any function [48,49].Using an essence of the machine learning technique-the minimization of the loss function-, they are directly optimized by using the machine learning technique.In the machine learning technique, a network structure is introduced and the parameters of the network are optimized with minimizing the loss function.This process is often called training.In many works, these parameters are trained by using the given data called learning data, whose size is often large.On the contrary, the machine leaning trained without learning data, called unsupervised machine learning, has also been used in many different fields.References [58,59] also assumed the ground-state wave function as a deep neural network that were optimized by using the machine learning technique: Ref. [58] performed calculation of fewbody bosonic systems and Ref. [59] performed calculation of the simplest realistic system-a deuteron.Although these papers are pioneering works of unsupervised machine learnings for quantum many-body problems, the fermion antisymmetrization was not considered and excited states were not studied, while both ground and excited states of many-fermionic systems are interesting in general.Recently, Ref. [60] proposed a method to obtain the ground state of the many-body Schrödinger equation for fermionic systems by using a tensor neural network, while its implementation is involved and the antisymmetrization is, indeed, not perfectly guaranteed.
In this paper, on top of the method in Refs.[58,59] a simple method of the antisymmetrization for manyfermion systems or the symmetrization for many-boson systems is introduced.Then, low-lying excited states are sequentially calculated by using the orthogonality conditions and the variational principles.In this method, there is no need to discover a DNN architecture to generate (anti)symmetric wave functions.The (anti)symmetrization is put at the level of loss functions.Furthermore, the symmetrization and the antisymmetrization are implemented in almost the same way and the wave function perfectly satisfies (anti)symmetry.Thanks to the simplicity of the implementation, the numerical cost is quite small.We show that our method works successfully for popular examples in bosonic and fermionic quantum mechanical systems, providing a fundamental basis of the DNN method for quantum mechan-ics.
This paper is organized as follows: Section II is devoted to calculation of the ground-state.The novel (anti)symmetrization is introduced.Section III is devoted to the calculation of low-lying excited states.All the calculations are performed in a MacBook Pro with the Apple M1 chip (MacBook Pro (13-inch, M1, 2020): MacBookPro17,1) and 16 GB memory.Section IV gives a summary of this paper.

II. GROUND-STATE CALCULATION
In this section, the ground-state wave function and energy are calculated by using a DNN and the machine learning technique.Throughout the paper, a machine learning software named Tensorflow [61] is used.

A. Network structure and machine learning technique
In general, a wave function of a d-dimensional Nparticle system is a function of the spatial coordinates of all the particles r j = (r j1 , r j2 , . . ., r jd ) (j = 1, 2, . . ., N ).
In this work, the wave function is represented by a deep neural network with N d-input units that corresponds to the spatial coordinate R and one-output unit that corresponds to the value of the wave function ψ (R).Between the input and output layers, there are hidden layers.Each unit is connected to all the units just one before or after layers.The schematic figure of the deep neural network is shown in Fig. 1.In this paper, the "softplus" function is used for an activation function and the Adam optimizer [62] is used for the optimization process.
As the normal procedure of the numerical calculation of the wave function, the spatial coordinate is discretized.Each point is treated as a batch of the machine learning.In other words, if the spatial coordinate of each direction is discretized with M meshes, the batch size is the same as the number of meshes, M dN .The mini-batch technique is not used.
Once the spatial coordinates are discretized, the Hamiltonian (3) can be written as a matrix, where m is the mass of the particles, V ext is the external potential, and V int is the interparticle interaction.The matrices of the external potential and the interaction are diagonal and that of the kinetic energy is sparse.Hence, the expectation value of the Hamiltonian ⟨H⟩ can be calculated by using the sparse-matrix technique.The ground-state wave function minimizes ⟨H⟩; therefore, ⟨H⟩ is regarded as a loss function.Note that all the calculations are performed with double precision floating point numbers (float64).For simplicity, m = ℏ = 1 is assumed.
The procedure in the Tensorflow code is as follows: 1. Construct a model of the deep neural network; 2. Note that although a Tensorflow subroutine specification for the loss function technically requires two inputs-the training data (true_value) and the network output (predicts), the former is not referred in our training; 3. Fit the model (model.fit)where the initial value of predicts consists of positive random numbers; 4. The final wave function output_wf is given by using model.predict; 5. The ground-state energy is calculated using the wave function obtained by the last step.
The third step (model.fit)corresponds to determining the parameters inside the DNN; the fourth step (model.predict)corresponds to storing the wave function obtained in the previous step; the fifth step corresponds to calculating the ground-state energy using the wave function obtained in the fourth step.Note that predicts and the final wave function should be normalized whenever generated.
B. One-dimensional one-particle systems In this section, benchmark calculations of onedimensional systems are shown.The dependence of the numbers of units and layers on calculation accuracy is also discussed.Since there exists only one particle, there is no interaction, V int ≡ 0; thus, the Hamiltonian reads Since we focus only on bound states in this paper, it is enough to deal with the limited spatial region.In the calculation, |x| ≤ x max is considered and the box is discretized within 1024 meshes, i.e., M = 1024.The Dirichlet boundary condition (ψ (±x max ) = 0) is used.For the second derivative, the three-point derivative is used for simplicity, while it can be straightforwardly improved for the accuracy [63].Then, H is discretized as and the wave function is also discretized as a (M − 1)dimensional vector where ψ is assumed to be normalized, i.e., h hj, and h denotes the mesh size h = 2x max /M [64].This ψ is used for predicts and output_wf.Here, a tilde denotes discretized form.Then, ⟨H⟩ can be calculated as 1. Harmonic oscillator First of all, the harmonic oscillator potential is tested.The ground-state wave function ψ gs and energy E gs are, respectively, known exactly as [41] ψ gs (x) = ω π In this calculation, x max = 5 is used.
Table I shows the summary of calculations.In general, all the calculations give almost the correct energy [Eq.(9b)].On the one hand, total optimization costs similar amount of time in all the calculation.On the other hand, different setup requires different number of epochs and time per epoch for optimizing the DNN.Small DNN tends to take shorter time for each epoch, while it requires longer epochs.It seems that 32 units per layer is too large, so it requires longer epoch and longer estimation time per epoch.It should be noted that the number of epochs differs in each run since the initial condition of the fitting procedure is generated by the random numbers.In addition, if one uses a different value of the learning rate, the number of epochs can be different.
Figure 2 shows relative errors of the loss function, ⟨H⟩, to the exact ground-state energy E gs as functions of the number of epochs.It can be seen that, although the loss function achieved the relative error of 1.0×10 −8 , the final accuracy becomes about 1.0 × 10 −4 .This may be due to the precision of the Tensorflow code.
Figure 3 shows calculated wave functions.The red thick lines correspond to the exact solution given in Eq. (9a), while thin lines correspond to the results given in this work, where different colors correspond to different numbers of units and layers.The relative errors of the DNN wave function, ψ DNN , to the exact one, ψ exact , are shown in Fig. 4. It can be seen that the DNN calculation, basically, reproduces the exact solution in our interest within the accuracy of 10 −4 or more.This deviation can be reduced if we use more tight convergence criterion [65].In the tail region, the deviation δψ (x) diverges, while this is because the denominator of Eq. ( 10), ψ exact (x), reaches to zero.The deviation looks larger if ω is smaller, which is related to the cutoff parameter for the spatial mesh x min .The exact value of ψ gs (x min ) is 2.8 × 10 −6 for ω = 1.0, while it is 8.1 × 10 −28 for ω = 5.0 and it is much smaller for ω = 10.0, while in the numerical calculation, they are approximated to zero.The value 2.8 × 10 −6 may be too large to assume to be zero.
It should be noted that rather small DNN is enough to reproduce the solution of the wave function.Owing to the simplicity, it is easy to analyze the weights and biases of the DNN.For instance, the DNN wave function for the single layer with four units includes only 13 parameters; the ground-state DNN wave function for ω = 1.0 can be written as where the first coefficient of Eq. (11a) (1/3.7451) is not obtained by the DNN but instead by the normalization.Hence, smaller DNN is better not only due to the calculation cost but also for analysis of the structure of DNN.
Let us provide our interpretation of the obtained wave function [Eq.(11a)].The rectified linear function (ReLU) , respectively.The DNN with the ReLU function can be understood as an approximation with a piecewise linear function.As shown in Fig. 5, the ground-state wave function in DNN is approximated by the following function: where a and b are positive numbers.The ReLU function at the output layer guarantees to make the wave function vanish for x < −b/a and x > b/a, and thus, a gs should be ±ax + b.This function can be represented by just two ReLU functions.Hence, even two units in hidden layer are enough to describe the brief structure of ground-state wave function, and with increasing the number of units, the ground-state wave function is reproduced easily.Since the ReLU function is not differentiable at x = 0, the ReLU wave function is not differentiable.Hence, the softplus is better to describe a differentiable function, while the ReLU function can also describe a differentiable function approximately if the number of units is large enough.In case of N -bodies systems, the similar function to Eq. ( 11a) can be represented by just 2 N ReLU functions.

Square-well potential
Next, the square-well potential is tested (V 0 > 0).The analytical forms of the groundstate wave function ψ gs and energy E gs are unknown; thus, our values of the energy will be compared with the numerical calculation obtained by the orthodox method of Hamiltonian diagonalization.In this calculation, x max = 20 and x 0 = 1 is used.
Table II shows the summary of calculations.In general, all the calculations give almost the correct energy.The calculation time per epoch and the number of epochs with respect to the number of layers and units is slightly longer than the case of the harmonic oscillator.
Figure 6 shows relative errors of the loss function, ⟨H⟩, to the exact ground-state energy E gs as functions of the number of epochs.It can be seen that, although the loss function achieved the relative error of 1.0 × 10 −7 , the final accuracy becomes about 1.0×10 −2 .Note that in this calculation, the convergence criteria is needed to be set looser than the other case; otherwise, it could not reach convergence.This may be related to the shape of the potential: Asymptotic region of the square-well potential is zero, while the harmonic oscillator potential increases rapidly.It will be shown later that the double-well potential, which is close to the latter situation, reaches convergence with the tight criterion.
Figure 7 shows calculated wave functions.The red thick lines correspond to the exact solution given by the exact diagonalization, where the same mesh size matrix form are used, for comparison; thin lines correspond to the results given in this work.It can be seen that the DNN calculation, basically, reproduces the solutions given by the exact diagonalization.
C. One-dimensional many-particle systems When one considers systems composed of many identical particles, the symmetrization for bosonic systems or the antisymmetrization for fermionic systems of the wave function must be considered.The ground state of the bosonic system is identical to that of the different particles; hence it has no extra difficulty as was done in Ref. [58], while the antisymmetrization is rather difficult.In this section, a simple method of (anti)symmetrization in the DNN wave function is provided, in which the symmetrization and the antisymmetrization can be performed with equal footing.

Hamiltonian matrix
As was done in the last section, the discretized Hamiltonian H should be represented in a matrix form and the discretized wave function ψ should be represented in a vector form.Here, one-dimensional two-body systems are considered as an example, and their coordinates are denoted by x and y.Each direction is discretized with M meshes, i.e., in total M × M meshes.Then, the discretized wave function ψ is . . .
where ψ jk = ψ (x j , y k ), x j = −x max + hj, y k = −y max + hk, and x max = y max .Accordingly, the discretized Hamiltonian H reads where T1 and T2 are the kinetic energy matrices and Ṽint are the interaction matrix whose matrix elements are where I M −1 is the (M − 1) × (M − 1) identity matrix and ⊗ is the Kronecker product.For instance, the matrix elements of Eqs.(17) reads 0 5000 10000 15000 20000 25000 30000 35000 40000     FIG. 3. Wave function under the harmonic oscillator potential.The red thick line corresponds to the exact solution [Eq.(9a)], while thin lines correspond to results of DNN calculation.Different thin line corresponds to different number of units.We observe that all the simulated results overlap with the exact solution. T2

Symmetrization and antisymmetrization
The discretized wave function ψ should be symmetrized or antisymmetrized.In general, for the arbitrary function f (x, y), f (x, y)+f (y, x) is a symmetrized function and f (x, y) − f (y, x) is an antisymmetrized function.
is assumed to be a trial wave function.In the Tensorflow code, instead of the original predicts and output_wf, ψ± is used in the third (the calculation process of the loss function) and fifth (calculate the groundstate energy) steps in Sec.II A. Note that this process can be easily done by using the following commands: 1. predicts_transpose = tf.reshape(predicts,[m, m]), 2. predicts_transpose = tf.transpose(predicts_transpose), . predicts = tf.add(predicts,predicts_transpose) for bosonic systems or predicts = tf.subtract(predicts,predicts_transpose) for fermionic systems, 5. the final predicts is used to evaluate the loss function.
Here, m corresponds to the number of meshes M .Note that this method can be straightforwardly extended to multibody systems.

Two-body systems
Two-body systems under the harmonic oscillator potential [Eq. ( 8)] is tested.If there is no interaction V int ≡ 0, the ground-state wave function ψ gs and energy E gs are known exactly.If one considers bosonic systems, they read and if one considers the fermionic systems, they read Figures 8 and 9, respectively, show wave functions for bosonic and fermionic systems obtained in this work.The total energies and the calculation time are shown in Table III.For comparison, the exact wave functions [Eq.(23a) or Eq.(24a)] are also shown.Here, x max = y max = 5 and M x = M y = 256 are used for the spatial mesh and two layers each of which contains 32 units are used for the DNN.For the interaction, the Gaussian-type interaction is used, where λ is the strength of the interaction.The DNN results with λ = 0 show a good agreement with the exact results, demonstrating that the DNN technique works well.
Behavior of wave functions for nonzero λ is consistent qualitatively with our expectation: if λ is negative, i.e., the interaction is attractive, the wave function tends to collapse, and if λ is positive, i.e., the interaction is repulsive, the wave function tends to be broad.Time cost per epoch is almost universal among all the calculation, while more epochs are required to reach convergence for fermionic systems than for bosonic systems.This may be because all the values are positive for the initial condition, while there are negative values for fermionic groundstate wave functions.More epochs are required for the repulsive interaction (λ > 0) than for the attractive interaction (λ < 0).This may be because the topology of the wave function is more complicated and extended in the repulsive case than the attractive case.
Finally, we point out a strange behavior of the obtained wave function of the two-body system of ω = 1.0 without the interaction.Here, for simplicity, the two-layer DNN in which each layer is composed of four units is used.In the case of two layers, the function obtained by optimized  weights of DNN is where A is the normalization constant, N unit is the number of units of each layer, w is a weight, and b is a bias.The left column of Fig. 10 shows u gs (x, y).The upper and lower rows, respectively, correspond to the results with minimizing the bosonic or fermionic energy expectation value.It is shown that the obtained function u gs , which is referred to as the raw wave function, is not symmetric nor antisymmetric.After the symmetrization ψ sym (x, y) = [u gs (x, y) + u gs (y, x)] /A sym or the antisymmetrization ψ antisym (x, y) = [u gs (x, y) − u gs (y, x)] /A antisym is performed with the normalization constant A sym or A antisym , ψ sym or ψ antisym can be regarded as the bosonic or fermionic ground-state wave function, respectively.The energy expectation value of the raw (u gs ), the symmetrized (ψ sym ), and the antisymmetrized (ψ antisym ) wave functions are shown in Table IV.A surprising fact is that even if the raw wave function is obtained by minimizing the bosonic (fermionic) expectation value, fermionic (bosonic) energy expectation value is close to the correct fermionic (bosonic) energy eigenvalue, and V ext : Square wall V 0 = 0.5 ψ (x) x vice versa.The (anti)symmetrization corresponds to the projection of the raw wave function to the boson (fermion) subspace, while the remaining part is not supposed to be optimized well.This unexpected coincidence may be due to the smallness of the number of parameters in the DNN architecture, and deserves further study.

Three-body systems
Three-body systems under the harmonic oscillator potential [Eq. ( 8)] is tested.For simplicity, we consider a system without any interaction V int ≡ 0.Then, the ground-state wave function ψ gs and energy E gs are known exactly as for bosonic systems and for fermionic systems.Figures 11 and 12, respectively, show wave functions for bosonic and fermionic systems obtained by this work.The total energies and the calculation time are shown in Table V.Here, x max = y max = z max = 5 and M x = M y = M z = 64 are used for the spatial mesh and two layers each of which contains 32 units are used for the DNN.The interaction is not considered.
The DNN calculations reproduce the exact groundstate energies.The DNN wave functions are consistent with the exact solution.The number of epochs for threebody systems is comparable with those for two-body systems, where the number of units and layers are identical for these two cases.In contrast, the time per epoch for the three-body systems are about four times of that for the two-body systems.This is related to the number of spatial meshes: 256 × 256 = 65536 meshes are used for the two-body systems and 64 × 64 × 64 = 262144 meshes are used for the three-body systems; thus, the number of meshes for the three-body systems are four times more than those for the two-body systems.Hence, it can be concluded that the time per epoch is almost proportional to the number of spatial meshes.This is reasonable since we use numerical methods for sparse matrices, in which the number of the nonzero matrix elements is O M N d .

III. EXCITED-STATE CALCULATION
In this section, based on the variational principle, a method to calculate low-lying excited states sequentially is explained.Assume that wave functions of the ground state and n excited states, |ψ 0 ⟩, |ψ 1 ⟩, . . ., |ψ n ⟩, are obtained, where |ψ 0 ⟩ = |ψ gs ⟩.We consider a problem of finding the (n + 1)-th excited state |ψ n+1 ⟩, which satisfies the orthnormal condition tion value where |ψ⟩ is assumed to be orthogonal to |ψ j ⟩ (j = 0, 1, . . ., n).This can be implemented in Tensorflow with assuming that is a trial wave function, instead of the simple |ψ⟩.For one-body problem, x max = 5 and M = 1024 are used for the spatial mesh and the single-layer DNN with eight units is adopted.

A. Harmonic oscillator potential
One-body one-dimensional harmonic oscillators are taken as examples.The exact wave functions for several low-lying excited states are [41] where ψ n is the n-th excited state, and the energies are Figure 13 shows the wave functions of the ground state and first, second, third, and fourth excited states.Ta-ble VI shows the summary of calculations.Basically, not only the ground-state but also low-lying excited-states wave functions and energies are successfully calculated.Thus, it can be concluded that the method to calculate low-lying excited states proposed here works well.
The number of epochs are almost universal for all states calculated here.In contrast, the time per epoch for a higher excited state is slightly longer since calculation for orthogonal condition [Eq.(31)] is needed to be performed, while it takes just a few µs.
Let us explain why our simple DNN can describe even the excited states correctly.For simplicity, the singlelayer DNN with the four unit is used.The optimized raw wave function for the ground state (u gs ) and the first (u 1st ), and the second (u 2nd ) excited states are shown in Fig. 14.The obtained function for the n-th excited state is with n j=0 |a j | 2 = 1, where ψ n-th is the n-th excited-state wave function.In the case of the first and second excited states, are obtained.We notice that the most part of the obtained raw wave function is the ground state and the small fraction is for the excited components; hence, the fairly simple raw wave function, which is made by the small architecture of the DNN and is close to that of the ground-state wave function, is capable of describing even the excited states.

B. Double-well potential
To see the effect of degeneracy, we also test the doublewell potential If the central barrier is low enough, i.e., α is small enough, each state is not degenerate.In contrast, if the central barrier is high, i.e., α is large, low-lying excited states below the central barrier are twofold degenerate: One state ψ L is localized into the left (x < 0) region while the other state ψ R is localized into the right (x > 0) region, and ψ L (x) = ψ R (−x) holds.Using a linear combination of these two degenerate states, one can recognize each state is degenerate of the following two states: where ψ + (ψ − ) is a positive (negative) parity state.According to the exact diagonalization, α = 1.0 and 1.25 give nondegenerate ground and low-lying excited states and thus ψ j is just a j-th excited state, while α = 2.0 and 3.0 give ground and low-lying excited states, which are almost twofold degenerate: ψ 0 and ψ 1 correspond to the ground states and ψ 2 and ψ 3 correspond to the first excited states.
Figures 15 and 16 shows the wave functions of the ground state and first, second, third, and fourth excited states.Table VII shows the summary of calculations.Basically, not only the ground-state but also low-lying excited-states wave functions and energies are successfully calculated, even for the degenerate states.It is not apparent which calculation gives, left-right bases (ψ L and ψ R ), parity bases (ψ + and ψ − ), or even general linear combinations.The DNN calculations for both α = 2.0 and 3.0 obtained wave functions with the left-right bases, while it may depend on the initial condition.Note that the exact diagonalization for α = 2.0 obtained wave functions with the parity bases, while wave functions with the left-right bases are plotted by using linear combinations in Fig. 16 to make a comparison with the DNN result easily.

C. Two-body systems
Two-body one-dimensional harmonic oscillators are taken as the last examples.Here, x max = y max = 5 and M x = M y = 256 are used for the spatial mesh and two layers each of which contains 32 units are used for the DNN.Here, the interparticle interaction is not considered.The exact wave functions for several low-lying excited states can be written as linear combinations of Eqs.(32): where the energy eigenvalue of Ψ 0 , Ψ 1 , Ψ 2 , and Ψ 3 are equal to, respectively, 1, 2, 3 and 3 for bosonic systems and where the energy eigenvalue of Ψ 0 , Ψ 1 , Ψ 2 , and Ψ 3 are equal to, respectively, 2, 3, 4 and 4 for fermionic systems.Note that the second excited states, Ψ 2 and Ψ 3 , are twofold degenerate in both the bosonic and fermionic systems.
Figures 17 and 18, respectively, show the wave functions of the ground state and first and second excited states.Table VIII shows the summary of calculations.
Note that [Ψ 2 (x, y) ± Ψ 3 (x, y)] / √ 2 are plotted for the second excited states for exact solutions.Not only the ground-state but also low-lying excited-states wave functions and energies are successfully calculated even for two-body systems.In addition, as one-body problems, the numerical cost for a low-lying excited state is almost the same as that for the ground-state.Thus, this method to calculate low-lying excited states can work even for multibody systems with a reasonable numerical cost.

IV. SUMMARY
In this paper, we proposed a method to calculate the wave functions and energies of not only the ground state but also low-lying excited states of quantum multibody systems using the deep neural network and the unsupervised machine learning technique.To calculate systems of many-particle systems of identical particles, a simple method of symmetrization for bosonic systems and antisymmetrization for fermionic systems were also proposed.
The obtained wave functions and energies are consistent with the exact solution.We found that the neural network is not necessarily large for one-body systems, which also enables us to analyze the internal structure of the deep neural network used.For instance, just only one hidden layer with four units is enough to describe the ground-state wave function of the harmonic oscillator.This can be understood by using the piecewise approximation with linear functions.We confirmed that our simple (anti)symmetrization method works for multibody systems.The numerical cost per epoch for fermionic systems is almost the same as that for bosonic systems.The numerical cost is almost proportional to the number of spatial meshes since the sparse matrix representation is used.In addition, the numerical cost for a low-lying excited state is almost the same as that for the ground state.
The deep neural network has been applied to solve many-fermion systems where the ground-state wave function is assumed to be a Jastrow wave function [46,47,57].The method proposed in this paper can be an alternative method to solve many-fermionic systems since the ansatz for the ground-state wave function is lenient, and the symmetrization and antisymmetrization are treated on an equal footing.
Since the numerical cost is not so large and our (anti)symmetrization is quite simple, this method can be an alternative method to calculate wave functions and energies of the ground and low-lying excited states, for instance, for the electronic structure of molecules and solids, for the nuclear structure of atomic nuclei including a tetra neutron [66][67][68], and for cold atoms [69].
At this moment, we only considered one-dimensional systems, while most problems interested are threedimensional systems.In addition, spin components, or even isospin components for nuclear systems, are often important.The restricted Boltzmann machine has been applied to obtain the ground-and low-lying excitedstates wave functions [45,70,71].Since the input is dis- crete variables in the spin systems, the Boltzmann machine is suitable.Such pioneering works may help to consider the spin (or isospin) components in this work.Such extensions are possible within our framework, and remain for future work.
As far as we know, all the calculations using the deep neural network for wave functions are static, while describing many phenomena including the interaction between matter and laser [72,73], ion-cluster collision [74] heavy-ion collision [75] nuclear fission [76,77], and fusion [78].To describe such phenomena, time evolution from a state obtained by the deep neural network is also interesting, while it is left for a future study.
Finally, let us make a comment on the interpretation of the wave functions obtained in our work.As we have shown, thanks to the simplicity of the deep neural network, we could interpret the structure of the network easily.We found that replacing the softmax function with the ReLU activation provides a piecewise linear function which approximates the ground-state wave func-tion.Since any wave function including those for excited states, which is naturally continuous, can be approximated by a piecewise-linear function, we intuitively conclude that the neural network representation can work for any physical quantum mechanical system in any dimensions.The physical meaning of the piecewise-linear functions is as follows.First of all, linear functions are solutions of the free Schrödinger equation with no potential term.So it is a good idea to start with linear functions in physical systems.Then the inclusion of the potential term in the Hamiltonian causes the curvature of the wave function.The curvature is determined by the interplay between the Laplacian and the potential term in the Hamiltonian.So, the kink structure of the wave function is dictated by the Hamiltonian.The kinks correspond to the ReLU activations, thus in effect, the nonlinearity in the Hamiltonian corresponds to the neural network structure.This reminds us of the work [79] in which the deep layers of the deep Boltzmann machine representing the ground-state wave functions of spin systems were in- terpreted as a Euclidean Hamiltonian evolution, or the work [80][81][82] in which the deep layers of the sparse neural network used for the AdS/CFT correspondence were interpreted as a bulk curved geometry.Further interplay between the sparsity of the interpretable neural networks and Hamiltonians of physical systems is to be discovered.

FIG. 1 .
FIG. 1. Schematic figure of the deep neural network representing a one-dimensional three-body system.
is a widely used activation function, and the softplus function can be regarded as a smoothed version of the ReLU.Here, for interpreting Eq. (11a), we shall just replace the softplus function with the ReLU.The wave function obtained by the DNN [Eq.(11a)] and the obtained function before the output layer [Eq.(11b)] are shown in Fig. 5, where the normalization factor (1/3.7451) of ψ gs is ignored.Equations (11a) and (11b) where the softplus function is replaced to the ReLU function are also plotted as ψ ReLU gs and a ReLU gs

16 FIG. 4 .FIG. 5 .
FIG.4.Relative error of DNN wave function to exact one for the harmonic oscillator potential.Different thin line corresponds to different number of units.In the region of large |x|, the deviation δψ (x) diverges, because the denominator of Eq. (10), ψ exact (x), reaches to zero.Hence, the figures plot only |δφ (x)| < 10 −1 .

FIG. 8 .
FIG.8.Two-body wave function under the harmonic oscillator potential for bosonic systems.The exact wave function without the interaction is shown in the left-most column.

FIG. 10 .FIG. 11
FIG.10.DNN wave function of the raw (ugs in Eq. (26a)), the symmetrized, and the antisymmetrized wave functions.The rows named "Boson" and "Fermion," respectively, correspond to the results obtained by minimizing the bosonic or fermionic energy expectation value.

FIG. 13 .
FIG.13.Wave functions of the ground and low-lying excited states under the harmonic oscillator potential.The top panel shows the exact wave functions and the bottom one shows the DNN wave functions.To make consistency for the phase factor, −ψ3 (x) is plotted for the exact wave function of the third excited state.

FIG. 14 .
FIG.14.Optimized function u obtained by the DNN for the ground-state (0th), the first and the second excited states.

VVV 3 FIG. 15 .
FIG. 15.Wave functions of the ground and low-lying excited states under the double-well potential for α = 1.0 and 1.25.The top panels show the exact wave functions and the bottom ones show the DNN wave functions.

TABLE I .
Calculation summary of a one-body problem under the harmonic oscillator potential.Row with "-" in the column "# of unit for 2nd layer" corresponds to calculation performed only with one layer.
FIG.2.Relative error of ⟨H⟩ to the exact ground-state energy Egs for the harmonic oscillator potential as functions of the number of epochs.

TABLE II .
Calculation summary of an one-body problem under the square-well potential.Row with "-" in the column "# of unit for 2nd layer" corresponds to calculation performed only with one layer.

TABLE III .
Calculation summary of a two-body problem under the harmonic oscillator potential.

TABLE VI .
Calculation summary of excited states for a onebody problem under the harmonic oscillator potential.Calculation is performed with ω = 1.0.

TABLE VII .
Calculation summary of a one-body problem under the double-well potential.

TABLE VIII .
Calculation summary of excited states for a two-body problem under the harmonic oscillator potential.Calculation is performed with ω = 1.0.