Periodic structure of memory function in spintronics reservoir with feedback current

The role of the feedback effect on physical reservoir computing is studied theoretically by solving the vortex-core dynamics in a nanostructured ferromagnet. Although the spin-transfer torque due to the feedback current makes the vortex dynamics complex, it is clarified that the feedback effect does not always contribute to the enhancement of the memory function in a physical reservoir. The memory function, characterized by the correlation coefficient between the input data and the dynamical response of the vortex core, becomes large when the delay time of the feedback current is not an integral multiple of the pulse width. On the other hand, the memory function remains small when the delay time is an integral multiple of the pulse width. As a result, a periodic behavior for the short-term memory capacity is observed with respect to the delay time, the phenomenon of which can be attributed to correlations between the virtual neurons via the feedback current.


I. INTRODUCTION
Physical reservoir computing is a kind of recurrent neural network based on physical systems, such as optic lasers, soft materials, and quantum systems 1- 10 . The dynamical response of the physical system, called the reservoir, consisting of numerous virtual neurons with recurrent interactions reflects the history of time-dependent data inputted into the reservoir. Such a response can be applied to an energy-efficient brain-inspired computation calculating and classifying a time sequence of input data such as spoken language and movie. Therefore, physical reservoir computing has attracted much interest from various fields of science such as neural science, information science, and physics. Recently, it has been shown that a fine structure of ferromagnets on nanoscale is regarded as a physical reservoir [11][12][13][14][15][16][17][18][19] , and other computational schemes and related phenomena based on spintronics devices have also been proposed [20][21][22][23][24][25][26] .
An intriguing method for the physical reservoir computing is time-multiplexing method 8,11,13 , which creates virtual neurons by dividing the dynamical response from the reservoir into several nodes, and thus, enables us to perform the reservoir computing by using small numbers of physical particles. A critical issue of this method is, however, low-memory functionality of the reservoir due to lack of richness in the dynamics. A fascinating approach to solving the issue is to use a feedback effect because it makes the dimension of the phase space infinite and causes highly nonlinear dynamics 27 , although adding a delay line eventually forces us to make the size of the whole device large. Recently, the role of the feedback effect on spintronic systems has been extensively studied [28][29][30][31] . For example, a modulation of Hopf bifurcation in a vortex-type ferromagnet by the feedback of electric current was proposed theoretically 28 and was confirmed experimentally 29 . Chaos in nanomagnets via feedback current was also theoretically predicted 30,31 . An enhancement of pattern-recognition rate in the presence of the feedback current was reported recently 18 . It is, however, still not fully understood how the feedback effect relates to the memory function of the physical reservoir. More generally, in other words, the roles of network topology, complexity, and/or parameters on the performance of the reservoir computing remain unclear yet [32][33][34] .
In this work, a periodic behavior for the memory function with respect to the delay time is found in the spintronics reservoir having the feedback current. As a figure of merit, the short-term memory capacity is evaluated, which characterizes the number of the input data stored in a physical reservoir. The short-term memory capacity is maximized when the delay time τ of the feedback current is slightly longer than the pulse width t p of the input data. The periodic reduction of the short-term memory capacity is also revealed when the delay time is an integral multiple of the pulse width, i.e., τ ≃ nt p (n ∈ N). The periodic behavior of the memory function can be attributed to an interaction between the virtual neurons via the feedback effect.
The paper is organized as follows. In Sec. II, we give the description of the magnetic multilayer including a magnetic vortex and the equation of motion for the core center. In Sec. III, the dynamics of the magnetic vortexcore with respect to the pulse current is studied. In Sec. IV, the short-term memory capacity is evaluated, where the periodic structure of the memory function is found. Section V summarizes the conclusions of this work.

II. SYSTEM DESCRIPTION
The system investigated in this work is schematically shown in Fig. 1(a). A magnetic tunnel junction (MTJ) consists of a vortex-type free layer and uniformly magnetized reference layer. The unit vector pointing in the FIG. 1: (a) Schematic picture of the magnetic trilayer. The magnetization in the reference layer is p. The current density injected into the free layer consists of a bias and random binary pulse. In addition, the feedback current (χJ dc [Y (t − τ )/R]) is also injected. (b) A magnetic configuration in a vortex-type free layer. The vortex-core center position is X = (X, Y, 0). magnetization direction of the reference layer is denoted as p. The z axis is perpendicular to the film plane, whereas p lies in the xz plane as p = (p x , 0, p z ). Current density J is injected into the free layer, where the positive current is defined as current flowing from the reference to free layer. The spin-transfer torque [35][36][37] caused by the current compensates with the damping torque, and moves the vortex core from the disk center, as schematically shown in Fig. 1(b). The dynamics of the vortexcore center is described by the Thiele equation for the core position X = (X, Y, 0) given by [38][39][40][41][42][43] where G = 2πpcM L/γ and D = −(2παM L/γ)[1 − (1/2) log(R 0 /R)] consist of the saturation magnetization M , the gyromagnetic ratio γ, the Gilbert damping constant α, the thickness L, the disk radius R, and the core radius R 0 of the free layer. The polarity p and chirality c are assumed to be +1, for convenience. The unit vectors along the x and z axes are denoted asx andẑ, respectively. The normalized position of the vortex-core center is s = |X|/R. The nonlinear parameter of the damping torque is ξ(≃ 1/4) 43 . The potential energy is where κ ≃ (10/9)4πM 2 L 2 /R and κ ′ /κ ≃ 1/4 39,43 . The spin-torque strength with the spin polarization P is a J = π P/(2e). Throughout the paper, we use the following values of the parameters estimated from experiments and simulations [43][44][45][46] as M = 1500 emu/c.c., γ = 1.764 × 10 7 rad/(Oe s), α = 0.005, L = 4 nm, R = 150 nm, R 0 = 10 nm, and P = 0.3. The direct current I = πR 2 J is 4.0 mA, corresponding to the current density of J dc ≃ 5.7 × 10 6 A/cm 2 . The magnetization in the reference layer is tilted from the z axis with 75 • . We note that the magnetization direction in the reference layer can be stabilized to the tilted direction in the experiment with the help of the interlayer exchange in a synthetic antiferromagnet through a nonmagnetic spacer such as Ru, an exchange bias from an antiferromagnet such as IrMn, and an external magnetic field applied to the z direction [44][45][46] . We note that a finite tilted component p z to the z direction is necessary to excite the dynamics of the magnetic vortex core; see Appendix A.

III. DYNAMICAL RESPONSE TO PULSE CURRENT
The input data for the physical reservoir computing are random binary pulse b = 0 or 1, which is injected as electric current in the spintronics system 11,16,17 . In addition, in this work, the output current from the MTJ is returned to the free layer with the delay time τ , as schematically shown in Fig. 1(a). Therefore, the total current density J in Eq. (1) is given by where ν and χ are the strengths of the binary pulse and the feedback current, respectively, and are set to be ν = 0.2 and χ = 0.1 in this work. The binary pulse is constant during a pulse width t p . Through the magnetoresistance effect, the magnitude of the feedback current reflects the vortex-core position which has a time lag of delay time τ . The resistance of the MTJ with the vortex free layer depends on the y component of the core position 28 because it determines the cross-section area with the magnetic moments parallel (or antiparallel) to p. Therefore, the feedback current in Eq. (3) is proportional to Y (t−τ ). The fourth-order Runge-Kutta method with the continuation method is applied to Eq.
(1) to include the feedback effect 31,47 . The reservoir computing is performed by using the output s(t) 11,13 . We note that the output current from the magnetic multilayer depends on the vortex core position through tunnel magnetoresistance effect, as mentioned above. Therefore, the memory of the vortex position at a past time is naturally stored in the output signal. In other words, by passing through an electric cable, the feedback current naturally carries the information of the past status with a finite delay time. Thus, no external registers nor memories devices are necessary, which is preferable feature to use spintronics devices for the physical reservoir computing with the feedback effect. Such a delayedfeedback effect without using register nor memory device was already reported experimentally 29 . We note that electric circuits, such as registers and/or memories, can also be applied to the physical reservoir computing because they have memory function and/or nonlinear response. It was, however, shown in our previous work that using spintronics device for the reservoir computing results in a large memory functionality compared with an electric circuit without spintronics device 13 . The result indicates that the spintronics devices contribute to an enhancement of the memory function in the physical reservoir computing. Furthermore, we will show in the following that the feedback effect further contributes to achieve a large memory function in the spintronics devices. Let us first show the role of the feedback current on the vortex dynamics. Figure 2(a) shows the time evolutions of the center position s(t) of the vortex core from the disk center in the absence of random binary pulse. The inset shows an enlarged view in a steady state. In the absence of the feedback effect (χ = 0), the radius of the vortexcore center saturates to a certain value after a relaxation time on the order of 1 µs and shows a limit-cycle oscillation, as indicated by black line. A small oscillation of the amplitude is due to the spin-transfer torque originating from the in-plane component (p x ) of the magnetization in the reference layer, which breaks a circular rotation of the vortex core by the torque with the polarization in the x direction. The feedback effect changes the vortex dynamics, depending on the delay time. For a relatively short delay time such as 10 (red), 50 (orange), and 200 (green) ns, the saturated values of the vortex-core position are changed. This is due to the modulation of the Hopf bifurcation 28 , where the critical current density necessarily to move the vortex core from the disk center oscillates with the delay time; see also Appendix A. An amplitude modulation as well as nonperiodic behavior, is observed for relatively long delay times such as 500 (blue) ns and 1 µs (purple). Figure 2(b) shows a vortex dynamics in the presence of the random binary pulses, where the delay time and the pulse width are 500 and 25 ns, respectively. The pulse width is sufficiently shorter than the relaxation time of the vortex-core dynamics, which is on the order of 1 µs, as mentioned above. Therefore, the dynamics shows a transient behavior from one limit-cycle state to another, and does not saturate to a steady state.

IV. RESERVOIR COMPUTING AND PERIODIC STRUCTURE IN MEMORY CAPACITY
The procedure of the physical reservoir computing is as follows; see also Appendix B. The dynamical response s(t) during an input pulse is divided into N node nodes. We denote the output s at ith node in the presence of kth pulse as s k,i . The time-multiplexing method regards s k,i as the ith virtual neuron at a discrete time k 8,11,13 . The vortex dynamics can be then regarded as an interacting system consisting of N node neurons, where the state of a neuron s k,i is affected by the other neurons through the time evolution described by Eq. (1). Adding the feedback effect will bring an additional interaction, as discussed below. Similarly, let us denote the kth binary input data as b k . Applying N L pulse to the vortex, we determine the weight w D,i minimizing the distance between the input and output data, The integer D(= 0, 1, 2, · · · ) is called delay 8 . Since the recurrent neural network calculates a time sequence of input data, the weight should be defined for each past data injected D times before the current pulse. We note that the delay D is a common number used in reservoir computing 8,9,13 , and not to confuse with the delay time τ of the feedback current in the present study. Throughout this paper, the symbol D appeared in quantities related to the reservoir computing and used as an integer represents the delay, whereas the symbol D appeared in the Thiele equation represents the damping strength.
After determining the weights, we injected different random binary pulse b ′ m , and measure the output s ′ m,i , where m = 1, 2, · · · , N ′ L , and the number of the input N ′ L is not necessarily the same with the number N L for the learning. Then, we define and evaluate the correlation coefficient as The correlation coefficient characterizes the reproducibility of the input data b ′ m from the output s ′ m,i , and quantifies the memory function of the vortex free layer. When the input data can be reproduced completely, [Cor(D)] 2 → 1, whereas [Cor(D)] 2 → 0 when the vortex free layer cannot reproduce the input data. The shortterm memory is defined as Figures 3(a)-3(c) show the short-term memory capacity, as a function of the delay time, for the pulse widths of 25, 200, and 300 ns. A large short-term memory capacity due to the presence of the feedback effect is found, in particular for the delay time slightly longer than the pulse width; see Appendix C for comparison, where the shortterm memory capacity in the absence of the feedback effect is shown. We also show the value of the short-term memory capacity in echo-state network 3 in Appendix D for comparison. An important finding in this work is a periodic behavior of the short-term memory capacity with respect to the delay time; i.e., the short-term memory shows sharp drops at the delay time of τ ≃ nt p , where n is an integer. The results reveal that the feedback effect does not always contribute to the enhancement of the memory function.
The periodic behavior of the short-term memory capacity can also be implied from the correlation coefficients. Note that the correlation coefficients in the absence of the feedback effect monotonically decrease with increasing the delay D; see Ref. 13  . The local maximum of the correlation coefficients contributes to the large short-term memory capacity. The monotonic decrease of the correlation coefficients, on the other hand, appears when the delay time is an integer multiple of the pulse width, as shown in Fig.  4(b). The short-term memory capacity becomes small in this case.
We consider the origin of the periodic behavior of the short-term memory capacity, although the relation between the network structure and memory function in the reservoir computing is not fully clarified yet [32][33][34] . We note that the feedback effect in Eq. (1) can be regarded as an interaction between virtual neurons. The presence of the feedback current in Eq. (1) means that the time evolution of a virtual neuron s k,i depends on other neuron s l,j (l < k) in the past 48 . When the delay time is an integral multiple of the pulse width (τ = nt p ), the neuron in the delayed information is the present neuron itself, i.e., j = i; see Fig. 5(a). Therefore, the feedback effect in this case is a self-feedback, and does not provide any interaction between different neurons, as schematically shown in Fig. 5(b). On the other hand, when τ = nt p , the information of a different neuron s l,j (j = i) is used to determine the time evolution of the present neuron, s k,i ; see Fig. 5(c). The feedback effect in this case can be regarded as an interaction between different neurons, as shown in Fig. 5(d). It is useful to remember that the memory function of the physical reservoir comes from the recurrent network between neurons 1-5 . Since the feedback effect does not provide interactions between the neurons for τ = nt p , the memory function of the physical reservoir remains small. On the other hand, the feedback effect provides an additional interaction between the neurons for τ = nt p , resulting in the enhancement of the memory function. As a result, a periodic structure of the short-term memory capacity with respect to the delay time appears with the interval of nt p .

V. CONCLUSION
In conclusion, the physical reservoir computing by using a vortex-type ferromagnet in the presence of the feedback current is performed theoretically. A periodic behavior of the short-term memory capacity was found with respect to the delay time of the feedback current, where the memory capacity becomes large when the delay time τ is not an integer multiple of the pulse width t p , whereas it shows sharp drops at the delay time τ satisfying τ = nt p with a positive integer n. The origin of the periodic structure can be attributed to be an interaction between the virtual neurons via the feedback current. (1), in terms of the normalized core position s and the phase ψ from the x axis is given bẏ We note that the vortex-core size, R 0 , is usually much smaller than the disk radius; see Sec. II for the val-ues of the parameters. In addition, the Gilbert damping constant α in conventional ferromagnets, such as Fe, Co, Ni, and their alloys, are usually small (α ≪ 1) 49 . Therefore, neglecting the spin-transfer torque, proportional to R 0 /R, originated from an in-plane component of the magnetization p and higher order terms of |D| ∝ α and s(< 1), the equation of motion of s becomes the Stuart-Landau equation for the real variable s aṡ where a and b are defined as The solution of the Stuart-Landau equation with the initial condition of s(0) is The solution indicates that the Hopf bifurcation appears at the current satisfying a = 0, i.e., the vortex core is stabilized at the disk center (s = 0) when J < J c1 , whereas the vortex core moves from the disk center to a limit cycle oscillation with the radius lim t→∞ s(t) = a/b when J > J c1 , where the critical current density J c1 is given by We note that a finite tilted component of the magnetization in the reference layer, p z , is necessary to move the vortex core from the disk center. This is because an injection of energy to the system becomes finite by the work done by the spin-transfer torque originating from the perpendicular component of p. We note that the above analysis works well to explain the limit cycle oscillation of the vortex core. For example, using the values of the parameters mentioned in Sec. II, lim t→∞ a/b = 0.55, which agrees with the saturated value of s in the absence of the feedback effect shown in Fig. 2(a). We also note that the normalized radius should satisfy s ≤ 1. Therefore, the current density should be smaller than J c2 = J c1 (1 + ξ + ζ). In summary, the Hopf bifurcation appears at the current density of J c1 , and the limit cycle oscillation is excited when the current density is in the range of J c1 < J < J c2 .
Appendix B: Evaluation method of short-term memory capacity Here, we explain the procedure of the physical reservoir computing by focusing on the evaluation method of the short-term memory capacity 8,12,13 . In general, a physical system consisting of N interacting particles is considered as a reservoir, where N is the number of the particles. In this section, however, we consider a particle, and regard its dynamical response to input data as an ensemble of numerous virtual neurons, as done in experiments.
We first apply N L -bit random input into the system, as shown in Fig. 6. As an example, let us assume that the input signal is an electric voltage V , as done in experiments 11,13 . The input data consists of a bias term V bias and random binary data V p as where b k (= 0 or 1) represents the kth binary pulse. Note that the binary pulse, V p b k , is constant during a pulse width t p . The kth binary pulse is applied to the system in the time range of (k − 1)t p ≤ t < kt p . The number of the input is N L .
The physical system shows some output, such as an oscillating voltage emitted from spin-torque oscillator 11 . In experiments 11,13,17 , the output voltage is divided into the amplitude and phase by the Hilbert transformation. The amplitude was used for the reservoir computing in Refs. 11,13 , whereas Ref. 17 used the phase for the computation. Although the main text in this work uses the (normalized) amplitude s = |X|/R of the vortex core for the reservoir computing, the following procedure is applicable to the computation using the phase. Let us denote therefore a general output from the system as u(t). As explained in the main text, we divide the output u(t) in the presence of the kth input V in,k into N node nodes, and denote it as u k,i , as shown in Fig. 6, i.e., The discrete output u k,i is regarded as the ith neuron at time k. Figure 7 shows how the virtual nodes are defined from the magnetic vortex-core dynamics studied in the main text. The dynamical response of the vortex-core radius s(t) in the presence of the random binary pulses is shown in Fig. 7(a), which is the same with that shown in Fig. 2(b) in the main text. Figure 7(b) is an enlarged view during a relatively short time range. In this case, the pulse width t p is 25 ns, whereas the node number N node is 250. Therefore, the virtual nodes appear with the time interval of 0.1 ns, as indicated by red circles in Fig. 7(b).
The physical reservoir computing aims to reproduce some of the data from the output u k,i . The data to be reproduced is not necessarily the same with the input data, V in,k , or equivalently, b k . In general, the data to be estimated from the output u k,i is called target data. Let us denote the target data as z k , in general. For the evaluation of the short-term memory capacity, the target data is the random binary pulse, i.e., On the other hand, for the evaluation of the parity check capacity, discussed in Refs. 12,17 , the target output is chosen as Weight w D,i is introduced to find one-to-one correspondence between the system output u k,i and the target output z k . The weight is determined to minimize the error between the system output and the target output given by (B5) The bias term of the system output is u k,N node +1 = 1 8 . The weight w D,i is set so as to reproduce the (k − D)th target output from the system output u k at time k. Note that the weight should be introduced for each kind of the target output, i.e., the weight for the evaluation of the short-term memory capacity is different from that for the evaluation of the parity check capacity. The process determining the weight is called learning, and the target output z k for the learning is called training data.
After the learning, other random binary data are injected into the system, where the number of the input data N ′ L is not necessarily the same with N L for the learning. We denote the system and target output in this process as u ′ n,i and z ′ n to distinguish them from u k,i and z k for the learning. Using the system output u ′ n,i with the weight determined by the learning, we define If the learning is sufficiently achieved, v n−D will reproduce the target output well. To quantify the reproducibility, we evaluate the correlation coefficient given by In the main text, u k,i , as well as u ′ n,i , and z n are the normalized vortex-core radius s k,i and the random binary b n because we focus on the evaluation of the short-term memory capacity. If we use, on the other hand, the target output for the parity check, the correlation coefficient for the evaluation of the parity check capacity will be evaluated. The capacity is generally defined as 0, 1, 2, · · · ), whereas the capacity in this work is defined from the correlation coefficients for D ≥ 1, as done in previous works 8,12,13,17 . The maximum value of the capacity is the number of nodes, i.e., C ≤ N node . In the present work, we first solve the Thiele equation without random binary pulse from t = 0 to t = 5 µs. After that, the random pulses are applied as washout, and then, N L = 1000 pulses are applied for the learning, where the number of the pulses used for the washout is 300. Then, different 300 random pulses for the washout are applied again. After that N ′ L = 1000 pulses are applied to evaluate the short-term memory capacity.
Appendix C: Short-term memory capacity in the absence of feedback effect In the main text, we show the correlation coefficients and the short-term memory capacity in the presence of the feedback current. For comparison, here, we show these quantities in the absence of the feedback effect. Figure 8 shows the square of the correlation coefficients obtained from the Thiele equation without feedback effect, where the pulse width is t p = 25 ns. The correlation decreases monotonically and rapidly with the increase of the delay D, as found in experiments 13,17 . We remind the readers that a similar behavior is observed even in the presence of the feedback effect when the delay time τ of the feedback current satisfies τ ≃ nt p (n ∈ N). The short-term memory capacity evaluated from Fig. 8 is 0.74, which is smaller than the value with the feedback effect satisfying τ = nt p ; see state network consists of N node -nodes given as x(t d ) = [x 1 (t d ), x 2 (t d ), · · · , x N node (t d )], where t d = 1, 2, · · · represents a discrete time. The time evolution of the nodes is given by where b is the binary input. The components of N node th order matrix W and vector W in , W ij and W in,i , are random numbers. The matrix W corresponds to the weight of the internal connection between nodes, whereas the vector W in corresponds to the weight to the input data 8 . In this work, we normalized W in by dividing its component by the absolute value of the component, |max[W in,i ]|. In addition, we evaluate the singular value of W, and divide the components of W by the maximum singular value. The maximum singular value of the internal weight W is called spectral radius 8 . The normalization of W makes the spectral radius of the present calculation 1.0. A training noise is neglected, for simplicity. Similar calculations are reported in Refs. 8,12 .
We emphasize the difference between the spintronics reservoir in the main text and the echo-state network. The spintronics device in the main text consists of single device, and the virtual neurons are defined by dividing the dynamical response during a pulse input into N node -nodes. On the other hand, the echo-state network consists of N node -nodes, and we do not divide the time evolution of x during a pulse.
The system output u k,i defined in Appendix B corresponds to x i (k) of the echo-state network. Applying the procedure explained in Appendix B, the short-term memory capacity of the echo-state network can be evaluated. In this work, we use the same numbers of the training data and washout with those used in the spintronics reservoir. Figure 9 shows the dependence of the shortterm memory capacity of the echo-state network on the node numbers, N node . Although the value of the shortterm memory capacity depends on the random numbers in W and W in , as well as the spectral radius and the training noise, and thus, the values in Fig. 9 might not be an optimized, the values are close to those found in previous works 8,12 . For example, the short-term memory capacity for N node = 5 is about 2.2, which is smaller than the maximum values of the short-term memory capacities of the spintronics reservoir with the feedback effect; see Figs. 3(a)-3(c). The result in Fig. 9 implies that a single spintronics reservoir with the feedback current is comparable to an echo-state network consisting of 5-10 nodes.