Metaheuristic conditional neural network for harvesting skyrmionic metastable states

,


I. INTRODUCTION
In systems with many interacting entities, e.g., a chain of amino acids forming a protein [1], a metal cluster [2], or a system of spins [3], the Hamiltonian and its corresponding potential energy surface (PES) can quickly become highly convoluted as the number of interacting particles increases.This complexity is expressed in the form of rugosity of the PES, which in the case of coupled spins is driven by frustration or competing spin-spin interactions.In such cases, finding the global minimum is usually a challenging task due to the non-convexity of the high-dimensional function that represents the PES [4].
However, not only the global minimum, but also the numerous local minima in the PES can provide important information about a given system.In fact, many of the states corresponding to local minima can be long-lived at finite temperatures and have significant consequences for the physical behavior of the system.
Magnetic topological spin textures, e.g.skyrmions, are prime examples of such states.Depending on the intrinsic properties (interactions, geometry, anisotropy) and external conditions (field, temperature), skyrmions can emerge as metastable (excited) states in the configura- * These two authors contributed equally tion space, as confirmed by previous experiments [5][6][7][8].
Recently, more complex metastable skyrmionic textures such as skyrmion bags, skyrmionium, and skyrmion bundles have been observed experimentally [9][10][11][12][13].From the theory side, it is clear that chiral magnetic skyrmions with arbitrary topological charge should exist [14].This current interest in more complex metastable skyrmionic textures stems from that such structures may open the door to new topologically inspired spintronic and information-storage concepts.For example, combinations of several different types of skyrmionic structure might prove very useful to make the concept of a skyrmionic bit applicable in practice [15].Another example is skyrmion-based artificial synapses for skyrmionbased neuromorphic computing [16].In addition, the skyrmion Hall, skyrmion Seebeck, and skyrmion Nernst effects depend strongly on the topological charge.[17] From theory and computer simulations, some highorder topological spin textures have been identified [14,[18][19][20][21][22], but a vast flora with novel and potentially very interesting properties remains to be discovered and analyzed.This has motivated us to develop a computational method that can systematically harvest metastable skyrmionic spin textures, given a spin Hamiltonian defining the PES.
Here, we present and use a metaheuristic algorithm based on gradient-descent optimization within a conditional neural network, driven by random perturba-tions.A conceptual illustration of our method is given in Fig. 1(a).The goal of our method is to efficiently explore the rugged PES and its multiple energy minima, and in this way identify metastable and possibly long-lived states represented by the Hamiltonian.We demonstrate our method by investigating higher-order antiskyrmions within a model of a well-known frustrated system of interacting spins -the Pd/Fe/Ir(111) film.This system has been very thoroughly characterized experimentally and was one of the first studied systems containing twodimensional magnetic skyrmions [23], and has somewhat attained the status of "fruit fly" in the research on topological spin textures.

II. METHOD A. Metaheuristic conditional neural network
Our method consists essentially of two main ingredients: stochastic gradient-descent (SGD) optimization of the Hamiltonian parameterized in a shallow neural network framework, and a condition directing the global exploration of the PES.Both are described in more detail below.
For the first ingredient -i.e., the shallow neural network, in the following called NN-SGD, whose role is to identify local minima of the Hamiltonian -we use the AdamW [24,25] implementation of SGD within a single-layer feedforward NN to minimize a loss function L (X, W, b).Here, X is an input vector defining the starting point of the SGD optimization, whereas W and b are the weights and biases within the NN.The value of the loss function L (X, W, b) is provided by the Hamiltonian, i.e., the total energy of the system.We use a Heisenbergtype classical atomistic spin Hamiltonian of the form where S i is the normalized spin moment at site i and the total number of spins is n.J ij , D ij , K U i , µ i , e z and B ext are Heisenberg exchange interactions, Dzyaloshinskii-Moriya (DM) interactions, uniaxial anisotropy, the magnetic moment length of site i, the easy axis vector, and the applied field, respectively.Thus, to employ SGD for optimization we parameterize the neural network's feed-forward mapping as computation of a spin configuration (S 1 , S 2 , . . ., S n ).The structure of the NN-SGD is illustrated in Fig. 1(b).To initialize the NN, values for X, W , and b are randomly assigned from a Gaussian distribution with a mean of 0 and a standard deviation of 1.The corresponding spin configuration (S 1 , S 2 , . . ., S n ) can then be computed from X, W , and b using where W T is the transpose of W , and M = (m 1 , m 2 , m 3 , . . ., m 3n ) is a vector containing the components of all spins so that the x, y, and z components of S 1 are m 1 , m 2 , and m 3 , and the components of S 2 are m 4 , m 5 and m 6 , etc. Furthermore, Norm is the normalization operation so that each spin retains unit length, and LeakyReLU is the activation function, defined in [26].In our implementation, X is a vector of length k and the number of spins n, is 10 4 .Typically, k is much smaller than n.The role of the NN-SGD is to iteratively converge toward a local minimum of the loss function by adjusting the weights and biases.Convergence is achieved when the stopping condition is met, i.e., the difference in the loss function between two consecutive checkpoints is small enough.Here, is the value of the loss function at the i-th checkpoint and T s is the convergence criterion hyperparameter.In our implementation, typically every 50th optimization step is a checkpoint.Although the NN-SGD optimisation is conducted with respect to W and b, we use an input X to drive the process of the loss function minimisation, as X determines the initial conditions for the SGD and thus it influences which local minimum the NN-SDG converges to.In particular, we use a random vector X and to facilitate "jumping around" over the PES we define new initial conditions by adding random perturbations P to X, where P is a vector where some elements are randomly set to zero, and the remaining elements are random numbers taken from a Gaussian distribution.
This leads us to the second ingredient in our algorithm -the condition we impose, directing the global exploration of the PES.For each random input vector X generated (i.e., for each new "jump" over the PES), we first compute the value of the DM-term i̸ =j D ij • (S i × S j ) in Eq. (1) of the corresponding spin configuration.Only if its value is within a certain predefined range, we proceed with the NN-SGD optimization.This means that we will only investigate local minima pertaining to specific regions of the PES selected by the imposed condition.In more general terms, the approach can be expressed as follows.The imposed condition is where H C is some in principle arbitrary combination of spin-Hamiltonian terms, and H C as well as the thresholds T low

C
and T high C are selected based on physical insight.In practice, the values of T low C and T high C are found through trial and error.Also within the SGD loop, we perform occasional checks that the condition Eq. ( 4) is still valid, in order to minimise the risk of spending the compute time on converging toward an uninteresting local minimum.
In order to sample as many different parts as possible of the PES, we further impose an additional condition that the SGD starting point spin configurations we use be dissimilar to each other.In practice, this is done by creating a large number of randomized spin configurations (typically about 200, all fulfilling the condition in Eq. ( 4)) and, from these, select the one that is most dissimilar to the previously used starting point spin configuration.
When the optimization has converged to a local minimum, the corresponding spin texture is analyzed, for instance by computing its topological charge Q.The al-gorithm keeps generating new sets of X and explore the PES as already described above, until the preset maximum execution time is reached.The main parts of the workflow are summarized in a flowchart in Fig. 1(c).
Finally, let us briefly return to the issue of convergence, Eq. ( 3).We stated previously that the optimization toward the local minimum was performed using SGD.Naturally, this optimization can be done in several ways and there may be methods that are faster than SGD in certain regimes.In this work, we have therefore selected to use not only SGD but also augment it with a non-gradient method based on Metropolis Markov Chain Monte Carlo (MMCMC) when sufficiently close to convergence.We will in the following denote the first approach as the SGD-only mode and the latter approach as the hybrid mode.The SDG-only mode is named this way because it simply uses the SGD inherent in the NN algorithm.In the hybrid mode, we also start out by using SGD, but at a later stage, as convergence is approached, we switch to MMCMC.In essence, the MMCMC optimizer (as implemented in the Uppsala Atomistic Spin Dynamics, Up-pASD, package [27,28]) performs energy minimization under finite temperatures by using the transition probability P t between two spin configurations in a Markov chain: where ∆E is the energy difference between the spin configurations, k B is the Boltzmann constant, and T is the temperature of the system.The hybrid mode has certain advantages, which we discuss in more detail in Section II E. In all MMCMC simulations in the present work, the temperature was set to 1 µK.We note that in [29] a neural-network-based method is described and used to identify the spin spiral state in a model system.However, that method contains neither any condition directing the global exploration of the PES, nor any perturbation scheme for X.In fact, the goal of the method in [29] is to identify the global minimum.That is orthogonal to our goal, which is to harvest a large number of metastable textures.

B. Metastable spin texture isolation
The local minima identified using the method described above usually correspond to several well separated spin textures within the simulation cell.Therefore, for practical reasons, each spin texture that we wish to analyze further needs to be isolated and embedded into a ferromagnetic background.Doing this "by hand" is cumbersome and time consuming.To solve the problem, we designed a pipeline that takes advantage of an AI-driven segmentation model.This approach greatly simplifies the process of harvesting the spin textures we want to analyze further.In the current pipeline, since we are working with a 2D system, we decided to use the state-of-the-art pre-trained Segment Anything Model (SAM), which is designed for image segmentation but can be easily transferred to work for our purposes, i.e., identifying complex spin textures in 2D spin systems, due to its zero-shot generalization property [30].The workflow can be described as follows: 1. Translate the contents of the entire simulation cell (i.e, a normalized 3 × n matrix corresponding to a spin configuration) to an image by mapping the direction of each spin vector onto RGB color space.
2. Generate spin texture masks using SAM.
3. Isolate a selected spin texture with coordination information contained in the mask.
4. Calculate the topological charge of the isolated spin texture and embed it into a ferromagnetic background.
5. Store the embedded spin texture.

C. Computation of the spin Hamiltonian parameters and topological charges
The electronic structure calculations of the fcc-Pd/Fe/Ir(111) multilayer system were performed using Density Functional Theory (DFT).We employed the self-consistent real-space linear muffin-tin orbital method in the atomic-sphere approximation (RS-LMTO-ASA) [31,32] with the local spin density approximation (LSDA) exchange-functional [33].The resulting ab-initio magnetic moments, as well as coupling parameters J ij and D ij , used in Eq. ( 1), are described in detail in [34].To avoid effects coming from anticipated truncation, the coupling coefficients were considered up to 360 neighbors (within a distance of ∼ 7a from the reference site, where a = 3.84 Å is the experimental lattice parameter of the Ir host).Furthermore, the experimental value (0.4 meV/Fe [35]) is here assumed for the K U uniaxial anisotropy constant.Concerning the Zeeman term in Eq. ( 1), we set the external field to be B ext = 3.5 T and applied in the outof-plane direction ([001]).At this field, similarly to situation depicted in [36], the skyrmion lattice and ferromagnetic (single-domain) solutions are energetically almost degenerate [34], being isolated skyrmions characterized as metastable structures.Finally, for the Gilbert damping parameter in SLLG, we used the value α ∼ 0.01, which is in the order of magnitude of the intrinsic α in thin films (lower limit) [37].
In the approximation where the Pd-induced spin moments are treated as independent degrees of freedom [38], the Pd layer (when considered explicitly in the ASD simulations) reproduces the Fe layer spin texture [34].Therefore, we restrict ourselves to the analysis of the Fe layer.
As it is widely known, skyrmionic structures are characterized by the topological number Q, which in the limit of a two-dimensional continuum reads: that measures the winding number of S. As we consider a discrete medium, the total topological charge, Q tot , of each spin configuration was calculated using the method proposed by Berg and Lüscher [39], with periodic boundary conditions.The quantity Q tot accounts for all topological structures with individual topological charge (or winding number) Q i in the spin system (Q Following the same reasoning, each of such structures can be isolated (e.g., in a ferromagnetic background) for the evaluation of the respective topological charges Q i .

D. Atomistic spin dynamics simulations
To perform the stability analysis of the textures found by the algorithm presented in this work, we solve the stochastic Landau-Lifshitz-Gilbert (sLLG) equation [28], which is implemented in the UppASD package, and is given as: Here, ∂Si is the effective field on site i, related to the Hamiltonian defined in Eq. ( 1) and parametrized by the ab-initio quantities described in Section II C. The dimensionless (and isotropic) Gilbert damping parameter is here denoted by α, while γ L = γ/ 1 + α 2 is the renormalized gyromagnetic ratio (as a function of the bare one, γ).In turn, B f i is the stochastic field arising from the thermal fluctuations.This term describes how temperature effects are considered in the spin dynamics, in a Langevin approach [27].More specifically, B f i is modelled by an uncorrelated Gaussian white noise, so that the time-averages B f i (t) = 0 and B f iµ (t)B f jν (t ′ ) = 2Dδ ij δ µν δ(t−t ′ ) for the site indices i and j and the Cartesian coordinates µ, ν = {x, y, z}.In the second relation, the parameter D can be obtained via the fluctuationdissipation theorem [40] as D = αk B T /(γµ i ), for a given temperature T , where k B is the Boltzmann constant.
In Eq. ( 7), the first term expresses the precessional motion of atomistic magnetic moments in a given system, and the second term describes the damped motion (both in the adiabatic approximation, i.e., d |S i |/dt = 0).Besides the dynamics, in the absence of thermal effects, the solution of Eq. ( 7) with α > 0 corresponds to the relaxation process of finding the nearest energy minimum in the PES.
Throughout the text, we define the typical simulation times of 200 ps (for the stability of the topological charge) and 2 ns (for the calculation of the average energy/spin with T > 10 −4 K).The numerical solution of Eq. ( 7) is obtained with a time step of ∆t = 10 −16 s.

E. Comparison of the SGD-only and hybrid modes
From the user perspective, it is important that the computations converge quickly.Although we cannot directly compare the efficiency of the two optimization algorithms (NN-SGD and MMCMC) since, among other reasons, practically, in our implementation they use different architectures (GPU and CPU parallelized by OpenMP, respectively), and are written in different programming languages, we can compare the execution times of the SGD-only and hybrid modes from the user perspective.
To achieve this practical comparison, we introduce an intermediate convergence hyperparameter T int s in the context of Eq. ( 3).When the optimization has reached the intermediate level of precision defined by T int s , we either continue with the NN-SGD optimizer (the SGDonly mode), or switch to the MMCMC optimizer (the hybrid mode) until the stopping condition Eq. ( 3) is fulfilled for our final convergence hyperparameter T final s .In the present benchmark, we have used T int s = 10 −3 mRy/atom and T final s = 10 −6 mRy/atom.For both modes, the magnetic texture eventually converges to the same configuration on the PES, but with distinct execution times.Typical optimization procedures for the SGD-only and hybrid modes, for different initial spin configurations, are illustrated in Fig. 2(a) and (b), respectively.
In Fig. 2 (c), we compare the execution times of the two modes for ten randomly selected initial spin configurations.In order to produce a fair comparison, both algorithms (NN-SGD and MMCMC) start at T int s from the same point within each case, and the minimization procedure is stopped at T final s .The red (green) dots show the execution times, counted from the intermediate threshold, for the SGD-only (hybrid) mode.Clearly, the hybrid scheme shows consistently shorter execution times from the user perspective.

III. RESULTS
A. Textures with higher-order Q We applied our algorithm in the hybrid mode to systematically harvest topological spin textures in the fcc-Pd/Fe/Ir(111) system.Overall, given the right conditions such as for instance an appropriate magnitude of the applied magnetic field, we find that this system can hold many types of spin textures including skyrmions, skyrmionium, antiskyrmions, rings and bags comparable to the ones obtained in other systems [9,14,36,[41][42][43].For the remainder of this work, we have selected to focus our analysis on antiskyrmion textures.We found an abundance of such textures, and, curiously, they have been less discussed in the literature compared to, e.g., skyrmion bags.In Fig. 3, we show 15 selected spin textures that have been identified to be metastable by our algorithm, with the topological charge Q varying from 1 to −13.Note that throughout this work we use the convention that the skyrmion has topological charge +1, and antiskyrmions have negative topological charge.The first row in Fig. 3 shows a Q = 1 skyrmion, skyrmion-  4)) is encountered.Therefore, the algorithm creates a new starting point by adding Gaussian noise to X, and restarts the optimization toward another local minimum.The yellow dashed circle indicates the point at which the intermediate convergence criterion is fulfilled.At this point, the hybrid mode switches from using NN-SGD to MMCMC.Finally, the blue dashed circle points out the final converged solution.The parallelogram insets in both plots (a/b) show the magnetic textures of the system at different times during the minimization procedure.Illustrative movies of the optimization processes can be found in the Supplementary Material.Panel (c) shows the execution times of both algorithms, for ten randomly selected cases.The NN-SGD optimization was performed on a single Nvidia A100 GPU, whereas the MMCMC optimization used an Intel Xeon Gold 6130 CPU, parallelized with OpenMP.
ium (Q = 0), an three simpler antiskyrmions.As |Q| increases, the number of limbs and/or rings in the textures increase, see rows 2 and 3 in Fig. 3.For large |Q|, we find very convoluted textures with several rings and limbs combined.

B. Stability analysis and binding energies
Generally speaking, for a given topological charge Q, several spin textures are typically possible, with varying number of holes and chiral kinks.Here, holes are defined as non-self-intersecting curves that define closed domain walls in the structure.Chiral kinks are described in more detail in [43].To elucidate how the stability of the spin textures vary with Q, we have for each Q selected the lowest-energy spin texture found in our simulations.In Fig. 4, we show the computed energy difference with respect to the ferromagnetic single-domain state (Q = 0), ∆E = E − E FM for these selected spin textures.We see that the energy difference, ∆E, is almost linear with respect to Q.As Q decreases one unit, the energy per spin increases on average ∼ 0.6 µRy.Consequently, the antiskyrmion states with |Q| ≥ 1 are relatively close in energy.
To gain a more detailed understanding of the stability and lifetimes of these topological spin textures, we have analyzed them further using the concept of skyrmionic binding energy E b (Q) developed in [42], in combination with results from atomistic spin dynamics simulations using the UppASD software package [28].In [42], the skyrmion binding energy E b (Q) is defined as (respecting our Q sign convention) where , and m is the number of Q = 0 structures that compose the complex skyrmionic configurations.
Here, an underlying assumption is that the higherorder spin textures can be described as an excited state consisting of multiple bounded Q = ±1 (and eventually Q = 0) structures.Using this definition, in Table I we show the computed binding energies of higher-order antiskyrmions with Q ∈ [−5, −3], where m = 0.The negative E b 's obtained for these antiskyrmions mean that it is energetically favorable for lower-|Q| structures to combine together and form these more complex structures with higher total |Q|.
For comparison, we also computed the binding energy of a Q = 2 texture, in which two skyrmions are trapped inside skyrmionium, i.e., a skyrmion bag, see the last row in Table I.Note that for this case, Eq. ( 8) must account also for the skyrmionium energy, so that m = 1.We find that the binding energy of this skyrmion bag is positive.Similarly, skyrmions bags with Q > 2 were also found to have positive binding energies.Thus, the given (B, T ) conditions disfavor lower-Q structures to generate more complex structures with a higher positive-Q.
Figure 5 depicts the results from our spin-dynamics simulations for selected topological spin textures with a Q ranging from −3 to −6.Specifically, we have mapped out how the topological charge, decay time τ , and total magnetic moment changes with temperature and applied magnetic field.Here, the decay time τ is the time it takes for the spin texture to unpack, i.e., change its topological charge.Based on the decay time, final topological charges, and final magnetic moment (per site), and also visual inspection, we divided the charts into four basic zones.Thus, zone I indicates the region in field and temperature in which the topological particle-like structures are long-lived (here taken to mean a life time exceeding τ > 200 ps), keeping their initial real-space structure intact.When B and/or T become sufficiently high to reduce the binding energy (see Eq. ( 8)) by favoring the FM state or due to thermal fluctuations, these higher-order antiskyrmions unpack into lower-|Q| textures.These lower-|Q| structures can either be relatively stable or survive for just a few picoseconds, depending on the energy barrier height.This characterizes the transition region (zone II), where the initial higher-order antiskyrmions are unstable but topological particle-like structures are still present.With further increased temperature, even the lower-|Q| textures become unstable, gradually shifting to a mostly FM spin configuration (note that here T ≲ T C [34]), which defines zone III.Finally, in zone IV we find metastable configurations that resemble domain structures.In that zone, the sufficiently low B either drives the spin-spiral to be the dominant state, or there is a mix of a spin-spiral state and a skyrmion lattice.As a general trend, we notice that the region where the higher-order antiskyrmions are long-lived (zone I) diminishes with increasing initial |Q|.These observed patterns agree well with the information in Table I.E b increases with temperature, enhancing also the probability that high-|Q| structures separate into lower-|Q| (or Q = −1) constituents due to thermal effects.
The above analysis, together with the facts that (i ) single antiskyrmions are long-lived up to sizable temperatures (T ∼ 32 K at B ext = 3.5 T), which makes the population of these metastable states more likely (with a finite probability given by the Boltzmann factor); and (ii ) higher-order antiskyrmions are relatively close in energy (Fig. 4), suggests that some high-Q states in Pd/Fe/Ir(111) can be formed experimentally, at their critical temperatures, by merging Q = −1 textures, as similarly proposed in [44].The process of bound states Table I.Energies E(Q) of the topological structures with respect to the ferromagnetic (single-domain) background, and binding energies E b (Q) of higher-order antiskyrmions (see Eq. ( 8)), both given in µRy/atom.The out-of-plane applied field is B = 3.5 T, at the temperatures indicated in parenthesis.

C. Geometry-topology analysis
Our method is able to harvest numerous metastable topological spin textures.This opens the door for analyzing whole families of spin textures with the same Q.It is for instance interesting to gain insight into how the stability is affected by the internal structure of spin textures with a given Q.In Fig. 6, the relative energies of seven spin textures, all with Q = −5, are plotted.We see that the spin textures have different energies even though all of them have the same topological charge.This also implies that Eq.( 8) is indeed a simplification.The structures can be classified into three different groups.The textures shown in Figs. 6 (b)-(d) can all be described as a skyrmion ring (or hole) with limbs of varying number and lengths attached, whereas the texture in Fig. 6 (e) consists of several rings directly attached to each other.The remaining textures (Figs.6(f)-(h)) represent holefree antiskyrmions.Overall, the textures with holes have higher energy and are, consequently, less stable than the ones without holes inside.This result agrees well with what can be observed from Figs. 3 and 4, i.e., spin textures with holes tend to be less stable as compared to the textures without holes.We stress that the textures in Fig. 6 do not cover the whole diversity of solutions for Q = −5 (some other possible metastable states with Q = −5 are given in Ref. [43]).

IV. DISCUSSION AND CONCLUSIONS
Several factors may influence the results obtained with the computational approach we have presented and used in this work.Which metastable spin textures (i.e., local minima) our method finds of course depends on which starting points we allow for the optimization.Thus, tuning the condition in Eq. (4) will strongly affect the results.In this work, we have selected to impose a simple condition on the DM interaction strength, i.e., T low C > 0.01.Many other options are possible, including imposing several different conditions simultaneously.A condition that would select on exchange frustration may tip the balance in favor of other types of local minima [21,41].
An important issue is the shape of the energy barriers surrounding each local minimum.The probability for a starting point X to end up on a wider energy barrier is higher than for it to end up on a more narrow energy barrier.Thus, our method will naturally select for local minima with wide (but perhaps low) energy barriers, unless this is remedied by imposing a condition to specifically select for narrow energy barriers.
In addition, the stability of the high-order topological spin textures may depend quite sensitively on the parameters used in the spin Hamiltonian.As an example, we mention that our |Q| > 6 textures have a sensitive dependence on the exchange frustration.In our case, the major source of frustration is the next-nearest neighbor interaction shell, so that the relation J 3 /J 1 (see Appendix VII B) -where J n denotes the Fe-Fe interaction from a reference site to its n-th neighboring shell -plays a significant role.Indeed, a change of ∼ 10% in J 3 /J 1 can lift the theoretical metastability of those configurations.In the context of stability analysis, it is worth noting that there exist additional higher-order interactions that have not been taken into account in this study.These higher-order interactions can influence the energy barrier height as has been demonstrated in recent investigations on skyrmions The energy scale here is set so that the energy is zero for the texture with the lowest energy per atom.[45,46].Optimizing the condition (4) for various scenarios and performing additional sensitivity analysis will be the subject of future work.
To conclude, we have presented a new computational approach based on a conditional neural network, aimed at identifying large numbers of higher-order topological spin textures.Using the Pd/Fe/Ir(111) model system as an example, we have harvested numerous hitherto unidentified antiskyrmion spin textures.The identified spin textures have in turn enabled us to perform a systematic analysis of how the relative stability of spin tex-tures depends on temperature, external magnetic field, the topological charge as well as the number of holes in the texture.
V. DATA AVAILABILITY All data is available from the authors upon request.

VI. CODE AVAILABILITY
All the relevant code is available from the authors upon request.

VII. SUPPLEMENTARY A. Minimum energy paths
The calculation of the minimum energy paths (MEPs) between two local minima in the energy surface results in the corresponding activation barrier.Considering one of the states as the ferromagnetic (single-domain) configuration, then we can define the so-called path to collapse, and the resulting curve can be directly related to the stability of the topological texture.Here, the geodesic nudged elastic band (GNEB) method [47], as implemented in UppASD, is used in order to determine the MEP to collapse of a single skyrmion (or antiskyrmion) in Pd/Fe/Ir(111), considering the configuration space determined by the parameters described in Ref. [34].The resulting MEPs are shown in Fig. 7, where the highest activation barrier is obtained for the isolated skyrmion.This is consistent with previous calculations (e.g., Ref. [48]), and demonstrates the enhanced stability of skyrmions in the context of annihilation to the ferromagnetic state.
B. Intra-layer interactions in Pd/Fe/Ir(111) Fig. 8 shows the isotropic Heisenberg exchange interactions of a reference Fe site with its surrounding Fe neighbors, calculated using the magnetic force theorem [49] and analyzed in detail in [34].Here, J n denotes the interaction between the reference site and the n-th neighboring shell.

C. Supporting movies
Examples of the energy minimization process and the resulting metastable state of Pd/Fe/Ir(111) containing high-order antiskyrmions, obtained with the use of the SGD-only (hybrid) mode, can be found in the movies S1 (S2).In the movies, the frames in red and blue colors refer to optimization with NN-SGD, and the gray and yellow frames refer to optimization using MMCMC, respectively.The red/gray colors represent the out-ofplane component of the magnetic moments, while the blue/yellow colors represent the in-plane components.

Figure 1 .
Figure 1.(a) Conceptual illustration of the overall problem.In a complex PES (the middle picture), we wish to efficently identify local minima containing interesting skyrmionic spin textures (top picture) and avoid converging toward minima with uninteresting spin textures (bottom picture).The yellow and red stars show the initial and final positions for the corresponding NN-SGD optimizations.(b) Chart of the fully connected feed-forward NN-SGD.A random vector X of length k is used to select the starting point of the SGD optimization.The values mi on the output nodes define the spin configuration, see Eq. (2).The loss function L is the spin Hamiltonian, Eq. (1).(c) Flow chart showing the main features of the algorithm.For more details, see Section II A.

Figure 2 .
Figure 2. Illustration of (a) SGD-only mode (NN-SGD), (b) hybrid mode (NN-SGD followed by MMCMC), and (c) comparison of the execution times of the two modes from the user perspective.Panels (a) and (b) show schematically the optimization procedures for the two modes.Note that the two optimizations in (a) and (b) start from different (random) spin configurations, and cannot be compared directly.The red dashed circle is the starting point (a random spin texture).At the green dashed circle, a spin configuration that does not fulfill Eq. (4)) is encountered.Therefore, the algorithm creates a new starting point by adding Gaussian noise to X, and restarts the optimization toward another local minimum.The yellow dashed circle indicates the point at which the intermediate convergence criterion is fulfilled.At this point, the hybrid mode switches from using NN-SGD to MMCMC.Finally, the blue dashed circle points out the final converged solution.The parallelogram insets in both plots (a/b) show the magnetic textures of the system at different times during the minimization procedure.Illustrative movies of the optimization processes can be found in the Supplementary Material.Panel (c) shows the execution times of both algorithms, for ten randomly selected cases.The NN-SGD optimization was performed on a single Nvidia A100 GPU, whereas the MMCMC optimization used an Intel Xeon Gold 6130 CPU, parallelized with OpenMP.

Figure 4 .
Figure 4. Relative energy per atom of isolated spin textures with topological charge Q ranging from −13 to 1. (For completeness we have also included the Q = +2 and Q = +3 cases in the plot).The total energy is calculated by embedding each topological excited state in an fcc-Pd/Fe/Ir(111) supercell i.e., a single-domain ferromagnetic background with size 23.6 × 27.2 nm 2 .The energy scale in the figure is set so that the lowest energy (i.e., the Q = 1 skyrmion state) is zero, for visibility.The black dots are computed energies and the blue dashed line is a linear fit to the data.The horizontal red dashed line shows the energy of the ferromagnetic state.The inset shows the energy gap between the Q = −1 spin texture and the ferromagnetic state.
a n.d.= not defined generation, however, must overcome an energy barrier.

Figure 5 .
Figure 5. Magnetic field versus temperature stability maps of antiskyrmions with topological charge varying from Q = −6 up to Q = −3.Columns 1 to 3 show the final topological charge, topological charge decay time and the normalized magnitude of the final magnetic moment, respectively.The top row is for Q = −6, the second row is for Q = −5, the third row is for Q = −4 and the bottom row is for Q = −3.In the right-most column we show typical final magnetic textures in each zone (I, II, III and IV).

Figure 6 .
Figure 6.(a) Relative energy, with respect to the ferromagnetic single-domain state, per atom of the isolated topological textures with topological charge Q = −5 in a magnetic field of 3.5 T, shown in panels (b) to (h).The energy scale here is set so that the energy is zero for the texture with the lowest energy per atom.

Figure 7 .
Figure 7. Minimum energy paths to collapse, at T = 10 −4 K and B ext = 3.5 T, for: (a) single skyrmion; (b) single antiskyrmion on Pd/Fe/Ir(111).The energies are given with respect to the initial state (i.e., isolated Q = ±1 structure) and mapped as a function of the reaction coordinate corresponding to the collapse path to the ferromagnetic (FM) single-domain state.The maximum energy represents a first order saddle point in the potential energy surface.

Figure 8 .
Figure 8.The Heisenberg Fe-Fe exchange interactions as a function of the (normalized) distance in Pd/Fe/Ir(111).