Crowdsourcing human common sense for quantum control

Citizen science has been applied with great success to help solve complex numerical challenges within microbiology, e.g. protein folding. Quantum physics offers similarly complex challenges, yet there is no systematic research effort exploring the potential of citizen science in this domain. We take first steps in this direction by introducing a game, Quantum Moves 2, and in three different quantum control problems compare the efficacy of various optimization methods: gradient-optimization with player-based seeds and random seeds, and gradient-free with random seeds. Although approximately optimal results can be obtained within reasonable computational resources, the considered problems fall into distinct degrees of complexity. Players can apply the gradient-based algorithm to their seeds and we find that these results perform roughly on par with the best of the standard optimization methods. This highlights the future potential for crowdsourcing the solution of quantum research problems. Further, cluster-optimized player seeds was the only method to yield roughly optimal performance throughout all problems. We are aware that comparisons with random seeding constitutes the lowest bar of comparison with the plethora of possible optimization approaches and our work should therefore only be taken as a necessary first demonstration of the potential for further exploration. Additionally, since the three challenges are much simpler than the problem of folding proteins, they also lend themselves to different research questions. Where protein folding focuses on the results of a small subset of players achieving near expert status, we explore bulk behavior of all players. The relative success of the player seeds indicates, in our view, the potential value of studying the crowdsourcing of common sense, the innate responses of all players to the interface, as inspiration for expert optimization.


I. INTRODUCTION
Despite amazing advances in the past years, it has becoming increasingly clear that pattern-matching results from deep learning algorithms alone can be surprisingly brittle [1]. This failure has been attributed to, among other reasons, a lack of hierarchical learning, transfer between problems, and common sense comprehension of real world phenomena [2]. Common sense has been defined in many different ways, but here we refer to it as implicit and shared fundamental assumptions that people have about the world [3]. Additionally, there are indications that humans may sometimes solve computationally hard problems quickly and near-optimally [4,5]. However, humans just as often fail miserably [6]. Thus, many argue that optimized synergetic systems integrating individuals or collectives of humans and machines offer a promising, human-in-the-loop, approach to tackle complex problems [7][8][9][10]. One key challenge in this approach * sherson@phys.au.dk is to perform large-scale studies of human capacities, such as common sense and the development of rich cognitive models [2]. Initial steps in this direction can be taken by exploring problems in research relevant contexts, such as in the related fields of citizen science and collective intelligence [8,10], and detailed comparisons between human [5] and AI [11] performance are becoming feasible for problems such as protein folding.
In this paper, we take steps towards exploring how citizen input can be applied towards solving relevant problems in quantum mechanics. In particular, we examine the efficiency of player solutions as seeds for a local optimization algorithm in a dynamic, quantum state transfer task. We analyze three distinct scientific challenges, where the performance of the hybrid approach (humancomputer) is compared against standard algorithms from quantum optimal control and computer science. The human input was gathered through our game, Quantum Moves 2, which allowed the player to both create seeds and subsequently engage with the local optimization algorithm embedded in the game and see its action in real time. Please note that a previous version of the game has already examined one of these challenges [12]. In-advertently, an error discovered by A. Grønlund [13] in the numerical algorithm used for its analysis invalidates the quantitative conclusions, and Ref. [12] has therefore been retracted. The new game, launched in December 2018, explicitly contains numerical benchmark curves for the players to attempt to beat. These are based on corrected and updated algorithms [14][15][16] with performance verified to equal the one subsequently discussed by A. Grønlund [17]. Our code will be made available at https://gitlab.com/quatomic/quantum-moves2.
In order to address the tenability of game-based exploration of quantum research problems, we pose here five specific questions about the current and potential scientific contribution to quantum physics and related fields: Q1: Can a suitable gamified interface allow citizen scientists to solve quantum control problems entirely on their own, using their own hardware to run the required computation (e.g. using an in-game optimizer), with a quality on par with traditional expert-driven optimization? If this is the case, then such an interface would allow for a novel form of online, quantum citizen science combining human problem solving with crowd computing. In such a framework, one could imagine quantum researchers continually feeding in optimization challenges that are then solved efficiently by the community at no computational cost to the researchers.
Q2: Can the player-generated solutions, in combination with concrete algorithms, provide an edge against purely algorithmic solutions, and how does that depend on the type and mathematical complexity of the problem?
Q3: Could citizen science games be first steps towards playful, explorative tools for domain experts? This approach is currently being pursued in microbiological research settings [18] but not yet in quantum physics. As a related question, can the citizen science experience be expanded to include citizen contribution in more aspects of the scientific method such as data analysis and ultimately hypothesis formation and problem formulation. If such so-called extreme citizen science [19] could be understood and systematically implemented then this would constitute a major advance in the World Economic Forum challenge of understanding complex problem solving [20].
Q4: If the games are sufficiently challenging for humans, can they be used for systematic studies of human problem solving? In our group we have started to investigate this within the setting of quantum experiment optimization [21] and by developing cognitive science variants of quantum challenges [22]. Q5: If larger portions of the player base can, using the gamified interface, make non-trivial contributions to several classes of research problems, could this potentially contribute to the solution of one of the major roadblocks in the path to domain-general AI according to e.g. the author of Ref. [2] -that isthe crowdsourcing and algorithmization of human common sense?
The latter two questions are very speculative and move well beyond the realm of quantum physics and the scope of this paper. We currently pursue Q3 in other work (manuscript in preparation) by developing an intuitive and visual quantum programming environment [23]. Here, we shall focus merely on addressing Q1 and Q2 in a systematic manner. For this purpose, we compare optimization of player seeding to uniform random seeding across a range of distinct problems. We are aware that making comparisons with entirely random seeding constitutes the lowest possible bar of comparison with the plethora of possible numerical optimization approaches. As we shall see below, player seeds are indeed, for the investigated problems, on average more efficient than the randomly generated ones. This broad contribution of the players stands in stark contrast to e.g. citizen science projects like Foldit, where only a small fraction of players provide a scientific contribution after an extensive training process. In order to distinguish these two types of citizen science challenges, we assert in this work that the Quantum Moves 2 game to some extent taps into certain aspects of common sense (shared tacit knowledge of reasonable behavior) of the player population at large. Although we do not consider it within the scope of this work to verify the explicit nature of this common sense, it seems reasonable to conjecture that the liquid analogy of the game interface taps into the classical intuition for sloshing water, which was also proposed by D. Sels in his numerical analysis of one of the problems [24].

II. QUANTUM OPTIMAL CONTROL
The hallmarks of the second quantum revolution [25] are the exploitation and engineering of fragile, isolated quantum objects. For example, quantum computing with any platform predicates precise control of the constituent qubits and associated gate operations. Additionally, the control must also be expeditiously carried out such as to avoid decoherence and other detrimental effects to the overall goal. Controls meeting these criteria and more can be obtained within the well-established theory of quantum optimal control.
A common class of quantum optimal control problems deals with facilitating a particular initial-to-target state transfer, |ψ 0 → |ψ tgt , for some fixed process duration T . The manipulatory access to the state evolution is through a set of control parameters {u(t)} where each solution (specific choice of functions) uniquely maps to a final state |ψ 0 → |ψ(T ) (see Appendices B-C). In this context, the transfer fidelity for each fixed T can be interpreted as a high-dimensional optimization (or control) landscape as illustrated in Fig. 1. Optimal controls (or solutions) can then be associated with points in the landscape that exceed some threshold fidelity value. This value can either be (1) absolute regardless of T or (2) relative to the global maxima at a given T . To avoid ambiguity, we take the latter definition in this work. Additionally, for a given fidelity requirement, we associate a fundamental quantum speed limit, T F QSL , below which no maximum exceeds F . T F QSL is thus defined as the shortest duration at which at least one maximum can obtain the given F . Common choices for threshold values are F = 0.99, 0.999, 0.9999, . . . , depending on the context, characterized by a trade-off between F and T F QSL (increased precision leads to longer durations).
In the limit T → ∞, most problems become easy in the sense that many global maxima with F ≈ 1 exist. As T → 0, however, the control problem becomes increasingly difficult as previously global maxima gradually become sub-optimal and the control landscape becomes more rugged. The usually unfavorable topography of the control landscape in the T ≈ T F QSL regimes therefore makes uncovering global maxima especially difficult for the aforementioned high-fidelity requirements.
At its core, any iterative optimization algorithm attempting to locate global extrema must prescribe a way to traverse the optimization landscape in a meaningful way. It must thus strike a balance between local (exploitation) and global (exploration) search methodologies. A common optimization paradigm is to initialize an algorithm, e.g. one excelling in finding the nearest local optimum, from many different seeds (starting points), thereby introducing a simple global component [26]. In principle, a local optimizer maps each seed to its nearest attractor [27]. Sub-optimal local optima are often called local traps, referring to the propensity for said optimizers to locate these and terminate (since they are "stuck"). The effectiveness of this paradigm is then necessarily strongly correlated to a combination of the seeding strategy (the mechanism with which seeds are generated) and the choice of optimization algorithm.
All seeding strategies cover some particular region of the optimization landscape as depicted in Fig. 1. A good seeding strategy is one that not only covers high-quality regions but also has a high probability of generating a seed that reaches optimality upon subsequent optimization by the chosen algorithm. Effective seeding strategies and algorithms naturally become increasingly important with growing problem complexity and this can broadly be characterized by two axes: the computational (or numerical) complexity of the underlying simulations and the inherent topographical landscape complexity. The computational difficulty can also be interpreted as the amount of available resources. Below, we discuss problems with two different degrees of computational difficulty drawing upon the single particle Schrödinger equation and nonlinear dynamics of Bose-Einstein condensates. Abstract illustration of an optimization (control) landscape with two control parameters. Each point in the plane corresponds to a particular set of functions {u1(t), u2(t)} and the height of the landscape is given by the associated fidelity. Only a small region of control space is optimal (with respect to F ≈ 1) and contains many local traps. Purple region: Uniform random seeding. Obtaining F ≈ 1 is possible, but with low probability. Green region: Specialized seeding. Obtaining F ≈ 1 is possible with high probability, but not guaranteed. Such a region is conventionally targeted by experts through their accumulated knowledge, problem insights, heuristics, and programming proficiency. Non-expert players may be useful in uncovering this region without extensive prior training.
Possible seeding strategies range from uniform random guessing (least imposed structure) to parameterizations based on highly domain-specific expert knowledge or heuristics (most imposed structure) as depicted by the colored regions in Fig. 1. Thus, discovering the "good" regions of the optimization landscape can be very challenging. We investigate whether non-expert players may be useful in identifying good regions. Each player can be considered an independent, adaptive seeding strategy that incorporates complex decision-making processes into an optimization loop which are otherwise difficult to capture and implement programmatically. This methodology may be used as a general means to extract features and heuristics for a given problem, which could then aid experts in further analysis and guide the development of seeding strategies.

III. OVERVIEW OF QUANTUM MOVES 2
In Quantum Moves 2, the player's goal is to solve various state transfer problems, referred to in-game as levels. Each level concerns 1D transfers of either single particle or Bose-Einstein condensate (bec) wave functions ψ(x, t) = x|ψ(t) , both describable by the Hamiltonian where taking the non-linear coupling parameter g = 0 corresponds to the single-particle case. While there is no in-game distinction in their representation, the numerics involving bec (g = 0) are more intricate which has consequences on the efficiency of some optimization algorithms as discussed later. The potential has up to two controllable parameters,

Scientific problems of various complexity
Here we briefly describe and motivate the three main levels in the game. Fig. 2 displays their in-game representation. See Appendix. A for a more complete description of the game interface.
Bring Home Water: A single atom resides in the ground state of a static tweezer and must be picked up and shuttled back into the ground state at the original location of the movable tweezer. This type of transfer is necessary for implementing quantum computations in neutral atoms based on collision gates [12,28].
Splitting: A bec initially resides in the ground state of a single-well configuration on an atom chip and must be transferred into the ground state of a double-well configuration by deforming the potential. The split condensate can then be used for matter-wave interferometry [29][30][31][32].
Shake Up: A bec initially resides in the ground state of a single-well configuration on an atom chip and must be transferred into the first excited state by shaking the potential. The excited state of the bec acts as a source for twin-atom beams [14,[31][32][33][34].

IV. ALGORITHMS AND SEEDING STRATEGIES
In this section we specify the suite of different algorithms and seeding strategies under consideration. We Middle: Splitting (bec). Bottom: Shake Up (bec). The instantaneous density |ψ(x, t)| 2 (red line) must be transfered into the target density |ψtgt(x)| 2 (yellow line) without residual excitation. The trapping potential (green line) is parametrized by the position of the round, draggable cursor, {u1(t), u2(t)} = {f1(xcursor(t)), f2(ycursor(t)}), which is unable to leave the turquoise bounding box (control boundaries). The functions f1 and f2 are linear. The wave function densities are offset by the potential, |ψ(x)| 2 + V (x), for illustrative purposes (e.g. the Splitting target density represents two equally sized wave packets trapped in a double well -the large bump in the center is the barrier and two smaller bumps are the wave packets). define a method as a particular combination of algorithm and seeding strategy. For example, grape pr-rs is the method that uses the grape algorithm to optimize preselected random seeds. Further details of numerics and each algorithm are included in Appendices B-C.

A. Algorithms
grape: Standard gradient-based optimization using the l-bfgs quasi-Newton search direction with line search. Bandwidth limitation (smoothness) is included through a derivative-regularization cost term.
pgrape: Player grape. The player can start and stop the optimization. Otherwise, it is identical to grape. The algorithm runs locally inside the game.
Stochastic Ascent (sa): Gradient-free maximally greedy time-local search. T is segmented into n b bins of equal width within which the control values are constant.
The bin values are updated in a stochastic order. The bandwidth limitation (smoothness) is inversely proportional to n b (no additional cost term in the current implementation).

B. Seeding Strategies
As outlined in Sec. II, seeding strategies are a fundamental component of optimization. Drawing from random distributions provides the most generic way of seeding: Random Seed (rs): The control is assembled by independently sampling a uniform distribution within given boundaries (Appendix B) for each control parameter, for p = 1, 2. Subsequent control values are completely uncorrelated. To introduce correlations, one can also segment T into n b bins of equal width w such that the control value is initially constant within each bin (see also Appendix C).
In previous work [12] we employed random seeding in frequency space, which has the advantage of enabling frequency cutoffs and thereby adjustable levels of smoothness of the seeds. This, along with other types of basis parametrization, also requires the introduction of additional parameter choices. Although we have also conducted the analysis below for this type of seed parametrization, it will not be presented here in order to preserve the clarity of the presentation, as it generally performed on par or slightly worse than the uniform random seeding for the chosen parameters.
Quantum Moves 2 provides two novel seeding strategies: Player Seed (ps): The control is assembled from mapping the players' cursor position during gameplay as a function of time, Player Optimized Seed (po): This seeding strategy describes a ps seed that has been optimized by the player (pgrape), u 1 (t) = pgrape(f 1 (x cursor (t))), (5a) u 2 (t) = pgrape(f 2 (y cursor (t))). (5b) If the optimization was stopped before convergence, the optimized control can be used as a seed for further optimization. If the optimization converged, the seed itself is already a local optimum. From a resource perspective, these seeds are very valuable since they come partially or fully pre-optimized at no cost to the research team.
A heuristic extension of any seeding strategy is preselection: Preselection (pr): A naïve greedy heuristic to choose which seeds should be picked for optimization. Given a set of candidate seeds and their associated fidelities, we perform optimization only on the seeds with highest initial fidelity.

V. BRING HOME WATER
In this section we present optimization results obtained in Bring Home Water for the different methods described in Sec. IV. Outside the game, optimizations were performed in large batches on a computer cluster. Each seed was optimized until convergence (step size below 10 −7 ), exceeding a fidelity threshold F = 0.999, or exceeding the allotted optimization time. The initial minimal allotted time was roughly 13 minutes per seed. To maximize resource use, excess time for seeds that terminated due to the other criteria was divided equally amongst the remaining seeds and added to their minimal times. Inside the game, pgrape ran locally on the player devices until manually stopped by the player, convergence (step size below 10 −7 ), or F = 0.999, incurring no computational cost for us. We denote by pgrape ps + po the joint result of the player seeds and their optimizations. Figure 3 shows the density distribution of solution infidelities (1 − F ) [35] as a function of process duration T in the high-fidelity regime for various methods. The densities estimate the probability distribution P(F |T ) of obtaining a particular fidelity for a given T , since each column is individually normalized. The solid line shows the best obtained grape ps results from Fig. 4 for reference. Figure 4 shows the aggregate, monotonically best 1 − F as a function of process duration T . The blue dots show all results produced only by players, i.e. player seeds (ps) and player-optimized seeds (po). The red dots show the result of optimizing the ps with the computational resources described above. Lines with (without) dots show the best results for methods based on players (random seeds).
For the grape ps density and pgrape ps + po results Fig. 13, we observe two clear bands of solutions that are each described by distinct exponential behavior which hints at two corresponding solution strategies. This is verified and analyzed in Sec. V A using clustering techniques. The identification of the two exponentiallygapped solutions strategies and the relative likelihoods of different methods identifying each is the first of three main novel findings in this section. Each density is normalized for every individual T and thus represents an estimate of the probability distribution for obtaining a particular F for a given T (P(F |T )). The reference curve shows the best obtained results for grape ps. Purple crosses indicate individual solutions at densities lower than 0.002.
FIG. 4. Aggregate, monotonically best optimization results (lower is better) for several methods in Bring Home Water. Solid lines with (without) dots show the best results obtained with (without) player influence. The scattered blue dots show all results produced by players seeding and optimizing their own seeds. The scattered red dots show the same, except the optimization is carried out with the computational resources described in the text (i.e. no player influence after seeding). The dot translucency indicates the density distribution.
The optimal strategy changes from one to the other near T = 0.092 ms, explaining the kink in the best result reference curve seen in Fig. 3 and the departing line of red dots in Fig. 4. The gap between the strategies increases exponentially, leading to significantly different quantum speed limit T F QSL estimates (as defined in Sec. II). Outside of these strategies, the densities are mostly sparse but non-zero, indicating a topographically complex optimization landscape containing many local sub-optimal minima or regions with near-vanishing gradient. This is also understood by the interspersed red and blue dot distributions on Fig. 4.
We now examine grape rs, where the only difference to the former methods is the seeding strategy. In this case, we observe that only the sub-optimal strategy is discovered, with only two (out of 20734) instances located in the strategy gap. Evidently, for the same optimization algorithm, the structure of player seeds is preferable. The deficiency of grape rs is analyzed in Sec. V B.
Next, we turn our attention to sa rs variants with full resolution (n b = n t ) and reduced resolution (n b = 40). As discussed in detail in Appendix C, sa is expected to be efficient for linear problems (g = 0) with preferably a single control parameter. For Bring Home Water the former is satisfied by the problem definition and the latter can be satisfied by choosing the tweezer amplitude control such that it is maximally deep at all times [24] , u 2 (t) = u max 2 , and thus only optimize u 1 (t). Indeed, this fixation heuristic is the strongest type of structure to impose on the problem, the motivation of which is rooted in insight of the particular problem (steeper potentials yield faster dynamical time scales). This type of heuristic may not always be readily available, e.g. if the control parameters must simultaneously change in a non-trivial way to facilitate the transfer (e.g. the θ(t) and β(t) parameters in Ref. [16]). Nevertheless, in this case the control landscape dimension is halved since only u 1 (t) is randomly seeded and optimized. Keeping both control parameters variable would present the most fair comparison, but here we choose the heuristically constrained version to be consistent with previous studies [17,24]. Likewise, the number of bins for reduced resolution variant, (n b = 40), was chosen based on these.
With these choices, sa rs (n b = n t ) is capable of discovering the two strategies with remarkable efficiency, despite the u 1 (t) seeding mechanism being structureless. Only a few low fidelity solutions occur. sa rs (n b = 40) does not identify the optimal solution branch and instead concentrates on a broad band of solutions with a center shifted away from the upper branch. This can be attributed to the reduced resolution of the algorithm since it cannot resolve the dynamics finely enough. It is safe to conjecture that a more careful choice of the heuristic bin parameter 40 < n b < n t would yield results matching (n b = n t ) above some resolution threshold.
Both grape ps and sa rs (n b = n t ) find the same estimates T F =0.99 QSL ≈ 0.0973 ms and T F =0.999 QSL ≈ 0.1057 ms, whereas the estimate from pgrape ps + po is off by less than 1% with respect to F = 0.99. The fact that the purely player-optimized curve roughly matches the best results optimized on the computer cluster represents the second novel finding of this section. This, and similar findings for the two remaining challenges, represents our experimental confirmation of Q1 from the introduction. grape ps, pgrape ps + po, and sa rs (n b = n t ) are also all able to find both solution strategies, but sa rs (n b = n t ) is the most efficient at optimizing lowfidelity solutions into high-fidelity solutions. This cannot be attributed to the different seeding mechanisms, since grape rs fails to find the optimal solution strategy. Instead, the difference is due to how the two optimization algorithms traverse the landscape (Appendix C) and how they respond to being near a local, sub-optimal trap. sa performs exhaustive search (within its discretization) along a single, stochastically chosen dimension at a time and it is therefore possible to escape from initial trapping regions likely to be found by structureless random seeding. Eventually, the optimal strategies are discoverable granted that their attractors are sufficiently broad. On the other hand, grape is principally a strict local optimizer and will rapidly converge to the local trap, and the only possibility for escape is if the inexact line search accepts a step size beyond the trap. Thus, when using grape the structureless rs seeding is more likely to get stuck in local traps, whereas the more structured ps seeding is more likely to yield better results under the assumption of sufficiently broad attractors.
Observations on the above end-of-optimization results do not account for the associated computation times. Figure 5 shows the optimization trajectories for rs methods as a function of wall time for the same 100 seeds at T F =0.99 QSL [36] . These are separate from the results in Figs. 3-4, and each seed had in this case a maximum optimization time of roughly 13 minutes. The quantile statistics consider only ongoing optimizations, i.e. they do not include solutions converged at earlier times. We clearly see that sa (n b = 40) rapidly improves the fidelity initially, lending merit to its usefulness (see Appendix C for a discussion). This is owed to the speed with which a full iteration can be carried out, i.e. where all parts of the control domain are adjusted once. The mean time per complete iteration is 14.2 ± 1.4 s. Further speed up can be achieved as pointed out in Appendix C. However, progress stagnates and terminates before F = 0.99 due to the reduced resolution. A better choice for the heuristic resolution parameter 40 < n b < n t would address this issue. The full resolution sa (n b = n t ) does not suffer from stagnation, but progress is initially orders of magnitudes slower since control values at successive times are completely uncorrelated and a full iteration takes much longer. The mean time per complete iteration is 219 ± 31 s. grape completes an iteration in 0.27±0.03 s and performs about the same as sa (n b = n t ) until around 400s. Beyond this point sa (n b = n t ) dominates in the mean, reflecting that grape rs finds only the sub-optimal strategy and with less efficiency. Recall that, as opposed to grape, sa optimizes only u 1 (t) and is not subject to derivative regularization [15] (smoothness criterion) of the control, and the algorithm therefore effectively solves an easier problem in the current implementation. Although inclusion of both points would lead to roughly a doubling in computation time for obtaining similar results and more constrained controls, we do not believe it would drastically change the conclusions in terms of overall performance.
Appendix D provides an alternative statistical characterization of each method, as well as the effect of the preselection heuristic.

A. Optimal Strategies -Control Clustering
In order to extract the identified solution strategies, we apply dbscan clustering [37] to the grape ps method. Based on the results presented in the previous section, we expect the existence of distinct solution strategies, i.e. families of solutions that have a similar functional shape and characteristics but possibly different durations. Such solutions can be transformed into each other by time compression or expansion. The dbscan clustering algorithm allows even controls that are far away from each other (as measured by e.g. Euclidean metric) to be in the same cluster, provided there are other controls bridging the gap between them. This property enables solutions from a given optimal strategy to be grouped within the same cluster.
The clustering was performed only on the durationnormalized tweezer position, u 1 (t/T ). To simplify the clustering, all solutions were given the same number of points by interpolating on a 1000 point grid (the original number of points was defined by T /δt ). We used the sklearn implementation of dbscan with Euclidean metric, = 3, and a minimum number of neighbors min samples = 5.
Initial results revealed two major and three minor clusters. By inspection it was found that each smaller cluster was in fact identical to one of the major clusters, apart from an initial delay in placing the tweezer near the atom. In order to filter out this physically insignificant delay we apply a correction procedure to all the controls, dropping all initial points where u 1 (t/T ) < 0 before the first instance of u 1 (t/T ) ≥ 0. Applying the clustering with the same parameters returns a significantly simpler result as seen in Fig. 6. We now find only two major clusters (labeled 0 and 1) with increased populations and a "cluster" of unclassified controls (labeled -1) with reduced population compared to the unfiltered case. Each cluster exhibits a strikingly exponential trade-off between fidelity and duration well-described by the fits and it is clearly seen that the strategy gap widens exponentially. Assuming these trends can be extrapolated, this yields T F QSL,0 /T F QSL,1 = 1.17, 1.26, 1.33 and , 63944 for F = 0.999, 0.9999, 0.99999, respectively. The drawbacks associated with the longer transfer times are accentuated by decoherence and decay mechanisms present in realistic settings, which themselves exhibit exponential behavior.
Examination of the unclassified controls suggest that those solutions do belong to one of the two strategies but have a few local defects (sudden displacements) that make them sub-optimal with respect to the nearest cluster and appear far away from other cluster members based on the Euclidean metric. Thus, effectively only two high-fidelity strategies exist.
Physically, the strategy corresponding to the cluster 0 begins by placing the tweezer in front of the atom, providing immediate acceleration towards the target position. We name this the front-swing strategy. Conversely, the strategy corresponding to the cluster 1 begins by placing the tweezer behind the atom, providing immediate acceleration away from the target position. We name this the back-swing strategy. In both instances, there is very little deviation from the cluster mean except during the shuttling of the atom where small deviations are allowed. In our Quantum Moves 1 work [12], we named these the shoveling and tunneling strategies, respectively. However, as pointed out in Ref. [24] the "tunneling" solution FIG. 6. Bring Home Water: Clustering on controls with removed delay. The cluster color code and their population are shown in the legend. Top: Each cluster (0 and 1) corresponds to a strategy and the gap between them exhibits an exponential 1 − F (T ) behavior, leading to different estimates of quantum speed limits for a given threshold. The unclassified points (-1) are mostly populated by controls similar to either strategy, except for a few local defects that makes them appear distant to the cluster with respect to the Euclidean metric. Bottom: Lines correspond to cluster means and shaded areas to the standard deviation. The most populous cluster (0) corresponds to a front-swing strategy with the tweezer initially placed in front of the atom whereas the less populous cluster (1) corresponds to a back-swing strategy with the tweezer initially placed behind the atom. exhibits largely classical motion (|ψ(x, t)| 2 remains unimodal at all times), and hence that naming is not quite appropriate.
Intuitively one might expect that the back-swing strategy would be slower because the atom must travel an overall longer distance compared to the front-swing strategy, but this is evidently not the case. Instead, initially displacing the atom onto the static well's right side can serve as an additional accelerating force to that of the movable tweezer. With this analysis, we are in a position to explore why grape rs fails to find the back-swing strategy in the following section.

B. GRAPE RS -Efficiency vs Optimality
It is clear from Fig. 4 that grape rs (with n b = n t ) fails to find the back-swing solution whereas the same optimization algorithm with player-based seeds does not. As shown in Sec. V A, this strategy requires the control to initially be located on the right-hand side of the static tweezer (u 1 > x 0 ) for a time interval sufficient to displace the atom. To gain a sense of timescales, the harmonic approximation to the fully overlapped tweezers (u 1 = x 0 ) yields an oscillation period of T = 2π/ω ≈ 0.1 (in simulation units, see Appendix B). Assuming T /10 is a sufficient response time for meaningful dynamics, the probability for consecutively sampling the corresponding number of points n = (T /10)/δt = 10 −2 /(3.5·10 −4 ) ≈ 29 such that they all have (u 1 > x 0 ) is vanishingly small P (u 1 > x 0 ) n = (0.25) n = 6 · 10 −18 when successive control values are uncorrelated. Even at a much less conservative estimate of T /40, the probability 5 · 10 −5 remains strongly suppressed. Since grape traverses the landscape locally, it is therefore almost guaranteed that subsequent optimization leads to the front-swing strategy. The same is not observed for sa, since the onedimensional exhaustive search allows for the adjustment of points (u 1 < x 0 ) into (u 1 > x 0 ).
One obvious criticism to the difference between randomly and player-seeded behavior is thus that with a very high time resolution the random seed is very rapidly oscillating whereas the player seeds are probably typically rather smooth due to the physical limitations in the speed of the players' cursor movements. One might then hypothesize that sufficiently coarse-grained piecewise constant random seeds would also yield similarly good behavior. If this was the case, the player superiority would indeed be a trivial artifact of the choice of discretization. To test this, we therefore study the behavior of grape rs optimizations when heuristically introducing correlations in the random seed by dividing the seed into n b ≤ n t piecewise constant control segments (like with sa, but in this case the subsequent optimization is not constrained to this parametrization). This makes it increasingly likely for the seed to initially begin and remain on the right side of the atom for a sufficient amount of time. Figure 7 shows the results of optimizing 2000 of these seeds near T F =0.999 QSL as a function of n b , as well as the grape ps results from Figs. 3-4. In the limit n b = n t , there is a high probability of finding solutions belonging to the front-swing strategy with associated fidelities around 0.99. This is consistent with the findings in Fig. 3. Only for n b 4 is there an appreciable albeit low probability (about 1 to 3.5%) of identifying back-swing solutions with much higher associated fidelities above 0.99, whereas more than 91% of the density resides below F = 0.9. The increase in low fidelity solutions can be attributed to the fact for e.g. n b = 2 (that has the highest empiric probability of finding the back-swing), there is an increased probability of placing the control tweezer far away from the atom and never touching the atom at all, leaving it in the initial state and resulting in a vanishing gradient (it is proportional to ψ tgt |ψ(T ) ). On the other hand, the probability of grape finding the back-swing strategy with player seeds is much larger (about 13%) with only 36% of solutions below F = 0.90.
The non-trivial observation that player seeds outperform randomly generated piecewise constant seeds at all coarseness scales represents the third novel contribution of this section. An astute reader might now point out that the shape of both optimal strategies in Fig. 6 suggest an obvious improvement by employing piecewise linear rather than piecewise constant seeding. Indeed, a similar observation in [12] based on bulk analysis of player-based solutions led to the so-called hilo seeding strategy. It solved the problem with remarkable efficiency with only three free seed parameters and thereby demonstrated that the Bring Home Water problem is indeed simple when it is viewed in the appropriate parametrization. Similar insights were obtained based on randomly seeded grape results in Ref. [13]. A figure similar to Fig. 7 with piecewise linear grape rs results would therefore yield a dramatically different picture. Our main conclusion is therefore not that a standard, heuristic-free numerical approach will fail on the problem but that it can fail.
We stress again that comparison to randomly generated seeds represents the lowest possible bar of comparison. It should therefore in no way be taken as proof that players can provide computational improvements to this or other problems. It does, however, constitute a necessary first demonstration of the value of examining the gamified approach further and comparing to or integrating it with more sophisticated expert heuristics.
As a final remark in this section we note that, for this problem, the exclusive identification of the sub-optimal front-swing strategy from the presented grape rs results by itself points to the piecewise linear seeding, which, if applied in subsequent optimization, would directly lead to the discovery of the optimal back-swing strategy. This relies on the fact that both solution strategies are captured within the simple parametrization of two piecewise linear segments. In general, such a similarity between distinct solution strategies is not guaranteed. As an example, we find in Sec. VII that the Shake Up problem also contains a plurality of strategies which are related to one another in a more subtle way. Bulk analysis of solutions from one strategy would thus not be sure to yield a problem parametrization that also encapsulate the other strategies. This underscores the potential usefulness of having data sets available that have been generated with sets of basic underlying assumptions that are as different Each density is normalized for every individual T and thus represents an estimate of the probability distribution for obtaining a particular F for a given T (P(F |T )). The red reference curve shows the best obtained results for grape ps.
FIG. 9. Aggregate, monotonically best optimization results (lower is better) for several methods in Splitting. Solid lines with (without) dots show the best results obtained with (without) player influence. The scattered blue dots show all results produced by players seeding and optimizing their own seeds. The scattered red dots show the same, except the optimization is carried out with the computational resources described in the text (i.e. no player influence after seeding). The dot translucency indicates the density distribution.

VI. SPLITTING
In this section, we present optimization results obtained for the Splitting problem using the same suite of methods, convergence criteria, and computational resources as in Sec. V.
The solution densities are shown in Fig. 8 and the aggregate results are shown in Fig. 9. Looking at the solution density for grape ps we see a single, dominant band of solutions with only a sparse population of solutions away from it. This hints at a simple, almost trap free, easily navigated control landscape containing a very broad attractor for a single optimal strategy. Indeed, this is also understood by observing the dot density shift in Fig. 9: there are very few blue dots in the sea of red dots, meaning that almost all of them coincide with the best curve. A similar situation can be observed from grape rs, except for a modest increase in low fidelity solutions but with virtually no population between these and the optimal band. Specifically, the average total density per bin below F ≈ 0.9 is (18 ± 4)% for rs and (11 ± 5)% for ps. Player seeds are thus slightly more likely to be within reach of the optimal attractor. Both methods obtain estimates T F =0.99 QSL ≈ 0.92 ms and T F =0.999 QSL ≈ 0.105 ms. The estimates from pgrape ps + po are off by less than 2% in both instances.
The sa methods perform significantly worse than the grape methods on this problem because g = 0. Even the best sa (n b = 40) results do not reach the same fidelity, possibly due to a combination of reduced controllability (low resolution) and the computational penalties associated with g = 0 as described in Appendix C. The full resolution sa (n b = n t ) also fails to converge almost everywhere, except at T sufficiently larger than T F =0.99 QSL . This reaffirms that the control landscape topography is benign enough that even an inefficient algorithm can find the optimal strategy with appreciable probability at these durations. Figure 10 shows the optimization trajectories as a function of wall time for the same 100 seeds at T F =0.99 QSL . The conclusions are readily apparent in this data as well.

A. Optimal Strategies -Control Clustering
Even without clustering, the optimal strategy was apparent. Figure 11 shows the mean of all controls within 0.02 of the optimal fidelity as a function of 0.2 ms < T . Performing clustering with = 5 and min samples = 5 on the grape ps or grape rs results verifies that this problem has a simple optimization landscape: 3569 of the controls was associated with a single cluster, whereas 56 controls were unclassified.
For all T , the mean takes an initially maximal control value for an extended period. This physically corresponds to the double well barrier in the center of trap (where the wave function is initially localized, see Splitting: mean of (near) optimal solutions u2(t/T ) opt . For a given T (y-axis), the color indicates the mean optimal control value at a given t/T (x-axis). Only a single cluster was identified.
celeration to split the condensate into two equal wave packets. At low T the mean control exhibits a bangbang structure, which tapers off as T increases. Near T = 0.4 ms there is a sharp, single transition between extremal values, after which another, smoother transition is introduced. As the third transition is introduced around T = 0.65 ms, the mean control becomes increasingly blurry indicating a growing departure from bangbang structures. Each density is normalized for every individual T and thus represents an estimate of the probability distribution for obtaining a particular F for a given T (P(F |T )). The red reference curve shows the best obtained results for grape ps.
FIG. 13. Aggregate, monotonically best optimization results (lower is better) for several methods in Shake Up. Solid lines with (without) dots show the best results obtained with (without) player influence. The scattered blue dots show all results produced by players seeding and optimizing their own seeds. The scattered red dots show the same, except the optimization is carried out with the computational resources described in the text (i.e. no player influence after seeding). The dot translucency indicates the density distribution.

VII. SHAKE UP
In this section, we present optimization results obtained in Shake Up for the same suite of methods, convergence criteria, and computational resources as in Sec. V.
The solution densities are shown in Fig. 12 and the aggregate results are shown in Fig. 13. The kinks in the reference curve in Fig. 12 can be more apparently understood by looking at the red dots in Fig. 13 (corresponding to grape ps solutions). The several pronounced staircase-like plateaus suggest the existence of multiple strategies that are relevant at different duration intervals. These are examined more closely in Sec. VII A and are found to correspond to periodic solution strategies. Each plateau extends over the next, meaning that the now sub-optimal strategy remains a prominent attractor for quite some time. From looking at the grape solution densities, this evidently makes the new optimal strategy much harder to find. For grape ps the density splits when crossing the kinks and one plateau in particular remains the main attractor, obfuscating the optimal strategy that exists in the wing of the distribution. On the other hand, the density of very low fidelity solutions is sparse and independent of T . This occurs because partial transfers are not difficult to achieve as the initial state density can very quickly and easily be overlapped with one of the lobes of the double-peaked target state density (see Fig. 2) by a small constant displacement. Looking at the aggregate results, we indeed observe that the red dots are separated from the upper, thick sea of blue dots. The abundance of the latter at low-fidelity is due to players initially terminating the optimization prematurely.
No single method is the best for all T . This is contrary to both Bring Home Water and Splitting, where grape ps, pgrape ps + po and either sa rs or grape rs, respectively, found the optimal solutions independently. In Shake Up, however, ps seeds seem to have the upper hand around T = 0.85 ms and after T = 0.98 ms, while the rs seeds seem to be better between those durations. The grape rs density provides some nuance to this observation. It shows a broader, less dense distribution of high-fidelity solutions and the addition of many very low fidelity solutions as T increases. In fact, near T = 1.05 ms the monotonically best results are due to only a few points in an otherwise empty density region. On the contrary, for grape ps, the same region has a fairly high density and is thus the statistically superior method in the T = 1.05 ms regime. We thus obtain slightly different quantum speed limit estimates: more rs seeds than ps seeds. To make the comparison fair, we form subsets by dividing T into 24 bins and randomly drawing rs seeds equal to the number of ps seeds in each bin. The presented solution density and aggregate results are representative of the statistical outcome of this procedure. For the sa rs methods, the (n b = n t ) and (n b = 40) variants both fail entirely to find any meaningful results. Whereas the Splitting control landscape was benign enough to compensate for the computational difficulties associated with g = 0, this is clearly not the case in Shake Up.
The optimization trajectories of the same 100 seeds at T = 0.89 ms (close to ps solution at 0.887 ms) for the three rs methods is shown in Fig. 14. sa (n b = 40) and sa (n b = n t ) completes an iteration in 159 ± 13 s and > 800 s, respectively, but never reaches high fidelities. grape completes an iteration in 0.76 ± 0.08 s and fares comparatively much better.

A. Optimal Strategies -Control Clustering
For Shake Up, the optimized controls did not possess any readily apparent structure. When individually plotted alongside the corresponding position expectation value of the wave function, however, an oscillatory structure begins to emerge. Subtracting the expectation values, u(t) − x(t) , we observe that these relative controls are dominated by low frequency cosine components. Decomposing the relative controls as ) cos(πkt/T )dt (8) thus yields low-dimensional vectors c = (c 0 , . . . , c 5 ) in frequency space (with corresponding number of oscillations N k = k/2) on which we apply clustering. We join the grape ps and grape rs result sets, selected by the criteria 0.267 ms < T < 1.068 ms and F > 0.6, since the player and random seeds are dominant in different regions. We use = 0.1 and min samples = 250. These parameters were chosen such that a single cluster is identified per cosine component. Figure 15 shows the clustering results. Each cluster is labeled by its dominant coefficient corresponding to a half-integer or integer number of oscillations, which is evident from the cluster means when transforming back into real space. When solutions are colored according to their cluster membership in the aggregate plot, we uncover a clear hierarchy of the solution strategies corresponding to the clusters: within the full set of time intervals considered, each strategy is optimal in ascending order of oscillation number with approximately equal interval lengths. As the optimal strategy transitions, the previously optimal strategy remains a relatively broad attractor for an appreciable amount of time, leading to the plateaus observed in Fig. 13.
Based on the presented analysis, we do not believe we have found the true best results and associated quantum speed limits since i) neither grape ps, grape ps + po or grape rs alone produce the best results, ii) the solution densities are shifted away from the optimal strategies with high variances, and iii) sa fails completely for this task. Especially, the grape ps solution at 0.887 ms in Fig. 13 strongly suggests existence of optimal solutions in this vicinity from the k = 4 solution strategy shown in Fig. 13. Their discovery, however, is evidently obfuscated due to the other active, sub-optimal solution strategies (k = 3, 5) in this region. Based on this analysis one could imagine alleviating this issue by developing a seeding mechanism parametrized to target specifically the k = 4 strategy.
This demonstrates that this is the most challenging of the three examined problems in terms of landscape complexity. The dots with bars denote the mean and standard deviation, whereas the dashed lines only help to guide the eye. Middle: Cluster means in real space color-coded as above. Bottom: Distribution in terms of fidelity color-coded as above

VIII. CONCLUSIONS AND OUTLOOK
We have presented Quantum Moves 2, a citizen-science game in which players act as seeding mechanisms and initial optimizers for quantum optimal control problems. Selecting three distinct problems in the game for analyses, we applied different optimization methods (combinations of algorithms and seeding strategies) to these. For each problem, we examined (a) the respective method efficiencies, (b) the optimal solution strategies, and (c) the overall problem structure.
In Bring Home Water, a single-particle problem, we identified two solution strategies (front-and back-swing) characterized by an exponentially-widening gap between them. Using the same resources and gradient-based algorithm (grape), the player-infused seeds uncover both strategies efficiently whereas the random seeds only find the sub-optimal strategy at the relevant durations for high-fidelity transfers. Imposing increasingly more structure a posteriori from physical insight on the random seeds allows discovery of the optimal strategy, albeit with significantly reduced overall efficiency. Employing a gradient-free algorithm (stochastic ascent) with random seeds, with a strong enforced behavior of one of the controls, we find both strategies with high efficiency. The success of sa is explained by a combination of the optimization landscape structure (many, small local traps and a few, broad optimal attractors), the algorithm's search methodology (allowing it to escape the abundant local traps), and the linearity of the equations of motion (reducing algorithmic complexity). On an optimization algorithmic level, the comparatively reduced efficiency of the gradient-based algorithm is due to its inability to escape these traps.
In Splitting, a bec problem, we identified a single solution strategy. This strategy was found equally efficiently by both the player-infused and randomly seeded gradient-based methods. The gradient-free algorithms with random seeds fail due to the non-linearity of the equations of motion (resulting in increased algorithmic complexity), with the exception that one variant is somewhat successful in the limit of large durations. The optimization landscape is thus so simple that even a numerically inefficient algorithm can discover the optimal strategy with the allotted resources.
In Shake Up, another bec problem, we identify four solution strategies, corresponding to low frequency oscillations around the bec center-of-mass. Each strategy, individually characterized by a dominant half-integer frequency, is optimal at different durations in the order of lowest to highest number of oscillations. Outside of their respective regions, each strategy remains a broad suboptimal attractor, leading to plateaus in the optimization results. Neither of the gradient-based methods are found to be the best on its own, but each exhibit exclusive regions of dominance. In this instance, the gradient-free algorithms completely fail. From these observations, we determine that the optimization landscape is thus very complex and we do not believe we have found the ultimately best results.
The fast linear evaluation (g = 0) for the sa algorithm is independent of the choice to restrict control values to a discretized set of values as presented in Appendix C. For example, it would be straightforward to use the same fast evaluation methodology with derivative-based methods and perform line searches. Such a change in update rule shifts the exploration vs. exploitation trade-off: the discrete version is in a sense fully exploratory (the search is globally exhaustive), but only along a given axis at a time. Obviously, discretization and fixing the remaining axes produces a reduced representation of the underlying control landscape, with respect to which the discrete version exhibits a mix of global-and local search properties. Abandoning discretization and performing line searches turns the algorithm into a fully exploitative one, but again only along a single axis. This is the more standard version of coordinate ascent [38]. In our implementation, line searching usually requires about 5 n d objective evaluations [39], allowing potentially orders of magnitude fewer time steps (n d → 5 in table I, Appendix C) when close to an optimum. Although such an approach does not allow caching ofÛ d since the controls can take any value, the aforementioned benefits should more than compensate for this during the local adjustment phase. In this setting, however, the advantageous convergence rates associated with derivatives and adaptive step sizes is only with respect to the chosen axis -there are no theoretical guarantees for convergence in the full dimensional landscape [38]. In light of these observations, it would be interesting to combine the three methodologies with handover techniques, for example starting with the most global algorithm and ending with the most local one discrete sa → gradient sa → grape The performance of such a combination would be interesting to try on the different seeding strategies discussed in the main text and is a potential subject of future work.
We now return to the five citizen science related questions discussed in the introduction. A main feature of Quantum Moves 2 was an in-game optimization button enabling players to store and optimize candidate solutions on their local device. This clearly underscored the player's role in the search for overall, global features in solutions (that is, locating the green region in Fig. 1), whereas fine-grained, local optimization could be left to the optimization algorithm. This certainly supports a sequential, "one-off", player-computer interaction. However, with supporting tools (such as replay and the ghost feature, see Appendix A) the game also enables more intertwined, hybrid human-computer interactions in which players gain insight by examining the output of the computer optimization and can thereby improve their search for promising features and heuristics. Figures 4,9,and 13 clearly demonstrate that the method of player-seeding with player-invoked local-device optimization (pgrape ps + po) performs roughly on par with the best performing purely algorithmic approaches under consideration in each problem. For the two hardest challenges, Bring Home Water and Shake Up, this method outperforms the randomly seeded grape and sa, respectively. Thus, we conclude, as proposed in Q1, that it does indeed make sense to develop a framework, like Foldit does for protein folding, in which the solution of quantum problems are outsourced to the general population. In Q2, we ask if the game-based approach could actually yield a computational advantage. A full answer to this question should entail comprehensive comparison to the best possible expert-driven optimization. Here, we take first steps in that direction with a baseline benchmark comparison to off-the-shelf optimization with initialization that is as heuristic-free as possible. As stated above, we are aware that making comparisons with entirely random seeding constitutes the lowest possible bar of comparison with the plethora of possible numerical optimization approaches. Of course, any numerical optimization expert will routinely introduce heuristics beyond random seeding. However, playergenerated data show potential because, similar to machine learning-generated data [40][41][42], they constitute a means for researchers to address problems that is not influenced by any expert biases. This can inform the extraction of heuristics and insights that can subsequently be understood, utilized, and expanded upon by the domain expert. An indication of this was seen in the most complex challenge, Shake Up, in which player and randomly seeded methods were optimal at different durations, probing different parts of the interleaved optimal strategies. Our work is at most a first demonstration for further exploration and should not be taken as a guarantee that player-based seeding is advantageous when comparing to increasingly complex algorithmic strategies but merely a first step demonstrating efficacy in a somewhat simplified domain. The extent to which our work may be extended to gain broader implications for the fields of quantum research (Q3), social science (Q4) and computer science (Q5) constitutes interesting topics for future studies. Units: For numerical purposes we obtain nondimensionalized [15] working equations such that effectively = m = 1 and where κ is a constant that can be used to gauge the units. SI and simulation units are related by α SI = µ [α] α sim where µ [α] is the chosen unit for the dimension of quantity α and α sim is the dimensionless number entering e.g. Eq. (B2) (the subscript is then omitted for brevity and quantities written without units imply simulation values). We take the atomic species to be rubidium atoms such that the unit of mass is µ mass = m Rb = 87 amu and we take the energy unit to be µ energy = /µ time . Fixing two elements of the triplet {κ, µ length , µ time } determines the remaining one to produce Eq. (B2). with n x = 256 grid points. The sum of the tweezer potentials, initial control values, and bounds are given by with n x = 256 grid points. The atom chip potential (see [15] and references therein), initial control values, and bounds are given by with n x = 256 grid points. The atom chip potential for ω = 2π · 1.26 MHz after non-dimensionalization (see also [32,43]), initial control values, and bounds are given by In this form, p = (µ magnetic · µ Bohr m F g F )/µ energy = 8794.1 is an overall factor (µ magnetic has been factored out from under the square root), and m F = 2, g F = 1/2 are the internal hyperfine state and Landé factors, respectively. Additionally, B ω = 0.9 and B S (x) = (Gr · x) 2 + B 2 I , where Gr = 0.2 is a magnetic field gradient and B I = 1. We take g 1D = 1.8299.
This section reviews and expands the analysis of the stochastic ascent algorithm from Ref. [24]. In the following, we assume a single control parameter for simplicity. The time axis is segmented into n b bins of equal width w which have the same control value, such that n t = w · n b . For example for w = 3 The propagator for the first bin isÛ b1 =Û t3Ût2Ût1 and the pattern continues for the other bins [44]. We then allow u b k to assume only values from a predefined discrete set Ω = {u d } n d d=1 (this choice is discussed at the end of the section). Using n d = 128, these values are linearly spaced from the lower to the upper control boundary, see (B1).
Updating u b k is done by exhaustively computing the fidelity for all possible values in Ω and setting u b k corresponding to the maximal value, while keeping the other control values fixed (discrete coordinate ascent [45]). When u b k has been updated, it is not chosen for further updates until all the remaining points have also been updated. Updating all points once constitutes an iteration and the sequential control update order is stochastic within each of these. As is, bandwidth limitations are only imposed by the choice of n b , but one could easily accommodate a derivative regularization term as in grape. The pertinent stopping conditions are: exceeding the allotted optimization wall time (minimum ∼ 13 minutes), exceeding the fidelity threshold (F ≥ 0.999), or when the algorithm achieves no gain in fidelity by changing any of the control values.
Exhaustive evaluation of the fidelity for bin k can be sped up for the linear Schrödinger equation (g = 0), j=1Û bj |ψ 0 and backward-propagated |χ b k+1 = k+1 j=n bÛ † bj |ψ tgt vectors for all times up to and after bin k, respectively, only need to be calculated and cached once per bin update. Upon finishing the evaluation, the update is applied to the control and forward propagated state. Additionally, the matrix representation of the time-evolution operatorŝ U d corresponding to every element in Ω can be precomputed and cached in memory, changing the time-stepping method to a single matrix-vector multiplication [24] instead of the Fourier split-step method.
The first control value update requires (n t + n d ) time steps after which the forward/backward vectors have been initialized and cached. Calculating the new forward/backward vectors and updating for a subsequent bin at k requires only w|k − k | time steps when re-using the old vectors. If k < k the ψ cache is updated and the χ cache otherwise.
We may write the average time step distance w |k − k | = wρn b = ρn t where ρ ≈ 1/3 is found empirically. Performing subsequent updates thus costs (n t /3 + n d ) time steps when averaged over all bins. The average number of time steps required to complete a full iteration is thus n b (n t /3 + n d ), except the first iteration which costs an additional 2n t /3 due to forward/backward vector cache initialization [46].
In the non-linear case g = 0, the explicit state dependence has severe consequences for the algorithm's feasibility. First, the time evolution operatorsÛ j cannot be precomputed since they depend on ψ, which changes as the control changes. Second, it does not make sense to maintain a cache for backward-propagated vectors; altering the control at t k changes the ψ state trajectory from k to n t and the backward-propagated vectors depend on these in the non-linear case. In this case one may just as well evaluate the fidelity at t nt using the first equality in Eq. (C1). The ψ cache only needs to be updated when k < k , yielding w |k − k | k<k = wn b /6 = n t /6. From there, evaluating the fidelity of a single u b k ∈ Ω requires w|n b − k| time steps which must be done for all n d elements. Averaging over a full iteration yields w | n b − k | = wn b /2 = n t /2. Consequently, the number of time steps needed to update a single point u b k changes as n t /3 → n t /6 and n d → n d n t /2. Thus roughly an additional n b n d n t /2 time steps must be performed when g = 0, each of which has an increased computation time becauseÛ d cannot be cached. The differences between the two cases is summarized in table I.
The speed with which the stochastic (coordinate) ascent operates comes at the cost of not being able to per-Linear (g = 0) Non-Linear (g = 0) CachingÛ d , ψ, χ ψ Time evolution Matrix × vector Fourier split-step Time steps/iter n b ( 1 3 nt + n d ) n b ( 1 6 nt + 1 2 ntn d ) TABLE I. Summary for g = 0 and g = 0 for sa form correlated, simultaneous control value updates between bins. This could easily be remedied, in principle, by updating n p bins instead of just a single one. However, such an approach is untenable for the discrete version even for small n p > 1 if one desires exhaustive search: the update cost would then depend on the largest index distance between the parameters and an exponential number of discrete combinations n np d .

Appendix D: Statistical Performance by Sampling
Here we assess the relative usefulness of the preselection heuristic introduced in Sec. IV, and provide an alternative way of characterizing the statistical method performance, which is not based on solution densities. Instead, we compare statistics when only a restricted number of seeds are allowed to be optimized. This emulates either the restriction of computational resources (e.g. no cluster for parallel computation is available) or an increased numerical difficulty of the problem (i.e. operations are more expensive in wall time, allowing fewer seeds to be optimized within the same time frame).
Regardless of either interpretation, we estimate a mean quantum speed limit T fit QSL for F = 0.99 as a function of the sample size, N samples , on the aggregate results in Figs . If T fit QSL lies within the sampled region (interpolation) it is a successful trial, otherwise the trial is a failure (extrapolation). T fit QSL is then calculated as the mean of the successful trials. Finally we calculate the empirical success rate P(T fit QSL ∈ T ), which can also be interpreted as the estimated probability of success. 4. Linearly fit log 10 (1 − F ) and denote by T fit QSL the duration where the fit value corresponds to F = 0.99.

If T fit
QSL ∈ T (interpolation) it counts as a success. If it lies outside T (extrapolation) it counts as a failure.
6. Repeat 2-5 N trials = 1000 times. 7. Compute the mean value T fit QSL over the successful trials. This avoids skewing the mean due to outliers. 8. Compute the empirical success rate (estimated probability of success) P(T fit QSL ∈ T ) = N successes /N trials . Fig. 17 shows the results for different methods in all three levels. The top row shows T fit QSL and the bottom row shows P(T fit QSL ∈ T ). Taking the generic grape rs method as a baseline comparison, one finds that the average behavior of pgrape ps + po (player seeding and player optimization) tends to be worse in all levels: it has a comparatively high T fit QSL and a low P for small sample sizes. For instance in Bring Home Water, N samples = 30 yields success probability P ≈ 0.20 with mean value T fit QSL = 1.15. An exception occurs beyond N samples = 200 where the players perform better on average, but only in this level.
Thus player methods without any additional optimization is not a superior approach on average. However, when preselecting the best individual 600 ps and po seeds and optimizing these we see a significant shift in performance. The grape pr-po and grape pr-ps methods are observed to require up to several orders of magnitude fewer samples to produce the same T fit QSL across all three levels. Moreover, their success rates are significantly higher than any other method for small sample sizes and only dips below unit P in Shake Up. For the most extreme case at N samples = 30, we can compare P = 0.95 for grape pr-po with only P = 0.07 for grape rs, while the former still has a lower estimate for the mean. Obtaining T fit QSL = 1.135 requires N samples = 30 and N samples = 120 respectively with corresponding success probabilities P = 0.95 and P = 0.70 in favor of grape pr-po, while the first instance of P ≥ 0.99 occurs at N samples = 39 against N samples = 260. Similar trends are seen for the other levels, e.g. in Bring Home Water at N samples = 30, where grape pr-po (P = 1) yields a substantial improvement in T fit QSL over grape rs (P = 0.69).
The T fit QSL reduction gained by increasing the sample size for grape pr-po is minimal and seemingly quickly saturated in Bring Home Water and Splitting, while this is not true for Shake Up. This leads to the same conclusion as in the main text that Shake Up is the most difficult overall problem. The only instance where grape pr-po is matched in performance at the same N samples occurs is in Splitting. Here grape pr-rs intersects near N samples = 200, the maximum sample size for the preselection-based methods. However, practically the same T fit QSL can be achieved using just N samples = 30 with grape pr-po.
Additionally, the relative difference between pr-po and pr-ps signifies how valuable the player optimizations are in terms of absolute results for a given level. Bring Home Water and Shake Up shows a clear gain while Splitting is nearly unaffected, lending itself again to it has the interpretation that it has the least difficult landscape topography.
The sa (n b = n t ) method performs very well in Bring Home Water. The sa (n b = 40) method are seen to have the unequivocally worst overall statistical performance: even in the linear case, T fit QSL scales comparatively poorly with N samples . For Shake Up both these methods fail (P = 0) for all N samples . The failure is, again, less severe for Splitting.
There is a clear, unambiguous advantage in using prps and pr-po seeding strategies to obtain the best average performance for small sample sizes (equivalently increased numerical difficulty or restricted available computational resources) for these problems. The benefit diminishes gradually as the sample size is increased. This echoes the conclusion drawn in the previous section -given enough resources (correspondingly enough N samples ) most algorithms achieve similar results. Moreover, when the underlying landscape topography is difficult, po seeds are definitively more valuable than ps seeds in terms of absolute results.