Collective behavior of self-steering active particles with velocity alignment and visual perception

The formation and dynamics of swarms is wide spread in living systems, from bacterial bio-films to schools of fish and flocks of birds. We study this emergent collective behavior in a model of active Brownian particles with visual-perception-induced steering and alignment interactions through agent-based simulations. The dynamics, shape, and internal structure of the emergent aggregates, clusters, and swarms of these intelligent active Brownian particles (iABPs) is determined by the maneuverabilities $\Omega_v$ and $\Omega_a$, quantifying the steering based on the visual signal and polar alignment, respectively, the propulsion velocity, characterized by the P{\'e}clet number $Pe$, the vision angle $\theta$, and the orientational noise. Various non-equilibrium dynamical aggregates -- like motile worm-like swarms and millings, and close-packed or dispersed clusters -- are obtained. Small vision angles imply the formation of small clusters, while large vision angles lead to more complex clusters. In particular, a strong polar-alignment maneuverability $\Omega_a$ favors elongated worm-like swarms, which display super-diffusive motion over a much longer time range than individual ABPs, whereas a strong vision-based maneuverability $\Omega_v$ favors compact, nearly immobile aggregates. Swarm trajectories show long persistent directed motion, interrupted by sharp turns. Milling rings, where a worm-like swarm bites its own tail, emerge for an intermediate regime of $Pe$ and vision angles. Our results offer new insights into the behavior of animal swarms, and provide design criteria for swarming microbots.


I. INTRODUCTION
Self-organized group formation and collective motion in form of swarms or flocks is a hallmark of living systems over a wide range of length scales, from bacterial biofilms to school of fish, flocks of birds, and animal herds [1][2][3][4][5][6].This behavior emerges without central control and is rather governed by the response of individuals to the action of other group members or agents.The selforganized structures and motion typically extend over much larger length scales than the size of the individual units, and emergent properties and function achieved are beyond the capacity of constituent units [7][8][9][10][11].Arising patters and structures not only depend on the physical interactions between the various agents of an ensemble, but are often governed by nonreciprocal information input, e.g., visual perception in case of animals, processing of this information, and active response.Unravelling the underlying mechanisms and principles not only sheds light onto the behavior of biological systems, but provides concepts to design functional synthetic active [2,12] and microrobotic [13] systems, which are able to adopt to environmental conditions and perform complex tasks autonomously [14,15].
Several interactions and information-exchange processes can contribute to formation of swarms and flocks.Various models have been proposed and analyzed to understand this process and the resulting structures.
A pertinent feature of the collective motion of animal groups is motion alignment and cohesion.In a pioneering work, Reynolds proposed three interaction rules for the bird-like objects called "boids", which are collision avoidance, velocity matching, and flock centering [16].The boid model shares feature with earlier models on fish, which considered alignment with nearby individuals, attraction to the center of the school, and avoidance of close neighbors to prevent collisions [8,10].A similar model is the "behavioural zonal model" by Couzin et al. [17,18], which considers different types of interactions in three non-overlapping zones: minimum distance, attraction towards other individuals, and alignment with neighbours.The analysis of this model shows complex structures like swarms, millings, and groups with highly parallel motion.From the physics perspective, perhaps the most celebrated model of collective motion is the "Vicsek model" [19], and its refinements and extensions [5,[20][21][22][23][24][25][26].In the basic version of this model, particles move with constant speed and change their direction at each time step by aligning with the mean orientation of neighboring particles in a prescribed interaction range, together with some noise accounting for environmental perturbations.The Vicsek model shows a phase transition from disordered phase to order phase with increasing density and decreasing noise.
Another class of models emphasizes the short-range steric repulsion, and possibly longer-range attraction between the self-propelled units, called "active Brownian particles" (ABPs).This implies that the shape of the objects is now relevant to determine structure formation and collective motion.Motility-induced phase separation is observed in such systems, where uniformly distributed spherical ABPs can phase separate into the dense phase of slow-moving particles and dilute fast-moving particles under certain packing fractions and activity [27][28][29], while ABPs with elongated shapes form non-equilibrium motile clusters and swarm [30][31][32].
Models just based on an attraction of individual units to the center of mass of neighbouring particles induced by self-steering controlled by visual perception [33][34][35][36] show different non-equilibrium structures like clusters, single-file motion, and millings.Millings have also been observed in simulations of other models [7,37].Using vision-based velocity alignment with time delays, agents can spontaneously condense into 'droplets' [38] and increasing the activity and/or delay time of an active particle's attraction to a target point can induce a dynamic chiral state [39].The combination of the two mechanism of attraction and Vicsek-type alignment [40] yields a shift of the critical noise amplitude of the phase transition, and type of phase transition, compared to the case of pure velocity alignment [19].Detailed observations of flocks of surf scoters have also been used to infer individual interaction forces in the behavioural zonal model [41].
After the recognition that in starling flocks a typical individual significantly interacts only with 7 or 8 closest neighbours [11], models with metric-free, "topological" interactions have also been studied [42][43][44][45].For instance, the "topological Vicsek model", in which particles align their velocity with neighbors defined through the first Voronoi shell, shows qualitative different results, like no density segregation, compared to its metric counterpart.In Delaunay-based models [44], the communication topology of the swarm is determined by Delaunay triangulation, where the rules of attraction and repulsion between neighboring individuals are the same as for the zone-based models, except that the region of attraction is unbounded.The results suggest that Delaunay-based models are more appropriate for swarms that are larger in number and more spatially spread out, whereas the zonebased models are more appropriate for small groups.
We focus here on the numerical investigation of a minimal cognitive flocking metric model with visual perception [34,36] and polar-based alignment [19] for a system of self-steering particles.The usual ABPs are additionally equipped with visual perception and polar-alignment interactions.The visual signal allows these "intelligent ABPs" (iABPs) to detect the instantaneous position of center of mass of neighbouring particles within vision cone (VC), whereas polar alignment favors reorientation toward the average orientation of neighbors.Our model shares some basic features with the behavioral zonal model [17], like long-range attraction, short-range repulsion, and medium-range alignment.However, it is important to note that in our study, we incorporate (i) hard-core excluded-volume interactions instead of a zone of repulsion between particles [17], or even point particles [34,40], and (ii) a limited maneuverability in response to external signals.Additionally, the vision-based attrac-tion in our model is non-additive, i.e. the reorientation force does not depend on the number of particles, but their (normalized) distribution in the vision cone.A recent theoretical study [46] highlights the influence of the non-additive versus additive interactions on structure formation in the Vicsek model.Experimentally, systems of intelligent ABPs can be realised by active colloids, which are steered externally by a laser beam, with an input signal mimicking visual perception [47][48][49].
The main goal of our study is the exploration of the state diagram -which depends on several parameters, such as propulsion, maneuverabilities for vision-induced steering and alignment, vision angle, and ranges of vision and alignment interactions -as well as a characterization of the emerging structures and dynamical behaviors.We find various types of emergent structures like dispersed clusters, compact aggregates, worm-like swarms, and millings, resulting from the interplay of visual-signalcontrolled steering and polar alignment.An important feature is the formation of worm-like swarms with a large variability of elongation and thickness.The dynamics of the swarms displays a persistent super-diffusive motion over a wide range of time scales, which becomes diffusive at long times.This motion is characterized by a persistence length, controlled by the maneuverabilities and vision angle.Furthermore, swarms are found to display interesting trajectories, with long periods of directed motion interrupted by sharp turns or circular arcs.

II. MODEL
We consider a two-dimensional system of N responsive "intelligent" active Brownian particles (iABPs) with the position r i (t) of particle i (i = 1, . . ., N )at time t.The particles are self-steered with constant propulsion force F a i (t) = γv 0 e i (t) along the direction e i (t) and velocity v 0 .The dynamics of this system is governed by the equations of motion [36,50] Here, m is the mass of an iABP, γ the translational friction coefficient, F i the force due to excluded-volume interactions between the iABPs, and Γ i is a stochastic Gaussian and Markovian process of zero mean and the second moments T the temperature and k B the Boltzmann constant.Excluded volume interactions are taken into account by the short-range, truncated, and shifted Lennard-Jones potential where r = |r| is the distance between iABP particles, σ represents their diameter, and ϵ is the energy determining the strength of repulsion.An iABP is able to react to information about the position and orientation of neighboring particles.As shown schematically in Fig. 1, particle i at position r i can adjust its propulsion direction e i through self-steering in the direction u ij = (r j − r i )/|r j − r i |, determined by the positions of neighbors, with an adaptive force f av i , as in the cognitive flocking model [9,34,36].Simultaneously, it is capable to align its propulsion direction with those of neighboring particles, e j , with the alignment force f aa i , as in the Vicsek model [5,21,46].Hence, the dynamics of the propulsion direction of particle i is determined by [51] Here, the Λ i represent Gaussian and Markovian stochastic processes with zero mean and the correlations The cognitive force ("visual" force) is given by with the "visual" maneuverability Ω v and the number of iABPs within the vision cone (VC).The condition for particles j to lie within the vision cone of particle i is where θ -denoted as vision angle in the followingis the opening angle of the vision cone centered by the particle orientation e i (Fig. 1).The exponential distance dependence describes a characteristic range R 0 of visual perception.In addition, we limit the vision range to |r i − r j | ≤ 4R 0 and treat all particles further apart as invisible.
Alignment of the propulsion direction (velocity alignment) is described by the adaptive force with the "alignment" maneuverability Ω a , and the number N a,i of iABPs in the polar-alignment circle (PA) (Fig. 1).The condition for particles j to lie within the polar alignment (PA) range of particle i is where R c is the cutoff radius.Unless stated otherwise, R c = σ, i.e., particles align up to second shell of neighbors.Particles inside the overlap zone of the VC and PA regions interact both via f av i and f aa i .The representation of the propulsion directions in polar coordinates, e i = (cos φ i , sin φ i ) T (see Fig. 1), yields the equations of motion for the orientation angles The first sum on the right-hand side of Eq. ( 8) describes the preference of an iABP to move toward the center of mass of iABPs in its "vision" cone (VC), while the second sum describes the preference of an iABP to align with the neighboring particles.In the vision-based self-steering, the sum corresponds to the projection of the positions of all N c particles within the VC onto the "retina" of particle i, with ϕ ij the polar angle of the unit vector u ij = (cos ϕ ij , sin ϕ ij ) T between the positions of particles i and j.The activity of the iABPs is characterized by the Péclet number where Our model presents a minimalistic description of selfsteering particles with visual perceptions and velocity alignment, and provides insight into the interplay of these two swarming models.However, it depends on a significant number of parameters, as there are the Péclet number P e, the vision angle θ and the vision range R 0 , the visual maneuverability Ω v , the alignment maneuverability Ω v , the particle size σ, and the packing fraction Φ.In order not to get lost in this large parameter space, we focus here varying the alignment-vision ratio Ω a /Ω v , the Péclet number P e, and the vision angle θ.

III. PARAMETERS
In the simulations, we measure time in units of τ = mσ 2 /(k B T ), energies in units of the thermal energy k B T , and lengths in units of σ.We choose γ = 10 2 mk B T /σ 2 and the rotational diffusion coefficient D R = 8 × 10 −2 /τ , which yields the relation D T /(σ 2 D R ) = 1/8.The above choice of the friction and rotational diffusion coefficients ensures that inertia does not affect the behavior, because the resulting relation m/γ = 10 −2 τ ≪ τ implies strongly overdamped single-particle dynamics.The main reason for including the inertia term in Eq. ( 1) is the reduced numerical effort and the improved accuracy of the numerical integration of Eq. ( 1), as purely Brownian Dynamics requires an orders of magnitude smaller time step.We set ϵ/k B T = (1+P e) to ensure a nearly constant iABP overlap upon collisions, even at high activities.The iABP density is measured in terms of the global packing fraction Φ = πσ 2 N/(4L 2 ), with L the length of the quadratic simulation box.Periodic boundary conditions are applied, and the equations of motion ( 1) are solved with a velocity-Verlet-type algorithm suitable for stochastic systems, with the time step ∆t = 10 −3 τ [52].We perform 10 7 equilibration steps, and collect data for additional 10 7 steps.For certain averages, up to 10 independent realizations are considered.As shown in Ref. [53], for the ratio M = mD R /γ = 8 × 10 −4 and the considered Péclet numbers, we do not expect MIPS.
If not indicated otherwise, the number of particles is N = 625, the length of the simulation box is L = 250σ -corresponding to a packing fraction Φ = 7.85 × 10 −3 -, the characteristic radius is R 0 = 1.5σ,Ω v /D R = 12.5, and the range of vision angle is π/16 ≤ θ ≤ π.
Initially, the iABPs are typically arranged on a square lattice, with iABPs distances equal to their diameter σ, in the center of the simulation box.In order to study the importance of vision and alignment in the interplay between these two self-steering mechanisms, we keep the vision-based maneuverability Ω v /D R constant, and vary the alignment-vision ratio by changing the alignmentbased maneuverability Ω a .

Phase behavior -vision-induced steering versus polar alignment
The effect of the maneuverability ratio of polar alignment and vision-induced steering, Ω a /Ω v , and of the vision angle on the emerging structures is illustrated in Fig. 2 for two Péclet numbers.The packing fraction Φ = 0.00785 is very low, hence, typically only a single or very few clusters or aggregates can be observed at any moment in time.
For the low Péclet number P e = 10, where orientational noise plays a significant role, the state diagram in Fig. 2(a) shows two clearly distinguishable regimes, (i) the pursuit-dominated regime at low Ω a /Ω v ≲ 4 and large vision angles θ ≳ π/4, characterized by large quasicircular, nearly immobile clusters, and (ii) the alignmentdominated regime at higher Ω a /Ω v ≳ 10 and smaller vision angles θ ≲ π/5, characterized by thick elongated worm-like swarms, which are highly mobile.
When alignment interactions dominate, Ω a /Ω v ≳ 10, the iABPs obviously tend to align in the same direction, but cohesion by vision-based steering toward clusters of other iABPs is still relevant; together, these two effects result in the formation of worm-like motile swarms for large vision angle θ = π.As the vision angle is reduced to π/5, the number of particles in the vision cone decreases, thus, cohesion weakens, and the worm-like swarms become thinner and more elongated.An important point to note is that even for very small vision angles, i.e., θ ≤ π/8, vision-based cohesion remains important for aggregate formation due to the very low packing fraction.
When the vision-mediated interaction dominates, i.e., for Ω a /Ω v = 0.1 and 0.5, close-packed structures are observed for the vision angle θ ≥ π/3, and dilute structures for the lower angle θ ≤ π/6.These cases are similar to those of systems with purely vision-based interactions [36].For large vision angles, a significant number of neighbors are sensed by an iABP, which then moves toward their center of mass easily, the effect of the alignment interaction is too weak to generate any significant parallel orientation and the iABPs form large close-packed aggregates.When the vision angle is low, e.g., θ = π/6, very few particles are detected within the vision cone, no clusters can form, and particles are distributed homogeneously.
For intermediate values Ω a /Ω v = 4, close-packed structures are observe at the high vision angle θ = π, while thick worm-like motile swarms emerge at the lower value θ = π/3 (see also Supplementary Movies M1 and M2).
The effect of vision-based steering becomes weaker with decreasing vision angle, as fewer particles appear in the vision cone.This can be captures by an effective maneuverability Ω v,ef f = Ω v θ ν with ν ≥ 1.We will show below that ν ≃ 2. Thus, vision-base steering dominates at large vision angles, which favors compact clusters, and alignment for intermediate vision angles, which favors worm-like swarms.The wiggling of the worm-like swarm arises from the orientational noise of the leading particles.
For the higher Péclet number P e = 70, where persistent ballistic motion becomes more prevalent, the characteristic emergent structures are displayed in Fig. 2(b).At high alignment-vision ratio, again elongated wormlike swarms are observe, which are, however, much thinner compared to those of the lower activity case with P e = 10.A new feature is the emergence of milling struc- tures, where thin worm-like swarms "bite their own tail" and form ring-like rotating aggregates; they are observed for 1 ≤ Ω a /Ω v ≤ 10, and vision angles π/3 ≤ θ ≤ π/2 (see Supplementary Movie M3 ).In the vision-dominated regime, with Ω a /Ω v = 0.5, we observe small rotating clusters or a coexistence-phase with small worm-like swarms and small rotating aggregates at the vision angle θ ≥ π/3 (see Supplementary Movies M4, M5) .A phase of small worm-like swarms is found for θ ≤ π/4, which is similar to worm-aggregate phase and single-file motion in the system without alignment interactions [36], except that the aggregates are here rotating and are smaller in size.We like to emphasize that we use different initial conditions for all parameter sets in order to avoid a bias by the initial condition toward some rare configuration, in particular, for the milling structures.The highly elongated worm-like swarms can sometimes show milling intermittently, but then regain the worm-like conformation (see Supplementary Movie M6).Yet, the milling conformations displayed in Fig. 2(b) always remain stable over the whole simulation time.For close-packed structures at P e = 10 in Fig. 2(a), we employ an initial configuration, where particles are distributed uniformly.This leads first to the formation of multiple close-packed aggregates, which subsequently merge to form a single large cluster.A different behavior is observed for the small rotating aggregates at P e = 70, e.g., at Ω a /Ω v = 0.5 and θ ≥ π/3 in Fig. 2(b), which do not merge, but rather form by splitting of an initial large aggregate in the center of the simulation box.Thus, the small rotating clusters at high activities are different from the large close-packed aggregates observed at lower activities.

Phase Behavior -Alignment-Dominated Regime
For a more detailed investigation of the alignmentdominated regime, we focus on the alignment-vision ratio Ω a /Ω v = 10.This provides insight into the structural evolution with increasing activity, characterized by the Péclet number P e and the vision angle θ. Figure 3 shows typical snapshot of emerging structures, like thin and thick worm-like swarms, milling, dispersed clusters, and a dilute phase as a function of these two parameters.For large vision angles θ ≥ π/4, predominately long and thick motile worm-like swarms are present.For vision angles θ ≤ π/6, either dilute or dispersed clusters dominate.
With increasing propulsion, the large worm-like swarms become thinner and more elongated as long as θ ≳ π/4, while for θ ≤ π/8 small aggregates persist.At high activity, P e ≥ 100, the large swarms show dynamical splitting into multiple swarms, while small swarms can merge into larger swarms.The very thin worm-like swarms can sometimes span the whole system.There is a small window of parameters (P e ≃ 70, θ ≃ π/2) where circular milling-like structures appear.

Phase Behavior -Balanced Alignment-Vision Regime
Figure 2 indicates that Ω a /Ω v = 4 marks roughly the boundary between stationary close-packed compact structures -where vision-based attraction dominatesand motile worm-like swarms -where alignment interactions dominate.Thus, these two types of interactions approximately balance each other at this alignment-vision ratio.
Snapshot of typical emerging structures at different P e and vision angles θ are displayed in Fig. 4. For vision angles θ ≤ π/8, either dilute or dispersed cluster are obtain across all activities.With increasing vision angle θ ≥ π/6, first worm-like swarms, then compact clusters are stabilized.For higher activity, P e ≥ 100, the close-packed structures are absent even at the maximum possible vision angle θ = π, because the turning radius of a particle, determined by P e/(Ω v /D R ) [51], becomes too large for fixed maneuverability at high P e to reach the target cluster.There is a transition from closepacked clusters to thick elongated worm-like swarms at P e = 10 and 20 when the vision angle decreases from θ = π to π/6, similar as in Fig. 2 The main characteristics are the presence of (i) compact clusters in the vision-induced steering regime, with Ω a /Ω v,ef f ≤ 4, where alignment plays a minor role, (ii) worm-like swarms in the alignment-dominated regime, with Ω a /Ω v,ef f ≥ 10, where the elongation of the swarm increases and the thickness decreases with increasing P e and increasing Ω a /Ω v,ef f , and (iii) millings at intermediate values of Ω a /Ω v,ef f and P e.

(a). At intermediate
The transition from close-packed aggregates to thick worm-like swarms occurs as the effective alignment-vision ratio increases from Ω a /(Ω v θ ν ) ≃ 1, both with increasing Ω a (see Fig. 2) -due to stronger alignment -as well as decreasing vision angle (see Fig. 4) -due to weaker Long and thin worm-like swarms are favored by larger activities, due to a larger inward-pushing force of particles at the swarm edges, as studied and explained in more detail in Sec.IV B 1 below.Long and thin wormlike swarms are also favored by small vision angles, as the thickness is related to the range R 0 θ of the vision cone.If the swarm thickness becomes larger than R 0 θ, particles on the rim cannot see the full swarm width, and the swarm can split, similar to the single-file motion observed in the vision-only case [34,36].
Importantly, alignment stabilizes persistent swarm motion (compared to the single-file motion of vision-only systems [36]), because the incipient leader particle becomes aware of and is affected by its followers.
It is important to note that the presence of wormlike swarms in our model at the low packing fraction (ϕ = 0.00785) is in stark contrast to the structures observed in the Vicsek model at higher packing fraction (e.g.Φ = 0.25), where homogeneous disorder phases and giant motile aggregates coexisting with a dilute gas of single particles are observed [21].Increasing the field of vision range yields a comparable outcome to enhancing the visual maneuverability of particles (see SM S-II).Similarly, extending the range of polar alignment demonstrates an effect akin to improving alignment-related maneuverability (see SM S-III).An interesting feature of worm-like swarms is the increasing elongation and thinning with increasing Ω a and increasing P e.This is related to the behavior of particles at the edge of the swarm, which, due to the visioninduced steering, push "inwards", but, due to the strong alignment, can do so only to a limited extent.The balance of vision-induced steering and alignment forces can be employed in a simple mean-field estimate (see SM S-IV) to predict the particle orientation angle φ * at the edge of the swarm, with This estimate is in semi-quantitative agreement with the orientational structure of snapshots of worm-like swarms, see Fig. 6.The preferred tilt angles φ * imply a lateral compressive force and equivalent perpendicular Péclet number P e ⊥ = P e sin φ * (11) which is increasing with P e, in agreement with the conformations in Fig. 3. Furthermore, the snapshot shows that there is an interesting correlation of particle orientation and local curvature of the swarm centerline, where an imbalance of particles with inward orientation on the two sides seems to generate the snake-like motion of the swarm.

Swarm Shape and Asphericity
We characterize the overall size and shape of the emerging structures by the radius-of-gyration tensor [54] where ∆r i is the distance of i-th particle from a cluster's center of mass, m, n ∈ {x, y}, and N is the total number of particles in the cluster.We use a distance criterion to define a cluster, where an iABP belongs to a cluster when its distance to another iABP is within a radius of σ 0 .Since our system is very dilute, we choose σ 0 = 2σ.In order to avoid averages to be strongly affected by configurations which occur only rarely, we only consider realizations which appear in more than 1% of the recorded configurations.
An important quantity to characterize the shape of aggregates is the asphericity where λ 1 and λ 2 are the eigenvalues of the radius-ofgyration tensor.Figure 7 shows the asphericity A as a function of alignment-vision ratio at various vision angles θ.The close-packed structures for weak alignment and strong vision at small Ω a /Ω v ≤ 0.5 are nearly circular, hence, A ≃ 0, similar to the vision-only case [36].The worm-like swarms for 10 ≤ Ω a /Ω v ≤ 25 and θ = π, as well as Ω a /Ω v = 1 for θ = π/4, are highly elongated, which results in the large apshericities A ≃ 0.8.The asphericity starts to increase with Ω a /Ω v significantly earlier for smaller vision angles, because cohesion and thus formation of compact aggregates is suppressed for smaller visual signals, which favors worm-like swarms.Thus, the effect of an increase of the vision angle is similar to an enhanced visual maneuverability, because in both cases the tendency of an iABP to steer toward existing clusters increases.Consequently, we re-calibrated the visual maneuverability Ω v by a factor, which increases with the vision angle θ to accommodate this effect.As a result, we can collapse all data of the asphericity for θ ≤ π/4 onto a single master curve by employing an effective scaled variable Ω a /(Ω v θ ν ), with ν ≃ 2, as demonstrated in the inset of Fig. 7.This shows that the asphericity displays universal behavior as a function of this scaled alignment-vision ratio, with a sharp transition from the compact-cluster to the worm-like swarm phase at Ω a /(Ω v θ 2 ) ≃ 2.
A similar scaling behavior is found for the radius of gyration, see SM S-V.

Global Polarization
The global polarization is characterized by the order parameter where e i is orientation of particle i and the average is performed over time.Figure 8 shows the polarization as a function of vision angle θ for Ω v /Ω a = 10 at various activities P e.For θ ≲ π/8, particles are randomly oriented and P ≃ 0. For larger vision angles, global polarization emerges, which can reach P = 1 for θ = π.Global polarization at small vision angles is enhanced by larger P e, due to stronger persistence particle motion.It is important to note that we are not characterizing bulk phases here, but typically a single large cluster.Thus, P quantifies the alignment order within the cluster.P ≃ 1 also does not imply that the cluster is always moving in the same direction, just that the propulsion directions of the individual particles remain highly aligned at any moment in time.

Mean-Square Displacement
The translational motion of the iABPs is characterized by their mean-square displacement (MSD) where the average is performed over the initial time t 0 .An important reference case is the behavior of single ABPs, for which theoretical calculations in two dimensions yield Figure 9 displays mean-square displacements of iABPs for various alignment-vision ratios, vision angles, and Péclet numbers.For larger Ω a /Ω v = 4 to 25, where alignment interactions dominate over vision-controlled steering (worm-like swarms), the particles move nearly ballistic and ⟨r 2 (t)⟩ ∼ t α , with the exponent α ≈ 1.95.For Ω a /Ω v ≤ 0.1, in the vision-dominated regime, the close-packed aggregates display translational diffusion and ⟨r 2 (t)⟩ ∼ t.The transition from ballistic diffusive motion occurs at Ω a /Ω v ≃ 0.5 for θ = π/4.It shifts to Ω a /Ω v ≃ 1 for θ = π/2 -in agreement with the conclusion in Sec.IV B 2 that the importance of visioncontrolled steering increases with increasing vision angle.

Collective Dynamics
The dynamics of elongated worm-like swarms is characterized by an essentially one-dimensional motion along a curvi-linear path, where all particles of the swarm trace out trajectories, which are only slightly displaced laterally from the trajectory of the center-line (see Movie M7).This is reminiscent of the "railway motion" performed by semi-flexible, tangentially driven active polymers at high Péclet numbers [55].
Thus, we can characterize the dynamics of the whole swarm by the temporal auto-correlation function of individual particles, where, e i represents the orientation of particle i, and N is the total number of particles in the swarm.
In the case of railway motion, the spatial conformations of the swarm as well as the temporal autocorrelation function, Eq. ( 17) are determined by the statistical properties of the (infinitely long) rail, with the spatial correlation function of tangent vectors t(s) (with contour length s) and persistence length ξ p .This length is also the spatial correlation length of shape fluctuations of the center-line of the swarm.Furthermore, the "railway" assumption implies that ) where v 0 is active velocity.Thus, the temporal decay of C θ (t) should be controlled by the relaxation time τ = ξ p /v, with the same persistence length ξ p as the instantaneous conformations.
Figure 10 shows examples of the auto-correlation function C θ (t), together with exponential fits (inset), and the derived persistence lengths for various parameter combinations.It is interesting to note that a comparison of the spatial and the temporal (Eq.( 19), Eq.S-5) persistence length are in good quantitative agreement for elongated worm-like swarms (see SM S-VIII, Fig. S8).The persistence length ξ p display three important trends: (i) The persistence grows roughly linearly with the alignmentinduced maneuverability Ω a , (ii) is only weakly dependent on the Péclet number, and (iii) it decreases with the vision angle roughly as a power law θ −1 in the range π/4 ≤ θ ≤ π/2.Together this implies The increase of the persistent swarm motion (compared to the single-file motion of vision-only systems [36]) with stronger alignment-induced maneuverability can be traced back to the effect of the followers on the incipient leader particle through the (isotropic) alignment interaction.The larger persistence length at θ = π/4 than at π/2 can be attributed to the larger worm length at θ = π/4, where a more focused vision enables the particles to more easily follow the incipient leader.
An important point to note here is that the persistence for the considered parameter combinations is always large, with ξ p /σ > 100.Since the effective translational diffusion coefficient for a random-walk-like motion with Kuhn length ξ p is given by this explains the large ballistic/superdiffusive regime in Fig. 15, because the crossover from the ballistic to the diffusion regime occurs at D R t * ≃ D R ξ p /v 0 , which implies D R t * ≃ 8(ξ p /σ)/P e ≃ 100.A notable exception in Fig. 10 is the vision angle θ = π.In this case, the persistence length increases roughly proportional to the Péclet number.The main difference to the case of smaller vision angles is that the worm-like swarms are here much thicker (and shorter), and exhibit a less persistent motion for small P e.This implies that the rotational diffusion of the leading group of particles -which is determined by P e -now plays an important role.Furthermore, with increasing P e, the swarm thickness decreases (see Fig. 3), which also contributes to an increasing persistence length.

Milling
Milling structures are characterized by the angular frequency and the radius of these aggregates, here N e is total number of particles in the milling structure and r cm is center-of-mass position.The radius of the milling ring-like ribbon for Ω a /Ω v = 4 and π/4 < Θ < π/2, increases roughly as R ∼ P e 2 .This is caused by the increasing persistence of the iABP motion with increasing P e. Figure 11(a) shows the scaled radius as a function of the vision angle, and suggests that R ∼ θ −γ , with γ ≃ 1.3.The angular frequency ω decreases with increasing activity P e approximately as ω ∼ 1/P e.This decrease is related to the increasing radius, because ω ∼ v/R and v ∼ P e. Figure 11(b) shows scaled frequency ω as a function of the vision angle, with ω ∼ θ γ and the same exponent γ as for the radius.Thus, all together, we predict the scaling behavior R = c R σθ −γ N e P e 2 , ω = c ω θ +γ D R /(N e P e) (24) with γ ≃ 1.3 and constants c R and c ω .The data in Fig. 11 indeed fall very nicely onto these single scaling curves.

D. Finite-Size Effects
In order to elucidate the influence of finite-size effects on the presented results, we construct a phase diagram for the same parameters as in Fig. 5, for the alignmentvision ratio Ω a /Ω v = 4, packing fraction ϕ = 0.00785, and vision maneuverability Ω v /D R = 12.5, but now of twice the number of particles, i.e., N = 1250, see Fig. 12. Overall, the topology of the phase diagram remains the same, mostly only phase boundaries are slightly shifted.The most significant change is the extension of the region of stability of the milling structure, which extends to smaller P e numbers and smaller θ for larger N .This similarity does not imply that the iABP behavior is independent of N .An obvious effect of increasing N is that the close-packed, nearly circular aggregates in the HCP phase grow in size, with their radius increasing as √ N .As the particles are all pushing toward the joint center of mass, this implies that the aggregates become more stable, which is expressed by the shift of the HCP -worm-like swarms boundary to lower vision angles.For the worm-like swarms and the milling structures, these can either remain a single aggregate, or slit and merge again intermittently into several smaller structures.In the latter case, only minor finite-size effects can be expected.In the former case, increasing N implies longer or thicker worm-like swarms or milling structures.Thicker swarms exhibit a more persistent and less "snaking-like" motion.This also appears for the milling structures, where R ∼ N and correspondingly ω ∼ 1/N , see Fig. 11 and Eq. ( 24).

V. SUMMARY AND CONCLUSIONS
We have studied the emergent structures and dynamics of ensembles of cognitive, self-steering particles, with a combination of visual-perception controlled steering and polar alignment.The visual signal gives particles a tendency to reorient toward the center of mass of other particles in their visual field and implies group cohesion, whereas polar alignment induces particle reorientation toward the average orientation of their neighbors within the alignment-perception range and implies collective directed motion.Depending on the vision-induced maneuverability Ω v and polar-alignment related maneuverability Ω a , various kinds of collective motion are obtained.Moreover, the vision angle θ, the vision range R 0 , and the activity P e play a crucial role in structure formation.In worm-like swarms, which are predominately observed for large alignment-vision ratios Ω a /Ω v ≳ 4, particles move together collectively with little individual orientational fluctuations, and the swarm display super-diffusive or nearly ballistic motion over long times.Dispersed cluster and dilute phases prevail at small vision angles, typically θ ≤ π/8, due to the small number of particles in the vision cone, which implies weak cohesion.Close-packed disk-like clusters emerge for high vision-induced maneu- verability and vision angle θ ≥ π/4, because the larger the number of particle in a visible cluster, the larger is also the tendency to quickly turn toward its center of mass.
Circular milling structures are obtained mainly for balanced alignment-vision ratios, Ω a /(Ω v θ 2 ) ≃ 4, at an intermediate range of activities and vision angles.The underlying mechanism for such structures to be stable is that the persistence length of the worm-like contour of a conformation should be on the same order as its contour length, because large persistence lengths favor elongated swarms.The milling structure is characterized by a radius R ∼ P e 2 and, hence, a rotation frequency Ω ∼ 1/P e. Balanced maneuverability, i.e., Ω a /(Ω v θ 2 ) ≃ 4, seems to be a very favorable condition for swarms in general, because it makes swarms susceptible to external perturbations while remaining cohesive, so that the swarm can quickly react to the appearance of predators.We want to mention parenthetically that the importance of critically in biological systems has also been discussed in the context of scale-free correlation of swarms of midges [56].
A closer look at the internal structure of a worm-like swarm reveals interesting new features.Particle orientations at the edge of the swarm are weakly inclined toward the centerline, which implies a compressive force responsible for swarm elongation.Furthermore, a lateral asymmetry seems to be correlated with undulations of the centerline.Although our approach shares some basic features with the "boid model" [16] and the "behavioural zonal model" [17], it differs in other important aspects.First, we employ a hard-core repulsion between the agents, which implies close-packed aggregates, whereas the previous models typically adopt a softer repulsion potential, which leads to disordered aggregates.Thus, the way the repulsion between particles is modeled plays a crucial role in structure formation.It certainly depends on the real system to be considered, which of these repulsive interactions is more appropriate.Second, while in the other models [16,17] attraction-related reorientation is instantaneous, in our model cohesion emerges from vision-based steering, where the reorientation toward a target is restricted by a limited maneuverability.Thus, both visionand alignment-related maneuverability are important parameters, which have not been investigated in combination with alignment so far.
Thick worm-like swarms have already been observed in the "behavioral zonal model", called highly parallel groups [17,18].However, we also obtain highly elon-gated, thin worm-like swarms, in particular for large Péclet numbers and large alignment-vision ratios.These swarms have to be distinguished from the swarms in the pure vision-based minimal cognitive model with and without excluded volume [34,36], where they display single-file motion and are much shorter in length, i.e., less stable.The most interesting feature of these thin worm-like swarms is that they can transform into metastable milling states, where the swarm bites its own tail, and then regains the elongated shape later on.Milling structures have also been observed previously in the behavioural zonal model [17].We observe both large (polar) milling bands and small rotating aggregates -where the latter differ from the nematic ring-like bands in the vision-only case with point particles [34].
Millings have been seen previously in simulations of other models [7,37], but more importantly in groups of several animal species in the wild, such as schooling fish [57], army ants [58], bats [59], plant-animal worms [60,61], and dictyostelium [62].Large extended worm-like swarms have been observed in flocks of birds [6,63], herds of sheep, and school of fish.
We conclude from our simulations that it would be very interesting to study and characterize the existence, motion, and trajectories of large worm-like swarms in more detail, both in simulations and in animals herd in the wild.We have analyzed the trajectories in terms of a persistent random walk model and extracted effective persistence length.However, it is not at all obvious that the assumption of a persistent random walk fully captures the complexity of motion of an animal herd.In fact, a more detailed look at the long-time trajectory of a worm-like swarm, see Fig. 13, already indicates that this behavior -with long stretches of persistent directed motion interrupted by loop-like pieces and sharp turns -is much more complex and interesting than a simple persistent random walk.

FIG. 1 .
FIG.1.(a) Schematic representation of vision cone and alignment neighborhood of particle i at position r i , with orientation e i , distance vector r j − r i to other particles, and the corresponding orientation angle ϕij.(b) Schematic showing the polar orientation field with cutoff RC and vision cone of blue particle with vision angle θ and cutoff radius 4R0.The blue particle interacts with other particles (red) through visual perception only within the vision cone (green), and aligns with other particles (pink) within the alignment region (grey).Particles (white) in the overlap region experience both interactions.

FIG. 2 .
FIG. 2. Snapshots of emerging structures for various vision angles θ, alignment-vision ratios Ωa/Ωv, and the Péclet numbers (a) P e = 10 and (b) P e = 70.In order to ensure clear visibility, the snapshots are not presented to scale.For certain structures, a zoomed-in view is necessary to provide a more detailed representation, see SM Fig. S1 for scaled structures.The dilute phase is highlighted within yellow box, dispersed clusters are represented in grey box, close packed cluster in purple box, and worm-like swarms in green box.

FIG. 3 .
FIG. 3. Snapshots of iABP structures are presented for various vision angles θ, Péclet numbers P e, and an alignmentvision ratio Ωa/Ωv = 10.The snapshots are not to scale for better visualization.See SM, Fig. S2 for full phase diagram.

FIG. 4 .
FIG. 4. Snapshots of emerging structures for different activities P e, vision angles θ, and alignment-vision ratio Ωa/Ωv = 4 for the packing fraction ϕ = 0.00785.The snapshots are not to scale for better visualization.

4 FIG. 7 .
FIG. 7. Aggregate asphericity A as a function of the ratio Ωa/Ωv for P e = 10 and the indicated vision angles.

FIG. 8 .
FIG.8.Polarization order parameter P as a function of the vision angle θ for the indicated activities P e and Ωa/Ωv = 10.The packing fraction is ϕ = 0.00785.