Modeling collective behaviors from optic flow and retinal cues

Animal collective behavior is often modeled with self-propelled particles, assuming each individual has ``omniscient'' knowledge of its neighbors. Yet, neighbors may be hidden from view and we do not know the effect of this information loss. To address this question, we propose a visual model of collective behavior where each particle moves according to bio-plausible visual cues, in particular the optic flow. This visual model successfully reproduces three classical collective behaviors: swarming, schooling, and milling. This model offers a potential solution for controlling artificial swarms visually.

In self-propelled particle (SPP) models , the rules of attraction, alignment, and avoidance are typically applied with the simplifying assumption that each individual possess idealized senses, is "omniscient" and gauges perfectly the position, distance, orientation, and velocity of its neighbors [16,17].However, this assumption may not hold in practice, especially when some neighboring individuals are hidden from view [18][19][20][21][22].
The causal link between visual cues and collective behavior has been shown through several vision-based biological models [23][24][25][26].As a result, different ways of incorporating vision in SPP models have been suggested.The most widespread approach consists of using vision to filter information [27,28].One of these models suggested that flocks of starlings adjust their density to reach a state of "marginal opacity" [29].However, this density adjustment does not seem to be widespread, as certain animals such as fish can form opaque schools [7,30].
In a recent study, Bastien and Romanczuk proposed a model of collective behavior based purely on vision capable of reproducing most of the collective behaviors [31].Their model simulates each individual's response to a projection of the visual field, rather than relying on omniscient information.However, the portrayed milling phase is uncoordinated, meaning that the particles turn in both directions in the same swirl.And, while they claim to use the simplest possible equations of movement that satisfy fundamental symmetries, their model involves six parameters that are hard to relate to the classical 3A rules.In addition, four terms of their model involve spatial and temporal derivatives that would use important computing resources.Instead, we propose to use bio-plausible visual cues that can be measured directly by a visual sensor, i.e. the optic flow and the retinal position.
Optic flow refers to the apparent angular velocity of objects in the visual field due to the relative motion between the observer and its surroundings.Numerous animal species perceive and use optic flow for a variety of tasks.Insects use it to navigate in crowded settings [32], evade ground obstacles, and control their landing [33].Fish use it for navigation [34], and birds use it during takeoff [35].Optic flow is ubiquitous in nature.It involves specialized neurons, well-identified in invertebrates, which have inspired bio-inspired sensors dedicated to optic flow [36][37][38][39].These visual sensors can provide panoramic optic flow sensing [37,38] and allows for direct measurement and on-board panoramic use of this visual cue [40,41].
In this article, we propose a non-stereoscopic visionbased model of collective behavior, inspired by animal vision.This model is intended to be implementable on autonomous robots equipped with visual sensors, i.e. the robots do not need to communicate with a central command or/and to know their neighbors' relative coordinates, their relative positions between each other or to be georeferenced.Our model aims at bridging the gap between traditional SPP models that rely on omniscient information and biomimetic visual approaches.

MODEL
We consider a system of N self-propelled particles in two dimensions.Each particle is a circular object with radius a, moving with a constant speed U .The position of the ith particle is noted x i and its direction e i = [cos θ i , sin θ i ], with θ i its heading [Fig.1(a)].
We model the interactions between particles using changes in their angular velocity associated with attraction, alignment, and noise.Specifically, the equations of Notations used and principle of the visual cues.(a) The particle i is located in xi and θi is its heading.θij is the retinal position of particle j for the i-th particle.(b) The angle ϕ is the line of sight with respect to the particle's heading.The vector Vij = U (ej − ei) is the relative velocity of particle j perceived by i.It can be decomposed into radial and azimuthal components, Vir and V iϕ .(c) Point of view of the particle i, the arrows represent the optic flow generated by the relative angular velocity of each particle.(d) The visual field Vi(φ) is a binary representation of (c).It is formed of a set of "shades".(e) The function Ri(ϕ) represents the estimated distance associated to shades, assuming they are associated with a single individual of radius a. (f) Optic flow Oi(ϕ) associated with the apparent angular velocity of the shade edges.(g) Optic flow divergence Di(ϕ).
motion can be written as follows where dots denote temporal derivatives and η(t) is a standard Wiener process representing rotational noise.The functions ω ⊙ and ω ∥ are O(1) functions representing attraction and alignment.The parameters k ⊙ , k ∥ , k η control the strength of attraction, alignment, and noise.For simplicity, we did not include an avoidance rule, as it is not a required rule to reproduce collective behaviors [2,12,42].We begin by introducing an omniscient model that will serve as a reference.This model is inspired by a data-driven fish model [43,44], with the difference that each particle interacts with all the others.The attraction and alignment terms are given by where b ϵ (ϕ) = 1+ϵ cos ϕ models the blind angle (See Also Fig. S1, [45]).When ϵ = 0, b ϵ is isotropic; when ϵ = 1, the particle cannot see behind itself [45].The brackets denote a normalization defined as To model visual perception, we assume that each particle senses a visual field V i (ϕ), where ϕ represents the angle between the particle's heading and the line of sight [Fig.1(c)].The function output is binary, indicating the presence or absence of a shade in the visual field [Fig.1(b)-(d)].
Using the information from the visual field V i (ϕ), we can derive the function R i (ϕ) = a/ sin(∆ϕ/2), where ∆ϕ represents the angle of view angle of shades [Fig.1(e)].With this definition, when a shade is associated with a single particle, R i (ϕ) represents its distance from the viewer.
Temporal changes in the features of the visual field can be used to calculate the optic flow O i (ϕ).A simplified optic flow is estimated by assuming that each shade has a pattern that moves and deforms with it.It results that O i (ϕ) is a linearly interpolated function of the angular velocity between two features of the shade, its rising and falling edges [Fig.1(f)].This method to compute the optic flow computation corresponds to a cross-correlation of visual field features [46], which are known to occur in animal eyes [47,48].Similarly, we can compute the optic flow divergence D i (ϕ) by derivating the optic flow O i (ϕ) [Fig.1(f)].Due to the piece-wise linear nature of O i (ϕ), D i (ϕ) is a piece-wise constant function.
The optic functions V, R, O, and D are inspired by animal vision.These functions can easily be computed by a man-made vision system.We will now use these functions to the attraction and alignment terms of a vision model.
In this visual model, the attraction and alignment terms are given by where the brackets denote the normalization given by Eq. ( 3) with the sum replaced by the integral.The attraction and alignment terms of the visual model are constructed to be similar to those of the omniscient model given in Eqs.(2a)-(2b).The difference in the exponent of R comes from the additional ∆ϕ ∼ 1/R arising from the integration.In the alignment term, ω visu.

∥
, the cross-product e i ×e j is evaluated from the visual information.This is done by using the optic flow O i (ϕ) and its divergence D i (ϕ), which are related to the velocity of particle j with respect to particle i [Fig.1(b)].Specifically, the radial component is given by V ir = −R i (ϕ) D i (ϕ), and the azimuthal component by V iϕ = R i (ϕ) O i (ϕ).We can use these components to calculate the vector V ij in polar coordinates (r, ϕ) as , while e i can be expressed as (cos ϕ, − sin ϕ).It results that The alignment term is thus the sum of two terms: one proportional to the derotated optic flow O and sensitive to the azimuthal velocity of neighbors, and one proportional to the optic flow divergence D and sensitive to the radial velocity.When computing e i × V ij with Eq. ( 5), we remove the particle rotation with angular velocity θi from the optic flow O i (ϕ).
The equations of motion presented in Eq. ( 1), along with the attraction and alignment terms derived in Eqs.(4a) and (4b), provide a model of collective behavior based on realistic visual cues.To make the problem dimensionless, we chose a = 1 and U = 1.With this approach, four dimensionless parameters remain, the strengths of noise attraction and alignment, k η , k ⊙ , k ∥ , and the blind angle parameter ϵ.

NUMERICAL SIMULATIONS
To explore the effect of these parameters on collective behaviors, we performed numerical simulations with N = 50 particles.Initially, the particles are randomly placed in a square of side aN with random headings [45][49].The dynamical system described by Eqs.(1a) and (1b) is solved numerically using a discrete implementation of Eqs.(4a) and (4a) (See also Figs.S2-S3, [45]).We examined the effect of the time step δt by conducting simulations with δt = 0.001, 0.01, and 0.1 (Figs.S17-S22 and Video S1-S6, [45]).However, no significant differences were observed on the collective behavior, and a time step of δt = 0.1 was selected for the remaining simulations to ensure computational efficiency.
We first set k η = 0.01 and ϵ = 1 (maximum blind angle) and explore the effects of the two remaining parameters k ⊙ and k ∥ in the visual model.Our simulations show three distinct dynamical phases (Fig. 6 for the visual model, Fig. S10 for the omniscient model and Figs.S23-S28 snapshots for the evolution of the stable phases seen on Videos S7-S9 and S10-S12, [45]).If the alignment is zero, a disordered swarming phase is observed, where individuals form a group without a preferred direction [Fig.6(a)].When the alignment strength increases, particles begin to align in the same direction, resulting in the schooling phase [Fig.6(b)].If the ratio between the alignment and the attraction strengths is around 0.4, the group exhibits a milling phase [Fig.6(c)], creating a vortex.These three phases (swarming, schooling, and milling) have regularly been observed in (omniscient) self-propelled-particle models [43,44,[50][51][52] To quantitatively distinguish between the different dynamical phases, we introduce three global order metrics: polarization P , milling M , and opacity O [43,44].These metrics are defined as follows, where the overbar represents an average over all individuals and the unit vector  the parameter space (k ⊙ , k ∥ ) ∈ [0, 0.2] × [0, 0.2] for two values of the blind angle parameter (ϵ = 0 or 1).For each parameter set, we ran 10 simulations over long durations (∆t = 5000).The mean values of P , M , and O were determined by averaging over the last 1000 time units to ensure that the transient has no influence.The outcomes of these simulations are synthesized in phase diagrams [Fig.3, and expanded on each metric independently on Figs.S4, S6-S9, S11, S13-S16.[45]].

DISCUSSION
Somewhat arbitrarily, we chose to identify the collective phases from the values of the polarization P and the milling M parameters: schooling when P > 0.5 and M < 0.5; milling when P < 0.5 and M > 0.5; swarming when P < 0.5 and M < 0.5; and bistable when P > 0.5 and M > 0.5 (we will come back to this particular phase below).
Our vision model qualitatively reproduces the phases observed in the omniscient model [Fig.3].Specifically, we observe the three phases in the vision model: schooling when k ⊙ ≲ 0.5k ∥ ; swarming when k ∥ ≲ 0.3k ⊙ ; and milling or bistability otherwise.When the blind angle parameter increases, it tends to stabilize the milling phase both in the visual model and the omniscient model, as observed in the literature [43].
The difference between the two models increases at high values of the alignment and attraction strengths when the opacity O is maximum.In the visual model, the opacity does not exceed 0.7, regardless of the values of ϵ and k η .In the omniscient model, however, the opacity exceeds 0.8 and even reaches one [Fig.3(c) and (d)].This is because the radius a does not play any role in the omniscient model and the characteristic length is set by k 2 η .In both models, as expected, larger noise strength causes a decrease in average opacity (Figs.S4 and S11) [45].Now, let us examine the bistable phase.but eventually it transitions to a schooling phase for 6000 ≲ t ≲ 7500 before returning to a schooling phase again, and so on.These transitions show that the system exhibits a noise-induced intermittency between two stable states: milling and schooling.The schooling phase far from the transition resembles a front of parallel individuals [Fig.4(f)].Just before the transition to milling, some individuals move ahead, [Fig.4(c)].Reciprocally, the milling phase just before the transition to schooling opens up, generating a C shape [Fig.4(e)].These transitions are fairly stereotyped as they tend to follow the same path in the (M, P ) plane [Fig.4(b)].The existence of these intermittent transitions mediated by noise is likely due to a second-order transition between the milling and schooling phases as seen on the metric distribution along the boundaries of both phases [Figs.S6-S9 and S13-S16].This multi-stability has been present on similar 3A models [43].
We changed the group size from N = 5 to 300 (Figs.S5 and S12, [45]).Although the group size does not seem to impact qualitatively the phase diagram, small groups tend to favor schooling, whereas large groups tend to favor swarming.

CONCLUSION
In conclusion, we proposed a model based on biologically plausible visual cues.This model successfully reproduces the three classical phases of animal collective behavior: swarming, schooling, and milling.These findings show that visual cues provide enough information to enable collective behavior.
Furthermore, our findings imply potential practical uses for synchronizing groups of artificial drones, which may be governed by analogous visual stimuli.In future studies, we aim to investigate these opportunities more thoroughly and enhance our model for a more accurate depiction of animal collective behavior in a threedimensional space.
Supplemental Material for "Modeling collective behaviors from optic flow and retinal cues" D. Castro We defined the attraction rule for the omniscient and visual models to be equivalent one to the other.But changing the discrete sum of the omniscient model into an integral needed for the visual model, requires to multiply the integrand by R i (ϕ).
Here we present a step by step calculation of this integral to show the equivalence with the discrete sum.We start by the integral of Eq. (4a), where b ϵ (ϕ) = 1 + ϵ cos ϕ.
V i (ϕ) is a piece-wise constant function that can be expressed as a set unit boxcar functions (a shade).Each, can be described as a rising φ ⇑k and its corresponding falling edge φ ⇓k .Furthermore, from these edges, we can compute the shade's mid-point φ k = (φ ⇑k + φ ⇓k )/2, and the shade's half-width ∆φ k = (φ ⇓k − φ ⇑k )/2.R i (ϕ) is V i (ϕ) weighted by a sin ∆φ k .This means that R i (ϕ) for a binary V i (ϕ), is independent of ϕ and is only be dependent on the k-shade's half-width (R k (∆φ k )).The subscript changes from i to k, as i indicate is a value referred to continuous information, k to discrete information of a single shade.A shade k id different from an individual j, as in a shade can be occlusion or aggregation of any j-particles.Then, we can rewrite Eq. (S1) as: And here R k (∆φ k ) = a sin ∆φ k , Resulting in: (S5) For the case where there is no occlusion nor aggregation k = j and taking into account the physical constrains of the particles: Finally the 2a factor is nullified by the ⟨.⟩ normalization.So it becomes identical to the attraction term in Eq. (2a), proving the equivalence between the two models.

S1.2 Normalization for non-trigonometric equation
We presented a normalization to guarantee the O(1) for all the terms of the models.Here we proof that the the non-trigonometric equation presented that are normalized by this equation, does in fact maintains the same form as shown on Eq. ( 3).We start by taking the alignment term of the omniscient model, this cross product can be expressed as: Rewriting equation (2b) to better isolate the terms of the normalization described on equation ( 3) where, e i and e j are a unit vectors.Here sin(θ j ) = sin(θ i − θ j )) and f (j) = U ∥ei∥∥ej ∥ ∥xj −xi∥ 2 b ϵ (θ ij ).Which maintains the form of the normalization presented on Eq. (3).

S1.3 Equivalence of the visual and omniscient model for the alignment term
Here we show the equivalence between visual and omniscient model for the alignment term if we consider that each shade is associated to a single particle.
The visual alignment terms is defined as: (S10) On the previous section we discussed the piece-wise constant properties of R i (ϕ), the same is true for D i (ϕ) and O i (ϕ), being independent of ϕ Taking Eq. (S10) and Eq.(4b), we can express the alignment for the visual model as two components, a radial and a azimuthal to be: Eq. ( S13) is the exact form as Eq.(S2) with the weight of the integral being (S14) As for Eq.S12 the developing the integral and assuming ∆ϕ ≈ sin ∆ϕ we have: Once more, for the case where there is no occlusion nor aggregation k = j and taking into account the physical constrains of the particles: cos ∆φ k ≈ 1 and φ k = θ ij , then Eqs.(S14-15) becomes This leads to The normalization nullifies the factor 2 that is not present in the omniscient model and shows that the two models are equivalent when shades are formed by single particles.As previously stated, the model that we propose has an omniscient implementation.The alignment term is shown on Eq. (S7) can be normalized by Eq. ( 5), which lead to the implementation of: For attraction term, the implemented equation is: On the main text, we described the continuous form of the equations for attraction (Eq.(4a)) and for alignment (Eq.(4b)) and Sections S1.1 and S1.3 expanded on equivalency between these visual terms when there is no occlusion or aggregation.Here, we go in more depth on how the visual components are derived and calculated, as well as the formulation of the implementation for equations for the numerical simulations.
Consider the point of view of the i-th particle in 2 consecutive time steps for the j-th particle: Each particle has a radius a, the over index represent + time t and -time t-dt.e r and e ϕ are the unit vectors of the polar coordinates referred to θ i , where the former is radial and the later azimuthal.− → V ij is the relative velocity vector perceived by the particle i of the net velocity of the particle j.Then from the geometry of the problem we have Then the R i vectors can be described as follows: Consequently the distance associated with the movement of − → V ij is: From Eq. (S26) we can do the decomposition in radial and azimutal components to calculate O i and Note that on a robotic implementation this method of calculating the optic flow and its divergence (O i and D i ), would be entirely replaced by its direct measurement via a specialized optic flow sensor.
For the computer implementation of the alignment term of the visual model, we use Eq.(S26) to calculate the cross product of the relative vectors over R 2 i (∆φ k ) Note that there are two approximations (cos ∆φ k ≈ 1 and sin ∆φ k ≈ ∆φ k ) on Eq. (S29).
As for attraction, we use at Eq. (S6) written on the terms described above.

Algorithm to detect corresponding features necessary to compute optic flow
The optic flow is the perception of relative movement and we use a feature matching-inspired algorithm to calculate it in numeric simulation.But this feature match requires that you discriminate and detect the same feature in two consecutive time steps.Here we present the algorithm to match different features given the binary characteristic of V i (ϕ).In the model this will be the assumption to use Eqs.(S27) and (S28).Note that shades can split or join in between time steps.To detect these events, we employ a minimum distance error to correlate two set of features in two consecutive time steps.
In Fig. S7, we illustrate the algorithm used, where yellow boxes are associated to corresponding shades.To correspond to each other, shades should be associated to radial and azimuthal velocities bounded by U .
This algorithm is also illustrated on the following pseudo-code (see Algorithm ).
The model is implemented on python 3.10 (it will be available for download on [1] and [2].This includes the implementation of the algorithm described by Fig. S7, all model equations, data management and figure creation.There is a markdown file that only include the model implementation without data logging..5 .6 .7 .1
points towards particle i from the center of mass.All three metrics range in the interval [0, 1].The polarization P measures the alignment: P = 0 corresponds to particles pointing in all directions, P = 1 corresponds to a perfectly aligned school.The milling M represents the normalized angular momentum: straight-line formation gives M = 0 and perfect milling gives M = 1.The opacity O measures the "occupancy" of the visual fields: O = 0 when there is no object in the visual field, and O = 1 when the entire visual field is obscured.We now compare the visual and omniscient models by setting the value of the noise to k η = 0.01 and exploring

Fig. S 6 .
Fig. S 6. Point of view of the i-th particle (black) of the j-th particle in t-1 (contour) and t (solid color)

Fig. S 7 .
Fig. S 7. Example of two consecutive visual fields.The matrix at the bottom illustrate the algorithm we use to find corresponding shades between two time steps.∆ φ is the delta between the mid point of the shades between two consecutive time steps.

Fig. S 12 .Fig. S 13 .Fig. S 15 .
Fig. S 12. Phase diagram for the Visual model of a flock size of 50 individuals, kη =0.1 and no blind angle.a) Integrated metrics, b) Separated metrics and c) Variance between the 10 experiments final metrics.

12 𝛿𝑡Fig. S 17 .
Fig. S 17. Phase diagram for the Omniscient model of a flock size of 50 individuals, kη =0.01 and no blind angle.a) Integrated metrics, b) Separated metrics and c) Variance between the 10 experiments final metrics.

12 𝛿𝑡
Fig. S 18. Phase diagram for the Omniscient model of a flock size of 50 individuals, kη =0.01 and maximum blind angle.a) Integrated metrics, b) Separated metrics and c) Variance between the 10 experiments final metrics.

12 𝛿𝑡Fig. S 19 .
Fig. S 19.Phase diagram for the Omniscient model of a flock size of 50 individuals, kη =0.1 and no blind angle.a) Integrated metrics, b) Separated metrics and c) Variance between the 10 experiments final metrics.

12 𝛿𝑡
Fig. S 20.Phase diagram for the Omniscient model of a flock size of 50 individuals, kη =0.1 and maximum blind angle.a) Integrated metrics, b) Separated metrics and c) Variance between the 10 experiments final metrics.