Controlling Optical Beam Thermalization via Band-Gap Engineering

We establish dispersion engineering rules that allow us to control the thermalization process and the thermal state of an initial beam propagating in a multimode nonlinear photonic circuit. To this end, we have implemented a kinetic equation (KE) approach in systems whose Bloch dispersion relation exhibits bands and gaps. When the ratio between the gap-width to the band-width is larger than a critical value, the KE has stationary solutions which differ from the standard Rayleigh-Jeans (RJ) distribution. The theory also predicts the relaxation times above which such non-conventional thermal states occur. We have tested the validity of our results for the prototype SSH model whose connectivity between the composite elements allows to control the band-gap structure. These spectral engineering rules can be extended to more complex photonic networks that lack periodicity but their spectra consist of groups of modes that are separated by spectral gaps.


I. INTRODUCTION
The understanding of light propagation in nonlinear multimode settings has both fundamental and technological ramifications. Optical phase transitions [1][2][3][4], beam self-cleaning [5][6][7] (a similar phenomenon occurs also for vibrational polar modes [8]), spatio-temporal mode locking [9], and multimode solitons [10] are some of the novel phenomena that these studies have recently revealed. At the same time, the development of predictive tools of the nonlinear beam dynamics will be addressing urging technological needs associated with the looming information "capacity crunch" in fiber-optical communication systems [11,12] and the quest for new platforms of high-power light sources [9]. Yet, the evaluation of energy redistribution among the various modes due to the nonlinear interaction is a daunting task. Typically, it is addressed via brute-force computations -a formidable process which involves a mode-by-mode analysis [13]. For example, the light propagation in multimode nonlinear fibers requires knowledge of the group velocity of each mode, their propagation constant, the self-phase modulation coefficients (which are proportional to M for M -modes), the crossphase modulation coefficients (proportional to M 2 ), the four wave mixing coefficients (proportional to M 3 ), etc.
This kind of a brute force calculation would give the detail picture of the time evolution of energy distribution over the (linear) modes of the system. Often, however, one is interested only in the long-time limit of the distribution. The latter follows from the general principles of statistical mechanics which imply that a collection of weakly interacting modes reaches an equilibrium distribution given by the Rayleigh-Jeans (RJ) formula [14,15]. When the interacting optical modes reach an * tkottos@wesleyan.edu internal equilibrium, one can define various thermodynamic functions and develop an "optical thermodynamics" [16], in complete analogy with the standard theory for systems in thermal equilibrium [17]. We will often refer to this internal equilibrium in the mode space as "thermal equilibrium" although the corresponding temperature is only an effective one which has nothing to do with the actual temperature of the physical environment.
The assertion of the RJ distribution, in the long-time limit, has been supported by deriving and studying an approximate kinetic equation for the intensity distribution of the modes in case of Kerr type non-linearities [14,18]. This equation exhibits a stationary solution of the RJ form. The same conclusion has been reached using the optical thermodynamic approach [16,19,20]. Importantly, this framework, immediately implies that the formation of a RJ equilibrium distribution is independent of the specific nature of the nonlinear interaction (e.g. Kerr or saturable or thermal nonlinearities) as long as the latter is weak [4,16,21]. The RJ distribution has been recently confirmed by a direct measurement of the thermal state in a highly-multimode optical fiber [22]. Other indirect measurements of RJ have provided additional confidence on the validity of these theoretical predictions [23].
In contrast to the optical thermodynamic approach that takes thermalization as granted, the kinetic equation provides also an understanding of the thermalization process. Unfortunately, its mathematical complexity has limited its implementation only to simple systems, with a linear spectrum consisting of a single Bloch band (like the photonic lattice in Fig. 1a). In this case one always ends up with a stationary RJ distribution whose effective temperature and chemical potential are uniquely determined from the initial norm and energy. The main purpose of the present paper is to demonstrate that in more complex cases, when the linear spectrum of the system exhibits bands and gaps, the kinetic equation can have, under specific conditions, stationary solutions with different RJ arXiv:2108.05072v1 [physics.optics] 11 Aug 2021 distributions in different bands. These solutions describe states of a partial (or quasi) equilibrium of the system. Our analysis underlines the importance of band-gap engineering in the Bloch dispersion relation of a composite photonic circuit and highlights the relaxation processes that are responsible for the thermalization of an initial beam towards this (quasi-) stationary two-component RJ (TCRJ) distribution. Our theoretical considerations are confirmed via detailed numerical simulations with a variety of weakly nonlinear multimode systems.
The structure of the paper is as follows. First we present the theoretical model that describes the beam propagation in nonlinear multimode (multicore) fibers and photonic networks of coupled resonators. The associated kinetic equation and its stationary RJ solution is briefly reviewed as well. Then in Sec. III we analyze the conditions under which a thermal equilibrium states follows the two-component RJ distribution. This analysis allows us to establish a set of spectral engineering rules that determine the nature of the thermal state. In the next Sec. IV we provide examples of thermal equilibrium management in composite photonic networks with underlying periodicity. A prominent example that we analyze in detail is the Su-Schrieffer-Heeger (SSH) photonic lattice for which we also derive theoretical expressions for the relaxation times towards the thermal equilibrium. In Sec. V we extend the spectral engineering approach to random photonic lattices. Our conclusions and outlook are summarized at the last section VI.

A. Dynamical Equations
We model the dynamics of multimode nonlinear photonic networks using a time-dependent coupled mode theory (CMT) where ψ l is the complex field amplitude at node l while J lj = J * jl describes the coupling between the nodes l and j. The last term models a nonlinear light-matter interaction, which is associated with a Kerr effect. The variable t in Eq. (1) represents (a) either time in the case of temporal field dynamics at a network of coupled microresonators or (b) the paraxial propagation distance of a beam propagating in a multimode fiber or in an array of coupled waveguides.
The equation of motion (1) is derivable from the Hamiltonian (internal energy) which is a constant of motion. Below we will assume that the Kerr nonlinearity coefficient χ is small. This assumption will allow us to approximate the total internal energy (Hamiltonian) by its linear part i.e. H({ψ l }) ≡ H L + H N L ≈ H L . Physically, in multimode waveguide settings, the internal energy represents the longitudinal electrodynamic momentum flow. Another constant of motion is which can be interpreted as the total optical power of the beam. Equation (1) describes the field dynamics in the Wannier basis associated with the (localized) modes of the individual nodes of the circuit. There is an alternative formulation of this problem which utilizes the supermode basis {f α }, α = 1, · · · , M of the underlying linear network. In this case ψ l (t) = α C α (t)f α (l), and Eq. (1) is re-written as: where ε α is the eigenvalue associated with eigenmode f α of the linear network and describes the interactions associated with a nonlinear mixing between supermodes.

B. Kinetic Equation and Standard Rayleigh-Jeans Distribution
A well established method that allows to analyze the thermalization process of a light beam propagating under Eqs. (1,4) is based on the nonequilibrium kinetic equation, which in turn is based on the Random Phase Approximation. The approximation is valid because any particular mode is coupled, by nonlinearity, to a great number of other modes (essentially to a continuum of modes). This coupling leads to a chaotic mode dynamics and to the phase randomization. The system therefore can be described solely in term of the slowly varying mode intensities which obey the kinetic equation (KE) [14,18]: where I α (t) = |C α (t)| 2 is the optical power in supermode α, βγδ denotes the summation over all the non-diagonal  terms (triplets satisfying (α, β) = (γ, δ) and (α, β) = (δ, γ)), and It is straightforward to confirm that a stationary solution of Eq. (6) is the RJ distribution [14,15,24] where the optical temperature and chemical potential (T, µ), are obtained from the total power A and the total (linear) energy E of the initial beam. Equation (6) features a four-mode (i.e. excitations) interaction process which respects the conservation of the total power A = α I α and total (linear) energy E ≈ α ε α I α , and admits the following physical interpretation. Generally, all four modes I α,β,γ,δ appearing in the first line in Eq. (6) are occupied (this is the case in thermal equilibrium), although in principle it can happen that one out of the four modes is initially empty but gets occupied after the collision. During the interaction process, two of these modes will gain some norm (and energy) while the other two will lose the same amount. Specifically, the collision process assumes that two (occupied) modes act on the third (occupied) mode, causing (by induced emission) an increase of the norm of that third mode and producing some norm in the fourth mode (which can be occupied or being empty before the collision). We stress, that the KE approach adopted in this work is applicable only in the weak nonlinearity limit where the concept of linear modes is still valid. In the opposite limit of strong nonlinearities, one needs to adopt an alternative approach based on the evaluation of the Gibbs distribution [25]. Figure 1a displays the simplest photonic lattice, consisting of an array of coupled single-mode waveguides with nearest neighbor coupling constants J l,l±1 = v(= 1) and same resonant frequency J 0 ≡ J l,l (= 2). The photonic lattice has a linear dispersion relation consisting of a single band ε(k) = 2 − 2v cos(k) where k ∈ [−π, π] is the propagation constant (wavevector). For reasons that will become clear below, we have chosen to represent the dispersion relation of such system using a reduced Brillouin zone associated with doubling of the unit cell. In this representation the dispersion relation takes the form ε(k) = 2 ± 2v cos(k/2), see Fig. 1b.
The thermalization process of a typical initial state (open circles) has been verified numerically in Fig. 1c. In these simulations we have integrated Eq. (1) up to times O(10 6 ) (in units of coupling constant) using a high-order three part split symplectic integrator scheme [4]. The method conserved, up to errors O(10 −8 ), the total internal energy and power of the system. The thermal state (filled black circles) has been extracted by performing a time averaging over the last 2 × 10 5 propagation times. We have found that such system always thermalizes to the standard RJ distribution (red line) Eq. (8).

III. TWO-COMPONENT RAYLEIGH-JEANS DISTRIBUTIONS
We now argue that more complicated optical networks, whose spectrum involves bands and gaps, can approach in their thermalization process a different distribution which we name two-component Rayleigh-Jeans (TCRJ). More precisely, the kinetic equation for such systems has stationary solutions which correspond to different RJ distributions in different bands. An example is shown in Fig. 1d where the connectivity J lj of the waveguide array (SSH system-see below) is such that the spectrum of the system consists of two bands separated by a gap, see Fig. 1e. It turns out that the TCRJ distributions (see Fig. 1f for the SSH example) are possible because, under conditions specified below, the optical power is conserved separately in each individual band.
Strictly speaking, a TCRJ distribution corresponds to a partial equilibrium (a long-lived state) rather than to a full one. The latter is eventually established via processes involving exchange of the modal power between the bands. The main point, however, is that these processes (again, under the appropriate conditions, see below) are beyond the KE Eq. (6) and would involve time scales much longer than those needed for reaching the TCRJ state. Therefore the latter can be considered as a genuine equilibrium state for all practical processes. Our goal below is to establish dispersion engineering rules that will allow us to control the thermalization process, and therefore the thermal distribution, of an initial beam.

A. Collision Processes and Conservation laws
To understand the results shown in Fig. 1f we analyze the KE under the assumption that the spectrum of the system has a band-gap structure. For simplicity, we consider a system with one energy gap separating the spectrum into two bands. In this case, there are the following "collision" processes between modes that contribute to the relaxation described by Eq. (6): (I) all four interacting modes are from the same band; (II) two of the interacting modes belong to one band while the other two belong to the other band; (III) three interacting modes are from the same band, and one mode is from the other band. All other scenarios are energetically impossible.
Processes (II) and (III) exchange energy between the two bands, while only process (III) exchanges both energy and power. The lack of process (III), therefore, enforces the conservation of power in each individual band, i.e., where α 1 and α 2 refer to the modes associated with the first and the second band respectively. A lack of processes (II) and (III) altogether will lead to a total disconnection between the two bands. Let us analyze further the conditions under which process (III) is possible. For this purpose, we define the gap-width ∆ g , and the band-widths ∆ of the lower (s = 1) and upper (s = 2) bands. We further define the corresponding bottom energy states ε Fig. 1e for an example. Next, we consider a collision process which involves three modes from the lower band and one mode from the upper band. In this case, the most extreme collision process (that respects the energy conservation) involves the two highest modes of the lower band, interacting to form two excitations, each located at the bottom of the corresponding band. These considerations enforce the following inequality 2(ε A similar argument applies when the three modes that participate in the collision process are associated with the upper band s = 2 and the fourth is associated with the lower band s = 1. The corresponding extreme process, which still allows the presence of process (III), involves two modes from the bottom of the upper band to collide and form two modes that are located at the top of the upper and lower band respectively. Consequently, we have the energetic restriction 2ε b . We conclude therefore that the absence of process (III) is guaranteed whenever the following inequality is satisfied leading to the conservation of the total internal energy, and of the individual band-powers (E, A (1) , A (2) ) in Eq. (6). Under the broad gap condition Eq. (10), the KE collapses to the following form for the relaxation process associated with the α 1 −modes (and similarly for the which results from Eq. (6) after excluding from the summation the collision processes (III). The first sum at the r.h.s. of Eq. (11) describes the intra-band collisions (I) while the second term represents the inter-band processes (II). Using Eq. (11), it is straightforward to show that in- A numerical confirmation of the consequences of the absence/presence of the broad-gap condition Eq. (10), is shown in Figs. 2a,b where we are reporting our results for A (s) (t), for two representative cases: In the set-up of Fig. 1a the band powers A (1,2) (t) vary in time (see Fig. 2a). In contrast, in the set-up of Fig. 1d where the connectivity is such that the bandspectrum satisfies the broad-gap condition Eq. (10), both A (1,2) (t) remain constant in time (see Fig. 2b).

B. Two-Component Rayleigh-Jeans Distribution
Next we analyze the nature of the thermal state under the broad gap condition. In this case, the distribution governed by Eq. (11) does not evolve towards the standard RJ distribution i.e., there is no full thermalization. Rather, the following TCRJ distribution nullifies the r.h.s. of Eq. (11) where ε (s) αs andĨ αs refer to the eigenvalues and modal power associated with the α 1,2 −modes of the first (s = 1) and second (s = 2)) band respectively. Equations (12) dictate that the initial beam relaxes to a uniform optical temperature, but the modes in the two bands do not share the same chemical potential. Consequently the TCRJ distribution is defined by three parameters, (T, µ 1 , µ 2 ). These parameters have an one-to-one correspondence with the conserved quantities (E, A (1) , A (2) ), and can be calculated explicitly from the initial beam. Specifically, the conservation of total energy and norm in each band can be expressed as: where M 1 and M 2 refer to the number of modes involved in each band (M = M 1 + M 2 ).
Let us re-iterate that the TCRJ distribution might not necessarily be the true equilibrium state of the system. Its existence is tightly connected with the validity  Fig. 1a,d: (a) When the broadgap condition is violated, the band power A (s) oscillates in time; (b) When the broad-gap condition is satisfied, the band power A (s) remains constant. In this case ∆g = 1.5, and ∆ b = 0.5 corresponding to ∆ = 3. In these simulations we have assumed that the strength of the Kerr nonlinearity is χ = 0.05 while the SSH model (see Fig. 1d) has intra-cell coupling v = 1 and inter-cell coupling w = 1/4 of conservation laws Eq. (9) for A (1) and A (2) . The latter is the result of the KE Eqs. (11), which describes the dynamics generated by Eq. (1) for times up to t ∼ t KE ∝ O(1/χ 4 ). This time constraint signifies the break-down of the second-order approximation for the rate of change of I α . For even longer times, the effects of higher-order terms in the nonlinear interaction among the modes are going to accumulate, and the modal powers will eventually converge to the standard RJ distribu-tion. However, the derivation of the broad-gap condition Eq. (10) for the existence of a (quasi-)stationary TCRJ thermal state is extremely relevant to any realistic applications where the paraxial distance (or evolution time) is finite.

IV. EXAMPLES OF THERMALIZATION MANAGEMENT VIA DISPERSION ENGINEERING
In this section we will be implementing in practice the thermalization management via dispersion engineering. For this purpose, we shall analyze the thermal state for a number of photonic circuits corresponding to different connectivity matrices J lj . A special attention will be given to the thermalization process of the so-called Su-Schrieffer-Heeger (SSH) network which has attracted a lot of attention for its topological properties recently.
It turns out that the SSH model provides a good theoretical framework where the effects of the existence/absence of the broad-gap condition Eq. (10) on the relaxation times can be also analyzed theoretically. These results can further guide our understanding of the relaxation process itself. It is important to point out that photonic systems with separated modes in different groups, which do not exchange power with one-another, have been demonstrated also in [16]. However, the design mechanism in Ref. [16] is completely different from the one discussed here. Specifically, these authors have utilized the polarization degrees of freedom in order to enforce a mode-connectivity matrix V αβγδ which suppresses transitions between modes at different polarization. In contrast, our design protocol involves spectral engineering methods for the management of the thermal equilibrium states.

A. The SSH Model
A prototype system that demonstrates a transition from a RJ to a TCRJ is the SSH model, see Fig. 1d. It consists of M/2 unit cells, with each cell containing two single-mode waveguides coupled together with an intracell coupling constant v. The inter-cell coupling is only between neighboring cells and it is described by a coupling constant w < v. By setting the "on-site" potential (mode) of each waveguide equal to v + w, we ensure that the ground-state energy is zero. The corresponding coupled mode theory (CMT) Hamiltonian is: where s, s ∈ {1, 2} and ψ n,s refers to the complex amplitude on the sth site in the nth unit cell. The connectivity matrix is three-diagonal with a diagonal element J 0 0 = −(v + w), and off-diagonal elements J 0 ±1 = v, J ∓1 ±1 = w, and J = 0 otherwise.
The eigenmodes of this system can be analytically evaluated and they take the form is the eigenmode amplitude in the momentum representation i.e. ψ n,s = 1 2π´d ke ikn f (s) (k) and k ∈ [−π, π] is the wavenumber taken over the Brillouin zone. The corresponding eigenvalues are The lower (higher) band ε (1) (k)(ε (2) (k)) ranges from 0 to 2w (2v to 2(w + v)), and they are separated by a band-gap of width 2(v − w), see Fig. 1e. When w = v the two bands merge. In this case, the thermalization process leads to a RJ distribution Eq. (8) as discussed previously (see Figs. 1a-c).
In the opposite limit of broad gaps (∆ 1) Eqs. (15,16) can be further simplified to ε (s) (k) ≈ 2vδ s,2 + w (1 + (−1) s cos(k)) . s = 1, 2 It turns out that in this case the system approaches a (quasi-)thermal state which differs from the RJ distribution. In fact, our numerics nicely confirms the theoretical predictions of Eq. (12) associated with the TCRJ distribution, see Fig. 1f. We found that the TCRJ emerges when the ratio of the gap-width to the bandwidth ∆ ≡ v−w w reaches the value ∆ = 1. In Fig. 1f we have reported the theoretical predictions (solid red and black lines) for the (quasi-)equilibrium modal-power distribution for an SSH system where ∆ g = 2(v − w) = 3/2 > ∆ b = 2w = 1/2 where v = 1, w = 1/4. In this broad-gap case we expect the formation of TCRJ distribution Eq. (12). The optical thermodynamic variables (T, µ 1 , µ 2 ) can be evaluated explicitly from the initial beam preparation using the extended variables (E, A (1) , A (2) ), see Eq. (13). We have found that these results are nicely describing the outcome of the numerical simulations (see filled circles). We have also confirmed numerically, (see Fig. 2b) that the emergence of the TCRJ coincides with the "individual band power" conservation of A (1) , A (2) .

B. Relaxation Times
Next, we investigate the relaxation time (distance) that an initial beam needs to propagate before reaching a (quasi)-stationary thermal state. Although the analysis below focuses on the case where the broad-gap condition is satisfied, we will also comment on the relaxation times for narrow-gap conditions.
We assume a small deviation δI α from the stationary distributionĨ α such that the modal power of the α-mode becomes Substitution of Eq. (18) in Eq. (6) and subsequent linearization lead us to the following rate equation which describes the rate at which a single α-mode relaxes towards the thermal state, given that all other modes are already at equilibrium. For a better analysis we will decompose the total relaxation rate into various processes associated with different collision mechanisms i.e. and analyze each one separately. We point out that, in the case of broad gap spectrum, we immediately exclude the inter-band contributions associated with the collision process (III). These processes are responsible for a power-exchange between the bands and consequently they lead to the formation of a standard RJ thermal state.

Intra-band Processes
We first analyze the intra-band processes (I) that do not contribute to any energy or norm exchange between the bands. These processes are also present in case of a narrow-gap (or even single band) photonic system. Below we evaluate the relaxation time of the α 1 − modes that belong to the lower band, but a similar calculation can be done for the α 2 −modes that belong to the upper band. In the case of α 1 -modes, Eq. (19) becomes which can be also derived from the first term of the r.h.s. of Eq. (11) after substituting the modal powers I α (see Eq. (18)) and performing a subsequent linearization.
In the case of the SSH model, we can make further progress with the evaluation of Eq. (20). To this end, we substitute Eqs. (5,7) in Eq. (20) and express the matrix Γ α1β1γ1δ1 in terms of the linear modes of the SSH model. When ∆ 1, we can use the approximate expressions Eq. (17) for the modes of the SSH, which allows to simplify the matrix Γ α1β1γ1δ1 into the Kronecker's delta function 1 2M δ k+q−k −q ,0 , where k, q, k , q refers to the wavenumber of modes α 1 , β 1 , γ 1 , δ 1 respectively. In the thermodynamic limit M → ∞, we turn the sums appearing in Eq. (20) into integrals over the wavenumbers and have that The Dirac's delta function in Eq.(21) enforces four-wave mixing processes which satisfy the wavevector constraints (k, q = π − k, k , q = π − k ). Then Eq.(21) becomes which is amenable to further theoretical treatment in the two limiting cases of high and low temperatures.
In the low temperature limit T → 0, the power is concentrated at the bottom of the band corresponding to ε (1) (k ≈ 0) ≈ 0 (and similarly ε (2) (k ≈ ±π) ≈ 2v for the upper band). The associated modal power scales as I k ∼ O( w T ), while the power of higher modes diminishes asĨ k ∼ O( T w ) (see Supplementary Material). A substitution of these estimations in Eq. (22) leads to the following expression for the relaxation times for energies in the middle of the spectrum where a (s) ≡ A (s) /M s is the modal power density associated with the lower (s = 1) and upper (s = 2) bands respectively. The scaling of the relaxation times of the modes in the second band is given by a similar relation as Eq. (23), with the only difference being that a (1) has to be substituted with a (2) . From Eq. (23) we conclude that the scattering rate vanishes when T → 0, which is a generic result for any excitation. One can also develop a qualitative understanding of the linear dependence of the scattering rate on the temperature T by analyzing the expression of the KE appearing in the first line of Eq. (6). For instance, the term −I α I β I γ , describes the following process: an αexcitation combines with a β-excitation and produce an induced emission of a γ-excitation, while the δ-excitation is emitted to conserve energy. This process contributes to the decay rate 1/τ α for the added α-excitation a term ∼ I β I γ . At low T , a small energy interval, ∼ T , is occupied with low energy excitations with high density, ∼ 1/T (see also Supplementary Material), while the bulk of the band is occupied by "high-energy" excitations with low density proportional to ∼ T . Now, for the above process to be possible, the added α-excitation (at high energy, in the bulk of the band) has to meet a β-excitation and a γ-excitation to interact with. Moreover, ε γ must be close to ε α , in order to conserve energy. Specifically, ε γ must compensate for the energy of the destroyed α-excitation in the bulk of the spectrum, while the β-excitation belongs to the bottom of the band. The large 1/T density is compensated by the small T -factor, coming from the summation over β, and we are finally left with the Tfactor due to the small density of γ-excitations so that 1/τ α is proportional to T .
The high temperature limit T → ∞ of the relaxation times Eq. (19) can be also evaluated. In this case, the power is uniformly distributed within each α s -mode in the lower (s=1) and upper (s=2) bands i.e.Ĩ αs = a (s) . For the modes in the first band s = 1 we obtain while intra-band relaxation time for the modes in the upper band (s = 2) will be given by the same expression as Eq. (24) with the only substitution a (1) → a (2) . The theoretical predictions Eqs. (23,24) are nicely confirmed by a direct numerical evaluation of the intra-band relaxation times using Eq. (20), see Fig. 3a. Let us also mention that the intra-band transitions discussed above have their equivalent in narrow gap systems as well (for the high temperature limit see Ref. [1]). In this case, of course, one has a standard RJ distribution Eq. (8) of the modal powers. As the power populates progressively the modes of the lower band for T → 0, we have to substitute in Eq. (23), the individual band power density a (1) with the total one a ≡ A/M . For completeness, we also report in Fig. 3b, the scaling behavior of the intra-band relaxation times for the narrow-gap scenario of an SSH model corresponding to inter-dimer coupling w = 2/3 and intra-dimer v = 1.

Inter-band Processes
Next we analyze the contribution of collision processes (II) to the relaxation of the α 1 -modes. This mechanism involves inter -band transitions and it is responsible for the energy relaxation between the two bands. In this case, Eq. (19) becomes which can be evaluated analytically in the case of the SSH model. Following the same calculations as for the intra-band relaxation time, we can further simplify the above equation into the following form where the super-indexes in the modal powers indicate the corresponding band. In the above analysis, we have used the fact that the presence of the Kronecker's and Dirac's delta functions (see Eqs. (5,7), and the discussion in the previous subsection) limits the four-wave mixing to processes that satisfy the relations (k, q = −k, k , q = −k ) .
When T → ∞, the modal power becomes independent of k and it only depends on the power density associated with the specific band s, i.e.Ĩ (s) k = a (s) . Using Eq. (26) we get that the inter-band relaxation rate of the modes associated with the first band is A similar expression applies for the modes of the second band with the obvious substitution a (2) → a (1) .
In the opposite limit of T → 0 the power will be mainly concentrated in modes that are at the bottom of the spectrum of each of the two bands. Specifically, the modal powerĨ (1) k at the first band will be significant for modes with k ≈ 0, while the modal powerĨ (2) k at the second band will be appreciable for modes with k ≈ ±π. Using their corresponding approximate expressions (see Supplementary Material) and substituting them back in Eq. (26) we have which holds for modes belonging to either band. The relaxation rates vanish linearly with the temperature T as in the intra-band scenario discussed above. In fact, one can invoke the same considerations as above, for the qualitative understanding of this linear scaling law. The above theoretical predictions for the inter-band relaxation times describe nicely our results from the direct numerical evaluation of Eq. (25) for the SSH model in case where the broad gap condition Eq. (10) is satisfied, see Fig. 3a. In the opposite case of narrow-gaps (but still ∆ = 0), the inter-band relaxation rate is supplemented with an additional term associated with collision processes (III). Although we were not able to evaluate these additional contributions analytically, our detailed numerical calculations using Eq. (19) confirm that these processes do not affect the overall behavior of the relaxation rates, see Fig. 3b.

Limitations and Energy Relaxation Rate
The main assumption underlying the derivation of the KE is that initially the phases of the linear modes are random and only the intensities are out of equilibrium. However, for an arbitrary initial preparation the phases need not be random and, generally, some "phase randomization time", τ , may enter the picture. For small J l,j , this time-scale, is controlled by the very slow process of equilibration between different regions in real space and it obviously approaches infinity when J l,j → 0. In typical circumstances where the coupling between local modes is evanescently small, we expect that τ ≥ τ α and therefore the latter can be always considered as the lower bound of the total relaxation process. Let us finally point out that our analysis also sheds light on energy (not power!) relaxation between the two bands. For example, the rate of energy exchange of the first band δE (1) ≡ α ε (1) α1 δI α1 can be approximated as where we have used Eq. (19) together with the fact that only the inter-band transitions can contribute to the energy relaxation between bands. Equation (29) can be used for extracting the energy relaxation time τ H1 defined as dδE (1) dt ≡ − δE (1) τ H .

V. THERMALIZATION MANAGEMENT IN RANDOM PHOTONIC NETWORKS
In the previous section we have highlighted the importance of dispersion engineering in the control of the thermalization process of an initial beam. Our analysis was focusing on photonic networks with an underlying spatial periodicity. It is natural, therefore, to question the validity of our theoretical results in cases of random photonic networks where the concepts of bands and gaps are not, strictly speaking, applicable. Still, one can design photonic structures whose spectrum is consisting of groups of modes which are spectrally away from one another and investigate the applicability of the "broad gap" condition for the realization of TCRJ distributions. In this section we are analyzing the thermal state for two such representative non-periodic networks.
The first network consists of an array of coupled singlemode waveguides (resonators). The propagation constants (node-resonant frequencies) are J ll = J 1 for the first n = 1, · · · , M 1 waveguides (resonators) and J ll = J 2 for the remaining n = 1, · · · , M 2 waveguides (resonators) [27]. For simplicity we have further assumed that M 1 = M 2 while we considered that all waveguides (resonators) are coupled together with the same coupling constant J n,n±1 = J 0 . The parameters (J 0 , J 1 , J 2 ) of the network are chosen in a way that the spectrum is organized in two groups that are separated by a gap which satisfies a broad gap condition Eq. (10), see Fig. 4a. In Fig. 4b we show the (quasi-)thermal state (filled circles) associated with a specific initial excitation (open circles). We have extracted the optical temperature T and the two chemical potentials (µ 1 , µ 2 ) from the individual-band powers A (1) , A (2) and the total energy H of the initial excitation, see Eqs. (13). Using these inputs, we have evaluated the TCRJ distribution Eq. (12) for the modal powers (see red and black lines in Fig. 4b). The theoretical results, nicely match the numerical data extracted by evolving the initial excitation up to times t ≈ 10 7 (units of J 0 ).
A similar analysis has been performed for a random photonic network whose connectivity matrix J nm has been constructed via an orthogonal transformation of a diagonal matrix O is a random orthogonal matrix. The linear spectrum of this network is characterized by the parameters J 0 , M 1 , and ∆ g which have been chosen to enforce a broad-gap (see Eq. (10)) between two spectral groups, see Fig. 4c. Following the same methodology as previously, we have extracted from Eqs. (13) the theoretical values for the optical temperature T and the chemical potentials (µ 1 , µ 2 ) using as input the individual-band powers A (1) , A (2) and the total energy E of the initial excitation. The predicted TCRJ, which uses these values, is shown in Fig.  4d together with the modal thermal occupation that has been evaluated via direct dynamical simulations. In these simulations the initial excitation was evolved up to time t ≈ 10 5 . The nice comparison confirms once more the validity of our theory.

VI. CONCLUSIONS
We have highlighted the importance of dispersion engineering methods for the control of the thermalization process and the formation of a thermal state of an initial beam propagating in a nonlinear multimode photonic structure. Using a kinetic equation approach we have shown that its stationary solutions might differ from the standard Rayleigh-Jeans distribution if the ratio between the gap-width and the band-width exceeds a critical value. In such case, each individual band preserves the power with which it is initially populated. The modal power in each band is characterized by a Rayleigh-Jeans distribution with distinct chemical potential which is dictated by the individual band power. The latter, together with the optical temperature that it is determined by the initial beam, controls the relaxation rate towards these thermal states. We have tested numerically the valid-ity of the predictions of the kinetic equation using the SSH model whose band-gap structure is controlled by the coupling between the elements of the network. We have further extended the spectral engineering rules for the management of the thermal states using more complex networks that are not periodic. We have shown that an appropriate connectivity between the elements of the network, can enforce the formation of groups of modes that are separated by spectral gaps whose width determines the thermal state of the system. It will be interesting to implement these ideas to other complex systems, like quasi-periodic or aperiodic photonic structures that demonstrate fractal spectra, and analyze the thermalization process towards a thermal equilibrium state. . The gap-width between these two spectral intervals is ∆g = 2.125 corresponding to broad-gap condition ∆ = 1.1333; (d) The same as in (b) but for modal power distribution associated with the random network of (c). The parameters that determine the TCRJ distribution in this case are (T, µ1, µ2) = (0.841, −2.09, 1.8).
The Kerr nonlinearity parameter used in this simulation is χ = 0.01.

S1. LOW TEMPERATURE APPROXIMATION OF MODAL POWER IN A STANDARD RJ DISTRIBUTION
In this section we derive the relaxation times for the one-band spectrum. In this case the dispersion relation of the system is ε α ≡ ε(k) = −2w cos(k) where the wavevector k ∈ [−π, π] and w is the coupling element between nearby waveguides (or resonators). Our starting point is the general equation Eq. (19) that give us the relaxation times τ α as: where Γ αβγδ is given by Eq. (5). In the large systemsize M → ∞ limit the above sum can be turned to the following integral 1 τ k = χ 2 πĨ kˆπ −πˆπ −πĨ qĨq Ĩ k+q−q δ (−2w [cos(k) + cos(q) − cos(q ) − cos(k + q − q )]) dqdq = χ 2 2πwĨ kˆπ −π dq |sin(q ) − sin(k)|Ĩ π−kĨq Ĩ π−q (S2) In the infinite temperatire limit T → ∞, the equilibrium distribution of the modal powers isĨ k = −T µ = a ≡ A/M (µ < 0) and the above expression becomes and M = 2M 1 is the total numger of modes in both bands. Furthermore, we define h ≡ E M to be the energy density, and a (1) ≡ A (1) M1 , a (2) ≡ A (2) M2 to be the individual band power density. Under the broad-gap approximation ∆ 1, we can substitute from Eq. (17) the eigenvalues of the SSH model, and arrive to the following expressions (S20) where E ≡ 1 M α ε α is the mean energy of the spectrum and · · · indicates an averaging over random phases.
The last step in the equation above utilized the following contraction rule

B. Two-band systems
We proceed with a similar analysis for a two-band system. We assume that the number of modes in the lower band is M 1 and in the upper band is M 2 . The total number of modes is M = M 1 + M 2 . In case of GRJ thermal states, one needs to take into consideration the additional "individual-band" conservation laws: where a (s) are the power densities in each band s = 1, 2 The low temperature limit T = 0 is trivial. Following the same arguments as for the one-band case, we get for the minimum energy density (for fixed a 1 , a 2 ): The evaluation of the high temperature border is more subtle. Before writing it down, we note that only Γ αβαβ = Γ αββα come into play and regardless of whether α, β belong to the same band or to different bands we have Following the same argumentation as in the case of one band we assume that all modes are equally excited. From Eq. (S22) we conclude that C αs = √ a (s) e iΦα s . It then immediately follows that the total maximum energy density is h max = s m s a (s) E (s) + χ m 1 a (1) + m 2 a (2) 2 (S25)