Projection-free approximate balanced truncation of large unstable systems

In this article, we show that the projection-free, snapshot-based, balanced truncation method can be applied directly to unstable systems. We prove that even for unstable systems, the unmodified balanced proper orthogonal decomposition algorithm theoretically yields a converged transformation that balances the Gramians (including the unstable subspace). We then apply the method to a spatially developing unstable system and show that it results in reduced-order models of similar quality to the ones obtained with existing methods. Due to the unbounded growth of unstable modes, a practical restriction on the final impulse response simulation time appears, which can be adjusted depending on the desired order of the reduced-order model. Recommendations are given to further reduce the cost of the method if the system is large and to improve the performance of the method if it does not yield acceptable results in its unmodified form. Finally, the method is applied to the linearized flow around a cylinder at Re = 100 to show that it actually is able to accurately reproduce impulse responses for more realistic unstable large-scale systems in practice. The well-established approximate balanced truncation numerical framework can therefore be safely applied to unstable systems without any modifications. Additionally, balanced reduced-order models can readily be obtained even for large systems, where the computational cost of existing methods is prohibitive.


Introduction
Many linear dynamical systems, such as the linearized Navier-Stokes equations, are composed of a large number of states O (10 5 − 10 8 ), but their behavior is dominated by a much smaller number of modes O (1−100). Obtaining a low-order model that only retains the dominant features of the system's behavior is of great value in order to understand and modify its dynamics. Specifically in a feedback control setting, most controller design methods are only tractable if a reduced-order model (ROM) is available.
In fluid mechanics, many successful ROMs of the flow field dynamics have been obtained by projecting the system equations onto a low-dimensional subspace, composed of a set of particularly relevant modes, such as global modes, proper orthogonal modes (POMs) or balanced modes. Global modes are ranked by their damping rate, so projecting the dynamics onto the least stable modes is a natural choice.Åkervik et al. [3] and Henningson et al. [23] successfully applied this strategy to damp global oscillations in a shallow cavity using an LQG controller. However, it is common for the dynamics to be dominated by the non-normal interaction between (potentially highly damped) global modes, in which case a large number of modes may be required to obtain an acceptable ROM. A more strategic selection of global modes can result in better performance in some cases [19,12] but finding a set of robust selection criteria is not straightforward [12]. Another weakness of this method is that identifying a large number of (highly damped) global modes may be prohibitively expensive.
Alternatively, proper orthogonal modes provide an optimally low rank approximation of the state over a chosen set of snapshots as they are ranked by their energy content. ROMs of many flow-fields have been developed by projecting the dynamics onto these modes. For instance, models of boundary layers [6,38], channel flows [37], backward-facing steps [39,40,10], bluff body flows and wakes [22,29,35,42,45,47], forward facing steps [41], and cavities [21] have all been studied. POD-based ROMs are attractive both for the simplicity of the (snapshot-based) method used to obtain them [46], and the intuitive projection basis they provide (which allows retaining most of the energy of the snapshot ensemble). However, there is not a strong theoretical justification for using this orthogonal basis, as it depends on the specific set of snapshots used to construct it. It is therefore unsurprising that a large number of modes are sometimes required to accurately represent the flow dynamics [11]. Several methods have been developed to improve the robustness and reliability of the models, such as regularly adapting the set of snapshots used to form the basis [40,13], or adding a so called shift-mode [35] which acts as a mean flow-correction mode and allows transients to be modeled more accurately.
In this article, we focus instead on ROMs based on balanced modes. These are ranked by their dynamical significance to the input-output relationship of the system, and are therefore better suited for feedback control purposes than POMs or global modes by design. ROMs based on balanced modes have attracted a significant amount of attention in recent years, and have been used in many flow configurations such as the response of the vertical force of an airfoil to a plunging motion [52], a channel flow [43], a cavity flow [11], a boundary layer flow [7,8], the flow over a flat plate at large incidence [1], over a backward facing step [18], in a three dimensional boundary layer [44], and over a cylinder [50]. Furthermore, several studies have compared the performance of ROMs based on global, proper orthogonal and balanced modes [52,11,9,18] and have concluded that balanced ROMs typically require a much smaller number of modes for a given degree of accuracy.
As the classical (exact) balanced truncation method is intractable for large systems, Moore [32], Willcox and Peraire [52], and Rowley [43], developed an approximate snapshot-based method sometimes called balanced POD (or BPOD, see Section 2.1) to reduce the computational cost of the balancing procedure. When using this snapshot method, approximate balanced modes can be obtained from as few as two impulse response simulations (for single-input-single-output or SISO systems), and the snapshots do not need to be updated or complemented by a shift-mode.
A further issue with the classical and snapshot methods is that they were designed for stable systems. While a number of studies [27,30,20,54,18] have developed techniques to balance unstable systems (see 2.2), only a recent extension of the snapshot method [11,1] has allowed the computation of ROMs for large unstable systems. The method was applied to the flow over a cavity [11], a flat plate at large incidence [1], and a cylinder [50]. This extension requires the computation of the system's unstable global modes and the projection of the system onto its stable subspace. For large systems, such as three-dimensional flows, the cost of this procedure may still be excessively large.
The aim of this paper is to show that this expensive projection step is in fact unnecessary and that applying the unmodified snapshot-based balanced truncation method (designed for stable systems) directly to unstable systems yields a converged transformation, which balances the system and can result in ROMs of the same quality as those obtained when the projection method is used. Many largescale unstable systems (e.g. three-dimensional flows) for which the projection step makes approximate balanced truncation intractable can therefore readily obtain balanced reduced-order models by using the projection-free method.
In Sec. 2, the theoretical framework is introduced and the existing balanced truncation methods are outlined. The fact that the projection-free snapshot method can be used for unstable systems is proven in Sec. 3, and it is then applied to a representative one-dimensional model system in Sec. 4, where it is compared to existing methods. In Sec. 5, the method is applied to a two-dimensional Navier-Stokes simulation of the flow over a cylinder to show that it also performs well in this more realistic and computationally demanding setup.

Balanced truncation of stable systems
In this section, we introduce the main concepts and methods that are relevant to balanced truncation, in particular for stable systems. More details about these methods can be found in standard control textbooks (e.g. [53] or [5]). The standard continuous time, linear time-independent state-space system is: where x ∈ C nx are the states of the system, u ∈ C nu are the inputs, y ∈ C ny are the outputs and A ∈ C nx×nx , B ∈ C nx×nu and C ∈ C ny×nx are three time independent matrices. The dynamics of the system are then governed by the transfer matrix: G(s) = C (sI − A) −1 B, where Y (s) = G(s)U (s), s is the Laplace variable and U (s) and Y (s) are the Laplace transforms of the input and output signals, respectively.
The balanced truncation approach is based on an analysis of the controllability and observability of the system. The controllability of a state is related to the minimum input energy required to reach it from x(0) = 0. The observability of a state x 0 is related to the energy of the output signal generated by the system starting from x(0) = x 0 , without any input: u(t) = 0. A state that has a large impact on the input-output behavior of the system -i.e. on the transfer matrix -is said to be dynamically significant.
If a state is unobservable or nearly unobservable, then even if only a small amount of input energy is required to reach it (i.e. if it is highly controllable), it will not have a large impact on the output signal. Conversely, if a state is uncontrollable or nearly uncontrollable, then a large (or infinite) amount of input energy is required to reach it, so only a comparatively negligible part the output energy can be due to that state (as long as x(0) = 0). Balanced truncation therefore aims to create a ROM by only retaining the states whose observability and controllability is high, as these are the ones which have the largest impact on the system dynamics. In order to identify which states are the most controllable and observable, let us define the controllability Gramian: and the observability Gramian: where † is the complex conjugate transpose. These are defined for stable and unstable systems for 0 ≤ t ∞ < +∞. As t ∞ → +∞ however, the Gramians converge to constant matrices for stable systems but become unbounded for unstable systems. When the limits exist, the following notation is used: . . σ nx and Σ ∈ R nx×nx is a diagonal matrix, where σ 1 ≥ σ 2 ≥ . . . ≥ σ nx are the Hankel singular values (HSVs) of the system. The (non-zero) HSVs are unique and provide an indication of the corresponding state's dynamical significance. In other words, the states of a balanced system are ranked according to their joint controllability and observability.
In general however, systems are not balanced and given a transfer matrix G(s), the realization (A, B, C) or internal coordinate system used to define the states x is not unique. In order to balance a system it is therefore necessary to find the coordinate transformation T = S −1 ∈ C nx×nx that ensures: The transformed (balanced) system is then: In order to obtain T and S for stable systems, the converged Gramians are evaluated by solving the following Lyapunov equations: The balancing transformations are then found using Eq. (6c), which can be computed from the singular value decompositions (SVDs) in Eq. (6a) and (6b): Note that X and Z are not unique. The balanced system can then be decomposed as follows: and as the transformed states are now ranked by dynamical significance, the less observable and controllable statesx 2 can be truncated to give the following ROM: which is also a stable balanced system. An attractive property of balanced truncation is that an a priori upper-bound on the ROM error exists: where r is the number of states that have been retained in the ROM, · ∞ is the H ∞ norm, and G r (s) =Ĉ 1 (sI −Â 11 ) −1B 1 is the reduced transfer matrix.
Unfortunately, if the system matrices are not explicitly available or if the system dimension is too large, this technique cannot be applied directly. Moore [32], Willcox and Peraire [52] and Rowley [43] therefore introduced an approximate balanced truncation method sometimes referred to as balanced POD (BPOD), which is based on snapshots from the impulse response of the primal system (1) and the adjoint system (7), also assumed to be available: In this approach, snapshots from the primal impulse responses x(t ck ) = x k = e (At ck ) B and the adjoint impulse response z(t ok ) = z k = e (A † t ok ) C † are stacked into matrices in order to find an approximation for the Gramians in the form Eq. (6a), by defining X and Z in the following way: where there are N c primal snapshots and N o adjoint snapshots taken at discrete times t ck and t ok respectively. δ ck ∈ R and δ ok ∈ R are the associated quadrature coefficients corresponding to a chosen numerical integration scheme. As a result, XX † and ZZ † are discrete versions of the continuous definition of the Gramians in Eq. (2) and Eq. (3), respectively. Note that x k ∈ C nx×nu and z k ∈ C nx×ny and that the snapshots are not necessarily equally spaced in time. Forming X and Z in this manner therefore allows T and S to be computed using Eq. (6c), but at a reduced cost since it avoids finding the solution of the Lyapunov equations (5) and also only requires finding the SVD of a (N c n u × N o n y ) matrix instead of the three (n x × n x ) matrices in Eq. (6a) and (6b). This can represent significant savings if N c n u ≪ n x and N o n y ≪ n x , which is usually the case in computational fluid dynamics. Additionally, "squaring up" the matrices X and Z to form the Gramians is detrimental to the accuracy of the results [32].

Balanced truncation of unstable systems
In this section, existing methods related to the balanced truncation of unstable systems are discussed. For stable systems it is straightforward to show that the balancing transformation T converges and that it diagonalizes and equalizes the Gramians as t ∞ → +∞, since the primal and adjoint states go to zero.
If the system is unstable however the state becomes unbounded for large t ∞ . W c (t ∞ ), W o (t ∞ ) as defined in Eq. (2) and (3) respectively are therefore also unbounded and W c and W o are not defined. Nevertheless, Chiu [14], Kenney and Hewer [27], Therapos [49], and Al-Saggaf [4] showed that despite Eq. (2) and (3) being unbounded, Eq. (5) still have solutions if and only if λ i + λ j = 0, where λ i and λ j are any two eigenvalues of A. When this is the case, a balancing transformation can be obtained if W c W o is similar to a real diagonal matrix. However, this is problematic when Eq. (5) does not have a solution and when the balancing transformation does not exist. Furthermore, the resulting reduced-order models are not always of satisfactory quality [54]. An alternative method developed by Meyer [30] is based on the co-prime factorization of the transfer matrix. Nett [34] showed that G(s) = N (s)M (s) −1 , where N (s) = C(sI −Ā) −1 B and M (s) = I + K(sI −Ā) −1 B is a right co-prime factorization of G(s) ifĀ = A + BK is stable. Meyer and Franklin [31] showed that if K = −B † P , where P is the solution to the algebraic Riccati equation: P A + A † P − P BB † P + C † C = 0 then the co-prime factorization is normalized. A realization of the stable system N † (s) M † (s) † is then given by Ā , B, C † K † † , 0 I † , which can be balanced and truncated as in Sec. 2.1. The stabilizing feedback K and output augmentation can then be undone to retrieve the ROM. An issue with this method is that it does not simplify to the standard method when the plant is stable. Zhou [54] developed a related method by using the frequency-domain definition of the Gramians: where W cf and W of are now well-defined as long as A has no eigenvalues on the imaginary axis. In the stable case, Parseval's theorem can be applied to show that W cf = W c and W of = W o . It is then possible to evaluate W cf and W of if a transformation that decouples the stable and antistable dynamics of the system is available. However, they are usually instead evaluated by finding the stabilizing solutions to (A + BK)W cf + W cf (A + BK) † + BB † = 0 and W of (A + LC) Here P is the stabilizing solution to the algebraic Riccati equation P A + A † P − P BB † P = 0 and Y is the stabilizing solution to the algebraic Riccati equation AY + Y A † − Y C † CY = 0. The balancing and truncation procedure from Sec. 2.1 is applied once these "generalized" Gramians are known. A snapshot-based extension to this method was developed by Dergham et al. [18], who applied it to a rounded backward facing step and cavity flow. Here, the snapshots are defined in the frequency-domain rather than the time-domain: the primal system snapshots are of the form X (ω) = (iωI − A) −1 B and of an analogous form for the adjoint snapshots. Each snapshot therefore has to be evaluated by inverting a large matrix explicitly, but all the snapshots can potentially be evaluated simultaneously in parallel.
In the methods described above, the Hankel singular values corresponding to unstable modes are not necessarily larger than the ones corresponding to stable modes, so there is no guarantee that unstable modes will not be truncated. This is often not a desired property for control design purposes. If on the other hand the transformation that uncouples the stable and antistable dynamics is available, such that G(s) = G a (s) + G s (s), then an alternative approach, for instance suggested by Enns [20], is to only balance and truncate the stable part of the system G s (s) and simply add the unbalanced antistable dynamics back in such that the ROM transfer matrix is: G r = G a (s) +Ĝ s (s). where· refers to the balancing and truncation procedure, thus conserving all the unstable modes.
Unfortunately, if only a time-stepping code is available and if the system dimension is large, none of the methods described above can be used. In order to tackle this issue, an extension to the snapshot method introduced in Sec. 2.1 was developed by Barbagallo et al. [11] and Ahuja et al. [1]. This extension is closely related to the approach proposed by Enns [20] as it is based on the separation of the stable and antistable dynamics of the system. Although it may be expensive to compute the full stable-antistable uncoupling transformation, it may still be possible to use an Arnoldi procedure to identify the right and left antistable eigenspaces P a and Q a respectively (scaled such that Q † a P a = I). The primal and adjoint systems can then be projected onto their respective stable subspaces, by defining the projection matrix P = I − P a Q † a : This projected system is then balanced and truncated using snapshots, as described in Sec. 2.1. The antistable dynamics are finally added back in, and the final reduced-order model is: where, as above· refers to the projected system's balanced and truncated matrices. Like in the method proposed by Enns [20], this procedure does not balance the unstable subspace (unlike the ones developed by Kenney and Hewer [27], Meyer [30], and Zhou [54]), but guarantees that unstable modes are not truncated. For fluid systems, the dimension of the unstable subspace is often O (1 − 10) so this method often only requires computing a few eigenvalues and eigenmodes. However an Arnoldi package is not always available and for large systems (e.g. three-dimensional flows), finding even a few eigenmodes may still not be tractable. A further issue with this method is that if the system is poorly conditioned, it may be difficult to evaluate the projection matrix itself. In this article we propose an alternative method that is projection-free and does not require evaluating any global modes. As briefly mentioned in [5], one can choose to simply use the standard method for stable systems, but based on finite-time Gramians, which are expected to approximate the impulse response over the chosen time interval well-enough. This article aims to show that if this finite time interval is long enough, the balancing transformations actually converge and hence balance the Gramians for any further time. These transformations can therefore be considered to be true balancing transformations, often obtained at a fraction of the cost of the methods described above. As a result, we show that the projection-free, snapshot-based balanced truncation method is directly applicable to unstable systems.
3 Projection-free balanced truncation of unstable systems

Theoretical justification
The goal of this section is to show that the projection-free snapshot-based balanced truncation method (BPOD) introduced in Sec. 2.1 can be used even in the unstable case, despite the fact that the Gramians are unbounded. In order to complete the proof we set out to prove the following statements: 1. The transformations T and S as defined in Eq. (6c) converge to constant matrices, even for unstable systems.
2. The controllability and observability Gramians are balanced by the converged transformations for any sufficiently large t ∞ , despite not converging to constant matrices.
These two statements are proven in Proposition 2 and Proposition 3 respectively and ensure that the converged T and S matrices can be considered to be balancing transformations for the unstable system. These proofs are based on Proposition 1, which states that as t ∞ → +∞, the unstable singular vectors and values of the matrices X and Z can be identified explicitly. In particular, all the left singular vectors tend to constant vectors. The proofs outlined here correspond to the case where all eigenvalues have distinct real parts. Appendix A deals with configurations where some of the modes have the same growth rate. Let (A, B, C) be a minimal realization, where A = P ΛQ † , and P † Q = I, where P = p 1 . . . p nx , Λ = diag λ 1 . . . λ nx , Q = q 1 . . . q nx , and Re(λ 1 ) < . . . < Re(λ nx ). The primal and adjoint impulse responses are defined as in Sec. 2.1 and can respectively be written: The approximation of X and Z from the impulse response snapshots defined in Eq. (8) can be written: where β = β 1 . . . β nx and ξ = ξ 1 . . . ξ nx and: . . .
where the superscript * refers to the complex conjugate.
Proposition 1: For large t ∞ , the singular values and vectors of X (Z) corresponding to unstable singular values in Σ c (Σ o ) converge to: respectively, where · is the Euclidean norm of a vector and: Proof. We will use a proof by induction and choose the induction hypothesis (I n ) to be the fact that the proof holds for all i ≤ n. For the base case (I 1 ), if Re(λ 1 ) > 0 then the first mode p 1 is unstable and: since this is just rank 1. (I 1 ) therefore holds, up to a unit norm factor e iθc1 : where θ c1 is a real scalar. The first singular vectors and value are thus uniquely defined, up to e iθc1 , which we can always choose to be equal to 1, so we will make this assumption without loss of generality for the remainder of this paper to simplify the notation. As a result, the direction of the first left singular vector u c1 converges to that of the first eigenvector of the system p 1 (which is constant).
In the inductive step, we prove that if (I n ) holds for some rank n, then (I n+1 ) also holds. The transformation matrices T ci defined in Eq. (10a) project out all the left singular vectors u cj for all j ≤ i, i.e. T ci u cj = 0, T cj u ci = u ci : Thus, if p n+1 is an unstable mode and recalling that we assumed that Re(λ 1 ) < . . . < Re(λ nx ), we have: . Now for 0 < i ≤ n, T cn can be written: and since we are assuming (I n ) holds for rank n: and hence (I n+1 ) holds too. This completes the inductive step and along with the base step concludes the proof by induction. The singular values and vectors corresponding to unstable modes are therefore given by: The singular vector u ci is therefore pointing in the direction of the component of p i that is orthogonal to the subspace defined by p 1 . . . p i−1 . The procedure that identifies the left unstable controllability singular vectors is therefore essentially a Gram-Schmidt process. An analogous derivation starting from the adjoint impulse response leads to: completing the proof of Proposition 1.
It is clear that the left singular vectors and singular values of X and Z, when restricted to the stable subspace also converge to constants as t ∞ → +∞, (this is the basis for the snapshot method in [11] and [1]). Therefore all stable and unstable left singular vectors of X and Z tend to constants, i.e. U c and U o converge to constant matrices.

Proposition 2:
The balancing transformations T and S converge to constant matrices for large t ∞ .
Proof. Let us define the Hankel matrix M H , noting that the SVD of the Gramians can be approximated with snapshots for any finite t ∞ : Using a similar reasoning to the one in the proof of Proposition 1, the (i + 1)th set of Hankel singular values and vectors corresponding to each unstable mode of the system can be identified by projecting M H onto the subspace that is orthogonal to the left and right singular vectors corresponding to all unstable modes j such that j ≤ i. By using a similar proof by induction to that in Proposition 1 (see Appendix B for the full derivation) and by defining the following two transformations: it can be shown that as t ∞ → +∞: Because the singular values corresponding to unstable modes tend to infinity for large t ∞ , we can separate the stable and antistable parts of X and M H (denoted with the subscripts s and a respectively) as follows: As a result, the transformation matrix T , as defined in Eq. (6c) becomes: The ith column of the converged T a is therefore T ai = u ci σ 2 ci /σ i . The ratio of the singular values can be shown to tend to a constant (this fact is proven in Appendix C) so the matrix T converges to a constant matrix for large t ∞ (since T s must also converge). An analogous argument can be made to show that S converges to a constant matrix too, where This completes the proof of Proposition 2. Proof. If a (converged) transformation T (t 1 ) is found, corresponding to a given set to snapshots with the final snapshot taken at t ∞ = t 1 , we would like to check that it diagonalizes and equalizes the Gramians W c (t 2 ) and W o (t 2 ), corresponding to a different (but also sufficiently large) set of snapshots such that t ∞ = t 2 . Using the notation M H (21) 21) , the transformed observability Gramian becomes: , and V (21) can be obtained from the matrix M H(21 By following the same procedure as in the proof of Proposition 2, we obtain: where, as above, the a subscript refers to the antistable part of the matrix for large t ∞ (i.e. large t 1 and t 2 ). For the stable part, given Eq. (16) and the fact that the stable modes decay for large t 1 and t 2 , any additional snapshots cannot modify these subspaces so V s(21) = V s (11) . Finally: which is clearly diagonal. An analogous proof can be applied to the controllability Gramian and leads to . This completes the proof of Proposition 3.
In Appendix A, it is shown that the same conclusions hold for unstable and marginally stable repeated eigenvalues, as well as marginally stable complex conjugate pairs. For unstable complex conjugate pairs, only the two-dimensional subspace spanned by the two corresponding balanced modes converges. This is all that we require, as it is usually not desirable to truncate only one of the two modes corresponding to an unstable pair of complex conjugate eigenvalues. In this case, the two columns (rows) of the transformation T (S) that define this converged two-dimensional subspace oscillate as t ∞ is increased. The transformed Gramians are therefore equal and diagonal except for one (2×2) block along the diagonal for each unstable complex conjugate pair. Let us emphasize here that despite the transformation not becoming constant and the Gramians not being strictly diagonalized and equal, any T (and S) corresponding to a sufficiently large t ∞ can be used as the balancing transformation, as long as the 2D subspace corresponding to the unstable modes has converged. In this case the unstable complex conjugate mode pair is balanced as a whole.

Practical considerations
As illustrated in sections 4 and 5, the projection-free snapshot-based balanced truncation method can often be applied directly to large unstable systems, just as in the stable case. Nevertheless, some modifications to the method that can lead to significant improvements in the quality of the ROMs and/or the computational cost of the method are outlined in this section. These may be required for particularly challenging systems.

Final simulation time and sampling intervals
For both stable and unstable systems, the sampling intervals must be small enough to capture the highest frequencies of interest in the flow field and the final simulation time t ∞ must be large enough for all stable modes to decay. For unstable systems t ∞ must also be large enough for the impulse response to be dominated by the unstable modes at the end of the simulation, thus allowing T a and S a to converge.
Although there is no theoretical upper-bound on t ∞ , in practice as t ∞ → +∞ any initial transients eventually become so negligible compared to the long term response that the information related to modes that are more stable than the dominating unstable mode(s) may be lost. This can result in an inaccurate identification of the corresponding balanced modes. Both the upper and lower bounds on t ∞ are clearly problem-dependent and the trade-off between these two limits is investigated in further detail in Sec. 4.

Improving the accuracy of the method
If several unstable modes have nearly identical growth rates or if it is necessary to identify slowly decaying modes, the lower bound on t ∞ may be higher than its upper bound (see 3.2.1). It may therefore not be possible to obtain sufficiently accurate information about the less unstable parts of the system from the impulse response. One way to work around this is to use the fact that for sufficiently large t ∞ the most unstable mode(s) are still identified accurately. Another set of impulse response simulations can then be run with these modes projected out in order to identify the more stable modes. The procedure can potentially be repeated until each unstable mode has been identified and projected out. In this case the method is clearly not projection-free anymore. It becomes more similar to the one suggested by Barbagallo et al. [11] and Ahuja et al. [1], although here the entire unstable subspace does not necessarily need to be projected out for the method to work. Additionally, it still avoids the need for an Arnoldi solver and theoretically still yields the same transformation T as the projection-free method (and hence the unstable subspace is still balanced).
An alternative approach is inspired from the work of Morgans and Dowling [33] and Illingworth, Morgans and Rowley [25]: the unsatisfactory ROM obtained from the impulse response can be used as a first approximation to the system (with again the most unstable mode(s) accurately identified). Using this ROM, a controller aimed at suppressing the most unstable mode(s) can be designed and the resulting closed-loop impulse response can be used to deduce the corresponding open-loop dynamics and improve the initial ROM.

Large systems and Gaussian quadrature
In order to generate the balanced ROM, snapshots are typically recorded at regular time intervals. The matrices X and Z as defined in Eq. (8) -which are used to approximate the Gramian integrals in Eq. (2) and (3) -are formed by using Newton-Cotes quadrature weights of a selected order (e.g. trapezoidal, Simpson or Boole rule). However, the number of snapshots required can be significantly reduced by selecting a Gaussian quadrature rule instead (e.g. Gauss-Legendre quadrature), where the snapshots are not equally spaced in time.
Since the final simulation time is usually not known a priori, it may be preferable to use a "composite" quadrature rule. In this case each impulse response is divided into several time windows (of potentially different lengths). If there are N i snapshots in the ith time window, the integral of this window can then be evaluated independently from the rest of the impulse response, using a (N i )-point Gaussian quadrature. With this piecewise integration method, it becomes straightforward to add an additional time window to increase the final simulation time. It is then also possible to optimize the distribution of the snapshots across the full simulation, for instance, if a higher-order quadrature is required in a specific time window.
If the system dimension is so large that storing each snapshot is computationally demanding or if the number of snapshots required is so large that it results in an excessively expensive SVD (for threedimensional flows for instance), then it may be necessary to use Gaussian quadrature to make the balanced POD method tractable.
4 Application to a spatially developing unstable system 4

.1 Linearized complex Ginzburg-Landau equation simulation setup
In order to check that the application of the approximate balanced truncation method to unstable systems results in satisfactory ROMs for large-scale problems, we first apply the method to the linearized complex Ginzburg-Landau equation. This one-dimensional system exhibits spatially developing behavior and instabilities, which are representative of typical flow fields. The linearized complex Ginzburg-Landau equations (18a) and the corresponding adjoint equations (18b) are: where µ(x) = µ 0 − c 2 u + µ 2 x 2 /2. Note that now x is the spatial variable, while q is the system state. The complex parameters ν and γ characterize the convection and diffusion/dispersion in the system respectively, while µ is related to the exponential growth of instabilities. µ 0 can be seen as being similar to the Reynolds number in the Navier-Stokes equations, as it is used to determine the nature of the global stability of the system. A large value of µ 2 corresponds to a large degree of non-parallelism, while a small value of µ 2 and a large value of ν result in a strongly non-normal flow. Finally c u is the most unstable wavenumber in the flow-field.
The results in this section were obtained by using the code developed by Bagheri [9] with a similar set of parameters to the supercritical (globally unstable) system considered in [9], as summarized in Table  1. In order to demonstrate the method's ability to deal with several unstable modes however, µ 0 was set to 0.57 in order to obtain a two-dimensional antistable subspace. In this code, the forward problem uses a spectral Hermite collocation method, where state q ∈ C nx is evaluated at the roots of the n x th Parameter Value n x 220 µ 0 0.57 Hermite polynomial H nx (bx), and the parameter b is chosen to obtain an accurate approximation of the continuous problem. The adjoint equations have been obtained from the discretized forward equations using a scaling matrix M , so that the energy of the discretized state, defined using the inner product q † M q, approximates the energy of the continuous state. Therefore, the adjoint state-space matrices are of the form (A + , B + , , as opposed to simply (A † , C † , B † ) (see the Appendix of [9] for more details regarding the discretization of the code).
In the present work, the system was set up as a single-input-single-output (SISO) system: the input (actuator) has a narrow Gaussian distribution, centered at x I , the upstream limit of the region where instabilities are able to grow ("branch I" in [9]). The output (sensor) is defined by the same Gaussian function, but centered at x II , the downstream limit of the growth region ("branch II"). The discretized system has 220 states, corresponding to a spatial extent of [−85, 85].
The Ginzburg-Landau equation has been often used to model fluid instabilities (reviews on this topic such as [24,15] frequently demonstrate important concepts with this one-dimensional model), and is a common test case for flow-control and model reduction studies of convectively and globally unstable flows because its behavior is often representative of the Navier-Stokes equations but at a much lower computational cost (e.g. [36,28,16]). For more details about the Ginzburg-Landau equation and related studies, the reader is referred to [9].

Numerical results
In this case, the cost of a simulation is low, so there was no need to use a Gaussian quadrature scheme suggested in Sec. 3.2 and instead a Boole rule quadrature scheme was chosen. The number of snapshots used was chosen such that t ∞ /N ≈ 0.05. The balancing transformations and ROM were then obtained as described in Sec. 2.1 and Sec. 3. As there are only 220 states in the full system, it was also possible to compute the transformations and ROMs based on the various procedures described in Sec. 2.2. For clarity however, we choose to compare our method to the snapshot-based projection method [11,1] and what can be referred to as its "exact" equivalent [20], where the stable subsystem is balanced exactly.

Comparison of the system's behavior with theory
As the conclusions of Sec. 3 are based on the theoretical limiting behavior of unstable systems, it is crucial to investigate the extent to which the key steps of the method yield the predicted results. In this section, we therefore compare several matrices obtained from the simulations with their theoretical counterparts. When the matrices considered here have a dual analog, which behaves in the same way, we choose to focus only on one of them for brevity.
The first result that is checked here is part of the conclusion of Proposition 1, i.e. that lim t∞→+∞ u ci = u ci , whereũ ci = T ci−1 p i T ci−1 p i −1 for unstable modes (there are two in this case), as stated in Eq. (9a). We therefore plot the first 10 columns of |Ũ † c U c |, withŨ c = ũ c1ũc2 , which we expect to be equal to I 2×2 0 2×8 . Figure 1 shows that indeed, for large enough t ∞ , the first unstable singular vector of the controllability Gramian u c1 points in the direction of p 1 and the second singular vector u c2 points in the direction of the component of p 2 that is orthogonal to p 1 . An analogous conclusion can be drawn for   respectively. We therefore expect the following to hold: Indeed for large enough t ∞ , Fig. 2 shows that this is approximately true although at t ∞ = 80, residuals still appear in the cross-terms V † ca V s and V † cs V a . We now turn our attention to the transformation matrix, where we theoretically expect the direction of the unstable balanced modes to tend to that of the unstable left controllability singular vectors U ca . In Fig. 3, we plot |U † cT |, whereT i are the normalized balanced modesT i = T i T i −1 so each column is in the same direction as the corresponding column of U † c T : We therefore expect: However, as seen in Fig. 2, V † c V is not fully converged at t ∞ = 80 and hence the only part of the expected behavior that clearly appears in Fig. 3 is |U † csTa | = 0. By recalling that U cs spans the component of the stable subspace that is orthogonal to the antistable subspace, this can be interpreted as the fact that the unstable balanced modes span the same subspace as the unstable global modes in practice. However, it does not seem that the stable balanced modes are orthogonal to the unstable subspace at t ∞ = 80 since |U † caT s | = 0, or that each unstable balanced mode is in the direction of the corresponding controllability singular vector since |U † caTa | = I. These observations can be explained by considering the structure of |U † cT | when V † c V is still converging and is instead of the form: for some potentially small ǫ aa , ǫ as , and ǫ sa matrices of the correct dimensions: Given that Σ cs and Σ s converge to constants, while lim t∞→+∞ Σ a = +∞ and lim t∞→+∞ Σ ca = +∞, it is not surprising that lim t∞→+∞ Σ cs ǫ sa Σ The behavior of the converging matrices considered up to this point can therefore readily be explained. We now wish to check that the transformation matrices converge as expected in practice. In order to do this, the transformation matrices T and S are computed for a range of t ∞ values, and the rate of change in the transformations over a fixed time interval ∆t: T (t ∞ ) − T (t ∞ − ∆t) F /∆t and S(t ∞ ) − S(t ∞ − ∆t) F /∆t are shown in Fig. 4b ( · F is the Frobenius norm of a matrix). Clearly both matrices converge to constants for large t ∞ , as expected despite the exponential growth of the energy of the primal and adjoint states, shown in Fig. 4a.
It is important to note that although T and S converge, the convergence of the direction of the balanced modes also needs to be checked. We can therefore plot 1−|T i (t) †T i (t − ∆t)| for each normalized balanced modeT i to check that the direction of the balanced modes also converges. Figure 5 shows that even for large t ∞ , the most unstable mode converges, while the more stable modes eventually become so negligible compared to the most unstable mode that they lose their accuracy. Not surprisingly, the less dynamically significant the mode, the lower the value of the maximum t ∞ before it diverges. This behavior is expected as was mentioned in Sec. 3.2 and the consequences of it for the quality of the ROM are further investigated in Sec. 4.2.2 and in the next paragraph.
Finally, we check that the Gramians are indeed balanced by the transformation. By definition, Gramians computed using a given set of snapshots are exactly balanced by transformations T and S, which were computed using the same set of snapshots. However, we wish to investigate how well the "converged" transformation balances Gramians computed using another set of snapshots. In order to do this, recall from Eq. (15) that the transformed observability Gramian is Σ 11) . The only term that is potentially non-diagonal, V † (11) V (21) , therefore provides a good indication of how wellbalanced the Gramians are, as we expect it to tend to the identity matrix. Similarly, we could investigate how close U † (11) U (12) is to the identity matrix to investigate how well-balanced the controllability matrix is.
We therefore choose to compute I − |V † (11) V (21) | F , where the transformations have been computed using a set of snapshots corresponding to t ∞ = t 1 = 20, 40, and 80 time units and applied to Gramians, which were computed for values of t ∞ = t 2 ranging from 0 to 120. We only consider the top left (4 × 4),  (8×8), and (12×12) elements of V † (11) V (21) , to evaluate how balanced the resulting 4th, 8th, and 12th-order ROMs would be.
Focusing first on the influence of t 1 , Fig. 6 shows that in general, computing the matrix with a longer t 1 improves the balancing for all three orders of the ROM (for all except highest values of t 2 and close to t 1 = t 2 ), as expected: theoretically, the higher the value of t 1 , the more converged the transformation matrices. Specifically, the three curves corresponding to t 1 = 20 have a large error regardless of the ROM order, as the transformation matrices are not converged. On the other hand, if t 1 is too large, the set of snapshots used to define the transformation does not allow an accurate identification of dynamically less significant modes. Therefore Gramians (and ROMs) which include these modes will not be properly balanced regardless of t 2 . In Fig. 6 therefore, the curve corresponding to a 12-mode Gramian with t 1 = 80 always has a large balancing error.
Considering now the effect of t 2 , the quality of the balancing increases as t 2 approaches t 1 . At t 2 = t 1 the Gramians are exactly balanced by definition as mentioned above. For t 2 > t 1 , the error settles to a constant value, until eventually, a sharp degradation appears for the highest values of t 2 , for instance at t 2 ≈ 100 for the t 1 = 80, 8-mode ROM. This behavior can be explained by the fact that the accuracy of the Gramians, XX † and ZZ † , as opposed to the transformations, decreases if t 2 is too large, as they are also approximated using state snapshots.
In this section, we have shown that in practice, we cannot allow the system to fully converge by letting t ∞ → +∞, as this results in the loss of crucial information about the dynamically significant stable behavior of the system. However, the balanced modes do converge up to a critical t ∞ value and result in approximately balanced transformed Gramians. We therefore now turn our attention to the physical problem of the performance of reduced-order models obtained with this method, and compare them with ROMs obtained with existing methods.

Analysis of the reduced-order models
In this article, the L ∞ norm of the difference between the full-order transfer function and the reducedorder transfer function, defined as sup ω∈R∪∞ |G(iω) − G r (iω)| for SISO systems is used as a measure of ROM quality. In practice, we approximate this by: max ω∈[−ω∞,ω∞] |G(iω) − G r (iω)| for ω ∞ → +∞. Note that this transfer function norm is different both from the H ∞ error defined as sup s:Re(s)>0 |G(s) − G r (s)|, which is infinite for unstable systems, and from the time-domain L ∞ (I t ) norm usually used for signals x(t) and defined over some interval I t as ess sup t∈It |x(t)|. The L ∞ error norm normalized by L ∞ norm of the full-order system will be referred to as the "ROM error" in the following paragraphs, i.e. G − G r ∞ / G ∞ . Potentially unstable systems can alternatively be compared using the δ ν -gap metric [51], which can be interpreted as the "distance" between two plants from the point of view of their behavior in a feedback setting. However for the purposes of the present analysis, the L ∞ norm is an adequate measure of the difference between the two systems (in this case it represents the maximum distance between the loci of G(iω) and G r (iω)), as our focus is on model reduction as opposed to controller design in this article.
Given the restrictions on the final simulation time mentioned in Sec. 3.2 and Sec. 4.2.1, we first investigate the quality of ROMs obtained with different t ∞ in Fig. 7. The behavior described in Sec. 3.2 is apparent: as the simulation time increases the ROM error initially decreases. The lower bound on t ∞ related to the decay of the stable modes is illustrated with the black lines with no markers, corresponding to ROMs of different orders, but computed with the projection method of [11,1]. The second lower bound on t ∞ , corresponding to the convergence of the unstable modes as they begin to dominate the impulse response appears to be slower, as shown by the lines with markers, corresponding to projectionfree ROMs. Now, if t ∞ is sufficiently large for the less dynamically significant information to be lost, increasing the order of the ROM does not reduce the error any further. Additionally, each ROM order has a different optimal final simulation time: the error is minimized just before the unstable modes start to dominate the response enough to sharply increase the error. In the following paragraphs, an estimated value t opt (r) for the optimal t ∞ corresponding to each ROM order r is used. These values are shown in Fig. 8, where it can be observed for this particular case that t opt (r) seems to be decreasing roughly linearly as the model order is increased for r > 2.
The existence of an optimal final simulation time can be further illustrated by plotting the singular values of the Hankel matrix (Fig. 10) for different values of t ∞ . Here, the HSVs corresponding to 4th, 8th and 12th-order ROMs, obtained with the exact and snapshot projection methods as well as the projection-free method (using estimated optimal t ∞ values) are plotted. As t ∞ is increased, in the projection-free case, the HSVs corresponding to unstable modes grow, while the rest of the singular values tend towards their respective converged values. This explains the increasing precision of the ROMs as the final simulation time is increased, like in stable systems. On the other hand, the flat region of each curve,   Figure 9: (Color Online) ROM error as a function of the ROM order, obtained with t ∞ = t opt (r) for the projection-free method. Crosses: exact projection method [20]. Filled circles: snapshot-based projection method [11,1]. Open circles: projection-free method.
where the singular values effectively have no dynamical significance, incorporates an increasing number of states as t ∞ grows. This again corresponds to the less significant information disappearing due to the unstable modes becoming too large, and can be used as a way to choose the appropriate ROM order, given an HSV distribution. For instance, with t ∞ = t opt (4) = 108 (top line with diamonds in Fig. 10), roughly 4 singular values are not in the flat region of the curve. As shown in Fig. 7, the error will start growing if t ∞ is increased further with r = 4. Conversely, increasing the ROM order to anything higher than 4 will not reduce the ROM error significantly when t ∞ = t opt (4).
In Fig. 9, the error obtained with t opt (r) is plotted as a function of the ROM order, and compared to the performance of the snapshot [11,1] and exact [20] projection methods. Clearly, the two projection approaches yield ROMs of the same quality. On the other hand, the projection-free method ROM error is of comparable order of magnitude, with the difference in the error increasing slightly with the ROM order. Note however that this difference is not significant: the error of a 12-state projection-free ROM is smaller than that of an 11-state ROM computed with the projection methods.
For control design purposes, it is crucial that the ROMs are able to reproduce the full system's impulse response and transfer function. In Fig. 11, the impulse responses (a) and the transfer function gains (b) of the three methods are compared for different ROM orders. The three methods yield similar results for the three ROM orders considered. As the ROM order in increased in all three cases, the initial transient of the impulse response is estimated with increasing precision. Similarly, the high-frequency gain is better modeled with high ROM orders in the transfer function. Projection method (exact) Snapshot projection method No projection: t ∞ = t opt (12) No projection: t ∞ = t opt (8) No projection: t ∞ = t opt (4) Decreasing t ∞

Application to a two-dimensional unstable flow-field
The previous sections in this article focused on demonstrating the basis for using the method from a theoretical point of view and by comparing its performance to that of existing methods, when applied to the linearized complex Ginzburg-Landau equation. In this section, the applicability of the method in a more realistic scenario, i.e. for large-scale unstable flow-fields is investigated. The chosen test case is the flow over a cylinder at Reynolds number Re = U ∞ D/ν = 100, where U ∞ is the incoming flow velocity, D is the cylinder diameter, and ν is the kinematic viscosity.
The forward direct numerical simulations run here are based on the immersed boundary projection method code developed by Taira and Colonius [48,17]. The two-dimensional, viscous, incompressible Navier-Stokes equations expressed in vorticity form are discretized on a second-order accurate finitevolume staggered grid. A multi-grid algorithm is used, where the problem is solved on a series of nested identical Cartesian uniform grids, each of twice the physical extent and half the resolution of the previous one. A second-order Adams-Bashforth scheme is used to discretize the advection term(s) and a secondorder Crank-Nicolson scheme is used for the other (linear) terms. For more information, the reader is referred to [48,17]. The linearized equations are therefore a discretization of: where ω refers to vorticity, u to velocity, 0 to base flow quantities, f IB to immersed boundary forcing that imposes the no-slip condition on the cylinder surface, i.e. the second equation in Eq. (20), where u B is the velocity on the body surface. The spatially distributed forcing function f is used to model the input (more details below).
The corresponding continuous adjoint equations can be shown to be: where + refers to adjoint quantities and f + forces the adjoint simulation at the sensor location (output location of the forward simulation), as opposed to f , which forces the forward simulation at the actuator (input) location. In this code however, the adjoint equations were obtained from the spatially discrete but temporally continuous linearized equations, where the time-marching scheme is self-adjoint in the present case since the base flow is the steady unstable equilibrium of the problem. Note that the same nested-grid method was used in both the forward and adjoint problem and this procedure is not self-adjoint so the adjoint equations we solve are technically not the exact discrete adjoint of the forward problem. The solver used for Eq. (20) and (21) is similar for both sets of equations, except for the advection terms and the fact that Eq. (21) runs backwards in time.
The same grid as the one in [50] 20,20]. Each grid has 1500 × 500 square cells of length and width δ = 0.02. The chosen input-output setup is also almost identical to the one in [50]: the flow is forced using a disk-shaped vertical body force of radius 1 (white disk in Fig. 12), centered at (x, y) = (2, 0) in the wake. The sensor measures the vertical velocity, also at (x, y) = (2, 0) (black triangle in Fig. 12). The base flow, shown in Fig. 12, was computed using Selective Frequency Damping (SFD) [2,26]. The SFD parameters values chosen here were χ = 0.4391 and ∆ = 3.1974, as suggested by [26] for the cylinder flow at Re = 100.
In order to compute the reduced-order model, both the forward and the adjoint impulse responses were run for a long time (300 time units). Snapshots were stored every 0.2 time units and a Boole quadrature rule was used to scale the snapshots. Once the snapshots are stored, one can compute the Hankel singular values of the ROMs for different values of t ∞ , as shown in Fig. 13. The general behavior is similar to the one in Sec. 4, but in this case the HSV distribution is more complex than in Fig. 10. Furthermore, both the transients and the long-term response of the system are of interest, it is less clear how one should quantify the ROM error. Nevertheless, since the cost of computing the SVD of the Hankel matrix is relatively small, one can find a compromise between ROM order and quality by trial and error by comparing the model impulse response to the one from the sensor.
In this case, using only 125 snapshots, corresponding to a final simulation time of 25 time units, a 12th-order model was obtained, whose impulse response is compared to the full system's in Fig. 14. It is clear that the method successfully reproduces the impulse response in this case, both for times that are much longer than the final simulation time used to compute the model and for the initial transients.
Note that the instability here is due to an unstable complex-conjugate pair of eigenvalues, i.e. the special case considered from a theoretical point of view in Appendix A.3.2. This therefore demonstrates that the technique is also readily applicable in practice for this ubiquitous scenario. The HSVs corresponding to the two unstable complex conjugate modes clearly appear in Fig. 13 as the first two HSVs, which grow to large values as t ∞ is increased, and are always of the same order of magnitude.
Another important point is that at t = 25, the output is |y(t)|≈ 1, which is only about 4 times larger than the first transient peak. In other words, we obtained a successful ROM of this unstable linear system without requiring an excessively long final simulation time or strong domination of the unstable modes. This suggests that with a sufficiently small impulse at the input, the method might be directly applicable to the initial linear response of some nonlinear systems.

Conclusions
In this article, we have shown that applying the snapshot-based, projection-free, approximate balanced truncation method to unstable systems theoretically yields a converged transformation that balances the Gramians for all t ∞ → +∞, including the unstable subspace. For stable systems, this method has been popular in recent years as it is straightforward to implement and scales well for large systems. The fact that it can be used without any modifications for unstable systems will allow the computation of BPOD-based ROMs and controllers of even three-dimensional unstable flow-fields, without the need to identify or project-out any global modes.
Benchmarking the method using a one-dimensional, spatially developing, unstable system showed that it is not required for the system to converge to its theoretical limiting state for this approach to yield ROMs of similar quality to the ones obtained with existing methods [11,1,20]. In fact this limiting behavior cannot be reached exactly in practice due to finite-precision arithmetic: if the final simulation time becomes excessively large, only the unbounded growth of the most unstable modes will be conserved and the initial transient information from the impulse response will be lost. It was found that there is an optimal final simulation time value, which is a function of the desired ROM order, and methods to estimate this optimal value were discussed.
In cases where the unmodified method yields unacceptable ROMs, or where the balanced modes must be identified more accurately, a method was outlined to circumvent the final time restriction. This method has a slightly increased cost as it requires projecting out some of the system's unstable balanced modes, but their identification still does not require an Arnoldi solver. Additionally, if the system dimension is so large that storing a large number of snapshots is challenging, it was suggested that a (piecewise) Gaussian quadrature method (with varying sampling intervals) could be used when storing the snapshots, as opposed to a traditional Newton-Cotes approach with a constant sampling rate, as this is a simple way to significantly reduce the number of snapshots required to obtain a ROM of a given accuracy.
Finally, the method was applied to a more realistic large-scale system: the supercritical (linearized) flow over a circular cylinder at Re = 100, and an excellent match was obtained with a 12th-order model. Only a short simulation and a small number of snapshots were required to obtain the ROM, where the final output amplitude was similar to the initial transients, suggesting that it may be possible to apply the method to the initial linear growth of nonlinear unstable flows.
The standard approximate balanced truncation algorithm was therefore shown to be applicable to unstable systems both from a theoretical and a practical point of view. The implications of this are twofold. First, the simple numerical setup required to compute the snapshot-based balanced truncation of stable systems can be safely applied without any modifications to unstable systems. Second, the computation of balanced reduced-order models of unstable systems that are so large that other existing methods are not tractable is made possible with this approach.

Acknowledgments
This research is funded by the EPSRC. The authors would also like to thank Shervin Bagheri for allowing us to use the Ginzburg-Landau Equation matlab code, on which Sec. 4 is based, as well as Tim Colonius for the Immersed-Boundary Fractional Step code used in Sec. 5.
A Projection-free approximate balanced truncation of unstable systems: special cases In Sec. 3, it was assumed that the growth rates of the different eigenmodes of the flow field were distinct, so that one eigenmode would eventually dominate the impulse response. The purpose of this section is to show that the proof extends in a similar way when some of the eigenmodes have the same growth rate. The general case where the nth to (n + m)th eigenvalues have the same growth rate: Re(λ n ) = Re(λ n+1 ) = . . . = Re(λ n+m ) = α, but the imaginary parts Im(λ n ) = ω n differ: ω n = ω n+1 = . . . = ω n+m is considered first and the special cases of repeated eigenvalues and complex conjugate eigenvalue pairs are then treated separately.

A.1.1 Controllability and observability singular vectors
Let us assume that the (n − 1) most unstable modes are projected out as in Sec. 3. Let us also write Ω = diag( ω n . . . ω n+m ). The impulse response state x(t) = e At B becomes: The left and right singular vectors of T cn−1 X thus tend to the subspaces spanned by T cn−1 p n . . . p n+m and β n . . . β n+m respectively. T on−1 Z behaves in an analogous way and hence its left and right singular vectors tend to the subspaces spanned by T on−1 q n . . . q n+m and ξ n . . . ξ n+m respectively. However u cn . . . u cn+m and u on . . . u on+m do not generally converge to constants due to the oscillatory term e iΩt .

A.1.2 Hankel matrix
Next, as in Sec. 3, we consider the projected Hankel matrix: where: The left and right Hankel singular vectors therefore tend to the subspaces spanned by v on . . . v on+m and v cn . . . v cn+m respectively, which are already (individually) orthonormal bases by definition.
We thus only need unitary matrices to identify the Hankel singular vectors and values, which we can get from the SVD of M : Therefore in general: where R c and R o are not necessarily constant.

A.1.3 Balancing transformations
For large t ∞ , only the nth to (n + m)th columns of the transformation matrix T = U c Σ c V † c V Σ −1/2 must be considered since Eq. (24) implies that v † ci v j = 0 if n ≤ i ≤ n + m and j / ∈ [n, n + m], and hence the rest of T is independent of these (m + 1) columns. Using Eq. (24), these (m + 1) columns become: Therefore, in general, the subspace spanned by the balanced modes T i for n ≤ i ≤ n + m converges to the subspace spanned by T cn−1 p n . . . p n+m . Similarly, it can be shown that the adjoint balanced modes S i for n ≤ i ≤ n + m must span the same subspace as T on−1 q n . . . q n+m .

A.1.4 Transformed Gramians
Finally, since Eq. (15) is still valid here, W o is diagonalized by T if V (21) = V (11) . We again only need to consider the nth to (n + m)th columns of each matrix since the other columns are unaffected by these modes for large t 1 and t 2 . Using Eq. (24): v n . . . v n+m (11) = R † c(21) R c (11) .
In general, V † (11) V (21) therefore becomes: The transformed Gramian T (t 1 ) † W o (t 2 )T (t 1 ) is thus fully diagonal, except for (m + 1) columns which have a dense ((m + 1) × (m + 1)) block along the diagonal. An analogous derivation applied to the controllability Gramian leads to the conclusion that the Gramians are balanced (diagonal and equal), except for the (m + 1) columns corresponding to the modes with equal growth rates. The fact that the transformation does not in general balance unstable modes with identical growth rates is not problematic because, as noted above, the subspace spanned by the (m + 1) balanced modes does converge, and the rest of the system remains unaffected. In other words, the (m + 1) modes are balanced as a whole. As it is undesirable to truncate unstable modes for control purposes, the transformations T and S are still adequate transformations as long as they are computed with a set a snapshots that allows this subspace to converge.
On the other hand, if R c and R o were to tend to constant matrices, the Gramians would become fully balanced by T and S for large t ∞ . We will focus on special cases where this happens in the sections below.

A.2 Repeated eigenvalues
In this case, λ n = λ n+1 = . . . = λ n+m = α + iω. In order to analyse the singular values and vectors of X in this case the transformation is applied to the full controllability Gramian W c (t ∞ ) =    Here σ i =σ i e 2αt∞ /(2α) (and σ i =σ i t ∞ for α = 0), andM tends to a constant matrix (and hence so do its singular valuesσ n , and vectors R o , R c ). For large t ∞ , the nth to (n + m)th columns of the transformation matrix become: Since R c and R o converge to constants for large t ∞ , we can conclude that for repeated eigenvalues, the matrices T and S converge and fully balance the Gramians.
Therefore, even for large t ∞ the left controllability (observability) singular vectors oscillate but stay in the two-dimensional plane defined by the eigenvectors. Similarly the Hankel singular values grow on average exponentially as e 2αt∞ , but the growth rate also oscillates around this mean trend. As discussed in Sec. A.1, for each unstable complex conjugate pair of modes, two columns of T and two rows of S will not converge regardless of t ∞ (but will stay bounded), and a full (2 × 2) block will be present along the diagonal of the transformed Gramians, while the rest of the matrix will be independently balanced. If t ∞ is large enough for the subspace spanned by the two corresponding columns of T (and rows of S) to be converged, then any T and S transformations can therefore be used an adequate balancing transformations, where the complex conjugate pair is considered to be balanced as a whole.

B Induction proof for the Hankel matrix SVD
As in Proposition 1 in Sec. 3, we will use a proof by induction to show that for large t ∞ , the singular vectors and values corresponding to unstable modes of the Hankel Matrix We choose the induction hypothesis (I n ) to be the fact that the proof holds for all i ≤ n. For the base case (I 1 ), if the first mode is unstable: since, if t ∞ is large, σ c1 ≫ σ ci and σ o1 ≫ σ oi for i > 1. (I 1 ) therefore holds: u 1 = v o1 , σ 1 = σ o1 u † o1 u c1 σ c1 and v 1 = v c1 .
In the inductive step, we prove that if (I n ) holds for some rank n, then (I n+1 ) also holds. The transformation matrices T li defined in Eq. (12) project out all the left singular vectors u j for all j ≤ i, i.e. T li u j = 0, T lj u i = u i . T ri acts in an analogous way on the right singular vectors. As we are assuming (I n ) to hold for rank n: As a result, if the (n + 1)th mode is unstable: because if t ∞ is large, σ ci ≫ σ cj and σ oi ≫ σ oj for i < j. Since V o and V c are individually orthonormal bases, v on+1 is normal to v oi = u i if i ≤ n and v cn+1 is normal to v ci = v i if i ≤ n: and therefore (I n+1 ) holds too: u n+1 = v on+1 , σ n+1 = σ on+1 u † on+1 u cn+1 σ cn+1 and v n+1 = v cn+1 . This completes the inductive step and, along with the base step, this concludes the proof by induction.

C Proof of the convergence of the ratio of singular values
In this section, we prove that σ 2 ci /σ i tends to a constant for large t ∞ .
Only β i and ξ i are not constants for large t ∞ in the above expression and: . . .
We can therefore use the approximation: where we have defined λ i + λ * i = 2α i and α i ∈ R. Similarly: Therefore: which is a constant.