Multiple abrupt phase transitions in urban transport congestion

During the last decades, our view of cities has been deeply transformed by new approaches combining engineering and complexity sciences. Network theory is playing a central role, facilitating the quantitative analysis of crucial urban dynamics, such as mobility, city growth or urban organization. In this framework, betweenness, a well-known centrality measure, represents a fundamental tool, standing out as a proxy of traffic density and congestion, among others. In this work, we focus on the spatial aspects of congestion. To help unveiling such relationship, we introduce a simple model composed of a grid connected to a set of trees. This structure, coined as the GT-model, allows us to analytically describe in terms of betweenness, how and why congestion emerges in particular geographical areas, and more importantly, how it may co-evolve with city growth and mobility pattern shifts. We observe the emergence of several congestion regimes, with abrupt transitions betwenn them, related to the entanglement of arterial and urban local roads. The existence of these transitions is corroborated in a large amount of real city networks, thus representing an important step towards the understanding and optimization of traffic dynamics over coupled transportation networks.


I. INTRODUCTION
Cities evolved, and continue evolving, into different organizational patterns as a consequence of historical, political, financial and efficiency circumstances [1,2]. Similar to the many-body problem, city constituents usually interact with one another, requiring a complex systems perspective to describe the observed phenomena. Urban transportation networks, on top of road networks, are a paradigmatic example of such scenarios.
The analysis of these systems has taken different approaches depending on their scale and purpose. Dynamics on inter-urban roads (a.k.a. arterial roads, or high capacity roads), characterized by long segments and limited interconnection, is governed by the interaction between cars circulating within. Thus, most of the phenomena can be explained by considering vehicles moving on independent roads, e.g., using car-following models and fluid dynamics [3], or the macroscopic fundamental diagram [4,5]. On the contrary, dynamics on intra-urban roads, characterized by short street segments with frequent interactions between them, is basically driven by road intercommunication, where vehicles can relocate to different segments during their trajectory. In these situations, complex networks theory plays a salient role. Theoretical models developed along these lines have helped to understand the observed phenomena on such systems, e.g., urban mobility [6,7], traffic [8][9][10][11][12][13][14][15], road usage [16,17], and network collapse [18][19][20].
These, a priori independent, transportation networks are increasingly entangled as cities sprawl over suburban areas. Under a multi-level analysis, these systems may develop special phenomena [21]. Up to date, only * asolerib@uoc.edu few works have studied the urban transportation networks from this intertwined (and entangled) perspective, and their implications have been barely analyzed. In [22] the authors describe the structural properties of systems where arterial and local urban roads share the same geographic space [23]. This model allows them to explain the shape of the experimentally observed universal betweenness distribution of the road networks of 97 cities worldwide.
In this work, we consider the complementary situation in which arterial roads and urban local ones operate on separate geographic spaces. Specifically, local roads are basically located at the city center, and arterial roads at the urban periphery. This model, that we call Grid-Tree model or GT-model, reproduces previous results in terms of betweenness distribution [22], and provides a different, yet plausible, explanation for them.
A considerable advantage of the GT-model comes from its analytical tractability. GT-model equations evidence that cities may experience a set of multiple abrupt phase transitions in the spatial localization of congested areas. These transitions define a set of congestion regimes, that we also empirically find in many cities worldwide. These regimes imply the emergence of congestion in the city center, its periphery or in urban arterial roads, and are related to the way different road types are entangled to form a unique road transportation system.
The manuscript is organized as follows. In Sec. II, we discuss the importance of the betweenness centrality in urban settings, its characteristics when the network is embedded in Euclidean spaces, and we relate our work with the recent literature. In Sec. III, we develop our GTmodel and analyze it in terms of known results related to the betweenness distribution and its spatial behavior. Section IV is devoted to the derivation, and subsequent validation, of the analytical expressions for the between-ness of several crucial nodes of the GT-model. In Sec. V we derive, using the previous analytical calculations, the abrupt transitions which define the congestion regimes. The existence of these different congestion regimes is validated with empirical road networks associated to real cities in Sec. VI. Finally, Sec. VII contains some concluding remarks and perspectives.
We focus on the classical definition based on shortest paths routing dynamics (in time or length) over urban roads networks. Recent studies indicate that drivers (and pedestrians) may opt for alternative trajectories larger than the shortest path [56], but the assumption of shortest path dynamics is still a crucial routing model, based on the rational choice of trajectories, thus helping in the design and analysis of transportation networks. In this context, the shortest-paths betweenness (B n ) considers only geodesic paths between city locations and is defined, for a given node n, as: where σ od is the number of shortest-paths going from origin o to destination d, while σ od (n) is the number of these paths crossing n. Factor N , usually taken to be (N − 1)(N − 2), N 2 , N (where N represents the number of nodes in the network), or even 1, represents a normalization constant which may be different depending on the application. For convenience, we will calculate the unnormalized betweenness, setting N = 1. As already evidenced by Guimerà et al. [8], the analysis of the betweenness distribution is crucial for understanding the dynamical properties of transportation networks since it is an accurate proxy of node and link usage, which combined with other properties (e.g., processing capacity) can accurately predict congestion. Recently, Kirkley et al. [22] have shown that, when networks are attributed with planar properties, such as urban road networks, these distributions display a similar shape that scales with N . Additionally, there is a strong dependence between node betweenness and its geographic position. In general, low betweenness nodes are located at peripheral regions, while high betweenness nodes appear near the urban center. This distribution can be seen in Fig. 2 for three different cities (see Fig. 2 of [22] for a large scale analysis). The right hand side of the betweenness distribution can be approximated with a power-law with an exponential cut-off: P (B n ) = B −α n e Bn/β . Exponent α happens to be quite stable, with values α ≈ 1. However, the value of β has a strong dependence on the size of the city [22].

III. GRID-TREE MODEL FOR CITY ROAD NETWORKS
The shape of the tail of the betweenness distribution suggests that there might be a structural transition between different network topologies as we depart from the city center. At large distances, we mainly expect to find arterial roads, whose connectivity may resemble a tree. The betweenness distribution of trees follow a power-law which is compatible with low influence on the cut-off parameter β on P (B n ). As distances approach the city center, the presence of local roads increases and the structure of the network increments in regularity. Although many types of regular networks could mimic this structure, e.g. Delaunay triangulation or rectangular grids, most of them have equivalent properties in terms of betweenness. These structures provide high inter-connectivity between city center buildings, with high redundancy of paths, offering, in turn, higher resilience to congestion than tree-like structures. A grid graph displays a betweenness distribution with a significant cut-off.
These observations allow us to set the basis of our model for road networks, composed of a rectangular grid to describe the city center, connected to trees that model the periphery. More precisely, our grid-tree model (simply GT-model hereafter) consists of a regular grid, of size w × w, connected with a set of n t trees with height h and branching factor r, see Fig. 1. We suppose the trees are full and complete.
The number of nodes of the grid, N G , and of the trees, N T , are given by while the size of the GT-model structure is For simplicity and symmetry of the GT-model, we choose four trees (n t = 4), an odd number for the sides of the grid (w = 2 + 1, with ∈ N), and assume that trees are connected to the central node of each side, see Fig. 1. Compared to actual cities of diverse size, the GTmodel reproduces the range (1, ∞) (right branch) of the Figure 1. Diagram of a network generated with the GTmodel, with parameters w = 2 + 1, r = 2 and h = 3. The maximum betweenness can be located at the grid-center node (in red), the connector nodes (marked with a double circle), or the tree-root nodes (marked with a star). Colors, labels and notation are set to explain the analytical computation of the betweenness of the central node of the grid, see Sec. IV. betweenness distribution in Fig. 2 (red line in panels a-c). These are the nodes with larger betweenness centrality, the critical ones to many network phenomena. This is a clear evidence that this simple regular network model may suffice for the analysis of these phenomena. The left hand side of the rescaled betweenness distribution, range (−∞, 1), corresponding to low values, emerges with the addition of structural noise to the GT-model. We have tested three types of noise: random addition and removal of edges (both with distance bias), and Delaunay noise. See Appendix A for a detailed description of noise generation procedures.
QQ-plots for the three cities in Fig. 2 show that the GT-model leads to similar shapes of the betweenness distribution, and specially, a good fit for the upper part of the distribution. The lower part of the distribution, corresponding to the structural noise, slightly underestimate betweenness.
Additionally, as important as the shape of the distribution, is the relationship between the node's geographic position and its betweenness. We show in Fig. 3 and Supplemental Fig. S3 that the GT-model recovers the monotonic decrease of the larger betweenness nodes, as well as the average betweenness, as we move away from the city center. Furthermore, it not only recovers the general trend but also the deviation of these values, i.e., large deviations for nodes near the city center, and low deviations for peripheral regions.
These positive results depend, of course, on the choice  of the parameters. The accuracy of the GT-model is determined by the values taken by w, r, h, and the level of noise. In particular, it is governed by the ratio between N G and N T . As this ratio grows, the high-betweenness distribution branch approaches the form of a grid and the cut-off β has an influential role, while in the opposite case a power-law emerges as it is related to the betweenness distribution of the tree. Results in Fig. 3 correspond to intermediate regimes.

IV. ANALYTICAL DERIVATION OF BETWEENNESS IN THE GT-MODEL
For a large set of traffic models and strategies, the critical injection rate of vehicles, γ c , i.e., the maximum rate at which vehicles can enter the system without congesting it, can be obtained in terms of the maximum node betweenness, B * n [8, 9, 21, 44-46, 48, 49, 52]: The value of B * n sets the onset of congestion. Thus, it makes sense to study routing dynamics over the GTmodel in such terms. Here, we are interested in understanding how topological changes in terms of the GTmodel model parameters (i.e., w, r and h), affect the position of B * n in the network. Numerical exploration of the (noiseless) GT-model reveals that the onset of congestion is set by nodes located Figure 3. Comparison between the real (bottom) and estimated (top) relationship between nodes geographic position and their betweenness for small (Dalian, China), medium (Medan, Indonesia) and big cities (Tokyo, Japan). The GTmodels used have parameters w = 51, r = 2, the values of h indicated in each column, and additive noise ∆ρ/ρ = 1.5%, see Appendix A 2. All distances and betweenness are normalized between 0 and 1. Coordinates for the nodes of the GT-model have been assigned using the planar embedding described in Appendix A 1. Solid lines on the scatter plots represent the 80-centiles of the betweenness distribution at the given radius R.
at three different key network positions: the center node of the grid (red node in Fig. 1), at connector nodes on the perimeter of the grid (nodes with a circumscribed circle), or at the root of the trees (nodes with a star within). We will refer to these nodes as grid-center (gc), connector (c), and tree-root (t), respectively. According to them, it is possible to define three different congestion regimes, which correspond to the location of the node that marks the onset of congestion. Therefore, the transportation network may collapse in the city center, in the perimetral city roads (ring roads), or in arterial roads. Note that in these two last cases, because of symmetry, we have four nodes with equivalent structural properties (see Fig. 1).
According to Eq. (4), the understanding of which circumstances lead to each regime, goes through the analysis of the betweenness for these three types of nodes. The rest of this section develops the analytical expressions of the betweenness for these nodes of interest, while we leave for the next Sec. V the analysis of the congestion regimes. To this aim, we proceed in the classical way of counting the number of paths crossing each of the nodes, according the definition in Eq. (1). As we will see, regularities in the network allow for an efficient computation of such values. For the sake of simplicity, we assume that origin and destination nodes do not contribute to the betweenness of such nodes. Otherwise, one should add 2(N GT − 1) to each node: N GT − 1 for the paths where the node is origin, and N GT − 1 for the paths where the node is destination.
Considering the structural composition of the GTmodel, each node may be traversed by three different types of paths: (1) paths with origin and destination in the trees; (2) paths between a tree and the grid; (3) paths within the grid. The betweenness of each node can be obtained as the sum of the contributions of each of these types of paths. Specifically, the betweenness of any node in the network can be written as a second-degree polynomial: with different coefficients depending on the GT-model parameters. The constant term c j considers the contribution of paths fully contained within the grid, which does not depend on the parameters of the tree. The linear term b j N T , instead, considers the contribution of paths that go between a tree and the grid, while the first term, a j N 2 T , considers the contribution of paths that go between trees. All the coefficients a j , b j , and c j depend on the grid parameter (w or ), and for the tree-root node, also on r.
The analytical computation of the coefficients of B j is cumbersome. Consequently, to improve readability of the paper, in the following, we only describe the mechanism we use to obtain the betweenness for the grid-center node (B (gc) ), and postpone the derivation of the betweenness for the connector node (B (c) ) and the tree-root node (B (t) ) to Appendices B and C, respectively.
Coefficient a (gc) of the betweenness of the grid-center node can be expressed as: where = (w − 1)/2 corresponds to the distance between the central node of the grid and its sides, see Fig. 1, and we have defined π x,y , the number of different paths in the grid involving x horizontal and y vertical steps: As described in Eq. (5), a (gc) only considers the contribution to betweenness of paths with origin and destination belonging to nodes in different trees. Following Fig. 1, which visually describes the process, we first consider the betweenness contribution to the grid-center node (colored in red) of paths that go from T 2 to T 4 . Each of these paths contributes with a unity to B (gc) , since all paths between those nodes go though the central node. Since we could also choose the paths between T 1 and T 3 , we obtain the 2 in Eq. (6). Now consider the contribution of paths that go from either T 1 or T 3 to nodes within T 4 . All these paths are canalized through nodes n o and n d , and there are many paths of equal length between n o and n d : we have path multiplicity (or degeneration). Some of them are illustrated in green lines in Fig. 1. We proceed combinatorially to count all these paths. Consider we need to move steps to the right (→) and steps down (↓) to go from n o to n d . There are π , = 2l l ways in which we could order the → and ↓ operations. Only one of these paths goes through the central node, the one where all ↓ operation precede the → ones. In this way, the betweenness contribution of any of the paths that go from nodes in T 1 or T 3 to nodes within T 4 is 1/π , . Considering there are four different origin-destination combinations in this configuration -(T 1 , T 4 ), (T 3 , T 4 ), (T 1 , T 2 ) and (T 3 , T 2 )we add a factor 4 to Eq. (6). Note that we do not have to account for the reversed assignment of the trees as origin and destination, e.g., (T 4 , T 1 ), due to the reversibility of the paths; if we calculate all the shortest paths from node i to node j, it is not necessary to do the same for paths between j and i.
The expression for b (gc) is the following: Figure 1 provides a visual support for the explanation of the terms in Eq. (8). Consider a node identified with variables (x, y), i.e., located at x horizontal and y vertical steps of the central node of the grid. The shortest paths whose destination node is within tree T 4 and that go through the central node, are paths whose origin is located on the left-hand side of the grid, shaded in orange in the diagram. Any of these origin-destination pairs has π x,y different paths that cross the central node, and π x+ ,y paths to reach the connector node that connects the grid and the tree, since the connector node is nodes to the right of the central node. Thus, each of the origin-destination pairs contribute to the betweenness of the central node with a factor π x,y /π x+ ,y .
We can now exploit the grid's symmetries to obtain their total contribution to the betweenness of the central node. For each origin node above the central node there is an equivalent node below it, thus a factor 2 must be added. However, for nodes with x = 0, there is just one path to the destination, and it crosses the central node, thus the term in front of the sums. Finally, a factor 4 to both terms is necessary to account for the 4 possible destination trees connected to the grid, thus completing all the terms in Eq. (8).
The calculation of coefficient c (gc) is similar to the previous ones, with just the difference that both origin and destination of the paths belong to the grid. The result is The idea is to consider that the origin node is identified with variables (x, y) and the destination node with (a, b), where x and a represent the horizontal distances to the grid center, while y and b are the corresponding vertical distances. There exist four different configurations for the relative positions of these origin and destination nodes, that lead to the four terms in Eq. (9). In the first term, the origin and destination are aligned with the grid center, i.e., x = a = 0 for a vertical alignment, and y = b = 0 for the horizontal alignment. They amount a total of 2 non-degenerated shortest paths per alignment, in which all of them contain the grid-center node.
The second term considers the cases in which origin and destination nodes are, one aligned horizontally and the other vertically with the grid center, i.e., the cases x = b = 0 and y = a = 0. There are four of these combinations, and in all of them there is only one shortest path that crosses the grid center. If we choose for example the case x = b = 0, there exist a total of π a,y shortest paths between the origin and the destination, thus explaining this term in Eq. (9).
Next, we have the situation in which the alignment with the grid center is limited to one of the nodes, and in one of the directions; let us choose the destination node and the horizontal alignment, i.e., b = 0. Now, only π x,y of the π x+a,y shortest paths connecting origin and destination pass through the grid center. Since there are eight possible origin-destination configurations with this relative alignment, they explain the third term in the r.h.s. of Eq. (9).
Finally, the last term comes from the two configurations where no alignment is present, with π x,y π a,b shortest paths crossing the grid center among the π x+a,y+b connecting origin and destination.
Once we have computed the three coefficients, the betweenness of the grid center is simply: Note that the same approach could have been used to obtain the betweenness of all the nodes in the grid, not just the grid center. Basically, the limits of the sums in Eqs. (8) and (9) would change to reflect the new position of the reference node, some terms would disappear from the coefficients due to the node not being within a shortest path, and special care would be needed to account for nodes in the sides and corners of the grid. The analysis for one type of these nodes, the connector node, is available in Appendix B, while in Appendix C we show the calculation of the betweenness for all of the nodes in the trees, including the important tree-root node. For validation purposes, Fig. 4 displays a correlation plot between the analytical values of the betweenness calculated using the expressions above, and the numerical values obtained using the Brandes algorithm [57], showing perfect agreement.

V. CONGESTION REGIMES
The position of the maximum betweenness node provides information about which city areas determine the collapse of the transportation network. The three regimes (central's grid, connector and tree-root node) we have discussed in the previous sections may be interpreted as different congestion phases which characterize a urban system. In this section, we establish a relation between the three congestion regimes and the parameters of the GT-model. Precisely, given the values of w, r and h, we conclude where, geographically, congestion occurs.
The transition between two different regions is defined by the condition where r 1 and r 2 are two distinct regimes. For instance, the frontier between the tree-root (t) and the connector node (c) regimes is defined by the equation with B (t) and B (c) introduced in Eq. (5). Clearly, Eq. (12) is a second-degree polynomial: Then, by using the equations in Sec. IV, and Appendices B and C, we can provide an analytic solution to Eq. (13): with the discriminant In a similar way we can obtain the transition between the other regimes. Figure 5 shows the three different regimes for varying configurations of the GT-model. When the trees are small with respect to the grid, the grid-center node dominates congestion. Increasing the size of the trees, the congestion jumps first to the connector nodes, and later to the tree-root nodes. An appropriate parameter to know the current congestion regime is the congestion radius R c , defined as the distance between the maximum betweenness node and the grid center.
Once noise is added to the GT-model using the methods in Appendix A, the regime's transitions are expected to soften. The left panel of Fig. 5 presents the behavior of the congestion radius as a function of N T for fixed r and w. First, for small sizes of the trees, we observe an offset in the grid-center regime, although the congestion radius remains close to the grid center (R c = 0). Secondly, as tree height increases, we observe an expected noisy behavior preclusion of the transitions between the regimes. Finally, a general shift of the different transition with respect to the noiseless case is also pointed out. However, despite all, the different regimes are still clearly identifiable and stable for a wide range of N T values when noise is added to the GT-model. It is worth highlighting that the abruptness of the transition remains despite the noise. As it can be seen in Supplemental Fig. S4, as we decrease the level of noise, the observed shift vanishes.
The right panel of Fig. 5 generalizes the results on the left one and analyzes, in terms of a phase diagram, the accuracy in predicting the transition between the different regimes for a large set of model parameters. As defined, the GT-model only considers squared grids and regular trees, which probably is too restrictive to resemble real cities and. Also, it renders sparse model sampling as the parameters get large. To overcome these drawbacks, we extend the model to incorporate intermediate grid and tree sizes (besides the ones given by Eq. 2) which allows to draw GT-model realizations of any network size. In the grid case, we generate these intermediate sizes between w 2 and (w + 1) 2 by incrementally adding nodes (one by one) to the current grid periphery until we reach the desired nodes number w 2 ≤ N G ≤ (w + 1) 2 . In this iterative process, nodes are added at random locations considering the remaining empty positions. We also implement an equivalent procedure for the tree.
Results in the right panel of Fig. 5 are intuitive: congestion occurs in the center of the grid (dark values in the phase diagram) when trees are short, i.e., grid dynamics predominate. As trees increase in size (vertical axis) a phase transition to the connector regime occurs (gray values in the diagram). At this point, the connector nodes become a bottleneck for the transportation network. Eventually, as trees keep increasing in size, the highest hierarchy nodes of trees becomes the bottleneck and then we reach the tree-root regime (light gray in the diagram). These transitions from the center to the outer bounds of the graph are reminiscent of the double percolation phase transition observed in core-periphery networks [58]. Indeed, the GT-model shares some structural features with that family of networks, which may explain the resemblance of such observation. Finally, note that the phase transitions evidenced in the left panel of Fig. 5 represent a vertical slice of the phase diagram in right one (marked as a blue vertical line).
We see that the transition between the grid center and the connector regimes, predicted by Eq. (17), is accurate in all the phase diagram (red line). The transition between the connector and the tree-root regimes, as given by Eq. (17), is only accurate at grid sizes given by Eq. (2) (black solid line with dots in Fig. 5). This is mainly because the addition of new nodes alters the betweenness of the connector and the tree root in an imbalanced way. See that, in comparison with the regular sizes of the grid (circles) we need much larger trees to trespass the phase transition. This can be considered in Eq. (17) without much effort to obtain a better prediction (black dashed line). A detailed explanation of how to correct Eq. (17) is given in Supplemental Sec. S1.
In addition to the detection of this phase transition points, our analytical development allows to understand the asymptotic behavior of the regimes. Here, the parabolic character of the transition function in Eq. (13) permits to state that the two regimes never collapse, neither reach a constant separation and the grid-side area enlarges as N G increases. This would mean that, as cities grow larger, the internal flow predominates the dynamics of the transportation system. This is compatible with recent observations considering real transport data [59].

VI. EVIDENCE OF CONGESTION REGIMES IN CITIES
The congestion regimes and their abrupt transitions uncovered by the GT-model have not been previously observed in real cities. In this section, we validate the existence of these regimes, and analyze how abrupt the transitions are by applying to empirical road networks the same analysis done for the GT-model. To this aim, we consider the road network contained within a circle of radius R p , using as geometric center of the city the one provided by the service [60]. For each value of R p , we identify the set of λ nodes with largest betweenness, and calculate the average congestion radius as the average distance to the center of the city of these λ-largest betweenness nodes. Theoretical analysis on GT-model predicts three different regimes delimited by two transitions. However, on real cities, these regimes may or may not exist or, alternatively, be different in number. To detect them, we use an automated method which statistically identifies change points where the congestion radius significantly changes, either in mean or slope [61]. Additionally, we make use of the elbow method (usually applied to k-means clustering algorithm) to choose the optimal number of change points, see Supplemental Sec. S2. Subsequently, these change points define the different congestion regimes.
We have analyzed the 97 city road networks in [22], using the data provided by the authors; a detailed description of all these cities can be read in this reference. From all of them, 49 networks present detectable regimes with multiple abrupt transitions between them, whereas the rest exhibit softer transitions or other patterns. Figure 6 shows the results for 9 of the networks with clear multiple abrupt transitions, with radius ranging from the 39 km of Abidjan to the 64 km of Tianjin. Equivalent results for another 21 cities are presented in Supplemental Fig. S5. For these 30 cities, the behavior is very close to that predicted by the GT-model. We first highlight that real cities, in general, have between 3 and 5 regimes. The additional regimes absorb noise when the transition is not as abrupt as expected. Besides this effect, we clearly observe that many cities, indeed, contain the expected regimes where congestion abruptly changes between distant geographical regions.
These results have been obtained by averaging the radius of the λ largest betweenness nodes for each city and radius, where λ = 15 in Fig. 6 and Supplemental Fig. S5. We present in Supplemental Fig. S6 a sensitivity analysis of the results in Fig. 6 for values of λ ∈ {2, 5, 10, 15}. This allows us to conclude that the detected pattern persists by extending our analysis beyond the single congestion node. Among the remaining 48 networks, 43 present softer regime transitions (see Supplemental Fig. S7) and 5 of them exhibit other patterns (see Supplemental Fig. S8). This makes evident that GT-model does not work properly for all the cities since the landscape characteristics, socio-economic factors and historical background may have different impacts in the morphogenesis of cities [62]. These factors may eventually derive into different downtown layouts (other than grid-like), or different connectivity patterns between arterial roads and urban centers, which are not considered by the GT-model. See for instance Melbourne, Rio de Janeiro, Mumbai and San Francisco in Supplemental Fig. S8, as examples. These cities present important commonalities: they are all constrained by the sea, with a shape that resembles a peninsula, and with weak connectivity among city differentiated components.

VII. CONCLUSIONS AND PERSPECTIVES
We have presented the GT-model for the analysis of dynamical properties of transportation networks. By using the model, which reproduces several well-known properties of real transportation networks, we unveil the existence of a double congestion phase transition, which is tightly related to the geographic areas where traffic congestion emerges. Remarkably, the transitions between these regimes are abrupt, implying that they may be un-predictable by only considering the current state of the system. Consequently, taking proper actions previous to the network collapse requires a complex systems analysis. Using the GT-model, we are able to provide analytical predictions along this direction, and reveal that the emergence of such phenomena is related to the size of the city, the way in which arterial and local roads are connected, and the size of the urban catchment area [63,64].
We provide here another step towards the understanding of how urban growth and changes in mobility dynamics affects urban infrastructure. Occurring these transitions at long or short time scales [17,20,59], urban infrastructure is continuously evolving and, undoubtedly, congested areas (or, alternatively, dense traffic areas) will also co-evolve with them. Clearly exemplified in [20], the system may transform into a radically different effective structural organization, from a lattice to a small-world in terms of percolation. Here, we unveil a similar effect for the spatial behavior of congestion. As cities increasingly incorporate mobility from larger distances, the core of the transportation network goes from a lattice-like behavior to a tree-like one, causing that the areas with the least resilience to congestion abruptly change into different geographic position, and eventually concluding in the different congestion phases we identify. We provide clues of the existence of such phenomena for about 50% of the analyzed cities worldwide -observing at times more than two regime shifts. Interestingly, we also show that several other transitional patterns may emerge.
Further work, along the lines of this paper, may continue in this direction and provide information about how the architecture of road inter-connectivity affects the emergence of the confirmed and alternative regimes. Although cities are currently undergoing a paradigm shift of urban mobility, ground transportation is, and will be, still prominent in the near future, and being able to adapt our transportation networks to the accelerated urban growth is of pivotal importance to optimize the cities of tomorrow.

A. GT-MODEL WITH NOISE
We have tested three types of noise: (1) random edge addition with length bias; (2) random edge removal with length bias; and (3) Delaunay triangulation noise. For all noise models, edges are gradually inserted or removed until graph density ρ = E/(3N − 6) is modified in the desired amount ∆ρ = ∆E/ (3N −6), where E is the number of edges and N the number of nodes of the network. Note that for planar graphs 0 ρ 1, see Chapter 2 of [62]. Since the noise is introduced with biases proportional to the distances between the nodes, it is necessary to assign first coordinates to the nodes.

Planar embedding
The purpose of the current appendix is to explain how the GT-model network is embedded in the surface, namely how we assign to the GT-model nodes a position in 2D. The coordinates origin, i.e., the node with position (0, 0), is assumed to be located in the center of the grid, which is well defined since the grid is supposed to have sides with an odd number of nodes, w = 2 + 1 (perfect grid), as shown in Fig. 1. This holds even for grids with a general size N G (such as in Fig. 5), because they are obtained by randomly adding new nodes to the periphery of a perfect grid. In the grid, nodes coordinates (x g , y g ) remain defined as the number of jumps required to reach the central node: x g counts the jumps in the horizontal direction, while y g in the vertical one.
It is possible to proceed in a similar way for the tree nodes. Here position of a node t is defined using polar coordinates with respect to the grid-center node: where • The tree-root is set to be in the same axis as the corresponding connector node, and at a distance R (t) = w + 1 from the grid center. This selection is useful to avoid the collision between tree and grid nodes; • d t is the number of jumps required to reach the node t from its tree-root (i.e., the tree level); • θ t accounts for the angular separation of node t with respect to the grid center. All nodes at the same level of the tree, and for the four trees of the GT-model, are uniformly distributed along the circle of radius R (t) + d t .
We show an example of this planar embedding in Supplementary Fig. S9.

GT-model with additive length bias noise
For this type of noise, we randomly add edges with a probability inversely proportional to a certain power of the euclidean distance d ij between endpoints i and j. The iterative procedure works as follows: for a given pair of nodes i and j chosen uniformly at random, an edge (i, j) is added with probability 1 − (d ij ) − , provided that planarity restrictions are not compromised. The value of > 0 controls the bias towards the introduction of new edges.

GT-model with negative length bias noise
This kind of noise is implemented following the same procedure as for additive noise, with the only difference that now edges are removed, rather than added.

GT-model with Delaunay noise
This type of noise is generated considering the Delaunay triangulation of the GT-model nodes, mimicking the procedure in [22]. Once the triangulation is generated, edges, uniformly chosen at random, are gradually inserted to the base model until the desired edge density is reached.

B. BETWEENNESS OF CONNECTOR NODES OF THE GT-MODEL
The calculation of the betweenness of the connector nodes of the GT-model, i.e., the nodes in the center of the sides of the grid to which the root of the trees are connected, can be done following the same approach as in Sec. IV. First, we decompose the betweenness in three contributions using Eq. (5): Let us consider for example the connector node to tree T 4 in Fig. 1. All the shortest paths coming from nodes in the other three trees have to pass through it, thus Similarly, all the paths starting in nodes of the grid and going to the tree T 4 also cross our connector node, contributing to its betweenness with one unity per origin node, i.e., with a term w 2 − 1. Additionally, this connector also participates in other paths between the grid and the other trees. More precisely, if the origin is a node in the same side of the grid as the connector, there are two trees with shortest paths through the connector. For example, if we take as origin the node which is at a distance y below the considered connector in Fig. 1, there is one path crossing the connector among the π 2 ,y shortest paths to arrive to tree T 2 , and π , paths through the connector among the π , +y to arrive to tree T 1 . Given the up-down symmetry of the system, the final expression for Finally, shortest paths that start and end in the grid and cross a connector node need again that one of the endpoints is in the same side as the connector, thus we choose again as origin a node at a distance y below the previously considered connector. The destination node can be identified with variables (a, b), the horizontal (to the left) and vertical (upwards) distances to the connector node, respectively. The case a = 0 consists of destinations in the same side as the origin and connector, thus contributing to the betweenness of the connector with a value 2 . When a > 0 there are π a,b paths through the connector node among the total π a,b+y shortest paths connecting the origin and destination nodes, thus we get:

C. BETWEENNESS OF TREE NODES OF THE GT-MODEL
The symmetries of full and complete trees allow for the calculation of the betweenness of all their nodes. In fact, all nodes of the tree located at the same level share the same value of the betweenness, thus we can denote it as B(v), where level v ∈ {0, . . . , h} is 0 for the root and h for the leaves of the tree.
Trees are characterized by the absence of cycles. As a consequence, there is a unique path connecting every pair of nodes, which means that all σ od = 1, thus simplifying Eq. (1).
Let us consider a node at level v of the tree. There are two types of paths that cross it: paths that come from one children (level v+1) and continue to one of its siblings (level v +1); paths that come from the parent (level v −1) and continue to one of the children (level v + 1). We need to count how many different paths exist for each of these types.
In the first case, for each of the r(r − 1)/2 pairs of children, there are N T (h − v − 1) possible origins of the path, and the same number of possible destinations, thus forming a total of r(r−1) Here, we have made use of Eq. (2), which provides the number of nodes in a full and complete tree with branching ratio r and height h: In our case, we have applied it to calculate the number of nodes of the sub-tree formed by a child from level v + 1 and all its descendants. In a similar way, the number of paths that cross the node and its parent is equal to the number of descendants of the node, r N T (h − v − 1), multiplied by the number of the rest of the nodes, N T (h) − N T (h − v). Combining both results, the betweenness of a node at level v reads: which can be written as: This expression is valid for all nodes of the tree, including the root (v = 0). In particular, betweenness vanishes for the leaves, B(h) = 0, as expected.
It is worth noting that maximum betweenness of the tree is located at the root only for trees with branching ratio r > 2; binary trees have the maximum at the children of the root. This can be shown by calculating the difference of betweenness between levels v = 0 and v = 1: The term in brackets is negative if r = 2 and h > 2. To obtain Eq. (28), we have made use of the property Once we have determined the betweenness within the tree, we need to include the contribution of the rest of the network which forms the GT-model. The new paths to consider are those starting outside the tree, with destination in a node descendant of the one for which we want to calculate the betweenness. If this node belongs to level v, following the same procedure which has led to Eq. (26), the result is (30) Note that N GT = w 2 + 4N T (h). Now, we obtain the final expression for the desired betweenness of the tree-roots of the GT-model, B (t) = B total (0): (31) For binary trees (r = 2), the consideration of the full GT-model network makes the root of the tree become the node of maximum betweenness among the rest of the nodes in the tree: (32) This time, the term in brackets is positive for all values of the branching ratio, height of the tree, and size of the grid.
The betweenness of the tree-root nodes can also be expressed in terms of the sizes of the trees of the GTmodel, as in Eq. (5), if we do not expand the value of N T = N T (h) in Eqs. (26) and (30): where we have made use of Rearranging the terms in Eq. (33) we get with

S1. CONGESTION REGIME IN NON-COMPLETE GRIDS
Analytical betweenness expressions in Sec. 4 have been derived assuming the case of complete grids, i.e. grids with side w = 2 + 1 (n ∈ N) and w 2 nodes. In Fig. 5, we also introduced incomplete grids, to fill the gaps between grids of odd-squared sizes. These incomplete grids, with extra nodes in the periphery, modify differently the values of the betweenness of the connector and tree-root nodes, thus shifting the transition between these regimes.
Consider Fig. S1 as an illustration of the following process. We start by connecting a new node n o to the left side of a complete grid, and quantify its contribution to the betweenness of the connector node n c (belonging to the side in front of the new node), and the adjacent treeroot, n t . After adding n o , nodes n c and n t , among others, experience an increment in its betweenness that can be formalized as: The value of ∆B (t) includes the contribution to betweenness of paths between n o and the nodes of the tree to which node n t belongs to. Since all paths need to cross n t to reach their destinations, this value is equivalent to the number of nodes in the tree minus one. These paths also cross the connector and contribute to ∆B (c) , see the black path in Fig. S1. However, ∆B (c) embodies a new term, δ (c) , that considers the paths between the new node and the grid-ones located in the same side as the connector n c , and above it, see the green paths in Fig. S1. It turns out that which explains why, in Fig. 5, numerical simulations associated to the connector regime overcome the solid black frontier defined by Eq. (17). Formally, we may write It can be approximated by where we have taken advantage of the same idea used to derive the c (c) term in Eq. (B4). This is an approximation because we are supposing that the side in which node n o is located is also full of new nodes, above and below it. This gives a good upper bound for most configurations, which is enough to obtain the corrected transition boundary depicted as a dashed line in the right panel of Fig. 5.

S2. DETECTION OF THE NUMBER OF CONGESTION REGIMES
Theoretical analysis based on GT-model predicts three different phases that can be clearly defined by two cut points (a.k.a. change points). To identify these pattern in empirical road networks (see Fig. 6 in the main text), one has to decide the number of change points to consider. Note that, as the number of change points increases, the fitting error decreases. To automatically decide the optimal number of change points, we recall to the elbow test introduced in [65]. The method is based on the concept of diminishing returns to balance the accuracy obtained with respect to the number of change points considered.
In Fig. S2 we plot the evolution of the mean squared error (MSE) of the fit with respect to the number of change points. As expected, the resulting line draws an elbow: in general, the error rapidly decreases as the number of change points grows, but after a certain value, adding another change point doesn't provide much improvement. Such a surplus of precision can be considered a kind of over-fitting. Of course, the opposite situation, namely a very low number of change points, leads to under-fitting. Thus, the elbow singles out the optimal number of change points. In Fig. S2 we see that the elbow position approximately falls between 2 and 4 change points, endorsing the GT-model prediction. Figure S1. Graphical representation of a network generated with the GT-model with parameters w = 2 + 1, r = 2 and h = 3, in which an additional node no has been attached to periphery. Colors, labels and notation are set to explain the development of Sec. S1 of this document. Under-fitting Optimal Over-Fitting Figure S2. Relationship between fitting error, in terms of MSE, and number of change points as used to approximate the curve, see Fig. S5. Figure S3. Comparison between the real (bottom) and estimated (top) relationship between nodes geographic position and their betweenness for small (Dalian, China), medium (Medan, Indonesia) and big cities (Tokyo, Japan). The GTmodels used have parameters w = 51, r = 2, ∆ρ/ρ = 1.5%, and the values of h indicated in each column. All distances and betweenness are normalized between 0 and 1. Coordinates for the nodes of the GT-model have been assigned using the planar embedding described in Appendix A1. Solid lines on the scatter plots represent the mean betweenness at the given radius R. Deviations correspond to one σ of the distribution.   Figure S8. Several examples of real cities that do not present congestion regimes as defined by the GT-model. These cities present distinct patterns of displacement of the largest betweenness nodes. We see for Melbourne, Rio de Janeiro, Mumbai and San Francisco that, as we increase the radius Rp, the maximum betweenness nodes tend to move towards the city center. The rest of the cities, Milan and Dubai, present different patterns as well. Figure S9. Descriptive example of the planar embedding of the GT-model.