Bottom-to-top decomposition of time-series by smoothness-controlled cubic splines: Uncovering distinct freezing-melting dynamics between the Arctic and the Antarctic

,

The classical methods of identifying significant slow components (modes) in a strongly fluctuating signal usually require strict stationarity. A notable exception is the procedure called "empirical mode decomposition" (EMD), which is designed to work well for non-stationary and nonlinear (quasiperiodic) time series. However, EMD has some well-known limitations such as the end divergence effect, mode mixing, and the general problem of interpreting the modes. Methods to overcome these limitations such as "ensemble EMD", or "complete ensemble EMD with adaptive noise" promise an exact reconstruction of the original signal and a better spectral separation of the "intrinsic mode functions" (IMFs). All these variants share the feature that the decomposition runs from the top-tobottom: the first few IMFs represent the noise contribution and the last is a long term trend. Here we propose a decomposition from the bottom-to-top, by the introduction of smoothness-controlled cubic spline fits. The key tool is a systematic scan by cubic spline fits with an input parameter controlling the smoothness, essentially the number of knots. Regression qualities are evaluated by the usual coefficient of determination R 2 , which grows monotonously when the number of knots increases. In contrast, the growth rate of R 2 is not monotonous: when an "essential slow mode" (ESM) is approached, the growth rate exhibits a local minimum. We demonstrate that this behavior provides an optimal tool to identify strongly quasi-periodic slow modes in non-stationary signals. We illustrate the capability of our method by reconstruction of a synthetic signal composed of a chirp, a strong nonlinear background and a large amplitude additive noise, where all EMD based algorithms fail spectacularly. As a practical application, we identify essential slow modes in daily ice extent anomalies both at the Arctic and Antarctic. Our analysis demonstrates the distinct freezing-melting dynamics on the two poles, where apparently different factors are determining the time evolution of ice sheets. Thus, we believe that our methodology offers a competitive tool to identify modes in strongly fluctuating data and advances significantly the state of the art regarding the decomposition of nonlinear time series.

I. INTRODUCTION
The decomposition of nonlinear time series is a general problem pervading many branches of science, where a plethora of data are produced and recorded by advanced automated measuring systems. Thus, there is a great deal of interest in general methods capable of providing fruitful insight into the complex realm of time series. An interesting method extensively used during the past decades is the so-called "empirical mode decomposition" (EMD) [1] which, along with its many variants, has gained popularity in several fields, especially in analysing geophysical records [2][3][4][5][6], biomedical signals [7][8][9][10] cou-Decomposition of non-stationary, nonlinear, noisy signals is not well performed by traditional methods. EMD provides a general tool for an adaptive signal decomposition into a finite number of narrow band intrinsic mode functions (IMFs), which are derived directly from the data [17][18][19]. The IMFs generated by the EMD algorithm are determined by the local extrema of the signal, and traditional EMD uses cubic spline for upper and lower envelopes interpolation. Variants such as the ensemble EMD (EEMD) [20,21] or complete ensemble EMD with adaptive noise (CEEMDAN) [22,23] are noise-assisted extensions of EMD, where the IMFs are constructed as a mean of an ensemble of decompositions where the original signal is decorated by different added white noise series. These and other improvements can handle some shortcomings of the traditional EMD algorithm, e.g. mode-mixing, where segments of a given characteristic frequency range are randomly distributed among two or more IMFs [24][25][26], or the problem of missing data points [see e.g., 27]. Notably, the EMD algorithm works as a dyadic filter bank, that is the mean frequency of a given IMF is approximately half of the previous one [27][28][29]. For this reason, the basic problem of all top-to-bottom decompositions is the interpretation: how to distinguish relevant IMFs and irrelevant IMFs in an efficient way.
Here we propose an alternative procedure providing a bottom-to-top decomposition. Similarly to EMD, this is an empirical method, purely data driven, and uses cubic spline fits. We have successfully validated the procedure on a series of synthetically constructed composite signals (see below and the Appendix). Specifically, we demonstrate the efficiency of the method on two real signals: daily see ice extent over the Arctic and over the Antarctic as determined by satellite image processing (Fig. 1).
The original idea of smoothness-controlled cubic spline fits is explained by de Boor [30]. Smoothing splines are commonly used function estimates for a set of noisy observations with the goal of striking a balance between having a smooth curve and being close to the given data. The starting point to find an optimum is the smoothest possible curve, a straight line fit. When the data set is segmented with a growing number of knots, local cubic fits (with matching second derivatives at the knots) approach better and better the data set. Perfect fit is achieved by the natural cubic spline incorporating all the data points, at the price that the smooth fit can exhibit extreme waviness. The goodness-of-fit monotonously improves with an increasing number of knots, nevertheless, we have observed that the rate of improvement is not monotonous for composite signals. A large number of numerical experiments indicated that the growth rate of any correlation based measure of goodness-of-fit slows down when the cubic spline approaches an "essential slow mode" (ESM), a given component in a synthetic signal. This observation is the key which permits a bottom-totop decomposition of non-stationary nonlinear signals as it is demonstrated in the remainder of the paper.

A. Daily ice extent over both poles
Figures 1a-1b illustrate time series of daily ice extent over the Arctic and the Antarctic in the period between 10/26/1978 and 05/15/2020. These freely available data are collected and maintained by the US National Ice and Snow Data Center [31]. During the first eight years of satellite observations, data for every second day were recorded, with a gap of 42 days at the end of 1988 (clearly visible in Fig. 1b). Since the freezing-melting dynamics is apparently smooth enough, we used linear interpolation as a proxy for the missing data, resulting in the two evenly-sampled time series of N = 15177 days.
Climatological mean values for each calendar day were determined in the usual way [32], averages over 36, 37 or 38 years were computed (depending on the available samples), including the 10 leap days (29 February). The results are plotted in Fig. 2 for both poles. The width of the orange band represents one standard deviation illustrating the larger variability of the Arctic ice cover around the annual cycle, as it is well known. Our starting point for the subsequent analysis is the ice extent anomaly, namely the difference between a given observation and the climatological mean value on the particular calendar day. Note that the removal of climatological means does not affect long-term trends, e.g. when an average calculated over systematically decreasing values in consecutive years is subtracted, the tendency in the anomaly time series obviously remains the same.

B. Smoothness-controlled cubic spline fits
We implemented the Python package "csaps" (version 0.11.0) developed by Eugene Prilepin for univariate, multivariate and n-dimensional grid data approximation using cubic smoothing splines [33]. The methodolgy is based on the original algorithm of de Boor [30], where a smoothing spline S with a smoothing parameter p minimizes the sum of an error measure and a roughness measure: where {x i , y i } is the original data set, and S denotes the second derivative. The smoothness parameter p ∈ [0, 1] has well defined limits, p = 0 belongs to a least-square linear fit, while p = 1 is the natural cubic spline. A widely discussed point in the literature related to smoothing splines is the optimal choice of smoothing parameter p [30,[34][35][36]. For a recent survey see, e.g., Ref. [37]. Here we follow a different strategy by systematically evaluating spline fits in a wide range of smoothing parameter p, because we are interested in composite signals, and not only in finding "the" optimal smoothing cubic spline. Our approach is illustrated in Fig. 3 with a synthetic signal Y (t) composed from a chirp C(t), a slow background trend B(t), and an additive Gaussian noise G(t). Our construction mimics the time range and amplitudes of the polar ice extent anomalies with the following parameters: with a = 0.3, Ω = 1.115 × 10 −3 day −1 , α = 5 × 10 −7 day −1 , and t = 1, 2, . . . , N = 15177.
with b = 0.5, ω 1 = 2.867 × 10 −4 day −1 , and ω 2 = 6.881 × 10 −4 day −1 . The additive Gaussian noise G(t) has a unit amplitude and variance of 1 /2. The result for Fig. 3a (black), also the chirp component C(t) (white) and the slow background The quality of a smoothing cubic spline with a particular smoothing parameter p is characterized by the usual coefficient of determination: where the nominator represents the sum of squared error of the fit, the denominator is the total variance of the signal. We are aware of the well-known limitations of correlation based measures [see e.g., 38]. However, this work is not aimed to analyse the occurrence of extreme cases such as enormous outliers, strongly correlated noise components, etc. By definition, the coefficient of determination R 2 grows monotonously when p increases. Indeed, Fig. 3b illustrates well this relationship for our synthetic test signal Y (t), where the results for 151 logarithmically placed p values are shown, starting from p 1 = 8 × 10 −15 and iterating as p j = 1.24p j−1 , where j = 2, 3, . . . , 151.
Notably, the growth rate of R 2 is not uniform as demonstrated in Fig. 3c, where the derivative d(R 2 )/dp is plotted. We have performed several numerical experiments with various synthetic signals composed of periodic, polynomial, chirp-like [see Eq. (2)], piecewise linear, etc., and additive noise terms, obtaining the following general behaviour. When the smoothing cubic spline approaches a fundamental mode, what we may call an "essential slow mode" (ESM), the growth rate of R 2 exhibits a local minimum. The first such minima belongs to ESM 1 of the smallest characteristic frequency range, the second minimum represents the superposition of the slowest and second-slowest modes ESM 1 + ESM 2 , and so on, and so forth. This behavior provides the key for decomposing a non-stationary and nonlinear signal into ESMs with the following steps: 1. Perform a scan over a wide enough range of smoothing parameter p, and determine R 2 for each cubic spline fit S p (t). (We recommend logarithmic spacing of p values because, in general, the growth of R 2 has more or less logarithmic tendencies.) 2. Determine the local minima {p 1 , p 2 , p 3 , . . . , p M } of the empirical growth-rate function d(R 2 )/dp.

A proper decomposition of the original signal is
iteratively given: the slowest mode ESM 1 is the cubic spline fit S p1 (t) by p 1 , the second slowest mode ESM 2 is the difference of cubic spline fits S p2 (t) − S p1 (t) by p 2 , and p 1 , while mode ESM 3 is the difference of cubic spline fits S p3 (t) − S p2 (t) by p 3 , and p 2 , etc. [note that S p2 (t) is already the superposition of ESM 1 and ESM 2 ].

C. Complete ensemble EMD with adaptive noise (CEEMDAN)
In order to compare the results of our bottom-to-top decomposition with results from the standard EMD, we used the Python package "PyEMD" developed by Dawid Lascuk [39]. Besides the original EMD and ensemble EMD (EEMD) procedures, the package contains the "complete ensemble empirical mode decomposition with adaptive noise" (CEEMDAN) algorithm as described in Refs. [22,23]. Since the methodology is widely discussed in the literature [22,23,40,41], we do not repeat the details here. The decomposition runs from the top to the bottom by using local maxima and minima as upper and lower envelopes at each step of the identification of an intrinsic mode function (IMF).
The improved CEEMDAN algorithm [23] identified 12 intrinsic mode functions (IMFs) for the synthetic test signal illustrated in Fig. 3a. The first 6 IMFs represent noise terms with decreasing characteristic frequencies. The interesting part IMF-7, . . . , IMF-12 is shown in Fig. 4.
The top-to-bottom decomposition spectacularly fails to reconstruct the essential slow modes: the chirp part C(t) is divided among IMF-7, IMF-8, and IMF-9, the background part B(t) is divided among IMF-10 and IMF-11 (even the sum of them has serious flaws at the ends of the signal). The almost linear trend of small slope IMF-12 is a clear artifact of the procedure.
The decomposition of chirp-like modes is a particularly demanding task where most classical methods fail. Since practical signal processing often meets with the problem of smoothly changing internal frequencies, there are some recent proposals to handle such situations [42][43][44]. Our method seems to be by far the simplest. As a representative example, we evaluate the two ice extent time series described in Subsection II A. Although these data sets are available at daily resolution, we have not found any publication where the whole records would have been evaluated. Probably the smoothness of the curves in Fig. 1 suggests that there is not much information in a daily datum, therefore practically every analysis in the literature use monthly mean values as a starting point. Here we compare the results of our bottom-to-top decomposition method with the CEEMDAN procedure (see Subsection II C). Figure 5 illustrates the results of the systematic scan by smoothing cubic spline fits as a function of p. As it is explained in Subsection II B, local minima on the growthrate curves of the coefficient of determination d(R 2 )/dp identify the essential slow modes (ESMs). Fig. 5b exhibits three local minima for the Arctic time series, presumably a slow background tendency at m1, a small amplitude slow oscillatory mode at m2, and a large amplitude higher frequency mode at m3. However, at the smoothness parameter belonging to m3, the cubic spline has an almost perfect match with R 2 ≈ 1.0 (see Fig. 5a) indicating that all the high frequency noise spikes are already incorporated in the fit. In order to have a smoother component, we can chose p 3 = 5 × 10 −6 (vertical red dashed line in Fig. 5b). This provides a very close match (see Fig. 5a), however with a dampened noise, as we will discuss below.
The same procedure for the Antarctic time-series Figs. 5c-d locates four local minima, the situation at m4 is the same as for the Arctic m3. Instead of analysing a perfect noisy fit, we estimate the highest frequency ESM with a smoothness parameter p 4 = 10 −4 (vertical red dashed lines in Figs. 5c-d).
We might have expected that the two time series shown in Fig. 1 should be decomposed into similar ESMs (where ESM 0 is the climatological mean plotted in Fig. 2 for both poles). In contrast, the results in Fig. 5 reflect very essential differences between the freezing-melting dynamics at the Arctic and Antarctic. Next, we provide details and comparisons with the CEEMDAN decomposition.

A. Arctic
The ice extent anomaly time series for the Arctic is plotted in Fig. 6a. Clearly, the removal of the climatological mean (Fig. 2a) from the original record (Fig. 1a) did not affect the long-term tendency, the well documented, dramatic ice melting at the northern pole. This smooth trend is the first slow mode ESM 1 (Fig. 6b) obtained with the smoothness parameter p 1 (Fig. 5b). It seems that a regime change happened between 2000 and 2005: in the last two decades of the 20th century, fluctuations exhibited large amplitude positive (freezing) spikes, while in the first two decades of the 21st century the sign changed, large amplitude negative (melting) spikes dominate.
ESM 2 (red curve in Fig. 6c) is a slow quasi-periodic oscillatory mode with a characteristic period of around 5 years, with a visible phase shift in the early 2000s, coinciding with the suggested regime change.
ESM 3 (blue curve in Fig. 6c) exhibits much larger amplitude oscillations than ESM 2 , as expected already from the peak heights in Fig. 5b. The dominating period is sharply 1 year, and the regime change is also apparent: large positive peaks in the 20th century are followed by dampened fluctuations in the early 2000s, and large negative annual excursions are dominating in the last 15 years. The residuals (Fig. 6c, gray line) have a broad, very small amplitude Fourier hump centered at around 80-100 days (not shown here), which can be reasonably related to short term fluctuations of different amplitudes in the different seasons.
The comparison of our results with the top-to-bottom CEEMDAN decomposition is illustrated in Figure 7. Similarly to the synthetic noise in Fig. 4, the algorithm produced here 11 IMFs, the first 6 represent high frequency modes. A visual inspection reveals again substantial mode mixing. While IMF-7 seems quite similar to ESM 3 identified in Fig. 6c (blue line), large discrepancies are present at the peaks of largest amplitudes. IMF-8 provides some amplitude corrections to IMF-7, although the mismatch with ESM 3 is obvious. Particularly strong mode mixing is apparent when we compare IMF-9 and IMF-10 with the essential slow mode ESM 2 . Note that even the slow general trend of ESM 1 is poorly resolved by the last intrinsic mode function IMF-11.

B. Antarctic
We repeated the same analysis for the Antarctic ice extent time series, following the steps of Subsection III A. Daily ice extent anomalies are plotted in Fig. 8a. The slowest background mode ESM 1 (Fig. 8b) identified by the cubic spline fit with p 1 in Fig. 5c is totally different from the monotonously decreasing long-term trend at the Arctic (Fig. 6b). The lack of overall shrinking of the Antarctic ice sheet until around 2015 is well-known again, and possible explanations are widely discussed in the geophysical literature (together with the quick melting during 2015-2016).
The slowest oscillatory mode ESM 2 (Fig. 8c, red curve) seems to be similar to the Arctic 5-year mode. However, a Fourier test suggests a somewhat longer period of around 5.5 years. Note the "swing" of ESM 2 with an increasing amplitude in the last two decades. (Since such behavior is not consistent with the stationarity criteria required by spectral analysis, we do not refer to details here.) The next fluctuating mode ESM 3 of p 3 (Fig. 8c, blue curve) exhibits a characteristic peak at a period of 1.2-1.3 years, which is somewhat perplexing. The lack of a clear 1 year component (in strong contrast with the Arctic) might be a consequence of interferences among the annual periodicity and distinct climate modes, such as the Southern Annular Mode, quasistationary wave3 pattern, Pacific South American pattern, and SemiAnnual Oscillation, all known to related to the Antarctic ice extent [45][46][47].
Similarly interesting is the fastest oscillatory mode ESM 4 of p 4 (Fig. 8c, brown curve). It has a comparable amplitude to ESM 3 , and a broad spectral hump centered at around a period of 160-180 days, somewhat less than half a year. This is again in contrast with the Arctic signal, however semi-annual oscillations are known to determine many meteorological patterns on the southern hemisphere [45][46][47]. Unlike the Arctic, the Antarctic residuals (Fig. 8c, gray line) have negligible spectral peaks at small periods between 1-100 days, and zero for larger values. This indicates that the extra near semi-annual mode ESM 4 (superposed with the slower ESMs) in the case of Antarctic explains all the relevant variabilities apart from some uncorrelated random noise.
The CEEMDAN results for the Antarctic are illustrated in Fig. 9, where the first four IMFs represent noise. Again, the top-to-bottom decomposition cannot reproduce ESMs properly. While IMF-9 seems to be close to  Fig. 8a. The only difference between runs is that the seed of the random number generator producing the noise is set from an internal timer. ESM1 (see Fig. 8b) is illustrated by red dotted lines. ESM 2 , significant discrepancies are obvious. The closest match is visible between IMF-7 and ESM 2 , at least for the characteristic frequencies, although amplitudes are overestimated by IMF-7. Notably, the long-term trend ESM 1 spectacularly suffers from mode mixing, even the sum of IMF-11 + IMF-12 does not fit to ESM 1 , particularly at the ends of the signal.

IV. DISCUSSION
While empirical methods based on mode decomposition are able to provide a kind of basis functions for noisy, non-stationary, nonlinear signals, the method of top-tobottom decomposition suffers from several shortcomings. The basic question is to interpret the intrinsic mode functions (IMFs): do they represent physical modes, or not? The forced dyadic filtering usually results in 10+ IMFs for empirical records, the first few are the noise parts, depending strongly on the first local maxima and minima representing the upper and lower envelopes. Noise assisted variants like EEMD and CEEMDAN seemingly override this randomness with a decomposition of an ensemble of noise decorated records. However, Fig. 10 illustrates a peculiar feature of such procedures, namely, repeated ensemble realizations often result in strongly different IMFs.
In Fig. 10, the results for four subsequent runs of CEEMDAN decomposition are illustrated for the Antarctic ice extent anomaly time series (Fig. 8a). In each run, an ensemble of 100 test sets are generated with an internal random number generator seed [39]. Particularly the low-frequency intrinsic mode functions, such as IMF-11 plotted in Fig. 10, suffer from an extreme variability, questioning any interpretation as being representative of a physical mode.
There are several proposals to identify relevant IMFs (presumably representing some physical modes) [see e.g., [48][49][50][51]. The limitation of all proposals (known to us) is the following. The first step always is the top-to-bottom decomposition into a limited number of IMFs, then some measures of correlation with the original signal are evaluated to discriminate relevant and irrelevant IMFs. In contrast, by evaluating hundreds of "IMFs" our method is able to identify essential slow modes (ESMs) which represent particularly well the overall tendencies of the signal.
In conclusion, we believe the proposed bottom-to-top cubic spline decomposition to provide a competitive tool to identify modes in strongly fluctuating data and to significantly advance the state of the art regarding decomposition of time series. The computational cost (CPU time) demanded by our algorithm with 100-200 cubic spline fits is at most one tenth of a standard CEEMDAN procedure.
Appendix: How does the method work?
Since our bottom to top decomposition method is a fully empirical and data driven procedure, it is impossible to provide a comprehensive exploration of the infinite variety of artificial signals. Instead, we are using simple harmonic functions and Gaussian noise as a first test: The harmonic component has a unit amplitude, the time t is measured in "days", T is measured in "years", and G(0, 1) represents a Gaussian white noise of zero mean and unit standard deviation, with amplitude A.
In order to check the sensitivity of the procedure for the resolution of the smoothing parameter p, we constructed three sets covering the range p ∈ [10 −16 , 3 × 10 −3 ] (with logarithmic spacing) as follows. The set p151 is formed by p151 n = 1.23 × p151 n−1 from n = 2, ..., 151. Similarly, p301 is constructed as p301 n = 1.11 × p301 n−1 from n = 2, ..., 301, finally p601 has the rule of p601 n = 1.053 × p601 n−1 from n = 2, ..., 601. In all three cases the first value is 10 −16 . Figure 11a illustrates the results for single harmonic components with a period time T = 3 without noise, while Fig. 11b presents analogous data for the same sinusoidal slow mode with noise [A = 2 in Eq. (A.1)]. The shape of the R 2 (p) curves is a simple sigmoid (not shown here) starting from zero and shooting up to 1 with a logarithmic slope, consequently the derivative is an almost the location.) When the single peaks are normalized by their peak maxima, they fully coincide. Normalization is necessary for the comparison, because finer steps of the smoothing parameter p obviously result in smaller values of the numerical gradient d(R 2 )/dp. The remarkable result in Fig. 11b is that the procedure is weakly sensitive to the high frequency noise, the slow sinusoidal mode is easily detectable with surprisingly large noise amplitudes without any problem (the example A = 2 means twice larger noise amplitude than the slow mode). The presence of noise is visible for larger and larger smoothness parameters behind the peak, the quality of cubic spline fits improves again by progressively representing the noise better and better.
The position of the peak functionally depends on the characteristic period time, as illustrated in Fig. 11c. The horizontal axis is log(T ) (covering the interval of [0.3, 7.5] years by steps of 0.3 year), the vertical axis is the position of the peak log(p peak ). The dashed line has a fitted slope of −4.02 indicating a power law dependence: The widths of the peaks on semilog scales (such as Figs. 11a and 11b) are hardly dependent of the position, but the scaling behavior with T −4 means that the separation of peaks decreases sharply at increasing T values. This obviously limits the identification of slow sinusoidal components of close frequencies. Figure 12a illustrates that the superposition up to five sinusoidal modes is clearly decomposable when the separation of characteristic frequencies are large enough. A feasible bottom to top decomposition requires clear local minima between the peaks. As expected, when the separation is too small, the peaks merge and the identification of such slow modes is not possible, as clearly visible in Fig. 12b. The critical separation ∆T crit , where the double peak has a clear local minimum, depends on the period time, as noted above. We performed a series of tests similarly to the ones in Fig. 12b with different period time of the slower component T 0 , and identified the last clear double peak as a function of separation which defines a critical difference ∆T crit . Fig. 12c illustrates the result, which suggests a surprisingly simple linear relationship ∆T crit = T 0 /2 for two sinusoidal components.
It is clear that there is a plethora of questions about the sensitivity and selectivity of the bottom to top decomposition, such as modes with different amplitudes, slow background trends, signals other than sinusoidal, etc. A detailed analysis of every such aspects is beyond the scope of this work. Nevertheless we believe that the method works and is efficient, and that it is able to identify slow nonlinear modes present in complicated signals when the condition of large enough characteristic frequency separation is fulfilled.