Characterizing spreading dynamics of subsampled systems with non-stationary external input

Many systems with propagation dynamics, such as spike propagation in neural networks and spreading of infectious diseases, can be approximated by autoregressive models. The estimation of model parameters can be complicated by the experimental limitation that one observes only a fraction of the system (subsampling) and potentially time-dependent parameters, leading to incorrect estimates. We show analytically how to overcome the subsampling bias when estimating the propagation rate for systems with certain non-stationary external input. This approach is readily applicable to trial-based experimental setups and seasonal fluctuations, as demonstrated on spike recordings from monkey prefrontal cortex and spreading of norovirus and measles.

Propagation dynamics in complex networks are often approximated by models with an autoregressive representation. Examples include affinity maturation in immune systems [1], reproductive dynamics of bacteria [2][3][4][5] or humans [6], epidemiological disease spreading in a network of humans [7,8], neutron transport theory [9] and collective cortical dynamics [10][11][12][13][14][15]. The inference of propagation dynamics is often complicated. First, only a fraction of all system components can be observed experimentally (subsampling) [12,[16][17][18]. Second, the model parameters can be time-dependent (non-stationary), and specific time-dependent input rates can lead to signatures of criticality even for networks of uncorrelated units [19]. In general, time-dependent input rates are ubiquitous for collective dynamics in neural networks, and are one source for seasonal fluctuations of infectious disease incidence [20].
The subsampling challenge is typically addressed for stationary model parameters. Recent progress has been made for equilibrium and non-equilibrium systems by explicitly modelling the hidden units [21][22][23][24][25][26]. However, explicit knowledge about the hidden units cannot be guaranteed for real-world applications. A subsamplinginvariant approach that does not require knowledge about the underlying model size was recently proposed [12]. The authors showed that established estimators based on linear regression or Kalman filtering underestimate the propagation behaviour. They introduced a novel multistep regression (MR) estimator that is subsampling invariant by characterizing propagation dynamics through the systems autocorrelation time τ . However, it does not consider time-dependent model parameters.
To tackle non-stationarities, recent approaches considered models with time-dependent parameters. Examples include Bayesian models based on Cox-processes [27], weighted least-squares [28], or expectation-maximization based on Kalman filtering [29,30]. However, none of these methods consider the complication of subsampling, although real spreading processes are usually subsampled [12,31].
In this Letter, we derive an estimator for a subsampled process subject to a specific type of non-stationary external input, namely cyclostationary input. We first show that the subsampling-invariant MR estimator [12] can be biased if the external input rate changes over time. We then analytically derive a generalization of the MR estimator that can overcome the bias in the case of cyclostationary input. This approach is subsampling invariant and readily applicable to two prevalent situations: First, to trial-based experiments, which are commonly found in neuroscience; Second, to periodic input rates, e.g. the seasonal fluctuations of infectious disease incidence [20]. We demonstrate the applicability of our methodology on numerical data (testing robustness to relaxation of our assumptions) and on real-world experimental data.
We consider the class of stochastic processes with an autoregressive representation of first order. This includes widely-used processes, such as branching processes, Kesten processes, and AR(1) processes. Time evolves in discrete steps (∆t = 1). Let A i t denote the activity of a realization i at time t, then the conditional expectation value over the ensemble of independent realizations is defined as where m is the time-independent mean offspring parameter and h t is the average ensemble rate of a timedependent input distribution. In the framework of spike propagation in neural networks, m describes the average number of neurons that a single neuron subsequently activates and h t describes the expected input rate at time t from sensory modalities or other brain areas.
Note that the expectation values in Eq. (1) are defined over the ensemble of independent realizations (trials) of the stochastic process, e.g., h t = i h i t (for the trialaverage we drop the index that was summed over). For a general non-stationary external input, h t cannot be defined unless one has multiple realizations from the same Step function as example for a time-dependent input rate.
(b) Subsampled activity a i t of a branching process with constant internal autocorrelation time τ and non-stationary input rate shows non-stationary behavior in regime II and III. Colored lines are individual trials, black solid line is trial-ensemble average at (over 200 trials). (c) Linear regression slope estimates r k (blue dots) of time lag k for process in (b) do not decay exponentially as expected from the process' internal autocorrelation (red line), which makes an unbiased estimation of τ impossible. (d) Time series (b) corrected by subtracting trial-ensemble average:ã i t = a i t − at. (e) For corrected time series (d), the r k decay exponentially with τ , such that τ can now be inferred without bias. Simulation parameters: Trial length T = 10000 steps, internal autocorrelation time τ = 20 steps, number of trials N = 200, mean (fully sampled) baseline activity A0 = 1000, subsampling fraction α = 0.05, relative step height hup / h down = 2.6, step duration c = 200 steps.
In nature, this is approximately realized by cyclostationarity, e.g., trialbased experiments or seasonal fluctuations. We make use of this to solve the problem even without knowledge of the precise realization of external inputs. In the following, we assume that the generation of offsprings is Poisson distributed with time-independent m, while the generation of external input is Poisson distributed with time-dependent rate h t .
Subsampling is incorporated as follows: We only require that the subsampled activity a t is on average linear in the full activity A i t , i.e., a i t |A i t = αA i t (for details see Ref. [12]). For example, every spike or disease incidence is sampled with probability p = α.
To estimate the spreading behaviour m under subsampling and time-dependent external input rates, we follow the principle idea of the MR estimator [12]. We generalize Eq. (1) by recursive iteration to k time steps: If the rate is time-independent ( h t = h ), Eq. (2) implies that the original process A i t has an exponential autocorrelation function with the time lag k in steps of ∆t. The autocorrelation function relates the propagation dynamics (m) to an internal autocorrelation time τ = −∆t/ ln(m) and represents a measure of how long information persists in the activity [12]. For stationary processes the variance across trials is equal to the variance within trials (Var i (a i t ) = Var t (a i t ) = Var i,t (a i t )), such that the autocorrelation function C(k) of the subsampled activity a i t can be calculated directly via linear regression [12] with time-independent autocorrelation strength b = α 2 Var i,t (A i t )/Var i,t (a i t ) for all k = 0. While b is biased under subsampling (b < 1 if α < 1), the autocorrelation time τ is subsampling invariant and can be obtained by fitting Eq. (4) to the data [12].
For a time-dependent external input rate h t , however, the autocorrelation function is not time invariant, and if calculated does not necessarily decay exponentially (Fig. 1a-c). Consider, for example, a step-function external input rate. Linear regression applied to each regime independently would yield similar slopes (identical slopes for full activity A i t ) but different offsets of linear regression (Supplemental Material, Fig. S2). Therefore, the naive application of the MR estimator fails even for full activity. This represents an issue for general time-dependent input.
In the following, we construct a reliable estimate of the internal autocorrelation time τ in the presence of cyclostationary external input rates. We focus our discussion on subsampled activity a t , which includes the fully-sampled case (α = 1, b t = 1).

FIG. 2.
Robustness of our estimate to variability in a non-stationary input (step function with step-duration c = 10 τ ). (a) Variability in the onset time tstep with standard deviation σt. (b) Variability in the step height ∆ h with standard deviation σ h . Color in matrix indicates relative error between the estimated autocorrelation timeτ and the internal autocorrelation time τ of the branching process. The 5 % error bound was fitted (black lines) and scales as σ ∝ τ γ . Simulation parameters: T = 1000 τ steps, N = 300, A0 = 5000, α = 0.01.
To correct the bias from cyclostationary external input ( h t is time-dependent but identical for each trial i), we introduce the following method: Given we have N trials, with independent realizations a i of a subsampled linear autoregressive process, we calculate the time-dependent trial-ensemble average over all trials (not to be confused with an average calculated over all recorded times). Now, we correct for the non-stationarity of the original process by subtracting the trial-ensemble average ( Fig. 1d) Its linear regression slopes r k reveal the true internal autocorrelation time in their exponential decay (Fig. 1e) for sufficiently large N (see below and Supplemental Material Fig. S4). From the corrected time seriesã i t , we can thus infer the unbiased autocorrelation time by applying the MR estimator [32] (see Supplemental Material S.6 for the full derivation). To prove this, we reformulate Eq. (4) as simple linear regression at each time across trials, i.e., r k,t = Cov i (ã i t ,ã i t+k )/Var i (ã i t ). For trial-ensemble correctedã i t , we find that the correction compensates the convolution in Eq. (2), such thatr k,t = b t m k with timeindependent decay but with time-dependent autocorrelation strength (Supplemental Material, Eq. (S23)) where the relation to the (across-trial) Fano factor of the full activity F t = Var i (A i t )/ A t is strictly true only for binomial subsampling. However, we can show (Supplemental Material, Eq. (S25)-(S29)) that for the corrected time series direct application of Eq. (4) with the standard regression approaches yields an unbiased estimate of the internal autocorrelation time τ despite cyclostationary input and subsampling (for a proof of concept see Fig. S3).
In addition to the bias from subsampling or nonstationary input, there can be a bias from short trial length T [33] and from small trial number N . The shorttrial bias can be avoided by estimating both covariance and variance as fluctuations around a global stationary mean (cf. "stationarymean" method in Ref [32] with a detailed discussion). For all our analyses (experimental and numerical), we thus use the MR estimator toolbox [32] with "stationarymean" method. In principle, this allows for an unbiased estimation down to N = 10 short trials (Fig. S4), while of course the variance of the results increases with decreasing N (Supplemental Material, Sec. S.4 and S.7).
We tested the applicability of MR estimation for cyclostationary external input by increasing the level of realism for a numerical problem. The test case is a baseline rate h 0 plus step-function at onset time t step with step height ∆ h and step duration c. We consider three cases: i) perfect cyclostationarity across trials ( Fig. 1 and Fig. S5 for an extreme example), ii) variation of onset time t step ∼ N (T /2, σ t ) with ∆ h = h 0 fixed (Fig. 2a), and iii) variation of the step height ∆ h ∼ N ( h 0 , σ h ) with t step = T /2 fixed (Fig. 2b). We generated N = 200 trials of branching processes with internal autocorrelation time τ , trial duration T = 1000 τ , and baseline activity A 0 = 5000 such that h 0 = (1 − m) A 0 (m = exp(−∆t/τ ), ∆t = 1 step). This setup allows us to independently investigate variability in onset time and height of the input.
Variations in the onset time and step height do not hinder correct inference as long as the standard deviations are sufficiently low (Fig. 2). In our test case, variations in the onset time barely affect the correct inference as long as the standard deviation σ t is below the magnitude of the autocorrelation time (Fig. 2a). When σ t ≈ O(τ ), the method still provides consistent estimates of the processes autocorrelation time. Moreover, the estimates improve for a given σ/τ with increasing autocorrelation time τ . We observe, that the 5 % error bound scales as σ t ∝ τ γ witĥ γ ≈ 0.22 (3). Similarly, variations in the step height barely affect the correct inference as long as the standard deviation σ h is below h 0 /5 (Fig. 2b). Again, the estimates improve with increasing autocorrelation time and the 5 % error bound scales as σ h /∆ h ∝ τ γ withγ ≈ 0.4(1). We conclude that our method provides consistent results even after relaxation of perfect cyclostationarity.
We applied our method to two sets of experimental data. The first dataset consists of spiking activity in The intrinsic timescales τ in macaque pre-frontal cortex have been inferred with our new approach from spike recordings during a trial-based visual short-term memory task [34]. pre-frontal cortex from a trial based short-term visual memory task on macaque mulatta [34] (about N = 300 trials each, see Supplemental Material Sec. S.11). In this dataset, the external input can be interpreted as sensory input from other brain areas to the investigated area. The second dataset are epidemiological case reports from the Robert-Koch-Institute [35] (N = 18 trials each, see Supplemental Material Sec. S.10). In the epidemiological dataset, the infections carried into the country via travel can be interpreted as non-stationary external input.
For the monkey data, we want to emphasize three findings: First, although the trial-ensemble average a t increases by a factor 3 (Fig. 3a) the autocorrelation function hardly differs in most cases (Fig. 3b). Second, we find a systematic decrease of intrinsic timescales after correction, while for the majority of the recording sets the decrease was less than 10% (Fig. 3c). Third, a robustness test of our method with parameters adjusted to experimental scale ( Fig. 3d with experimentally realistic stimulus shape) indicates that our method yields less than 5% deviation from τ ≥ 200 ms despite stimulus onset variability with σ t < 50 ms, which is a realistic constraint given the steep rise of typical ensemble responses within 30 ms-50 ms (Fig. 3a). To conclude, our method reveals intrinsic timescales in pre-frontal cortex between 57(4) ms and 345(26) ms with median 214 ms (compared to 239 ms if not corrected) from recordings covering the full task. Our results are consistent with previous results in pre-frontal areas of macaque (about 200 ms) confined to the stimulus foreperiod to approximate the resting state [36,37].
In the example of disease spreading, our method accounts well for seasonal fluctuations (Fig. 3e-g). The weekly case number reports reveal a strong yearly periodicity, suggesting a year-wise separation into trials. The improvement due to trial-ensemble average correction is readily visible in the regression function r k (Fig. 3f). With the correction, the infectiousness estimate is higher than without (Fig. 3g, Norovirus: τ = 14(3) weeks, Measle: τ = 15(8) weeks). The disease results are in principle subject to additional uncertainty from the small number of trials (cf. Fig. S4), which are probably on the order of 10% and thus smaller than the error bars from the fits. Our results highlight that the correction by trial-ensemble average can reveal higher infectiousness of diseases, which might otherwise be underestimated due to seasonal fluctuations and other non-stationary effects, and that long-term recordings are necessary to reveal the intrinsic infectiousness of a disease.
In summary, we have presented a simple, subsamplinginvariant estimate of the internal autocorrelation time for stochastic processes with an autoregressive representation subject to (approximate) cyclostationary external input.
The key success of the presented approach (MR estimation with trial-ensemble average corrected time series) is the potential to disentangle the internal spreading from any hidden, but repetitive external input rate. Thereby, our approach solves the problem of apparent criticality due to non-stationary input rates [19] for repetitive stimulation protocols. We demonstrated the robustness of our approach to violations of perfect cyclostationarity for the external input rate; and we showed its applicability to real-world problems from neuroscience and epidemiology. In conclusion, we recommend the trial-ensemble average correction as best practise when approximating trial-based experiments with autoregressive models. A toolbox for the multistep-regression analysis is readily available [32].
We thank Matthias Munk for sharing his data. All authors acknowledge support by the Max Planck Society. Financial support was received from the Gertrud Subsampling induced correlation bias α Subsampling fraction F Fano factor r k Linear regression slope estimate for time lag k s k Linear regression offset estimate for time lag k Vari(·) Variance over index i Covi(·) Covariance over index i TABLE S1. List of notations.

S.2 Branching process
The branching process (BP) with immigration is a stochastic autoregressive process. Each realization of the process i is described by a temporal evolution with time t. For realization i at time t there are A i t units, of which each unit j generates a random integer number of "offsprings" y i,t,j and all y i,t,j ∈ N are independently and identically distributed with the mean m [38][39][40]. Additionally, a timedependent external input H i t , with mean h t "immigrates" at each time step, where · denotes the expectation value over independent realizations. The evolution of the total activity A i t of the branching process is recursively given by Autocorrelation time The branching parameter m is directly connected to the processes autocorrelation time τ by [12] m = exp(−∆t/τ ), given a time binning ∆t, e.g., from simulation steps or data binning in experiments.
Stationary BP Assuming the mean of the branching parameter m and the external input h t are constant over time, i.e., h t = h , we can derive the dynamics of the branching process. First, the expectation value of the time step A i t+1 given the activity Recursive iteration of Eq. (S3) and identification of the geometric series yields the expectation value of the evolution over k time steps Now, we can separate the dynamics into three regimes: Subcritical for m < 1, critical for m = 1 and supercritical for m > 1. We find a stationary solution A ∞ for the subcritical case by iterating Eq. (S4) to k −→ ∞ (m k → 0), such that In the critical state, the mean activity shows linear growth due to h , whereas the activity diverges exponentially in the supercritical state.
BP with non-stationary input Since our main manuscript addresses non-stationary external input, we here investigate a non-stationary branching process with a time-dependent external input h t . The stationary distribution in Eq. (S5) is no longer valid and by iterating Eq. (S3) with the time-dependent external input rate h t we can derive the conditional expectation value of the activity after k time steps:

S.3 Subsampling
When only a fraction of the full system can be observed, this is defined as subsampling. Examples include electrophysiological recordings of neuronal activity in neuroscience or incomplete case reporting of infectious disease propagation. Naive analysis of the data neglecting the influence of subsampling can lead to severe misinterpretations of the system's dynamics [12,16,18]. The theory and implications of subsampling for linear autoregressive processes have been described in detail in Ref. [12] and will here be recapitulated briefly. The time series a t is called a subsample of A i t , if holds for all t, j ∈ N with constants α, β ∈ R. The subsample a i t is constructed from the fully sampled time series upon sampling and does not interfere with it's evolution. We assume β = 0.

S.4 Multistep regression estimation
To infer a network's autocorrelation time τ and the branching parameter m even under subsampling, Wilting & Priesemann developed the multistep regression (MR) estimator [12]. It addresses the issue of classical estimators being biased under subsampling. The MR estimator is applicable to stationary autoregressive processes of first order only, giving misestimations when applied to a nonstationary autoregressive time series. The MR estimator works as follows: In a first step, we estimate the linear correlation of Eq. (S4) between a step a i t and a i t+k (within the same realization) with the slope r k and offset s k for time lags 0 < k < k max for the time steps t < T − k, by minimizing the sum of residuals It can be shown [12], that for stationary dynamicsr k converges in probability tô where m k is the slope between the fully sampled activity pairs and b the bias in the slope estimation due to subsampling. More specifically, the linear regression slopes fulfill the relation where the notation Var i (·) denotes the variance over independent realizations. The bias depends on the subsampling fraction α (see Eq. S7), the variance of the full activity A i t and the subsampled activity a i t respectively. However, these are usually unknown. Then, in the second step of the estimator, the sum of residuals is minimized for where the two step estimation over various time lags k allows us to infer the bias b, which remains unknown to classical linear-regression estimators, and the branching parameter m. The autocorrelation time τ can easily be calculated via Eq. (S2). The procedure is equivalent to the calculation of time series autocorrelation that has a decreased correlation strength b in step 1 and fitting the exponential decay in step 2.
We used for all analyses the python toolbox Mr. Estimator [32] of the multistep regression estimator. The exponential function f exp (x, τ, b) = b exp(−x/τ ) has been used for the purpose of this investigation as it addresses the pure exponential decay characteristic for the autocorrelation function of an autoregressive process. For the monkey dataset, the offset-exponential fit-function f exp (x, τ, b, c) = b exp(−x/τ ) + c has been used.
Two estimation methods are implemented in the Mr. Estimator toolbox. The method stationarymean uses all trials combined to calculate the activity average a, which is needed to calculate the linear regression slopesr k numerically. The advantage of the method stationarymean is that the linear regression estimation is more robust if only short trials with few datapoints each are available. The method trialseparated calculates the activity average a i and subsequently linear regression slopes for each trial i independently and averages over all obtained regression slopesr k,i , see Ref. [32] for further details. In case the mean activity between trials varies significantly, the method trialseparated provides better estimation results for the regression slopes. However, when each trial is short but activity across trials is stationary (or as in our case cyclostationary) the method stationarymean corrects for short-trial biases [32]. We validate the trial ensemble average correction on both methods in Sec. S.8. For all estimations in the paper, the method stationarymean was used to correct for short-trial biases.

S.5 Effect of non-stationary input on MR estimation
Assuming a branching process that is subject to a timedependent external input rate h t , a naive application of the MR estimator gives a biased estimationτ . We will demonstrate this analytically in the following example of a step function, which can be generalized to arbitrary time-dependent external input rates.
Let A i t T t=0 be subject to a time-dependent external input rate h t with a step function: Now, one can divide the mean activity development into three regimes, as shown in Fig. S1. Two stationary regimes I and III with different expectation values and one transient regime II right after the jump in the external input rate. The expectation values for the regimes can be derived from Eqs. (S5) and (S4).
Mean activity At (bottom) of a BP subject to a non-stationary input ht (top) with step-function rate [Eq. (S12)] can be divided in three regimes. Mean baseline activity A0 in regime I, the transient regime II with growing activity and a new mean activity regime III, compare Eq. (S13).
Here, case 1 (t < t step ) and 3 (t t step ) follow immediately from Eq. (S5), while for case 2 (t ≥ t step ) we assume the stationary solution of regime I ( h1 1−m ) at t step , insert this into Eq. (S4) with h = h 2 , and identify k = t − t step .
By applying the estimator to the different regimes separately, following the steps in Ref. [12], one finds that the linear regression offset estimatorŝ k will take on different values due to h t in the different regimes: Here we can clearly see that the least square estimation with Eq. (S8) on the entire time series is influenced by the step function in h t . As actually two different offsets would be treated as values of unity. Consequently, the estimation ofr k will be biased by the time-dependence in the external input rate. To visualize that analytical example, the linear regression for a given time lag k for the regimes I to III separately and combined is visualized in Fig. S2. first order, to realize a correct estimation of the processes spreading dynamics in terms of the branching parameter m and autocorrelation time τ respectively and proves the validity analytically. We discuss subsampled systems with a i t |A i t = αA i t . Any following results are applicable to fully sampled systems A i t by choosing α = 1. When the external input H t is drawn from the same time-dependent probability distribution, we can define a trial ensemble average that averages over all trials for each time step and where N is the number of trials. For N −→ ∞ the trial ensemble average converges in probability to the time-dependent expectation value, thus a t −→ a t at a given time step, which we assume for the following analytical derivations. We start by defining the mean-corrected time series such that ã t = 0. We recall that the actual time evolution takes place in the original process A i t . Next, we show that the slopes r k,t from linear regressions over mean-corrected subsampled activities from an ensemble of trials at time t and time t + k can be decomposed into a time-dependent correlation bias b t and a k-dependent decay m k . For each time t we can solve the simple linear regression, Eq. (S8), with where the covariance is given by and · denotes the ensemble expectation value (where by convention the index i is dropped). Per construction ã t = ã t+k = 0, cf. Eq. (S16). We thus only need to calculate where we used a t = a t |A t and Eq. (S7). Using again the law of total expectation, namely A t A t+k = A t A t+k |A t and A t+k = A t+k |A t , we find where we used Eq. (S4). We thus find with a time-dependent amplitude (or bias) b t = α 2 Var i (A i t )/Var i (ã i t ) and a purely k-dependent decay m k . The time-dependent amplitude can be related to the Fano-factor of the original process. To see this, we note that per construction the trial-ensemble expectation value ã t = 0, such that Var i (ã i t ) = ã 2 t = Var i (a i t ). When the subsampling procedure can be described by a binomial distribution, where Var i (a i t |A i t ) = α(1 − α)A i t , we obtain from the law of total variance, Var i ( Finally, we show that the time-dependent amplitude b t still allows the application of the linear regression estimatorr k to the mean-corrected subsampled process as found in Eq. (S8) and Eq. (S11) despite cyclostationary external input. For this, it is important to notice that the minimization in the simple linear regression step, Eq. (S8), is solved by where both covariance and variance here run over trial ensemble (i) as well as time (t).
For the stationarymean method of the MR estimator [32], Eq. (S25) translates to where ã = 1 T N t,iã i t = 0 by construction. We thus find that where we used Eq. (S23) and observe that an effec- With a k-independent amplitude, the second (fitting) step of the MR estimator, Eq. (S11) becomes unbiased.
For the trialseparated method of the MR estimator [32], Eq. (S25) translates to where [·] denotes the time average and ã i t = 1 T tã i t ≈ 0. Because the trials are independent but identically distributed, we can assume that Var t (ã i t ) = (ã i t ) 2 is constant across trials, take it out of the sum, rearrange the double sum as in Eq. (S26), and find where we used Eq. (S23) and observe that an effective amplitude b = t b t Var i (ã i t )/T Var t (ã i t ) remains kindependent. With a k-independent amplitude, the second (fitting) step of the MR estimator, Eq. (S11) becomes unbiased.
To summarize, we showed that the corrected time series a i t enables the application of the MR estimator [12,32] for an unbiased estimation of the internal dynamics (m or equivalently τ ) from subsampled data despite a timedependent cyclostationary external input rate.

S.7 Proof of concept
To verify the trial-ensemble average correction methodology, a numerical test was performed, see Fig. S3. More specifically, branching process trials with a timedependent external input (step function) were simulated for various autocorrelation times τ and subsampling fractions α. Both estimation methods stationarymean and trialseparated of the MR estimator toolbox were tested, see Sec. S.4. For stationarymean (Fig. S3 a-c), the length of each trial was chosen as T = 100 τ , where N = 5000 trials. For trialseparated (Fig. S3 d-f ), the length of each trial was chosen as T = 5000 τ (to avoid short-trial biases, cf. Sec.S.4 and Refs [32,33]), where N = 200 trials were simulated. In both cases, the relative height of the step function is h up / h down = 2. In Fig. S3, we plot independent estimates for each τ (after trial-ensemble average correction) together with the fit error from the MR estimation toolbox [32]. One can clearly see that the results differ less than 1% from the true τ , and moreover that about 2/3 of the results are correct within bootstrap errorbars as they should be. This demonstrates the general applicability of our method despite subsampling.
In addition, we checked the applicability of our methodology for small trial numbers N (Fig. S4). Again, we simulated branching processes with different autocorrelation times τ and a time-dependent external input (step function), but now we fixed the trial length T = 100 τ and the subsampling fraction α = 0.01 and varied the number of trials N . For each data point in Fig. S4, we generated 100 simulations with independent estimates of τ , and we plotted the mean and standard deviation (as errorbars). For stationarymean (Fig. S4 a-c), the standard deviation decreases with increasing trial number as expected (it starts with 10% for N = 10, which would still enable one to quantify the order of magnitude, but falls to about 2% for N ≈ 200), while the mean always coincides with the true τ . For trialseparated (Fig. S4  d-f ), the variance also decreases for increasing N , but the mean no longer coincides with the true τ . This is due to the before mentioned short-trial bias [32,33] given the short trial length of T = 100 τ . This shows that for cyclostationary external input, the best choice of methods from the two above is the stationarymean.

S.8 Effect of time-dependent autocorrelation strength bt
As we derived in Eq. (S24), regimes with rapid changes in the input rate leads to a variation of the Fano factor F t and under subsampling subsequently the amplitude b t . More specifically, a rapid increase in h t decreases F t and a rapid decrease in h t increases F t . These regimes of rapid change will be called transient regimes.
To test the influence of strong transients on the estimation ofr k and subsequently τ , a branching process of extreme transients was simulated and analyzed with the MR estimator (Fig. S5). More specifically, a periodically recurring jump in the external input rate h t was implemented on a branching process with τ = 200 steps (Fig. S5 a-f ). The period of the external input was chosen as T P = 10 τ so that the up and down transients would cover five autocorrelation times each. This way approxi- mately stationary regions were excluded and the process only consists of transient regimes. The jump's height in activity and input rate is h up / h down = 5 and the subsampling fraction α = 0.01. The estimatedτ deviates only 2 % from the internal autocorrelation. The same check was repeated for a periodically-increasing input rate (Fig S5 g-i) with the same effect. Note that the minor systematic underestimation of the internal timescalê τ < τ is a result from a finite-statistics bias [33]. We conclude that the MR estimator applied to trial-ensemble average corrected time series correctly infers the internal autocorrelation time despite strong time dependence of the input rate.

S.9 Numerical data
All branching process simulations were performed with C++, where the random number generator MT19937 was used to drive two Poisson distributions for the branching processes recurrent internal activation and the external input. Reproducible seeding was utilized. Subsampling from the full activity was implemented numerically by drawing from a Binomial distribution.

S.10 Epidemiological recordings
Case report data for measles and norovirus infections in Germany were obtained from the Robert-Koch-Institute [35]. Strong seasonal fluctuation and presumably nonstationarities with a period of 52 weeks motivated the investigation of the epidemiological case reports, preprocessed with the trial-ensemble average correction and the MR estimator. The case numbers were available with a weekly binning for 52+1 weeks per year from 2001 to 2018. Week 53 was omitted due to overlapping and only full years were used, thus ignoring the first recording year. The data was separated into trials representing one year each -in agreement with the 52 week periodicity of the fluctuations.

S.11 Spike data of macaque mulatta
The monkey experiments were performed according to the German Law for the Protection of Experimental Animals, and were approved by the Regierungspräsidium Darmstadt. The procedures also conformed to the regulations issued by the NIH and the Society for Neuroscience.
Spike data from electrophysiological recordings in the brain of macaque mulatta monkeys has been analyzed. The dataset originates from a visual short-term memory experiment by Pipa et al. [34]. The monkeys were presented visual sample stimuli, which they had to remember for 3 s. Afterwards test stimuli were shown, which the monkeys had to classify into matching and non-matching. 16 single-ended micro-electrodes and tetrodes in a 4x4 grid were placed in the lateral prefrontal cortex of the trained monkeys. The inter-electrode spacing was between 0.5 and 1 mm. The setup allowed a simultaneous activity recording of single units and field potentials at 1 kHz, which was digitized and processed so that signal artefacts from licking and movement were rejected. The tetrode recordings were spike-sorted with the Spyke Viewer software, whereas micro-electrode data was processed with the Smart Spike Sorter by Nan-Hui Chen.
To convert the data into activity data, all simultaneous recordings were collapsed into one collective spike count and binned into ∆t = 4 ms time steps [12]. The 4 ms binning represents the timescale in which spikes propagate from one neuron to the other, motivated by the autoregressive process as a model for neural activity propagation. The trials within the sets had slightly varying length, so they were cut off at the end to share the length of the shortest trial within the set. No temporal alignment was performed. For the MR estimation the maximum time lag was chosen as half the minimum trial length thus k max = T min /2. This way enough data is available for each regression step k. For fitting, the offsetfit-function was used from the Mr. Estimator toolbox that is f exp (x, τ, b, c) = b exp(−x/τ ) + c, see Sec. S.4. The maximum number of trials analyzed for each set was 300, where two sets had contained less than 300 trials, see Tab. S2.