Dynamically accelerated cover times

Among observables characterising the random exploration of a graph or lattice, the cover time, namely the time to visit every site, continues to attract widespread interest. Much insight about cover times is gained by mapping to the (spaceless) coupon-collector problem, which amounts to ignoring spatio-temporal correlations, and an early conjecture that the limiting cover time distribution of regular random walks on large lattices converges to the Gumbel distribution in $d \geq 3$ was recently proved rigorously. Furthermore, a number of mathematical and numerical studies point to the robustness of the Gumbel universality to modifications of the \textit{spatial} features of the random search processes (e.g.\ introducing persistence and/or intermittence, or changing the graph topology). Here we investigate the robustness of the Gumbel universality to dynamical modification of the \textit{temporal} features of the search, specifically by allowing the random walker to"accelerate"or"decelerate"upon visiting a previously unexplored site. We generalise the mapping mentioned above by relating the statistics of cover times to the roughness of $1/f^\alpha$ Gaussian signals, leading to the conjecture that the Gumbel distribution is but one of a family of cover time distributions, ranging from Gaussian for highly accelerated cover, to exponential for highly decelerated cover. While our conjecture is confirmed by systematic Monte Carlo simulations in dimensions $d>3$, our results for acceleration in $d=3$ challenge the current understanding of the role of correlations in the cover time problem.


I. INTRODUCTION
How long does it take to collect N distinct objects that are sampled uniformly with replacement? This is the socalled coupon collector problem [1]. Depending on the context, the objects may represent stickers in a football album, vertices on a fully-connected graph, or people in an epidemic. Close analogies to the coupon collector can be found in a toy model for the build-up of strain in a seismic fault [2], the random deposition of k-mers on a substrate [3], the infection of nodes on a network [4], or the parasitization of hosts [5]. More generally, the coupon collector belongs to the family of urn problems [6,7]. An early result, proved by Erdős and Rényi [8], is that the coupon collection time follows a Gumbel distribution.
Often, the N objects to be collected are not sampled uniformly at any given time. For example, a random walker exploring a lattice can only "collect" nearestneighbour sites. In this context, the total time to visit every site on a graph or lattice is known as the cover time. Cover times have been intensely studied since the 1980s [9][10][11]. For example, an early conjecture [12] that the cover time for a d ≥ 3 torus is also Gumbel distributed was recently proved rigorously [13]. The manner in which a random walker covers a lattice [14][15][16] is encoded in the trace of the walk, i.e. the walk's history, and this non-trivial random object has received much attention in the mathematics literature [17,18]. Qualitatively, an important distinction is between walks that are transient (d > 2) versus recurrent (d ≤ 2), even if the walk is restricted to a finite torus, in which case every site will eventually be visited.
In this paper, we are interested in modifying the cover process in time. Thus, we study the consequences of accelerating or decelerating the random walker upon visiting a new site. In this way, we show that the Gumbel distribution is but one of a family of cover time distributions, ranging from Gaussian for highly accelerated cover, to exponential for highly decelerated cover. Coincidentally, this family of distributions describes the roughness of 1/f α Gaussian signals [19].
Our motivation for dynamically modifying the cover process is to further investigate some of the assumptions underlying the mapping of the cover time problem in d ≥ 3 to the coupon collector problem, specifically those relating to the irrelevance of spatio-temporal correlations. The specific procedure we implement is also inspired by transport behavior in e.g. cellular environments, in which a molecule may aggregate or fragment in the course of its diffusion, thereby altering its diffusion constant in time [20,21]. Alternatively, in the context of search problems [22], the random walker could be "rewarded" or "penalized" upon acquiring new targets, thereby enhancing or inhibiting future search.
The structure of the paper is as follows: In Sec. II we review basic results of the coupon collector problem. In Sec. III we describe how we accelerate or decelerate the dynamics, and identify the distribution of collection times. In Sec. IV we turn our attention to cover times on a torus, and present numerical results for accelerated and decelerated random walkers in Secs. V and VI. We summarize our findings in Sec. VII.

II. COUPON COLLECTOR PROBLEM
In this section we review the basic properties of the coupon collector problem [8]. The probability p i of collecting a new coupon, given that i have already been collected, is Qualitatively, the first coupons are collected rapidly, while the last coupons are collected very slowly. Let n i be the number of coupons drawn between collecting the ith and (i + 1)th distinct coupon. Then the total number of draws C N to collect N coupons is where n i are independent but non-identical geometric random variables with mean 1/p i . Using angular brackets to denote expectation, the mean of C N is therefore which behaves like N log N for large N , i.e. collecting the full set of N coupons is slower than linear in N . Similarly, it can be shown that the variance of C N is proportional to N 2 . Erdős and Rényi derived the full distribution of C N , showing it to be Gumbel [8].
Before giving a heuristic derivation of this distribution, it is convenient to embed the coupon collector in continuous time, such that coupons arrive at unit rate in the manner of a Poisson point process [12]. Thus, rather than the discrete unit steps representing the number of coupon draws, consider instead the amount of continuous time elapsed since collection began. In this perspective, the collection time H j for any particular coupon j is an exponential random variable with mean N , The total collection time is the maximum of all the individual coupon collection times. Since these times are identical and independent The intensity of coupon arrivals (middle) is increased as distinct coupons are acquired (filled circles). The piecewise constant and increasing intensity profile (top) gives rise to a point process of distinct coupon arrivals (bottom) whose intensity can be adjusted.
After centering and rescaling, which is recognized as the Gumbel distribution from extreme value statistics [23].

III. ACCELERATED AND DECELERATED COUPON COLLECTOR
The waiting time T i between collecting the ith and (i + 1)th distinct coupon is a sum over a random number n i of unit exponential random variables. Since n i is a geometric random variable, T i is, in fact, also exponentially distributed with mean 1/p i [24]. Thus, the total collection time can be written as where ε k are independent and identically distributed exponential random variables with unit mean. We now manipulate the arrival rate of random coupons which, in turn, alters the rate at which distinct coupons are collected. For example, if coupons arrive at rate ρ i = 1/p i = 1/(1 − i/N ) all the while i coupons have been collected, then the waiting time between distinct coupons has unit mean. Thus, by accelerating the arrival of coupons to compensate for the decreasing likelihood of obtaining a distinct coupon, distinct coupons are collected at unit rate. This acceleration protocol is depicted schematically in Fig 1: the piecewise constant rates ρ i increase each time a distinct coupon is collected.
In order to accommodate a variety of accelerationdeceleration protocols, we generalize the rates ρ i according to This leads to the collection time where the unaccelerated coupon collector is recovered for α = 1, i.e. Eq. (12), and the accelerated version just discussed above corresponds to α = 0. For large N , the mean of C N (α) scales as so that coupon collecting is accelerated for 0 ≤ α < 1, and decelerated for α > 1, as compared to the original unaccelerated process with α = 1. Apart from the N α prefactor, the exact same sum in Eq. (14) describes the roughness of periodic Gaussian 1/f α signals [19], as outlined in the Appendix. In that context, α = 0, 1, 2, 4 correspond respectively to white noise, 1/f noise [25], a steady-state Edwards-Wilkinson interface [26], and a steady-state curvature-driven interface [27].
When α = 0, C N (0) in Eq. (14) is a sum over independent and identically distributed random exponential variables, which, after rescaling, is described by the central limit theorem. As shown in the Appendix, the Lindeberg condition extends the central limit theorem to non-identical random variables, such that the rescaled distribution of C N (α) remains Gaussian for all α ≤ 1/2. For α = 2, the distribution is Kolmogorov-Smirnov, i.e. the distribution of the test statistic in the Kolmogorov-Smirnov goodness-of-fit test [28]. This distribution reoccurs in many Brownian problems [29,30], branching processes [31], aggregation [32], and statistics [33]. For α = 4, the distribution of C N (4) has been calculated in [34]. Finally, in the limit α → ∞, C N (∞) is exponentially distributed, since only the first term in Eq. 14 contributes. A full discussion of the properties of C N (α) can be found in [19]. In summary, the Gumbel distribution is one of a family of distributions of sums of weighted exponential random variables.

IV. COVER TIMES ON A TORUS
If one identifies coupons with sites, then coupon collecting is similar in spirit to covering a lattice or graph, that is, visiting each and every site at least once. However, if the lattice exploration is undertaken by a random walker, it is far from obvious that coupon collecting describes the statistics of covering: at any given time coupons are sampled uniformly, whereas a random walker samples nearest neighbour sites. This non-uniform sampling is illustrated in Fig. 2, showing a portion of the trace of a random walk as it covers a lattice in d = 3. On a fully-connected graph all sites are nearest neighbours. Therefore covering a fully-connected graph via a random walk is almost identical to coupon collecting, with the irrelevant difference that the random walker must necessarily leave the site most recently visited (assuming self-loops are excluded). Meanwhile, for random graphs cover times have been actively studied by mathematicians [10,35] and physicists [36,37], among others. If the probability distribution of the random walker location converges to the uniform distribution sufficiently fast, the same N log N scaling as Eq. (4) often describes the mean cover time. A graph-dependent constant prefactor will reflect the fact that the walker has to diffuse across the graph to cover it. This constant can be expressed in terms of the mean time spent at the origin [9].
For random walks on a torus (i.e., a regular lattice with periodic boundary conditions), cover times depend on dimension. In d = 1, the cover time (equivalent to the first-passage time of the range process) is not Gumbel distributed [38], while in d ≥ 3 it is [13]. The d = 2 cover time, posed as the "white screen problem" [11], is not completely resolved to this day. Dembo et al. have established rigorously that the mean cover time converges to 4L 2 (log L) 2 /π as the side length L of the simple cubic lattice tends to infinity, although there are practical difficulties in observing this behaviour in numerics [39]. Subleading order corrections to Dembo et al.'s result have been explored in the mathematics literature [40]. In the physics literature, numerical evidence suggests that d = 2 cover times are approximately Gumbel-distributed [41].
For this reason, in the following we restrict our at-tention to d ≥ 3, where it is rigorously known that the cover time is Gumbel distributed [13] (already anticipated heuristically in [12]). The technical proof of this result relies on the transience of a random walker in d ≥ 3, and the approximately Poisson distribution of unvisited sites at the late stage of the cover process [13].
Remarkably, the coupon collector scaling carries over to the cover time, even though the first-passage times {H 1 , H 2 , . . . , H N } to each of the N sites are clearly not independent random variables, although they are approximately exponential. The appropriately scaled cover time now takes the form which is identical to the coupon collector apart from a factor g(0). This factor is the Green function for the unrestricted random walker evaluated at the origin, which is equivalent to the mean time spent at the origin. For example, for the simple cubic lattice in d = 3 [42] Thus, random walk covering is approximately 50% slower on a simple-cubic lattice compared to a fully-connected graph.

V. ACCELERATED AND DECELERATED COVER
In the coupon collector, the waiting times between coupon arrivals is exponential, and acceleration or deceleration is effected by changing its rate. Analogously, the cover process is accelerated or decelerated by changing the rate of the exponential waiting times between random walk steps. Thus, if we employ the accelerationdeceleration protocol as described in Eq. (13), we might conjecture that, for d ≥ 3, the cover time in Eq. (14) is generalized to where ε k are again iid exponential random variables, and the effect of the underlying lattice is incorporated by the Green function g(0). This generalization assumes that the correlations that were carefully accounted for in the standard cover problem [13] continue to play a minor role for α = 1. In the case α = 1, it is known that the firstpassage times H x and H y of sites x and y respectively are correlated such that where 1(H x > t) indicates that site x has been visited at a time greater than t. Eq. (19) is an asymptote in large system size N with t proportional to that size [15,18]. For α = 1, on the other hand, the nature of the correlations is unknown to us. We numerically test the conjecture of Eq.(18) in the following by rescaling the observed probability density p(C N (α)) by the mean or by the standard deviation after centering, A. Deceleration, α = 2 For α = 2, we conjecture that C N (2) is described by the Kolmogorov-Smirnov distribution, with Laplace transform [29,30] for large N , and first two moments The Laplace transform in Eq. (22) can be inverted to recover a series expansion for the probability density p(C N (2)) which, after rescaling by the mean, reads [29] φ 1 (x) = π 2 3 N k=1 (−1) k+1 k 2 exp(−π 2 k 2 x/6).
The sum converges fast, so that the cover time density of relatively small systems is very close to the asymptotic density as N → ∞. For α = 4, we conjecture that C N (4) has the same distribution as the roughness of a curvature-driven interface, with Laplace transform [34] e −sC N (4) = 4π 4 g(0)N 4 s and first two moments  The Laplace transform in Eq. (25) can be inverted to recover a series expansion for the probability density p(C N (4)) which, after rescaling by the mean, reads [34] φ 1 (x) = 2π 5 45  C. Acceleration, For 0 ≤ α ≤ 1/2, Eq. (18) falls under the scope of the central limit. Therefore, the conjectured statistics of C N (α) normalised to zero mean and unit standard deviation are described by a Gaussian distribution In the presence of correlations, the central limit theorem need no longer apply. Indeed, we find that our conjecture breaks down for accelerated cover in d = 3, and we discuss that case separately in Section VI. For d ≥ 4, however, our conjecture continues to agree well with nu-merics.  As explained in [19], for 1/2 < α < 1 the rescaled cover time densities φ 2 (z) can be expanded as (29) where ζ is the Riemann zeta function, and Eq. (29)  In all cases considered so far, the conjecture that the cover time C N (α) is described statistically by Eq. (18) is successfully verified empirically. However, the conjecture fails in the case α = 0 in d = 3. According to Eq. (18), the cover time is predicted to be statistically equivalent to a sum of independent and identical exponential random variables, therefore falling under the scope of the central limit theorem. The only feature correctly predicted by Eq. (18) is that the mean cover time C N (0) still behaves as g(0)N , as shown in Fig 10. However, the standard deviation σ C N (0) does not scale as N 1/2 . Instead, for system sizes N ≥ 10 3 it is well approximated by where we note that the fitted values of the amplitude A = 0.44 (2) and exponent γ = 0.6608 (12) are close to √ 2 − 1 = 0.414 . . . and 2/3, respectively. The rescaled cover time density φ 2 (z) is also not Gaussian, as shown in Figs 11 and 12. We are not able to identify the empirical density, although a Tracy-Widom density for the largest eigenvalue from the Gaussian orthogonal ensemble of random matrices gives a reasonable approximation. Given the discrepancies in the right tail of the density, and the behaviour of the skewness and kurtosis as shown in Fig 13, we cannot claim conclusive support for the Tracy-Widom density and offer this curious near coincidence as an open problem.
While we cannot identify the empirical density of cover times for α = 0 and d = 3, we can nevertheless investigate the breakdown of our conjecture, Eq (18), which naively expresses the cover time as a sum over exponential waiting times ε k . Since we do not recover the anticipated Gaussian distribution, we are led to conclude that the random variables ε k are either sufficiently non-identical, or non-independent (or both).
To isolate these question, we perform a shuffling operation across (independent) members of the ensemble from which we collect statistics of cover times. Specifically, we choose one member of the ensemble at random, i.e. one realisation of the cover process, and sum the first of its b waiting times {ε (1) k } b k=1 . Then we pick another realisation at random, and sum the next b waiting times from that process {ε (2) k } 2b k=b+1 , and so on. We continue this operation N/b times, so that we accumulate the shuffled  cover time process (32) By this operation, we generate an ensemble of cover times from processes that have been block-shuffled. If the block length b = N , then the original cover process is left intact and no shuffling occurs. Meanwhile, if b = 1, then each waiting time ε k is drawn randomly from the ensemble distribution of waiting times to the kth unvisited site. More generally, the block length plays the role of a "highpass" filter that destroys correlations with characteristic scale larger than b. Thus, for b = 1, the block-shuffled cover time C shuff. pendent realizations. The resulting C shuff.

N
(0) could only be non-Gaussian if the ε k were sufficiently non-identical.
As a measure of discrepancy between the empirical density φ 2 (z) of shuffled cover times and a standard Gaussian density g(z), we compute the Kullback-Leibler divergence from φ 2 (z) to g(z) for different block lengths b. Figure 13 shows that a comparatively large block length of b 5 × 10 3 = N/25 is enough to recover Gaussian cover time behaviour, thus suggesting that long-range correlations are at play. It is instructive to consider another modification of the cover process (also implemented in [41] in the unaccelerated case). Instead of splicing together blocks of cover from independent realizations, we intermittently allow the random walker to "teleport" to a randomly chosen site. Thus, the walker performs a teleportation jump with probability p, and a nearest-neighbour step with probability (1 − p). If p = 0, the original cover process is recovered. If p = 1, the walker effectively explores a fully-connected graph. Fig 15 shows  In conclusion, we attribute the non-Gaussianity of α = 0 accelerated cover times in d = 3 to correlations in the sequence of sites visited as the lattice is covered. However, we are not able to explain why such correlations can be ignored for d ≥ 4, or for deceleration protocols with α > 1.

VII. CONCLUSION
We have studied the cover times of accelerated and decelerated random walks on a torus in dimensions d ≥ 3. Building on the work of Aldous [12] and Belius [13], we conjecture a generalized cover time which agrees well with numerics for a range of acceleration-deceleration protocols and dimensions. The α-indexed family of cover time distributions are in fact those describing the roughness of 1/f α Gaussian signals [19], which include Gaussian (0 ≤ α ≤ 1/2), Gumbel α = 1 and exponential (α → ∞) distributions, to name a few.
A notable exception to our conjecture is for α = 0 in d = 3, where we find a cover time distribution somewhat resembling a Tracy-Widom distribution from the Gaussian orthogonal ensemble of random matrices. Although the numerics do not support this identification conclusively, it is interesting to speculate whether a connection between accelerated cover in d = 3 and random matrices exists, e.g. via mappings to KPZ interfaces [43], Gaussian free fields [44,45], or spin glasses [46].
This study leaves a number of open questions, such as the identification of the cover time distribution for α = 0 in d = 3, and why this distribution is particular to d = 3.
1 k α/2 [a k sin(2πkx/L) + b k cos(2πkx/L)] , (A1) where the amplitudes a k and b k are independent standard Gaussian random variables [19]. By construction, the signal is periodic with zero mean, and its power spectrum decays as 1/k α . The integrated power spectrum by Parseval's theorem. Since the sum of two Gaussian squared random variables is exponentially distributed, Hence, apart from an N α prefactor, the integrated power spectrum of 1/f α signals has the same distribution as the coupon collection time S N (α) discussed in the main text.
In the language of interfaces, Eq. (A1) describes a periodic steady-state height profile, and the integrated power spectrum is equivalent to the profile's roughness [47]. A review of 1/f α signals can be found in [19].

Appendix B: Lindeberg condition
Given a collection of independent but not necessarily identical random variables {X k } N k=1 with (finite) variances {σ 2 k } N k=1 , the Lindeberg condition [48] guarantees that their rescaled sum is still Gaussian-distributed, provided that In our context, the collection of random variables ε k /k α have variances 1/k 2α . Therefore, satisfying Eq. (B1) requires i.e. that 0 ≤ α ≤ 1/2.