Maximal intervals between rare random events

© 2011 by Alexei Kourbatov,
As of 01/2013 this article has been updated and incorporated into: Maximal gaps between prime k-tuples: a statistical approach (see Appendix)

Abstract. We derive a simple formula expressing the expected maximal interval between rare random events in terms of the average interval:

E(max interval)  =  a ln(T/a) + O(a)     (1 << a << T),
where a is the average interval between the rare events, and T is the total observation time or length, as applicable.

The formula holds for random events occurring at exponentially distributed (real-valued) intervals, as well as for events at geometrically distributed discrete (integer-valued) intervals. (For more information on extreme value distributions of random sequences see [1–4]; for extreme value distributions of integer-valued random sequences, such as head runs in coin toss sequences, see [5,6] and further references therein.) We expect that similar formulas would also be applicable to maximal intervals between terms in certain other kinds of random or pseudorandom sequences. Moreover, a similar heuristic formula a(ln(x/a) − b) describes maximal gaps between primes near x as well as maximal gaps between prime constellations or prime k-tuples [7].

Notation. In all formulas hereafter:
ln x denotes the natural logarithm of x
logbx denotes the base-b logarithm of x
E(x) denotes the expected value of a random variable x
SD x denotes the standard deviation of x
Var x denotes the variance of x
y = O(x) is shorthand for “y is such that |y| ≤ Cx for some constant C
y = o(x) is shorthand for “y is such that y/x tends to zero.”


Two problems about random events

To illustrate the formula for maximal intervals between random events, we will use two problems:

Problem A. Consider a non-stop toll bridge with very light traffic. Let p > 1/2 be the probability that no car crosses the toll line during a one-second interval, and q = 1 − p the probability to see a car at the toll line during any given second. Suppose we observe the bridge for a total of T seconds, where T is large, while p is constant.

Problem B. Consider a biased coin with a probability of heads p > 1/2 (and the probability of tails q = 1 − p). We toss the coin a total of T times, where T is large.

In both problems, we want to answer the following questions:
(1) What is the expected total number of rare events (cars or tails) in the observation series of length T?
(2) What is the expected average interval a between events (i.e. between cars/tails)?
(3) What is the expected maximal interval between events, as a function of the average interval a?

Before you read on, try to answer the questions yourself. (Notice that the first and the second question are so much easier than the third!)

Here are the easy answers:

(1) Because the probability of the event is q at any given second/toss, we expect a total of nq events after n seconds/tosses, and a total of Tq events at the end of the entire observation series of length T.

(2) To estimate the expected average interval a between events, we divide the total length T of our observation series by the expected total number of events Tq. So the expected average interval between events is aT/(Tq) = 1/q.

(3) Quite obviously, we can predict that the expected maximal interval is less than T, but not less than a:

a  ≤  E(max interval)  <  T.
The expected maximal interval will likely depend on both a and T:
E(max interval)  =  ƒ(a,T).
It is also reasonable to expect that ƒ(a,T) is an increasing function of both arguments, a and T.

This is nice – but can we say anything more specific about the expected maximal interval?


Estimating the most probable maximal interval

In both problems A and B we will assume that 1 << a << T  –  or, in plain English:
  • the events are rare and
  • our observations continue for long enough to see many events.

    In problem A, to estimate the most probable maximal interval between cars we proceed as follows:

    After n seconds of observations, we would have seen about nq cars, hence about nq intervals between cars. The intervals are independent of each other and real-valued. A known good model for the distribution of these intervals is the exponential distribution:

    with probability 1, each interval must be at least 0 seconds;
    with probability p, any given interval is at least 1 second;1
    with probability p2, any given interval is at least 2 seconds;
    with probability p3, any given interval is at least 3 seconds;...
    with probability pt, any given interval is at least t seconds.

    Thus, after n seconds of observations and about nq carless intervals, we would reasonably expect that at least one interval is no shorter than t seconds if we choose t such that

    pt × (nq)  ≥  1.
    Now it is easy to estimate the most probable maximal interval tmax:
    ptmax  ≈  1/(nq)
    (1/p)tmax  ≈  nq
    tmax  ≈  log1/p(nq).

    In problem B we can estimate the longest run of heads Rn after n coin tosses reasoning very similarly. One notable difference is that now the head runs are discrete (have integer lengths). Accordingly, they should be modeled using the geometric distribution. Schilling [5] gives this estimate for the longest run of heads after n tosses, given the heads probability p:

    Rn  ≈  log1/p(nq).
    In both problems, the estimates for the most probable maximal interval (as a function of p and n) have the same form log1/p(nq). Therefore, it is reasonable to expect that the answers to our original question (3) in both problems A and B will also be the same or similar functions of the average interval a, even though the problems are modeled using different distributions of intervals. We will soon see that this indeed is the case.


    If random events are rare...

    If the events (cars in problem A, or tails in problem B) are rare, then p is very close to 1, and q is very close to 0. Since q is small, we have 1/(1 − q) = 1 + q + o(q), so we can write:
    ln(1/p)  =  ln(1/(1 − q))  =  ln(1 + q + o(q))  =  q + o(q)
    or, omitting the o(q) terms,
    ln(1/p)  ≈  ln(1 + q)  ≈  q,     and therefore  
    1 / ln(1/p)  ≈  1/q  ≈  a.

    This allows us to transform our previous estimate log1/p(nq) as follows:

    log1/p(nq)  =  ln(nq) / ln(1/p)  ≈  (1/q) ln(nq)  ≈  a ln(n/a).
    For a long series of observations, with the total length or duration n = T (e.g. T tosses of a biased coin, or T seconds of observing the bridge), the estimate for the most probable maximal interval becomes a ln(T/a).


    Expected maximal intervals

    The specific formulas for expected maximal intervals between rare events depend on the nature of events in the problem – in particular, whether the initial distribution of intervals is exponential or geometric. However, in the formulas for both cases the highest-order term (as T → ∞) turns out to be the same: a ln(T/a), which was precisely our estimate for the most probable maximal interval.

    (A) Exponential initial distribution. Fisher and Tippett [1], Gnedenko [2], Gumbel [3] and other authors showed that, for initial distributions of exponential type (including the exponential distribution as a special case) the limiting distribution of maximal terms in a random sequence is the double exponential distribution – often called the Gumbel distribution. If the initial exponential distribution has the mean a and the probability density function (PDF) 1 − pt = 1 − et/a, then the limiting Gumbel distribution for problem A has these parameters:

    Scale  =  a  =  1/ln(1/p)  ≈  1/q
    Mode  =  μ  =  log1/p(Tq)  ≈  a ln(T/a)
    Mean  =  μ + γa
    where γ = 0.5772... is the Euler-Mascheroni constant. The mean value of observed maximal intervals in problem A will converge almost surely to the mean μ + γa of the Gumbel distribution,
    E(max interval)  =  μ + γa  =  log1/p(Tq) + γ/ln(1/p)    or, expressing the result in terms of a and T,
    E(max interval)  ≈  a ln(T/a) + γa  =  a ln(T/a) + O(a).

    Historical notes:  Interestingly, Gumbel was not the original discoverer of his namesake distribution. Fisher and Tippett (1928) [1] described three types of limiting extreme-value distributions and showed that the double exponential distribution is the limiting extreme-value distribution for a certain wide class of random sequences. They also computed, among other parameters, the mean-to-mode distance in the double exponential distribution (see [1], p.186; it is this result that allows one to conclude that the mean is μ + γa if the mode is μ.) Gnedenko (1943) [2] rigorously proved the necessary and sufficient conditions for an initial distribution to be in the domain of attraction of a given type of limiting distribution. The three possible types of limiting distributions are the Gumbel (double exponential), Fréchet, and Weibull distributions.

    (B) Geometric initial distribution. Somewhat surprisingly, in this case the limiting extreme-value distribution does not exist (see [5], p.203, for a detailed explanation). For the longest run of heads Rn in a series of n tosses of a biased coin, with the probability of heads p, Schilling [5] gives the expected value

    E(Rn)  =  log1/p(nq) + γ/ln(1/p) − 1/2 + r1(n) + ε1(n)
    where the first term is the same as in Problem A (up to a substitution n = T). The sum of the other terms grows as O(a) when p → 1 (and is bounded when n → ∞) so, again, we have:
    E(Rn)  =  a ln(n/a) + O(a).


    Standard deviation (SD)

    As above, the specific formula for standard deviation in distributions of maximal intervals between events depends on the nature of the problem (whether the initial distribution of intervals is exponential or geometric). Still, in both cases the SD is O(a):

    (A) Exponential initial distribution. Here the limiting distribution of maximal intervals is the Gumbel distribution with the scale a = 1/ln(1/p), therefore the SD of maximal intervals must be very close to the SD of the Gumbel distribution, which is πa/6 = π/(6 ln(1/p)) = O(a).

    (B) Geometric initial distribution. For the longest run of heads Rn in a series of n tosses of a biased coin, the variance is

    Var Rn  =  π2/(6 ln2(1/p)) + 1/12 + r2(n) + ε2(n)     ([5], p.202)
    where the first term grows as O(a2) when the heads probability p → 1, while the sum of the other three terms is O(1) when n → ∞ and/or p → 1. (Again, recall that for average intervals a between rare events – in this case, between “tails” – we have a ≈ 1/ln(1/p).) Therefore, the standard deviation (the square root of the variance) is:
    SD Rn  =  π/(6 ln(1/p)) + O(1)  =  O(a).


    A shortcut to the answer

    There is a simple way to guess the answer  a ln(T/a) + O(a). If a is the average interval between events, we estimate that the most probable maximal interval is about  a ln(T/a). We can now simply use the fact that the width of the extreme value distribution is O(a). (Imagine what happens if every interval becomes twice as large: then average and maximal intervals will also become twice as large, and the extreme value distribution will be twice as wide. This immediately implies that the extreme value distribution is O(a) wide.) But then the true value of the expected maximal interval cannot be any farther than O(a) from our estimate  a ln(T/a); so the expected maximal interval is  a ln(T/a) + O(a).



    We have thus considered maximal intervals between random events in two common situations:
    (A) rare events occurring at exponentially distributed intervals;
    (B) discrete rare events at geometrically distributed intervals.

    These two situations are somewhat different:
    in case (A) maximal intervals have a limiting distribution (the Gumbel distribution), while
    in case (B) no limiting distribution exists (here the Gumbel distribution is simply a good approximation).

    Nevertheless, in both cases the expected maximal interval between events is

    E(max interval)  =  a ln(T/a) + γa + lower-order term  =  a ln(T/a) + O(a),    
    with standard deviation  ≈  πa/6  =  O(a),  
    where a is the average interval between events, and T is the total observation time or length. When T → ∞, the lower-order term tends to zero in case (A) – but does not in case (B).

    Note. A remarkably similar heuristic formula, a(ln(x/a) − b) [7], with a negative term −ba replacing the positive γa, describes

  • maximal gaps between primes near x (a = ln x, b ≈ 3; OEIS A005250)
  • maximal gaps between twin primes near x (a = 0.76 ln2x, b ≈ 1.2; OEIS A113274) and, more generally,
  • maximal gaps between prime k-tuples (a = Clnkx, b ≈ 2/k).



    1. R.A. Fisher and L.H.C. Tippett, Limiting forms of the frequency distribution of the largest and smallest member of a sample, Proc. Camb. Phil. Soc., 24 (1928), 180-190.

    2. B.V. Gnedenko, Sur la distribution limite du terme maximum d'une série aléatoire. Ann. Math., 44 (1943), 423-453.
    English translation: On the limiting distribution of the maximum term in a random series. In: Breakthroughs in Statistics, Volume 1: Foundations and Basic Theory. Springer, New York (1993), 185-225.

    3. E.J. Gumbel, Statistical theory of extreme values and some practical applications. Applied Mathematics Series, 33. US National Bureau of Standards (1954)

    4. E.J. Gumbel, Statistics of Extremes, Columbia University Press, New York (1958)

    5. M. F. Schilling, The longest run of heads. The College Mathematics Journal, v.21 (1990), No.3, 196-207.

    6. L. Gordon, M.F. Schilling, and M.S. Waterman, An extreme value theory for long head runs. Probability Theory and Related Fields, v.72 (1986), 279-297.

    7. A. Kourbatov, Maximal gaps between prime k-tuples. In: Doing math with JavaScript (2011)

    8. A. Kourbatov, Maximal gaps between Cramer's random primes (2014)

    Copyright © 2011-2014, Alexei Kourbatov,