# Survival Times and Probabilities

My acquaintance gwern had a question: given the current total lifetime of a certain web service, how to infer the probable future lifetime?

## Quick Estimate

A quick and dirty intuitive method of calculating this uses Laplace’s rule of succession: treat survival or failure as a Bernoulli trial per unit time, so that given that the subject has survived for N days without a closure on any of those days, we could say that the probability of surviving the next day is as given:

$p(N) = \frac{1}{N+2}$

In that case the probability that we survive at least the next $$n$$ days could be $$p^n$$, or

$P(n | N) = \left(\frac{1}{N+2}\right)^n$

But of course that’s not really right, because after the first of those $$n$$ days we have one more day of information, so we’re a little bit more likely to survive the next. Therefore it can’t just be $$p^n$$. We really want:

$P(n | N) = \prod_{d=0}^{n-1} \left(\frac{1}{N+d+2}\right)$

## Problem?

The above solutions have two big problems. The first is somewhat obvious: the second, more correct solution requires $$n$$ to be an integral multiple of your unit of time. You can’t ask about $$n=1.618$$ days, because you can’t take the product of discrete steps from $$d=0$$ to $$1.618$$.

The second problem is more subtle: neither of these solutions are invariant with respect to the base unit of time. This is easiest to see with the first solution – changing from measuring in days to months turns $$\left(\frac{1}{N+2}\right)^n$$ into $$\left(\frac{1}{N/30+2}\right)^{(n/30)}$$ which is not identically equal. This is a problem because our units of time are an arbitrary convention and should make no difference to the result.

## A True Solution

The resolution to both those problems lies in observing that the rule of succession isn’t really aware of time as a continuous (as opposed to discrete) notion. We need a proper model. As it turns out, the usual model for ‘lifetimes’ of things is a negative exponential:

$P(T=t | a) = a e^{-at}$

$P(T>t | a) = e^{-at}$

This could perhaps be called the continuous generalization of a Bernoulli trial. The event has a constant probability per unit time given that it hasn’t happened yet, described by $$a$$, which is called the “hazard rate”. $$a$$ has units of inverse time, or “events per unit time”, as you’d expect.

In our case, $$a$$ is an unknown variable. The only input we’ve been given is the survival time so far, $$Y$$, from which we must calculate the distribution for the future survival time, $$X$$ (switching to variables $$X$$ and $$Y$$ here rather than $$n,N$$ since we’re no longer thinking discretely). Our goal is to calculate the posterior distribution:

$P(X=x | Y=y)$

or

$P(T=(x+y) | T>y)$

We can start by calculating the posterior distribution on $$a$$:

\begin{aligned} P(a | T>y) &= \frac{P(T>y|a) P(a)}{P(T>y)} \\ &= \frac{P(T>y|a) P(a)}{\int da' ~ P(T>y|a') P(a')} \end{aligned}

A reasonable choice of prior $$P(a)$$ is also a negative exponential $$P(a) = w e^{-wa}$$. We can take the limit $$w \to 0$$ for the flat (improper) prior of maximum ignorance. Then,

\begin{aligned} P(T>y | a) P(a) &= e^{-ay} w e^{-wa} \\ &= w e^{-a(y+w)} \\ P(T>y) &= w \int_0^\infty e^{-a(y+w)} ~ da \\ &= \frac{w}{y+w} \end{aligned}

So we have our posterior distribution on $$a$$:

\begin{aligned} P(a | T>y) &= (y + w) e^{-a(y+w)} \end{aligned}

With this we can calculate the posterior pdf on $$X$$, by marginalizing over $$a$$:

\begin{aligned} P(T=(x+y) | T>y) &= \int_0^∞ a e^{-ax} P(a|T>y) ~ da \\ &= \int_0^∞ a e^{-ax} (y + w) e^{-a(y+w)} ~ da \\ &= (y + w) \int_0^∞ a e^{-a(w + y + x)} ~ da \\ &= \frac{w + y}{(w + y + x)^2} \end{aligned}

And the posterior reverse cumulative distribution:

\begin{aligned} P(T>(x+y) | T>y) &= \int_0^∞ { e^{-ax} ~ (y + w) e^{-a(y+w)} } ~ da \\ &= \int_0^∞ { (y + w) e^{-a(w+y+x)} } ~ da \\ &= \frac{w + y}{w + y + x} \end{aligned}

Notice that $$w$$ (the prior) seems to affect the result in the same way as would $$w$$ units of previously observed survival time. Taking the ignorance prior $$w=0$$, the results are:

\begin{aligned} P(a | Y=y) &= y e^{-ay} \\ P(X=x | Y=y) &= \frac{y}{(y + x)^2} \\ P(X>x | Y=y) &= \frac{y}{y + x} \end{aligned}

The expected value of $$a$$ is $$1/y$$ which is quite reasonable, but notice that since $$x P(X=x | Y=y)$$ goes as $$1/x$$ asymptotically, the expectation $$E[x|y]$$ doesn’t converge.

Also notice that these results are properly invariant with respect to the unit of time. $$P(X>x | Y=y)$$ is a unitless probability, and therefore is completely invariant as it should be, while $$P(X=x | Y=y)$$ and $$P(a | Y=y)$$ are probability densities with respect to time and inverse time, and therefore have units of inverse time / time as they should.