Detailed Description of Distributions

This describes implementation of distributions for sampling of distributions for stochastic simulation in continuous time. This kind of simulation makes specific demands on what calls a distribution must support, and those calls are different from what libraries provide. This is a guide for implementation of new distributions and a way to ensure that those implemented look correct. If something is wrong here, it matters, so file a bug report.

Notation

First, let’s affix notation. The cumulative distribution function of every regular distribution can be written as an integral over its hazard rate, \(\lambda\)

(1)\[F(t)=1-e^{-\int_{0}^t \lambda(s)ds}.\]

All algorithms for stochastic simulation treat distributions as being defined in absolute time, specified as an enabling time, \(t_e\),

(2)\[F(t, t_e)=1-e^{-\int_{0}^{t-t_e} \lambda(s)ds}.\]

Working with distributions in absolute time is a simple shift of the time scale and will be ignored in further discussions, although the enabling time, \(t_e\), will certainly appear in code. Fig. (1).

The density function is the derivative of the cumulative distribution function,

(3)\[f(t)=\frac{dF(t)}{dt}=\lambda(t)e^{-\int_{0}^t \lambda(s)ds}.\]

The survival is

(4)\[G(t)=1-F(t)=e^{-\int_{0}^t \lambda(s)ds}.\]

Because survival is multiplicative, we further label the survival from time \(t_0\) to \(t_1\) as

(5)\[G(t_0, t_1)=\frac{G(t_1)}{G(t_0)}=e^{-\int_{t_0}^{t_1} \lambda(s)ds}\]

Requirements for a Continuous-Time Simulation

Shifted Sample

The First Reaction method requires that we sample a distribution given that we known it has not yet fired by a time \(t_0\). The statement that it hasn’t fired by time \(t_0\) creates a new distribution from which to sample. If the old distribution had the hazard \(G(t)=G(0, t)\), it could be written as

(6)\[G(0, t)=G(0, t_0)G(t_0, t).\]

It is this partial survival, since \(t_0\), that we want to sample. Solving for \(G(t_0, t)\) and subtracting both sides from 1,

\[1-G(t_0, t)=\frac{G(0, t_0)-G(0, t)}{G(0, t_0)}.\]

Written in terms of the cumulative distribution functions, the cdf of the new distribution, which we’ll call \(F(t, t_0)\), is

(7)\[F(t, t_0)=\frac{F(t)-F(t_0)}{1-F(t_0)}\]

This kind of distribution could be sampled by a rejection method, but the default way to sample it is by inversion, which means generating a uniform random value between \([0,1]\) and solving \(U=F(t)\) for \(t\). For Eq. (7), this becomes

(8)\[\begin{split}\begin{eqnarray} U&=&F(t,t_0,t_e) \\ &=&\frac{F(t,t_e)-F(t_0,t_e)}{1-F(t_0,t_e)} \\ U(1-F(t_0,t_e))&=&F(t,t_e)-F(t_0,t_e) \\ F(t,t_e)&=&U(1-F(t_0,t_e))+F(t_0,t_e) \\ F(t-t_e)&=&U(1-F(t_0-t_e))+F(t_0-t_e) \\ t-t_e &=& F^{-1}\left[U(1-F(t_0-t_e))+F(t_0-t_e)\right] \\ t &=& t_e+F^{-1}\left[U(1-F(t_0-t_e))+F(t_0-t_e)\right] \end{eqnarray}\end{split}\]

We will call this operation SampleShifted.

Hazard Rate for Next Reaction

The Next Reaction method requires sampling a distribution such that the quantile is saved, so that later adjustments to the distribution can use the same quantile.

During a simulation, the hazard rate, \(\lambda\), is a function of the state of the system, \(X(t)\). The state of the system only changes in jumps, so the hazard rate is effectively a series of distributions in time. For instance, a hazard rate, from enabling time \(T_0\) to firing time \(T_3\), might have three parts.

(9)\[\begin{split}\begin{eqnarray} \lambda(\{X_0, T_0\}, t)=h_0(t) & &\qquad T_0 \le t < T_1 \\ \lambda(\{X_0, T_0, X_1, T_1\}, t)=h_1(t) & &\qquad T_1 \le t < T_2 \\ \lambda(\{X_0, T_0, X_1, T_1, X_2, T_2\}, t)=h_2(t) & &\qquad T_2 \le t < T_3 \end{eqnarray}\end{split}\]

The algorithm therefore samples for a firing time from \(h_0(t)\) as soon as the transition is enabled, but that time will turn out to be wrong (we call it a putative time). Later, the algorithm will resample using \(h_1(t)\) using the original sample’s quantile and taking time off the clock. If the first sample were by inversion, it would look like solving this equation for \(t\) (still ignoring enabling times),

\[U=1-\exp\left(\int_0^{t}h_0(s)ds\right).\]

Then a later sample would use the same \(U\), but with knowledge that the distribution now contains a new part, \(h_1(t)\),

(10)\[ U=1-\exp\left(-\int_0^{t_1}h_0(s)ds\right)\exp\left(-\int_{t_1}^{t}h_1(s)ds\right)\]

Anderson had the bright idea to write the quantile as an exponential quantile, \(1-U=e^{\ln (1-U)}\), so that the equation requires only addition of integrated hazards,

(11)\[\begin{split}\begin{eqnarray} \ln(1-U)&=&-\int_0^{t_1}h_0(s)ds-\int_{t_1}^{t}h_1(s)ds \\ \int_{t_1}^{t}h_1(s)ds&=&-\ln(1-U)-\int_0^{t_1}h_0(s)ds. \end{eqnarray}\end{split}\]

As the underlying distribution, \(h_i(t)\), changes, the right hand side gets smaller and smaller. Let’s call the sum of all consumed hazard \(\gamma\),

\[\gamma=\sum_i \int_{t_i}^{t_{i+1}}h_i(s)ds\]

The algorithm therefore needs three operations from the distribution.

  1. MeasuredSample—Sample the distribution, returning the exponential quantile. Calling the random number generator, “rng,” and the putative time \(t_p\), it’s

    \[\left(\mbox{rng}, h_0(t)\right) \mapsto \left(t_p, -\ln(1-U)\right).\]
  2. ConsumeSample—Consume remaining quantile for the next sample. If the sum of hazard in the past is called \(\gamma\), then

    \[\left(\gamma, h_i(t), t_i\right) \mapsto \left(\gamma', t_{i+1}\right)\]
  3. Putative—Generate a new putative time from the exponential quantile and the consumed hazard,

    \[\left(-\ln(1-U), \gamma, t_i, h_i(t)\right) \mapsto p_t\]

The nice part about the first step is that there is no need to sample by inversion. Any sampling method will do, as long as the exponential quantile is calculated.

Cumulative Distributions for Next Reaction

The original form of the Next Reaction, by Gibson and Bruck, was written in terms, not of the hazards, but of the cumulative distribution functions. This form remains useful because some distributions are much simpler, or more accurate, to sample as cdfs instead of sampling from their hazard rates.

Returning to Eq. (10), this can be rewritten as

\[1-U=\exp\left(-\int_0^{t_1}h_0(s)ds\right)\exp\left(-\int_{t_1}^{t}h_1(s)ds\right)\]

In terms of the survival functions, this becomes

\[1-U=G_0(0, t_1)G_1(t_1, t)\]

If we wish to solve this for \(t\), then, in terms of the survival, it looks like

\[G_1(t_1, t)=\frac{1-U}{G_0(0, t_1)}\]

Writing the left-hand side as a cumulative distribution function requires the transformation

\[G(a, b)=\frac{G(b)}{G(a)}=\frac{1-F(b)}{G(a)}\]

so we have

\[F_1(t)=1-G_1(t_1) \frac{1-U}{G_0(0, t_1)}\]

This generalizes to many steps as

\[F_j(t)=1-G_j(t_j) (1-U) \prod_i^j \frac{G_i(t_i)}{G_i(t_{i+1})}\]

Let’s call the running product on the right \(\delta\),

\[\delta=\prod_i^j \frac{G_i(t_i)}{G_i(t_{i+1})}\]

Then the algorithm requires three operations

  1. MeasuredSample—Sample the distribution, returning the quantile. Calling the random number generator, “rng,” and the putative time \(t_p\), it’s

    \[\left(\mbox{rng}, h_0(t)\right) \mapsto \left(t_p, 1-U\right).\]
  2. ConsumeSample—Consume remaining quantile for the next sample.

    \[\left(\delta_i, h_i(t), t_i\right) \mapsto \left(\delta_{i+1}, t_{i+1}\right)\]
  3. Putative—Generate a new putative time from the exponential quantile and the consumed hazard,

    \[\left(1-U, \delta_i, t_i, h_i(t)\right) \mapsto p_t\]

As you can see by comparison with the hazards version, it’s simple to write the algorithm to accommodate either method of sampling. Therefore, each distribution can choose which interface to support.

Testing Distributions

EmpiricalDistribution

The type EmpiricalDistribution will estimate the mean and variance of a distribution, given samples from it. More importantly, it can compare those samples with a given functional form of a distribution using the Kolmogorov-Smirnov test. Often the distributions in this library have corresponding distributions in the main library, just with a different interface.

There are test functions in after the definition of each distribution to do these tests.

Emulating Next Reaction Sampling

Take a MeasuredSample and then repeatedly consume the sample, with the same distribution, and finally use Putative to get a value. It should agree with the original sample.

Exact Solution to Paradigmatic Systems

There are some stochastic systems, such as a random walk, or the SIR model, for which exact results are known for certain distributions.

The Ball Nasell example shows this. It turns out there is a minor error in Ball Nasell we found this way. The Weiss example makes a similar measurement.

Using Julia’s Distributions

Julia’s continuous univariate distributions support a fixed interface. In this section, we look at how to translate any distribution into the operations above.

In this table, d is the distribution.

cdf(d,t) \(F(t)\)
quantile(d,q) \(F^{-1}(q)\)
logcdf(d,t) \(\ln(F(t))\)
ccdf(d,t) \(G(t)\)
logccdf(d,t) \(-\int_0^t \lambda(s)ds\)
quantile(d,q) \(F^{-1}(q)\)
cquantile(d,q) \(F^{-1}(1-q)=G^{-1}(q)\)
invlogcdf(d,lp) \(F^{-1}(e^{l_p})\)
invlogccdf(d,lp) \(G^{-1}(e^{l_p})\) or \(-\int_0^{t(l_p)}\lambda(s)ds=l_p\)
randexp(rng) \(-\ln(1-U)\)

A shifted sample, from Eq. (8), which ends with

\[t = t_e+F^{-1}\left[U(1-F(t_0-t_e))+F(t_0-t_e)\right]\]

transliterates to

wrappeddistribution.jl
function MeasuredSample(d::WrappedDistribution, t0::Float64, rng)
    U=rand(rng)
    te=d.enabling_time
    value=te+quantile(d.relative_distribution,
        U+(1-U)*cdf(d.relative_distribution, t0-te))
    (value, -log(1-U))
end

The next two pieces concern the hazard. The goal is to find the integral of the hazard between two absolute times, \(t_1\) and \(t_2\), where both are \(t_{1,2}\ge t_0\). This is

\[\int_{t_1-t_e}^{t_2-t_e} \lambda(s)ds=\int_{0}^{t_2-t_e} \lambda(s)ds -\int_{0}^{t_1-t_e} \lambda(s)ds.\]

In terms of the given methods, this would be, noting the minus sign in the table,

wrappeddistribution.jl
function ConsumeSample(dist::WrappedDistribution, xa, start, finish)
    if xa<0
        xa=0
    end
    xa+HazardIntegral(dist, start, finish)
end

function HazardIntegral(dist::WrappedDistribution, t1, t2)
    # logccdf is log(1-cdf(d, x))
    rel=dist.relative_distribution
    te=dist.enabling_time
    logccdf(rel, t1-te)-logccdf(rel, t2-te)
end

Looking back to Eq. (11),

\[\begin{equation} \int_{t_1}^{t}h_1(s)ds=-\ln(1-U)-\int_0^{t_1}h_0(s)ds, \end{equation}\]

we can label xa the combination of the exponential quantile and the sums of integrals on the right-hand side.

wrappeddistribution.jl
function Putative(dist::WrappedDistribution, when,
        interval, consumed_interval)
    ImplicitHazardIntegral(dist, interval-consumed_interval, when)
end

function ImplicitHazardIntegral(dist::WrappedDistribution, xa, t0)
    rel=dist.relative_distribution
    te=dist.enabling_time
    t=te+invlogccdf(rel, -xa+logccdf(rel, t0-te))
    @assert(t>=t0)
    t
end

Exponential

The exponential distribution is constructed with a hazard rate, even though the internal distributions object uses a scale, which is \(\theta =1/\lambda\),

exponentialdistribution.jl
type TransitionExponential <: TransitionDistribution
    hazard::Float64
    enabling_time::Float64
    TransitionExponential(rate::Real)=new(rate, 0.0)
end

It doesn’t matter how we sample the distribution, as long as we return its quantile. This samples using Base.randexp, which uses the ziggurat method for a sample that’s much faster than inversion. The value returned by randexp is equivalent to \(-\ln(1-U)\).

exponentialdistribution.jl
function MeasuredSample(d::TransitionExponential, now::Float64, rng)
    u=randexp(rng)
    (now+u/d.hazard, u)
end

The hazard integral for constant hazards is \((t_2-t_1)\lambda\).

exponentialdistribution.jl
function HazardIntegral(dist::TransitionExponential, start, finish)
    @assert(finish>=start)
    (finish-start)*dist.hazard
end

function ConsumeSample(dist::TransitionExponential, xa, start, finish)
    xa = xa<0 ? 0 : xa
    xa+HazardIntegral(dist, start, finish)
end

Even inverting the hazard integral is an increment with a multiplication.

exponentialdistribution.jl
function ImplicitHazardIntegral(dist::TransitionExponential,
        cumulative_hazard, current_time)
    @assert(cumulative_hazard>=0)
    current_time+cumulative_hazard/dist.hazard
end

function Putative(dist::TransitionExponential, when,
        interval, consumed_interval)
    ImplicitHazardIntegral(dist, interval-consumed_interval, when)
end

Weibull

Like the exponential distribution, the Weibull distribution has an integrable hazard rate, which makes implementation straightforward. Unfortunately, the use of the parameter \(\lambda\) in the definition of the Weibull is at odds with our use of it as a hazard rate, but it’s just a scale parameter here.

(12)\[ F(t)=1-\exp\left[\left(\frac{t-t_e}{\lambda}\right)^k\right]\]

The constructor uses this cdf.

weibulldistribution.jl
type TransitionWeibull <: TransitionDistribution
    parameters::Array{Float64,1}
    te::Float64
    TransitionWeibull(lambda, k)=new([lambda, k], 0.0)
end

From the cdf, the hazard rate is

\[\Lambda(t)=\int_0^t\lambda(s)ds=\left(\frac{t-t_e}{\lambda}\right)^k\]

The inverse, where we ask when the integral equals \(l_u=-\ln(1-U)\), is

\[t=t_e+ \lambda l_u^(1/k)\]

The version in the code is overachieving because it allows for shifting the distribution.

weibulldistribution.jl
function MeasuredSample(distribution::TransitionWeibull, now::Float64, rng)
    (λ, k)=distribution.parameters
    d=now-distribution.te
    value=0
    mlogU=randexp(rng)
    if d>0

Given that the hazard is already integrated in Eq. (12), integrating the hazard is algebraic.

weibulldistribution.jl
function HazardIntegral(dist::TransitionWeibull, last, now)
    (λ, k)=dist.parameters
    if now-dist.te>eps(Float64)
        return ((now-dist.te)/λ)^k - ((last-dist.te)/λ)^k
    else
        return 0::Float64
    end
end

function ConsumeSample(dist::TransitionWeibull, xa, start, finish)
    xa=(xa<0) ? 0 : xa
    xa+HazardIntegral(dist, start, finish)
end
weibulldistribution.jl
function ImplicitHazardIntegral(dist::TransitionWeibull,
        cumulative_hazard, when)
    (λ, k)=dist.parameters
    if when-dist.te>eps(Float64)
        return dist.te + λ*(cumulative_hazard + ((when-dist.te)/λ)^k)^(1.0/k)
    else
        return dist.te + λ*(cumulative_hazard)^(1.0/k)
    end
end

function Putative(dist::TransitionWeibull, when,
        interval, consumed_interval)
    ImplicitHazardIntegral(dist, interval-consumed_interval, when)
end

Log-Logistic

Working from wikipedia, because Gradstein and Ryzhik is too heavy to lift.

\[F(x;\alpha, \beta)=\frac{1}{1+(x/\alpha)^{-\beta}}.\]

We shift this to

\[F(t, t_e)=\frac{1}{1+((t-t_e)/\alpha)^{-\beta}}.\]

The pdf is

\[f(x;\alpha, \beta)=\frac{(\beta/\alpha)(x/\alpha)^{\beta-1}} {(1+(x/\alpha)^\beta)^2}.\]

The quantile is

\[F^{-1}(p; \alpha, \beta)=\alpha \left(\frac{p}{1-p}\right)^{1/\beta}.\]

Survival

\[G(t)=1-F(t)=\frac{1}{1+(t/\alpha)^\beta}.\]

Hazard

\[\lambda(t)=\frac{f(t)}{G(t)}=\frac{(\beta/\alpha)(t/\alpha)^{\beta-1}} {1+(t/\alpha)^\beta}\]

Lastly, we need invlogccdf(d,lp), which is \(G_d^{-1}(e^{l_p})\), or \(-\int_0^t(l_p)\lambda(s)ds=l_p\).

\[\begin{split}\begin{aligned} l_p&=&\ln(G(t)) \\ e^{l_p}&=&G(t) \\ e^{l_p}&=&\frac{1}{1+(t/\alpha)^\beta} \\ e^{-l_p}&=&1+(t/\alpha)^\beta \\ (t/\alpha)^\beta&=& 1-e^{-l_p}\\ t/\alpha&=& (1-e^{-l_p})^{1/\beta}\\ t&=&\alpha(1-e^{-l_p})^{1/\beta}\\\end{aligned}\end{split}\]

Gamma

We will define paramaters from the shape \(\alpha\) and rate \(\beta\).

\[f(x)=\frac{\beta^\alpha}{\Gamma(\alpha)}x^{\alpha-1}e^{-\beta x}\]

where

\[\Gamma(t)=\int_0^\infty x^{t-1}e^{-x}dx.\]

The CDF is

\[F(x;\alpha,\beta)=\frac{\gamma(\alpha,\beta x)}{\Gamma(\alpha)}\]

where \(\gamma\) is the (lower) incomplete gamma function,

\[\gamma(x;\alpha)=\int_0^x t^{\alpha-1}e^{-t}dt\]

In our back pocket, from Boost::Math, are \(\Gamma(x)\), \(\ln(|\Gamma(x)|)\), digamma, which is

\[\psi(x)=\frac{d}{dx}\ln(\Gamma(x))=\frac{\Gamma'(x)}{\Gamma(x)},\]

gamma ratio, which is \(\Gamma(a)/\Gamma(b)\), gamma delta ratio, which is \(\Gamma(a)/\Gamma(a+\Delta)\), and the set of incomplete gamma functions. In order, they are normalized lower incomplete, normalized upper, incomplete full (non-normalized) lower incomplete, and full (non-normalized) upper incomplete gamma functions.

\[\begin{split}\begin{aligned} \mbox{gamma\_p}(a,z)&=&\frac{\gamma(a,z)}{\Gamma(a)}=\frac{1}{\Gamma(a)} \int_0^zt^{a-1}e^{-t}dt \\ \mbox{gamma\_q}(a,z)&=&\frac{\Gamma(a,z)}{\Gamma(a)}=\frac{1}{\Gamma(a)} \int_z^0t^{a-1}e^{-t}dt \\ \mbox{tgamma\_lower}(a,z)&=&\gamma(a,z)= \int_0^zt^{a-1}e^{-t}dt \\ \mbox{tgamma}(a,z)&=&\Gamma(a,z)=\frac{1}{\Gamma(a)} \int_z^0t^{a-1}e^{-t}dt \\\end{aligned}\end{split}\]

There are a set of inverses of incomplete gamma functions and derivatives of incomplete gamma functions. OK, back to what we need.

\[\begin{split}\begin{aligned} F(x;\alpha,\beta)&=&\mbox{gamma\_p}(\alpha, \beta x) \\ F^{-1}(y;\alpha,\beta)&=&\mbox{gamma\_p\_inv}(\alpha, y)/\beta\end{aligned}\end{split}\]

The hazard integral, in terms of the cdf, is

\[\begin{split}\begin{aligned} \int_{t_1-t_e}^{t_2-t_e}\lambda(s)ds&=&-\ln(1-F(t_2-t_e))+\ln(1-F(t_1-t_e)) \\ &=& \ln\left[\frac{1-F(t_1-t_e)}{1-F(t_2-t_e)}\right].\end{aligned}\end{split}\]

Can we simplify this into something provided?

\[\begin{split}\begin{aligned} \int_{t_1-t_e}^{t_2-t_e}\lambda(s)ds & = & \ln\left[\frac{1-\frac{\gamma(\alpha,\beta (t_1-t_e))}{\Gamma(\alpha)}}{1-\frac{\gamma(\alpha,\beta (t_2-t_e))}{\Gamma(\alpha)}}\right] \\ & = & \ln\left[\frac{\Gamma(\alpha)-\gamma(\alpha,\beta (t_1-t_e))} {\Gamma(\alpha)-\gamma(\alpha,\beta (t_2-t_e))} \right] \\ \gamma(\alpha,\beta (t_1-t_e)) & = & \int_0^{\beta(t_1-t_e)} t^{\alpha-1}e^{-t}dt\end{aligned}\end{split}\]

It looks like we might do best just with

Ga=tgamma(a)
hazint(te, t1, t2)=log((Ga-tgamma_lower(a,b*(t1-te)))/
    (Ga-tgamma_lower(a,b*(t2-te))))

Our other goal for Gamma distributions is to get the inverse hazard. This can be seen as two steps. First find the integral

\[l_p=-x+\left[\int_0^{t0-t_e}\lambda(s)ds\right].\]

Then solve for \(t'\) in

\[l_p=-\int_0^{t'-t_e}\lambda(s)ds.\]

Or, we could write this as

\[l_e =e^{-x}e^{-\int_0^{t0-t_e}\lambda(s)ds}=e^{-x}(1-F(t_0-t_e))\]

and

\[l_e=e^{-\int_0^{t'-t_e}\lambda(s)ds}=1-F(t'-t_e).\]

All at once,

\[\begin{split}\begin{aligned} F(t'-t_e)&=&1-e^{-x}(1-F(t_0-t_e)) \\ t'&=&t_e+F^{-1}\left(1-e^{-x}(1-F(t_0-t_e))\right). \\ F(t_0-t_e)&=&\mbox{gamma\_p}(\alpha,\beta(t_0-t_e)) \\ F^{-1}(y)&=&\mbox{gamma\_p\_inv}(\alpha, y)/\beta\end{aligned}\end{split}\]

So here is our inverse hazard integral.

quad=1-exp(-x)*(1-gamma_p(a,b*(t0-te)))
tp=te + gamma_p_inv(a, quad)/b

Uniform Distribution

Maybe this one will be easier. This distribution has two parameters, a start time and an end time, \(t_a\) and \(t_b\). The pdf is constant, \(f(t)=1/(t_b-t_a)\) between \(t_a\le t<t_b\). The CDF is just the integral of that, \(F(t)=(t-t_a)/(t_b-t_a)\). The integrated hazard will have nonzero cases for for \(t_1<t_a<t_2<t_b\), \(t_1<t_a<t_b<t_2\), \(t_a<t_1<t_2<t_b\), \(t_a<t_1<t_b<t_2\). It is zero for \(t_1<t_2<t_a\) and \(t_a<t_b<t_1<t_2\)

\[\int_{t_1-t_e}^{t_2-t_e}\lambda(s)ds= \ln\left[\frac{1-F(t_1-t_e)}{1-F(t_2-t_e)}\right]\]

If \(t_a\le t_n-t_e<t_b\), then \(F(t_n-t_e)=(t_n-t_e-t_a)/(t_b-t_a)\). Otherwise it is \(0\) or \(1\). It should never be the case that a uniform distribution does not fire before \(t_b\). The hazard integral always sums over time already past in the simulation. Nevertheless, it will be necessary to check for overflow near \(t_b\), and it would help to keep the two logs separated, instead of in the fraction.

What about the inverse of the hazard integral? \(F^{-1}(x)=t_a+(t_b-t_a)x\) Therefore, for \(t_a\le t_0-t_e\),

\[t'=t_e+t_a+(t_b-t_a)\left[1-e^{-x}\left(1-\frac{t_0-t_e-t_a}{t_b-t_a}\right)\right]\]

and for \(t_0-t_e< t_a\),

\[t'=t_e+t_a+(t_b-t_a)\left[1-e^{-x}\right]\]

Triangular Distribution

The cumulative distribution function for the triangular distribution with endpoints \(a\) and \(b\) and midpoint \(m\) is

\[\begin{split}\begin{aligned} \frac{(t-a)^2}{(b-a)(m-a)} & & a\le t \le m \\ 1-\frac{(b-t)^2}{(b-a)(b-m)} & & m<t\le b.\end{aligned}\end{split}\]

This makes the survival

\[\begin{split}\begin{aligned} 1-\frac{(t-a)^2}{(b-a)(m-a)} & & a\le t \le m \\ \frac{(b-t)^2}{(b-a)(b-m)} & & m<t\le b.\end{aligned}\end{split}\]

Simple Sample

The cutoff is at \(t=m\), which is

\[\begin{split}\begin{aligned} U'&=&\frac{(m-a)^2}{(b-a)(m-a)} \\ &=&\frac{m-a}{b-a}\end{aligned}\end{split}\]

so first check whether \(U\) is greater than that. Then, for \(U\) less than that,

\[\begin{split}\begin{aligned} t = a + \left[U(b-a)(m-a)\right]^{1/2} & & U\le U' \\ t = b- \left[(1-U)(b-a)(b-m)\right]^{1/2} & & U'<U \\\end{aligned}\end{split}\]

Shifted Sample

If this is sampled after some time, \(x\), then the thing we want to invert is

\[U=\frac{F(t)-F(x)}{G(x)}\]

so

\[F(t)=UG(x)+F(x)\]

which, for \(a<t\le m\) and \(a<x\le m\), is

\[\begin{split}\begin{aligned} \frac{(t-a)^2}{(b-a)(m-a)} & = & U\left[1-\frac{(x-a)^2}{(b-a)(m-a)}\right] +\frac{(x-a)^2}{(b-a)(m-a)} \\ (t-a)^2 & = & U(b-a)(m-a)+ (1-U)(x-a)^2 \\ t & = & a + \left[U(b-a)(m-a)+ (1-U)(x-a)^2\right]^{1/2}\end{aligned}\end{split}\]

For \(m<t\le b\) and \(m<x\le b\), this is

\[\begin{split}\begin{aligned} 1-\frac{(b-t)^2}{(b-a)(b-m)} & = & U\frac{(b-x)^2}{(b-a)(b-m)}+ 1-\frac{(b-x)^2}{(b-a)(b-m)} \\ -(b-t)^2 & = & U(b-x)^2-(b-x)^2 \\ t & = & b-(b-x)\sqrt{1-U}\end{aligned}\end{split}\]

In the case that \(m<t\le b\) and \(a<x\le m\), the result is

\[\begin{split}\begin{aligned} 1-\frac{(b-t)^2}{(b-a)(b-m)} & = & U\left[1-\frac{(x-a)^2}{(b-a)(m-a)}\right] + \frac{(x-a)^2}{(b-a)(m-a)}\\ 1-\frac{(b-t)^2}{(b-a)(b-m)} & = &U+\frac{(x-a)^2(1-U)}{(b-a)(m-a)} \\ \frac{(b-t)^2}{(b-a)(b-m)} & = &(1-U)-\frac{(x-a)^2(1-U)}{(b-a)(m-a)} \\ (b-t)^2& = &(1-U)(b-a)(b-m)-\frac{(x-a)^2(1-U)(b-m)}{(m-a)} \\ t & = & b-\left[(1-U)(b-a)(b-m)-\frac{(x-a)^2(1-U)(b-m)}{(m-a)}\right]^{1/2}\end{aligned}\end{split}\]

Sampling from Quantile

The equation we have to solve is

\[F(t)=1-G(t_j)(1-u)\prod_i\frac{G_i(t_i)}{G_i(t_{i+1})},\]

given to us as

\[F(t)=1-G(t_j)(1-u)\gamma,\]

so, in terms of survivals, it’s

\[G(t)=G(t_j)(1-u)\gamma\]

The value on the right is all known. If \(t_j<m\), the cutoff is

\[G'=\frac{b-m}{(b-a)(b-m)}\]

Below that,