Implementing the Hull-White Model
The Hull-White model12 can be considered the simplest of the usual risk-neutral interest rate models for arbitrary starting yield curves. As such it is the bread-and-butter model many market-consistent valuations - in the German insurance market it is used by the GDV-built Branchensimulationsmodell and - in a risk-neutral two-factor variant - for the risk-return classification for pension savings products.
Below I derive the formula I use for simulation in my Hull-White model Jupyter notebook.
The Hull-White model specifies the short rate $r_t$ to follow the SDE $$ dr_t = \alpha (\mu(t) - r_t) dt + \sigma dW_t $$ where $\alpha >0$ is the mean-reversion speed, $\mu$ the (time-dependent) mean-reversion target , $\sigma >0$ is the volatility, $W_t$ a Wiener process. With constant $\mu$, the process turns into the Ornstein-Uhlenbeck or (in the financial literature) Vasicek process. To derive at a risk neutral model, the price at time $t$ of a zero coupon bond maturing at time $T$ is $$ P(t,T) = E\big[ \exp(-\int_t^T r_t dt) | \mathcal{F}_t\big] $$ and the deflator (the stochastic discount factor, reciprocal of the money-market numéraire) is $$ D(t) = \exp(-\int_0^t r_t dt). $$
Interestingly, even though the Gaussian nature of the Hull-White model allows to sample directly from the distribution of the relevant quantities at the desired time-step3, most implementations I have seen use a time discretization of the SDE such as the Euler scheme that uses approximations and may require a calculation time step that is finer than the desired output time step.
Similarly, the model is often formulated - and implemented - using the time-dependent mean reversion target derived from the instantaneous forward rates directly even though it is well-known that the yield curve can also be accommodated separately4, greatly simplifying the implementation.
Leaving the shifting for later, we can thus set $\mu=0$ and consider $$ dx_t = -\alpha x_t + \sigma dW_t $$ with initial condition $x_0=0$.
To produce Monte Carlo simulations we need to simultaneously sample $x_t$ and $\int_0^t x_s ds$, Say at positive integer times $t$. We will (somewhat inconsistent with the above) assume that $x_0$ is given and we want to calculate the processes at $t$. Note that both are function(al)s of a Gaussian and thus themselves normally distributed. Also, the integral is a path integral and thus the usual integration rules apply.
The standard way5 to solve $x_t$ is to modify the process to make the drift vanish: $$ d(\exp(\alpha t) x_t) = \alpha_t \exp(\alpha t) x_t dt - \exp(\alpha t) dx_t = \exp(\alpha_t) \sigma dW_t $$ thus for $t>0$ we can integrate and get $$ \exp(\alpha t) x_t = x_0 + \sigma \int_0^t \exp(\alpha s) dW_s $$ or $$ x_t = \exp(-\alpha t) x_0 + \sigma \int_0^t \exp(\alpha (t-s)) dW_s. $$ The integral has expectation $0$ so $$ E[x_t | x_0] = x_0 \exp(-\alpha t). $$ and the variance can be calculated by means of the Itô isometry6 (or directly, as the integrand is not stochastic) to be $$ V[x_t] = E[ (\int_0^t \sigma \exp(-\alpha (t-s)) dW_s)^2 ] = \int_0^t \sigma^2 \exp(-2\alpha (t-s)) ds = \frac{\sigma^2}{2 \alpha} (1-\exp(-2\alpha t)). $$ To get the covariance, we consider $s < t$ and use the indicator function $1_{u< s}$, i.e. $1$ if $u< s$ and else $0$. Using the Itô isometry again we obtain $$ \begin{aligned} Cov[x_s, x_t] &= E[(\int_0^s \sigma \exp(-\alpha (s-u)) dW_u) (\int_0^t \sigma \exp(-\alpha (t-u)) dW_u) ] \\ &= E[(\int_0^t 1_{u< s} \sigma \exp(-\alpha (s-u)) dW_u) (\int_0^t \sigma \exp(-\alpha (t-u)) dW_u) ] \\ &= \sigma^2 \int_0^t 1_{u< s} \exp(-\alpha (s+t-2u)) du \\ &=\sigma^2 \int_0^s \exp(-\alpha (s+t-2u)) du \\ &=\frac{\sigma^2}{2\alpha} (\exp(-\alpha (t-s)-\exp(-\alpha (t+s))) \end{aligned} $$ The same calculations apply when $t_0>0$ is substituted for $0$ and subtracted from $s$ and $t$.
We can now turn to the integrated $x_t$ needed for the numéraire development. To compute the expected value we exchange expectation and time-integral and obtain $$ E[\int_0^t x_s ds | x_0] = \int_0^t E[x_s | x_0] ds = x_0 \frac{1}{\alpha} (1-\exp(-\alpha t)). $$ To compute the variance of $\int x_s ds$ we introduce two integration variables. We first use Fubini's Theorem to exchange time integration and expectation, plug in the covariance and stare down the integration to see $$ \begin{aligned} Var[\int_0^t x_s ds] &= E[(\int_0^t x_s ds - E[\int_0^t x_s ds])^2] \\ &= E[(\int_0^t x_s ds - E[ \int_0^t x_s ds]) (\int_0^t x_u du - E[\int_0^t x_u du]) ] \\ &= E[(\int_0^t x_s - E[ x_s] ds) (\int_0^t x_u - E[ x_u ] du) ] \\ &= E[(\int_0^t \int_0^t (x_s - E[ x_s]) ( x_u - E[ x_u ]) du ds) ] \\ &= \int_0^t \int_0^t E[(x_s - E[ x_s]) ( x_u - E[ x_u ])] du ds \\ & =\int_0^t \int_0^t Cov(x_s, x_u) du ds \\ & = 2 \int_0^t \int_0^s Cov(x_s, x_u) du ds \\ & = 2 \int_0^t \int_0^s \frac{\sigma^2}{2 \alpha} \exp(-\alpha s) (\exp(\alpha u)-\exp(-\alpha u)) du ds \\ & = \frac{\sigma^2}{\alpha^2} \int_0^t \exp(- \alpha s) (\exp( \alpha s)+\exp(-\alpha s)-2) ds \\ & = \frac{\sigma^2}{\alpha^2} \int_0^t 1+\exp(-2 \alpha s)-2\exp(-\alpha s) ds \\ & = \frac{\sigma^2}{\alpha^3} (\alpha t + \frac{1}{2} (1-\exp(- 2 \alpha t)) + 2 (\exp(-\alpha s)-1)). \end{aligned} $$
As the last covariance we need that between $x_t$ and its integral, this is easier and we are at operating temperature by now. We calculate $$ \begin{aligned} Cov[x_t, \int_0^t x_s ds] &= \int_0^t Cov(x_t, x_s) ds \\ & = \int_0^t \frac{\sigma^2}{2 \alpha} \exp(-\alpha t) (\exp(\alpha s)-\exp(-\alpha s)) ds \\ & = \frac{\sigma^2}{2 \alpha^2} \exp(-\alpha t) (\exp(\alpha t)+\exp(-\alpha t)-2) ds \\ & = \frac{\sigma^2}{2 \alpha^2} (1-exp(-\alpha t))^2. \end{aligned} $$
We now compute the updates from time $0$ to time $t$ and so want to compute $x_t$ and its integral which we denote $y_t := y_0 + \int_0^t x_t dt$.
To sample from the joint normal distribution with given covariance, we use a pair of independent standard Gaussian samples, $Z_1$ and $Z_2$. Then we set $x_t := x_t exp(-\alpha t) + \sqrt{V[x_t]} Z_1$.
Recall that $Z := \rho Z_1 + \sqrt{1-\rho^2} Z_2$ gives a unit normal distribution with correlation $\rho$ between $Z$ and $Z_2$. Using $$ \rho = \frac{Cov[x_t,y_t]}{\sqrt{V[x_t] V[y_t]}} $$ we set $$ \begin{aligned} y_t &:= y_0 + x_0 \frac{1}{\alpha} (1-\exp(-\alpha t)) + \sqrt{V[y_t]} Z \\ &= y_0 + x_0 \frac{1}{\alpha} (1-\exp(-\alpha \delta t)) + \frac{Cov[x_t,y_t]}{\sqrt{V[x_t]}} Z_1 + \sqrt{V[y_t]-\frac{Cov[x_t,y_t]^2}{V[x_t]}} Z_2. \end{aligned} $$
Phew, that was a bit of hard work, but it saves us from worrying about the discretisation timestep and in my experience the usual predictor-corrector discretisations can be quite tideous as well.
Note that the calculation also gives the price formula for zero coupon bonds. As the discounted value of the payoff $\exp(-\int_0^t x_s ds) = \exp(-y_t)$ is log-normally distributed, its expectation, i.e. the ZCB price, is $$ P(t) = E[\exp(-y_t | x_0] = \exp( - E[y_t | x_0] + \frac{1}{2} V[y_t] ). $$ Note that $V[y_t]$ does not depend on $x_0$ and $E[y_t | x_0]$ is linear in $x_0$. Thus denoting the coefficient of $x_0$ in the expectation by $B(t) := \frac{1}{\alpha} (1-\exp(-\alpha t))$ and setting $A(t) := \exp(\frac{1}{2} V[y_t])$ we obtain the usual $P(t) = A(t) \exp(-B(t) x_t)$.
After this bit of hard calculation, we have the ingredients to simulate the Hull-White-Model by directly sampling the short rate and the deflator.
I have implemented these in pytorch along with Jamshidian's swaption pricing for the 1-Factor model in my Hull-White model Jupyter notebook.
-
John Hull and Alan White, Pricing interest-rate derivative securities, The Review of Financial Studies, Vol 3, No. 4 (1990) pp. 573–592 ↩
-
Wikipedia Hull-White_model ↩
-
I do believe that this has been published in D.T. Gillespie: Exact numerical simulation of the Ornstein-Uhlenbeck process and its integral, Physical review E, 1996, but I have not obtained the article. ↩
-
Brigo, Mercurio: On deterministic shift extensions of short rate models. I have seen this implemented as a "scenario shift tool" that was commonly used to derive scenario sets for sensitivities. ↩