## Interview Questions VII: Integrated Brownian Motion

For a standard Weiner process denoted , calculate

This integral is the limit of the sum of at each infinitessimal time slice from to , it is called Integrated Brownian Motion.

We see immediately that this will be a random variable with expectation zero, so the result of the squared expectation will simply be the variance of the random variable, which is what we need to calculate. Here I show two ways to solve this, first by taking the limit of the sum and second using stochastic integration by parts

Need help? Discuss this in the Interview Questions forum!

1) Limit of a sum

(1)

This sum of values along the Wenier process is not independent of one-another, since only the increments are independent. However, we can re-write them in terms of sums of these independent increments

(2)

where are the individual independent increments of the brownian motion. Substituting into our previous equation and reversing the order of the summation

(3)

which is simply a weighted sum of independent gaussians. To calculate the total variance, we sum the individual variances using the summation formula for

(4)

which is the solution.

2) Stochastic integration by parts

The stochastic version of integration by parts for potentially stochastic variables , looks like this:

Re-arranging this gives

Now setting and we have

(5)

We recognise this as a weighted sum of independent gaussian increments, which is (as expected) a gaussian variable with expectation 0 and variance that we can calculate with the Ito isometry

(6)

which is the solution.

## Fitting the initial discount curve in a stochastic rates model

I’ve introduced the Vasicek stochastic rates model in an earlier post, and here I’m going to introduce a development of it called the Hull-White (or sometimes Hull-White extended Vasicek) model.

The rates are modelled by a mean-reverting stochastic process

which is similar to the Vasicek model, except that the  term is now allowed to vary with time (in general and are too, but I’ll ignore those issues for today).

The freedom to set theta as a deterministic function of time allows us to calibrate the model to match the initial discount curve, which means at least initially our model will give the right price for things like FRAs. Calibrating models to match market prices is one of the main things quants do. Of course, the market doesn’t really obey our model. this means that, in time, the market prices and the prices predicted by our model will drift apart, and the model will need to be re-calibrated. But the better a model captures the various risk factors in the market, the less often this procedure will be needed.

Using the trick

to re-express the equation and integrating gives

where  is the rate at . The observable quantities from the discount curve are the initial discount factors (or equivalently the initial forward rates) , where

The rate  is normally distributed, so the integral  must be too. This is because an integral is essentially a sum, and a sum of normal distributions is also normally distributed. Applying the Ito isometry as discussed before, the expectation of this variable will come wholly from the deterministic terms and the variance will come entirly from the stochastic terms, giving

where throughout

and since

we have

Two differentiations of this expression give

and combining these equations gives an expression for that exactly fits the initial discount curve for the given currency

and since  is simply the initial market observed forward rate to each time horizon coming from the discount curve, this can be compactly expressed as

Today we’ve seen how a simple extension to the ‘basic’ Vasicek model allows us to match the initial discount curve seen in the market. Allowing the volatility parameter to vary will allow us to match market prices of other products such as swaptions (an option to enter into a swap), which I’ll discuss another time. But we’re gradually building up a suite of simple models that we can combine later to model much more complicated environments.

## Interview Questions VI: The Drunkard’s Walk

A drunk man is stumbling home at night after closing time. Every step he takes moves him either 1 metre closer or 1 metre further away from his destination, with an equal chance of going in either direction (!). His home is 70 metres down the road, but unfortunately, there is a cliff 30 metres behind him at the other end of the street. What are the chances that he makes it to his house BEFORE tumbling over the edge of the cliff?

This is a fun question and quite heavy on conditional probability. We are trying to find the probability that the drunkard has moved 70 metres forward BEFORE he has ever moved 30 metres backward, or visa versa. There are several ways of attempting this, including some pretty neat martingale maths, but I’m going to attempt it here in the language of matrices and markov chains.

Essentially, there are 100 states that the drunkard can be in, from right beside the cliff all the way down the road to right beside his front door. Let’s label these from 1 to 100, in terms of the number of metres away from the cliff, so that he starts in state 30. At each step, he transitions from his current state to either one state higher or one state lower with probability 50%, and the process continues until he reaches either the cliff or the door, at which point the process will cease in either a good or a very bad night’s rest. We call these two states, 0 and 100, ‘absorbers’, because the process stops at this point and no transitions to new states can happen. A markov diagram that illustrates this process can be drawn like this:

We can characterise each step in the process by a transition matrix acting on a state vector. The drunkard initially has a state of 30 metres, so his state vector is a long string of zeroes with a single 1 at the 30th position:

$S_0&space;=&space;\begin{pmatrix}&space;0&space;\\&space;\vdots&space;\\&space;1\\&space;\vdots\\&space;0&space;\end{pmatrix}$

This vector is probabalistic – a 1 indicates with 100% certainty that the drunkard is in the 30th state. However, with subsequent moves this probability density will be spread over the nearby states as his position’s probability density diffuses into other states. The transition matrix multiplies the drunkard’s state vector to give his state vector after another step:

$P&space;=&space;\begin{pmatrix}&space;1&space;&&space;0.5&space;&&space;0&space;&&space;\cdots&space;&&space;0&space;&&space;0\\&space;0&space;&&space;0&space;&&space;0.5&space;&&space;\cdots&space;&&space;0&space;&&space;0\\&space;0&space;&&space;0.5&space;&&space;0&space;&&space;\cdots&space;&&space;0&space;&&space;0\\&space;0&space;&&space;0&space;&&space;0.5&space;&&space;&&space;0&space;&&space;0&space;\\&space;\vdots&space;&&space;\vdots&space;&&space;\vdots&space;&&space;\ddots&space;&&space;\vdots&space;&&space;\vdots\\&space;0&space;&&space;0&space;&&space;0&space;&&space;\cdots&space;&&space;0.5&space;&&space;1&space;\end{pmatrix};&space;\quad&space;S_{i+1}&space;=&space;P&space;\cdot&space;S_i$

So, after one step the drunkard’s state vector will have a 0.5 in the 29th and the 31st position and zeroes elsewhere, saying that he will be in either of these states with probability 0.5, and certainly nowhere else. Note that the 1’s at the top and bottom of the transition matrix will absorb and probability that arrives at that state for the rest of the process.

To keep things simple, let’s consider a much smaller problem with only six states, where state 1 is ‘down the cliff’ and state 6 is ‘home’; and we’ll come back to the larger problem at the end. We want to calculate the limit of the drunkard’s state as the transition matrix is applied a large number of times, ie. to calculate

$S_n&space;=&space;P^n&space;\cdot&space;S_0$

An efficient way to calculate powers of a large matrix is to first diagonalise it. We have $\inline&space;P&space;=&space;U&space;A\&space;U^{-1}$, where $\inline&space;A$ is a diagonal matrix whose diagonal elements are the eigenvalues of $\inline&space;P$, and $\inline&space;U$ is the matrix whose columns are the eigenvectors of $\inline&space;P$. Note that, as $\inline&space;P$ is not symmetric, $\inline&space;U^{-1}$ will have row vectors that the the LEFT eigenvectors of $\inline&space;P$, which in general will be different to the right eigenvectors. It is easy to see how to raise $\inline&space;P$ to the power n

$P^n&space;=&space;(U&space;A\&space;U^{-1})^n&space;=&space;U&space;A^n\&space;U^{-1}$

since all of the $\inline&space;U$s and $\inline&space;U^{-1}$s in the middle cancel. Since $\inline&space;A$ is diagonal, to raise it to the power of n we simlpy raise each element (which are just the eigenvalues of $\inline&space;P$) to the power of n. To calculate the eigenvalues of P we solve the characteristic equation

$|P&space;-&space;\lambda_a&space;I|&space;=&space;0$

with

$P&space;=&space;\begin{pmatrix}&space;1&space;&&space;0.5&space;&&space;0&space;&&space;0&space;&&space;0&space;&&space;0\\&space;0&space;&&space;0&space;&&space;0.5&space;&&space;0&space;&&space;0&space;&&space;0\\&space;0&space;&&space;0.5&space;&&space;0&space;&&space;0.5&space;&&space;0&space;&&space;0\\&space;0&space;&&space;0&space;&&space;0.5&space;&&space;0&space;&&space;0.5&space;&&space;0&space;\\&space;0&space;&&space;0&space;&&space;0&space;&&space;0.5&space;&&space;0&space;&&space;0\\&space;0&space;&&space;0&space;&&space;0&space;&&space;0&space;&&space;0.5&space;&&space;1&space;\end{pmatrix}$

This gives six eigenvalues, two of which are one (these will turn out to correspond to the two absorbing states) and the remainder are strictly $\inline&space;0\leq&space;\lambda_a&space;<&space;1$. Consequently, when raised to the power n in the diagonal matrix above, all of the terms will disappear except for the first and the last eigenvalue which are 1 as n becomes large.

Calculating the eigenvectors is time consuming and we’d like to avoid it if possible. Luckily, in the limit that n gets large, we’ve seen that most of the eigenvalues raised to the power n will go to zero which will reduce significantly the amount we need to calculate. We have

$P^n\cdot&space;S&space;=&space;\bigl(&space;U&space;A^n&space;\&space;U^{-1}\bigr)\cdot&space;S$

and in the limit that n gets large, $\inline&space;U&space;\cdot&space;A^n$ is just a matrix of zeroes with a one in the upper left and lower right entry. At first sight it looks as though calculating $\inline&space;U^{-1}$ will be required, which is itself an eigenvector problem, but in fact we only have to calculate a single eigenvector – the first (or equivalently the last), which will give the probability of evolving from an initial state S to a final state 0 (or 100).

$\inline&space;U^{-1}$ is the matrix of left eigenvectors of $\inline&space;P$, each of which satisfies

$x_a&space;\cdot&space;P&space;=&space;\lambda_a&space;x_a$

and we are looking for the eigenvectors corresponding to eigenvalue of 1, so we need to solve the matrix equation

$x&space;\cdot&space;\begin{pmatrix}&space;0&space;&&space;0.5&space;&&space;0&space;&&space;0&space;&&space;0&space;&&space;0\\&space;0&space;&&space;-1&space;&&space;0.5&space;&&space;0&space;&&space;0&space;&&space;0\\&space;0&space;&&space;0.5&space;&&space;-1&space;&&space;0.5&space;&&space;0&space;&&space;0\\&space;0&space;&&space;0&space;&&space;0.5&space;&&space;-1&space;&&space;0.5&space;&&space;0&space;\\&space;0&space;&&space;0&space;&&space;0&space;&&space;0.5&space;&&space;-1&space;&&space;0\\&space;0&space;&&space;0&space;&&space;0&space;&&space;0&space;&&space;0.5&space;&&space;0&space;\end{pmatrix}&space;=&space;0$

We know that if he starts in the first state (ie. over the cliff) he must finish there (trivially, after 0 steps) with probability 100%, so that $\inline&space;x_1&space;=&space;1$ and $\inline&space;x_6&space;=&space;0$. The solution to this is

$x_n&space;=&space;{6&space;-&space;n&space;\over&space;5}$

which is plotted here for each initial state

which says that the probability of ending up in state 1 (down the cliff!) falls linearly with starting distance from the cliff. We can scale up this final matrix equation to the original 100 by 100 state space, and find that for someone starting in state 30, there is a 70/100 chance of ending up down the cliff, and consequently only a 30% chance of getting home!

This problem is basically the same as the Gambler’s ruin problem, where a gambler in a casino stakes $1 on each toss of a coin and leaves after reaching a goal of$N or when broke. There are some very neat methods for solving them via martingale methods that don’t use the mechanics above that I’ll look at in a future post.

## Forwards vs. Futures

I’ve covered Forwards and Futures in previous posts, and now that I’ve covered the basics of Stochastic Interest Rates as well, we can have a look at the difference between Forwards and Futures Contracts from a financial perspective.

As discussed before, the price of a Forward Contract is enforceable by arbitrage if the underlying is available and freely storable and there are Zero Coupon Bonds available to the Forward Contract delivery time. In this case, the forward price is

$F(t,T)&space;=&space;S(t)&space;\cdot&space;{1&space;\over&space;{\rm&space;ZCB}(t,T)}$

In this post I’m going to assume a general interest rate model, which in particular may well be stochastic. In such cases, the price of a ZCB at the present time is given by

${\rm&space;ZCB}(t,T)&space;=&space;{\mathbb&space;E}\Big[&space;\exp{\Big\{\int_t^T&space;r(t')&space;dt'\Big\}&space;}&space;\Big]$

Futures Contracts are a bit more complicated, and we need to extend our earlier description in the case that there are interest rates. The basic description was given before, but additionally in the presence of interest rates, any deposit that is in either party’s account is delivered to the OTHER party at the end of each time period. So, taking the example from the previous post, on day 4 we had $4 on account with the exchange – if rates on that day were 10% p.a., over that day the$4 balance would accrue about 10c interest, which would be paid to the other party.

Let’s say we’re at time s, and want to calculate the Futures price to time T. Our replication strategy is now as follows, following the classic proof due to Cox, Ingersall and Ross but in continuous time. Futures Contracts are free to enter into and break out of due to the margin in each account, so entering X Futures Contracts at time t and closing them at time t+dt will lead to a net receipt (or payment if negative) of $\inline&space;{\rm&space;X}\cdot\big[&space;H(t+dt,T)&space;-&space;H(t,T)\big]$. From t+dt to T, we invest (borrow) this amount at the short rate and thus recieve (need to pay)

${\rm&space;X}\cdot\big[&space;H(t+\tau,T)&space;-&space;H(t,T)\big]\cdot\prod_t^T&space;\big(&space;1&space;+&space;r(t)\tau&space;\big)$

and now moving to continuous time

${\rm&space;X}\cdot\big[&space;H(t+dt,T)&space;-&space;H(t,T)\big]\cdot\int_t^T&space;e^{&space;r(t)}\&space;dt$

We follow this strategy in continuous time, constantly opening contracts and closing them in the following time period [I’m glossing over discrete vs. continuous time here – as long as the short rate corresponds to the discrete time step involved this shouldn’t be a problem], and investing our profits and financing our losses both at the corresponding short rate. We choose a different X for each period [t,t+td] so that $\inline&space;{\rm&space;X}(t)&space;=&space;\int_s^t&space;\exp{\{r(t')\}}dt'$. We also invest an amount H(s,T) at time s at the short rate, and continually roll this over so that it is worth $\inline&space;H(s,T)\cdot&space;\int_s^T&space;\exp{\{r(t)\}}dt$ at time T

Only the final step of this strategy costs money to enter, so the net price of the portfolio and trading strategy is H(s,T). The net payoff at expiry is

$H(s,T)\cdot&space;\int_s^T&space;e^{r(t)}dt&space;+&space;\sum_s^T&space;{\rm&space;X}\cdot[H(t+dt,T)-H(t,T))]\cdot\int_t^T&space;e^{r(t)}dt$

$=&space;H(s,T)\cdot&space;\int_s^T&space;e^{r(t)}dt&space;+&space;\sum_s^T&space;\int_s^t&space;e^{r(t)}dt\cdot[H(t+dt,T)-H(t,T))]\cdot\int_t^T&space;e^{r(t)}dt$

$=&space;H(s,T)\cdot&space;\int_s^T&space;e^{r(t)}dt&space;+&space;\int_s^T&space;e^{r(t)}dt&space;\cdot&space;\sum_s^T&space;[H(t+dt,T)-H(t,T))]$

$=&space;H(s,T)\cdot&space;\int_s^T&space;e^{r(t)}dt&space;+&space;\int_s^T&space;e^{r(t)}dt&space;\cdot&space;[H(T,T)-H(s,T))]$$=&space;H(T,T)&space;\cdot&space;\int_s^T&space;e^{r(t)}dt$

And H(T,T) is S(T), so the net payoff of a portfolio costing H(s,T) is

$=&space;S(T)&space;\cdot&space;\int_s^T&space;e^{r(t)}dt$

How does this differ from a portfolio costing the Forward price? Remembering that in Risk-Neutral Valuation, the present value of an asset is equal to the expectation of its future value discounted by a numeraire. In the risk-neutral measure, this numeraire is a unit of cash B continually re-invested at the short rate, which is worth $\inline&space;B(t,T)&space;=&space;e^{\int_t^T&space;r(t')dt'&space;}$, so we see that the Futures Price is a martingale in the risk-neutral measure (sometimes called the ‘cash measure’ because of its numeraire). So the current value of a Futures Contract on some underlying should be

$H(t,T)&space;=&space;{\mathbb&space;E}^{\rm&space;RN}\big[&space;S(T)&space;|&space;{\cal&space;F}_t&space;\big]$

ie. the undiscounted expectation of the future spot in the risk-neutral measure. The Forward Price is instead the expected price in the T-forward measure whose numeraire is a ZCB expiring at time T

$F(t,T)&space;=&space;{\mathbb&space;E}^{\rm&space;T}\big[&space;S(T)&space;|&space;{\cal&space;F}_t&space;\big]$

We can express these in terms of each other remembering F(T,T) = S(T) = H(T,T) and using a change of numeraire (post on this soon!). I also use the expression for two correlated lognormal, which I derived at the bottom of this post

\begin{align*}&space;F(t,T)&space;&=&space;{\mathbb&space;E}^{T}\big[&space;F(T,T)&space;|&space;{\cal&space;F}_t&space;\big]&space;\\&space;&=&space;{\mathbb&space;E}^{T}\big[&space;S(T)&space;|&space;{\cal&space;F}_t&space;\big]&space;\\&space;&=&space;{\mathbb&space;E}^{T}\big[&space;H(T,T)&space;|&space;{\cal&space;F}_t&space;\big]&space;\\&space;&=&space;{\mathbb&space;E}^{RN}\big[&space;H(T,T)&space;{B(t)\over&space;B(T)}&space;{{\rm&space;ZCB}(t,T)\over&space;{\rm&space;ZCB(T,T)}}|&space;{\cal&space;F}_t&space;\big]&space;\\&space;&=&space;{\rm&space;ZCB}(t,T){\mathbb&space;E}^{RN}\big[&space;H(T,T)&space;{1\over&space;B(T)}|&space;{\cal&space;F}_t&space;\big]&space;\\&space;&=&space;{\rm&space;ZCB}(t,T){\mathbb&space;E}^{RN}\big[&space;H(T,T)\big]&space;{\mathbb&space;E}^{RN}\big[&space;e^{-\int_t^T&space;r(t')dt'}&space;\big]&space;e^{\sigma_H&space;\sigma_B&space;\rho}&space;\\&space;&=&space;H(t,T)&space;\cdot&space;e^{\sigma_H&space;\sigma_B&space;\rho}&space;\\&space;\end{align*}

where $\inline&space;\sigma_H$ is the volatility of the Futures price, and $\inline&space;\sigma_B$ is the volatility of a ZCB – in general the algebra will be rather messy!

As a concrete example, let’s consider the following model for asset prices, with S driven by a geometric brownian motion and rates driven by the Vasicek model discussed before

${dS&space;\over&space;S}&space;=&space;r(t)&space;dt&space;+&space;\sigma_S&space;dW_t$

$dr&space;=&space;a&space;\big[&space;\theta&space;-&space;r(t)\big]&space;dt&space;+&space;\sigma_r&space;\widetilde{dW_t}$

And (critically) assuming that the two brownian processes are correlated according to rho

$dW_t&space;\cdot&space;\widetilde{dW_t}&space;=&space;\rho&space;dt$

In this case, the volatility $\inline&space;\sigma_B$ is the volatility of $\inline&space;{\mathbb&space;E}\big[&space;e^{-\int_t^T&space;r(t')dt'}\big]$, and as I discussed in the post on stochastic rates, this is tractable and lognormally distributed in this model.

We can see that in the case of correlated stochastic rates, these two prices are not the same – which means that Futures and Forward Contracts are fundamentally different financial products.

For two standard normal variates x and y with correlation rho, we have:

\begin{align*}&space;{\mathbb&space;E}\big[&space;e^&space;{\sigma_1&space;x}&space;\big]&&space;=&space;e^&space;{{1\over&space;2}\sigma_1^2&space;}&space;\end{align*}

and

\begin{align*}&space;{\mathbb&space;E}\big[&space;e^&space;{\sigma_1&space;x&space;+&space;\sigma_2&space;y}&space;\big]&&space;=&space;{\mathbb&space;E}\big[&space;e^&space;{\sigma_1&space;x&space;+&space;\sigma_2&space;\rho&space;x&space;+&space;\sigma_2&space;\sqrt{1-\rho^2}z}&space;\big]\\&space;&&space;=&space;{\mathbb&space;E}\big[&space;e^&space;{(\sigma_1&space;+&space;\sigma_2&space;\rho)&space;x&space;+&space;\sigma_2&space;\sqrt{1-\rho^2}z}&space;\big]\\&space;&&space;=&space;\big[&space;e^&space;{{1\over&space;2}(\sigma_1&space;+&space;\sigma_2&space;\rho)^2&space;+&space;{1\over&space;2}(\sigma_2&space;\sqrt{1-\rho^2})^2}&space;\big]\\&space;&&space;=&space;\big[&space;e^&space;{{1\over&space;2}\sigma_1^2&space;+&space;{1\over&space;2}\sigma_2^2&space;+&space;\sigma_1&space;\sigma_2&space;\rho}&space;\big]\\&space;&&space;=&space;{\mathbb&space;E}\big[&space;e^&space;{\sigma_1&space;x&space;}\big]&space;{\mathbb&space;E}&space;\big[&space;e^{\sigma_2&space;y}\big]&space;e^{&space;\sigma_1&space;\sigma_2&space;\rho}&space;\end{align*}

## Stochastic Rates Models

In a previous post, I introduced the Black-Scholes model, in which the price of an underlying stock is modeled with a stochastic variable that changes unpredictably with time. I’ve also discussed the basic model-independent rates products whose value can be determined at the present time exactly. However, to progress further with interest rate derivatives, we’re going to need to model interest rates more carefully. We’ve assumed rates are deterministic so far, but of course this isn’t true – just like stocks, they change with time in an uncertain manner, so we need to allow them to become stochastic as well.

One way of doing this is by analogy with the BS case, by allowing the short rate (which is the instantaneous risk-neutral interest rate ) to become stochastic as well, and then integrating it over the required period of time to calculate forward rates.

A very basic example of this is the Vasicek Model. In this model the short rate is defined to be stochastic, with behaviour governed by the following SDE

where  and  are constants and  is the standard Wiener increment as described before. This is marginally more complicated than the BS model, but still belongs to the small family of SDEs that are analytically tractable. Unlike stock prices, we expect rates to be mean-reverting – stock price variance grows with time, but we expect the distribution of rates to be confined to a fairly narrow range by comparison. The term in square brackets above achieves this, since if  is greater than  then it will be negative and cause the rate to be pulled down, while if it is below the term will be positive and push the rate up towards .

Solving the equation requires a trick, which is instead of thinking about the rate alone to think about the quantity . This is equal to , and substituting in the incremental rate term from the original equation we have

note that the term in  has been cancelled out, and the remaining terms can be integrated directly from a starting time to a finishing time

where  is the initial rate. This is simply a gaussian distribution with mean and variance given by

Where the variance is calculated using the “Ito isometry

as stated above. Note that this allows the possibility of rates going negative, which is generally considered to be a weakness of the model, but the chance is usually rather small.

As we know the distribution of the short rate, we can calculate some other relevant quantities. Of primary importance are the Zero Coupon Bonds, which are required for calculation of forward interest rates. A ZCB is a derivative that pays $1 at a future time , and we can price this using the Risk-Neutral Valuation technique. According to the fundamental theorem of asset pricing, the current price of a derivative divided by our choice of numeraire must be equal to it’s future expected price at any time divided by the value of the numeraire at that time, with the expectation taken in the probability measure corresponding to the choice of numeraire. In the risk-neutral measure, the numeraire is just a unit of currency, initially worth$1 but continually re-invested at the instantaneous short rate, so that its price at time is $1 . Now, the price of the ZCB is given by Although the RHS is true at any time, we only know the value of the ZCB exactly at a single time at the moment – the expiry date , at which it is worth exactly$1. Plugging in these values we have

So the ZCB is given by the expectation of the integral of the rate over a period of time. Since the rate is itself gaussian, and an integral is the limit of a sum, it’s not surprising that this quantity is also gaussian (it’s effectively the sum of many correlated gaussians, which is also gaussian, as discussed before), but it’s rather tricky to calculate, I’ve included the derivation at the bottom of the post to same space here. The mean and variance are given by

where

The ZCB is given by the expectation of the exponential of a gaussian variable – and we’ve seen on several occasions that

So the ZCB prices are

with  as defined above and

and as these expressions only depend on the rates at the initial time, we can calculate ZCB bond prices and hence forward rates for any future expiry dates at any given time if we have the current instantaneous rate.

Although we can calculate a discount curve for a given set of parameters, the Vasicek model can’t calibrate to an initial ZCB curve taken from the market, which is a serious disadvantage. There are more advanced generalisations which can, and I’ll discuss some soon, but they will use all of the same tricks and algebra that I’ve covered here.

I’ve written enough for one day here – in later posts I’ll discuss changing to the t-forward measure, in which the ZCB forms the numeraire instead of a unit of currency, which simplifies many calculations, and I’ll use it to price caplets under stochastic rates, and show that these are equivalent to european options on ZCBs.

An alternate approach to the short-rate model approach discussed today which is very popular these days is the Libor Market Model (LMM) approach, in which instead of simulating the short rate and calculating the required forwards, the different forwards required are instead computed directly and in tandem – I’ll look further at this approach in another post.

Here is the calculation of the distribution of the integral of the instantaneous rate over the period to :

and splitting this into the terms that contribute to the expectation and the variance we have

to calculate the variance, we first need to deal with the following term

we use stochastic integration by parts

and we’re now in a position to try and find the variance of the integral

where Ito’s isometry has been used again, and several more lines of routine algebra leads to the result

## Stochastic Differential Equations Pt2: The Lognormal Distribution

This post follows from the earlier post on Stochastic Differential Equations.

I finished last time by saying that the solution to the BS SDE for terminal spot at time T was

$\inline&space;S_T&space;=&space;S_0&space;e^{(\mu&space;-&space;{1&space;\over&space;2}\sigma^2)T&space;+&space;\sigma&space;W_T}$

When we solve an ODE, it gives us an expression for the position of a particle at time T. But we’ve already said that we are uncertain about the price of an asset in the future, and this expression expresses that uncertainty through the $\inline&space;\sigma&space;W_T$ term in the exponent. We said in the last post that the difference between this quantity at two different times s and t was normally distributed, and since this term is the distance between t=0 and t=T (we have implicitly ignored a term $W_0$, but this is ok because we assumed that the process started at zero) it is also normally distributed,

$W_T&space;\sim&space;{\mathbb&space;N}(0,T)$

It’s a well-known property of the normal distribution (see the Wikipedia entry for this and many others) that if $X&space;\sim&space;{\mathbb&space;N}(0,1)$ then $aX&space;\sim&space;{\mathbb&space;N}(0,a^2)$ for constant a. We can use this in reverse to reduce $W_T$ to a standard normal variable x, by taking a square root of time outside of the distribution so $W_T&space;\sim&space;\sqrt{T}\cdot{\mathbb&space;N}(0,1)$ and we now only need standard normal variables, which we know lots about. We can repeat our first expression in these terms

$\inline&space;S_T&space;=&space;S_0&space;e^{(\mu&space;-&space;{1&space;\over&space;2}\sigma^2)T&space;+&space;\sigma&space;\sqrt{T}&space;X}$

What does all of this mean? In an ODE environment, we’d be able to specify the exact position of a particle at time T. Once we try to build in uncertainty via SDEs, we are implicitly sacrificing this ability, so instead we can only talk about expected positions, variances, and other probabilistic quantities. However, we certainly can do this, the properties of the normal distribution are very well understood from a probabilistic standpoint so we expect to be able to make headway! Just as X is a random variable distributed across a normal distribution, S(t) is now a random variable whose distribution is a function of random variable X and the other deterministic terms in the expression. We call this distribution the lognormal distribution since the log of S is distributed normally.

The random nature of S is determined entirely by the random nature of X. If we take a draw from X, that will entirely determine the corresponding value of S, since the remaining terms are deterministic. The first things we might want to do are calculate the expectation of S, its variance, and plot its distribution. To calculate the expectation, we integrate over all possible realisations of X weighted by their probability, complete the square and use the gaussian integral formula with a change of variables

${\mathbb&space;E}[S_t]&space;=&space;\int^{\infty}_{-\infty}&space;S_t(x)&space;p(x)&space;dx$

$={S_0&space;\over&space;\sqrt{2\pi}}\int^{\infty}_{-\infty}&space;e^{(\mu-{1\over&space;2}\sigma^2)t&space;+&space;\sigma&space;\sqrt{t}x}&space;e^{-{1\over&space;2}x^2}&space;dx$

$={&space;{S_0&space;e^{(\mu-{1\over&space;2}\sigma^2)t}}&space;\over&space;\sqrt{2\pi}}\int^{\infty}_{-\infty}&space;e^{-{1\over&space;2}x^2&space;+&space;\sigma&space;\sqrt{t}&space;x}&space;dx$

$={{S_0&space;e^{(\mu-{1\over&space;2}\sigma^2)t}}&space;\over&space;\sqrt{2\pi}}\int^{\infty}_{-\infty}&space;e^{-{1\over&space;2}&space;(x&space;-&space;\sigma&space;\sqrt{t})^2}&space;e^{{1\over&space;2}\sigma^2&space;t}&space;dx$

$={{S_0&space;e^{\mu&space;t}}&space;\over&space;\sqrt{2\pi}}\int^{\infty}_{-\infty}&space;e^{-{1\over&space;2}&space;y^2}&space;dy$

$=S_0&space;e^{\mu&space;t}$

which is just the linear growth term acting over time [exercise: calculate the variance in a similar way]. We know what the probability distribution of X looks like (it’s a standard normal variable), but what does the probability distribution of S look like? We can calculate the pdf using the change-of-variables technique, which says that if S = g(x), then the area under each curve in corresponding regions must be equal:

$\int_{x_1}^{x_2}&space;p_x(x)&space;dx&space;=&space;\int_{g(x_1)}^{g(x_2)}&space;p_S(S)&space;dS$

$p_x(x)&space;dx&space;=&space;p_S(S)&space;dS$

$p_S(S_t)&space;=&space;p_x(x)&space;{dx&space;\over&space;dS_t}$

We know the function S(x), but the easiest way to calculate this derivative is first to invert the function t make it ameanable to differentiation

$x&space;=&space;{\ln{S_t&space;\over&space;S_0}&space;-&space;(\mu&space;-&space;{1&space;\over&space;2}\sigma^2)t&space;\over&space;\sigma&space;\sqrt{t}}$

${dx&space;\over&space;dS_t}&space;=&space;{1&space;\over&space;\sigma&space;\sqrt{t}&space;S_t}$

So the pdf of S expressed in terms of S is

$p_S(S_t)&space;=&space;{1&space;\over&space;S_t&space;\sigma&space;\sqrt{2\pi&space;t}}&space;\exp{-\Bigl(\ln{S_t\over&space;S_0}&space;-&space;(\mu-{1\over&space;2}\sigma^2)t&space;\Bigr)^2\over&space;2&space;\sigma^2&space;t}$

Well it’s a nasty looking function indeed! I’ve plotted it below for a few typical parameter sets and evolution times.

This distribution is really central to a lot of what we do, so I’ll come back to it soon and discuss a few more of its properties. The one other thing to mention is that if we want to calculate an expected value over S (which will turn out to be something we do a lot), we have two approaches – either integrate over $\inline&space;p_S(S_t)$

${\mathbb&space;E}[f(S_t)]&space;=&space;\int_0^{\infty}&space;f(S_t)p_S(S_t)dS$

or,instead express the function in terms of x instead (using $\inline&space;S_t&space;=&space;S_0&space;e^{(\mu&space;-&space;{1&space;\over&space;2}\sigma^2)t&space;+&space;\sigma&space;\sqrt{t}&space;x}$) and instead integrate over the normal distribution

${\mathbb&space;E}[f(S_t(x))]&space;=&space;\int_{-\infty}^{\infty}&space;f(x)p_x(x)dx$

This is typically the easier option. I think it is called the Law of the Unconscious Statistician. On that note, we’ve certainly covered enough ground for the moment!

-QuantoDrifter

## Stochastic Differential Equations Pt1: The BS Equation

SDEs are of fundamental importance in quantitative finance. Because we don’t know what is going to happen in the future, we need some kind of random process built into our equations to model this uncertainty.

By far the most common process used is Brownian motion, and in 1 dimension this is called the Weiner process, which we will label $\inline&space;W_t$ at time t. This is a diffusive process (basically, there are no ‘jumps’, and in small time intervals the distance that the particle has traveled is probably very small) with no memory, so that future motion is independent of past motion. If we choose any two times s and t with t > s, then the difference between the value of the process at these times is distributed normally, $\inline&space;W_t&space;-&space;W_s&space;\sim&space;{\mathbb&space;N}(0,t-s)$.  This means that our uncertainty about a particle (or asset) following a Weiner process at some future time T can be characterised by a normal distribution with variance T (ie. standard deviation T^0.5) and expectation 0.

The Black-Scholes SDE for a stock price (for reasons that will become clear later, this is usually called geometric brownian motion, by the way) is

${dS&space;\over&space;S}&space;=&space;\mu&space;dt&space;+&space;\sigma&space;dW_t$

What does this mean? Well, the first two terms are simple enough for anyone who has studied ordinary differential equations. S is an asset price, t is time and $\inline&space;\mu$ and $\inline&space;\sigma$ are both constants. Without the stochastic term, we would be able to solve the equation by a simple integration on both sides

$\int_{S_0}^{S_T}&space;{dS&space;\over&space;S}&space;=&space;\int_0^T&space;\mu&space;dt$

$S_T&space;=&space;S_0&space;e^{\mu&space;T}$

which is simply exponential growth of a stock with time. Later, we will introduce risk-free bonds that grow in this way, like compound interest in a bank account, but here we are interested in including an element of uncertainty as well. How do we deal with that?

We might be tempted to try the same thing and integrate the whole equation, using the result that $\inline&space;d&space;\ln&space;S&space;=&space;S^{-1}&space;\cdot&space;dS&space;=&space;\mu&space;dt&space;+&space;\sigma&space;d&space;W_t$ as we implicitly did above. Unfortunately, for non-deterministic functions this doesn’t work any more. The reasons are rather deep, but for the moment the fix is that we need to use a central result of stochastic calculus, Ito’s Lemma, to resolve the conundrum. This is the stocastic version of the chain rule for a function of many variables, which says that if

$dS&space;=&space;\mu&space;(S,t)&space;dt&space;+&space;\sigma&space;(S,t)&space;dW_t$

then

$d(&space;f(S,t))&space;=&space;\Bigr(&space;{\partial&space;f&space;\over&space;\partial&space;t}&space;+&space;\mu(S,t)&space;{\partial&space;f&space;\over&space;\partial&space;S}&space;+&space;{\sigma(S,t)^2&space;\over&space;2}{\partial^2&space;f&space;\over&space;\partial&space;S^2}&space;\Bigl)&space;dt&space;+&space;\sigma(S,t)&space;{\partial&space;f&space;\over&space;\partial&space;S}&space;dW_t$

That seems like quite a mouthful, but for the case that we’re considering, ln(S), it’s actually pretty straight-forward. Since ln(S) doesn’t depend explicitly on time, the time-derivative disappears. It’s quite easy to calculate the first and second derivatives of ln(S) with S, and remembering that we needed to multiply all of the terms in the BS equation above by S to turn the LHS into plain old dS (these were absorbed into our $\inline&space;\mu(S,t)$ and $\inline&space;\sigma(S,t)$), we get:

$d&space;\ln&space;S&space;=&space;(&space;\mu&space;-&space;{1&space;\over&space;2}&space;\sigma^2&space;)&space;dt&space;+&space;\sigma&space;dW_t$

with the $\inline&space;\mu$ and $\inline&space;\sigma$ just plain old constants again. We have picked up an extra term here, and it has come from the second derivative term in Ito’s Lemma. We are now in a position to integrate both sides of the equation ($\inline&space;\int^T&space;\sigma&space;dW_t&space;=&space;\sigma&space;W_T$, by the way), and similar to the result in the deterministic case we get

$S_T&space;=&space;S_0&space;e^{(\mu&space;-&space;{1&space;\over&space;2}\sigma^2)T&space;+&space;\sigma&space;W_T}$

This is the solution of the BS SDE, but what does it mean? There’s quite a bit to go along with above, so I’ll talk a bit more about what this means in the second part of this blog post.

-QuantoDrifter