Improving Monte Carlo: Control Variates

I’ve already discussed quite a lot about Monte Carlo in quantitative finance. MC can be used to value products for which an analytical price is not available in a given model, which includes most exotic derivatives in most models. However, two big problems are the time that it takes to run and the ‘Monte Carlo error’ in results.

One technique for improving MC is to use a ‘control variate’. The idea is to find a product whose price is strongly correlated to the product that we’re trying to price, but which is more easy to calculate or which we already know the price of. When we simulate a path in MC, it will almost surely give either an under-estimate or an over-estimate of the true price, but we don’t know which, and averaging all of these errors is what leads to the Monte Carlo error in the final result. The insight in the control variate technique is to use the knowledge given to us by the control variate to reduce this error. If the two prices are strongly correlated and a path produces an over-estimate of product price, it most likely also produces an over-estimate of the control variate and visa versa, which will allow us to improve our estimate of the product we’re trying to price.

The textbook example is the Asian Option. Although the arithmetic version of the asian option discussed in previous posts has no analytic expression in BS, a similar Geometric asian option does have an analytic price. So, for a given set of model parameters, we can calculate the price of the option. As a reminder, the payoff of an arithmetic asian option at expiry is

    \[C_{\rm arit}(T) = \Bigl({1\over N}\sum_{i=0}^{N-1} S(t_i) - K \Bigr)^+\]

and the payoff of the related geometric averaging asian is

    \[C_{\rm geo}(T) = \Bigl( \bigl(\prod_{i=0}^{N-1} S(t_i)\bigr)^{1\over N} - K \Bigr)^+\]

Denoting the price of the arithmetic option as X and the geometric option as Y, the traditional monte carlo approach is to generate N paths, and for each one calculate X_i (the realisation of the payoff along the path) and take the average over all paths, so that

    \[{\mathbb E}[X] = {1 \over N} \sum_{i=0}^{N-1} X_i\]

which will get closer to the true price as N \to \infty.

Using Y as a control variate, we instead calculate

    \[{\mathbb E}[X] = {1\over N} \sum_{i=0}^{N-1}\bigl( X_i - \lambda( Y_i - {\mathbb E}[Y] ) \bigr)\]

where {\mathbb E}[Y] is the price of the geometric option known from the analytical expression, and \lambda is a constant (in this case we will set it to 1).

What do we gain from this? Well, consider the variance of X_i - \lambda (Y_i - {\mathbb E } [Y])

    \[{\rm Var}\bigl( X_i - ( Y_i - {\mathbb E}[Y] ) \bigr) = {\rm Var}( X_i ) + \lambda^2 {\rm Var} ( Y_i ) - 2 \lambda {\rm Cov}( X_i Y_i )\]

(since {\mathbb E } [Y] is known, so has zero variance) which is minimal for \lambda = \sqrt{ {\rm Var}(X_i) \over {\rm Var}(Y_i) }\cdot \rho(X_i,Y_i) in which case

    \[{ {\rm Var}\bigl( X_i - ( Y_i - {\rm E}[Y] ) \bigr) \over {\rm Var}( X_i )} = 1 - \rho(X_i,Y_i)^2\]

that is, if the two prices are strongly correlated, the variance of the price calculated using the control variate will be a significantly smaller. I’ve plotted a sketch of the prices of the two types of average for 100 paths – the correlation is about 99.98%. Consequently, we expect to see a reduction in variance of about 2000 times for a given number of paths (although we now have to do a little more work on each path, as we need to calculate the geometric average as well as the arithmetic average of spots). This is roughly 45 times smaller standard error on pricing – well over an extra decimal place, which isn’t bad – and this is certainly much easier than running 2000 times as many paths to achieve the same result.

The relationship between payout for a geometric and an arithmetic asian option, which here demonstrate a 99.98% sample correlation
The relationship between payout for a geometric and an arithmetic asian option, which here demonstrate a 99.98% sample correlation. Parameters used were: r: 3%; vol: 10%; K: 105; S(0): 100; averaging dates: monthly intervals for a year

Asian options III: the Geometric Asian

I’ve introduced the Asian Option before, it is similar to a vanilla option but its payoff is based on the average price over a period of time, rather than solely the price at expiry. We saw that it was exotic – it is not possible to give it a unique price using only the information available from the market.

Today I’m going to introduce another type of option – the geometric asian option. This option is similar, but its payoff is now based on the geometric average, rather than the arithmetic average, of the spot over the averaging dates

    \[C(T) = \Bigl( \bigl( \prod_{i=1}^{N} S_i \bigr)^{1 \over N} - K \Bigr)^+\]

This option is exotic, just like the regular (arithmetic-average) asian option. At first sight it seems even more complicated; and what’s more, it is almost never encountered in practice, so why bother with it? Well, it’s a very useful as an exercise because in many models where the arithmetic asian’s price has no closed form, the geometric asian happens to have one! Further, since an arithmetic average is ALWAYS higher than a geometric average for a set of numbers, the price of the geometric asian will give us a strict lower bound on the price of the arithmetic asian.

Considering the Black-Scholes model in its simplest form for the moment (although the pricing formula can be extended to more general models), let’s consider what the spot will look like at each of the averaging times, and as we did in the earlier post, considering a simple geometric asian averaging over only two times t_1 and t_2 so that the payoff is C(T) = \Bigl( \sqrt{ S_1 \cdot S_2 } - K \Bigr)^+

At t_0S(t_0) = S_0. At t_1,

    \[S(t_1) = S_1 = S_0 \exp{ \Bigl\{ (r - {1\over 2} \sigma^2 )t_1 + \sigma \sqrt{t_1} x_1 \Bigr\} }\]

where x_1 \sim {\mathbb N}(0,1). At t_2,

    \[S(t_2) = S_2 = S_1 \exp{ \Bigl\{ (r - {1\over 2} \sigma^2 )(t_2 - t_1) + \sigma \sqrt{t_2-t_1} x_2 \Bigr\} }\]

where x_2 \sim {\mathbb N}(0,1) also, and importantly x_2 is uncorrelated with x_1 due to the independent increments of the Brownian motion.

Now the reason we couldn’t make a closed form solution for the arithmetic asian was that S_1 + S_2 is the sum of two lognormal distributions, which itself is NOT lognormally distributed. However, as discussed in my post on distributions, the product S_1 \cdot S_2 of two lognormal distributions IS lognormal, so valuing an asian that depends on the product of these two is similar to pricing a vanilla, with a slight change to the parameters that we need

    \[\begin{matrix} \sqrt{ S_1 \cdot S_2 } & = & S_0 \exp \Bigl\{ {1\over 2}(r - {1\over 2} \sigma^2 )(t_1 + t_2) + \sigma ( \sqrt{t_1} x_1 + {1\over 2} \sqrt{t_2 - t_1} x_2 ) \Bigr\} \\ & = & S_0 \exp \Bigl\{ {1\over 2}(r - {1\over 2} \sigma^2 )(t_1 + t_2) + {1 \over 2} \sigma \sqrt{ 3 t_1 + t_2 } x_3 \Bigr\} \end{matrix}\]

where x_3 is another normal variable (we’ve used the result about the sum of two normally distributed variables here).

If we re-write this as

    \[\begin{matrix} \sqrt{ S_1 \cdot S_2 } & = & S_0 \exp \Big\{ {1 \over 2} r ( t_1 + t_2 ) \Bigr\} \exp \Big\{ -{1 \over 2} \sigma^2 (t_1 + t_2) + {1 \over 2} \sigma \sqrt{ 3 t_1 + t_2 } x_3 \Big\} \\ & = & \sqrt{ F_1 \cdot F_2 } \exp \Big\{ {1 \over 2} \sigma^2 ( \tilde{t} - \tilde{\tilde{t}} ) \Big\} \exp \Big\{ -{1 \over 2} \sigma^2 \tilde{t} + \sigma \tilde{t} x_3 \Big\} \end{matrix}\]

where F_1 and F_2 are the forwards at the respective times and \tilde{t} and \tilde{\tilde{t}} are defined below. This is just the same as the vanilla pricing problem solved here. So, we can use a vanilla pricer to price a geometric asian with two averaging dates, but we need to enter transformed parameters

\begin{matrix} \tilde{t} & = & {1 \over 2} \sqrt{ 3t_1 + t_3 } \\ \tilde{\tilde{t}} & = & {1 \over 2} (t_1 + t_2) \\ F & \to & \sqrt{ F_1 \cdot F_2 } \cdot e^{{1 \over 2} \sigma^2 (\tilde{t} - \tilde{\tilde{t}}) } \\ \sigma^2 t & \to & \sigma^2 \tilde{\tilde{t}} \end{matrix}

In fact this result is quite general, we can price a geometric asian with any number of averaging dates, using the general transformations below (have a go at demonstrating this following the logic above)

 \begin{matrix} \tilde{t} & = & {1 \over n} \sqrt{  \sum_{i , j=0}^n {\rm min}(t_i, t_j) } \\ \tilde{\tilde{t}} & = & {1 \over n} \sum_{i =1}^n t_i \\ F & \to & \Big( \prod_{i=1}^n F_i \Big)^{1 \over n} \cdot e^{{1 \over 2} \sigma^2 (\tilde{t} - \tilde{\tilde{t}}) } \\ \sigma^2 t & \to & \sigma^2 \tilde{\tilde{t}} \end{matrix}

Credit Default Swaps Pt II – Credit Spreads

As well as governments, companies also need to borrow money and one way of achieving this is to issue bonds. Like government bonds, these pay a fixed rate of interest each year. Of course, this rate of interest will typically be quite a bit higher than government bonds in order to justify the increased credit risk associated with them. The difference between the two rates of interest is called the credit spread.

As we did before, today I’m just going to look at pricing a ZCB, but coupon-bearing bonds can be constructed as a product of these so it’s a relatively straight-forward extension.

Consider a ZCB issued by a risky company. It’s going to pay £1 at expiry, and nothing in-between. However, if the company defaults in the mean-time we get nothing. In fact, we’ll extend this a bit – instead of getting nothing we say that we will receive a fraction R of £1, since the company’s assets will be run down and some of their liabilities will get paid. This also allows us to distinguish between different classes of debt based on their seniority (we’re getting dangerously close to corporate finance here so let’s steer back to the maths).

In order to protect against the risk of loss, we might enter a Credit Default Swap (CDS) with a third party. In exchange for us making regular payments of £K/year to them (here we’re assuming that the payment accrues continuously for mathematical ease), they will pay us if and when the counterparty defaults. There are a few varieties of CDS – they might take the defaulted bond from us and reimburse what we originally paid for it, or they might just pay out an amount equal to our loss. We’ll assume the second variety here, so in the event of default they pay us £(1-R), since this is what we lose (up to some discount factors, but we’re keeping it simple here!). Also, if default occurs we stop making the regular payments, since we no longer need protection, so this product is essentially an insurance policy.

A cartoon of the payment schedule for a Credit Default Swap. We make continuous payments to the insurer of K per unit time, and in return we receive a one-off payment of (1-R) should the counterparty default, in which case the contract finishes early and no more payments happen
A cartoon of the payment schedule for a Credit Default Swap. We make continuous payments to the insurer of K per unit time, and in return we receive a one-off payment of (1-R) should the counterparty default, in which case the contract finishes early and no more payments happen

We value this product in the normal way. The value of the fixed leg is just the discounted cash flow of each payment multiplied by the probability that the counterparty hasn’t defaulted (because we stop paying in that case) – and remember that we said the payments are accruing continuously, so we use an integral instead of a sum:

    \[{\rm fixed} = -K \int_0^T \delta(0,t) S(t) dt\]

and the value of the insurance leg is the value of the payment times the chance that it defaults in any given interval

    \[{\rm floating} = -(1-R) \int_0^T \delta(0,t) p(t) dt\]

where in keeping with our notation from before, p(t) is the probability of default at a given time and S(t) is the chance that the counterparty hasn’t defaulted by a given time.

The contract has a fair value if these two legs are of equal value, which happens when

    \[0 = \int \delta(0,t) \Bigl(-(1-R) {dS \over dt} - K\cdot S(t) \Bigr) dt\]

At this point we refer to our exponential default model from Part I of this post.

In the exponential default model described in the previous post, we postulated a probability of default p(t) = \lambda e^{-\lambda t} which led to Survival Function S(t) = e^{-\lambda t}. Substituting these in to the equation above we have

    \[0 = \int \delta(0,t) e^{-\lambda t} \Bigl((1-R) \lambda - K \Bigr) dt\]

which is only zero when the integrand is also zero, so

    \[K = (1-R)\lambda\]

This is called the credit triangle, and it tells us the price of insuring a risky bond against default if we have it’s hazard rate. If the expected lifetime of the firm increases (ie. \lambda decreases) then the spreads will fall, as insurance is less likely to be required. If the expected recovery after default increases (R increases) then spreads also fall, as the payout will be smaller if it is required.

Although a little care is needed when moving from ZCBs to coupon-bearing bonds, it can be seen that the payments K (normally called the spreads) paid to insure the bond against default should be essentially the difference in interest payments between government (‘risk-free’) bonds and the risky bonds we are considering.

The default probabilities we have used here can be calibrated from market observed values for spreads K between government and corporate bonds. This resembles the process used to calculate volatility – the observable quantity on the market is actually the price, not the volatility/hazard rate, and we back this out from market prices using simple products and then use these parameters to price more exotic products to give consistent prices. This is the market’s view of the default probabilities in the risk-neutral measure, which means that they may not actually bear any resemblance to real world default probabilities (eg. historical default probabilities) and in fact are usually larger and often several times larger. See for example this paper for more on the subject.

Of course, throughout this analysis we have ignored the risk of the insurer themselves defaulting – indeed, credit derivatives rapidly become a nightmare when considered in general, and the Lehman Brothers default was largely down to some fairly heroic approximations that were made in pricing credit products that ultimately didn’t hold up in extreme environments. I’ll explore some of this behaviour soon in a post on correlated defaults and the gaussian copula model.