Asian options III: the Geometric Asian

I’ve introduced the Asian Option before, it is similar to a vanilla option but its payoff is based on the average price over a period of time, rather than solely the price at expiry. We saw that it was exotic – it is not possible to give it a unique price using only the information available from the market.

Today I’m going to introduce another type of option – the geometric asian option. This option is similar, but its payoff is now based on the geometric average, rather than the arithmetic average, of the spot over the averaging dates

    \[C(T) = \Bigl( \bigl( \prod_{i=1}^{N} S_i \bigr)^{1 \over N} - K \Bigr)^+\]

This option is exotic, just like the regular (arithmetic-average) asian option. At first sight it seems even more complicated; and what’s more, it is almost never encountered in practice, so why bother with it? Well, it’s a very useful as an exercise because in many models where the arithmetic asian’s price has no closed form, the geometric asian happens to have one! Further, since an arithmetic average is ALWAYS higher than a geometric average for a set of numbers, the price of the geometric asian will give us a strict lower bound on the price of the arithmetic asian.

Considering the Black-Scholes model in its simplest form for the moment (although the pricing formula can be extended to more general models), let’s consider what the spot will look like at each of the averaging times, and as we did in the earlier post, considering a simple geometric asian averaging over only two times t_1 and t_2 so that the payoff is C(T) = \Bigl( \sqrt{ S_1 \cdot S_2 } - K \Bigr)^+

At t_0S(t_0) = S_0. At t_1,

    \[S(t_1) = S_1 = S_0 \exp{ \Bigl\{ (r - {1\over 2} \sigma^2 )t_1 + \sigma \sqrt{t_1} x_1 \Bigr\} }\]

where x_1 \sim {\mathbb N}(0,1). At t_2,

    \[S(t_2) = S_2 = S_1 \exp{ \Bigl\{ (r - {1\over 2} \sigma^2 )(t_2 - t_1) + \sigma \sqrt{t_2-t_1} x_2 \Bigr\} }\]

where x_2 \sim {\mathbb N}(0,1) also, and importantly x_2 is uncorrelated with x_1 due to the independent increments of the Brownian motion.

Now the reason we couldn’t make a closed form solution for the arithmetic asian was that S_1 + S_2 is the sum of two lognormal distributions, which itself is NOT lognormally distributed. However, as discussed in my post on distributions, the product S_1 \cdot S_2 of two lognormal distributions IS lognormal, so valuing an asian that depends on the product of these two is similar to pricing a vanilla, with a slight change to the parameters that we need

    \[\begin{matrix} \sqrt{ S_1 \cdot S_2 } & = & S_0 \exp \Bigl\{ {1\over 2}(r - {1\over 2} \sigma^2 )(t_1 + t_2) + \sigma ( \sqrt{t_1} x_1 + {1\over 2} \sqrt{t_2 - t_1} x_2 ) \Bigr\} \\ & = & S_0 \exp \Bigl\{ {1\over 2}(r - {1\over 2} \sigma^2 )(t_1 + t_2) + {1 \over 2} \sigma \sqrt{ 3 t_1 + t_2 } x_3 \Bigr\} \end{matrix}\]

where x_3 is another normal variable (we’ve used the result about the sum of two normally distributed variables here).

If we re-write this as

    \[\begin{matrix} \sqrt{ S_1 \cdot S_2 } & = & S_0 \exp \Big\{ {1 \over 2} r ( t_1 + t_2 ) \Bigr\} \exp \Big\{ -{1 \over 2} \sigma^2 (t_1 + t_2) + {1 \over 2} \sigma \sqrt{ 3 t_1 + t_2 } x_3 \Big\} \\ & = & \sqrt{ F_1 \cdot F_2 } \exp \Big\{ {1 \over 2} \sigma^2 ( \tilde{t} - \tilde{\tilde{t}} ) \Big\} \exp \Big\{ -{1 \over 2} \sigma^2 \tilde{t} + \sigma \tilde{t} x_3 \Big\} \end{matrix}\]

where F_1 and F_2 are the forwards at the respective times and \tilde{t} and \tilde{\tilde{t}} are defined below. This is just the same as the vanilla pricing problem solved here. So, we can use a vanilla pricer to price a geometric asian with two averaging dates, but we need to enter transformed parameters

\begin{matrix} \tilde{t} & = & {1 \over 2} \sqrt{ 3t_1 + t_3 } \\ \tilde{\tilde{t}} & = & {1 \over 2} (t_1 + t_2) \\ F & \to & \sqrt{ F_1 \cdot F_2 } \cdot e^{{1 \over 2} \sigma^2 (\tilde{t} - \tilde{\tilde{t}}) } \\ \sigma^2 t & \to & \sigma^2 \tilde{\tilde{t}} \end{matrix}

In fact this result is quite general, we can price a geometric asian with any number of averaging dates, using the general transformations below (have a go at demonstrating this following the logic above)

 \begin{matrix} \tilde{t} & = & {1 \over n} \sqrt{  \sum_{i , j=0}^n {\rm min}(t_i, t_j) } \\ \tilde{\tilde{t}} & = & {1 \over n} \sum_{i =1}^n t_i \\ F & \to & \Big( \prod_{i=1}^n F_i \Big)^{1 \over n} \cdot e^{{1 \over 2} \sigma^2 (\tilde{t} - \tilde{\tilde{t}}) } \\ \sigma^2 t & \to & \sigma^2 \tilde{\tilde{t}} \end{matrix}

Interview Quesions III

Today’s question will test some of the statistics and correlation I’ve discussed in the last couple of months. Assume throughout that x\sim {\mathbb N}(0,1) and y\sim {\mathbb N}(0,1) are jointly normally distributed such that {\mathbb E}[x \cdot y] = \rho

a) Calculate {\mathbb E}[\ e^x \ ]
b) Calculate {\mathbb E}[\ e^x \ | \ y = b\ ]

The first expectation is of a lognormal variate, and the second is of a lognormal variate conditional on some earlier value of the variate having been a particular value – these are very typical of the sorts of quantities that a quant deals with every day, so the solution will be quite instructive! Before reading the solution have a go at each one, the following posts may be useful: SDEs pt. 1, SDEs pt. 2, Results for Common Distributions

a) Here, we use the standard result for expectations

    \begin{align*} {\mathbb E}[ \ e^x \ ] &= \int^{\infty}_{-\infty} e^x \cdot f(x) \ dx \nonumber \\ \nonumber \\ &= {1 \over \sqrt{2 \pi}}\int^{\infty}_{-\infty} e^x \cdot e^{-{1 \over 2}x^2} \ dx \nonumber \\ \nonumber \\ &= {1 \over \sqrt{2 \pi}}\int^{\infty}_{-\infty} \exp\Bigl( -{1\over 2}\Bigl[ x^2 - 2x + 1 - 1 \Bigr] \Bigr) \ dx \nonumber \\ \nonumber \\ &= {e^{1\over 2} \over \sqrt{2 \pi}}\int^{\infty}_{-\infty} \exp\Bigl( -{1\over 2} (x-1)^2 \Bigr) \ dx \nonumber \\ \nonumber \\ &= e^{1\over 2} \nonumber \end{align}

b) This one is a little tougher, so first of all I’ll discuss what it means and some possible plans of attack. We want to calculate the expectation of e^x, given that y takes a value of b. Of course, if x and y were independent, this wouldn’t make any difference and the result would be the same. However, because they are correlated, the realised value of y will have an effect on the distribution of x.

To demonstrate this, I’ve plotted a few scatter-graphs illustrating the effect of specifying y on x, with x and y uncorrelated and then becoming increasing more correlated.

When x and y are uncorrelated, the realised value of y doesn't affect the distribution for x, which is still normally distributed around zero
When x and y are uncorrelated, the realised value of y doesn’t affect the distribution for x, which is still normally distributed around zero
When x and y are correlated, the realised value of y has an effect on the distribution of x, which is no longer centered on zero and has a smaller variance
When x and y are correlated, the realised value of y has an effect on the distribution of x, which is no longer centered on zero and has a smaller variance
When the correlation of x and y becomes high, the value of x is almost completely determined by y. Now, if y is specified then x is tightly centered around a value far from zero
When the correlation of x and y becomes high, the value of x is almost completely determined by y. Now, if y is specified then x is tightly centered around a value far from zero

The simplest way of attempting this calculation is to use the result for jointly normal variates given in an earlier post, which says that if x and y have correlation \rho, we can express x in terms of y and a new variate z \sim {\mathbb N}(0,1) which is uncorrelated with y

    \[x = \rho\cdot y + \sqrt{1 - \rho^2}\cdot z\]


    \[(\ e^x\ |\ y = b\ ) = e^{\rho y + \sqrt{1-\rho^2}z} = e^{\rho b}\cdot e^{\sqrt{1-\rho^2}z}\]

Since the value of y is already determined (ie. y = b), I’ve separated this term out and the only thing I have to calculate is the expectation of the second term in z. Since y and z are independent, we can calculate the expectation of the z, which is the same process as before but featuring slightly more complicated pre-factors

    \begin{align*} {\mathbb E}[ \ e^x \ |\ y = b\ ] &= e^{\rho b} \int^{\infty}_{-\infty} e^{\sqrt{1-\rho^2}z} \cdot f(z) \ dz \nonumber \\ \nonumber \\ &= {e^{\rho b} \over \sqrt{2 \pi}}\int^{\infty}_{-\infty} e^{\sqrt{1-\rho^2}z} \cdot e^{-{1 \over 2}z^2} \ dz \nonumber \\ \nonumber \\ &= {e^{\rho b} \over \sqrt{2 \pi}}\int^{\infty}_{-\infty} \exp\Bigl( -{1\over 2}\Bigl[ z^2 - 2\sqrt{1-\rho^2}z \nonumber \\ & \quad \quad \quad \quad \quad \quad \quad \quad + (1-\rho^2) - (1-\rho^2) \Bigr] \Bigr) \ dz \nonumber \\ \nonumber \\ &= {e^{\rho b} \over \sqrt{2 \pi}} e^{{1 \over 2} (1-\rho^2)} \int^{\infty}_{-\infty} \exp\Bigl( -{1\over 2} \bigl(z-\sqrt{1-\rho^2}\bigr)^2 \Bigr) \ dz \nonumber \\ \nonumber \\ &= e^{{1\over 2}(1-\rho^2)+\rho b} \nonumber \end{align}

We can check the limiting values of this – if \rho = 0 then x and y are independent [this is not a general result by the way – see wikipedia for example – but it IS true for jointly normally distributed variables], in this case {\mathbb E}[ \ e^x \ |\ y = b\ ] = e^{0.5} just as above. If \rho = \pm 1, {\mathbb E} [\ e^x \ |\ y=b\ ] = e^{\pm b}, which also makes sense since in this case x = \pm y = \pm b, so y fully determines the expectation of x.

The more general way to solve this is to use the full 2D joint normal distribution as given in the previous post mentioned before,

    \[f(x,y) = {1 \over {2\pi \sqrt{1-\rho^2}}} \exp{\Bigl(-{1 \over 2(1-\rho^2)}(x^2 + y^2 - 2\rho xy)\Bigr)}\]

This is the joint probability function of x and y, but it’s not quite what we need – the expectation we are trying to calculate is

    \[{\mathbb E}[ \ e^x \ |\ y \ ] = \int^{\infty}_{-\infty} e^x \cdot f(\ x \ |\ y \ ) \ dx\]

So we need to calculate the conditional expectation of x given y, for which we need Bayes’ theorem

    \[f(x,y) = f( \ x \ | \ y \ ) \cdot f(y)\]

Putting this together, we have

    \[{\mathbb E}[ \ e^x \ |\ y = b\ ] = \int^{\infty}_{-\infty} e^x \cdot {f(x,y) \over f(y)}\ dx\]

    \[={1 \over {2\pi \sqrt{1-\rho^2}}} \int^{\infty}_{-\infty} { e^x \over e^{-{1\over 2}y^2}} \exp{\Bigl(-{1 \over 2(1-\rho^2)}(x^2 + y^2 - 2\rho xy)\Bigr)} dx\]

    \[={e^{{1\over 2}b^2} \over {2\pi \sqrt{1-\rho^2}}} \int^{\infty}_{-\infty} e^x \exp{\Bigl(-{1 \over 2(1-\rho^2)}(x^2 + b^2 - 2\rho xb)\Bigr)} dx\]

This integral is left as an exercise to the reader, but it is very similar to those given above and should give the same answer as the previous expression for {\mathbb E}[\ e^x \ | \ y = b\ ] after some simplification!

Some Results for Common Distributions

[NB. I try to make note of technical conditions where relevant, from now on I’ll put these in square brackets. On a first reading these can probably be ignored]


I review here a few commonly used results for normal and lognormal distributions. They get used over and over across quantitative finance, so I wanted to have lots of them in once place where I can refer back to them later.

Univariate Normal Distribution

The probability density function for X is

p(X=x) = {1\over \sigma\sqrt{2\pi}}e^{-{{(x-\mu)^2}\over 2 \sigma^2}}

with mean \inline \mu and standard deviation \inline \sigma. I tend to use the notation \inline X\sim{\mathbb N}(\mu,\sigma^2) to describe a normally distributed variable with these parameters.

As already referred to, we can take the location and the scale out of the variable X as follows:

X = a + bz

where a and b are constants and z \sim {\mathbb N}(0,1) is called a ‘standard normal variable’.

This comes from the result that the sum of any two normal variables is also normal – for \inline X_1\sim{\mathbb N}(\mu_1, \sigma_1) and \inline X_2\sim{\mathbb N}(\mu_2, \sigma_2), then [as long as \inline X_1 and \inline X_2 are independent]

aX_1 + bX_2 = X_3\sim{\mathbb N}(a\mu_1 + b\mu_2, a^2 \sigma_1^2 + b^2\sigma_2^2)

The cdf of the normal distribution isn’t analytically tractable, but I’ve already discussed a few numerical approximations to it here. I include a doodle that re-implements this function for the standard normal cdf for you to play with:

\Phi() =

The normal distribution comes up time and time again due to the central limit theorem. This says that [usually], as we take the mean value of a sequence of [independent, identically distributed] random variables, it will tend to a normally distributed variate regardless of the distribution of the underlying variables:

\lim_{n\rightarrow \infty} \Bigl( {1 \over n}\sum_{i=0}^n X_i\Bigr) \sim{\mathbb N}(\mu,{\sigma^2\over n})

This is of very broad importance. For example, it is the basis of Monte Carlo and the square root N convergence rate, since by taking many simulations, we are sampling the distribution of the mean of N realisations of the payoff of the option. Although the payoff probably isn’t normally distributed across the r-n probability distribution of the underlying, a very large number of payoffs will approach a normal distribution, and its mean is an estimator for the payoff mean. The variance also gives us error bounds, as variance should decrease with increasing number of samples as 1/n.

Lognormal Distribution

A lognormal variable is one whose log is distributed normally – so if \inline X\sim {\mathbb N}(\mu,\sigma^2) then \inline S \sim e^{a + bX} is lognormally distributed.

The pdf for a lognormal distribution is

p(X=x) = {1\over x\sigma\sqrt{2\pi}}e^{-{{(\ln x-\mu)^2}\over 2 \sigma^2}}; x>0

once again with mean \inline \mu and standard deviation \inline \sigma [Exercise: \inline \mu and \inline \sigma are actually the mean and std. dev. for the normal distribution in the exponent, NOT for the lognormal itself – calculate their true values for the lognormal distribution]. I tend to use the notation \inline X\sim {\mathbb L}{\mathbb N}(\mu,\sigma^2) to describe a lognormally distributed variable with these parameters.

Special properties of lognormal variables are mostly due to the properties of normal in the exponent. The two most important are related to the products of lognormal variates [here I am still assuming independence – I’ll generalise this another time], if X_1 \sim {\mathbb L}{\mathbb N}(\mu_1, \sigma_1^2) and X_2 \sim {\mathbb L}{\mathbb N}(\mu_2, \sigma_2^2) then:

X_1 \cdot X_2 = X_3 \sim {\mathbb L} {\mathbb N}(\mu_1 + \mu_2, \sigma_1^2+\sigma_2^2)

(X_1)^n \sim {\mathbb L}{\mathbb N}(n \mu_1, n^2\sigma_1^2)

{1\over X_1} = (X_1)^{-1}\sim{\mathbb L}{\mathbb N}(-\mu_1,\sigma_1^2)

Although the third of these is a special case of the second, it is worth taking a little bit of time to think about. It says that the distribution of the inverse of a lognormal variable is also lognormal. This will be useful areas like foreign exchange, since it says that if the future exchange rate is randomly distributed in this way, then the inverse of the exchange rate (which is of course just the exchange rate from the opposite currency perspective) is also lognormal. Secondly, what does \inline -\mu mean for the distribution? Well, don’t forget that this isn’t the distribution mean but the mean of the normal in the exponent – even when this is negative, the lognormal is positive, so this is an acceptable value.

Unlike the normal, there is no closed form expression for the sum of two lognormal variables. Two approaches are typically used in this case – for a large, independent selection of variables with the same mean and variance the CLT implies that the distribution of the average will be roughly normal, while for small, highly correlated sequences with similar means they are still roughly lognormal with an adjusted mean and variance. The second technique is an example of ‘moment matching’, I’ll discuss it later in more detail.

Multivariate Normal Distribution

In the case that we have more than one normally distributed random variable [we assume here that they are ‘jointly normally distributed’], we need to build into our calculations the possibility that they might not be independent, which will lead to the multivariate normal distribution (a trivial example of the failure of our above expressions for non-independent variables is if \inline X_2 = –\inline X_1; in which case \inline X_2 + \inline X_1 = 0, and it is certainly NOT true that \inline 0\sim {\mathbb N}(\mu_1+\mu_2,\sigma_1^2 + \sigma_2^2)!

To measure the dependence of two normal variables, our starting point is their covariance. Similarly to variance, this measures the difference between the expectation of the product of two of these variables from the individual expectations

{\rm Cov}[X_1,X_2] = {\mathbb E}[X_1\cdot X_2] - {\mathbb E}[X_1]{\mathbb E}[X_2]

which is 0 for independent variables. A scaled version of the covariance is called the correlation

\rho(X_1,X_2) = {{\rm Cov}[X_1,X_2] \over \sqrt{{\rm Var}[X_1]{\rm Var}[X_2]}}

so that \inline \rho \in [-1,1]. The combined pdf for two or more variables becomes rapidly more complicated, I’ll look at ways of thinking about it another time but state here the case for two variables each with a standard normal distribution and correlation \inline \rho:

P(X=x, Y=y) = {1 \over 2\pi \sqrt{1-\rho^2}}\exp{\Bigl(-{1 \over 2(1-\rho^2)}(x^2 + y^2 - 2\rho xy)\Bigr)}

[Exercise: derive the product rule for two normal and for two lognormal variables when the normals involved are allowed to be correlated]

Finally, for normal variables only, we have the important result that if X and Y have correlation \inline \rho, we can re-write X as the sum of Y and Z as

X = \rho \cdot Y + \sqrt{1-\rho^2}\cdot Zwhere Y and Z are uncorrelated. I’ll be using this expression lots in the future!

Stochastic Differential Equations Pt2: The Lognormal Distribution

This post follows from the earlier post on Stochastic Differential Equations.

I finished last time by saying that the solution to the BS SDE for terminal spot at time T was

\inline S_T = S_0 e^{(\mu - {1 \over 2}\sigma^2)T + \sigma W_T}

When we solve an ODE, it gives us an expression for the position of a particle at time T. But we’ve already said that we are uncertain about the price of an asset in the future, and this expression expresses that uncertainty through the \inline \sigma W_T term in the exponent. We said in the last post that the difference between this quantity at two different times s and t was normally distributed, and since this term is the distance between t=0 and t=T (we have implicitly ignored a term W_0, but this is ok because we assumed that the process started at zero) it is also normally distributed,

W_T \sim {\mathbb N}(0,T)

It’s a well-known property of the normal distribution (see the Wikipedia entry for this and many others) that if X \sim {\mathbb N}(0,1) then aX \sim {\mathbb N}(0,a^2) for constant a. We can use this in reverse to reduce W_T to a standard normal variable x, by taking a square root of time outside of the distribution so W_T \sim \sqrt{T}\cdot{\mathbb N}(0,1) and we now only need standard normal variables, which we know lots about. We can repeat our first expression in these terms

\inline S_T = S_0 e^{(\mu - {1 \over 2}\sigma^2)T + \sigma \sqrt{T} X}

What does all of this mean? In an ODE environment, we’d be able to specify the exact position of a particle at time T. Once we try to build in uncertainty via SDEs, we are implicitly sacrificing this ability, so instead we can only talk about expected positions, variances, and other probabilistic quantities. However, we certainly can do this, the properties of the normal distribution are very well understood from a probabilistic standpoint so we expect to be able to make headway! Just as X is a random variable distributed across a normal distribution, S(t) is now a random variable whose distribution is a function of random variable X and the other deterministic terms in the expression. We call this distribution the lognormal distribution since the log of S is distributed normally.

The random nature of S is determined entirely by the random nature of X. If we take a draw from X, that will entirely determine the corresponding value of S, since the remaining terms are deterministic. The first things we might want to do are calculate the expectation of S, its variance, and plot its distribution. To calculate the expectation, we integrate over all possible realisations of X weighted by their probability, complete the square and use the gaussian integral formula with a change of variables

{\mathbb E}[S_t] = \int^{\infty}_{-\infty} S_t(x) p(x) dx

={S_0 \over \sqrt{2\pi}}\int^{\infty}_{-\infty} e^{(\mu-{1\over 2}\sigma^2)t + \sigma \sqrt{t}x} e^{-{1\over 2}x^2} dx

={ {S_0 e^{(\mu-{1\over 2}\sigma^2)t}} \over \sqrt{2\pi}}\int^{\infty}_{-\infty} e^{-{1\over 2}x^2 + \sigma \sqrt{t} x} dx

={{S_0 e^{(\mu-{1\over 2}\sigma^2)t}} \over \sqrt{2\pi}}\int^{\infty}_{-\infty} e^{-{1\over 2} (x - \sigma \sqrt{t})^2} e^{{1\over 2}\sigma^2 t} dx

={{S_0 e^{\mu t}} \over \sqrt{2\pi}}\int^{\infty}_{-\infty} e^{-{1\over 2} y^2} dy

=S_0 e^{\mu t}

which is just the linear growth term acting over time [exercise: calculate the variance in a similar way]. We know what the probability distribution of X looks like (it’s a standard normal variable), but what does the probability distribution of S look like? We can calculate the pdf using the change-of-variables technique, which says that if S = g(x), then the area under each curve in corresponding regions must be equal:

\int_{x_1}^{x_2} p_x(x) dx = \int_{g(x_1)}^{g(x_2)} p_S(S) dS

p_x(x) dx = p_S(S) dS

p_S(S_t) = p_x(x) {dx \over dS_t}

 We know the function S(x), but the easiest way to calculate this derivative is first to invert the function t make it ameanable to differentiation

x = {\ln{S_t \over S_0} - (\mu - {1 \over 2}\sigma^2)t \over \sigma \sqrt{t}}

{dx \over dS_t} = {1 \over \sigma \sqrt{t} S_t}

So the pdf of S expressed in terms of S is

 p_S(S_t) = {1 \over S_t \sigma \sqrt{2\pi t}} \exp{-\Bigl(\ln{S_t\over S_0} - (\mu-{1\over 2}\sigma^2)t \Bigr)^2\over 2 \sigma^2 t}

Well it’s a nasty looking function indeed! I’ve plotted it below for a few typical parameter sets and evolution times.

A lognormal PDF with typical parameter values. Of course, it is highly unlikely that parameter values will stay constant for 4 years – we’ll discuss time dependence of these another day
A more-volatile PDF. Note that even though the mode is falling over time, the mean (which is independent of vol) is still going up due to the fat tails at positive values.

This distribution is really central to a lot of what we do, so I’ll come back to it soon and discuss a few more of its properties. The one other thing to mention is that if we want to calculate an expected value over S (which will turn out to be something we do a lot), we have two approaches – either integrate over \inline p_S(S_t)

{\mathbb E}[f(S_t)] = \int_0^{\infty} f(S_t)p_S(S_t)dS

or,instead express the function in terms of x instead (using \inline S_t = S_0 e^{(\mu - {1 \over 2}\sigma^2)t + \sigma \sqrt{t} x}) and instead integrate over the normal distribution

{\mathbb E}[f(S_t(x))] = \int_{-\infty}^{\infty} f(x)p_x(x)dx

This is typically the easier option. I think it is called the Law of the Unconscious Statistician. On that note, we’ve certainly covered enough ground for the moment!