European vs. American Options

All of the options that I’ve discussed so far on this blog have been European options. A European option gives us the right to buy or sell an asset at a fixed price, but only on a particular expiry date. In this post, I’m going to start looking at American options, which give the right to buy or sell at ANY date up until the expiry date.

Surprisingly for the case of vanilla options, despite the apparent extra utility of American options, it turns out that the price of American and European options is almost always the same! Why is this?

The value of a european call option, broken down into its intrinsic and its time component. Volatility is 10%, strike is 100, time to expiry is 1 year and risk free rate is 1%
The value of a european call option, broken down into its intrinsic and its time component. Volatility is 10%, strike is 100, time to expiry is 1 year and risk free rate is 1%

In general, American options are MUCH harder to price than European options, since they depend in detail on the path that the underlying takes on its way to the expiry date, unlike Europeans which just depend on the terminal value, and no closed form solution exists. One thing we can say is that an American option will never be LESS valuable than the corresponding European option, as it gives you extra optionality but doesn’t take anything away. So we can always take the European price to be a lower bound on American prices. Also note that Put-Call Parity no longer holds for Americans, and becomes instead an inequality.

How can we go any further? It is useful in this case to think about the value of an option as made up of two separate parts, an ‘intrinsic value’ and a ‘time value’, which sum to give the true option value. The ‘intrinsic value’ is the value that would be received if the exercise was today – in the case of a vanilla call, this is simply \max(0,S-K). The ‘time value’ is the ‘extra’ value due to time-to-expiry. This is the volatility-dependent part of the price, since we are shielded by the optionality from price swings in the wrong direction, but are still exposed to upside from swings in our favour. As time goes by, the value of the option must approach the ‘intrinsic value’, as the ‘time value’ decays towards expiry.

Consider the graph above, which shows the BS value of a simple European call under typical parameters. Time value is maximal at-the-money, since this is the point where the implicit insurance that the option provides is most useful to us (far in- or out-of-the-money, the option is only useful if there are large price swings, which are unlikely).

What is the extra value that we should assign to an American call relative to a European call due to the extra optionality it gives us? In the case of an American option, at any point before expiry we can exercise and take the intrinsic value there and then. But up until expiry, the value of a European call option is ALWAYS* more than the intrinsic value, as the time value is non-negative. This means that we can sell the option on the market for more than the price that would be received by exercising an American option before expiry – so a rational investor should never do this, and the price of a European and American vanilla call should be identical.

It seems initially as though the same should be true for put options, but actually this turns out not quite to be right. Consider the graph below, showing the same values for a European vanilla put option, under the same parameters.

The value of a european put option under the same parameters as used above, broken down into intrinsic and time components. Unlike the call option, for far in-the-money puts, the time value can be negative, so early exercise can be valuable
The value of a european put option under the same parameters as used above, broken down into intrinsic and time components. Unlike the call option, for far in-the-money puts, the time value can be negative, so early exercise can be valuable

Notice that here, unlike before, when the put is far in-the-money the option value becomes smaller than the intrinsic value – the time value of the option is negative! In this case, if we held an American rather than a European option it might well make sense to exercise at this point, since we would receive the intrinsic value, which is greater than the option value on the market [actually it’s slightly more complicated, because in this scenario the American option price would be higher than the European value shown below, so it would need to be a bit more in the money before it was worth exercising – you can see how this sort of recursive problem rapidly becomes hard to deal with!].

What is it that causes this effect for in-the-money puts? It turns out that it comes down to interest rates. Roughly what is happening is this – if we exercise an in-the-money American put to receive the intrinsic value, we receive (K-S) cash straight away. But if we left the option until expiry, our expected payoff is roughly (K-F), where F is the forward value

    \[F(t,T) = {1\over ZCB(t,T)} S(t)\]

so we can see that leaving the underlying to evolve is likely to harm our option value [this is only true for options deep enough in the money for us to be able to roughly neglect the \max(0,K-S) criterion]

We can put this on a slightly more rigourous footing by thinking about the GREEK for time-dependence, Theta. For vanilla options, this is given by

    \begin{align*} \Theta &= - {\partial V \over \partial \tau} \nonumber \\ &= ZCB(t,T)\cdot\Big\{-{\sigma F(t,T) \phi(d_1)\over 2\sqrt{\tau}} \mp rK\Phi(\pm d_2) \Big\} \nonumber \end{align*}

where F is the forward price from t to T\phi(x) is the standard normal PDF of x and \Phi(x) is its CDF, ZCB is a zero-coupon bond from t to T and the upper of the \mp refers to calls and the lower to puts.

The form for Theta shows exactly what I said in the last paragraph – for both calls and puts there is a negative component coming from the ‘optionality’, which is decreasing with time, and a term coming from the expected change in the spot at expiry due to interest rates which is negative for calls and positive for puts.

The plot below shows Theta for the two options shown in the graphs above, and sure enough where the time value of the European put goes negative, Theta becomes positive – the true option value is increasing with time instead of decreasing as usual, as the true value converges to the intrinsic value from below.

The Theta value for European options. For a European put, this becomes positive when the option value falls below the intrinsic value. The difference between these two Thetas is independent of spot, which can be seen directly from put-call parity.
The Theta value for European options. For a European put, this becomes positive when the option value falls below the intrinsic value. The difference between these two Thetas is independent of spot, which can be seen directly from put-call parity.

*Caveats – I’m assuming a few things here – there are no dividends, rates are positive (negative rates reverses the situation discussed above – so that American CALLS can be more valuable than Europeans), no transaction fees or storage costs, and the other sensibleness and simpleness criteria that we usually assume apply.

In between European and American options lie Bermudan options, a class of options that can be exercised early but only at one of a specific set of times. As I said, it is in general really tough to price more exotic options with early exercise features like these, I’ll look at some methods soon – but this introduction is enough for today!

Excel Monte-Carlo Pricer

I’ve updated the Monte-Carlo pricer to run in excel. Give it a go and the benefits should be obvious – no more messing around with consoles or repeated typing, excel provides a handy interface for running C++ files.

What you’ll need to do is:

  • Download the Excel pricing sheet
  • Download the .xll library (large!) – this is the equivalent of the old .exe file
  • Open the Excel file, and drag-and-drop the .xll file inside the pricing sheet
  • Click on the option to “enable this add-on for this session only”
  • The Option Price function should now re-calculate option price as you make changes to the parameters!

Try using the option pricer function yourself – it now works just like a native excel function, so you can drag it into many cells and it will adjust target cells as usual allowing for easy variation of parameters and plotting of results.

If you’ve successfully set up XLW and Code::Blocks as I described in a previous post, I’ve also made the source code available in a .zip file which you can load in Code::Blocks and fiddle with yourself. I’ll be doing a bit of updating soon to make the code slightly cleaner and allow pricing of more interesting options.

The old console project source code is still available in compiled form and with the source code in a zip file at the bottom of the MC pricers page.

Forwards vs. Futures

I’ve covered Forwards and Futures in previous posts, and now that I’ve covered the basics of Stochastic Interest Rates as well, we can have a look at the difference between Forwards and Futures Contracts from a financial perspective.

As discussed before, the price of a Forward Contract is enforceable by arbitrage if the underlying is available and freely storable and there are Zero Coupon Bonds available to the Forward Contract delivery time. In this case, the forward price is

F(t,T) = S(t) \cdot {1 \over {\rm ZCB}(t,T)}

In this post I’m going to assume a general interest rate model, which in particular may well be stochastic. In such cases, the price of a ZCB at the present time is given by

{\rm ZCB}(t,T) = {\mathbb E}\Big[ \exp{\Big\{\int_t^T r(t') dt'\Big\} } \Big]

Futures Contracts are a bit more complicated, and we need to extend our earlier description in the case that there are interest rates. The basic description was given before, but additionally in the presence of interest rates, any deposit that is in either party’s account is delivered to the OTHER party at the end of each time period. So, taking the example from the previous post, on day 4 we had $4 on account with the exchange – if rates on that day were 10% p.a., over that day the $4 balance would accrue about 10c interest, which would be paid to the other party.

Let’s say we’re at time s, and want to calculate the Futures price to time T. Our replication strategy is now as follows, following the classic proof due to Cox, Ingersall and Ross but in continuous time. Futures Contracts are free to enter into and break out of due to the margin in each account, so entering X Futures Contracts at time t and closing them at time t+dt will lead to a net receipt (or payment if negative) of \inline {\rm X}\cdot\big[ H(t+dt,T) - H(t,T)\big]. From t+dt to T, we invest (borrow) this amount at the short rate and thus recieve (need to pay)

{\rm X}\cdot\big[ H(t+\tau,T) - H(t,T)\big]\cdot\prod_t^T \big( 1 + r(t)\tau \big)

and now moving to continuous time

{\rm X}\cdot\big[ H(t+dt,T) - H(t,T)\big]\cdot\int_t^T e^{ r(t)}\ dt

We follow this strategy in continuous time, constantly opening contracts and closing them in the following time period [I’m glossing over discrete vs. continuous time here – as long as the short rate corresponds to the discrete time step involved this shouldn’t be a problem], and investing our profits and financing our losses both at the corresponding short rate. We choose a different X for each period [t,t+td] so that \inline {\rm X}(t) = \int_s^t \exp{\{r(t')\}}dt'. We also invest an amount H(s,T) at time s at the short rate, and continually roll this over so that it is worth \inline H(s,T)\cdot \int_s^T \exp{\{r(t)\}}dt at time T

Only the final step of this strategy costs money to enter, so the net price of the portfolio and trading strategy is H(s,T). The net payoff at expiry is

H(s,T)\cdot \int_s^T e^{r(t)}dt + \sum_s^T {\rm X}\cdot[H(t+dt,T)-H(t,T))]\cdot\int_t^T e^{r(t)}dt

= H(s,T)\cdot \int_s^T e^{r(t)}dt + \sum_s^T \int_s^t e^{r(t)}dt\cdot[H(t+dt,T)-H(t,T))]\cdot\int_t^T e^{r(t)}dt

= H(s,T)\cdot \int_s^T e^{r(t)}dt + \int_s^T e^{r(t)}dt \cdot \sum_s^T [H(t+dt,T)-H(t,T))]

= H(s,T)\cdot \int_s^T e^{r(t)}dt + \int_s^T e^{r(t)}dt \cdot [H(T,T)-H(s,T))]= H(T,T) \cdot \int_s^T e^{r(t)}dt

And H(T,T) is S(T), so the net payoff of a portfolio costing H(s,T) is

= S(T) \cdot \int_s^T e^{r(t)}dt

How does this differ from a portfolio costing the Forward price? Remembering that in Risk-Neutral Valuation, the present value of an asset is equal to the expectation of its future value discounted by a numeraire. In the risk-neutral measure, this numeraire is a unit of cash B continually re-invested at the short rate, which is worth \inline B(t,T) = e^{\int_t^T r(t')dt' }, so we see that the Futures Price is a martingale in the risk-neutral measure (sometimes called the ‘cash measure’ because of its numeraire). So the current value of a Futures Contract on some underlying should be

H(t,T) = {\mathbb E}^{\rm RN}\big[ S(T) | {\cal F}_t \big]

ie. the undiscounted expectation of the future spot in the risk-neutral measure. The Forward Price is instead the expected price in the T-forward measure whose numeraire is a ZCB expiring at time T

F(t,T) = {\mathbb E}^{\rm T}\big[ S(T) | {\cal F}_t \big]

We can express these in terms of each other remembering F(T,T) = S(T) = H(T,T) and using a change of numeraire (post on this soon!). I also use the expression for two correlated lognormal, which I derived at the bottom of this post

\begin{align*} F(t,T) &= {\mathbb E}^{T}\big[ F(T,T) | {\cal F}_t \big] \\ &= {\mathbb E}^{T}\big[ S(T) | {\cal F}_t \big] \\ &= {\mathbb E}^{T}\big[ H(T,T) | {\cal F}_t \big] \\ &= {\mathbb E}^{RN}\big[ H(T,T) {B(t)\over B(T)} {{\rm ZCB}(t,T)\over {\rm ZCB(T,T)}}| {\cal F}_t \big] \\ &= {\rm ZCB}(t,T){\mathbb E}^{RN}\big[ H(T,T) {1\over B(T)}| {\cal F}_t \big] \\ &= {\rm ZCB}(t,T){\mathbb E}^{RN}\big[ H(T,T)\big] {\mathbb E}^{RN}\big[ e^{-\int_t^T r(t')dt'} \big] e^{\sigma_H \sigma_B \rho} \\ &= H(t,T) \cdot e^{\sigma_H \sigma_B \rho} \\ \end{align*}

where \inline \sigma_H is the volatility of the Futures price, and \inline \sigma_B is the volatility of a ZCB – in general the algebra will be rather messy!

As a concrete example, let’s consider the following model for asset prices, with S driven by a geometric brownian motion and rates driven by the Vasicek model discussed before

{dS \over S} = r(t) dt + \sigma_S dW_t

dr = a \big[ \theta - r(t)\big] dt + \sigma_r \widetilde{dW_t}

And (critically) assuming that the two brownian processes are correlated according to rho

dW_t \cdot \widetilde{dW_t} = \rho dt

In this case, the volatility \inline \sigma_B is the volatility of \inline {\mathbb E}\big[ e^{-\int_t^T r(t')dt'}\big], and as I discussed in the post on stochastic rates, this is tractable and lognormally distributed in this model.

We can see that in the case of correlated stochastic rates, these two prices are not the same – which means that Futures and Forward Contracts are fundamentally different financial products.


For two standard normal variates x and y with correlation rho, we have:

\begin{align*} {\mathbb E}\big[ e^ {\sigma_1 x} \big]& = e^ {{1\over 2}\sigma_1^2 } \end{align*}


\begin{align*} {\mathbb E}\big[ e^ {\sigma_1 x + \sigma_2 y} \big]& = {\mathbb E}\big[ e^ {\sigma_1 x + \sigma_2 \rho x + \sigma_2 \sqrt{1-\rho^2}z} \big]\\ & = {\mathbb E}\big[ e^ {(\sigma_1 + \sigma_2 \rho) x + \sigma_2 \sqrt{1-\rho^2}z} \big]\\ & = \big[ e^ {{1\over 2}(\sigma_1 + \sigma_2 \rho)^2 + {1\over 2}(\sigma_2 \sqrt{1-\rho^2})^2} \big]\\ & = \big[ e^ {{1\over 2}\sigma_1^2 + {1\over 2}\sigma_2^2 + \sigma_1 \sigma_2 \rho} \big]\\ & = {\mathbb E}\big[ e^ {\sigma_1 x }\big] {\mathbb E} \big[ e^{\sigma_2 y}\big] e^{ \sigma_1 \sigma_2 \rho} \end{align*}

Interview Questions IV

Another question about correlations today, this time I thought we could have a look at a simple type of random walk, in which the distance travelled at each step either backwards or forwards and has a random length, and how to deal with the issues that come up.

Let’s say at each step we move a distance along the x-axis that is distributed randomly and uniformly between -1 and +1, and importantly that each step is independent of the others. So, after N steps the total distance travelled, L, is

L_N = \sum_{i=0}^{N} x_i\ ; \qquad x_i\sim {\mathbb U}[-1,+1]

where \inline x_i is the i-th step length.


i) the expected distance travelled after N steps

ii) the standard deviation of the distance travelled after N steps

iii) the autocorrelation between the distance travelled at N steps and the distance travelled at N+n steps

Since we’re dealing with uniform variables, it makes sense to start by calculating the expectation and variance of a single realisation of a variable of this type. The expectation is trivially 0, while the variance is

\begin{align*} {\rm Var}[x_i] & = {\mathbb E}[x_i^2] - {\mathbb E}[x_i]^2\\ & = \int_{-1}^{+1} x^2 dx - 0 \\ & = {2\over 3} \end{align}

We’ll also make use of the independence of the individual variables at several points, we recall that for independent variables x and y, that \inline {\mathbb E}[xy] = {\mathbb E}[x] {\mathbb E}[y]


i) This one is fairly straight-forward. Expectation is a linear operator, so we can take it inside the sum. We know the expectation of an individual variate, so the expectation of the sum is just the product of these

\begin{align*} {\mathbb E}\Big[\sum_{i=0}^N x_i \Big] & = \sum_{i=0}^N {\mathbb E}[ x_i ]\\ & = N\cdot 0\\ & = 0 \end{align}


ii) The standard deviation is the square root of the variance, which is the expectation of the square minus the square of the expectation. We know the second of these is 0, so we only need to calculate the first,

\begin{align*} {\rm Var}\Big[\sum_{i=0}^N x_i \Big] & = {\mathbb E}\Big[\Big(\sum_{i=0}^N x_i \Big)^2\Big]\\ & = {\mathbb E}\Big[\sum_{i,j=0}^N x_i x_j\Big]\\ &=\sum_{i,j=0}^N {\mathbb E} [x_i x_j] \end{align}

There are two types of term here. When i and j are not equal, we can use the independence criterion given above to express this as the product of the two individual expectations, which are both 0, so these terms don’t contribute. So we are left with

\begin{align*} {\rm Var}\Big[\sum_{i=0}^N x_i \Big] &=\sum_{i=0}^N {\mathbb E} [(x_i)^2] \\ &= N\cdot {2\over3} \end{align}and the standard deviation is simply the square root of this.


iii) This is where things get more interesting – the autocorrelation is the correlation of the sum at one time with its value at a later time. This is a quantity that quants are frequently interested in, since the value of a derivative that depends on values of an underlying stock at several times will depend sensitively on the autocorrelation. We recall the expression for correlation

\rho(x,y) = {{\rm Cov}(x,y) \over \sqrt{{\rm Var}(x){\rm Var}(y) } }

So we are trying to calculate

\begin{align*} \rho(L_N, L_{N+n}) = {\rm Cov}\Big[\sum_{i=0}^N x_i \cdot \sum_{j=0}^{N+n} x_j \Big] \cdot {3 \over 2\sqrt{N (N+n)}} \end{align}

where I’ve substituted in the already-calculated value of the variances of the two sums.

We can again use the independence property of the steps to separate the later sum into two, the earlier sum and the sum of the additional terms. Also, since the expectation of each sum is zero, the covariance of the sums is just the expectation of their product

\begin{align*} \rho(L_N, L_{N+n})&= {\rm Cov}\Big[\sum_{i=0}^N x_i \cdot \Big(\sum_{j=0}^{N} x_j + \sum_{j=N+1}^{N+n} x_j \Big) \Big] \cdot {3 \over 2\sqrt{N (N+n)}}\\&= {\mathbb E}\Big[\sum_{i=0}^N x_i \cdot \Big(\sum_{j=0}^{N} x_j + \sum_{j=N+1}^{N+n} x_j \Big) \Big] \cdot {3 \over 2\sqrt{N (N+n)}}\\&= {\mathbb E}\Big[\sum_{i,j=0}^N x_i x_j + \sum_{i=0}^N x_i \cdot\sum_{j=N+1}^{N+n} x_j \Big] \cdot {3 \over 2\sqrt{N (N+n)}} \end{align}and using the results above and the independence of the final two sums (because they are the sums of different sets of terms, and each term is independent to all the others) we know

{\mathbb E}\Big[\sum_{i,j=0}^N x_i x_j \Big] = {2 \over 3}N

{\mathbb E}\Big[\sum_{i=0}^N x_i \cdot\sum_{j=N+1}^{N+n} x_j \Big] ={\mathbb E}\Big[\sum_{i=0}^N x_i \Big]\cdot {\mathbb E}\Big[\sum_{j=N+1}^{N+n} x_j \Big] = 0


\begin{align*}\rho(L_N, L_{N+n}) & = {N\over \sqrt{N(N+n)}}\\ &= \sqrt{N\over N+n} \end{align*}

What does this tell us? Roughly that the sum of the sequence up to N+n terms is correlated to its value at earlier points, but as n gets larger the correlation decreases, as the new random steps blur out the position due to the initial N steps.

We can test our expressions using the RAND() function in excel. Try plotting a sequence of sets of random numbers and summing them, and then plotting the set of sums of 100 terms against the set of sums of 120 or 200 terms (nb. in excel, you probably want to turn auto-calculate off first to stop the randoms from refreshing every time you make a change – instructions can be found here for Excel 2010; for Excel 2013 I found the option inside the “FORMULAS” tab and at the far end – set the ‘Calculation Options’ to manual). I’ve done exactly that, and you can see the results below.

The sum of 100 terms vs. the sum of 120 terms. These are of course highly correlated, as the additional 20 terms usually don't affect the overall sum to a significant extent
The sum of 100 terms vs. the sum of 120 terms. These are of course highly correlated, as the additional 20 terms usually don’t affect the overall sum to a significant extent
The sum of the first 100 terms against the sum of 200 terms. We can see that the sums are slowly becoming less correlated
The sum of the first 100 terms against the sum of 200 terms. We can see that the sums are slowly becoming less correlated
This is the sum of the first 100 terms against the first 500. The correlation is much lower than in the graphs above, but not that from the formula we derived we still expect a correlation of around 45% despite the large number of extra terms in the second sum.
This is the sum of the first 100 terms against the first 500. The correlation is much lower than in the graphs above, but note that from the formula we derived we still expect a correlation of around 45% despite the large number of extra terms in the second sum.

You can also try calculating the correlation of the variables uing Excel’s CORREL() that you generate – these should tend towards the expression above as the number of sums that you compute gets large (if you press F9, all of the random numbers in your sheet will be recomputed and you can see the actual correlation jump around, but these jumps will be smaller as the number of sums gets larger).

Running C++ in Excel via XLW

Anyone who has downloaded and compiled the console version of my Monte Carlo Pricer will know that it’s not incredibly user-friendly. You have to double-click on an .exe file to run it, and then each time select input parameters which involves a fair amount of typing. The output is given as a single number on a console window, which isn’t very convenient for plotting multiple results or for further data analysis.

In fact, console applications aren’t that commonly used by quants. There are a few solutions used in the industry, and the one I’m going to look at today is an excel wrapper. The idea is that we will write C++ code as normal, then use a programme called XLW to wrap it as a link library and allow it to be called from inside excel. This combines the advantages of excel’s flexibility with C++’s speed and power. This post is going to be a step-by-step guide to getting things up and running (I’ve struggled with it a bit while trying to set things up) with a very simple example function – in future posts I’ll discuss shifting the old pricer over to an excel implementation.

I’m going to go through with screenshots from a clean start, hopefully this should be fairly general but if things change in future it may become outdated and require me to update or re-write these guidelines. I will discuss how to set XLW up using the open source Code::Blocks IDE, it’s a bit more advanced than the Dev-C++ that I’ve been using before and I’m shifting to it from now on for this blog.


  • Download and install Code::Blocks, WITHOUT a built in compiler

Auto-installers are available here, but it’s important that you choose the version WITHOUT the built in compiler, as this is out of date

Choose this download!
Choose this download!

Double-click on the downloaded file to auto-install; agree to the terms and choose default options (by default it should do a full install needing about 60MB and install to directory “C:\Program Files (x86)\CodeBlocks”)


  • Download and install the latest version of MinGW compiler

The latest version of the MinGW compiler can be found here, download and double-click on the .exe file to install. Click through the screens, choosing the following option on the third screen: “Use pre-packaged repository catalogues”; by default the compiler will install to “C:\MinGW”. On the ‘select components’ screen, be sure to select “C++ Compiler”! This WASN’T checked for me by default!

Check C++ compiler on this screen
Check C++ compiler on this screen


  • Register MinGW with Code::Blocks

In order to register the compiler that we’ve just downloaded, run Code::Blocks (there should now be a shortcut from your start bar).

How to register MinGW with Code::Blocks
How to register MinGW with Code::Blocks

In the toolbar, select Settings->Compiler Settings; and in the window that opens select the “Toolchain Executables” tab. Click on “Auto-detect”, and the IDE should find MinGW successfully.


  • Download and install XLW

The latest version of XLW is available via the “Download” link on the front page of the project here. At the time of writing, this is version 5.0 (released Feb’ 2013).

Once again, double-click on the downloaded .exe to install, but pay attention to the screens that pop up as I had to make some modifications to the default choices, as follows:

  1. XLW auto-detected Code::Blocks on the third page (Installed Development Environments) – if you see a blank white box here, you’ve not installed the IDE correctly
  2. On the 4th screen, I had to expand the grey ‘X’s and selected the option shown in the image below
Expand the selections, and check 'Code::Blocks' under 'Source'
Expand the selections, and check ‘Code::Blocks’ under ‘Source’

By default, XLW will install to “C:\Program Files (x86)\XLW\xlw-5.0.0f0”


  • Run the XLW template generator

Now we’re getting close. In your start menu, you should now find an XLW option. Go to XLW>xlw-5.0.0f0>xlw>Extract XLW xll template; select “Code::Blocks” on the window that appears, by default the template will be created in the directory “C:\Users\….\Documents\XLL_Project”

Select Code::Blocks in the template extractor options
Select Code::Blocks in the template extractor options



Unfortunately, there seems to be a bug which has something to do with how XLW and MinGW are talking to one-another. After fiddling with many install configurations without success, I found the same issue discussed here (the second paragraph with a dark background), along with the following fix:

  1. Go to C:\MinGW\bin and locate “mingw32-make.exe”
  2. Copy this to the directory C:\Users\….\Documents\XLL_Project which you created in the previous step (it should contain “RunInterfaceGenerator.mak” among other files)
  3. Rename “mingw32-make.exe” to “make.exe”
  4. Run Code::Blocks, in the toolbar go to Settings->Compiler again, and in the window that pops up select ‘Restore Defaults’. Click ‘OK’ on both of the warning pop-up windows.


  • Open the Template project in Code::Blocks

If it isn’t already running from the last step, run Code::Blocks. Go to the directory C:\Users\….\Documents\XLL_Project that was created before, and open the project file “Template.cbp”. If all of the previous steps have been done correctly, you should simply be able to click “Build” (it’s the cog-shaped icon in one of the task bars) and have it compile without problems. Note that after compilation, a file called “Template.xll” appears in the directory C:\Users\….\Documents\XLL_Project\bin\Debug\

To test that everything is working, add an additional function to “source.cpp” and declare it in “cppinterface.h”, such as the example given in the picture below and at the bottom of this post, and check that the project still compiles.

I've added a trivial additional function here to add two doubles - of course, I've also updated the header file with a declaration of the function. XLW will do the rest!
I’ve added a trivial additional function here to add two doubles – of course, I’ve also updated the header file with a declaration of the function. XLW will do the rest!


  • Call our functions in Excel!

Open an instance of Excel. In order to load the functions that we’ve created into Excel, find the file “Template.xll” in the directory C:\Users\….\Documents\XLL_Project\bin\Debug\ and drag-and-drop it into the Excel window.

Here we can see the trivial example we made before in action - our function is adding two doubles in different cells together. I will present more advanced examples in future posts.
Here we can see the trivial example we made before in action – our function is adding two doubles in different cells together. I will present more advanced examples in future posts.

You should now find that the additional function you’ve written is available alongside Excel’s in-built functions: congratulations, you’ve made your C++ scripts run with excel as a front end! I’ll be looking at the advantages of this a lot in the near future.


Additional example function code

In source.cpp:

double // adds two doubles
AddTwoDoubles( double x, // The first double
               double y  // The second double
  return x + y;

In cppinterface.h (before the #endif ):

double // adds two doubles
AddTwoDoubles( double x, // The first double
               double y  // The second double

Put-Call Parity

Put-Call parity is a simple result connecting the prices of puts and calls in a model-independent way via the forward price.

Consider the three graphs below, showing independently the payoff at expiry of a vanilla call, a vanilla put, and a forward contract. We can see that the payoff of a long call plus the payoff of a short put will precisely overlap the forward contract payoff (assuming they have the same strike and expiry…).

The combination of a long call and a short put with the same strike and expiry is equivalent to a forward at the same strike and expiry. This is guaranteed by their payoffs at expiry (making the usual assumptions about tradability of the underlying/forward etc.), since the payoff of holding a long call and a short strike can be exactly hedged by holding a forward contract
The combination of a long call and a short put with the same strike and expiry is equivalent to a forward at the same strike and expiry. This is guaranteed by their payoffs at expiry (making the usual assumptions about tradability of the underlying/forward etc.), since the payoff of holding a long call and a short strike can be exactly hedged by holding a forward contract

This can be seen from the algebra as well:

\begin{align*} C_{\rm call}(T) - C_{\rm put}(T)\ & = \big( S(T) - K \big)^+ - \big( K - S(T) \big)^+ \\ & = S(T) - K \\ & = F(T,T) \end{align*}

If two portfolios have the same payoff, it’s a fundamental rule of derivatives pricing that they must have the same price, which gives the fundamental put-call parity relationship

C_{\rm call}(t) - C_{\rm put}(t)\ = Z(t,T)\cdot\Big( F(t,T) - K\Big)

We can take the BS expressions for put and call prices given in a previous post and observe that they obey the relationship [since \inline \Phi(x) + \Phi(-x) = 1], but importantly this result is model independent. As long as there is a forward contract available with the strike and expiry of the options, then this result MUST hold in ANY model.

One implication of this is that if we use a model to determine the price of a call option, put-call parity fixes the price of the put option to be

C_{\rm put}(t) = C_{\rm call}(t) - Z(t,T)\cdot\Big( F(t,T) - K\Big)

We can actually use this to improve our code. Out-of-the-money options typically require more paths to converge in Monte Carlo simulations because they rely on the detailed behaviour of a few paths that travel a long way. However, we can calculate in-the-money prices quickly and due to put-call parity these will lock the price of the matching out-of-the-money options.

Put-call parity is also an important test of the implementation of a model – if the prices that we are getting out don’t obey this relationship then we’ve done something seriously wrong [as long as both prices have converged!]. An equivalent statement is that the implied BS volatility of matching puts and calls should be exactly the same in any model, since the implied vol is the BS vol that would produce the given price for a put/call, and this is locked by put-call parity.

We can get some other similar relationships between put and call prices that I’ll look at in the future. Also note that put-call parity holds for european-style options, but not for many more complicated path-dependent options or those with early exercise features like American options.

Stochastic Rates Models

In a previous post, I introduced the Black-Scholes model, in which the price of an underlying stock is modeled with a stochastic variable that changes unpredictably with time. I’ve also discussed the basic model-independent rates products whose value can be determined at the present time exactly. However, to progress further with interest rate derivatives, we’re going to need to model interest rates more carefully. We’ve assumed rates are deterministic so far, but of course this isn’t true – just like stocks, they change with time in an uncertain manner, so we need to allow them to become stochastic as well.

One way of doing this is by analogy with the BS case, by allowing the short rate (which is the instantaneous risk-neutral interest rate r(t)) to become stochastic as well, and then integrating it over the required period of time to calculate forward rates.

A very basic example of this is the Vasicek Model. In this model the short rate is defined to be stochastic, with behaviour governed by the following SDE

    \[dr = k\ [\ \theta - r(t)\ ]\ dt\ +\ \sigma\ dW\]

where k\theta and \sigma are constants and dW is the standard Wiener increment as described before. This is marginally more complicated than the BS model, but still belongs to the small family of SDEs that are analytically tractable. Unlike stock prices, we expect rates to be mean-reverting – stock price variance grows with time, but we expect the distribution of rates to be confined to a fairly narrow range by comparison. The term in square brackets above achieves this, since if r(t) is greater than \theta then it will be negative and cause the rate to be pulled down, while if it is below \theta the term will be positive and push the rate up towards \theta.

Solving the equation requires a trick, which is instead of thinking about the rate alone to think about the quantity d\big(\ e^{kt}\cdot r(t)\ \big). This is equal to e^{kt}\cdot dr(t)\ + \ ke^{kt} dt \cdot r(t), and substituting in the incremental rate term from the original equation we have

    \[d\big(\ e^{kt}\cdot r(t)\ \big) = e^{kt} \Big(k\ [ \theta - r(t) ] dt + \sigma dW \Big) + \ e^{kt} dt \cdot r(t)\]

    \[d\big(\ e^{kt}\cdot r(t)\ \big) = e^{kt} \Big( k\cdot \theta\cdot dt + \sigma\cdot dW \Big)\]

note that the term in r(t) has been cancelled out, and the remaining terms can be integrated directly from a starting time s to a finishing time T

    \[\Big[\ e^{kt}\cdot r(t)\ \Big]_{t=s}^{t=T} = \int_{t=s}^{t=T} k\ \theta\ e^{kt}\cdot dt + \int_{t=s}^{t=T} e^{kt} \sigma\cdot dW\]

    \[e^{kT}\cdot r(T)\ - e^{ks}\cdot r(s)\ = \Big[\ \theta\ e^{kt}\ \Big]_{t=s}^{t=T} + \sigma \int_{t=s}^{t=T} e^{kt}\cdot dW\]

    \[r(T)\ = e^{-k(T-s)}\cdot r(s)\ + \theta\Big(\ 1 - e^{-k(T-s)}\ \Big) + \sigma \int_{s}^{T} e^{-k(T-t)}\cdot dW(t)\]

where r(s) is the initial rate. This is simply a gaussian distribution with mean and variance given by

    \[\mathbb{E} \big[\ r(T)\ |\ {\cal F}_{\rm s}\ \big] = e^{-k(T-s)}\cdot r(s)\ +\ \theta\Big(\ 1 - e^{-k(T-s)}\ \Big)\]

    \[\mathbb{V} \big[\ r(T)\ |\ {\cal F}_{\rm s}\ \big] = {\sigma^2 \over 2k} \Big(\ 1 - e^{-2k(T-s)}\ \Big)\]

Where the variance is calculated using the “Ito isometry

    \[\mathbb{V} \big[\ r(T)\ |\ {\cal F}_{\rm s}\ \big] = \mathbb{E} \big[\ \big(r(T)\big)^2\ |\ {\cal F}_{\rm s}\ \big] - \big(\ \mathbb{E} \big[\ r(T)\ |\ {\cal F}_{\rm s}\ \big]\ \big)^2\]

    \[= \Big( \sigma \int_s^T e^{-k(T-t)} dW(t) \Big)^2\]

    \[=\sigma^2 \int_s^T \Big( e^{-k(T-t)}\Big)^2 dt\]

    \[= {\sigma^2 \over 2k} \Big(\ 1 - e^{-2k(T-s)}\ \Big)\]

as stated above. Note that this allows the possibility of rates going negative, which is generally considered to be a weakness of the model, but the chance is usually rather small.

As we know the distribution of the short rate, we can calculate some other relevant quantities. Of primary importance are the Zero Coupon Bonds, which are required for calculation of forward interest rates. A ZCB is a derivative that pays $1 at a future time t, and we can price this using the Risk-Neutral Valuation technique.

According to the fundamental theorem of asset pricing, the current price of a derivative divided by our choice of numeraire must be equal to it’s future expected price at any time divided by the value of the numeraire at that time, with the expectation taken in the probability measure corresponding to the choice of numeraire.

In the risk-neutral measure, the numeraire is just a unit of currency, initially worth $1 but continually re-invested at the instantaneous short rate, so that its price at time t is $1 \cdot e^{\int_0^t r(t') dt}. Now, the price of the ZCB is given by

    \[{Z(0,t) \over \$1} = {\mathbb E}^{RN}\Big[\ {Z(s,t)\over \$1 \cdot e^{\int_{0}^s r(t') dt' }}\ |\ {\cal F}_0\ \Big]\]

    \[Z(0,t) = {\mathbb E}\Big[\ {Z(s,t)\cdot e^{-\int_{0}^s r(t') dt' }}\ |\ {\cal F}_0\ \Big]\]

Although the RHS is true at any time, we only know the value of the ZCB exactly at a single time at the moment – the expiry date t, at which it is worth exactly $1. Plugging in these values we have

    \[Z(0,t) = \$1 \cdot {\mathbb E}\Big[\ e^{-\int_{0}^t r(t') dt' }\ |\ {\cal F}_0\ \Big]\]

So the ZCB is given by the expectation of the integral of the rate over a period of time. Since the rate is itself gaussian, and an integral is the limit of a sum, it’s not surprising that this quantity is also gaussian (it’s effectively the sum of many correlated gaussians, which is also gaussian, as discussed before), but it’s rather tricky to calculate, I’ve included the derivation at the bottom of the post to same space here. The mean and variance are given by

    \[\mathbb{E} \big[\int_s^T r(t)\ dt \ |\ {\cal F}_{\rm s}\ \big] = \big(r(s) - \theta \big) \cdot B(s,T)\ +\ \theta\cdot\big( T- s \big)\]

    \[\mathbb{V} \big[\int^T_s r(t)\ dt \ |\ {\cal F}_{\rm s}\ \big] = {\sigma^2 \over k^2} \Big( (T-s) -B(s,T) - {k\over 2} B(s,T)^2 \Big)\]


    \[B(s,T) = {1\over k}\big( 1 - e^{-k(T-s)}\big)\]

The ZCB is given by the expectation of the exponential of a gaussian variable – and we’ve seen on several occasions that

    \[\mathbb{E} \big[\ \exp{(x)}\ |\ x\sim {\mathbb N}(\mu, \sigma^2)\ \big] = \exp{ \Big(\mu + {1\over 2}\sigma^2 \Big) }\]

So the ZCB prices are

    \[Z(s,T) = A(s,T)\ e^{-B(s,T)\ r(s)}\]

with B(s,T) as defined above and

    \[A(s,T) = \exp{ \Big[ \Big( \theta - {\sigma^2 \over 2 k^2}\Big)\Big( B(s,T) - T + s \Big) - {\sigma^2 \over 4 k^2} B(s,T)^2 \Big]}\]

and as these expressions only depend on the rates at the initial time, we can calculate ZCB bond prices and hence forward rates for any future expiry dates at any given time if we have the current instantaneous rate.

Although we can calculate a discount curve for a given set of parameters, the Vasicek model can’t calibrate to an initial ZCB curve taken from the market, which is a serious disadvantage. There are more advanced generalisations which can, and I’ll discuss some soon, but they will use all of the same tricks and algebra that I’ve covered here.

I’ve written enough for one day here – in later posts I’ll discuss changing to the t-forward measure, in which the ZCB Z(0,t) forms the numeraire instead of a unit of currency, which simplifies many calculations, and I’ll use it to price caplets under stochastic rates, and show that these are equivalent to european options on ZCBs.

An alternate approach to the short-rate model approach discussed today which is very popular these days is the Libor Market Model (LMM) approach, in which instead of simulating the short rate and calculating the required forwards, the different forwards required are instead computed directly and in tandem – I’ll look further at this approach in another post.


Here is the calculation of the distribution of the integral of the instantaneous rate over the period s to T:

    \[r(t)\ = e^{-k(t-s)}\cdot r(s)\ + \theta\Big(\ 1 - e^{-k(t-s)}\ \Big) + \sigma \int_{s}^{t} e^{-k(t-u)}\cdot dW(u)\]

    \[\int_s^T r(t)\ dt = \int_s^T \Big[\ e^{-k(t-s)} r(s)\ + \theta\Big(\ 1 - e^{-k(t-s)}\ \Big) + \sigma \int_{s}^{t} e^{-k(t-u)} dW(u)\Big] dt\]

and splitting this into the terms that contribute to the expectation and the variance we have

    \[\mathbb{E} \big[\int_s^T r(t)\ dt \ |\ {\cal F}_{\rm s}\ \big] = \int_s^T \Big[\ e^{-k(t-s)} r(s)\ + \theta\Big(\ 1 - e^{-k(t-s)}\ \Big)\Big] dt\]

    \[= r(s) \int_s^T e^{-k(t-s)} \ dt + \theta \int_s^T \Big(\ 1 - e^{-k(t-s)}\ \Big) dt\]

    \[= r(s) \Big[ {1 \over -k}\ e^{-k(t-s)} \Big]^T_s + \theta \Big[ t - {1 \over -k}\ e^{-k(t-s)}\ \Big]^T_s\]

    \[= \big(r(s) - \theta \big)B(s,T)\ +\ \theta\cdot\big( T- s \big)\]

to calculate the variance, we first need to deal with the following term

    \[\int_s^T \Big[\ \sigma \int_{s}^{t} e^{-k(t-u)} dW(u)\Big] dt\]

we use stochastic integration by parts

    \[\sigma \int_s^T \int_{s}^{t} e^{-k(t-u)} dW(u)\ dt = \int_s^T e^{-kt}\big(\int_{s}^{t} e^{ku} dW(u)\big)\ dt\]

    \[= \sigma\int_s^T \big(\int_{s}^{t} e^{ku} dW(u)\big)\ d_t \Big( \int^t_s e^{-kv}dv \Big)\]

    \[= \sigma\Big[ \int_{s}^{t} e^{ku} dW(u)\cdot \int^t_s e^{-kv}dv\ \Big]^T_s - \sigma\int_s^T dW(t)\Big( e^{kt} \int^t_s e^{-kv}dv\ \Big)\]

    \[= \sigma\int_{s}^{T} e^{ku} dW(u)\cdot \int^T_s e^{-kv}dv - \sigma\int_s^T dW(t) e^{kt} \Big( \int^T_s e^{-kv}dv - \int^T_t e^{-kv}dv \Big)\]

    \[= \sigma\int_s^T e^{kt} \Big(\int^T_t e^{-kv}dv\Big) dW(t)\]

    \[= {\sigma \over k}\int_s^T \big(1 - e^{-k(T-t)}\big)\ dW(t)\]

and we’re now in a position to try and find the variance of the integral

    \[\mathbb{V} \big[\int^T_s r(t)\ dt \ |\ {\cal F}_{\rm s}\ \big] = {\mathbb E}\Big[\ \Big({\sigma \over k}\int_s^T \big(1 - e^{-k(T-t)}\big)\ dW(t)\Big)^2\ \Big]\]

    \[= {\mathbb E}\Big[\ {\sigma^2 \over k^2}\int_s^T \big(1 - e^{-k(T-t)}\big)^2\ dt\ \Big]\]

where Ito’s isometry has been used again, and several more lines of routine algebra leads to the result

    \[= {\sigma^2 \over k^2} \Big( (T-s) -B(s,T) - {k\over 2} B(s,T)^2 \Big)\]

Bootstrapping the Discount Curve from Swap Rates

Today’s post will be a short one about calculation of discount curves from swap rates. I’ve discussed both swaps and discount curves in previous posts, you should read those before this one or it might not make much sense!

Although bonds can be used to calculate discount bond prices, typically swaps are the most liquid products on the market and will go to the longest expiry times (often 80+ years for major currencies), so these are used to calculate many of the points on the discount curve [and often both of these can be done simultaneously to give better reliability].

In the previous post on swaps, I calculated the swap rate X that makes swaps zero-valued at the current time

    \[X = {Z(0,t_0) - Z(0,t_N) \over \sum_{n=0}^{N-1} \tau \cdot Z(0,t_{n+1})}\]

where the t‘s here represent the fixing dates of the swap (although payment is made at the beginning of the following period, so the n‘th period is received at t_{n+1}.

Consider the sequence of times \{t_0 = 0, t_1, t_2, \dots , t_n\} for which a sequence of swaps are quoted on the markets, with swap rates X_i for the swap running from t_0 up to t_i. We can back out the discount factor at each time as follows:

    \[X_0 = {Z(0,0) - Z(0,t_1) \over \tau \cdot Z(0,t_1)}\]

    \[\Rightarrow Z(0,t_1) = {1 \over 1 + \tau X_0 }\]

    \[X_1 = {Z(0,0) - Z(0,t_2) \over \tau \big(Z(0,t_1) + Z(0,t_2)\big) }\]

    \[\Rightarrow Z(0,t_2) = {1 - \tau \cdot X_1 \cdot Z(0,t_1) \over 1 + \tau X_1}\]

and we can see from this the general procedure, calculating another ZCB from each successive swap rate using the expression

    \[Z(0,t_i) = {1 - \sum_{i=1}^{i-1} \big(\tau \cdot X_i \cdot Z(0,t_i) \big) \over 1 + \tau X_{i-1}}\]

These swaps and ZCBs are called co-initial because they both started at the same time t_0.

Now imagine that instead the swaps X_i have the first fixing at time t_i and their final fixing at time t_n for 0 \leq i < n – such swaps are called co-terminal swaps as they start at different times but finish at the same one. Once again we can calculate the discount factors up to a constant factor, this time by working backwards:

    \[X_{n-1} = {Z(0,t_{n-1}) - Z(0,t_n) \over \tau \cdot Z(0,t_n)}\]

    \[\Rightarrow Z(0,t_{n-1}) = Z(0,t_n)\cdot\big( { 1 + \tau X_{n-1} }\big)\]

    \[X_{n-2} = {Z(0,t_{n-2}) - Z(0,t_n) \over \tau \big(Z(0,t_{n-1}) + Z(0,t_n)\big) }\]

    \[\Rightarrow Z(0,t_{n-2}) = Z(0,t_{n})\big( 1 + \tau X_{n-2} ( 1 + \tau X_{n-1}) \big)\]

and so on, the dcfs can be backed out.

To specify the exact values of co-terminal swaps, we need to know at least one dcf exactly. In general the co-initial case will also require this – I implicitly assumed that they started fixing at t=0 where we know Z(0,0) = 1, but for general co-initial swaps we would also have this issue.

Solving the Heat Equation

This post follows on from my earlier post on the BS Equation from Delta Hedging. In that post, I showed how delta hedging arguments within the BS model lead to the heat equation

{1\over 2}\sigma^2 {\partial^2 D\over \partial y^2} - {\partial D \over \partial \tau} = 0

where \inline D = C e^{-r \tau}, C(S,t) is the derivative price, \inline y = \ln S + \big( r - {1\over 2}\sigma^2 \big)\tau, and \inline \tau = T - t. In this post I’m going to solve this equation to give the BS option price for a vanilla call. S can only take positive values, but due the the logarithm y runs from \inline -\infty to \inline \infty.

In physics, this equation is usually encountered with periodic boundary conditions coming from the boundaries of the region of interest, but here we don’t have that luxury so will need to be more careful. In this post I’m going to solve the equation using the method of separation of variables, in a later post I’ll re-solve it using the method of Fourier transforms.

The heat equation is linear, which means that if \inline f(y,\tau) and \inline g(y,\tau) are both solutions, then so is the sum \inline \big( f(y,\tau) + g(y,\tau) \big). This means that if we can find a selection of possible solutions, we can combine them to match the boundary condition given by the known form of \inline D(y,\tau = 0) at expiry (coming from \inline C(S,\tau = 0) = \big( S - K \big)^+). We look for separable solutions, which are those of the form \inline f(y,\tau) = Y(y)T(\tau), in which case the derivatives in the heat equation only act on their respective terms:

{1\over 2}\sigma^2 T(\tau){\partial^2 Y(y)\over \partial y^2} - Y(y){\partial T(\tau) \over \partial \tau} = 0

{1\over 2}\sigma^2 Y''T = Y \dot{T}

{1\over 2}\sigma^2 {Y''\over Y} = {\dot{T} \over T} = \lambda^2

where a dash denotes differentiation by y and a dot differentiation by tau, In the final equation, both sides depend only on independent variables y and tau respectively – the only way this can be true is if both sides are equal to some constant, which we call \inline \lambda^2.

In general, we should consider the possibilities that \inline \lambda^2 is zero or negative, but it turns out that these won’t matter in this example so I ignore them for brevity. Two equations can be extracted, linked by \inline \lambda^2

Y'' - 2 {\lambda^2 \over \sigma^2} Y = 0

\dot{T} - \lambda^2 T = 0

Solving these equations,

Y_\lambda(y) = Y_{0,\lambda} e^{\sqrt{2}{\lambda \over \sigma }y}

T_\lambda (\tau) = e^{\lambda^2 \tau}

with any positive value of lambda allowed. We first try to find the fundamental solution of the heat equation, which is the solution that satisfies \inline f(y,\tau = 0) = \delta(y-a) where \inline \delta is the Dirac delta function, and I’ll use this to find the particular solution for a vanilla call at the end. At \inline \tau = 0 all of the functions T are 1, so these can be neglected, and we simply sum up the right number of Y terms, each with a different value of \inline \lambda which we index with n, to produce the delta function:

\sum_n Y_{0,\lambda_n} e^{\sqrt{2}{\lambda_n \over \sigma }y} = \delta(y-a)because \inline \lambda can take any value on the continuous interval \inline -\infty < \lambda < \infty, we change the sum to an integral, and also express the delta function using its integral representation and compare co-efficient to equate the two expressions

\int_{-\infty}^{\infty} Y_{0,\lambda_n} e^{\sqrt{2}{\lambda_n \over \sigma }y}dn = {1 \over 2\pi} \int_{-\infty}^{\infty} e^{in(y-a)}dn


Y_{0,n} = {1 \over 2\pi}e^{-in a} \quad ; \quad \lambda_n = {1 \over \sqrt{2}}\sigma i n

And plugging these into our earlier expressions for Y and T, and combining these in the same way for \inline \tau > 0 to provide the temporal behavior, we arrive at

f(y,\tau) = \int_{-\infty}^{\infty} Y_{\lambda_n}(y) T_{\lambda_n}(\tau) dn

= {1\over 2\pi} \int_{-\infty}^{\infty} e^{in(y-a)} e^{-{\sigma^2 n^2 \tau \over 2}}dn

and once again, this is a gaussian integral that we’ll tackle by completing the square…

= {1\over 2\pi} \int_{-\infty}^{\infty} \exp\Big\{ -{1 \over 2} \sigma^2 \tau \Big( n^2 - 2 {i(y-a)\over \sigma^2 \tau} + \big( {i(y-a)\over \sigma^2 \tau}\big)^2-\big( {i(y-a)\over \sigma^2 \tau}\big)^2 \Big) \Big\} dn

= {1\over 2\pi} \int_{-\infty}^{\infty} \exp\Big\{ -{1 \over 2} \sigma^2 \tau \Big(n- \big( {i(y-a)\over \sigma}\big) \Big)^2 \Big\} \cdot \exp{-(y-a)^2 \over 2 \sigma^2 \tau} dn

= {1\over 2\pi}\cdot\sqrt{2\pi \over \sigma^2 \tau}\cdot \exp{-(y-a)^2 \over 2 \sigma^2 \tau}

= {1\over \sqrt{2\pi \sigma^2 \tau}}\cdot \exp{-(y-a)^2 \over 2 \sigma^2 \tau}

and this is the fundamental solution given at the end of the first post and also at wikipedia. Note that although we used separation of variables to find the forms of the functions Y and T, the separability hasn’t been carried through the summation of many different T-Y combinations – the fundamental solution that we’ve arrived at ISN’T separable!

The fundamental solution is the time and space evolution of a function that is initially a delta-function, such that the evolution always obeys the heat equation. We can use the following property of the delta function to construct specific solutions for particular payoff functions:

f(y) = \int_{-\infty}^{\infty} \delta(y-a) f(a) da

So, because the fundamental solution \inline f(y,\tau=0) = \delta(y-a), we can construct the payoff function using this integral at \inline \tau = 0

D(y,\tau=0) = \int_{-\infty}^{\infty} f(a,\tau=0) D(a,\tau=0) da

and consequently, at later times

D(y,\tau) = \int_{-\infty}^{\infty} f(a,\tau) D(a,\tau=0) daand we can solve this for the specific case that \inline C(S,\tau = 0) = \big( S(\tau=0) - K \big)^+ as follows (denoting a as y’)

D(y,\tau) = {e^{-r\tau} \over \sqrt{2\pi\sigma^2\tau}}\int_{-\infty}^{\infty} \exp\Big\{ - {(y'-y)^2 \over 2\sigma^2 \tau} \Big\} \cdot \big( e^{y'} - K \big)^+ dy'

The second factor is only non-zero when \inline y' > \ln K = y_L, so we can absorb this condition into the lower limit

= {e^{-r\tau} \over \sqrt{2\pi\sigma^2\tau}}\int_{y_L}^{\infty} \exp\Big\{ - {(y'-y)^2 \over 2\sigma^2 \tau} \Big\} \cdot \big( e^{y'} - K \big) dy'and splitting this into two integrals, which we do in turn

= {e^{-r\tau} \over \sqrt{2\pi\sigma^2\tau}}\int_{y_L}^{\infty} K\exp\Big\{ - {(y'-y)^2 \over 2\sigma^2 \tau} \Big\} dy' = e^{-r\tau}\cdot K\cdot \Phi\Big({y - y_L \over \sigma \sqrt{\tau}}\Big)


{e^{-r\tau} \over \sqrt{2\pi\sigma^2\tau}}\int_{y_L}^{\infty} \exp\Big\{ - {(y'-y)^2 \over 2\sigma^2 \tau} \Big\} \cdot e^{y'} dy'

= {e^{-r\tau} \over \sqrt{2\pi\sigma^2\tau}}\int_{y_L}^{\infty} \exp\Big\{ - {1\over 2\sigma^2 \tau}\Big( y'^2 + y^2 - 2y'y -2y'\sigma^2\tau \pm ( y + \sigma^2\tau )^2 \Big) \Big\} dy'

= {e^{-r\tau} \over \sqrt{2\pi\sigma^2\tau}}\int_{y_L}^{\infty} \exp\Big\{ - {1\over 2\sigma^2 \tau}\Big( \big(y'-(y+\sigma^2 \tau)\big)^2 - \big(y + \sigma^2 \tau\big)^2 + y^2\Big) \Big\} dy'

= {e^{-r\tau} \over \sqrt{2\pi\sigma^2\tau}}\int_{y_L}^{\infty} \exp\Big\{ - {1\over 2\sigma^2 \tau}\Big( \big(y'-(y+\sigma^2 \tau)\big)^2 - \sigma^2\tau\big(2y + \sigma^2 \tau\big)\Big) \Big\} dy'

= {e^{-r\tau} \over \sqrt{2\pi\sigma^2\tau}}\int_{y_L}^{\infty} \exp\Big\{ - {\big(y'-(y+\sigma^2 \tau)\big)^2\over 2\sigma^2 \tau}\Big\}\cdot S e^{r\tau} dy'

= e^{-r\tau} \cdot F \cdot \Phi \Big( {y - y_L + \sigma^2 \tau \over \sigma \sqrt{\tau}} \Big)

where as usual, \inline \Phi(x) is the cumulative normal distribution; and we’ve used the expression \inline y = \ln S + (r - {1\over 2}\sigma^2)\tau and also \inline y + {1 \over 2}\sigma^2 \tau = S e^{r\tau} = F. Putting all of this together, we finally arrive at the Black-Scholes expression for a vanilla call price,

{e^{-r\tau} \over \sqrt{2\pi\sigma^2\tau}}\int_{y_L}^{\infty} \exp\Big\{ - {(y'-y)^2 \over 2\sigma^2 \tau} \Big\} \cdot \big( e^{y'} - K \big) dy'

= e^{-r\tau} \Big( F \cdot \Phi(d_1) - K\cdot \Phi(d_2)\Big)



Wow! Finally! Hopefully this will convince you that the method of risk-neutral valuation is a bit more straight-forward!

Digital Options

Today I’m going to talk about the valuation of another type of option, the digital (or binary) option. This can be seen as a bit of a case study, as I’ll present the option payoff and the analytical price and greeks under BS assumptions, and give add-ons to allow pricing with the MONTE CARLO pricer. I’ve also updated the ANALYTICAL pricer to calculate the price and greeks of these options.

Digital options are very straight-forward, they are written on an underlying S and expire at a particular date t, at which point digital calls pay $1 if S is greater than a certain strike K or $0 if it is below that, and digital puts pay the reverse – ie. the payoff is

P_{\rm dig\ call} = \Big\{ \ \begin{matrix} \$1 \quad S \geq K\\ \$0 \quad S < K\end{matrix}

We can calculate the price exactly in the BS approximation using the same method that I used to calculate vanilla option prices by risk-neutral valuation as follows

C_{\rm dig\ call}(0) = \delta(0,t)\ {\mathbb E}[ C_{\rm dig\ call}(t) ]

where \inline C_{\rm dig\ call}(t') is the price of the option at time \inline t', and we know that this must converge to the payoff as \inline t \to t', so \inline C_{\rm dig\ call}(t) = P_{\rm dig\ call}

= \delta(0,t)\ \int^{\infty}_{-\infty} P_{\rm dig\ call} \cdot e^{-{1\over 2}x^2} dx

\inline P_{\rm dig\ call} is zero if \inline S = S_0 e^{(r - {1\over 2}\sigma^2)t + \sigma \sqrt{t} x} < K which corresponds to \inline x < {\ln{K \over S_0} - (r-{1\over 2}\sigma^2)t \over \sigma \sqrt{t}} = -d_2

= \delta(0,t)\ \cdot \$1 \cdot \int^{\infty}_{-d_2} e^{-{1\over 2}x^2} dx

= \$ \delta(0,t)\cdot \Phi(d_2)


d_1 = {\ln{S\over K}+(r+{1\over 2}\sigma^2)t \over \sigma \sqrt{t}} \quad ;\quad d_2 = {\ln{S\over K}+(r-{1\over 2}\sigma^2)t \over \sigma \sqrt{t}}

Since we have an analytical price, we can also calculate an expression for the GREEKS of this option by differentiating by the various parameters that appear in the price. Analytical expressions for a digital call’s greeks are:

\Delta = {\partial C \over \partial S} = e^{-rt}\cdot {\phi(d_2)\over S\sigma \sqrt{t}}

\nu = {\partial C \over \partial \sigma} = -e^{-rt}\cdot {d_1 \ \phi(d_2)\over \sigma}

\gamma = {\partial^2 C\over \partial S^2} = -e^{-rt}\cdot {d_1 \ \phi(d_2)\over S^2 \sigma^2 t} = -e^{-rt}\cdot {d_1\ \phi(d_1)\over S K \sigma^2 t}

{\rm Vanna} = {\partial^2 C \over \partial S \partial \sigma} =e^{-rt}\cdot {\phi(d_2)\over S \sigma^2 \sqrt{t}}\Big( d_1 d_2 - 1 \Big)

{\rm Volga} = {\partial^2 C \over \partial \sigma^2} = e^{-rt}\cdot {\phi(d_2)\over \sigma^2}\cdot \Big( d_1 + d_2 - d_1^2 d_2 \Big)


\phi(d_1) = {1\over \sqrt{2 \pi}}\ {e^{-{1\over 2}d_1^2} }

Holding a binary put and a binary call with the same strike is just the same as holding a zero-coupon bond, since we are guaranteed to receive $1 wherever the spot ends up, so the price of a binary put must be

e^{-rt} = e^{-rt}\cdot \Phi(d_2) + C_{\rm dig\ put}(t'=0)\quad \quad \quad \quad [1]

C_{\rm dig\ put}(0) = e^{-rt} \cdot \Phi(-d_2)

Moreover, differentiating equation [1] above shows that the greeks of a digital put are simply the negative of the greeks of a digital call with the same strike.

Graphs of these are shown for a typical binary option in the following graphs. Unlike vanilla options, these option prices aren’t monotonic in volatility: if they’re in-the-money, increasing vol will actually DECREASE the price since it makes them more likely to end out-of-the-money!

Price and first-order greeks for a digital call option.
Price and first-order greeks for a digital call option.
Second-order greeks for a digital call option. Greeks for digital puts are simply the negative of these values

One final point on pricing, note that the payoff of a digital call is the negative of the derivative of a vanilla call payoff wrt. strike; and the payoff of a digital put is the positive of the derivative of a vanilla put payoff wrt. strike. This means that any binary greek can be calculated from the corresponding vanilla greek as follows

\Omega_{\rm dig\ call} = -{\partial \Omega_{\rm call}\over \partial K}

\Omega_{\rm dig\ put} = {\partial \Omega_{\rm put}\over \partial K}

where here \inline \Omega represents a general greek.

If you haven’t yet installed the MONTE CARLO pricer, you can find some instructions for doing so in a previous post. The following links give the header and source files for binary calls and puts which can be dropped in to the project in your C++ development environment

These will register the option types with the option factory and allow monte carlo pricing of the options (so far, all of the options in the factory also have analytical expressions, but I’ll soon present some options that can only be priced by Monte Carlo).