Calibrating time-dependent volatility to swaption prices

We have seen in a previous post how to fit initial discount curves to swap rates in a model-independent way. What if we want to control the volatility parameter to match vanilla rates derivatives as well? Just as we found for vanilla calls and puts, we will need to chose a model, for example the Hull-White extended Vasicek (HWeV) model that we’ve seen before (there are a few reasons why this isn’t a great choice, discussed later).

    \[dr = \Bigl( \theta(t) - ar(t) \Bigr) dt + \sigma(t) dW_t\]

Our next choice is which vanilla rates options we want to use for the calibration. A common choice is the interest rate swaption, which is the right to enter a swap at some future time T with fixed payment dates \{T_i\} > T and a strike X. These are fairly liquid contracts so present a good choice for our calibration. A ‘payer swaption’ is one in which we pay fixed and recieve floating, and a ‘receiver swaption’ is the opposite. For simplicity, for the rest of this post we will assume all payments are annual, so year fractions \tau are ignored.

A reciever swaption can be seen as a call option on a coupon-paying bond with fixed payments equal to X at the same payment dates as the swap. To see this, consider the price of a swap discussed before:

    \[S(t,\{T_i\}, X) = P(t,T_N) - P(t,T_0) + \sum^N_{i=1} X \cdot P(t,T_i)\]

where t is the time now. This is exactly the same as the price of buying a unit of bond at time T_0 for $1, receiving the fixed coupons X at each payment period, and receiving the original notional back at time T_N.

So, the price of a swaption is an option on receiving a portfolio of coupon payments, each of which can be thought of as a zero-coupon bond paid at that time, and the value of the swaption is the positive part of the expected value of these:

    \[C(t, T_0, T_N, K, X) = P(t,T_0) \cdot {\mathbb E}\Bigl[ \Bigl( K - \sum^N_{i=1} X \cdot P(t,T_i) \Bigr)^+ \Bigr]\]

where the cash payments have been replaced by the strike of the option (the only ‘notional’ payment that will be paid)

This isn’t readily tractable, but we make use of Jamshidian’s decomposition (I won’t go further here – this is worth it’s own post!) to re-write this with the max inside the summation:

    \[C(t, T_0, T_N, K, X) = P(t,T_0) \cdot {\mathbb E}\Bigl[ \sum^N_{i=1} \Bigl( K_i - \cdot P(t,T_i) \Bigr)^+ \Bigr]\]

where K_i is the price of a ZCB at time T_0 expiring at T_i if rates at T_0 make the value of the coupon-bearing bond equal to X.

Looking at this expression, we see that each term is simply the present value of an option to buy a ZCB at time T that expires at one of the payment dates T_i with strike K_i. So the price of a swaption has been expressed entirely as the price of a portfolio of options on ZCBs!

For the HWeV model, these are deterministic and depend only on the initial rate, and calibrated time dependent parameters in the model. Since rates are gaussian in HWeV this can be done analytically. Calculating these for time-varying parameters is algebra-intensive and I leave it for a later post, but for constant parameters the calculation is described in Brigo and Mercurio pg 75-76 and gives a price of

    \begin{align*} ZBC(t,T_0,T_i,K) &= P(t,T_0) {\mathbb E}\Bigl[ K - X \cdot P(t,T_i) \Bigr]^+ \nonumber \\ &= P(t,T_i) \cdot \Phi(h) - X \cdot P(t,T_0) \cdot \Phi(h-\sigma_p) \nonumber \end{align*}


    \begin{align*} \sigma_p &= \sigma \sqrt{\frac {1-e^{-2a(T_0 - t)}} {2a}} B(T_0,T_N) \nonumber \\ h &= {\frac 1 {\sigma_p}} \ln{\frac {P(t,T_N)} {X \cdot P(t,T_0)}} + {\frac {\sigma_p} 2} \nonumber \\ B(T_0,T_N) &= {\frac 1 a} \Bigl[ 1 - e^{-a(T_N - T_0)}\Bigr] \nonumber \end{align*}

We can see how we could use the above to calibrate the volatility parameter to match a single market-observed swaption price. When several are visible, the challenge becomes to choose a piecewise continuous function to match several of them. In HWeV this can be done analytically, but for more general models some sort of optimisation would be required.

Since these contracts have an exercise date T when the swap starts and the swaps themselves will have another termination date T_N which define a 2-dimensinal space \{T,T_N\}, it will not be possible to fit all market-observable swaptions with a one factor model. Many alternatives are discussed in the literature to deal with this concern, but the general procedure is the same. Practically, we should choose the most liquid swaptions and bootstrap to these, and only a few (5Y, 10Y etc) will practically be tradable in any case.

Hedging in a finite-state model (Binary Trees)

Today’s post is about hedging in a world where there are only a few outcomes. It demonstrates a lot of principles that are important in asset pricing, and it’s also a frequent topic for interview questions so almost got included as one of those!

A very basic approximation for a stock is a binary tree, as in the diagram shown below. The stock is worth 100 today, and after some time (say a year) it will be worth either 110 with probability p or 90 with probability (1-p). In this very basic model, we assume no other values are possible – the stock either does well and goes up by 10, or it does badly and goes down by 10. Furthermore, we assume that there is a risk-free yearly interest rate r at which we can deposit money in the bank and receive (1+r) times our initial deposit in a year.

The two possibilities for a stock's evolution in the binary tree model
The two possibilities for a stock’s evolution in the binary tree model

We’re going to try and price options in this model. Let’s say we’ve got a call option to buy the stock in a year’s time for 100. How much is this worth? Because of the simple state of the world, we can enumerate the two possibilities – either the stock is worth 110, in which case the call option to buy it for 100 is worth 10, or the stock is worth 90, and the option is worthless. We’ll try and simulate that payoff using a combination of cash-in-the-bank and the stock itself, and then by no-arbitrage the price must be the same today.

Our portfolio consists of \alpha stocks S and \beta cash-in-the-bank. The value of the cash will certainly be (1+r)\beta in a year, while the stock will be 110 \alpha or 90 \alpha depending on the state, and we want the sum of these two to equal the option payoff (in either the up or the down scenario), which gives us two simultaneous equations

    \[(1+r) \beta + 110 \alpha = 10\]

    \[(1+r) \beta + 90 \alpha = 0\]


    \[\alpha = 0.5\]

    \[\beta = {-45 \over (1+r)}\]

This says that a portfolio containing half of a stock and minus 45 pounds in the bank will exactly replicate the payoff of the option in either world state. The value of this portfolio today is just 0.5*100 – 45/(1+r), and I’ve plotted that as a function of r below.

The price of a call option on a stock in the two-state world given above
The price of a call option on a stock as a function of the risk-free rate in the two-state world given above

This gives meaningless prices if r > 0.1 (the price of the option must be between 0 and 10/(1+r) as these are the discounted values of the least/most that it can pay out at expiry). What does this mean intuitively? Well, if r > 0.1 then we have an arbitrage: 100 in the bank will *always* yield more than the stock at expiry, so we should short the stock as much as possible and put the proceeds in the bank. If r < -0.1 the situation is exactly reversed

The important point about all of this was that the option price DOESN’T depend on the relative chances of the stock increasing to 110 or falling to 90. As long as both of these are strictly greater than zero and less than one, then ANY set of probabilities will lead to the same price for the option. This is the discrete analogue of a result I presented before (here) – that the expected growth of the stock in the real world doesn’t matter to the option’s price, it is the risk-free rate that affects its price. Indeed, it is possible to derive the continuous result using a binary tree by letting the time period go to zero, I won’t attempt it here but the derivation can be found in many textbooks (eg. on wikipedia).

Things really get interesting when we try to extend the model in a slightly different way. Often banks will be interested in models that have not just a diffusive component, but also a ‘jump’ component, which gives a finite chance that a price will fall suddenly by a large amount (I’ll present one such model in the near future). Unlike a straight-forward Black-Scholes model, because these jumps happen suddenly and unexpectedly they are very difficult to hedge, and are meant to represent market crashes that can result in sudden sharp losses for banks.

In our simple tree model, this can be incorporated by moving to a three-branch model, shown below

The three possibilities for a stock's evolution in the three-branch tree model
The three possibilities for a stock’s evolution in the three-branch tree model

We have the same two branches as before, but an additional state now exists in which the stock price crashes to 50 with probability q. In this case, again trying to price an option to buy the stock for 100 gives three simultaneous equations

    \[(1+r) \beta + 110 \alpha = 10\]

    \[(1+r) \beta + 90 \alpha = 0\]

    \[(1+r) \beta + 50 \alpha = 0\]

Unlike before, we can’t find a single alpha and beta that will replicate the payoff in all three world states, as we have three equations and only two unknowns. Consequently, the best we will be able to to is sub-replicate or super-replicate the payoff. That is, find portfolios that either always pay equal to or greater than the option, or always pay less than or equal to the option. These will give bounds on the lowest and highest arbitrage-free option prices, but that’s as far as arbitrage-free prices will take us (in fact ANY of the prices between this limit is arbitrage-free) – choosing an actual price will require knowing the probabilities of p and q and deciding on our personal risk-premium.

In the simple three-state model, lower and upper bounds can be calculated by ignoring the second and third equation respectively above, and they give the limits shown in the graph below. Once again, as r \to 0.1 they converge, but note that a case where r < -0.1 is now possible, as the ‘crash’ option means that the stock can still pay out less than the bank account

The limiting values given by the no-arbitrage requirement on the price of a call option as a function of the risk-free rate in the three-branch tree model given above
The limiting values given by the no-arbitrage requirement on the price of a call option as a function of the risk-free rate in the three-branch tree model given above

This is what’s called an incomplete market. The securities that exist in the market aren’t sufficient to hedge us against all future possible states of the universe. In this case, we can’t uniquely determine prices by risk-neutral pricing – sice we can’t hedge out all of the risk, risk preferences of investors will play a part in determining the market-clearing prices of securities. Most models that are used in the industry are incomplete in this way – I’ll be looking at some examples of this soon which will include models involving jump processes and stochastic volatility.

The Dupire Local Vol Model

In this post I’m going to look at a further generalisation of the Black-Scholes model, which will allow us to re-price any arbitrary market-observed volatility surface, including those featuring a volatility smile.

I’ve previously looked at how we can produce different at-the-money vols at different times by using a piecewise constant volatility \inline \sigma(t), but we were still unable to produce smiley vol surfaces which are often observed in the market. We can go further by allowing vol to depend on both t and the value of the underlying S, so that the full BS model dynamics are given by the following SDE

dS_t = S_t\cdot \Big(\ r(t) dt\ +\ \sigma(t,S_t) dW_t\ \Big)

Throughout this post we will make constant use of the probability distribution of the underlying implied by this SDE at future times, which I will denote \inline \phi(t,S_t). It can be shown [and I will show in a later post!] that the evolution of this distribution obeys the Kolmogorov Forward Equation (sometimes called the Fokker-Planck equation)

{\partial \phi(t,S_t) \over \partial t} = -{\partial \over \partial S_t}\big(rS_t\phi(t,S_t)\big) + {1\over 2}{\partial^2 \over \partial S_t^2}\big(\sigma^2(t,S_t) S_t^2 \phi(t,S_t)\big)

This looks a mess, but it essentially tells us how the probability distribution changes with time – we can see that is looks very much like a heat equation with an additional driving term due to the SDE drift.

Vanilla call option prices are given by

C(t,S_t) = P(0,t)\int_K^{\infty} \big(S_t - K\big)\phi(t, S_t) dS_t

Assuming the market provides vanilla call option prices at all times and strikes [or at least enough for us to interpolate across the region of interest], we can calculate the time derivative of the call price which is equal to

{\partial C \over \partial T} = -rC + P(0,T)\int_K^{\infty}\big( S_T - K \big)\ {\partial \phi \over \partial T}\ dS_T

and we can substitute in the value of the time derivative of the probability distribution from the Kolmogorov equation above

rC + {\partial C \over \partial T} = P(0,T)\int_K^{\infty}\big( S_T - K \big)\Big[ -{\partial \over \partial S_T}\big( rS_T\phi\big) + {1\over 2}{\partial^2 \over \partial S_T^2}\big( \sigma^2S_T^2\phi\big) \Big]dS_T

These two integrals can be solved by integration by parts with a little care

\begin{align*} -\int_K^{\infty}\big( S_T - K \big){\partial \over \partial S_T}\big( rS_T\phi\big)dS_T & = -\Big[rS_T\phi \big( S_T - K \big)\Big]^{\infty}_{K} + \int_K^{\infty}rS_T\phi\ dS_T \\ & = r\int_K^{\infty} (S_T\phi)\ dS_T \end{align*}\begin{align*} \int_K^{\infty}\big( S_T - K \big){\partial^2 \over \partial S_T^2}\big( \sigma^2 S_T^2\phi\big)dS_T & =\Big[\big( S_T - K \big){\partial \over \partial S_T}(\sigma^2 S_T^2\phi) \Big]^{\infty}_{K} - \int_K^{\infty}{\partial \over \partial S_T}(\sigma^2 S_T^2\phi)\ dS_T \\ & = -\sigma^2 K^2\phi(K,T) \end{align*}where in both cases, the boundary terms disappear at the upper limit due to the distribution \inline \phi(t,S_t) and its derivatives, which go to zero rapidly at high spot.

We already have an expression for \inline \phi(t,S_t) in terms of C and its derivatives from our survey of risk-neutral probabilities,

\phi(t,S_t) = {1 \over P(0,t)}{\partial^2 C \over \partial K^2}

and we can re-arrange the formula above for call option prices

\begin{align*} P(0,T)\int_K^{\infty} S_T\ \phi\ dS_T & = C + P(0,T)\int_K^{\infty} K\phi\ dS_T \\ & = C + K {\partial C \over \partial K} \end{align*}and substituting these expressions for \inline \phi(t,S_t) and \inline \int^{\infty}_K (S_T \phi)\ dS_T into the equation above

\begin{align*} rC + {\partial C \over \partial T} & = P(0,T)\cdot \Big[ r\int_K^{\infty} (S_T\phi)\ dS_T + \sigma^2 K^2\phi(K,T) \Big]\\ & = rC + rK {\partial C \over \partial K} + \sigma^2 K^2{\partial^2 C \over \partial K^2} \end{align*}

and remember that \inline \sigma = \sigma(t,S_t), which is our Dupire local vol. Cancelling the rC terms from each side and re-arranging gives

\sigma(T,K) = \sqrt{ {\partial C \over \partial T} + rK {\partial C \over \partial K} \over K^2{\partial^2 C \over \partial K^2}}

It’s worth taking a moment to think what this means. From the market, we will have access to call prices at many strikes and expires. If we can choose a robust interpolation method across time and strike, we will be able to calculate the derivative of price with time and with strike, and plug those into the expression above to give us a Dupire local vol at each point on the time-price plane. If we are running a Monte-Carlo engine, this is the vol that we will need to plug in to evolve the spot from a particular spot level and at a particular time, in order to recover the vanilla prices observed on the market.

A nice property of the local vol model is that it can match uniquely any observed market call price surface. However, the model has weaknesses as well – by including only one source of uncertainty (the volatility), we are making too much of a simplification. Although vanilla prices match, exotics priced using local vol models typically have prices that are much lower than prices observed on the market. The local vol model tends to put most of the vol at early times, so that longer running exotics significantly underprice.

It is important to understand that this is NOT the implied vol used when calculating vanilla vol prices. The implied vol and the local vol are related along a spot path by the expression

\Sigma^2 T = \oint_0^T\sigma^2(S_t,t)dt

(where \inline \Sigma is the implied vol) and the two are quite different. Implied vol is the square root of the average variance per unit time, while the local vol gives the amount of additional variance being added at particular positions on the S-t plane. Since we have an expression for local vol in terms of the call price surface, and there is a 1-to-1 correspondence between call prices and implied vols, we can derive an expression to calculate local vols directly from an implied vol surface. The derivation is long and tedious but trivial mathematically so I don’t present it here, the result is that the local vol is given by (rates are excluded here for simplicity)

\sigma(y,T) = \sqrt{{\partial w \over \partial T} \over \Big[ 1 - {y \over w}{\partial w \over \partial y} + {1\over 2}{\partial^2 w \over \partial y^2}+ {1 \over 4}\Big( -{1 \over 4} - {1 \over w}+ {y^2 \over w}\Big)\Big({\partial w \over \partial y}\Big)^2 \Big]}

where \inline w = \Sigma^2 T is the total implied variance to a maturity and strike and \inline y = \ln{K \over F_T} is the log of ‘moneyness’.

This is probably about as far as Black-Scholes alone will take you. Although we can reprice any vanilla surface, we’re still not pricing exotics very well – to correct this we’re going to need to consider other sources of uncertainty in our models. There are a wide variety of ways of doing this, and I’ll being to look at more advanced models in future posts!

European vs. American Options

All of the options that I’ve discussed so far on this blog have been European options. A European option gives us the right to buy or sell an asset at a fixed price, but only on a particular expiry date. In this post, I’m going to start looking at American options, which give the right to buy or sell at ANY date up until the expiry date.

Surprisingly for the case of vanilla options, despite the apparent extra utility of American options, it turns out that the price of American and European options is almost always the same! Why is this?

The value of a european call option, broken down into its intrinsic and its time component. Volatility is 10%, strike is 100, time to expiry is 1 year and risk free rate is 1%
The value of a european call option, broken down into its intrinsic and its time component. Volatility is 10%, strike is 100, time to expiry is 1 year and risk free rate is 1%

In general, American options are MUCH harder to price than European options, since they depend in detail on the path that the underlying takes on its way to the expiry date, unlike Europeans which just depend on the terminal value, and no closed form solution exists. One thing we can say is that an American option will never be LESS valuable than the corresponding European option, as it gives you extra optionality but doesn’t take anything away. So we can always take the European price to be a lower bound on American prices. Also note that Put-Call Parity no longer holds for Americans, and becomes instead an inequality.

How can we go any further? It is useful in this case to think about the value of an option as made up of two separate parts, an ‘intrinsic value’ and a ‘time value’, which sum to give the true option value. The ‘intrinsic value’ is the value that would be received if the exercise was today – in the case of a vanilla call, this is simply \max(0,S-K). The ‘time value’ is the ‘extra’ value due to time-to-expiry. This is the volatility-dependent part of the price, since we are shielded by the optionality from price swings in the wrong direction, but are still exposed to upside from swings in our favour. As time goes by, the value of the option must approach the ‘intrinsic value’, as the ‘time value’ decays towards expiry.

Consider the graph above, which shows the BS value of a simple European call under typical parameters. Time value is maximal at-the-money, since this is the point where the implicit insurance that the option provides is most useful to us (far in- or out-of-the-money, the option is only useful if there are large price swings, which are unlikely).

What is the extra value that we should assign to an American call relative to a European call due to the extra optionality it gives us? In the case of an American option, at any point before expiry we can exercise and take the intrinsic value there and then. But up until expiry, the value of a European call option is ALWAYS* more than the intrinsic value, as the time value is non-negative. This means that we can sell the option on the market for more than the price that would be received by exercising an American option before expiry – so a rational investor should never do this, and the price of a European and American vanilla call should be identical.

It seems initially as though the same should be true for put options, but actually this turns out not quite to be right. Consider the graph below, showing the same values for a European vanilla put option, under the same parameters.

The value of a european put option under the same parameters as used above, broken down into intrinsic and time components. Unlike the call option, for far in-the-money puts, the time value can be negative, so early exercise can be valuable
The value of a european put option under the same parameters as used above, broken down into intrinsic and time components. Unlike the call option, for far in-the-money puts, the time value can be negative, so early exercise can be valuable

Notice that here, unlike before, when the put is far in-the-money the option value becomes smaller than the intrinsic value – the time value of the option is negative! In this case, if we held an American rather than a European option it might well make sense to exercise at this point, since we would receive the intrinsic value, which is greater than the option value on the market [actually it’s slightly more complicated, because in this scenario the American option price would be higher than the European value shown below, so it would need to be a bit more in the money before it was worth exercising – you can see how this sort of recursive problem rapidly becomes hard to deal with!].

What is it that causes this effect for in-the-money puts? It turns out that it comes down to interest rates. Roughly what is happening is this – if we exercise an in-the-money American put to receive the intrinsic value, we receive (K-S) cash straight away. But if we left the option until expiry, our expected payoff is roughly (K-F), where F is the forward value

    \[F(t,T) = {1\over ZCB(t,T)} S(t)\]

so we can see that leaving the underlying to evolve is likely to harm our option value [this is only true for options deep enough in the money for us to be able to roughly neglect the \max(0,K-S) criterion]

We can put this on a slightly more rigourous footing by thinking about the GREEK for time-dependence, Theta. For vanilla options, this is given by

    \begin{align*} \Theta &= - {\partial V \over \partial \tau} \nonumber \\ &= ZCB(t,T)\cdot\Big\{-{\sigma F(t,T) \phi(d_1)\over 2\sqrt{\tau}} \mp rK\Phi(\pm d_2) \Big\} \nonumber \end{align*}

where F is the forward price from t to T\phi(x) is the standard normal PDF of x and \Phi(x) is its CDF, ZCB is a zero-coupon bond from t to T and the upper of the \mp refers to calls and the lower to puts.

The form for Theta shows exactly what I said in the last paragraph – for both calls and puts there is a negative component coming from the ‘optionality’, which is decreasing with time, and a term coming from the expected change in the spot at expiry due to interest rates which is negative for calls and positive for puts.

The plot below shows Theta for the two options shown in the graphs above, and sure enough where the time value of the European put goes negative, Theta becomes positive – the true option value is increasing with time instead of decreasing as usual, as the true value converges to the intrinsic value from below.

The Theta value for European options. For a European put, this becomes positive when the option value falls below the intrinsic value. The difference between these two Thetas is independent of spot, which can be seen directly from put-call parity.
The Theta value for European options. For a European put, this becomes positive when the option value falls below the intrinsic value. The difference between these two Thetas is independent of spot, which can be seen directly from put-call parity.

*Caveats – I’m assuming a few things here – there are no dividends, rates are positive (negative rates reverses the situation discussed above – so that American CALLS can be more valuable than Europeans), no transaction fees or storage costs, and the other sensibleness and simpleness criteria that we usually assume apply.

In between European and American options lie Bermudan options, a class of options that can be exercised early but only at one of a specific set of times. As I said, it is in general really tough to price more exotic options with early exercise features like these, I’ll look at some methods soon – but this introduction is enough for today!

Put-Call Parity

Put-Call parity is a simple result connecting the prices of puts and calls in a model-independent way via the forward price.

Consider the three graphs below, showing independently the payoff at expiry of a vanilla call, a vanilla put, and a forward contract. We can see that the payoff of a long call plus the payoff of a short put will precisely overlap the forward contract payoff (assuming they have the same strike and expiry…).

The combination of a long call and a short put with the same strike and expiry is equivalent to a forward at the same strike and expiry. This is guaranteed by their payoffs at expiry (making the usual assumptions about tradability of the underlying/forward etc.), since the payoff of holding a long call and a short strike can be exactly hedged by holding a forward contract
The combination of a long call and a short put with the same strike and expiry is equivalent to a forward at the same strike and expiry. This is guaranteed by their payoffs at expiry (making the usual assumptions about tradability of the underlying/forward etc.), since the payoff of holding a long call and a short strike can be exactly hedged by holding a forward contract

This can be seen from the algebra as well:

\begin{align*} C_{\rm call}(T) - C_{\rm put}(T)\ & = \big( S(T) - K \big)^+ - \big( K - S(T) \big)^+ \\ & = S(T) - K \\ & = F(T,T) \end{align*}

If two portfolios have the same payoff, it’s a fundamental rule of derivatives pricing that they must have the same price, which gives the fundamental put-call parity relationship

C_{\rm call}(t) - C_{\rm put}(t)\ = Z(t,T)\cdot\Big( F(t,T) - K\Big)

We can take the BS expressions for put and call prices given in a previous post and observe that they obey the relationship [since \inline \Phi(x) + \Phi(-x) = 1], but importantly this result is model independent. As long as there is a forward contract available with the strike and expiry of the options, then this result MUST hold in ANY model.

One implication of this is that if we use a model to determine the price of a call option, put-call parity fixes the price of the put option to be

C_{\rm put}(t) = C_{\rm call}(t) - Z(t,T)\cdot\Big( F(t,T) - K\Big)

We can actually use this to improve our code. Out-of-the-money options typically require more paths to converge in Monte Carlo simulations because they rely on the detailed behaviour of a few paths that travel a long way. However, we can calculate in-the-money prices quickly and due to put-call parity these will lock the price of the matching out-of-the-money options.

Put-call parity is also an important test of the implementation of a model – if the prices that we are getting out don’t obey this relationship then we’ve done something seriously wrong [as long as both prices have converged!]. An equivalent statement is that the implied BS volatility of matching puts and calls should be exactly the same in any model, since the implied vol is the BS vol that would produce the given price for a put/call, and this is locked by put-call parity.

We can get some other similar relationships between put and call prices that I’ll look at in the future. Also note that put-call parity holds for european-style options, but not for many more complicated path-dependent options or those with early exercise features like American options.

Digital Options

Today I’m going to talk about the valuation of another type of option, the digital (or binary) option. This can be seen as a bit of a case study, as I’ll present the option payoff and the analytical price and greeks under BS assumptions, and give add-ons to allow pricing with the MONTE CARLO pricer. I’ve also updated the ANALYTICAL pricer to calculate the price and greeks of these options.

Digital options are very straight-forward, they are written on an underlying S and expire at a particular date t, at which point digital calls pay $1 if S is greater than a certain strike K or $0 if it is below that, and digital puts pay the reverse – ie. the payoff is

P_{\rm dig\ call} = \Big\{ \ \begin{matrix} \$1 \quad S \geq K\\ \$0 \quad S < K\end{matrix}

We can calculate the price exactly in the BS approximation using the same method that I used to calculate vanilla option prices by risk-neutral valuation as follows

C_{\rm dig\ call}(0) = \delta(0,t)\ {\mathbb E}[ C_{\rm dig\ call}(t) ]

where \inline C_{\rm dig\ call}(t') is the price of the option at time \inline t', and we know that this must converge to the payoff as \inline t \to t', so \inline C_{\rm dig\ call}(t) = P_{\rm dig\ call}

= \delta(0,t)\ \int^{\infty}_{-\infty} P_{\rm dig\ call} \cdot e^{-{1\over 2}x^2} dx

\inline P_{\rm dig\ call} is zero if \inline S = S_0 e^{(r - {1\over 2}\sigma^2)t + \sigma \sqrt{t} x} < K which corresponds to \inline x < {\ln{K \over S_0} - (r-{1\over 2}\sigma^2)t \over \sigma \sqrt{t}} = -d_2

= \delta(0,t)\ \cdot \$1 \cdot \int^{\infty}_{-d_2} e^{-{1\over 2}x^2} dx

= \$ \delta(0,t)\cdot \Phi(d_2)


d_1 = {\ln{S\over K}+(r+{1\over 2}\sigma^2)t \over \sigma \sqrt{t}} \quad ;\quad d_2 = {\ln{S\over K}+(r-{1\over 2}\sigma^2)t \over \sigma \sqrt{t}}

Since we have an analytical price, we can also calculate an expression for the GREEKS of this option by differentiating by the various parameters that appear in the price. Analytical expressions for a digital call’s greeks are:

\Delta = {\partial C \over \partial S} = e^{-rt}\cdot {\phi(d_2)\over S\sigma \sqrt{t}}

\nu = {\partial C \over \partial \sigma} = -e^{-rt}\cdot {d_1 \ \phi(d_2)\over \sigma}

\gamma = {\partial^2 C\over \partial S^2} = -e^{-rt}\cdot {d_1 \ \phi(d_2)\over S^2 \sigma^2 t} = -e^{-rt}\cdot {d_1\ \phi(d_1)\over S K \sigma^2 t}

{\rm Vanna} = {\partial^2 C \over \partial S \partial \sigma} =e^{-rt}\cdot {\phi(d_2)\over S \sigma^2 \sqrt{t}}\Big( d_1 d_2 - 1 \Big)

{\rm Volga} = {\partial^2 C \over \partial \sigma^2} = e^{-rt}\cdot {\phi(d_2)\over \sigma^2}\cdot \Big( d_1 + d_2 - d_1^2 d_2 \Big)


\phi(d_1) = {1\over \sqrt{2 \pi}}\ {e^{-{1\over 2}d_1^2} }

Holding a binary put and a binary call with the same strike is just the same as holding a zero-coupon bond, since we are guaranteed to receive $1 wherever the spot ends up, so the price of a binary put must be

e^{-rt} = e^{-rt}\cdot \Phi(d_2) + C_{\rm dig\ put}(t'=0)\quad \quad \quad \quad [1]

C_{\rm dig\ put}(0) = e^{-rt} \cdot \Phi(-d_2)

Moreover, differentiating equation [1] above shows that the greeks of a digital put are simply the negative of the greeks of a digital call with the same strike.

Graphs of these are shown for a typical binary option in the following graphs. Unlike vanilla options, these option prices aren’t monotonic in volatility: if they’re in-the-money, increasing vol will actually DECREASE the price since it makes them more likely to end out-of-the-money!

Price and first-order greeks for a digital call option.
Price and first-order greeks for a digital call option.
Second-order greeks for a digital call option. Greeks for digital puts are simply the negative of these values

One final point on pricing, note that the payoff of a digital call is the negative of the derivative of a vanilla call payoff wrt. strike; and the payoff of a digital put is the positive of the derivative of a vanilla put payoff wrt. strike. This means that any binary greek can be calculated from the corresponding vanilla greek as follows

\Omega_{\rm dig\ call} = -{\partial \Omega_{\rm call}\over \partial K}

\Omega_{\rm dig\ put} = {\partial \Omega_{\rm put}\over \partial K}

where here \inline \Omega represents a general greek.

If you haven’t yet installed the MONTE CARLO pricer, you can find some instructions for doing so in a previous post. The following links give the header and source files for binary calls and puts which can be dropped in to the project in your C++ development environment

These will register the option types with the option factory and allow monte carlo pricing of the options (so far, all of the options in the factory also have analytical expressions, but I’ll soon present some options that can only be priced by Monte Carlo).

BS from Delta-Hedging

Today I’m going to look at another method of getting to the BS equations, by constructing a delta-hedge. This is the way that the equation was in fact first reached historically, and it’s a nice illustration of the principle of hedging. All of the same assumptions are made as in the post that derived the BS equation via Risk Neutral Valuation.

The principle is that because the price of some derivative C(S,t) is a function of the stochastic underlying S, then all of the uncertainty in C comes from the same source as the uncertainty in S. We try to construct a risk-free portfolio made up of the two of these that perfectly cancels out all of the risk. If the portfolio is risk-free, we know it must grow at the risk free rate r(t) or else we have an arbitrage opportunity.

Our model for S is the geometric brownian motion, note that we allow the rate of growth \mu in general to be different from r

    \[dS = \mu Sdt + \sigma SdW_t\]

We can express dC in terms of its derivatives with respect to t and S using Ito’s lemma, which I discussed in a previous post,

    \[dC = \Bigr[ {\partial C \over \partial t} + \mu S{\partial C \over \partial S} + {1 \over 2}\sigma^2 S^2{\partial^2 C \over \partial S^2} \Bigl] dt + \sigma S{\partial C \over \partial S} dW_t\]

Our portfolio is made up of one derivative worth C(S,t) and a fraction \alpha of the underlying stock, worth \alpha \cdot S; so the net price is C(S,t) + \alpha S. We combine the above two results to give

    \begin{eqnarray*} d(C+\alpha S) &=& \Bigr[ {\partial C \over \partial t} + \mu S{\partial C \over \partial S} + {1 \over 2}\sigma^2 S^2{\partial^2 C \over \partial S^2} + \mu S\alpha \Bigl] dt \\ \nonumber \\ &   & + \, \sigma S \bigl( {\partial C \over \partial S} +\alpha \bigr) dW_t \nonumber \end{eqnarray}

We are trying to find a portfolio that is risk-free, which means we would like the stochastic term to cancel. We see immediately that this happens for \alpha = -{\partial C \over \partial S}, which gives

    \[d(C+\alpha S) = \Bigr[ {\partial C \over \partial t} + {1 \over 2}\sigma^2 S^2{\partial^2 C \over \partial S^2}\Bigl] dt\]

Since this portfolio is risk-free, to prevent arbitrage it must grow deterministically at the risk free rate

    \[d ( C + \alpha S) = r ( C + \alpha S) dt\]

and so

    \[rC= {\partial C \over \partial t} + {1 \over 2}\sigma^2 S^2{\partial^2 C \over \partial S^2} + rS{\partial C \over \partial S}\]

This is the BS partial differential equation (pde). Note that despite the fact that the constant growth term for the underlying had a rate \mu, this has totally disappeared in the pde above – we might disagree with someone else about the expected rate of growth of the stock, but no-arbitrage still demands that we agree with them about the price of the option [as long as we agree about \sigma, that is!]

As for any pde, we can only solve for a specific situation if we have boundary conditions – in this case, given by the payoff at expiry t=T. At that point we know the exact form the value that C(S,T) must take

    \[C_{\rm call}(S,T) = \max(K-S,0)\]

Our job is to use the pdf to evolve the value of C(S,t) backwards to t=0. In the case of vanilla options this can be done exactly, while for more complicated payoffs we would need to discretise and solve numerically. This gives us another way of valuing options that is complementary (and equivalent) to the expectations approach discussed previously.

To solve the equation above, it is useful to first make some substitutions. As we are interested in time-to-expiry only, we make the change of variables \tau = T - t which yields

    \[rC= -{\partial C \over \partial \tau} + {1 \over 2}\sigma^2 S^2{\partial^2 C \over \partial S^2} + rS{\partial C \over \partial S}\]

We can eliminate the S terms by considering change-of-variables M = \ln S. This means that

    \[{\partial C \over \partial S} = {\partial C \over \partial M}{\partial M \over \partial S} = {1 \over S}{\partial C \over \partial M}\]

    \begin{eqnarray*} \partial^2 C \over \partial S^2} = {\partial \over \partial S} \Bigl({\partial C \over \partial S}\Bigr) & = & {\partial \over \partial S} \Bigl({1 \over S}{\partial C \over \partial M}\Bigr) \nonumber \\ \nonumber \\ & = & {-1 \over S^2} {\partial C \over \partial M} + {1 \over S}{\partial \over \partial S} \Bigl({\partial C \over \partial M}\Bigr) \nonumber \\ \nonumber \\ & = & {-1 \over S^2} {\partial C \over \partial M} + {1 \over S}\bigl({\partial M \over \partial S}{\partial \over \partial M}\bigr) \Bigl({\partial C \over \partial M}\Bigr) \nonumber \\ \nonumber \\ & = & {-1 \over S^2} {\partial C \over \partial M} + {1 \over S^2}{\partial^2 C \over \partial M^2 \nonumber \end{eqnarray}

Combining these the BS equation becomes

    \[rC= -{\partial C \over \partial \tau} + {1 \over 2}\sigma^2 {\partial^2 C \over \partial M^2} + \bigr( r - {1 \over 2} \sigma^2 \bigl) {\partial C \over \partial M}\]

The linear term in C can be removed by another transformation D = C e^{-r\tau} so that

    \begin{eqnarray*} {\partial D \over \partial \tau} &=& {\partial C \over \partial \tau}e^{-r\tau} - rCe^{-r\tau} \nonumber \\ \nonumber \\ {\partial^n D \over \partial M^n} &=& {\partial^n C \over \partial M^n}e^{-r\tau} \nonumber \end{eqnarray}

The exponential terms cancel throughout, and we are left with

    \[0 = -{\partial D \over \partial \tau} + {1 \over 2}\sigma^2 {\partial^2 D \over \partial M^2} + \bigr( r - {1 \over 2} \sigma^2 \bigl) {\partial D \over \partial M}\]

One final transformation will be needed before putting in boundary conditions. The transformation will be

    \[y = M + \Bigr( r -{1\over 2 }\sigma^2\Bigl)\tau\]

But unlike the other transformations I’ve suggested so far, this one mixes the two variables that we are using, so a bit of care is required about what we mean. When I most recently wrote the BS equation, D was a function of M and \tau – this means that the partial differentials with respect to \tau were implicitly holding M constant and vise versa. I’m now going to write D as a function of y and \tau instead, and because the relationship features all three variables we need to take a bit of care with our partial derivatives:

    \[dD(M,\tau) = {\partial D \over \partial \tau}\bigg|_{M} d\tau + {\partial D \over \partial M}\bigg|_{\tau} dM\]

where vertical lines indicate the variable that is being held constant during evaluation. Now, to move from M to y, we expand out the dM term in the same way as we did for dD above

    \[dD(M(y,\tau),\tau) = {\partial D \over \partial \tau}\bigg|_{M} d\tau + {\partial D \over \partial M}\bigg|_{\tau} \Bigr({\partial M \over \partial \tau}\bigg|_{y} d\tau + {\partial M \over \partial y}\bigg|_{\tau}dy\Bigl)\]

    \[= dD(y,\tau) = {\partial D \over \partial \tau}\bigg|_{y} d\tau + {\partial D \over \partial y}\bigg|_{\tau} dy\]

We can compare these last two equations to give expressions for the derivatives that we need after the transformation by comparing the coefficients of d\tau and dy

    \begin{eqnarray*} {\partial D \over \partial \tau}\bigg|_y &=& {\partial D \over \partial \tau}\bigg|_M + {\partial D \over \partial M}\bigg|_{\tau} {\partial M\over \partial \tau}\bigg|_y \nonumber \\ \nonumber \\ {\partial D \over \partial y}\bigg|_{\tau} &=& {\partial D \over \partial M}\bigg|_{\tau}{\partial M \over \partial y}\bigg|_{\tau} \nonumber \end{eqnarray}

Computing and inserting these derivatives [I’ve given a graphical representation of the first of these equations below, because the derivation is a little dry at present!] into the BS equation gives

    \[0 = -{\partial D \over \partial \tau}\bigg|_y + {1 \over 2}\sigma^2 {\partial^2 D \over \partial y^2}\bigg|_{\tau}\]

This is the well-known Heat Equation in physics. For the sake of brevity I won’t solve it here, but the solution is well known – see for example the wikipedia page – which gives the general solution:

    \[D(x,\tau) = {1 \over \sqrt{2\pi \sigma^2 \tau}}\int^{\infty}_{-\infty} e^{-(x-y)^2 \over 2\sigma^2 \tau} p(y) dy\]

Where p(y) is the payoff condition (it’s now an initial condition, as expiry is at \tau = 0). The algebra is quite involved so I give the solution its own post, and you can show by substitution that the BS option formulae given previously is a solution to the equation.

An illustration of the difference between partial differentials when a change of variables involving both current varibles is used
An illustration of the difference between partial differentials when a change of variables involving both current variables is used. This should be thought of as a contour plot with the value of D on the out-of-plane axis. The amount D changes when moving a small amount dt depends on which direction you are moving in, as shown above.

As an aside, what was the portfolio that I was considering all of the way through? Comparing \alpha to the vanilla greeks, we recognise it as the option delta – the hedging portfolio is just the portfolio of the option with just enough stock to hedge out the local delta risk. Of course, as time goes by this value will change, and we need to constantly adjust our hedge to account for this. This shows the breakdown caused by one of our assumptions – that we could trade whenever we want and without transaction costs. In fact, because we need to re-hedge at every moment to enforce this portfolio’s risk free nature, in the presence of transaction costs the hedging costs in this strategy will be infinite! This demonstrates a significant failing of one of our assumptions, I’ll come back again to the effect of this in the real world in future posts.

The Greeks

I’ve already discussed how to price vanilla options in the BS model. But options traders need to know more than just the price: they also want to know how price changes with the various other parameters in their models.

The way traders make money is just the same way that shop-keepers do – by selling options to other people for a little bit more than they buy them for. Once they sell an option, they have some money but they also have some risk, since if the price of the underlying moves in the wrong direction, they stand to lose a large amount of money. In the simplest case, a trader might be able to buy a matching option on the market for less than she sold the original option to her client for. This would cancel (‘hedge’) all of her risk, and generate a small positive profit (‘PnL’ or Profit & Loss) equal to the difference in the two prices.

This might be difficult to do, however, and it won’t generate as much profit as the trader would like, because whoever she buys the hedging option from will also be trying to charge a premium over the actual price. Another possibility is to try and create a hedged portfolio consisting of several options and the underlying stock as well, so as to minimise the net risk of the portfolio.

Since she has sold an option on a stock (for concreteness, let’s say she has sold a call option expiring at time T on a stock with a spot price S(t) and the option has strike K – because she has sold it, we say she is ‘short’ a call option), the trader will have to pay the larger of zero or ( S(T) – K ) at the option expiry to the client. Clearly, if the stock price goes up too high, she will lose more money than she received for selling the option. One possibility might be for her to buy the stock for S(t). Since she will have to pay a maximum of S(T) – K, but would be able to sell the stock for S(T), she would cover her position in the case that the stock price goes very high (and actually guarantee a profit in this case). But she has over-hedged her position – in the case that the stock falls in price, she will lose S(t) – S(T) on the stock. This is shown in the graph below.

The payoff at expiry of the three portfolios shown in the text. The unhedged option makes the trader money if spot doesn’t rise by more than the premium. The overhedged option is the reverse – now the trader loses money if the stock falls too far below the strike price – this is called a covered call, the payoff is the same as the payoff for an uncovered put option. Finally, a delta hedged option will make money as long as the spot does not move too far in either direction – the trader has taken a directionless bet, she is instead betting on volatility remaining low 

In fact, she can ‘delta-hedge’ the option by buying a fraction \inline \Delta of the stock. One way of deriving the BS equation (which I’ll get to at some point) is to construct a portfolio consisting of one option and a (negative) fraction of the underlying stock where the movement of the option price due to the underlying moving is exactly cancelled out but the movement of the underlying in the portfolio – any small increase in S(t) will increase the price of the option but decrease the value of the negative stock position so that the net portfolio change is zero. The fraction \inline \Delta is called the Delta of the option, mathematically it is the derivative of the option price C with respect to the stock price S

The price of a vanilla call is roughly the same as the payoff at expiry for very high spots and very low spots. Near-the-money, the difference between the option price and its payoff at expiry is greatest as the implicit insurance provided by the option is most useful. The delta of this option is the local gradient of the call price with spot. This is the amount of the underlying that would be required to delta-hedge the portfolio, so that its value is unaffected by small changes in the spot price.

\Delta = {\partial C \over \partial S}

For a call option this will be positive and for a put it will be negative, and the magnitude of both will be between 0 when far out-of-the-money and 1 when far in-the-money.

This graph shows the instantaneous change in PnL due to changes in spot for the three portfolios discussed above. Both the covered and the uncovered calls have some delta – a change in the spot price will have a direct effect in the value of the portfolio. By contrast, the value of the delta hedged portfolio is insensitive to the value of spot for small moves. Unfortunately, it is short gamma, so large moves in either direction will reduce portfolio value, so the trader must be careful to re-hedge frequently. As all of these portfolios are short an option, they are long theta – that is, the value of the portfolio will INCREASE with time if other factors remain constant, as the time value of the option is decaying towards expiry.

Similarly, if the market changes its mind about the implied volatility of an option, this will increase or decrease the price of the trader’s current option portfolio. This exposure can also be hedged, but now she will need to do it by trading options in the stock as the stock price itself is independent of volatility. This time the relevant quantity is the ‘vega’ of the option, the rate of change of price with respect to vol.

These sensitivities of the derivative price are called the Greeks, as they tend to be represented with various greek letters. Some examples are delta (variation with spot), vega (variation with vol), theta (variation with time); and second-order greeks like gamma (sensitivity of delta to spot), vanna (sensitivity of delta to vol, or equivalently sensitivity of vega to spot), and volga (sensitivity of vega to vol).

For vanilla options in the BS model, there are simple expressions for the greeks (see, for example, the Wikipedia page). I’ve updated the PRICERS page to give values for a few of the vanilla greeks of these options along with the price, and there are some graphs of typical greeks below.

The deltas of a long call and long put position – note that these are opposite sign and everywhere separated by a constant value (here 1, but in general the discount factor at the option expiry). Vega is always positive – so increased vol will always increase the price. Increasing spot tends to increase the value of a call, while it decreases the value of a put, but by a progressively smaller amount as spot increases. For these options (and the graph below), spot = fwd = strike = 100; vol = 0.1, expiry = 1 and r = 0.
The Gamma, Vanna and Volga of long vanilla options (these are the same for a call and a put). Gamma is always positive for long options – this means that price is a convex function of spot. Vanna and volga tell us the sensitivity of other greeks to volatility, and are useful in hedging portfolios if vol is changing rapidly.

For exotic options greeks are often intractable analytically, so typically they will be calculated by ‘bump and revalue’, where input parameters are varied slightly and the change in price is observed. For example, a derivative’s \inline \Delta at spot price S could be calculated from its price by ‘bumping’ spot \inline S by a small amount \inline \delta S:

\begin{matrix} C(S+\delta S,\sigma) = C(S,\sigma) + {\partial C \over \partial S}\delta S + {1 \over 2}{\partial^2 C \over \partial S^2} (\delta S)^2 + O[(\delta S)^3] \\ C(S-\delta S,\sigma) = C(S,\sigma) - {\partial C \over \partial S}\delta S + {1 \over 2}{\partial^2 C \over \partial S^2} (\delta S)^2 + O[(\delta S)^3] \end{array}

{C(S+\delta S, \sigma) - C(S-\delta S, \sigma) \over 2 (\delta S) } = {\partial C \over \partial S} + O[(\delta S)^3] \simeq \Delta

which is the derivative delta to a very good approximation for small \inline \delta S.

Since banks will have large portfolios and want to calculate their total exposure fairly frequently, pricing procedures will typically need to be fairly fast so that these risk calculations can be done in a reasonable amount of time, which usually rules out Monte Carlo as a technique here.

These hedges will only work for small changes in the underlying price (for example, delta itself changes with the underlying price according to the second-order greek, gamma). What this means is that the trader will need to re-hedge from time to time, which will cost her some money to do – one of the main challenges for a trader is to balance the need to hedge her portfolio with the associated costs of doing so. Hopefully by buying and selling a wide variety of options to various clients she will be able to minimise many of her greek exposures naturally – this ‘warehousing of risk’ is one of the main functions that banks undertake and a key driver of their profits.

Risk Neutral Valuation

There are a few different but equivalent ways of viewing derivatives pricing. The first to be developed was the partial differential equations method, which was how the Black Scholes equation was originally derived. There’s lots of discussion of this on the wikipedia page, and I’ll talk about it at some stage – it’s quite intuitive and a lot of the concepts fall out of it very naturally. However, probably the more powerful method, and the one that I use almost all of the time in work, is the risk-neutral pricing method.

The idea is quite simple, although the actual mechanics can be a little intricate. Since the distribution of the underlying asset at expiry is known, it makes sense that the price of a derivative might be the expected value of the payoff of the option at expiry (eg. (S_t-K)^+ for a call option, where a superscript ‘+’ means “The greater of this value or zero”) over the underlying distribution. In fact, it turns out that this isn’t quite right: due to the risk aversion of investors, this will usually produce an overestimate of the true price. However, in arbitrage-free markets, there exists another probability distribution under which the expectation does give the true price – this is called the risk-neutral distribution of the asset. Further, [as long as the market is complete] any price other than the risk-neutral price allows the possibility of arbitrage. Taken together, these are roughly a statement of the Fundamental Theorem of Asset Pricing. In the case of vanilla call options, a portfolio of the underlying stock and a risk-free bond can be constructed that exactly replicate the option and could be used in such an arbitrage.

In this risk-neutral distribution, all risky assets grow at the risk-free rate, so that the ‘price of risk’ is exactly zero. Let’s say a government bond – which we’ll treat as risk-free – exists, has a price B and pays an interest rate of r, so that

dB = r B dt

Then, the stochastic process for the underlying stock that we looked at before

dS = \mu dt + \sigma dW_t

is modified so that mu becomes r, and the process is

dS = rdt + \sigma dW_t

 so the risk-neutral distribution of the asset is still lognormal, but with mu’s replaced by r’s:

S_t = S_0 e^{(r - {1\over 2}\sigma^2)t + \sigma\sqrt{t}z}


I’ve not provided the explicit formula yet, so I’ll demonstrate here how this can be used to price vanilla call options

C(F,K,\sigma,t,\phi) = \delta(t){\mathbb E^{S_t}}[(S_t-K)^+]

= \delta(t)\int^\infty_{0} (S_t - K)^+ p_S(S_t)dS_t

= \delta(t)\int^\infty_{K} (S_t - K) p_S(S_t)dS_t

= {\delta(t) \over \sqrt{2\pi}}\int^\infty_{x_K} (S_0 e^{(r-{1\over 2}\sigma^2)t + \sigma \sqrt{t}x} - K) e^{-{1\over 2}x^2}dx

= {\delta(t) \over \sqrt{2\pi}}\Bigr[S_0 e^{(r-{1\over 2}\sigma^2)t} \int^\infty_{x_K} e^{\sigma \sqrt{t}x -{1\over 2}x^2}dx - \int^\infty_{x_K} K e^{-{1\over 2}x^2} dx \Bigl]

= {\delta(t) \over \sqrt{2\pi}}\Bigr[S_0 e^{(r-{1\over 2}\sigma^2)t} \int^\infty_{x_K} e^{-{1\over 2}(x-\sigma\sqrt{t})^2 + {1\over 2}\sigma^2t}dx - \sqrt{2\pi} K\Phi(-x_K)\Bigl]

= {\delta(t) \over \sqrt{2\pi}}\Bigr[S_0 e^{rt} \int^\infty_{x_K - \sigma \sqrt{t}} e^{-{1\over 2}x^2} dx - \sqrt{2\pi} K \Phi(-x_K)\Bigl]

= \delta(t)\Bigr[S_0 e^{rt} \Phi(-x_K + \sigma \sqrt{t}) - K \Phi(-x_K)\Bigl]

= \delta(t)\Bigr[F \Phi(d_1) - K \Phi(d_2)\Bigl]

which is the celebrated BS formula! In the above, F = forward price = \inline S_0 e^{rt}\inline \Phi(x) is the standard normal cumulative density of x, \inline x_K is the value of x corresponding to strike S=K, ie.

x_K = {\ln{K \over F} + {1\over 2}\sigma^2t \over \sigma \sqrt{t}}

it is typical to use the variables d1 and d2 for the values in the cfds, such that

d_1 = {\ln{F \over K} + {1\over 2}\sigma^2t \over \sigma \sqrt{t}}

d_2 = {\ln{F \over K} - {1\over 2}\sigma^2t \over \sigma \sqrt{t}} = d_1 - \sigma \sqrt{t}

In reality, certain we have made certain assumptions that aren’t justified in reality. Some of these are:

1. No arbitrage – we assume that there is no opportunity for a risk-free profit

2. No transaction costs – we can freely buy and sell the underlying at a single price

3. Can go long/short as we please – we have no funding constraints, and can buy/sell an arbitrarily large amount of stock/options and balance it with an opposite position in bonds

4. Constant vol and r – we assume that vol and r are constant and don’t vary with strike. In fact, it’s an easy extension to allow them to vary with time, I’ll come back to this later

I’ll look at the validity of these and other assumptions in a future post.

If prices of vanillas have non-constant vols that vary with strike, doesn’t that make all of the above useless? Not at all – but we do need to turn it on its head! Instead of using the r-n distribution to find prices, we use prices to find the r-n distribution! Lets assume that we have access to a liquid market of vanilla calls and puts that we can trade in freely. If we look at their prices and apply the Fundamental Theorem of Calculus twice

C(t)= \delta(t) \int^\infty_{K} (S_t - K)p_S(S_t)dS_t

{\partial C \over \partial K} = -\delta(t) \int^\infty_K p_S(S_t)dS_t

{1\over \delta(t)}{\partial^2 C \over \partial K^2} =p_S(K)

 So the curvature of call prices wrt. strike tells us the local risk neutral probability! This means for each expiry time that we can see vanillas option prices, we can calculate the market-implied r-n distribution (which probably won’t be lognormal, telling us that the market doesn’t really believe the BS assumptions as stated above either). Once we know this, we can use it calibrate our choice of market model and to price other, more exotic options.

[Post script: It is worth noting that although this looks like alchemy, we haven’t totally tamed the distribution, because although we know the underlying marginal distribution at each expiry time, we still don’t know anything about the correlations between them. That is, we know the marginal distributions of the underlying at each time, but not the full distribution. For concreteness, consider two times \inline t_1 and \inline t_2. We know \inline P(S_{t_1}) and \inline P(S_{t_2}) but not \inline P(S_{t_1},S_{t_2}). To price an option paying off the average of the spot at these two times, knowing the first two isn’t enough, we need to know the second, as the expectation is \inline \int^\infty_0 \int^\infty_0 {1\over 2}(S_{t_1} + S_{t_2})P(S_{t_1},S_{t_2}) dS_{t_1}dS_{t_2}. To see the difference, from Bayes Theorem we have that \inline P(S_{t_1},S_{t_2}) = \inline P(S_{t_1}).\inline P(S_{t_2}|S_{t_1}). So, although we know how the spot will be distributed at each time, we don’t know how each distribution is conditional on the times before, which we’d need to know to price exactly – our modelling challenge will be to choose some sensible process for this that is consistent with the marginal distributions.]

Price vs. Implied Vol

Something people often comment on when they start out in quantitative finance is that it’s odd that prices tend to be quoted in terms of implied vol instead of… well, price! This seems a bit strange, surely price is both more useful and more meaningful, given that implied vol is based on a model which isn’t really correct?

Briefly, when a vanilla option is priced in the Black-Scholes, its price is given by the following formula

 C(F,K,\sigma,\tau,\delta) = \delta(\tau) \phi \Bigl( F \Phi(\phi \cdot d_1) - K \Phi(\phi \cdot d_2) \Bigr)

d_1 = {\ln{F \over K} + {1 \over 2}\sigma^2 \tau \over \sigma \sqrt{\tau}}

d_2 = d_1 - \sigma \sqrt{\tau}

with \inline \tau the time to expiry, \inline \Phi(x) the standard normal cumulative density of x, \inline \delta(\tau) the discount factor to expiry, \inline \phi +1 for a put and -1 for a call, F the forward to expiry, K the strike, and\inline \sigma the Black-Scholes volatility (there are a few different ways of expressing this formula, I’ll come back to it another time).

Importantly, for both puts and calls there is a 1-to-1 correspondence between price and vol – in both cases, increased vol means increased price, since more vol means a higher chance of greater returns, while our losses are capped at zero. However, the BS price is derived by assuming that vol is a constant parameter (or at least that it only varies with time), but we know that in reality it also varies with strike (this is called the vol smile, and it is a VERY important phenomenon which I’ll talk about LOTS in these posts!). What vol should we put into the equation to get a sensible price?

Actually, we usually think about this in reverse – prices are quoted on the market, and we can invert the BS price to give us instead an implied vol. In fact, usually even the quotes that we receive will be given in terms of implied vol!

There are a few reasons for this. Firstly, a price varys depending on the notional of an option – in physics we’d call it an extrinsic variable, while imp vol is an intrinsic one. But it’s more that price doesn’t really give us as much information about where the option is as vol does. Have a look at the graphs below:

Two graphs of price variation with strike,for options with a flat BS vol (grey) and from a vol smile(red). On the left, their comparative prices are plotted with the forward price for reference. On the right, the corresponding BS implied vols are plotted. In all cases, the forward price is 100, time to expiry is 1 year, the flat vol is a constant 0.1 while the SABR parameters are instant vol = 0.1, vol of vol = 0.5 and rho = -0.2.

These graphs show the price variation with strike for two vol surfaces. Although they come from very different vol surfaces, we really can’t see that from the price graphs. Because the scale is so large, the relatively small price differences are overwhelmed. On one end they look like essentially forwards, while on the other end they are effectively zero.

But when we look at the implied vols instead, we see that they’re in fact very different options. One set has an (unrealistic) constant vol of 10%, while the other set shows higher vols away from the money (ie. at high and low strikes), which is what we typically see in the market. If we didn’t take these into account and priced them using the same vols, we’d be exposing ourselves to significant arbitrage opportunities (incidentally, this vol smile comes from a model commonly used to model and interpolate vol smiles called SABR – we’ll be seeing a lot more of this in the future).

Finally, implied vols give us a feeling for what is happening – since vol is annualised, this is the same order as the percentage change that we would expect in the underlying in a typical year. This gives us an important intuition check on our results that could easily be forgotten in the decimal points or trailing zeroes of a price given in dollars or euros.

As an aside, I’m in the process of upgrading the vanilla pricer to do implied vol calculations as well – so you will be able to either enter a vol and calculate the price of the option, or else enter a price and work out the corresponding vol. Have fun!

[This requires some root-finding (once again, no closed form for the normal cdfs…), and once again I’m taking the path of least resistance for the moment and coding a bisection solver. Since this involves many, many calls to the normal cdf code I used before, I should probably use a quicker method eventually, so I’ll be coding a brent solver soon, which will probably be a post in itself]