Credit Valuation Adjustment (CVA)

Now that we’ve covered a basic model for the default of firms and the pricing of Credit Default Swaps, we’re ready to consider the implication of your counterparty’s credit risk on the price of a derivative contract signed with them – this is called the ‘Credit Valuation Adjustment’ or CVA, and is the amount that one should change the value of an uncollatorised credit-risk-free derivative to reflect the counterparty’s risk of default.

When we have priced derivatives under Black-Scholes assumptions, we’ve implicitly assumed that the contract runs to expiry. However, if our counterparty defaults before the end of the contract, this won’t happen (typically, in the event of default a liquidator will attempt to collect everything the firm is owed from counterparties, and use the finite resources collected to pay back creditors as much as possible).

If we hold a single derivative contract against a defaulting counterparty and it’s out-of-the-money (ie. the value is negative – we owe them money), we will still be required to hold the position. On the other hand, if we’re in-the-money, we won’t necessarily get all of our contract value back – as with CDS contracts, we assume that of the true value of the contract, we will only receive a fraction R, the ‘Recovery Rate’. If we gain nothing if the value is negative, but lose something when the value is positive, then to calculate the pricing impact we need to calculate expected value of the contract at a future point if it is positive, and ignore the price impact coming from the chance that it is negative, which we call the Expected Positive Exposure (EPE).

For a contract depending on underlier S with value C(S,t) at future time t, we can calculate this by integrating across all possible underlier values that generate a positive contract value at that time

    \[EPE(t) = \int_{C(s) \textgreater 0} C(S,t) \cdot p(S) dS\]

this will be larger for contracts whose values have a higher volatility, which are more likely to have a high positive or negative value at some point in the future.

Our loss due to a counterparty default event at time t is the discounted expectation of our loss due to a default accounting for the finite recovery value

    \[(1 - R) \cdot EPE(t) \cdot P(0,t)\]

where P(0,t) is the discount factor to time t as usual.

In order to calculate the CVA for a whole contract, we need to sum the product of the loss-given-default at t time and the probability of default at time t

    \[CVA = (1-R) \int_0^{T_f} EPE(t) \cdot P(0,t) \cdot S'(t) \; dt\]

where T_F is the time of the final cashflow due to the contract and S'(t) is the probability of default at time t as discussed in the previous post on firm default models, and if this is uncorrelated to the value of the contract, this integral is relatively straightforward to calculate. Default probabilities can be calibrated from observed CDS prices on the counterparties, most major counterparties (typically large investment banks) will have CDS contracts available on the market to use for calibration.

It’s worth mentioning that CVA is a pricing problem – we’re treating the pricing adjustment as the price of an exotic derivative and hedging it with a portfolio of CDS contracts. This means we’re calculating an actual dollar value, and when a trader quotes a derivative price to a counterparty he should also quote a ‘CVA adjustment’ to represent the counterparty’s chance of default. Internally, the trader will often have to pass this straight through to a ‘CVA Desk’, which internally manages the risk the bank faces due to the risk of default of any given counterparty – and will then reimburse the trader if she faces a loss due to such an event!

It’s worth noting that many simple products like ZCBs, forwards and FRAs have model-independent prices, so we can calculate their value independent of any assumptions about the market’s future behaviour. However, we’ve had to make some assumptions about the underlying volatility in the market to calculate EPEs, so the CVA for these products is model-dependent – we need to make some assumptions about the model driving the market and can’t enforce a price today purely by no-arbitrage considerations (although it will be informed by the prices of other model-dependent products whose prices we can see).

We’ve also discussed only the CVA adjustment from an individual contract, which is already quite computationally complex (in a future post, I’ll calculate this for some simple products in a simple model). If we have a portfolio of products in place against a specific counterparty, these should all be considered together and may lead to some offsetting exposures that reduce the CVA adjustment on a portfolio basis (although even this isn’t always the case, depending upon ‘netting agreements’ in the Credit Support Annex – CSA – that you have in place with the given counterparty…).

Further, CVA only applies to non-collaterised positions. Many positions (like futures) require the deposit of margin in an escrow account, exactly to try and cover for the risk of counterparty default. This leads instead to ‘gap risk’ (ie. that the margin might not be sufficient) but also to the related ‘Funding Valuation Adjustment’ (or FVA, essentially the cost of funding the margin you provide).

The Dupire Local Vol Model

In this post I’m going to look at a further generalisation of the Black-Scholes model, which will allow us to re-price any arbitrary market-observed volatility surface, including those featuring a volatility smile.

I’ve previously looked at how we can produce different at-the-money vols at different times by using a piecewise constant volatility \inline \sigma(t), but we were still unable to produce smiley vol surfaces which are often observed in the market. We can go further by allowing vol to depend on both t and the value of the underlying S, so that the full BS model dynamics are given by the following SDE

dS_t = S_t\cdot \Big(\ r(t) dt\ +\ \sigma(t,S_t) dW_t\ \Big)

Throughout this post we will make constant use of the probability distribution of the underlying implied by this SDE at future times, which I will denote \inline \phi(t,S_t). It can be shown [and I will show in a later post!] that the evolution of this distribution obeys the Kolmogorov Forward Equation (sometimes called the Fokker-Planck equation)

{\partial \phi(t,S_t) \over \partial t} = -{\partial \over \partial S_t}\big(rS_t\phi(t,S_t)\big) + {1\over 2}{\partial^2 \over \partial S_t^2}\big(\sigma^2(t,S_t) S_t^2 \phi(t,S_t)\big)

This looks a mess, but it essentially tells us how the probability distribution changes with time – we can see that is looks very much like a heat equation with an additional driving term due to the SDE drift.

Vanilla call option prices are given by

C(t,S_t) = P(0,t)\int_K^{\infty} \big(S_t - K\big)\phi(t, S_t) dS_t

Assuming the market provides vanilla call option prices at all times and strikes [or at least enough for us to interpolate across the region of interest], we can calculate the time derivative of the call price which is equal to

{\partial C \over \partial T} = -rC + P(0,T)\int_K^{\infty}\big( S_T - K \big)\ {\partial \phi \over \partial T}\ dS_T

and we can substitute in the value of the time derivative of the probability distribution from the Kolmogorov equation above

rC + {\partial C \over \partial T} = P(0,T)\int_K^{\infty}\big( S_T - K \big)\Big[ -{\partial \over \partial S_T}\big( rS_T\phi\big) + {1\over 2}{\partial^2 \over \partial S_T^2}\big( \sigma^2S_T^2\phi\big) \Big]dS_T

These two integrals can be solved by integration by parts with a little care

\begin{align*} -\int_K^{\infty}\big( S_T - K \big){\partial \over \partial S_T}\big( rS_T\phi\big)dS_T & = -\Big[rS_T\phi \big( S_T - K \big)\Big]^{\infty}_{K} + \int_K^{\infty}rS_T\phi\ dS_T \\ & = r\int_K^{\infty} (S_T\phi)\ dS_T \end{align*}\begin{align*} \int_K^{\infty}\big( S_T - K \big){\partial^2 \over \partial S_T^2}\big( \sigma^2 S_T^2\phi\big)dS_T & =\Big[\big( S_T - K \big){\partial \over \partial S_T}(\sigma^2 S_T^2\phi) \Big]^{\infty}_{K} - \int_K^{\infty}{\partial \over \partial S_T}(\sigma^2 S_T^2\phi)\ dS_T \\ & = -\sigma^2 K^2\phi(K,T) \end{align*}where in both cases, the boundary terms disappear at the upper limit due to the distribution \inline \phi(t,S_t) and its derivatives, which go to zero rapidly at high spot.

We already have an expression for \inline \phi(t,S_t) in terms of C and its derivatives from our survey of risk-neutral probabilities,

\phi(t,S_t) = {1 \over P(0,t)}{\partial^2 C \over \partial K^2}

and we can re-arrange the formula above for call option prices

\begin{align*} P(0,T)\int_K^{\infty} S_T\ \phi\ dS_T & = C + P(0,T)\int_K^{\infty} K\phi\ dS_T \\ & = C + K {\partial C \over \partial K} \end{align*}and substituting these expressions for \inline \phi(t,S_t) and \inline \int^{\infty}_K (S_T \phi)\ dS_T into the equation above

\begin{align*} rC + {\partial C \over \partial T} & = P(0,T)\cdot \Big[ r\int_K^{\infty} (S_T\phi)\ dS_T + \sigma^2 K^2\phi(K,T) \Big]\\ & = rC + rK {\partial C \over \partial K} + \sigma^2 K^2{\partial^2 C \over \partial K^2} \end{align*}

and remember that \inline \sigma = \sigma(t,S_t), which is our Dupire local vol. Cancelling the rC terms from each side and re-arranging gives

\sigma(T,K) = \sqrt{ {\partial C \over \partial T} + rK {\partial C \over \partial K} \over K^2{\partial^2 C \over \partial K^2}}

It’s worth taking a moment to think what this means. From the market, we will have access to call prices at many strikes and expires. If we can choose a robust interpolation method across time and strike, we will be able to calculate the derivative of price with time and with strike, and plug those into the expression above to give us a Dupire local vol at each point on the time-price plane. If we are running a Monte-Carlo engine, this is the vol that we will need to plug in to evolve the spot from a particular spot level and at a particular time, in order to recover the vanilla prices observed on the market.

A nice property of the local vol model is that it can match uniquely any observed market call price surface. However, the model has weaknesses as well – by including only one source of uncertainty (the volatility), we are making too much of a simplification. Although vanilla prices match, exotics priced using local vol models typically have prices that are much lower than prices observed on the market. The local vol model tends to put most of the vol at early times, so that longer running exotics significantly underprice.

It is important to understand that this is NOT the implied vol used when calculating vanilla vol prices. The implied vol and the local vol are related along a spot path by the expression

\Sigma^2 T = \oint_0^T\sigma^2(S_t,t)dt

(where \inline \Sigma is the implied vol) and the two are quite different. Implied vol is the square root of the average variance per unit time, while the local vol gives the amount of additional variance being added at particular positions on the S-t plane. Since we have an expression for local vol in terms of the call price surface, and there is a 1-to-1 correspondence between call prices and implied vols, we can derive an expression to calculate local vols directly from an implied vol surface. The derivation is long and tedious but trivial mathematically so I don’t present it here, the result is that the local vol is given by (rates are excluded here for simplicity)

\sigma(y,T) = \sqrt{{\partial w \over \partial T} \over \Big[ 1 - {y \over w}{\partial w \over \partial y} + {1\over 2}{\partial^2 w \over \partial y^2}+ {1 \over 4}\Big( -{1 \over 4} - {1 \over w}+ {y^2 \over w}\Big)\Big({\partial w \over \partial y}\Big)^2 \Big]}

where \inline w = \Sigma^2 T is the total implied variance to a maturity and strike and \inline y = \ln{K \over F_T} is the log of ‘moneyness’.

This is probably about as far as Black-Scholes alone will take you. Although we can reprice any vanilla surface, we’re still not pricing exotics very well – to correct this we’re going to need to consider other sources of uncertainty in our models. There are a wide variety of ways of doing this, and I’ll being to look at more advanced models in future posts!

Digital Options

Today I’m going to talk about the valuation of another type of option, the digital (or binary) option. This can be seen as a bit of a case study, as I’ll present the option payoff and the analytical price and greeks under BS assumptions, and give add-ons to allow pricing with the MONTE CARLO pricer. I’ve also updated the ANALYTICAL pricer to calculate the price and greeks of these options.

Digital options are very straight-forward, they are written on an underlying S and expire at a particular date t, at which point digital calls pay $1 if S is greater than a certain strike K or $0 if it is below that, and digital puts pay the reverse – ie. the payoff is

P_{\rm dig\ call} = \Big\{ \ \begin{matrix} \$1 \quad S \geq K\\ \$0 \quad S < K\end{matrix}

We can calculate the price exactly in the BS approximation using the same method that I used to calculate vanilla option prices by risk-neutral valuation as follows

C_{\rm dig\ call}(0) = \delta(0,t)\ {\mathbb E}[ C_{\rm dig\ call}(t) ]

where \inline C_{\rm dig\ call}(t') is the price of the option at time \inline t', and we know that this must converge to the payoff as \inline t \to t', so \inline C_{\rm dig\ call}(t) = P_{\rm dig\ call}

= \delta(0,t)\ \int^{\infty}_{-\infty} P_{\rm dig\ call} \cdot e^{-{1\over 2}x^2} dx

\inline P_{\rm dig\ call} is zero if \inline S = S_0 e^{(r - {1\over 2}\sigma^2)t + \sigma \sqrt{t} x} < K which corresponds to \inline x < {\ln{K \over S_0} - (r-{1\over 2}\sigma^2)t \over \sigma \sqrt{t}} = -d_2

= \delta(0,t)\ \cdot \$1 \cdot \int^{\infty}_{-d_2} e^{-{1\over 2}x^2} dx

= \$ \delta(0,t)\cdot \Phi(d_2)

where

d_1 = {\ln{S\over K}+(r+{1\over 2}\sigma^2)t \over \sigma \sqrt{t}} \quad ;\quad d_2 = {\ln{S\over K}+(r-{1\over 2}\sigma^2)t \over \sigma \sqrt{t}}

Since we have an analytical price, we can also calculate an expression for the GREEKS of this option by differentiating by the various parameters that appear in the price. Analytical expressions for a digital call’s greeks are:

\Delta = {\partial C \over \partial S} = e^{-rt}\cdot {\phi(d_2)\over S\sigma \sqrt{t}}

\nu = {\partial C \over \partial \sigma} = -e^{-rt}\cdot {d_1 \ \phi(d_2)\over \sigma}

\gamma = {\partial^2 C\over \partial S^2} = -e^{-rt}\cdot {d_1 \ \phi(d_2)\over S^2 \sigma^2 t} = -e^{-rt}\cdot {d_1\ \phi(d_1)\over S K \sigma^2 t}

{\rm Vanna} = {\partial^2 C \over \partial S \partial \sigma} =e^{-rt}\cdot {\phi(d_2)\over S \sigma^2 \sqrt{t}}\Big( d_1 d_2 - 1 \Big)

{\rm Volga} = {\partial^2 C \over \partial \sigma^2} = e^{-rt}\cdot {\phi(d_2)\over \sigma^2}\cdot \Big( d_1 + d_2 - d_1^2 d_2 \Big)

where

\phi(d_1) = {1\over \sqrt{2 \pi}}\ {e^{-{1\over 2}d_1^2} }

Holding a binary put and a binary call with the same strike is just the same as holding a zero-coupon bond, since we are guaranteed to receive $1 wherever the spot ends up, so the price of a binary put must be

e^{-rt} = e^{-rt}\cdot \Phi(d_2) + C_{\rm dig\ put}(t'=0)\quad \quad \quad \quad [1]

C_{\rm dig\ put}(0) = e^{-rt} \cdot \Phi(-d_2)

Moreover, differentiating equation [1] above shows that the greeks of a digital put are simply the negative of the greeks of a digital call with the same strike.

Graphs of these are shown for a typical binary option in the following graphs. Unlike vanilla options, these option prices aren’t monotonic in volatility: if they’re in-the-money, increasing vol will actually DECREASE the price since it makes them more likely to end out-of-the-money!

Price and first-order greeks for a digital call option.
Price and first-order greeks for a digital call option.
DigitalGreeks2
Second-order greeks for a digital call option. Greeks for digital puts are simply the negative of these values

One final point on pricing, note that the payoff of a digital call is the negative of the derivative of a vanilla call payoff wrt. strike; and the payoff of a digital put is the positive of the derivative of a vanilla put payoff wrt. strike. This means that any binary greek can be calculated from the corresponding vanilla greek as follows

\Omega_{\rm dig\ call} = -{\partial \Omega_{\rm call}\over \partial K}

\Omega_{\rm dig\ put} = {\partial \Omega_{\rm put}\over \partial K}

where here \inline \Omega represents a general greek.

If you haven’t yet installed the MONTE CARLO pricer, you can find some instructions for doing so in a previous post. The following links give the header and source files for binary calls and puts which can be dropped in to the project in your C++ development environment

These will register the option types with the option factory and allow monte carlo pricing of the options (so far, all of the options in the factory also have analytical expressions, but I’ll soon present some options that can only be priced by Monte Carlo).

BS from Delta-Hedging

Today I’m going to look at another method of getting to the BS equations, by constructing a delta-hedge. This is the way that the equation was in fact first reached historically, and it’s a nice illustration of the principle of hedging. All of the same assumptions are made as in the post that derived the BS equation via Risk Neutral Valuation.

The principle is that because the price of some derivative C(S,t) is a function of the stochastic underlying S, then all of the uncertainty in C comes from the same source as the uncertainty in S. We try to construct a risk-free portfolio made up of the two of these that perfectly cancels out all of the risk. If the portfolio is risk-free, we know it must grow at the risk free rate r(t) or else we have an arbitrage opportunity.

Our model for S is the geometric brownian motion, note that we allow the rate of growth \mu in general to be different from r

    \[dS = \mu Sdt + \sigma SdW_t\]

We can express dC in terms of its derivatives with respect to t and S using Ito’s lemma, which I discussed in a previous post,

    \[dC = \Bigr[ {\partial C \over \partial t} + \mu S{\partial C \over \partial S} + {1 \over 2}\sigma^2 S^2{\partial^2 C \over \partial S^2} \Bigl] dt + \sigma S{\partial C \over \partial S} dW_t\]

Our portfolio is made up of one derivative worth C(S,t) and a fraction \alpha of the underlying stock, worth \alpha \cdot S; so the net price is C(S,t) + \alpha S. We combine the above two results to give

    \begin{eqnarray*} d(C+\alpha S) &=& \Bigr[ {\partial C \over \partial t} + \mu S{\partial C \over \partial S} + {1 \over 2}\sigma^2 S^2{\partial^2 C \over \partial S^2} + \mu S\alpha \Bigl] dt \\ \nonumber \\ &   & + \, \sigma S \bigl( {\partial C \over \partial S} +\alpha \bigr) dW_t \nonumber \end{eqnarray}

We are trying to find a portfolio that is risk-free, which means we would like the stochastic term to cancel. We see immediately that this happens for \alpha = -{\partial C \over \partial S}, which gives

    \[d(C+\alpha S) = \Bigr[ {\partial C \over \partial t} + {1 \over 2}\sigma^2 S^2{\partial^2 C \over \partial S^2}\Bigl] dt\]

Since this portfolio is risk-free, to prevent arbitrage it must grow deterministically at the risk free rate

    \[d ( C + \alpha S) = r ( C + \alpha S) dt\]

and so

    \[rC= {\partial C \over \partial t} + {1 \over 2}\sigma^2 S^2{\partial^2 C \over \partial S^2} + rS{\partial C \over \partial S}\]

This is the BS partial differential equation (pde). Note that despite the fact that the constant growth term for the underlying had a rate \mu, this has totally disappeared in the pde above – we might disagree with someone else about the expected rate of growth of the stock, but no-arbitrage still demands that we agree with them about the price of the option [as long as we agree about \sigma, that is!]

As for any pde, we can only solve for a specific situation if we have boundary conditions – in this case, given by the payoff at expiry t=T. At that point we know the exact form the value that C(S,T) must take

    \[C_{\rm call}(S,T) = \max(K-S,0)\]

Our job is to use the pdf to evolve the value of C(S,t) backwards to t=0. In the case of vanilla options this can be done exactly, while for more complicated payoffs we would need to discretise and solve numerically. This gives us another way of valuing options that is complementary (and equivalent) to the expectations approach discussed previously.

To solve the equation above, it is useful to first make some substitutions. As we are interested in time-to-expiry only, we make the change of variables \tau = T - t which yields

    \[rC= -{\partial C \over \partial \tau} + {1 \over 2}\sigma^2 S^2{\partial^2 C \over \partial S^2} + rS{\partial C \over \partial S}\]

We can eliminate the S terms by considering change-of-variables M = \ln S. This means that

    \[{\partial C \over \partial S} = {\partial C \over \partial M}{\partial M \over \partial S} = {1 \over S}{\partial C \over \partial M}\]

    \begin{eqnarray*} \partial^2 C \over \partial S^2} = {\partial \over \partial S} \Bigl({\partial C \over \partial S}\Bigr) & = & {\partial \over \partial S} \Bigl({1 \over S}{\partial C \over \partial M}\Bigr) \nonumber \\ \nonumber \\ & = & {-1 \over S^2} {\partial C \over \partial M} + {1 \over S}{\partial \over \partial S} \Bigl({\partial C \over \partial M}\Bigr) \nonumber \\ \nonumber \\ & = & {-1 \over S^2} {\partial C \over \partial M} + {1 \over S}\bigl({\partial M \over \partial S}{\partial \over \partial M}\bigr) \Bigl({\partial C \over \partial M}\Bigr) \nonumber \\ \nonumber \\ & = & {-1 \over S^2} {\partial C \over \partial M} + {1 \over S^2}{\partial^2 C \over \partial M^2 \nonumber \end{eqnarray}

Combining these the BS equation becomes

    \[rC= -{\partial C \over \partial \tau} + {1 \over 2}\sigma^2 {\partial^2 C \over \partial M^2} + \bigr( r - {1 \over 2} \sigma^2 \bigl) {\partial C \over \partial M}\]

The linear term in C can be removed by another transformation D = C e^{-r\tau} so that

    \begin{eqnarray*} {\partial D \over \partial \tau} &=& {\partial C \over \partial \tau}e^{-r\tau} - rCe^{-r\tau} \nonumber \\ \nonumber \\ {\partial^n D \over \partial M^n} &=& {\partial^n C \over \partial M^n}e^{-r\tau} \nonumber \end{eqnarray}

The exponential terms cancel throughout, and we are left with

    \[0 = -{\partial D \over \partial \tau} + {1 \over 2}\sigma^2 {\partial^2 D \over \partial M^2} + \bigr( r - {1 \over 2} \sigma^2 \bigl) {\partial D \over \partial M}\]

One final transformation will be needed before putting in boundary conditions. The transformation will be

    \[y = M + \Bigr( r -{1\over 2 }\sigma^2\Bigl)\tau\]

But unlike the other transformations I’ve suggested so far, this one mixes the two variables that we are using, so a bit of care is required about what we mean. When I most recently wrote the BS equation, D was a function of M and \tau – this means that the partial differentials with respect to \tau were implicitly holding M constant and vise versa. I’m now going to write D as a function of y and \tau instead, and because the relationship features all three variables we need to take a bit of care with our partial derivatives:

    \[dD(M,\tau) = {\partial D \over \partial \tau}\bigg|_{M} d\tau + {\partial D \over \partial M}\bigg|_{\tau} dM\]

where vertical lines indicate the variable that is being held constant during evaluation. Now, to move from M to y, we expand out the dM term in the same way as we did for dD above

    \[dD(M(y,\tau),\tau) = {\partial D \over \partial \tau}\bigg|_{M} d\tau + {\partial D \over \partial M}\bigg|_{\tau} \Bigr({\partial M \over \partial \tau}\bigg|_{y} d\tau + {\partial M \over \partial y}\bigg|_{\tau}dy\Bigl)\]

    \[= dD(y,\tau) = {\partial D \over \partial \tau}\bigg|_{y} d\tau + {\partial D \over \partial y}\bigg|_{\tau} dy\]

We can compare these last two equations to give expressions for the derivatives that we need after the transformation by comparing the coefficients of d\tau and dy

    \begin{eqnarray*} {\partial D \over \partial \tau}\bigg|_y &=& {\partial D \over \partial \tau}\bigg|_M + {\partial D \over \partial M}\bigg|_{\tau} {\partial M\over \partial \tau}\bigg|_y \nonumber \\ \nonumber \\ {\partial D \over \partial y}\bigg|_{\tau} &=& {\partial D \over \partial M}\bigg|_{\tau}{\partial M \over \partial y}\bigg|_{\tau} \nonumber \end{eqnarray}

Computing and inserting these derivatives [I’ve given a graphical representation of the first of these equations below, because the derivation is a little dry at present!] into the BS equation gives

    \[0 = -{\partial D \over \partial \tau}\bigg|_y + {1 \over 2}\sigma^2 {\partial^2 D \over \partial y^2}\bigg|_{\tau}\]

This is the well-known Heat Equation in physics. For the sake of brevity I won’t solve it here, but the solution is well known – see for example the wikipedia page – which gives the general solution:

    \[D(x,\tau) = {1 \over \sqrt{2\pi \sigma^2 \tau}}\int^{\infty}_{-\infty} e^{-(x-y)^2 \over 2\sigma^2 \tau} p(y) dy\]

Where p(y) is the payoff condition (it’s now an initial condition, as expiry is at \tau = 0). The algebra is quite involved so I give the solution its own post, and you can show by substitution that the BS option formulae given previously is a solution to the equation.

An illustration of the difference between partial differentials when a change of variables involving both current varibles is used
An illustration of the difference between partial differentials when a change of variables involving both current variables is used. This should be thought of as a contour plot with the value of D on the out-of-plane axis. The amount D changes when moving a small amount dt depends on which direction you are moving in, as shown above.

As an aside, what was the portfolio that I was considering all of the way through? Comparing \alpha to the vanilla greeks, we recognise it as the option delta – the hedging portfolio is just the portfolio of the option with just enough stock to hedge out the local delta risk. Of course, as time goes by this value will change, and we need to constantly adjust our hedge to account for this. This shows the breakdown caused by one of our assumptions – that we could trade whenever we want and without transaction costs. In fact, because we need to re-hedge at every moment to enforce this portfolio’s risk free nature, in the presence of transaction costs the hedging costs in this strategy will be infinite! This demonstrates a significant failing of one of our assumptions, I’ll come back again to the effect of this in the real world in future posts.

Time Varying Parameters

When I discussed the BS equation here, one of the assumptions was that r and \inline \sigma were constant parameters. In reality, neither of these will be constant: how much of a problem is this for us? In general, they will both be stochastic and hence unpredictable in the future. In this post however I’m going to stick to deterministic quantities and demonstrate that BS can be readily extended to time-varying rates and vols. This will enable us to price at-the-money options correctly, but still won’t help us with the vol smile effect that I discussed here.

If r and \inline \sigma are time varying, the stochastic differential equation describing the underlying spot price in BS is

{ dS \over S} = r(t)dt + \sigma(t)dW_t

Using Ito’s Lemma as discussed before, this can be re-written in terms of the log as

d(\ln S) = \Bigl( r(t) - {1 \over 2}\sigma^2(t)\Bigr) dt + \sigma(t) dW_t

And hence

S(t) = S(0).\exp{ \Bigr[ \Bigr( \bar{r} - {1 \over 2}\bar{\sigma}^2 \Bigl)t + \bar{\sigma} \sqrt{t} z \Bigl] }

where \inline z \sim {\mathbb N}(0,1). This is still lognormal as before, but we now have an effective rate and an implied volatility defined by (you can see these by comparing to the distribution coming from constant r and \inline \sigma):

\bar{r}(t) = \int_0^t r(t')dt'

\bar{\sigma}(t) = \sqrt{ {1 \over t} \int_0^t \sigma^2(t')dt'}

\inline \sigma(t) is called the instantaneous vol, but the relevant quantity for option pricing is always the implied vol \inline \bar{\sigma}(t). Assuming that we have a discount curve constructed from traded bonds we can calculate r(t) from that as I described here, and if we can see liquid at-the-money (ATM) options on the market at different times we can also calculate the \inline \sigma(t) function consistent with them from their implied volatilities, which we will need to simulate paths in Monte Carlo, via the following procedure

  • Take all available ATM options, label their times in order \inline \{ t_1, t_2, \cdots , t_n \}
  • Calculate their implied vols using a vol solver (eg. the one on my PRICERS page). These correspond to \inline \{ \bar{\sigma}(t_1), \bar{\sigma}(t_2), \cdots , \bar{\sigma}(t_n) \}
  • As a first approximation, we will assume \inline \sigma(t) is constant between time windows (although you could make approximations arbitrarily complicated). In this case, for \inline 0 < t < t_1 we have

\bar{\sigma}(t_1) = \sqrt{ {1 \over t_1} \int_0^{t_1} \sigma^2dt'} \qquad 0 < t' < t_1

  • Which works out to simply \inline \sigma(t) = \bar{\sigma}(t) in this region
  • For later vols, the procedure is a little more complicated:

\bar{\sigma}(t_2) = \sqrt{ {1 \over t_2} \int_0^{t_2} \sigma^2(t')dt'} = \sqrt{ {1 \over t_2} \int_{t_1}^{t_2} \sigma^2(t')dt' + \bar{\sigma}^2(t_1)}

t_2 \cdot\bar{\sigma}^2(t_2) - t_1 \cdot\bar{\sigma}^2(t_1) = \int_{t_1}^{t_2} \sigma^2(t')dt' = (t_2 - t_1) \sigma^2(t_1)

\sigma(t_1) = \sqrt{t_2 \cdot\bar{\sigma}^2(t_2) - t_1 \cdot\bar{\sigma}^2(t_1) \over (t_2 - t_1) }

  • And for general time window \inline t_n < t < t_{n+1}:

\sigma(t) = \sqrt{t_{n+1} \cdot\bar{\sigma}^2(t_{n+1}) - t_n \cdot\bar{\sigma}^2(t_n) \over (t_{n+1} - t_n) } \qquad t_n < t < t_{n+1}

  • And this final expression can be used to calculate the value of the instantaneous vol required between each time window to give the correct ATM implied vol

Note that this will fail if \inline t_{n+1} \cdot\bar{\sigma}^2(t_{n+1}) < t_n \cdot\bar{\sigma}^2(t_n) – this is because we expect the distribution variance \inline t \cdot\sigma^2(t) to be an increasing function of time. If it didn’t hold for any time windows we’d have an arbitrage opportunity, selling the first option and buying the second to lock in a risk-free profit.

As I mentioned, this extension allows BS to correctly price options and forwards at different times by matching to market observables, but still doesn’t predict any volatility smile. Can we take it any further? In fact we can: in a later post I will discuss the local vol model which extends \inline \sigma(t) to be \inline \sigma(S,t) and will allow us to match any set of arbitrage-free market prices.

Importance of the Vol Smile

After going through the BS model and deriving an equation for vanilla options, it is tempting to believe all of the assumptions that have gone into it. This post will be the first in a series examining some of those assumptions, extending them where possible, and looking at how they fail in some other cases.

Today I’m going to write about the vol smile. If you go to the market and examine quoted vanilla option prices at a given expiry, and put these into an implied vol solver (for example the one on my page!!), if the market believed the BS model was correct, we’d expect to get the same value for each option, no matter what the strike. Alas, not so! In fact, what we will see is that options that have strikes further away from the forward price will usually have higher implied vols than those near the forward price (ie. ‘at-the-money’). This is called the ‘vol smile’ because as it increases away from the money in either direction, it looks something like a smile! In the picture below I’ve shown some toy vol smiles and the sort of evolution that is typically seen with time-to-expiry.

What does this mean? A higher implied vol means a higher price for the option, so we’re saying options far in-the-money or out-of-the-money cost more in real life than expected by the BS model. In the BS model, the log of the stock price was normally distributed at expiry, but more expensive options at distant strikes means that the real distribution has a higher-than-expected probability of ending up at extreme strikes, so the real probability has ‘fat tails’ relative to a normal distribution.

As described in the post on Risk-Free Valuation, we can go a step further and back out the market-implied distribution from the observed call prices from the equation

p_S(K) = {1\over \delta(t)}{\partial^2 C\over \partial K^2}

Taking the smile shown above at \inline t_1, and calculating prices at each strike (using the standard BS vanilla equation, but with the implied vol given by the smile as the vol input), and taking the second derivative with respect to strike, gives the following risk-neutral distribution

The risk-neutral distribution for the smile shown above, compared to a lognormal Black-Scholes distribution. Note the central peak and fat tails of the smiley distribution

Note that the real distribution does indeed have fat tails at distant strikes. Since they have the same expectation and variance, this means that it also has a central peak, and intermediate values suppressed relative to the normal. The graph below shows the tails in more focus

A comparison of the tails of the distribution. The smile distribution leads to an increased probability of extreme events with large spot moves.

How can we alter the BS model to accommodate a vol smile? One possibility is to allow the vol parameter to vary deterministically with spot and time. This approach can indeed match observed vol smiles, also has several weaknesses, I’ll explore it in depth in a later post (it’s called the Local Volatility model, by the way).

A more interesting idea is to allow the volatility itself to be a random variable. This seems intuitive – the volatility, as well as the stock price, responds to information arriving randomly and unpredictably and thus probably should be stochastic. Why would this give us a vol smile? Well – option prices can be seen as the average payoff over all of the different paths that the spot might take. For paths in which the vol stays low, the price won’t go very far. On the other hand, if the vol increases lots there’s a much higher chance that we will end up far away from the money. Looking at this in reverse, if at the expiry date we’re far from the money, it’s much more likely that we followed a higher volatility path to get here, so the implied vol away from the money will be higher.

Stochastic vol models are widely used by practitioners, and there are many different types and models used with many different strengths and weaknesses. I will return to this topic again, most likely repeatedly! The take-home lesson for today though is that vol smiles are important: they imply fat-tailed distributions relative to a lognormal, and they are significant, real features of markets – we need our models to match them or we will lose lots of money to other people who are!

[An interesting historical point is that before the market crash in 1987, there was no vol smile – options indeed tended to have the same vol regardless of strike – people believed the BS model more than they do now. Could the crash have made people realise that large moves were much more likely in real life than BS suggested, and adjusted accordingly? Or do higher prices at extreme strikes represent traders insuring themselves against the possibilities of more market crashes? There are parallels with the present – another assumption of BS is that it is possible to borrow unlimited amounts at the risk-free borrowing rate. This was almost true for big banks before the 2007 crash, but not so any more, and once again a lot of what we do now is trying to understand how to price options correctly in a price where there isn’t really such a thing at the risk-free rate. Each crash seems to lead to belated better understanding of the BS model weaknesses, and because markets often follow the models that participants are using to model them, this improved understanding itself has an effect on the market!]