Stochastic Differential Equations Pt1: The BS Equation

SDEs are of fundamental importance in quantitative finance. Because we don’t know what is going to happen in the future, we need some kind of random process built into our equations to model this uncertainty.

By far the most common process used is Brownian motion, and in 1 dimension this is called the Weiner process, which we will label \inline W_t at time t. This is a diffusive process (basically, there are no ‘jumps’, and in small time intervals the distance that the particle has traveled is probably very small) with no memory, so that future motion is independent of past motion. If we choose any two times s and t with t > s, then the difference between the value of the process at these times is distributed normally, \inline W_t - W_s \sim {\mathbb N}(0,t-s).  This means that our uncertainty about a particle (or asset) following a Weiner process at some future time T can be characterised by a normal distribution with variance T (ie. standard deviation T^0.5) and expectation 0.

The Black-Scholes SDE for a stock price (for reasons that will become clear later, this is usually called geometric brownian motion, by the way) is

{dS \over S} = \mu dt + \sigma dW_t

What does this mean? Well, the first two terms are simple enough for anyone who has studied ordinary differential equations. S is an asset price, t is time and \inline \mu and \inline \sigma are both constants. Without the stochastic term, we would be able to solve the equation by a simple integration on both sides

\int_{S_0}^{S_T} {dS \over S} = \int_0^T \mu dt

S_T = S_0 e^{\mu T}

which is simply exponential growth of a stock with time. Later, we will introduce risk-free bonds that grow in this way, like compound interest in a bank account, but here we are interested in including an element of uncertainty as well. How do we deal with that?

We might be tempted to try the same thing and integrate the whole equation, using the result that \inline d \ln S = S^{-1} \cdot dS = \mu dt + \sigma d W_t as we implicitly did above. Unfortunately, for non-deterministic functions this doesn’t work any more. The reasons are rather deep, but for the moment the fix is that we need to use a central result of stochastic calculus, Ito’s Lemma, to resolve the conundrum. This is the stocastic version of the chain rule for a function of many variables, which says that if

dS = \mu (S,t) dt + \sigma (S,t) dW_t


d( f(S,t)) = \Bigr( {\partial f \over \partial t} + \mu(S,t) {\partial f \over \partial S} + {\sigma(S,t)^2 \over 2}{\partial^2 f \over \partial S^2} \Bigl) dt + \sigma(S,t) {\partial f \over \partial S} dW_t

That seems like quite a mouthful, but for the case that we’re considering, ln(S), it’s actually pretty straight-forward. Since ln(S) doesn’t depend explicitly on time, the time-derivative disappears. It’s quite easy to calculate the first and second derivatives of ln(S) with S, and remembering that we needed to multiply all of the terms in the BS equation above by S to turn the LHS into plain old dS (these were absorbed into our \inline \mu(S,t) and \inline \sigma(S,t)), we get:

d \ln S = ( \mu - {1 \over 2} \sigma^2 ) dt + \sigma dW_t

with the \inline \mu and \inline \sigma just plain old constants again. We have picked up an extra term here, and it has come from the second derivative term in Ito’s Lemma. We are now in a position to integrate both sides of the equation (\inline \int^T \sigma dW_t = \sigma W_T, by the way), and similar to the result in the deterministic case we get

S_T = S_0 e^{(\mu - {1 \over 2}\sigma^2)T + \sigma W_T}

This is the solution of the BS SDE, but what does it mean? There’s quite a bit to go along with above, so I’ll talk a bit more about what this means in the second part of this blog post.


Monte Carlo in Quantitative Finance

In this post I’m going to give a little introduction to Monte Carlo as a method for integration, and try to get server-side scripting working via WordPress!

Monte Carlo is fundamentally a technique for doing numerical integration. If we’re confronted with an integral that we can’t solve analytically, we have an array of possible techniques available to us. For 1d integrals, probably the easiest thing for well-behaved functions is to use Simpson’s rule and integrate across the part we’re interested in. But to do this in many dimensions can be tricky – for each dimension we probably want the same number of partitions, so the overall number of function evaluations scales exponentially with dimension – very bad news! By contrast, Monte Carlo essentially samples the function at random points and takes an average. The higher value points contribute more to the average than lower value points, and the overall error in the average scales as \inline {1 \over \sqrt{n}}, giving a good approximation relatively quickly for high dimensional problems.

The quintessential example of a Monte Carlo experiment is a simple process to approximate pi. Consider drawing a circle inscribed into a square, just as shown in the following image:

Now, imagine scattering dried cous-cous over the whole area. They will land randomly all over the surface, and first of all we will sweep away any that fall outside of the square.

What next? Well, if we count the number of grains that fell inside the circle, and compare that to the total amount that fell inside the square, what do we expect the ratio to be? As long as they’re not too clustered, the expectation is that it will be \inline {\pi \over 4}, which is of course just the ratio of the areas.

Rather than actually counting grains, we can simulate this process on the computer much more quickly. To represent each grain, we simulate two uniform variables, each on the interval [-0.5,0.5]. We treat these as the grain’s x-coordinate and y-coordinate, and calculate the distance from the origin (0,0) by taking the sum of the squares of these. If the sum is less than the square of the radius of the circle (ie. 0.25) then the grain is ‘inside’ the circle, if it is greater then the grain is ‘outside’.

Here is the simulation run for 1000 grains of cous-cous (refresh for a repeat attempt):

Grains Simulated: 1000
Grains Inside Circle: 793
Our estimate of pi is 3.172

The estimate here is probably fairly close – you may have been lucky (or unlucky!), but we claimed that the estimate will converge to the real value of pi with a progressively larger number of grains [exercise: can you demonstrate this using the central limit theorem?]. Well, below you’ll find another simulation, this time going up to a little over a million grains but making estimates each time the number of grains is doubled, and the error in the estimate is compared to the number of grains so far at each step.

number of steps estimate error
2 2 -1.141593
4 2 -1.141593
8 2.5 -0.641593
16 2.75 -0.391593
32 2.75 -0.391593
64 2.9375 -0.204093
128 3 -0.141593
256 3.078125 -0.063468
512 3.15625 0.014657
1024 3.109375 -0.032218
2048 3.101563 -0.04003
4096 3.101563 -0.04003
8192 3.099121 -0.042472
16384 3.116943 -0.024649
32768 3.12439 -0.017203
65536 3.132324 -0.009268
131072 3.133606 -0.007987
262144 3.134659 -0.006934
524288 3.135147 -0.006446
1048576 3.13747 -0.004122

It should b fairly straight-forward to copy-paste these figures into excel – does the error fall like \inline {1 \over \sqrt{n}} as claimed? In fact, the central limit theorem tells us that the mean of the ratio of grains inside should go to a normal distribution with standard deviation proportional to this quantity, so we should expect it to be outside the bound about 30% of the time (if you have calculated the right constant!).

Of course, in the 2 dimensional circle case, a better idea might be to try and put points evenly over the square and count how many of these fall inside/outside. As mentioned before, the benefits of Monte Carlo are most pronounced when there are many dimensions involved. But, this is something like the procedure involved in quasi-Monte Carlo, a procedure that I’ll talk about some other time that doesn’t use random numbers at all…


Cumulative Normal Distribution

As discussed in my last post, I had some issues with an implementation of the standard normal cdf function, which I address here. This post will be my first test of the equation editor that I’ve just installed!

The standard normal cdf looks like this:

\Phi(x) = {1 \over \sqrt{2\pi}}\int ^{x}_{-\infty} e^{-{1 \over 2} y^2 } dy

I had a few thoughts as to how to go about this. It’s symmetrical around x = 0, so we only need to worry about solving for positive x, and then we can calculate negative x values by reflection. Possibilities were:

  • Series Approximation

e^x = \sum_{n = 0}^{\infty} {1 \over n!} x^n

{1\over \sqrt{2\pi}}\int_{-\infty}^x e^{-{1 \over 2}y^2} dy = {1 \over 2} + \sum_{n = 0}^{\infty} \int_{0}^x {-1^n \over 2^n n!} {y^{2n}} dy = {1 \over 2} + \sum_{n = 0}^{\infty} {-1^n \over (2n+1) 2^n n!} {x^{2n+1}}

The problem with this expression is that the terms oscillate between positive and negative, which means that for large enough x it will explode to either very big positive or negative values. We could include a fixing point and build in some sort of curve above a certain x, but I’m looking for a fairly robust solution that doesn’t involve too much testing for the moment, and this sounds like a lot of work!

  • Lookup Table

Again, this solution would be a lot of work to set up, but would have a guaranteed level of accuracy and would find the answer in a fixed amount of time. A vector could be set up containing the value of x that gives cdf(x) = 0.5+n*delta; delta might be one millionth or so, the array would have 2^19 members and a binary search would locate the closest value to x in approximately 18 operations (assuming constant time to check a vector). The index of this value would then give the value of the cdf, and an interpolation scheme between adjacent values could improve accuracy further.

Pros of this solution are efficiency and accuracy, but there would be a large setup cost and possibly memory cost to load up the look up table. If we had to run the function inside a Monte Carlo regime or numerical integration, this might be justified by the runtime savings, but for the moment it’s a bit much work!

  • Analytic Approximation

To get the pricer described in my last post running, I implemented the following approximation to the error function:

a1 = 0.278393
a2 = 0.230389
a3 = 0.000972
a4 = 0.078108

{\rm erf}(x) = 1 - {1 \over (a_1x + a_2 x^2 + a_3 x^3 + a_4 x^4)^4}

\Phi(x) = 0.5\Bigl[ 1 + {\rm erf}\Bigl({x \over \sqrt{2}}\Bigr) \Bigr]

Well, it got the pricer working but turns out to be not too accurate. For example, I got this running on excel and compared it to the inbuild NORMSDIST implementation, for some typical numbers (forward = 100, strike = 90, vol=10%, dcf = 1, time = 1). For NORMSDIST(d1) in excel, I get 0.86512, while for my Phi(d1) I get 0.86489, so the error is in the 4th sf. This leads to a price difference of 10.7124 for excel and 10.7095 for my implementation, and I’d like to do a bit better. Since I’d ruled out the possibilities described above, I went back to wikipedia and implemented the following version of the cdf, which gave a much more agreeable price (similar to excel’s now to 7sf):

b0 = 0.2316419;
b1 = 0.31938153;
b2 = -0.356563782;
b3 = 1.781477937;
b4 = -1.821255978;
b5 = 1.330274429;

t = {1 \over 1+b_0 x}

\phi(x) = {1 \over \sqrt{2\pi}} e^{-{1 \over 2}x^2}

\Phi(x) = 1 - \phi(x)\cdot(b_1t + b_2t^2 +b_3t^3+b_4t^4+b_5t^5)

It’s worth noting that although this implementation seems to give better accuracy for most positive x, it relies on some library functions (square root and exponential) itself, so we’re leaving ourselves slightly at the mercy of how those are programmed. It’s also going to take much longer than the previous implementation, which only used polynomial arithmetic. Since we only need a couple of evaluations in a vanilla pricer, it’s not a big concern, but for applications that call it repeatedly we might want to do better.

That’s all for today!


A First Pricer

Well, this wouldn’t be much of a derivatives blog without a discussion of, and pricer for, vanilla options. ‘Vanilla’ options are what quants call the basic call and put options that really are the bread-and-butter of what we do (there are good descriptions all over the web of what these are, and I’m not going to go into too much detail here).

A call(put) option on an asset is the right to buy(sell) that asset at a certain strike price, K, at a certain time, T. Usually, the underlying asset will be an asset that has a quoted market spot price, S, that it can be readily traded at. If at time T the price S is more than K, then the call option will have some value, as the asset can be bought at K and sold at S for a profit of (S – K). The put will be worthless, since the right to sell the asset at K isn’t much good if you can readily sell it on the market for a higher price S anyway! If the spot price S is below K, then the situation is reversed.

So calculating the price of the option at the expiry time is straight-forward. The question is, what is the price before this time? The answer is given by the famous Black-Scholes equation, and I’ve implemented a version in the pricer section for you to play with here: Vanilla Pricer. Clearly, before time T we don’t know what the stock price will be at time T. However, we might have some idea of what it’s probability distribution will look like. The amazing thing about option prices is that (within certain approximations) they depend only on the variance of this distribution, and not its expected value – no matter how fast we think the stock will grow, the option value is determined only by the spread of values. The more spread, the more valuable the option – because options have positive values for high terminal S, and losses are capped at zero for low terminal S, they are move valuable if there is a widespread set of outcomes possible.

What’s even more amazing is that we can get information the other way round – if we don’t know how the spot price S is going to end up, but we do have access to a market where options are being liquidly traded, we can IMPLY the terminal spot distribution from how these prices vary with strike! This is a really rich area of study and can be approached from lots of different angles, I’ll be going over some of this fairly slowly and in no particular order in the coming months.

Future posts will probably get a bit more mathematical, particularly once I’ve worked out how to use the equation editor here. Also, in implementing the pricer for vanilla options I’ve realised how woefully inadequate the maths infrastructure is in javascript for this sort of thing. I needed a cumulative normal distribution function but no default was available so I snatched the first one I could find from Wikipedia (due to Abramowitz and Stegun) but it’s not too good so I’ll address a quest for a better method of doing this quickly fairly soon. Posts of this sort will be just as much for my benefit as yours, as these sorts of tasks I’ve taken for granted before!


Hello world!

Hello and welcome to ‘What Quants Do’! On this blog I am to delve a bit into the wonderful world of mathematical finance, and the things that I do in my day job as a quant. There’ll be a mix of nice results, graphs, reviews of papers and books, and probably several streams of consciousness…

I’m also using this as an attempt to learn to use the WordPress content management system, once I work out how to integrate my own php and javascript, I also hope to include some pricers to illustrate a few of the points that I plan to cover!

Finally, I’m hoping that I’ll get a bit of discussion going, since that’s always the bit I enjoy most about other people’s blogs.

Looking forward to it!