I’m in the process of preparing some C++ code for the blog to allow user-downloadable and -compilable Monte Carlo pricers that won’t overload my server. That’s got me thinking about random number generators, and I’m going to talk a bit about them here [In fact, all of the algorithms discussed below are entirely deterministic, so I should really call them pseudo-random number generators. Whether or not particular algorithms are suitable for use in applications requiring random numbers is an area of active research and occasional catastrophic failure].
Generally, in Monte Carlo simulations we want to be able to generate many ‘sample paths’, each of which is a realisation of a possible evolution of the underlying price, and evaluate the payoff for that path. As I discussed in an earlier post, we usually model these quantities as depending on brownian motions and results are normally distributed, which means we need to generate random variables that are also normally distributed (so, for example, we are more likely to get values near to zero than at large values).
Generating these directly is quite tricky, so the first step is usually to generate uniform random variables and then transform these to gaussian variates. A uniform random variable is a variable with a constant probability of being anywhere within a certain range, typically [0,1] or [-1,1] – for the rest of this post, I denote a uniform variable as U[…], and define u~[0,1] and v~[-1,1]. The pdf of u and v are shown here:
A straight-forward method of generating these numbers uses modular multiplication [this is variously called a Park-Miller generator, a linear congruential generator and a Lehmer generator]:
For good choices of the variables, this will generate a sequence with a period of n that is uniformly distributed across the interval [0,n], so dividing by n rescales the variables to be distributed roughly according to u. The choices of g, n and (the ‘seed’) are non-trivial and several are given in the literature, as usual I’ve followed the advice given on wikipedia and lumped for n = 2^32 – 5, g = 279470273, and the choice of seed is left to the user but should be coprime to modulus n. Note that this will have a period of about 2^32, so the upper limit on the number of monte-carlo evaluations that can meaningfully be done is around 4 billion steps. It is also easy to see how the generator could be hampered by a poor choice of parameters – if the multiplier and the modulus are not co-prime, for example, the actual period of the sequence can be much less than n.
The more advanced Marsenne Twister algorithm aims to correct many of the problems with the Park-Miller generator, such as increasing the period length of the sequence and improving the statistical randomness of the values produced. It’s too complicated to go through here (it’s a full post in itself!) in full detail but many good summaries of the theory and performance of the twister algorithm can be found online.
One other thing to point out is that if initilised in the same way, each time one of the above algorithms is run it will return the same sequence of numbers. At first this might seem like an unwanted property as it deviates from ‘true’ randomness. However, in our context it will be extremely valuable as it allows us to test code using the same sequence of numbers each time. This is vital for calculating greeks by bump-and-revalue using monte carlo as otherwise monte carlo error in price will usually swamp small changes due to greeks, and it also helpful when optimising code to make sure that results aren’t being affected.
Once we have a source of uniform random variates, we need to consider methods for converting them to gaussian variates. The simplest and most direct is the inversion method, which involves finding a mapping between uniform random variate u and the standard normal random variate z. Consider first the CDF of the normal variable (I have discussed this before here), shown below:
This shows us the likelihood that z is below a certain value X, and is equal to the integral of the PDF from to X. Because the PDF is normalised to integrate to 1, the CDF maps each possible outcome of z to a value between 0 and 1, with the most likely outcomes being where the CDF is steepest (since these cover the y-axis values more quickly) which is exactly where the PDF is largest. This is exactly the opposite of what we want, so we consider instead the inverse CDF shown here:
This mapping is exactly what we’re after [although there are others – it’s not unique]. It maps uniformly distributed variates to normally distributed variates and they’re in a 1-to-1 correspondence. This procedure is robust and in principle it is exact. However, it is not always the fastest method, and it requires us to know the inverse CDF, which to evaluate in a reasonable time requires us to make approximations (this technique works for any distribution – although there is no closed form CDF for the normal distribution, for many other distributions the inverse CDF can be expressed in closed form, in which case this is almost certainly the best method).
There are several other techniques for obtaining normal distributions from uniform variates, a few of which I have implemented and describe them here because they’re quite interesting. The first is called the Box-Muller technique, which converts two independent uniform variates into two independent normal variates by combining the inversion technique described above, the trick that we use to calculate a gaussian integral and the change of variables law for Pdfs that I discussed here.
Consider two independent normally-distributed variables x and y. The joint PDF of the two is
As for gaussian integration, we can re-express this in polar co-ordinates remembering that
If we can simulate a value of r and theta using uniforms that obeys these densities, we will be able to transform them to normals x and y straightforwardly. There is radial symmetry so we can integrate over to give us the radial cumulative probability density (ie. radial CDF)
Unlike x or y, we can calculate this in closed form and invert it
This quantile function allows us to select a point with the correct radial distribution using a first uniform variate, and we then need to select a theta from a uniform distribution around angles – this can be done by multiplying a second uniform variate by [nb. we can simplify the radial quantile further – note that if u is uniformly distributed across [0,1], then so is (1-u), so we can substitute u for (1-u)]. The values of normally distributed variates are given by the x- and y-coordinates of the resulting point, ie.
This is a really useful trick, quickly converting two uniform variates into two gaussian variates. We can improve the speed still further if we can get rid of the computationally expensive sine and cosine functions, for which there is another nifty trick using rejection. In the final step, we inserted two uniform variates to produce and . We can use the following transformation to turn our initial uniform variates into two new variates that avoid the need for a sine or cosine:
Instead of taking and , consider instead two uniform variables and across [-1,1]. Let , and if s > 1 then reject the values and start again. The variables now lie within the unit sphere as shown below:
We can use change of variables to show that after rejecting external points, both theta and s are also uniform on [0,1]:
Above, we used and to generate a value of r and angle, but since the s and that we’ve generated here are also uniform on [0,1] they can be used just as well. Because of our definitions, , so new expressions are:
We’ve got an even more efficient set of formulae, at the cost of having to throw out around 20% of our values. It will turn out that this is often quicker than the basic version of the algorithm, but is totally useless if we move to pseudo-random numbers, as I will discuss another time.
The final technique I want to talk about is a generalisation of acceptance-rejection, which we’ve just touched on and was also used implicitly in my first post on Monte Carlo.
The idea here is once again to simulate a distribution f(x) by sampling instead from another distribution g(x) that overlaps it everywhere (we can rescale g(x) by a constant A to ensure this), and rejecting values that fall inside A.g(x) but outside of f(x). This is usually used when we can simulate g(x) directly from uniforms by using its inverse CDF – perhaps the best way to demonstrate how this works is through another example. We let f(x) be the normal PDF, and g(x) the exponential distribution, such that
This only has support on positive x, so we redefine it to cover positive and negative x, and rescale it so that it is everywhere larger than the normal pdf
and calculating the inverse CDF gives
We generate two independent uniform draws and . The first is used to simulate a variate distributed according to A.g(x) via the inverse CDF given here. Then we compare the values of A.g(x) at that point to the value of f(x) evaluated at that point. The second uniform variate is compared to the ratio of the two – if it is GREATER than the ratio, our point has fallen ‘outside’ of f(x), so it is rejected and we start again. If it is LOWER than the ratio, we accept the value of as a realisation of our target distribution f(x).
An obvious problem with this technique is that if the distributions don’t overlap well, many variates will be rejected, so we need a good uniform generator to provide enough independent variables for the algorithm to work.
The algorithm can be made arbitrarily complex by designing trial distributions g(x) that match f(x) as closely as possible, such as the ‘Ziggaraut algorithm’ which optimises the approach by breaking g(x) into many parts matched to different sections of f(x), which and is often used to generate normal random variates. As with the Marsenne Twister, there are many good guides to this algorithm available online.