## Interview Questions VII: Integrated Brownian Motion

For a standard Weiner process denoted , calculate

This integral is the limit of the sum of at each infinitessimal time slice from to , it is called Integrated Brownian Motion.

We see immediately that this will be a random variable with expectation zero, so the result of the squared expectation will simply be the variance of the random variable, which is what we need to calculate. Here I show two ways to solve this, first by taking the limit of the sum and second using stochastic integration by parts

Need help? Discuss this in the Interview Questions forum!

1) Limit of a sum

(1)

This sum of values along the Wenier process is not independent of one-another, since only the increments are independent. However, we can re-write them in terms of sums of these independent increments

(2)

where are the individual independent increments of the brownian motion. Substituting into our previous equation and reversing the order of the summation

(3)

which is simply a weighted sum of independent gaussians. To calculate the total variance, we sum the individual variances using the summation formula for

(4)

which is the solution.

2) Stochastic integration by parts

The stochastic version of integration by parts for potentially stochastic variables , looks like this:

Re-arranging this gives

Now setting and we have

(5)

We recognise this as a weighted sum of independent gaussian increments, which is (as expected) a gaussian variable with expectation 0 and variance that we can calculate with the Ito isometry

(6)

which is the solution.

## Hedging in a finite-state model (Binary Trees)

Today’s post is about hedging in a world where there are only a few outcomes. It demonstrates a lot of principles that are important in asset pricing, and it’s also a frequent topic for interview questions so almost got included as one of those!

A very basic approximation for a stock is a binary tree, as in the diagram shown below. The stock is worth 100 today, and after some time (say a year) it will be worth either 110 with probability p or 90 with probability (1-p). In this very basic model, we assume no other values are possible – the stock either does well and goes up by 10, or it does badly and goes down by 10. Furthermore, we assume that there is a risk-free yearly interest rate r at which we can deposit money in the bank and receive (1+r) times our initial deposit in a year.

We’re going to try and price options in this model. Let’s say we’ve got a call option to buy the stock in a year’s time for 100. How much is this worth? Because of the simple state of the world, we can enumerate the two possibilities – either the stock is worth 110, in which case the call option to buy it for 100 is worth 10, or the stock is worth 90, and the option is worthless. We’ll try and simulate that payoff using a combination of cash-in-the-bank and the stock itself, and then by no-arbitrage the price must be the same today.

Our portfolio consists of stocks S and cash-in-the-bank. The value of the cash will certainly be (1+r) in a year, while the stock will be 110 or 90 depending on the state, and we want the sum of these two to equal the option payoff (in either the up or the down scenario), which gives us two simultaneous equations

so

This says that a portfolio containing half of a stock and minus 45 pounds in the bank will exactly replicate the payoff of the option in either world state. The value of this portfolio today is just 0.5*100 – 45/(1+r), and I’ve plotted that as a function of r below.

This gives meaningless prices if r > 0.1 (the price of the option must be between 0 and 10/(1+r) as these are the discounted values of the least/most that it can pay out at expiry). What does this mean intuitively? Well, if r > 0.1 then we have an arbitrage: 100 in the bank will *always* yield more than the stock at expiry, so we should short the stock as much as possible and put the proceeds in the bank. If r < -0.1 the situation is exactly reversed

The important point about all of this was that the option price DOESN’T depend on the relative chances of the stock increasing to 110 or falling to 90. As long as both of these are strictly greater than zero and less than one, then ANY set of probabilities will lead to the same price for the option. This is the discrete analogue of a result I presented before (here) – that the expected growth of the stock in the real world doesn’t matter to the option’s price, it is the risk-free rate that affects its price. Indeed, it is possible to derive the continuous result using a binary tree by letting the time period go to zero, I won’t attempt it here but the derivation can be found in many textbooks (eg. on wikipedia).

Things really get interesting when we try to extend the model in a slightly different way. Often banks will be interested in models that have not just a diffusive component, but also a ‘jump’ component, which gives a finite chance that a price will fall suddenly by a large amount (I’ll present one such model in the near future). Unlike a straight-forward Black-Scholes model, because these jumps happen suddenly and unexpectedly they are very difficult to hedge, and are meant to represent market crashes that can result in sudden sharp losses for banks.

In our simple tree model, this can be incorporated by moving to a three-branch model, shown below

We have the same two branches as before, but an additional state now exists in which the stock price crashes to 50 with probability q. In this case, again trying to price an option to buy the stock for 100 gives three simultaneous equations

Unlike before, we can’t find a single alpha and beta that will replicate the payoff in all three world states, as we have three equations and only two unknowns. Consequently, the best we will be able to to is sub-replicate or super-replicate the payoff. That is, find portfolios that either always pay equal to or greater than the option, or always pay less than or equal to the option. These will give bounds on the lowest and highest arbitrage-free option prices, but that’s as far as arbitrage-free prices will take us (in fact ANY of the prices between this limit is arbitrage-free) – choosing an actual price will require knowing the probabilities of p and q and deciding on our personal risk-premium.

In the simple three-state model, lower and upper bounds can be calculated by ignoring the second and third equation respectively above, and they give the limits shown in the graph below. Once again, as r 0.1 they converge, but note that a case where r < -0.1 is now possible, as the ‘crash’ option means that the stock can still pay out less than the bank account

This is what’s called an incomplete market. The securities that exist in the market aren’t sufficient to hedge us against all future possible states of the universe. In this case, we can’t uniquely determine prices by risk-neutral pricing – sice we can’t hedge out all of the risk, risk preferences of investors will play a part in determining the market-clearing prices of securities. Most models that are used in the industry are incomplete in this way – I’ll be looking at some examples of this soon which will include models involving jump processes and stochastic volatility.

## Interview Questions VI: The Drunkard’s Walk

A drunk man is stumbling home at night after closing time. Every step he takes moves him either 1 metre closer or 1 metre further away from his destination, with an equal chance of going in either direction (!). His home is 70 metres down the road, but unfortunately, there is a cliff 30 metres behind him at the other end of the street. What are the chances that he makes it to his house BEFORE tumbling over the edge of the cliff?

This is a fun question and quite heavy on conditional probability. We are trying to find the probability that the drunkard has moved 70 metres forward BEFORE he has ever moved 30 metres backward, or visa versa. There are several ways of attempting this, including some pretty neat martingale maths, but I’m going to attempt it here in the language of matrices and markov chains.

Essentially, there are 100 states that the drunkard can be in, from right beside the cliff all the way down the road to right beside his front door. Let’s label these from 1 to 100, in terms of the number of metres away from the cliff, so that he starts in state 30. At each step, he transitions from his current state to either one state higher or one state lower with probability 50%, and the process continues until he reaches either the cliff or the door, at which point the process will cease in either a good or a very bad night’s rest. We call these two states, 0 and 100, ‘absorbers’, because the process stops at this point and no transitions to new states can happen. A markov diagram that illustrates this process can be drawn like this:

We can characterise each step in the process by a transition matrix acting on a state vector. The drunkard initially has a state of 30 metres, so his state vector is a long string of zeroes with a single 1 at the 30th position:

$S_0&space;=&space;\begin{pmatrix}&space;0&space;\\&space;\vdots&space;\\&space;1\\&space;\vdots\\&space;0&space;\end{pmatrix}$

This vector is probabalistic – a 1 indicates with 100% certainty that the drunkard is in the 30th state. However, with subsequent moves this probability density will be spread over the nearby states as his position’s probability density diffuses into other states. The transition matrix multiplies the drunkard’s state vector to give his state vector after another step:

$P&space;=&space;\begin{pmatrix}&space;1&space;&&space;0.5&space;&&space;0&space;&&space;\cdots&space;&&space;0&space;&&space;0\\&space;0&space;&&space;0&space;&&space;0.5&space;&&space;\cdots&space;&&space;0&space;&&space;0\\&space;0&space;&&space;0.5&space;&&space;0&space;&&space;\cdots&space;&&space;0&space;&&space;0\\&space;0&space;&&space;0&space;&&space;0.5&space;&&space;&&space;0&space;&&space;0&space;\\&space;\vdots&space;&&space;\vdots&space;&&space;\vdots&space;&&space;\ddots&space;&&space;\vdots&space;&&space;\vdots\\&space;0&space;&&space;0&space;&&space;0&space;&&space;\cdots&space;&&space;0.5&space;&&space;1&space;\end{pmatrix};&space;\quad&space;S_{i+1}&space;=&space;P&space;\cdot&space;S_i$

So, after one step the drunkard’s state vector will have a 0.5 in the 29th and the 31st position and zeroes elsewhere, saying that he will be in either of these states with probability 0.5, and certainly nowhere else. Note that the 1’s at the top and bottom of the transition matrix will absorb and probability that arrives at that state for the rest of the process.

To keep things simple, let’s consider a much smaller problem with only six states, where state 1 is ‘down the cliff’ and state 6 is ‘home’; and we’ll come back to the larger problem at the end. We want to calculate the limit of the drunkard’s state as the transition matrix is applied a large number of times, ie. to calculate

$S_n&space;=&space;P^n&space;\cdot&space;S_0$

An efficient way to calculate powers of a large matrix is to first diagonalise it. We have $\inline&space;P&space;=&space;U&space;A\&space;U^{-1}$, where $\inline&space;A$ is a diagonal matrix whose diagonal elements are the eigenvalues of $\inline&space;P$, and $\inline&space;U$ is the matrix whose columns are the eigenvectors of $\inline&space;P$. Note that, as $\inline&space;P$ is not symmetric, $\inline&space;U^{-1}$ will have row vectors that the the LEFT eigenvectors of $\inline&space;P$, which in general will be different to the right eigenvectors. It is easy to see how to raise $\inline&space;P$ to the power n

$P^n&space;=&space;(U&space;A\&space;U^{-1})^n&space;=&space;U&space;A^n\&space;U^{-1}$

since all of the $\inline&space;U$s and $\inline&space;U^{-1}$s in the middle cancel. Since $\inline&space;A$ is diagonal, to raise it to the power of n we simlpy raise each element (which are just the eigenvalues of $\inline&space;P$) to the power of n. To calculate the eigenvalues of P we solve the characteristic equation

$|P&space;-&space;\lambda_a&space;I|&space;=&space;0$

with

$P&space;=&space;\begin{pmatrix}&space;1&space;&&space;0.5&space;&&space;0&space;&&space;0&space;&&space;0&space;&&space;0\\&space;0&space;&&space;0&space;&&space;0.5&space;&&space;0&space;&&space;0&space;&&space;0\\&space;0&space;&&space;0.5&space;&&space;0&space;&&space;0.5&space;&&space;0&space;&&space;0\\&space;0&space;&&space;0&space;&&space;0.5&space;&&space;0&space;&&space;0.5&space;&&space;0&space;\\&space;0&space;&&space;0&space;&&space;0&space;&&space;0.5&space;&&space;0&space;&&space;0\\&space;0&space;&&space;0&space;&&space;0&space;&&space;0&space;&&space;0.5&space;&&space;1&space;\end{pmatrix}$

This gives six eigenvalues, two of which are one (these will turn out to correspond to the two absorbing states) and the remainder are strictly $\inline&space;0\leq&space;\lambda_a&space;<&space;1$. Consequently, when raised to the power n in the diagonal matrix above, all of the terms will disappear except for the first and the last eigenvalue which are 1 as n becomes large.

Calculating the eigenvectors is time consuming and we’d like to avoid it if possible. Luckily, in the limit that n gets large, we’ve seen that most of the eigenvalues raised to the power n will go to zero which will reduce significantly the amount we need to calculate. We have

$P^n\cdot&space;S&space;=&space;\bigl(&space;U&space;A^n&space;\&space;U^{-1}\bigr)\cdot&space;S$

and in the limit that n gets large, $\inline&space;U&space;\cdot&space;A^n$ is just a matrix of zeroes with a one in the upper left and lower right entry. At first sight it looks as though calculating $\inline&space;U^{-1}$ will be required, which is itself an eigenvector problem, but in fact we only have to calculate a single eigenvector – the first (or equivalently the last), which will give the probability of evolving from an initial state S to a final state 0 (or 100).

$\inline&space;U^{-1}$ is the matrix of left eigenvectors of $\inline&space;P$, each of which satisfies

$x_a&space;\cdot&space;P&space;=&space;\lambda_a&space;x_a$

and we are looking for the eigenvectors corresponding to eigenvalue of 1, so we need to solve the matrix equation

$x&space;\cdot&space;\begin{pmatrix}&space;0&space;&&space;0.5&space;&&space;0&space;&&space;0&space;&&space;0&space;&&space;0\\&space;0&space;&&space;-1&space;&&space;0.5&space;&&space;0&space;&&space;0&space;&&space;0\\&space;0&space;&&space;0.5&space;&&space;-1&space;&&space;0.5&space;&&space;0&space;&&space;0\\&space;0&space;&&space;0&space;&&space;0.5&space;&&space;-1&space;&&space;0.5&space;&&space;0&space;\\&space;0&space;&&space;0&space;&&space;0&space;&&space;0.5&space;&&space;-1&space;&&space;0\\&space;0&space;&&space;0&space;&&space;0&space;&&space;0&space;&&space;0.5&space;&&space;0&space;\end{pmatrix}&space;=&space;0$

We know that if he starts in the first state (ie. over the cliff) he must finish there (trivially, after 0 steps) with probability 100%, so that $\inline&space;x_1&space;=&space;1$ and $\inline&space;x_6&space;=&space;0$. The solution to this is

$x_n&space;=&space;{6&space;-&space;n&space;\over&space;5}$

which is plotted here for each initial state

which says that the probability of ending up in state 1 (down the cliff!) falls linearly with starting distance from the cliff. We can scale up this final matrix equation to the original 100 by 100 state space, and find that for someone starting in state 30, there is a 70/100 chance of ending up down the cliff, and consequently only a 30% chance of getting home!

This problem is basically the same as the Gambler’s ruin problem, where a gambler in a casino stakes $1 on each toss of a coin and leaves after reaching a goal of$N or when broke. There are some very neat methods for solving them via martingale methods that don’t use the mechanics above that I’ll look at in a future post.

## Interview Questions V: The Lowest Unique Positive Integer Game

Today’s post will be quite an extended one, about an interesting game theory problem I’ve been thinking about recently. The game is a lottery that works as follows. Each of N players chooses a positive integer, and the player who picks the lowest integer that no-one else has selected wins. There are two variants of the game discussed in the literature, one in which the integers must be between 1 and N, and one in which they have no limit. Except in the specific case of only 3 players, this restriction seems to make no difference to the solution […unless I’ve got my algebra wrong, which is by no means impossible!].

Note that throughout this post, I’ll assume a basic familiarity with probability and knowledge of the formulae for infinite sums (given here, for example), particularly the sum to infinity of a geometric progression:

3-Player Case

I start with the N=3 case to illustrate the principles of the problem. Because going in to the problem we have exactly the same information as the other two players, there is no deterministic solution, and in order to maximise our chances of victory we need to adopt a ‘mixed strategy’ which involves selecting a random distribution that assigns a probability to the selection of each integer. This is explained a bit further at this website, although unfortunately the solution that they present is incorrect!

In the case that players are limited to selecting 1-N (so 1, 2 or 3), the problem is analysed in a paper on arXiv (although there’s also a mistake in this paper – this time the N=4 solution presented is incorrect, as I’ll show later), and my solution will follow the author’s.

We are trying to select a set of probabilities  such that when all players play this strategy, no player increases her chances of victory by adjusting her strategy (such a solution is called a Nash Equilibrium in game theory, as seen in A Beautiful Mind). Let’s assume player 2 and player 3 are both using these probabilities, and player 1 tries to increase her chance of victory by playing instead the strategy .

If she chooses 1, she wins if the other players both play either 2 or 3; if she chooses 2 she wins only if the other players both play 1 or both play 3, and if she chooses 3 she wins only if both other players play 1 or both play 2. So, we can express her total probability of victory as the sum of these three terms

And subject to the normalisation condition

Maximising the victory probability is a natural problem for the method of Lagrange Multipliers. The Lagrange equation

gives us the following three equations

And as in the paper above, we can solve these simultaneously along with the normalisation condition to give the probabilities

Her chances of victory are now 0.28719, regardless of her strategy – the equations above say that as long as two players play this solution, the other player can’t change her probability of winning by choosing ANY strategy. For example, if she switches her strategy to ALWAYS choosing 1, she will now win whenever the other two players play either 2 or 3, the chance of which is given by – and this is still 0.28719.

Now what if we relax the restriction that bids be between 1 and 3?

We can see immediately that the solution presented in the first link above, , is incorrect – in this case, if the third player plays always 1, she wins ¼ times (when both other players play more than 1), while if she plays always 1000, she wins almost a third of the times (whenever the other two players play the same number). But we said her victory probability should be independent of strategy for the Nash Equilibrium, so this can’t be correct. But it is along the right lines – the actual solution does decay rapidly as x increases.

Her chance of victory is now

And differentiating this with respect to each  in turn gives an infinite set of coupled equations

This turns out to be solvable, but difficult – a sketch of the solution is given at the bottom of the post. The solution is , is given by the normalisation condition

and we can use a trick to find the parameter . We’ve said that the third player’s odds of victory are independent of her strategy. Consider two strategies, one in which she plays the same as the other players, and another in which she always plays the same, very large number (say ). In the first case, chances of victory are equal for all three players, so they must be equal to 1/3 times one minus the chances of NO-ONE winning (which happens when they all choose the same number)

And in the second case, she wins whenever the other two players play the same number

We can combine these and substitute in the solution  with , and solve for :

and using the summation formula

Expanding out and cancelling terms where possible, then taking out factors:

(1)

So  is the real root of the equation , which is given by . A quick wikipedia search reveals that this is the reciprocal of something called the tribonacci constant, vital in the construction of the Snub Cube. I don’t know if this has some deeper resonance, but I’d love to hear from anyone who thinks it does! So, the full symmetric Nash Equilibrium in the 3-player, no upper bid limit case is

with . Note that the (incorrect) form given in the linked-to website can also be expressed in the same form, but with .

4-Player Case

In the case of 4 players and above, the maximisation equation is

where the new term in the middle of the expression comes from the victory scenario in which two players both choose the same lower integer than our player and so are not unique, while the third player chooses one that is larger. This is the term that is missing in the paper linked to above that leads to the incorrect solution.

In the case that UL = 4, we have four equations and the normalisation condition to solve. The equations feature cubic equations, so the solution for these is required. It’s uncommon enough for me to have had to refer to wikipedia, the solution can be found here – we are saved some hassle because the relevant cubic equations don’t have a quadratic term.

Once again, a sketch of the solution in the unrestricted scenario is presented at the bottom of the post, but it seems to make very little difference whether the upper limit UL is set to 4 or infinity. In both cases, I find that

Why does the upper limit no longer make any difference? We only want to choose a higher number if there is a significant chance of the lower numbers being ‘crowded out’ by the other players. That is, we sometimes choose 4 because there is a small chance that the other three players will all select either 1, 2 or 3. Similarly, we would only choose 5 if there was a significant chance that 4 wasn’t high enough, because there was also a significant chance of all of the other three playing 4 – but from looking at the values, we see that  is very small already, and is truly tiny. Consequently, in the general case  for is going to be negligibly small.

N-player Case

The problem rapidly becomes hard to compute for large N, but there is a body of literature on the subject, in particular this 2010 paper points out the mistake in the paper I linked to before, and presents a general solution for N players. Note from their graphs that the probability of choosing a number greater than N is essentially 0 for higher numbers of players, so whether or not we impose this restriction doesn’t change the equilibrium materially.

Derivations

I hope you’ve enjoyed my take on this problem, the sketches of my method of solution for the 3- and 4-player cases are given below but can probably be skipped on a first reading.

In the three-player case, the Lagrange equations are

For i=1, and using normalisation, this gives

Which means for i greater than 1 we have

and with a little re-arrangement

This gives a ‘step-up’ formula for the construction of later terms from the sum of earlier terms. An engineering-style solution is to write a script (or an excel spreadsheet) that calculates  to  from an initial guess of , and then varying the initial guess to satisfy the normalisation condition. Alternatively, we can show by induction that the expression above is satisfied for our solution . First of all, assuming the relationship holds for , lets see what it implies for  (for notational simplicity, I’m going to write  below as

and remembering that from our normalisation condition

so we see that the ratio demonstrated above satisfies the step-up equation for every . The first term is a special case, so I show this separately

And now setting :

so we see that the first term is satisfied by the expression.

A similar expression for 4 players is given by the solution to

which gives

with

I’ve not found an analytical solution for general to this yet, and had to proceed by the variational technique discussed above, giving the answers in the body of the post. If you can solve this exactly for me I’d be very interested to hear!

## Interview Questions IV

Another question about correlations today, this time I thought we could have a look at a simple type of random walk, in which the distance travelled at each step either backwards or forwards and has a random length, and how to deal with the issues that come up.

Let’s say at each step we move a distance along the x-axis that is distributed randomly and uniformly between -1 and +1, and importantly that each step is independent of the others. So, after N steps the total distance travelled, L, is

$L_N&space;=&space;\sum_{i=0}^{N}&space;x_i\&space;;&space;\qquad&space;x_i\sim&space;{\mathbb&space;U}[-1,+1]$

where $\inline&space;x_i$ is the i-th step length.

Calculate:

i) the expected distance travelled after N steps

ii) the standard deviation of the distance travelled after N steps

iii) the autocorrelation between the distance travelled at N steps and the distance travelled at N+n steps

Since we’re dealing with uniform variables, it makes sense to start by calculating the expectation and variance of a single realisation of a variable of this type. The expectation is trivially 0, while the variance is

\begin{align*}&space;{\rm&space;Var}[x_i]&space;&&space;=&space;{\mathbb&space;E}[x_i^2]&space;-&space;{\mathbb&space;E}[x_i]^2\\&space;&&space;=&space;\int_{-1}^{+1}&space;x^2&space;dx&space;-&space;0&space;\\&space;&&space;=&space;{2\over&space;3}&space;\end{align}

We’ll also make use of the independence of the individual variables at several points, we recall that for independent variables x and y, that $\inline&space;{\mathbb&space;E}[xy]&space;=&space;{\mathbb&space;E}[x]&space;{\mathbb&space;E}[y]$

i) This one is fairly straight-forward. Expectation is a linear operator, so we can take it inside the sum. We know the expectation of an individual variate, so the expectation of the sum is just the product of these

\begin{align*}&space;{\mathbb&space;E}\Big[\sum_{i=0}^N&space;x_i&space;\Big]&space;&&space;=&space;\sum_{i=0}^N&space;{\mathbb&space;E}[&space;x_i&space;]\\&space;&&space;=&space;N\cdot&space;0\\&space;&&space;=&space;0&space;\end{align}

ii) The standard deviation is the square root of the variance, which is the expectation of the square minus the square of the expectation. We know the second of these is 0, so we only need to calculate the first,

\begin{align*}&space;{\rm&space;Var}\Big[\sum_{i=0}^N&space;x_i&space;\Big]&space;&&space;=&space;{\mathbb&space;E}\Big[\Big(\sum_{i=0}^N&space;x_i&space;\Big)^2\Big]\\&space;&&space;=&space;{\mathbb&space;E}\Big[\sum_{i,j=0}^N&space;x_i&space;x_j\Big]\\&space;&=\sum_{i,j=0}^N&space;{\mathbb&space;E}&space;[x_i&space;x_j]&space;\end{align}

There are two types of term here. When i and j are not equal, we can use the independence criterion given above to express this as the product of the two individual expectations, which are both 0, so these terms don’t contribute. So we are left with

\begin{align*}&space;{\rm&space;Var}\Big[\sum_{i=0}^N&space;x_i&space;\Big]&space;&=\sum_{i=0}^N&space;{\mathbb&space;E}&space;[(x_i)^2]&space;\\&space;&=&space;N\cdot&space;{2\over3}&space;\end{align}and the standard deviation is simply the square root of this.

iii) This is where things get more interesting – the autocorrelation is the correlation of the sum at one time with its value at a later time. This is a quantity that quants are frequently interested in, since the value of a derivative that depends on values of an underlying stock at several times will depend sensitively on the autocorrelation. We recall the expression for correlation

$\rho(x,y)&space;=&space;{{\rm&space;Cov}(x,y)&space;\over&space;\sqrt{{\rm&space;Var}(x){\rm&space;Var}(y)&space;}&space;}$

So we are trying to calculate

\begin{align*}&space;\rho(L_N,&space;L_{N+n})&space;=&space;{\rm&space;Cov}\Big[\sum_{i=0}^N&space;x_i&space;\cdot&space;\sum_{j=0}^{N+n}&space;x_j&space;\Big]&space;\cdot&space;{3&space;\over&space;2\sqrt{N&space;(N+n)}}&space;\end{align}

where I’ve substituted in the already-calculated value of the variances of the two sums.

We can again use the independence property of the steps to separate the later sum into two, the earlier sum and the sum of the additional terms. Also, since the expectation of each sum is zero, the covariance of the sums is just the expectation of their product

\begin{align*}&space;\rho(L_N,&space;L_{N+n})&=&space;{\rm&space;Cov}\Big[\sum_{i=0}^N&space;x_i&space;\cdot&space;\Big(\sum_{j=0}^{N}&space;x_j&space;+&space;\sum_{j=N+1}^{N+n}&space;x_j&space;\Big)&space;\Big]&space;\cdot&space;{3&space;\over&space;2\sqrt{N&space;(N+n)}}\\&=&space;{\mathbb&space;E}\Big[\sum_{i=0}^N&space;x_i&space;\cdot&space;\Big(\sum_{j=0}^{N}&space;x_j&space;+&space;\sum_{j=N+1}^{N+n}&space;x_j&space;\Big)&space;\Big]&space;\cdot&space;{3&space;\over&space;2\sqrt{N&space;(N+n)}}\\&=&space;{\mathbb&space;E}\Big[\sum_{i,j=0}^N&space;x_i&space;x_j&space;+&space;\sum_{i=0}^N&space;x_i&space;\cdot\sum_{j=N+1}^{N+n}&space;x_j&space;\Big]&space;\cdot&space;{3&space;\over&space;2\sqrt{N&space;(N+n)}}&space;\end{align}and using the results above and the independence of the final two sums (because they are the sums of different sets of terms, and each term is independent to all the others) we know

${\mathbb&space;E}\Big[\sum_{i,j=0}^N&space;x_i&space;x_j&space;\Big]&space;=&space;{2&space;\over&space;3}N$

${\mathbb&space;E}\Big[\sum_{i=0}^N&space;x_i&space;\cdot\sum_{j=N+1}^{N+n}&space;x_j&space;\Big]&space;={\mathbb&space;E}\Big[\sum_{i=0}^N&space;x_i&space;\Big]\cdot&space;{\mathbb&space;E}\Big[\sum_{j=N+1}^{N+n}&space;x_j&space;\Big]&space;=&space;0$

so

\begin{align*}\rho(L_N,&space;L_{N+n})&space;&&space;=&space;{N\over&space;\sqrt{N(N+n)}}\\&space;&=&space;\sqrt{N\over&space;N+n}&space;\end{align*}

What does this tell us? Roughly that the sum of the sequence up to N+n terms is correlated to its value at earlier points, but as n gets larger the correlation decreases, as the new random steps blur out the position due to the initial N steps.

We can test our expressions using the RAND() function in excel. Try plotting a sequence of sets of random numbers and summing them, and then plotting the set of sums of 100 terms against the set of sums of 120 or 200 terms (nb. in excel, you probably want to turn auto-calculate off first to stop the randoms from refreshing every time you make a change – instructions can be found here for Excel 2010; for Excel 2013 I found the option inside the “FORMULAS” tab and at the far end – set the ‘Calculation Options’ to manual). I’ve done exactly that, and you can see the results below.

You can also try calculating the correlation of the variables uing Excel’s CORREL() that you generate – these should tend towards the expression above as the number of sums that you compute gets large (if you press F9, all of the random numbers in your sheet will be recomputed and you can see the actual correlation jump around, but these jumps will be smaller as the number of sums gets larger).

## Interview Quesions III

Today’s question will test some of the statistics and correlation I’ve discussed in the last couple of months. Assume throughout that  and  are jointly normally distributed such that

a) Calculate
b) Calculate

The first expectation is of a lognormal variate, and the second is of a lognormal variate conditional on some earlier value of the variate having been a particular value – these are very typical of the sorts of quantities that a quant deals with every day, so the solution will be quite instructive! Before reading the solution have a go at each one, the following posts may be useful: SDEs pt. 1, SDEs pt. 2, Results for Common Distributions

a) Here, we use the standard result for expectations

b) This one is a little tougher, so first of all I’ll discuss what it means and some possible plans of attack. We want to calculate the expectation of , given that takes a value of . Of course, if and were independent, this wouldn’t make any difference and the result would be the same. However, because they are correlated, the realised value of will have an effect on the distribution of .

To demonstrate this, I’ve plotted a few scatter-graphs illustrating the effect of specifying on , with and uncorrelated and then becoming increasing more correlated.

The simplest way of attempting this calculation is to use the result for jointly normal variates given in an earlier post, which says that if and have correlation , we can express in terms of and a new variate  which is uncorrelated with

so

Since the value of is already determined (ie. = ), I’ve separated this term out and the only thing I have to calculate is the expectation of the second term in . Since and are independent, we can calculate the expectation of the , which is the same process as before but featuring slightly more complicated pre-factors

We can check the limiting values of this – if  then and are independent [this is not a general result by the way – see wikipedia for example – but it IS true for jointly normally distributed variables], in this case  just as above. If , , which also makes sense since in this case , so fully determines the expectation of .

The more general way to solve this is to use the full 2D joint normal distribution as given in the previous post mentioned before,

This is the joint probability function of and , but it’s not quite what we need – the expectation we are trying to calculate is

So we need to calculate the conditional expectation of given , for which we need Bayes’ theorem

Putting this together, we have

This integral is left as an exercise to the reader, but it is very similar to those given above and should give the same answer as the previous expression for  after some simplification!

## Interview Questions II

A friend visited last weekend and after several beers challenged me to a numbers game. I liked it and thought it would make a good post – apparently he read it in a book of interview questions so there’s a good excuse for me to put it on here! The variant of the game we played went as follows, although there is some scope for changing the various numbers.

The first player says any number between 1 and 10. The second player then follows with a larger number, but it can be no more than 10 larger. So, if player one had said 4, player two could say any number between 5 and 15. Play continues in turn until someone gets to exactly 50, at which point they are the winner and play finishes.

There is actually a simple rule to play by to ensure victory, but the best way to proceed is to play a few games with someone first to get the feel of play. The solution is below, along with the rest of this post.

.

.

.

.

Solution: The winner is the first person to say 50. But, after playing a few times you’ll rapidly notice that another good number to arrive at is 39. If I say 39, the other player has to say a number between 40 and 49, any of which allow me to say 50 on the next move and win; so 39 is a winning number as well. But of course, I can ensure that I am the person who says 39 if I am also the player who says 28, by the same logic. I can ensure that I say 28 if I also say 17, and 6. So, if I can be the player that says 6, I can be certain of victory – and since this is possible on the first move, I can always win if I go first. Is this always the case? Have a think about the case where the victor is the first person to say 55, instead of 50!

This backwards-propagation of the result is a lot like option pricing (the link seems slightly tenuous, but they ‘feel’ rather similar to me). With an option, we know what the payoff will look like as a function of spot at expiry, but we don’t know what it looks like before then – the whole challenge is to propagate the final payoff backwards in whatever model we are using to the current date, which can be done using backwards pdes or expectations.

As an extra, if you are interested in a slightly more complicated variant, try the following game (again for two people). Arrange 15 matchsticks in rows to form a pyramid as follows:

I
I I
I I I
I I I I
I I I I I

On each turn, remove some matchsticks from a single row. You can remove as few or as many as you like, but they have to come from the same row. The winner is the person who removes the last match/matches.

[The game has been thoroughly solved, it’s called Nimh and a quick internet search should tell you everything you need to know, but I strongly recommend having a play with a friend first to try and get a feel for how it works!]

## Interview Questions I

I thought it would be fun to have a section looking at some of the interview questions that I’ve faced, or heard of, during my time as a quant, and to discuss some ways of approaching them. I’ll start with a fairly simple one that I got some time ago.

One of the traders at your bank has agreed to sell a product to a client, where a coin is tossed. If it’s a tails, the client is paid $1, if it’s a head the coin is tossed again. If the second flip is a tails, the payoff is$2. Otherwise, we continue. On the third flip, if it’s a tails the payoff is $4, or$8 on the fourth flip and so on, with the pot doubling each time. The trader asks you what the price he should charge his client to play the game.

The first part of this problem is to calculate the expected payout of the game. Let’s say X is the number of heads flipped before the first tails comes up. It’s a discrete game, so we calculate the expectation by summing the payout, C(X), over the probability distribution p(X).

If the first flip is a tails, X=0 and C(X)=$1. If there’s one head then one tails, X=1 and C(X)=$2, and so on, so that

$C(X)&space;=&space;\(2^X)$

The probability of getting tails on the first flip is 0.5, so p(X=0)=0.5. The probability of getting exactly one head and then one tails is 0.25, so p(X=1)=0.25, and so on, so p(X) is given by

$p(X)&space;=&space;\Bigl({1\over&space;2}\Bigr)^X$

To calculate the expectation of the payoff, we need to sum over all possible values of X. There are an infinite number of possibilities, as we could have any number of heads before the first tails, although of course the probability gets very small so the  higher X values shouldn’t contribute too much. Let’s have a look:

${\mathbb&space;E}[C(X)]&space;=&space;\sum^\infty_{X=0}&space;C(X)p(X)$

$=&space;\&space;\sum^\infty_{X=0}&space;2^X&space;\Bigl({1\over&space;2}\Bigr)^X$

$=&space;\&space;\sum^\infty_{X=0}&space;1$

What is this sum? It’s the value 1 summed over all possible states of X – so each state adds the same amount to the expectation… but as we’ve already said, there are an infinite number of possible states of X, so the sum must be

$=\&space;\infty$

Oh dear – what does this mean?! Although the probability of getting a very large number of heads is very small, the payoff in these cases is correspondingly large, so although unlikely they still add a constant amount to the expectation. This is difficult – our expected loss from this game is infinite! How can we find a reasonable price to charge the client?

Probably the best way to deal with this is to notice that most of the weight of the expectation is coming from rather unlikely payoffs involving huge sums of money and very tiny possibilities. We could artificially cut these off beyond a certain point, but in fact there’s a very natural cutoff – the bank’s solvency. Beyond a certain point, we simply can’t pay any more because we don’t have it! Let’s say we’re a particularly well capitalised bank, and could afford to lose $100bn before collapsing (by way of comparison, the modest loss of$35bn by the RBS in 2008 was enough to topple that bank, at the time the World’s largest). The log base 2 of 100bn is about 36.5, so after 37 heads in a row we’re already bankrupt. In fact, the payout function looks something more like this:

$C(X)&space;=&space;\Bigl\{&space;\begin{matrix}&space;\&space;(2^X&space;)\\&space;\&space;(10^{11})&space;\end{matrix}&space;\quad&space;\begin{matrix}&space;X&space;<&space;37&space;\\&space;X&space;\geq&space;37&space;\end{matrix}$

Now, the states X=0 to X=36 all contribute 1 to the payoff, but the contribution of the remaining states declines with the probability since payoff is capped at our capital reserves. The total contribution of the remaining states will then be

$\&space;\sum_{X=37}^\infty&space;\Bigl({1\over&space;2}\Bigr)^X&space;\cdot&space;10^{11}$

$=&space;\&space;\Bigl(&space;{1\over&space;2}\Bigr)^{36}&space;\cdot&space;10^{11}$

Which is around $1.5. So, a more realistic consideration of our capital position gives a price of around$37.50, much more useful than our initial price of infinity. Of course, the trader would be sensible to charge quite a bit more than this, both to make a profit but also to prevent giving away potentially market-sensitive information about our capital reserves!