Briefly About Monte Carlo Simulation

Monte Carlo methods in the most basic form is used to approximate to a result aggregating repeated probabilistic experiments. For instance; to find the true probability of heads in a coin toss repeat the coin toss enough (e.g. 100 times) and calculate the probability by dividing number of heads to the total number of experiments. Here is a small example.

#Function of coin tossing with the given number of instances n_toss
toss_coins<-function(n_toss){
    sample(c("H","T"),n_toss,replace=TRUE)
}
#Do the experiment with 10 instances
experiment_1<-toss_coins(10)
#Print out the tosses to check the function
print(experiment_1)
##  [1] "T" "T" "T" "T" "H" "T" "H" "T" "T" "T"
sum(experiment_1=="H")/10
## [1] 0.2
#Now repeat the experiment with more instances
experiment_2<-toss_coins(10^4)
sum(experiment_2=="H")/10^4
## [1] 0.5007

Monte Carlo and its extensions (e.g. Markov Chain Monte Carlo) are generally used to find the results of very complex or analytically intractable calculations. For instance one of the earlier examples of MC methods, Metropolis Algorithm, is devised by Manhattan Project members and it is used in mathematical physics to understand the particle movements of the atomic bomb.

Before starting calculating options with Monte Carlo methods we will start with some toy examples and random variate generation. Aside from the lecture notes, I will also partly follow BOUN CMPE584 lecture notes and IE 586 lecture notes (not publicly available).

Toy Example: Calculation of \(\pi\)

To calculate the \(\pi\) value, think of a quarter of a unit circle (round with radius value 1).

#Create the points necessary to build a polygon
circle_data<-data.frame(x_val=c(0,seq(0,1,length.out=1000)),
                        y_val=c(0,sqrt(1-seq(0,1,length.out=1000)^2)))

library(ggplot2) #Load ggplot2
#create the shaded area for the quarter circle
plot_pi <- ggplot() +
geom_polygon(data=circle_data,aes(x=x_val,y=y_val),alpha=0.1) + theme_bw()
plot_pi

Now, let’s randomly put dots on the unit square (i.e. square with side length of 1). Then, we define them “in” or “out” depending on whether they are within the circle area or not.

#Create random points by declaring dot positions
#with 2 random uniformly distributed values for x and y.
dot_data<-data.frame(x_val=runif(25),y_val=runif(25))
#Define in with 1 and out with 0.
#Since circle radius is 1. We calculate in/out with the distance from origin.
dot_data$in_or_out<-ifelse(sqrt(dot_data$x_val^2+dot_data$y_val^2)<=1,1,0)
head(dot_data)
##          x_val     y_val in_or_out
## 1 0.3560381935 0.8451746         1
## 2 0.6841733987 0.8082791         0
## 3 0.0008856533 0.1399003         1
## 4 0.8064633757 0.4911957         1
## 5 0.5106650034 0.6675278         1
## 6 0.9941859178 0.4793008         0
plot_pi + geom_point(data=dot_data,aes(x=x_val,y=y_val,color=in_or_out)) +
theme(legend.position="none")

Ratio of number of dots within the circle area to the total number of dots will give us the ratio of the quarter unit circle to the unit square. We know the area of the quarter circle as \(\pi r^2/4\) and square with side length equal to the radius as \(r^2\). The ratio result as \(\pi/4\). So, 4 times the ratio of dots should give us approximately the value of \(\pi\).

#Simulated value of pi
4*sum(dot_data$in_or_out)/nrow(dot_data)
## [1] 3.04
#True value of pi
pi
## [1] 3.141593

Our simulated value of pi is not very satisfactory. Let’s increase the sample size from 25 to 10000.

dot_data_2<-data.frame(x_val=runif(10^4),y_val=runif(10^4))
dot_data_2$in_or_out<-ifelse(sqrt(dot_data_2$x_val^2+dot_data_2$y_val^2)<=1,1,0)
plot_pi + geom_point(data=dot_data_2,aes(x=x_val,y=y_val,color=in_or_out)) +
theme(legend.position="none")

4*sum(dot_data_2$in_or_out)/nrow(dot_data_2)
## [1] 3.1444

Better.

Toy Example: Chevalier de Méré

Directly quoting from Cut the Knot:

“A 17th century gambler, the Chevalier de Méré, made it to history by turning to Blaise Pascal for an explanation of his unexpectd losses. Pascal combined his efforts with his friend Pierre de Fermat and the two of them laid out mathematical foundations for the theory of probability.

Gamblers in the 1717 France were used to bet on the event of getting at least one 1 (ace) in four rolls of a dice. As a more trying variation, two die were rolled 24 times with a bet on having at least one double ace. According to the reasoning of Chevalier de Méré, two aces in two rolls are 1/6 as likely as 1 ace in one roll. (Which is correct.) To compensate, de Méré thought, the two die should be rolled 6 times. And to achieve the probability of 1 ace in four rolls, the number of the rolls should be increased four fold - to 24. Thus reasoned Chevalier de Méré who expected a couple of aces to turn up in 24 double rolls with the frequency of an ace in 4 single rolls. However, he lost consistently."

TLDR, there are two games with ordinary dice:

  • Getting a 6 on 4 rolls of a single die
  • Getting double 6 (düşeş) on 24 rolls of paired dice

Our gambling knight turned from Game 1 to Game 2 and asks why he starts losing money (assume at each game you either gain or lose 1 gold). Let’s calculate the probability of win for each case with simulation.

Game 1: One six in a four roll

#Set number of trials
n_trial<-10^4
#Roll the dice
#Each row is the result of 4 rolls
die_toss<-matrix(sample(1:6,n_trial*4,replace=TRUE),ncol=4)
head(die_toss)
##      [,1] [,2] [,3] [,4]
## [1,]    6    1    5    5
## [2,]    6    1    5    3
## [3,]    5    3    5    6
## [4,]    1    3    4    4
## [5,]    6    6    4    1
## [6,]    1    4    2    2
#See if the dice are 6 or not
die_toss<-die_toss==6
head(die_toss)
##       [,1]  [,2]  [,3]  [,4]
## [1,]  TRUE FALSE FALSE FALSE
## [2,]  TRUE FALSE FALSE FALSE
## [3,] FALSE FALSE FALSE  TRUE
## [4,] FALSE FALSE FALSE FALSE
## [5,]  TRUE  TRUE FALSE FALSE
## [6,] FALSE FALSE FALSE FALSE
#Calculate the wins
wins<-rowSums(die_toss)>=1
head(wins)
## [1]  TRUE  TRUE  TRUE FALSE  TRUE FALSE
#Calculate the winning proportion
sum(wins)/n_trial
## [1] 0.5167

True probability can be calculated as \(1-P(Loss)^4= 1 - (5/6)^4 = 0.518\).

Game 2: Double-six in 24 rolls of paired dice

Exercise!

True probability can be calculated as \(1-P(Loss)^{24} = 1 - (35/36)^24 = 0.491\).

Example: Pricing European Options

Suppose we want to price an option with the initial asset price \(S_0=100\), strike price \(K=100\), risk free rate \(r=0.02\), volatility \(\sigma=0.25\) and maturity \(T=1\) year. Risk-neutral pricing process is as follows. \[S_T = S_0 * exp((r - \sigma^2/2)T + \sigma Z \sqrt{T})\] where \(Z\) is a random standard normal variate \(N(0,1)\).

#Let's build a function to simulate
sim_european_call<-function(S_0=100,K=100,vol=0.25,T_years=1,r=0.02,n=10^4){
    #Simulate the stock
    sim_S_T<-S_0*exp((r-0.5*vol^2)*T_years + vol*rnorm(n)*sqrt(T))
    #Calculate payoffs
    payoffs<-pmax(sim_S_T-K,0)*exp(-r*T_years)
    #Simulate results and bounds
    Price<-mean(payoffs)
    SE<-1.96*sd(payoffs)/sqrt(n)
    LowerB <- Price - SE
    UpperB <- Price + SE
    return(c(Price=Price,SE=SE,Lower=LowerB,Upper=UpperB))
}
#Simulation with 1000 instances
sim_european_call(S_0=100,K=100,vol=0.25,T_years=1,r=0.02,n=10^3)
##     Price        SE     Lower     Upper 
## 10.129527  1.061294  9.068233 11.190821
#Simulation with 10000 instances
sim_european_call(S_0=100,K=100,vol=0.25,T_years=1,r=0.02,n=10^4)
##      Price         SE      Lower      Upper 
## 10.9288232  0.3447952 10.5840280 11.2736184
#Simulation with 100000 instances
sim_european_call(S_0=100,K=100,vol=0.25,T_years=1,r=0.02,n=10^5)
##      Price         SE      Lower      Upper 
## 10.9078898  0.1094115 10.7984782 11.0173013
#Compare with Black Scholes value
black_scholes_eopt(s0=100,K=100,r=0.02,T_in_days=252,sig=0.25,callOrPut="call")
## [1] 10.87056

We will return to option pricing in the following sections in detail.

Random Number Generation

Linear Congruential Generator

\[s_{i+1}=(as_i+b),\mod m\]

Let’s make a home made LCG

#This is the generating function.
lcg_generator <- function(s_i,m,a=7^5,b=0){
    val<-(a*s_i+b)%%m
}
#This returns a vector of pseudonumbers given the seed.
lcg_rand<-function(n,s_0=123457,m=2^35-1,...){

    s_vector<-lcg_generator(s_i=s_0,m=m,...)

    for(i in 2:n){
        s_vector[i]<-lcg_generator(s_i=s_vector[i-1],m=m,...)
    }
    #We make it this way to never get a 0 or 1
    return((s_vector+0.5)/m)
}
lcg_rand(n=10)
##  [1] 0.06038875 0.95379398 0.41544445 0.37485500 0.18793875 0.68649200
##  [7] 0.87100583 0.99494424 0.02791106 0.10125041

LCG is not the best pseudorandom number generator. Also, there might be some unwanted consequences, for instance someone win the lottery four times, if your randomness pattern is not so random.

Note: Mersenne Twister is the most popular random number generation algorithm with a period of \(2^{19937}-1\). It is also used by R.

Inversion Method

Generating random variates from non-uniform distributions is not as straightforward. One way to do it is inversion method. Basically, take the inverse of the cumulative distribution function and put a uniform random number as the input. Resulting number is a random variate from the corresponding distribution.

#Inverse CDF of exponential distribution
inverse_exp<-function(u,mu=1){
    return(-log(1-u)/mu)
}
#Prepare the comparison data
compare_data<-data.frame(inv_data=inverse_exp(u=runif(1000)),rexp_data=rexp(1000))
#Plot the inversion method histogram
ggplot(data=compare_data) + geom_histogram(aes(x=inv_data),binwidth=0.2)