Simulation

Background and Uses of Monte Carlo Simulation

[TBP, based on the existing course handout]

Pseudo-Random Numbers

The fundamental construct for most Monte Carlo simulations is a source of random variables.  While a physical construct can be used to produce these random variates (see, for example, the discussion at <http://www.random.org/nform.html>), the overwhelming method of choice for constructing random variables combines two basic constructs: a source of Uniform[0,1] pseudo-random numbers, and a theorem that permits us to transform such a Uniform[0,1] random variable into any other distribution whose cumulative distribution function is invertible.

The construction of suitable pseudo-random numbers is a non-trivial computer programming problem that we will not cover in any detail here.  However, it is perhaps worth noting that a pseudo-random number generator (PNG) is a (relatively) simple computer program that produces a defined and repeatable sequence of numbers that, while deterministic by definition, appear to be uniformly distributed, usually on the interval [0,1].  Considerable theoretical and practical effort has been devoted to constructing PNG that have good statistical properties and use computer resources efficiently; for our purposes we can assume that such a PNG is available.

Generating Non-Uniform Random Variables

If the only random variables in our simulation are uniformly distributed, we are essentially done; adding and multiplying allows us to transform a Uniform[0,1] random variable into a random variable uniformly distributed on any other interval.  But if we need random variables that are not uniformly distributed, we need a way to create pseudo-random sequences that follow those other distributions.  Fortunately, there is a relatively simple procedure that allows us to create such sequences starting from a Uniform[0,1] PNG.  

Theorem: If a continuous random variable X has a cumulative probability distribution defined by F(x), then the random variable defined by F^(-1)(U), where F^(-1)(F(x))=x (in other words, F^(-1) is the functional inverse of F(x) ) and U is a Uniform[0,1] random variable, will also follow this distribution.

Pragmatically, this means that to construct a PNG of an arbitrary distribution, we follow this procedure:
1.  Obtain a good Uniform[0,1] PNG.
2.  Compute the CDF of the random variable of interest, F(x).  
3.  Compute the inverse CDF, F^(-1)(U), by setting F(x)=U and solving for U.
4.  The sequence X_i=F^(-1)(U_i), obtained by substituting a sequence of Uniform[0,1] pseudo-random numbers into the inverse CDF, is a pseudo-random sequence with CDF F(x)

For example, suppose we are interested in generating exponentially distributed random variables.  The CDF of the exponential distribution is

F (x) = 1 - ^(-λx)

and so its inverse is

x = F^(-1)(U_i) = -1/λln(1 - U)

and substituting the output of a Uniform[0,1] PNG into this expression will give an instance of an exponentially distributed random variable.  Knowing that the Mathematica function Random[ ] returns a Uniform[0,1] random variable,

In[13]:=

-1/0.5Log[1 - Random[]]

Out[13]=

11.4565

returns an exponentially distributed random variable with mean 2 (parameter λ=0.5).

The Special Case of the Normal Distribution

The use of the inverse CDF, F^(-1)(U), is effective for many random variables of interest in the construction of simulations.  However, this method is impractical for the Normal distribution for one simple reason — since we cannot write an analytic function for the CDF of the Normal distribution, we cannot invert the CDF to execute it!

As a result, specialized techniques have been developed for generating pseudorandom numbers from the Normal distribution, ranging from adding several uniform random numbers to more specialized techniques.  

One of the simplest techinques for generating approximately Normal random variables exploits a very powerful theorem we will discuss later in the course; the Central Limit Theorem. In the discussion of the Central Limit Theorem below, the generation of normally distributed pseudorandom numbers is used as an example.

Simulation Experiments

An example of a simulation experiment is shown below in the discussion of confidence intervals.  This example is used to demonstrate how confidence intervals work, but it also serves as an example of how to construct a simulation in Mathematica.

Programming Techniques for Larger Experiments

If the simulation experiment is large, the techniques described above will fail due to limitations of the programming environment — usually by causing the computer in question to run out of memory. Since the problem is caused by characteristics of the computer being used, how big a problem has to be to be considered "large" will vary from computer to computer. The main factors that affect whether a simulation is large enough to need these techniques are the number of replications desired, the number of random variables involved, and the complexity of the calculations at each replication.

Investigation of programming techniques for larger experiments is beyond the scope of this course.  The principles involved combine computer programming, numerical methods (the theory of how computer implementations affect mathematical computations, and the Monte Carlo methods discussed in this section.


Created by Mathematica  (July 20, 2006) Valid XHTML 1.1!