The worlds of data science, mathematical finance, physics, engineering and bioinformatics (among many others) easily create difficult problems. These are problems for which no computationally ‘easy’ solution is available.
Fortunately, there are ways to approximate the solutions to these problems with a remarkably simple trick.
Monte Carlo methods are a class of methods that can be applied to computationally ‘difficult’ problems to arrive at nearly sufficiently accurate answers.
As an analogy, imagine you are an ant crawling on a large, tiled mosaic. From your vantage point, you have no easy way of finding out what the mosaic represents.
If you start walking around the mosaic and sampling the tiles you see at random intervals, you’ll form a rough idea of what the mosaic is about. The more samples you take, the better your approximation will be.
If you can cover every single tile, you will eventually have a true representation of the mosaic. However, this won’t be necessary – after taking a certain amount of samples, you’ll have a pretty good estimate.
This is exactly how Monte Carlo conjectures solutions to otherwise ‘unsolved’ problems.
The name refers to a famous casino in Monaco. It was coined in 1949 by one of the pioneers of the method, Stanisaw Ulam. Ulm’s uncle was reportedly a gambler, and the connection between gambling and the element of ‘chance’ in Monte Carlo statutes must have been particularly evident to Stanisaw.
The best way to understand a technical concept is to dive into it and see how it works. The rest of this article will show how Monte Carlo methods can solve three interesting problems. Examples will be presented in the Julia programming language.
Introduction to Julia
If you are interested in specializing in data science, there are several languages you can consider learning. One that has emerged as a serious alternative in recent years is a language called Julia.
Julia is a numerical programming language that has been adopted within a range of quantitative disciplines. It is free to download. There’s also a really neat browser-based interface called JuliaBox, which is powered by Jupyter Notebook.
One of the cool features of Julia that we are using today is how easily it facilitates parallel computing. This allows you to perform calculations on multiple processes, giving a serious performance boost when done on a large scale.
@everywhere The macro ensures that the hello() function is defined in all processes. The @spawn macro is used to wrap a closure around a hello(“world!”) expression, which is then automatically evaluated remotely on the chosen process.
The result of that expression is immediately returned as a future remote reference. If you try to print the result, you will be disappointed. The output of Hello(“World!”) is evaluated on a separate process, and is not available here. To make it available, use the fetch() method.
If bothering with spawning and hatching sounds like too much, you’re in luck. Julia also has a @parallel macro that will do some of the heavy lifting needed to run tasks in parallel.
@parallel Works either standalone, or with a reducer function to collect results across all processes and reduce them to the final output.
The for-loop simply returns the value of i at each step. @parallel The macro uses the addition operator as a reducer. It takes each value of i and adds it to the previous values.
The result is the sum of the first billion integers.
With that whistle-stop tour of Julia’s parallel programming capabilities in mind, let’s look at how we can use Monte Carlo methods to solve some interesting example problems.