Random Walks and Pi

We are going to estimate pi from the mean distance traveled <D_N> by a swarm of random walker each of N steps using python. The key equation that connects pi with the square of the mean distance traveled is

<D_N>^2 ~ (2/pi)*N

when N is large (source)

By simulating a bunch of random walkers each of N steps, and varying N we can plot <D_N>^2 vs N which is a linear function with constant 2/pi. The slope of our linear regression is 2/pi and we can solve for pi.

Supplies

Code:
python: dumpy, matplotlib
Concepts:
Random Walks. A great Wolfram article with the derivation of the equation needed. And a classic discussion from Richard Feynman.
Linear regression

Simulate One Random Walk

We can think of a random walk as a random sequence of tails/heads or -1/+1. The following code give us one random walk of N steps:

# 1 random walker, 1000 steps each
N = 1000 # N: number of steps
nRW = 1 # nRW =  number of random walks
random_walk_swarm = np.random.choice([-1,1],(nRW,N)) # sequence of nRW random walks of length N

We can plot this path using matplotlib

# not the most efficient way but it demonstrates the concept
plt.figure(figsize=(20,10))
plt.plot(item,color='k',alpha=0.5)
plt.title('Random Walk with 1000 steps')
plt.xlabel('steps')
plt.show()

One particular path is shown.

Simulate Several Random Walks

We can modify the code above to build several random walks and collect information about the distance traveled. The distance between the starting point (x=0) and the absolute value of final position of the random walk is d_N. We will average over several random walks for each N (number of steps)

# 100 random walkers, 1000 steps each
N = 1000 # N: number of steps
nRW = 100 # nRW =  number of random walks
random_walk_swarm = np.random.choice([-1,1],(nRW,N)) # sequence of nRW random walks of length N

# not the most efficient way but it demonstrates the concept
plt.figure(figsize=(20,10))
lD = [] # an array to store the distance traveled by the random walks
for item in np.cumsum(random_walk_swarm,axis=1):
    plt.plot(item,color='k',alpha=0.5)
    lD.append(np.abs(item[-1])) # adding each distance traveled (absolute value of the last position)
plt.title('100 Random Walks with 1000 steps')
plt.xlabel('steps')
plt.show()

We can compute the mean value of the distance traveled by all the random walks

np.mean(lD)

In this case is 25.18

Repeat Step 2 With Different Number of Steps

We are ready to repeat the process for different numbers of steps. Note that the approximation that we mentioned in the introduction is valid only for large numbers, so we will choose numbers between 1000 and 20000 in intervals of 1000.

nRW = 10000
lN = [] # an array to store the value of N
lD = [] # an array to store the distance traveled by the random walks
for N in range(1000,20000,1000):
    random_walk_swarm = np.random.choice([-1,1],(nRW,N))
    lN.append(N)
    lD.append(np.mean(np.abs(np.cumsum(random_walk_swarm,axis=1).T[-1])))

We can then plot the number of steps N and the square of the average distance traveled <d_N>^2

plt.scatter(lN,np.array(lD)**2) # notice that we are plotting N vs <D_N>^2
plt.xlabel('N - Number of steps')
plt.ylabel('Average distance traveled')
plt.show()

Notice the linear relationship between N and <d_N>^2

Find the Slope

From step 3 we can see there is a linear relationship between N and <d_N>^2. The equation in the introduction gives us a clue of the proportionality constant: 2/pi

We will use spicy.linregress to find the slope of the line

slope, intercept, r, p, se = stats.linregress(lN, np.array(lD)**2)

with the following values for our simulation

# N range between 1000 and 20000 intervals of 1000
# number of random walkers 10000
slope = 0.6369379699710177
pi = 3.1400231958082276

Close enough!

Increase Number of Random Walkers and Range of N

By increasing the number of random walkers and the range of the number of steps we obtain values closer to pi