Probability when tossing a coin

In the original coin tossing practical from Week 1 we discovered that as we increased the number of tosses of a coin, the fraction of heads converged towards 0.5, as expected according to the Law of large numbers.

Let’s get out our virtual coin and simulate the tosses again.

coin <- c(1,0)

tosses <- sample(coin, size=1000000, replace=TRUE)

num_tosses <- c(1, 3, 5, 10, 100, 1e3, 1e4, 1e5, 1e6)
frac_heads <- c()
for (n in num_tosses) {
  frac <- sum(tosses[1:n])/n
  frac_heads <- c(frac_heads, frac)
}
plot(num_tosses, frac_heads, type="b", ylim=c(0,1), log="x",
     xlab="Number of tosses", ylab="Fraction of Heads")
abline(h=0.5, col='red')

Let’s look a little bit closer at this result.

Do you think that the higher the number of tosses of a coin, the closer the number of heads will be to exactly half the number of tosses?

Well, not exactly. The analysis we have done shows that the relative number of heads (expressed as a fraction) really does get closer and closer to 0.5. But the absolute number of heads might be doing something unexpected…




Difference between observed and expected values

What happens if we decide to count exactly how many heads we obtained, and compare this against the number of heads we expected (\(\frac{1}{2}\) of the number of tosses)? Do you think the difference will get smaller and smaller as the number of tosses grows?

That’s the great thing about R, we can simply perform a simulation experiment to find out!

Remember, we still have our million tosses. Let’s revisit that:

length(tosses)
## [1] 1000000
sum(tosses)
## [1] 499683

All seems to be in order. Now, instead of calculating fractions, let’s calculate the difference between the observed number of heads, and those we expected.

diff1 <- sum(tosses[1]) - 0.5
diff2 <- sum(tosses[1:2]) - 1
diff3 <- sum(tosses[1:3]) - 1.5
diff4 <- sum(tosses[1:4]) - 2
diff5 <- sum(tosses[1:5]) - 2.5
diff6 <- sum(tosses[1:6]) - 3
diff7 <- sum(tosses[1:7]) -3.5

diff_heads <- c(diff1, diff2, diff3, diff4, diff5, diff6, diff7)

Note that we essentially reused the code we had before, but instead of dividing we are now subtracting. The value at the end of each line corresponds to the expected number of heads (0.5 * n). So we can revisit any of our results, we also take care to use different names for all our objects.

plot(diff_heads, type="b", xlab="Number of tosses", ylab="Observed - Expected number of Heads")
abline(h=0, col='red')

So, that looks pretty noisy, right? Do note that the differences (as in total number of Heads) are quite small (between -1.5 and 0.5.

What do you think will happen if we toss the coin more and more times? Will it stabilize closer to 0?

Let’s go and look at larger number of tosses to find out…

diff10 <- sum(tosses[1:10]) - 5
diff100 <- sum(tosses[1:100]) - 50
diff1k <- sum(tosses[1:1000]) - 500
diff10k <- sum(tosses[1:10000]) - 5000
diff100k <- sum(tosses[1:1e5]) - 5e4
diff1000k <- sum(tosses[1:1e6]) - 5e5

diff_heads_big <- c(diff10, diff100, diff1k, diff10k, diff100k, diff1000k)

Again note we have used almost the same code. Just be careful of calculating correctly the expected number of Heads.

plot(diff_heads_big, type="b", xlab="Much larger numbers of tosses", ylab="Observed - Expected number of Heads")
abline(h=0, col='red')

Of course the differences can be positive or negative, since we can by chance get more or less Heads than we expected. It can also happen that at some point, the value of the difference might get a bit closer to 0. But in general, we are using much larger number of tosses, and the absolute difference is getting larger (this plot shows values of up to a few hundred Heads, which is much larger than the previous plot).

If you still think that this is strange, and expect that the differences should “cancel out” at some point and get closer to zero, you can manually confirm by repeating the experiment a few times:

sum(sample(coin, size=1e4, replace=TRUE)) - 5e3
## [1] -1
sum(sample(coin, size=1e4, replace=TRUE)) - 5e3
## [1] -34
sum(sample(coin, size=1e4, replace=TRUE)) - 5e3
## [1] -19
sum(sample(coin, size=1e6, replace=TRUE)) - 5e5
## [1] -446
sum(sample(coin, size=1e6, replace=TRUE)) - 5e5
## [1] -679
sum(sample(coin, size=1e6, replace=TRUE)) - 5e5
## [1] -183

You should find that although the differences can fluctuate quite a bit (more on that later), the higher the number of tosses, the higher the difference tends to be.

Another important observation is that the difference does not grow as quickly as we are increasing the number of tosses.

This last simulation may have been one of those situations in which your intuition is wrong (especially after observing the fraction getting closer and closer to 0.5). But if you think about it a bit more, it is actually to be expected. When you toss the coin a small number of times, the number of Heads is clearly bounded by the number of tosses (if you toss the coin 3 times, you cannot possibly get more than 3 Heads). So the possible range of Heads will be between 0 and 3. The same is true for larger number of tosses, but the range will be larger, and so it is possible to get an ever larger number of Heads as you increase the number of tosses. Of course, just by chance it becomes less and less likely to observe the maximum possible number of Heads. Getting 3 Heads in 3 tosses is not too difficult, but getting 3000 Heads in 3000 tosses is virtually impossible. So, chance will tend to keep the number of Heads closer to the expected value (0.5) even though the absolute fluctuations around this exact value can be larger.