Lecture 5 Classical probability I

5.1 Probability with equally likely outcomes

Classical probability is the name we give to probability where there are a finite number of equally likely outcomes.

Classical probability was the first type of probability to be formally studied – partly because it is the simplest, and partly because it was useful for working out how to win at gambling. Tossing fair coins, rolling dice, and dealing cards are all common gambling situations that can be studied using classical probability – in a deck of cards, for example, there are 52 cards that are equally likely to be drawn. Among the first works to seriously study classical probability were “Book on Games of Chance” by Girolamo Cardano (written in 1564, but not published until 1663, one hundred years later), and a famous series of letters between Blaise Pascal and Pierre de Fermat in 1654.

Definition 5.1 Let \(\Omega\) be a finite sample space. Then the classical probability measure on \(\Omega\) is given by \[ \mathbb P(A) = \frac{|A|}{|\Omega|} . \]

So to work out a classical probability \(\mathbb P(A)\), crucially we need to be able to count how many outcomes \(|A|\) are in the event \(A\) and count how many outcomes \(|\Omega|\) are in the whole sample space \(\Omega\). (This is why classical probability is also called “enumerative probability” – “enumeration” is another word for counting.) In this lecture and the next, we’ll look at some different ways in which we can count the number of outcomes in common events and sample spaces.

Example 5.1 We roll a dice. What is the probability we get at least 5?

The sample space is \(\Omega = \{1,2,3,4,5,6\}\), with \(|\Omega| = 6\). The event that we roll at least 5 is \(A = \{5,6\}\), with \(|A| = 2\). Hence \[ \mathbb P(A) = \frac{|A|}{|\Omega|} = \frac{2}{6} = \frac{1}{3} . \]

There’s something we ought to check before going any further!

Theorem 5.1 Let \(\Omega\) be a finite nonempty sample space. Then the classical probability measure on \(\Omega\), \[ \mathbb P(A) = \frac{|A|}{|\Omega|} , \] is indeed a probability measure, in that it satisfies the three axioms in Definition 4.1.

Proof. We’ll take the axioms one by one.

  1. Since \(|\Omega| \geq 1\) and \(|A| \geq 0\), it is indeed the case that \(\mathbb P(A) = |A|/|\Omega| \geq 0\).
  2. We have \({\displaystyle \mathbb P(\Omega) = \frac{|\Omega|}{|\Omega|} = 1}\), as required.
  3. Since we have a finite sample space, we only need to show Axiom 3 for a sequence of two disjoint events; the argument can be repeated to get any finite number of events. Let \(A = \{a_1, a_2, \dots, a_k\}\) and \(B = \{b_1, b_2, \dots, b_l\}\) be two disjoint events with \(|A| = k\) and \(|B| = l\). Note that we can enumerate the elements of the disjoint union \(C = A \cup B\) as \[ c_1 = a_1, c_2 = a_2, \dots, c_k = a_k, c_{k+1} = b_1, c_{k+2} = b_2, \dots, c_{k+l} = b_l . \] Since \(A\) and \(B\) are disjoint, this list has no repeats, and we see that \(|C| = |A \cup B| = k+l\). Hence \[ \mathbb P(A \cup B) = \frac{k+l}{|\Omega|} = \frac{k}{|\Omega|} + \frac{l}{|\Omega|} = \mathbb P(A) + \mathbb P(B) , \] and Axiom 3 is fulfilled.

5.2 Multiplication principle

In classical probability, to find the probability of an event \(A\), we need to count the number of outcomes in \(A\) and the total number of possible outcomes in \(\Omega\). This can be easy when we’re just looking at one choice – like the 2 outcomes from tossing a single coin, the 6 outcomes of rolling a single dice, or the 52 outcomes from dealing a single card. Now we’re going to look at what happens if there are a number of choices one after another – like tossing multiple coins, rolling more than one dice, or dealing a hand of cards.

Here, an important principle is the multiplication principle. The multiplication principle says that if you have \(n\) choices followed by \(m\) choices, than all together you have \(n \times m\) total choices. You can see this by imagining the choices in a \(n \times m\) grid, with the \(n\) columns representing the first choice and \(m\) rows representing the second choice. For example, suppose you go to a burger restaurant where there are 3 choices of burger (beefburger, chicken burger, veggie burger) and 2 choices of sides (fries, salad), then altogether there are \(3 \times 2 = 6\) choices of meal.

Beefburger Chicken burger Veggie burger
Fries 1: Beefburger with fries 2: Chicken burger with fries 3: Veggie burger with fries
Salad 4: Beefburger with salad 5: Chicken burger with salad 6: Veggie burger with salad

More generally, if you have \(m\) stages of choosing, with \(n_1\) choices in the first stage, then \(n_2\) choices in the second stage, all the way to \(n_m\) choices in the final stage, you have \(n_1 \times n_2 \times \cdots \times n_m\) total choices altogether.

Example 5.2 Five fair coins are tossed. What is the probability they all show the same face?

Here, the sample space \(\Omega\) is the set of all sequences of 5 coin outcomes. How many sample outcomes are in \(\Omega\)? Well, the first coin can be heads or tails (2 choices); the second coin can be heads or tails (2 choices) and so on, until the fifth and final coin. So, by the multiplication principle, \(|\Omega| = 2 \times 2 \times 2 \times 2 \times 2 = 2^5 = 32\).

The event we’re interested in is \(A = \{\text{HHHHH}, \text{TTTTT}\}\), the event that the faces are all the same – either all heads or all tails. This clearly has \(|A| = 2\) outcomes.

So the probability all five coins show the same face is \[ \mathbb P(A) = \frac{|A|}{|\Omega|} = \frac{2}{32} = \frac{1}{16} \approx 0.06. \]

Example 5.3 Four dice are rolled. What is the probability we get at least one 6?

Here, \(\Omega\) is the set of all possible sequences of four dice rolls. Clearly \(|\Omega| = 6^4 = 1296\).

The event \(A\) is the set of all dice roll sequences with at least one 6. Whenever you see a question with the phrase “at least one” in it, it’s very often a good idea to look at the complementary event \(A^\mathsf{c}\) instead. We know from the last lecture that \(\mathbb P(A) = 1 - \mathbb P(A^\mathsf{c})\), but in “at least one” questions, it’s often easier to count \(|A^\mathsf{c}|\) than to count \(|A|\).

Here, since \(A\) is the set of all dice roll sequences with at least one 6, then \(A^\mathsf{c}\) is the set of dice roll sequence without any 6s at all. This means all four dice must have rolled a 1, 2, 3, 4 or 5. Since each of the four dice rolls has five possibilities, this means that \(|A^\mathsf{c}| = 5^4 = 625\).

Putting this together, we see that \[ \mathbb P(A) = 1 - \mathbb P(A^\mathsf{c}) = 1 - \frac{|A^\mathsf{c}|}{|\Omega|} = 1 - \frac{625}{1296} = \frac{671}{1296} \approx 0.518 .\] So there’s about a 52% chance we get at least one 6.

5.3 Sampling with and without replacement

Probabilists love problems where they pick coloured balls out of a bag!

Example 5.4 A bag contains 15 balls: 10 black balls and 5 white balls. We draw 3 balls out of the bag. What is the probability all 3 balls are black (a) if we put each ball back into the bag after it is chosen; (b) if we do not put each ball back into the bag after it is chosen.

Let’s start with (a). The number of ways to choose a ball out 15 on three occasions is \(|\Omega| = 15^3\). The number of ways to choose a black ball out of 10 on three occasions is \(|A| = 10^3\). Hence \[ \mathbb P(A) = \frac{|A|}{|\Omega|} = \frac{10^3}{15^3} = \frac{1000}{3375} = \frac{8}{27} \approx 0.30. \]

What about (b)? Here we don’t put the ball back in the bag once it has been chosen. There are 15 ways to pick the first ball. But then there are only 14 balls left in the bag for the second choice, and only 13 balls for the third choice. So \(|\Omega| = 15\times14\times13\). Similarly, there are 10 ways the first ball can be black. But once that black ball is removed, only 9 choices for the second black ball, and only 8 for the third. So \(|A| = 10\times9\times8\). So this time we have \[ \mathbb P(A) = \frac{|A|}{|\Omega|} = \frac{10\times9\times8}{15\times14\times13} = \frac{720}{2730} = \frac{24}{91} \approx 0.26, \] which is slightly smaller than the answer in part (a).

This example illustrated the difference between sampling with replacement (when the balls were put back into the bag) and sampling without replacement (when the balls were not put back). If we want to sample \(k\) items from a set of \(n\) items, then:

  • the number of ways to sample with replacement is \[ n^k = n\times n\times\cdots\times n; \]
  • the number of ways to sample without replacement is \[ {n}^{\underline{k}} = n\times(n-1)\times \cdots\times (n-k+1) .\]

Here, we’ve defined the notation \({n}^{\underline{k}}\) for the number of ways to sample without replacement; this is called the falling factorial or permutation number. This is still \(k\) numbers multiplied together, but decreasing by 1 each time down from \(n\). The final number in the product is the number of choices in the \(k\)th an final round: this is the original \(n\) items minus the \(k-1\) items sampled in the previous \(k-1\) rounds; so the final number is \(n - (k-1) = n - k + 1\). A notation point: Notice that the subscript is underlined in the falling factorial; other notation sometimes used includes \((n)_k\), \(P(n,k)\), or \({}^nP_k\).

In our balls-in-a-bag problem, the answer for sampling with replacement was \(10^3/15^3\), while the answer for sampling without replacement was \({10}^{\underline{3}}/{15}^{\underline{3}}\).

Next time, we will look at more classical probability problems. Do two of your friends share a birthday? Can you shuffle a deck of cards in an order that has never happened before in the history of the universe? And can you win the National Lottery?

Summary

  • “Classical probability” describes the situation where there are finitely many equally likely outcomes. The classical probability \(\mathbb P(A) = |A|/|\Omega|\) requires us to count how many outcomes there are in events or sample spaces.
  • The multiplication principle says that \(n\) choices followed by \(m\) choices makes \(n \times m\) choices in total.
  • Sampling \(k\) objects out of \(n\) with replacement gives \(n^k\) choices.
  • Sampling \(k\) objects out of \(n\) without replacement gives \(n^{\underline{k}} = n(n-1)\cdots(n-k+1)\) choices.

Recommended reading: