5 What is Probability#
In the previous sections, we have learned about permutations and combinations. Now, we will introduce the concept of probability.
Sample Space{definition}
The sample space \(\Omega\) is the set of all possible outcomes of an experiment.
Examples:
The sample space of a single coin flip is \(\Omega = \{H, T\}\). (Heads, Tails)
The sample space of rolling a 6-sided die is \(\Omega = \{1, 2, 3, 4, 5, 6\}\).
The sample space of tossing a coin twice is \(\Omega = \{HH, HT, TH, TT\}\).
The sample space of tossing a coin until the first head appears is \(\Omega = \{H, TH, TTH, TTTH, ...\}\). (Infinite)
The sample space of tossing two indistinguishable coins is \(\Omega = \{HH, HT, TT\}\). (order doesn’t matter)
The number of e-mails you receive in a day is \(\Omega = \{0, 1, 2, 3, ...\}\). (Infinite)
\(\cdots\)
Event{definition}
An event \(A\) is a subset of the sample space \(\Omega\). An event is a collection of outcomes of an experiment.
Examples:
The event of getting heads when flipping a coin is \(A = \{H\}\).
The event of rolling an even number with a 6-sided die is \(A = \{2, 4, 6\}\).
The event of rolling a number greater than 4 with a 6-sided die is \(A = \{5, 6\}\).
The event of getting at least one head when flipping a coin twice is \(A = \{HH, HT, TH\}\).
The event of getting heads appearing after at most two tails when flipping a coin is \(A = \{H, TH, TTH\}\).
The event of getting two tails when flipping two indistinguishable coins is \(A = \{TT\}\).
The event of receiving less than 20 e-mails in a day is \(A = \{x \in \mathbb{N} \ | x < 20\}\).
\(\cdots\)
Probability as Stochasticity#
Probability{definition}
The probability od an Event \(A\) is the likelihood of \(A\) occurring. It is denoted by \(P(A)\).
We can think of the probability as a function that maps events to real numbers in the range \([0, 1]\).
One way of thinking about probability is as the ratio of the number of favorable outcomes (events in \(A\)) to the total number of possible outcomes (sample space \(\Omega\)):
Here, \(n(A)\) is the number of times event \(A\) occurs in n trials (the number of times the outcome of \(n\) events “lands” in \(A\)).
Probability as Uncertainty#
Another way of thinking about probability is as a measure of uncertainty. For example, think about blindly rolling a die. Before looking at the outcome, we are uncertain about the outcome (although the outcome is already determined and the die has a specific number on top). The probability of each outcome is \(\frac{1}{6}\). In other words we are uncertain about the outcome. If we had to chose, we would say a specific number between 1 and 6, but we would be very uncertain about the outcome. No matter what number we would choose, we would be right only 1 out of 6 times. This is the uncertainty of the outcome. In this sense, probability is a language about expressing “that we don’t know” or “that we are uncertain about” the outcome.
Note: This view of probability does not imply that there is any randomness in the “world”. Even if the world is purely deterministic (the behaviour of the die is only determined by the laws of physics and we could in principle be able to predict the outcome), we can still use probability to express the fact, how uncertain we are about the outcome. For example, if we know the exact rotation and speed of the die before it hitting the table, we might be able to reduce uncertainty about the outcome and make better predictions which could in turn be expressed as changing the probability distribution of the outcomes.
Axioms of Probability{Axioms}
There are three axioms of probability:
\(0 \leq P(A) \leq 1\) for any event \(A\). (The probability of any event is between 0 and 1)
\(P(\Omega) = 1\). (The probability of the sample space is 1)
If \(E, F\) are two mutually exclusive events (they cannot occur at the same time), then \(P(E \cup F) = P(E) + P(F)\). (The probability of the union of two mutually exclusive events is the sum of the probabilities of the events)
Identity Rule of Probability{rule}
The identity rule of probability states that the probability of an event \(A\) occurring is 1 minus the probability of the event not occurring:
Identity Rule of Probability{proof}
The identity rule of probability can be proven using the axioms of probability:
\begin{align*} P(A) + P(A^c) &= P(A \cup A^c) \quad &\text{(Axiom 3 since they are mutually exclusive/disjoint)} \ &= P(\Omega) \ &= 1 \quad &\text{(Axiom 2)} \end{align*}
so:
\begin{align*} P(A) + P(A^c) = 1 \implies P(A) = 1 - P(A^c) \end{align*}
Rule for equally likely outcomes{rule}
If there are \(n\) outcomes in \(\Omega\) and each of them is equally likely. Then the probability of one of the outcomes is:
(proof similar to the previous rule with Axiom 2 and 3)
Directly from this rule, we can conclude:
(if the outcomes are equally likely)
# TODO: Add many examples of various solution spaces and event spaces and calculate the probabilities