next up previous contents
Next: Bayes' theorem Up: Conditional probability and Bayes' Previous: Dependence of the probability   Contents


Conditional probability

Although everybody knows the formula of conditional probability, it is useful to derive it here3.5. The notation is $ P(E\,\vert\,H)$, to be read ``probability of $ E$ given $ H$'', where $ H$ stands for hypothesis. This means: the probability that $ E$ will occur under the hypothesis that $ H$ has occurred3.6.

The event $ E\,\vert\,H$ can have three values:

TRUE:
if $ E$ is TRUE and $ H$ is TRUE;
FALSE:
if $ E$ is FALSE and $ H$ is TRUE;
UNDETERMINED:
if $ H$ is FALSE; in this case we are merely uninterested in what happens to $ E$. In terms of betting, the bet is invalidated and none loses or gains.
Then $ P(E)$ can be written $ P(E\,\vert\,\Omega)$, to state explicitly that it is the probability of $ E$ whatever happens to the rest of the world ($ \Omega$ means all possible events). We realize immediately that this condition is really too vague and nobody would bet a penny on a such a statement. The reason for usually writing $ P(E)$ is that many conditions are implicitly, and reasonably, assumed in most circumstances. In the classical problems of coins and dice, for example, one assumes that they are regular. In the example of the energy loss, it was implicit (``obvious'') that the high voltage was on (at which voltage?) and that HERA was running (under which condition?). But one has to take care: many riddles are based on the fact that one tries to find a solution which is valid under more strict conditions than those explicitly stated in the question, and many people make bad business deals by signing contracts in which what ``was obvious'' was not explicitly stated.

In order to derive the formula of conditional probability let us assume for a moment that it is reasonable to talk about ``absolute probability'' $ P(E)=P(E\,\vert\,\Omega)$, and let us rewrite

$\displaystyle P(E) \equiv P(E\,\vert\,\Omega)$ $\displaystyle \underset{\bf a}{=}$ $\displaystyle P(E\cap\Omega)$  
  $\displaystyle \underset{\bf b}{=}$ $\displaystyle P\left(E\cap (H \cup \overline{H})\right)$  
  $\displaystyle \underset{\bf c}{=}$ $\displaystyle P\left((E\cap H) \cup (E\cap\overline{H})\right)$  
  $\displaystyle \underset{\bf d}{=}$ $\displaystyle P(E\cap H) + P(E\cap\overline{H})\,,$ (3.1)

where the result has been achieved through the following steps:
(a)
$ E$ implies $ \Omega$ (i.e. $ E\subseteq \Omega$) and hence $ E\cap\Omega=E$;
(b)
the complementary events $ H$ and $ \overline{H}$ make a finite partition of $ \Omega$, i.e. $ H \cup \overline{H} = \Omega$;
(c)
distributive property;
(d)
axiom 3.
The final result of ([*]) is very simple: $ P(E)$ is equal to the probability that $ E$ occurs and $ H$ also occurs, plus the probability that $ E$ occurs but $ H$ does not occur. To obtain $ P(E\,\vert\,H)$ we just get rid of the subset of $ E$ which does not contain $ H$ (i.e. $ E\cap\overline{H}$) and renormalize the probability dividing by $ P(H)$, assumed to be different from zero. This guarantees that if $ E=H$ then $ P(H\,\vert\,H)=1$. We get, finally, the well known formula

$\displaystyle P(E\,\vert\,H) = \frac{P(E\cap H)}{P(H)}\hspace{1.0cm}[P(H)\ne 0]\,.$ (3.2)

In the most general (and realistic) case, where both $ E$ and $ H$ are conditioned by the occurrence of a third event $ H_\circ$, the formula becomes

$\displaystyle P(E\,\vert\,H, H_\circ) = \frac{P\left[E\cap (H\,\vert\, H_\circ]\right) } {P(H\,\vert\,H_\circ)}\hspace{1.0cm}[P(H\,\vert\,H_\circ)\ne 0]\,.$ (3.3)

Usually we shall make use of ([*]) (which means $ H_\circ=\Omega$) assuming that $ \Omega$ has been properly chosen. We should also remember that ([*]) can be resolved with respect to $ P(E\cap H)$, obtaining the well known

$\displaystyle P(E\cap H) = P(E\,\vert\,H)P(H)\,,$ (3.4)

and by symmetry

$\displaystyle P(E\cap H) = P(H\,\vert\,E)P(E)\,.$ (3.5)

We remind that two events are called independent if

$\displaystyle P(E\cap H) = P(E)P(H)\,.$ (3.6)

This is equivalent to saying that $ P(E\,\vert\,H) = P(E)$ and $ P(H\,\vert\,E)=P(H)$, i.e. the knowledge that one event has occurred does not change the probability of the other. If $ P(E\,\vert\,H) \ne P(E)$ then the events $ E$ and $ H$ are correlated. In particular:
next up previous contents
Next: Bayes' theorem Up: Conditional probability and Bayes' Previous: Dependence of the probability   Contents
Giulio D'Agostini 2003-05-15