Conditional probability

Next: Bayes' theorem Up: Conditional probability and Bayes' Previous: Dependence of the probability Contents

Conditional probability

Although everybody knows the formula of conditional probability, it is useful to derive it here^3.5. The notation is $P(E\,\vert\,H)$ , to be read ``probability of

given

'', where

stands for hypothesis. This means: the probability that

will occur under the hypothesis that

has occurred^3.6.

The event $E\,\vert\,H$ can have three values:

TRUE:: if is TRUE and is TRUE;
FALSE:: if is FALSE and is TRUE;
UNDETERMINED:: if is FALSE; in this case we are merely uninterested in what happens to . In terms of betting, the bet is invalidated and none loses or gains.

Then

can be written $P(E\,\vert\,\Omega)$ , to state explicitly that it is the probability of

whatever happens to the rest of the world ( $\Omega$ means all possible events). We realize immediately that this condition is really too vague and nobody would bet a penny on a such a statement. The reason for usually writing

is that many conditions are implicitly, and reasonably, assumed in most circumstances. In the classical problems of coins and dice, for example, one assumes that they are regular. In the example of the energy loss, it was implicit (``obvious'') that the high voltage was on (at which voltage?) and that HERA was running (under which condition?). But one has to take care: many riddles are based on the fact that one tries to find a solution which is valid under more strict conditions than those explicitly stated in the question, and many people make bad business deals by signing contracts in which what ``was obvious'' was not explicitly stated.

In order to derive the formula of conditional probability let us assume for a moment that it is reasonable to talk about ``absolute probability'' $P(E)=P(E\,\vert\,\Omega)$ , and let us rewrite

$\displaystyle P(E) \equiv P(E\,\vert\,\Omega)$	$\displaystyle \underset{\bf a}{=}$	$\displaystyle P(E\cap\Omega)$
	$\displaystyle \underset{\bf b}{=}$	$\displaystyle P\left(E\cap (H \cup \overline{H})\right)$
	$\displaystyle \underset{\bf c}{=}$	$\displaystyle P\left((E\cap H) \cup (E\cap\overline{H})\right)$
	$\displaystyle \underset{\bf d}{=}$	$\displaystyle P(E\cap H) + P(E\cap\overline{H})\,,$	(3.1)

where the result has been achieved through the following steps:

(a): implies $\Omega$ (i.e. $E\subseteq \Omega$ ) and hence $E\cap\Omega=E$ ;
(b): the complementary events and $\overline{H}$ make a finite partition of $\Omega$ , i.e. $H \cup \overline{H} = \Omega$ ;
(c): distributive property;
(d): axiom 3.

The final result of (

) is very simple:

is equal to the probability that

occurs and

also occurs, plus the probability that

occurs but

does not occur. To obtain $P(E\,\vert\,H)$ we just get rid of the subset of

which does not contain

(i.e. $E\cap\overline{H}$ ) and renormalize the probability dividing by

, assumed to be different from zero. This guarantees that if

then $P(H\,\vert\,H)=1$ . We get, finally, the well known formula

$\displaystyle P(E\,\vert\,H) = \frac{P(E\cap H)}{P(H)}\hspace{1.0cm}[P(H)\ne 0]\,.$

(3.2)

In the most general (and realistic) case, where both

and

are conditioned by the occurrence of a third event $H_\circ$ , the formula becomes

$\displaystyle P(E\,\vert\,H, H_\circ) = \frac{P\left[E\cap (H\,\vert\, H_\circ]\right) } {P(H\,\vert\,H_\circ)}\hspace{1.0cm}[P(H\,\vert\,H_\circ)\ne 0]\,.$

(3.3)

Usually we shall make use of (

) (which means $H_\circ=\Omega$ ) assuming that $\Omega$ has been properly chosen. We should also remember that (

) can be resolved with respect to $P(E\cap H)$ , obtaining the well known

$\displaystyle P(E\cap H) = P(E\,\vert\,H)P(H)\,,$

(3.4)

and by symmetry

$\displaystyle P(E\cap H) = P(H\,\vert\,E)P(E)\,.$

(3.5)

We remind that two events are called independent if

$\displaystyle P(E\cap H) = P(E)P(H)\,.$

(3.6)

This is equivalent to saying that $P(E\,\vert\,H) = P(E)$ and $P(H\,\vert\,E)=P(H)$ , i.e. the knowledge that one event has occurred does not change the probability of the other. If $P(E\,\vert\,H) \ne P(E)$ then the events

and

are correlated. In particular:

if $P(E\,\vert\,H) > P(E)$ then and are positively correlated;
if $P(E\,\vert\,H) < P(E)$ then and are negatively correlated.

Next: Bayes' theorem Up: Conditional probability and Bayes' Previous: Dependence of the probability Contents

Giulio D'Agostini 2003-05-15