Updating beliefs

Let us come finally to proposition (3): rational people are ready to change their opinion in front of `enough' experimental evidence. What is enough? It is quite well understood that it all depends on This is the reason why practically nobody took very seriously the CDF claim (not even most members of the collaboration, and I know several of them), while practically everybody is now convinced that the Higgs boson has been finally caught at CERN[31] - no matter if the so called `statistical significance' is more ore less the same in both cases (which was, by the way, more or less the same for the excitement at CERN described in footnote11 - nevertheless, the degree of belief of a Higgs boson found at CERN is substantially different!).

Probability theory teaches us how to update the degrees of belief on the different causes that might be responsible of an `event' (read `experimental data'), as simply explained by Laplace in his Philosophical essay[17] (`VI principle'14 at pag. 17 of the original book, available at book.google.com - boldface is mine):

``The greater the probability of an observed event given any one of a number of causes to which that event may be attributed, the greater the likelihood15 of that cause {given that event}. The probability of the existence of any one of these causes {given the event} is thus a fraction whose numerator is the probability of the event given the cause, and whose denominator is the sum of similar probabilities, summed over all causes. If the various causes are not equally probable a priory, it is necessary, instead of the probability of the event given each cause, to use the product of this probability and the possibility of the cause itself. This is the fundamental principle of that branch of the analysis of chance that consists of reasoning a posteriori from events to causes.''
This is the famous Bayes' theorem (although Bayes did not really derive this formula, but only developed a similar inferential reasoning for the parameter of Bernoulli trials16) that we rewrite in mathematical terms [omitting the subjective `background condition' $I_s(t)$ that should appear - and be the same! - in all probabilities of the same equation] as

P(C_i\,\vert\,E) &=& \frac{P(E\,\vert\,C_i)\cdot P(C_i)}
{\sum_j P(E\,\vert\,C_j)\cdot P(C_j)}\,.

This formula teaches us that what matters is not (only) how much $E$ is probable in the light of $C_i$ (unless it is impossible, in which case $C_i$ it is ruled out - it is falsified to use a Popperian expression), but rather The essence of the Laplace(-Bayes) rule can be emphasized writing the above formula for any couple of causes $E_i$ and $E_j$ as

\frac{P(C_i\,\vert\,E)}{P(C_j\,\vert\,E)} &=&
\frac{P(E\,\vert\,C_i)}{P(E\,\vert\,C_j)} \times

the odds are updated by the observed effect $E$ by a factor (`Bayes factor') given by the ratio of the probabilities of the two causes to produce that effect.

In particular, we learn that:

In particular the latter points looks rather trivial, as it can be seen from the 'senator Vs woman' example of the abstract. But already the Gaussian generator example there might confuse somebody, while the `$\mu$ Vs $x$' example is a typical source of misunderstandings, also because in the statistical jargon $f(x\,\vert\,\mu)$ is called `likelihood' function of $\mu$, and many practitioners think it describes the probabilistic assessment concerning the possible values of $\mu$ (again misuse of words! - for further comments see Appendix H of [5]).

Giulio D'Agostini 2012-01-02