The intellectual process of learning from observations can be
sketched as illustrated in figure 1.
From experimental data we wish to `determine' the value of
some physical quantities,
or to establish which theory describes `at best'
the observed phenomena. Although these two
tasks are usually
seen as separate issues, and analyzed with different
mathematical tools, they
can be viewed as two subclasses of the same process:
*inferring hypotheses from observations*.
What differs between the two kinds of inference
is the number of hypotheses that enters the game:
a discrete, usually small number when dealing with
*theory comparison*;
a large, virtually infinite number
when *inferring the value of physical quantities*.

In general, given some data (*past observations*),
we wish to:

- select a theory and determine its parameters with the aim to describe and `understand' the physical world;
- predict
*future observations*(that, once they are recorded, they join the set of past observations to corroborate or diminish our confidence on each theory and its parameters).

Given this cause-effect scheme, having observed an effect, we cannot be sure about its cause. (This is what happens to effects , and of figure 3 -- effect , that can only be due to cause , has to be considered an exception, at least in the inferential problems scientists typically meet.)

**Example 1.**- As a simple example, think about the effect
identified by the number
resulting by one of the following random generators
chosen at random:
= ``a Gaussian generator with and '';
= ``a Gaussian generator with and '';
= ``an exponential generator with ''
( stands for the expected value of the exponential distribution;
and are the usual parameters of the Gaussian distribution).
Our problem, stated in intuitive terms, is to find out which
hypothesis might have caused : ,
or ? Note that none of the hypotheses of this example
can be excluded and, therefore,
there is no way to reach a boolean conclusion. We can
only state, somehow, our
*rational preferences*, based on the experimental result and our best knowledge of the behavior of each*model*.

The status of uncertainty
does not prevent us from
doing Science. Indeed, said with Feynman's words,
*``it is scientific
only to say what is more likely and what is less
likely''*[3].
Therefore, it becomes crucial to
learn how to deal quantitatively with probabilities of causes,
because the *``problem*(s) *in the probability of causes ...
may be said to be the essential problem*(s) *of
the experimental method''* (Poincaré[4]).

However, and unfortunately, it is a matter of fact that
nowadays most scientists are
incapable to reason correctly about probabilities of causes,
probabilities of hypotheses, probabilities
of values of a quantities,
and so on. This lack of expertise is due to
the fact that we have been educated and trained
with a statistical theory in which the very concept of probability
of hypotheses is absent, although we naturally tend to think and
express ourselves in such terms. In other words, the common prejudice
is that probability *is* the long-term relative frequency,
but, on the other hand, probabilistic statements about hypotheses
(or statements implying,
anyway, a probabilistic meaning) are constantly made by
the same persons, statements that are
irreconcilable with their definition of
probability [5].
The result of this
mismatch between natural thinking and cultural
over-structure produces mistakes in scientific judgment,
as discussed e.g. in Refs. [1,5].

Another prejudice, rather common among scientists, is that, when they deal with hypotheses, `they think they reason' according to the falsificationist scheme: hence, the hypotheses tests of conventional statistics are approached with a genuine intent of proving/falsifying something. For this reason we need to shortly review these concepts, in order to show the reasons why they are less satisfactory than we might naïvely think. (The reader is assumed to be familiar with the concepts of hypothesis tests, though at an elementary level -- null hypothesis, one and two tail tests, acceptance/rejection, significance, type 1 and type 2 errors, an so on.)