One of the problems with this term is that it tends to have several
meanings, and then to create misunderstandings.
In plane English `likelihood' is ``1. the condition of being likely or probable; probability'', or ``2. something that is
probable''^{58};
but also ``3. (Mathematics & Measurements / Statistics) the probability of a given sample being randomly drawn regarded as a function of the parameters of the population''.

Technically, with reference to the example of the previous appendix, the likelihood is simply , where is fixed (the observation) and is the `parameter'. Then it can take two values, and .

If, instead of
only two models we had a continuity of models,
for example
the family of
all Gaussian distributions characterized by central value
and `effective width' (standard deviation) , our likelihood
would be
, i.e.

(37) |

written in this way to remember that: 1) a likelihood is a function of the model parameters and not of the data; 2)

In principle there is nothing bad to give a special name
to this function of the parameters. But, frankly, I had preferred
statistics gurus named it after their dog
or their lover, rather than call it
`likelihood'.^{59}The problem is that it is very frequent to
hear students, teachers and researcher
explaining that the `likelihood' tells
``how likely the parameters are'' (this is *the probability
of the parameters! not the `likelihood'*). Or they would say,
with reference to our example,
``it is the probability that comes from ''
(again, this expression would be the probability of
given , and not the probability of given the
models!) Imagine if we have only in the game:
comes with certainty from ,
although does not yield with certainty .^{60}

Several methods in `conventional statistics'
use somehow the likelihood to decide which model
or which set of parameters describes at best the data.
Some even use the likelihood ratio (our Bayes factor),
or even the logarithm of it (something equal or proportional,
depending on the base, to the weight of evidence we have indicated
here by JL).
The most famous method of the series
is the *maximum likelihood principle*.
As it is easy to guess from its name, it states
that the *best estimates* of the parameters
are those which maximize the likelihood.

All that *seems* reasonable and in agreement
with what it has been expounded here, but it is not
quite so. First, for those who support this approach,
likelihoods are not just a part of the inferential
tool, they are everything. Priors are completely
neglected, more or less because of the objections
in footnote 9. This can be acceptable,
if the evidence is overwhelming, but this is not always the case.
Unfortunately, as it is now easy to understand, neglecting
priors is mathematically
equivalent to consider the alternative hypotheses equally likely!
As a consequence of this statistics miseducation
(most statistics courses in the universities all around the world
only teach `conventional statistics' and never, little, or
badly probabilistic inference)
is that too many unsuspectable people
fail in solving the AIDS problem of appendix B,
or confuse the likelihood with the probability of the
hypothesis, resulting in misleading scientific claims
(see also footnote 60 and Ref. [3]).

The second difference is that, since ``there are no priors'',
the result cannot have a probabilistic meaning, as
it is
openly recognized by the promoters of this method,
who, in fact, do not admit we can talk about probabilities of causes
(but most practitioners seem not to be aware of this
`little philosophical detail', also because frequentistic
gurus, having difficulties to explain what is the meaning
of their methods, they say they are `probabilities',
but in quote marks!^{61}).
As a consequence, the resulting `error analysis',
that in human terms means to assign different
beliefs to different values of the parameters,
is cumbersome. In practice the results are reasonable only
if the possible values of the parameters are
initially equally likely and the `likelihood function' has
a `kind shape' (for more details see chapters 1 and 12
of Ref. [3]).