Strict falsificationism is definitely naïve and
its implementation via frequentistic hypothesis tests
is logically seriously flawed. Such tests `often
work' -- unfortunately I cannot not go through this point
for lack of space and I refer to Section 10.8 of Ref. [1] --
if we want to use them to form a rough idea about whether it
is worth investigating in alternative hypotheses
that would describe the data better.
Stated in different words, there is nothing to reproach
-- and I admit I do it --
calculating a variable to get a idea of the `distance'
between a model and the data.
What is not correct is to use the
, or any
other test variable to quantitatively assess
our confidence on that model.
An alternative way of reasoning, based on probability theory and then capable to quantify consistently our confidence in formal probabilistic terms, has been shortly outlined. I hope that, also with the help of the simple examples, the paper has been able to convey some important points.
Another important class of applications, not discussed in this paper, concerns parametric inference. Essentially, one starts from Eq. (4), and all the rest is `just math', including the extensions to several dimensions and some `tricks' to get the computation done. It can be easily shown that standard methods can be recovered as approximated application of the Bayesian inference under some well defined assumptions that usually hold in routine applications. I refer to Refs. [1] and [12] for details concerning this point, as well as for other issues in Bayesian data analysis not discussed here, and a rich bibliography.
Finally, I would like to add some epistemological remarks.
The first one concerns falsificationism, since after my conference talk
I have received quite some energetic reactions of colleagues
who defended that principle.
From a probabilistic perspective, falsificationism is easily
recovered if the likelihood vanishes, i.e.
.
However this condition is rarely met in the scientific practice,
if we speak rigorously (zero is a very committing value!).
I guess we just speak of falsificationism
because that is what we have being taught is the `good thing',
but without being aware of its implications.
It seems to me we actually think in terms of something that should
better be named
testability, that can be stated quite easily in the language
of probabilistic inference. Given a hypothesis , testability
requires that the likelihood is positive in a region
of the space
of the achievable experimental outcomes of an experiment
[i.e.
]
and is not trivially proportional to the likelihood of another hypotheses
[i.e.
].
These are in fact the conditions for a hypothesis to gain in credibility,
via Bayes theorem,
over the alternative hypotheses in the light of the
expected experimental results.
The theory is definitively
falsified if the experimental
outcome falls on another region
such that
.
Therefore, falsificationism is just a special case of the
Bayesian inference.
Anyway, if there is a topic in which falsificationism can be applied in a strict sense, this topic concerns the use of conventional statistical methods, as I wrote elsewhere[1]: ``I simply apply scientific methodology to statistical reasoning in the same way as we apply it in Physics and in Science in general. If, for example, experiments show that Parity is violated, we can be disappointed, but we simply give up the principle of Parity Conservation, at least in the kind of interactions in which it has been observed that it does not hold. I do not understand why most of my colleagues do not behave in a similar way with the Maximum Likelihood principle, or with the `prescriptions' for building Confidence Intervals, both of which are known to produce absurd results.''
The second epistemological remark concerns another presumed myth of scientists, i.e. that ``since Galileo an accepted base of scientific research is the repeatability of experiments.''[13] (``This assumption justifies the Frequentistic definition of probability ...'' -- continues the author.) Clearly, according to this point of view, most things discussed in this workshop are 'not scientific'. Fortunately, it is presently rather well accepted (also by the author of Ref. [13], I understand) that Science can be also based on a collection of individual facts that we cannot repeat at will, or that might happen naturally and behind our control (but there is still someone claiming fields like Geology, Evolutionary Biology and even Astrophysics are not Science!). The relevant thing that allows us to build up a rational scientific knowledge grounded on empirical observations is that we are capable to relate, though in a stochastic way and with the usual unavoidable uncertainties, our conjectures to experimental observations, no matter if the phenomena occur spontaneously or arise under well controlled experimental conditions. In other words, we must be able to model, though approximately, the likelihoods that connect hypotheses to observations. This way of building the scientific edifice is excellently expressed in the title of one of the volumes issued to celebrate the Centennial of the Carnegie Institute of Washington [14]. This scientific building can be formally (and graphically) described by the so called `Bayesian networks' or `belief networks' [2]. If you have never heard these expressions, try to google them and you will discover a new world (and how behind we physicists are, mostly sticking to books and lecture notes that are too often copies of copies of obsolete books!).
It is a pleasure to thank the organizers for the stimulating workshop in such a wonderful location, Paolo Agnoli for useful comments.