Next: Bibliography Up: From Observations to Hypotheses Previous: Forward to the past:

Conclusions

The main message of this contribution is an invitation to review critically several concepts and methods to which we are somehow accustomed.

Strict falsificationism is definitely naïve and its implementation via frequentistic hypothesis tests is logically seriously flawed. Such tests `often work' -- unfortunately I cannot not go through this point for lack of space and I refer to Section 10.8 of Ref. [1] -- if we want to use them to form a rough idea about whether it is worth investigating in alternative hypotheses that would describe the data better. Stated in different words, there is nothing to reproach -- and I admit I do it -- calculating a $\chi^2$ variable to get a idea of the `distance' between a model and the data. What is not correct is to use the $\chi^2$ , or any other test variable to quantitatively assess our confidence on that model.

An alternative way of reasoning, based on probability theory and then capable to quantify consistently our confidence in formal probabilistic terms, has been shortly outlined. I hope that, also with the help of the simple examples, the paper has been able to convey some important points.

Bayes theorem provides a consistent way to learn from data both for probabilistic parametric inference and probabilistic model comparison.
In order to perform a model comparison at least two fully specified hypotheses are needed [i.e. of which we are able to evaluate, though roughly, the likelihood $f(\mbox{data}\,\vert\,H_i({\mbox{\boldmath$\theta$}}))$ , where ${\mbox{\boldmath$\theta$}}$ are the model parameters].
Scientific conclusions, i.e. how much we believe in either hypothesis, must depend on priors -- would you trust an `ad hoc' model tailored on the data you are going to use for the inference?
If a hypothesis is hardly believable with respect to an alternative hypothesis, then it is absolutely normal that a stronger evidence in favor of it is needed, before we reverse our preference.
The Bayes factor can be considered an unbiased way to report how much the data alone `push' towards each hypothesis.

As an example of model comparison applied to real data, in tune with the workshop themes and written also with didactic intent, can be found in Ref. [11].⁹

Another important class of applications, not discussed in this paper, concerns parametric inference. Essentially, one starts from Eq. (4), and all the rest is `just math', including the extensions to several dimensions and some `tricks' to get the computation done. It can be easily shown that standard methods can be recovered as approximated application of the Bayesian inference under some well defined assumptions that usually hold in routine applications. I refer to Refs. [1] and [12] for details concerning this point, as well as for other issues in Bayesian data analysis not discussed here, and a rich bibliography.

Finally, I would like to add some epistemological remarks. The first one concerns falsificationism, since after my conference talk I have received quite some energetic reactions of colleagues who defended that principle. From a probabilistic perspective, falsificationism is easily recovered if the likelihood vanishes, i.e. $f(\mbox{data}\,\vert\,H_i)=0$ . However this condition is rarely met in the scientific practice, if we speak rigorously (zero is a very committing value!).

I guess we just speak of falsificationism because that is what we have being taught is the `good thing', but without being aware of its implications. It seems to me we actually think in terms of something that should better be named testability, that can be stated quite easily in the language of probabilistic inference. Given a hypothesis , testability requires that the likelihood is positive in a region of the space of the achievable experimental outcomes of an experiment [i.e. $f({\mbox{\boldmath$x$}}\, \mbox{in}\, Q\,\vert\,H_i,\, Exp) \ne 0$ ] and is not trivially proportional to the likelihood of another hypotheses [i.e. $f({\mbox{\boldmath$x$}}\, \mbox{in}\, Q\,\vert\,H_i, \, Exp) /f({\mbox{\boldmath$x$}}\, \mbox{in}\, Q\,\vert\,H_j,\, Exp) \ne k$ ]. These are in fact the conditions for a hypothesis to gain in credibility, via Bayes theorem, over the alternative hypotheses in the light of the expected experimental results. The theory is definitively falsified if the experimental outcome falls on another region $Q^\prime$ such that $f({\mbox{\boldmath$x$}}\, \mbox{in}\, Q^\prime\,\vert\,H_i,\, Exp) = 0$ . Therefore, falsificationism is just a special case of the Bayesian inference.

Anyway, if there is a topic in which falsificationism can be applied in a strict sense, this topic concerns the use of conventional statistical methods, as I wrote elsewhere[1]: ``I simply apply scientific methodology to statistical reasoning in the same way as we apply it in Physics and in Science in general. If, for example, experiments show that Parity is violated, we can be disappointed, but we simply give up the principle of Parity Conservation, at least in the kind of interactions in which it has been observed that it does not hold. I do not understand why most of my colleagues do not behave in a similar way with the Maximum Likelihood principle, or with the `prescriptions' for building Confidence Intervals, both of which are known to produce absurd results.''

The second epistemological remark concerns another presumed myth of scientists, i.e. that ``since Galileo an accepted base of scientific research is the repeatability of experiments.''[13] (``This assumption justifies the Frequentistic definition of probability ...'' -- continues the author.) Clearly, according to this point of view, most things discussed in this workshop are 'not scientific'. Fortunately, it is presently rather well accepted (also by the author of Ref. [13], I understand) that Science can be also based on a collection of individual facts that we cannot repeat at will, or that might happen naturally and behind our control (but there is still someone claiming fields like Geology, Evolutionary Biology and even Astrophysics are not Science!). The relevant thing that allows us to build up a rational scientific knowledge grounded on empirical observations is that we are capable to relate, though in a stochastic way and with the usual unavoidable uncertainties, our conjectures to experimental observations, no matter if the phenomena occur spontaneously or arise under well controlled experimental conditions. In other words, we must be able to model, though approximately, the likelihoods that connect hypotheses to observations. This way of building the scientific edifice is excellently expressed in the title of one of the volumes issued to celebrate the Centennial of the Carnegie Institute of Washington [14]. This scientific building can be formally (and graphically) described by the so called `Bayesian networks' or `belief networks' [2]. If you have never heard these expressions, try to google them and you will discover a new world (and how behind we physicists are, mostly sticking to books and lecture notes that are too often copies of copies of obsolete books!).

It is a pleasure to thank the organizers for the stimulating workshop in such a wonderful location, Paolo Agnoli for useful comments.

Next: Bibliography Up: From Observations to Hypotheses Previous: Forward to the past:

Giulio D'Agostini 2004-12-22