However, this does not imply that the latter is the correct way to proceed in the case the pattern of the individual results is at odds with the weighted average applied to all points. A more pondered analysis should rather be performed in order to model our doubts, as done e.g. in Ref. [2]. (In the case of the charged kaon mass, there is however a curious compensation, such that the biased result comes out to agree, at least in terms of central value and `error', with that of the `sceptical analysis' [2].)
I would like to conclude with some remarks concerning how to report an experimental result, in perspective of its further uses. In fact, a result is not an end in itself, as Physics and all Sciences are not just collections of facts (and even an experimental result is not a mere `fact', since it is derived from many empirical observations through models relying on a web of beliefs17).
Focusing on pure science, results are finally confronted with theoretical evaluations (not strictly `predictions') in order to classify in degree of belief the possible models describing how `the World works' (note that the acclaimed Popperian falsification is an idealistic scheme that however seldom applies in practice [23,24]). But in order to achieve the best selective power, individual results are combined together, as we have seen in this note. Moreover a result could be propagated into other evaluations, as it is, itself, practically always based on other results, since it depends on quantities which enter the theoretical model(s) on which it relies (`principles of measurement' [10]), including those who govern the “pieces of apparatus”, as reminded in footnote 17.
Therefore, it is important to provide, as outcome of an experimental investigation, something that can be used at best, even after years, for comparison, combination and propagation. Fortunately there is something on which there is universal consensus: the most complete information resulting from the the empirical findings, concerning a quantity that can assume values with continuity, is the so called likelihood function.18In fact, in the case of independent experiments reporting evidence on the same physics quantity the rule of the combination is straightforwards, as it results from probability theory, without the need of ad hoc prescriptions: just multiply the individual likelihoods. It follows then that the likelihood (or its negative log) should be described at best in a publication, as for example done in Ref.[27], in which several negative log-likelihoods were shown in figures and parameterized around their minimum by suitable polynomials.
Reducing the detailed information provided by the likelihood
in a couple of numbers does not provide, in general, an effective and
unbiased way to report the result of the findings,
unless the likelihood is, with some degree of approximation, Gaussian.
Instead, if the likelihood is not Gaussian [or
the is not parabolic, in those
cases in which the likelihood can be rewritten as
], then
reporting the value that maximize it, with an `error' related
to the curvature of its negative log at the minimum, or `asymmetric errors'
derived from a prescription that is only justified for a
Gaussian likelihood, is also an inappropriate way
of reporting the information contained in the findings.
This is because, when a result is given
in terms of
, then
is often used in further calculations, and the
's are `propagated' into further uncertainties
in `creative' ways, forgetting that the well know formulae
for propagations in linear combinations
(or in linearized forms) rely on probabilistic properties
of means and variances (and the Central Limit Theorem
makes the result Gaussian if `several' contributions are considered).
There are, instead, no similar theorems that apply to the
`best values' obtaining minimizing the
or the
and to the (possibly asymmetric) `errors' obtained by the
and the
rules, still
commonly used to evaluate `errors'.
Therefore these rules might produce biased results, directly
and/or in propagations [26].
Reporting the likelihood is also very important in the case of `negative searches', in which a lower/upper bound is usually reported. In fact, although there is no way to combine the bounds (and so people often rely on the most stringent one, which could be just due a larger fluctuation of the background with respect to its expectation) there are little doubts on how to `merge' the individual (independent) likelihoods in a single combined likelihood, from which conventional bounds can be evaluated (see Ref. [28] and chapter 13 of Ref. [29]).
Finally, a puzzle is proposed in the Appendix, as a warning on the use of the weighted average to combine results, even if they are believed to be independent and affected by Gaussian errors.
It is indebted to Enrico Franco for extensive
discussions on the subject and comments on the manuscript.