Introduction

It is not rare the case in which experimental results `appear' to be in mutual disagreement. The quote marks are mandatory, as a reminder that also very improbable events might by nature occur.1The fact that they `appear' to us in mutual disagreement is because we know by experience that uncertainties2might be underestimated, systematic errors overlooked, theoretical corrections not (properly) taken into account, or even mistakes of different kinds having possibly been made in building/running the experiment or in the data handling. It is enough to browse the PDG [3] to find cases of this kind, as the one of Fig. 1 concerning the mass of the charged kaon, whose values, as selected by the PDG,
Figure: Charged kaon mass from several experiments as summarized by the PDG [3]. Note that besides the `error' of 0.013 MeV, obtained by a $\times 2.4$ scaling, also an `error' of 0.016 MeV is provided, obtained by a $\times 2.8$ scaling. The two results are called `OUR AVERAGE' and 'OUR FIT', respectively.
\begin{figure}\centering\epsfig{file=PDG_summary.eps,clip=,width=0.8\linewidth}\end{figure}
are reported in Tab. 1.3

Table: Experimental values of the charged kaon mass, limited to those taken into account by the 2019 issue of PDG [3] (see footnote 3 for remarks).
  Authors pub. year central value $[d_i]$ uncertainty $[s_i]$  
$i$     (MeV) (MeV)  
$1$ G. Backenstoss et al. [4] 1973 493.691 0.040  
$2$ S.C. Cheng et al. [5] 1975 493.657 0.020  
$3$ L.M. Barkov et al.[6] 1979 493.670 0.029  
$4$ G.K. Lum et al. [7] 1981 493.640 0.054  
$5$ K.P. Gall et al. [8] 1988 493.636 0.011 (*)  
$6$ A.S. Denisov et al. [9] 1991 493.696 $[$0.0059$]$  
  & Yu.M. Ivanov [10] 1992 $[$same$]$ 0.007  


The usual probabilistic interpretation4of the results is that each experiment provides a probability density function (pdf) centered in $d_i$ with standard deviation $s_i$, as shown by the solid lines of Fig. 2.

Figure: Graphical representation of the results on the charged kaon mass of Tab. 1 (solid lines). The dashed red Gaussian shows the result of the naive standard combination (see text).
\begin{figure}\centering\epsfig{file=naive_combination.eps,clip=,width=0.76\linewidth}\end{figure}
The standard way to combine the individual results consists in calculating the weighted average, with weights equal to $1/s_i^2$, that is
$\displaystyle d_w$ $\textstyle =$ $\displaystyle \frac{\sum_id_i/s_i^2}{\sum_i1/s_i^2}$ (1)
$\displaystyle s_w$ $\textstyle =$ $\displaystyle \left(\sum_i1/s_i^2\right)^{-\frac{1}{2}}\,,$ (2)

which, applied to the values of Tab. 1, yields $d_w = 493.6766\,$MeV and $s_w = 0.0055\,$MeV, i.e. a charged kaon mass of $493.6766\pm 0.0055\,$MeV,5 graphically shown in Fig. 2 with a dashed red Gaussian. The outcome `appears' suspicious because the probability mass is concentrated in the region less preferred by the individual more precise results, as also emphasized in the ideogram of Fig. 1, on the meaning of which we shall return in section 5.

As a matter of fact, a situation of this kind is not impossible, but nevertheless, there is a natural tendency to believe that there must be something not properly taken into account by one or more experiments. Told with a dictum attributed to a famous Italian politician, “a pensar male degli altri si fa peccato ma spesso ci si indovina”.6

Figure: Same as Fig. 2 but with one result arbitrarily shifted by $-50\,$keV (dotted line).
\begin{figure}\centering\epsfig{file=naive_combination_Andreotti.eps,clip=,width=0.8\linewidth}\end{figure}
For example, looking at Fig. 2, one is strongly tempted to lower, just as an exercise, the highest value by 50 keV,7thus getting the excellent overall agreement shown in Fig. 3 (shifted Gaussian plotted with a dotted gray line), yielding a combined mass value of $493.6460 \pm 0.0055\,$MeV. And the question would be settled. But this sounds at least unfair. In particular because we are aware, from the history of measurements, of a kind of `inertia' of new results to different from old ones - but sometimes the new results moved `too far' from the old ones and the presently accepted value lies somewhere in the middle.
Figure: Some history plots from the PDG [3,14].
\begin{figure}\centering\epsfig{file=history_plot_UR.eps,clip=,width=0.87\linewidth}\end{figure}
Figure 4 shows some of the history plots traditionally reported by the PDG [14].

In such a state of uncertainty, probability theory can help us in building up a model in which the values about which we are in doubt are allowed to vary from the nominal ones. Obviously, the model is not unique, as not unique are the probability distributions that can be used. Following then [15], inspired to [16], these are the criteria followed and some (hopefully shareable) desiderata:

Once the model has been built, we can easily write down the multidimensional probability pdf $f(\underline{d},\mu,\underline{r}\,\vert\,\underline{s},I)$, of all the variables of interest (the `observed' $d_i$ and the uncertain values $\mu$ and $r_i$'s  - the $s_i$ will be instead considered as fixed conditions, as it will be clear in a while; $\underline{d}$ stands for all the $d_i$, and so on).

Once the multi-dimensional pdf has been settled, writing down the unnormalized pdf of the uncertain quantities,

\begin{eqnarray*}
\widetilde{f}(\mu,\underline{r}\,\vert\,\underline{d},\underl...
...& f(\mu,\underline{r}\,\vert\,\underline{d},\underline{s},I)\,,
\end{eqnarray*}


is straightforward, as we shall see in a while. But, differently from [15], the rest of the technical work (normalization, marginalization and calculation of the moments of interest) will be done here by sampling, i.e. by Monte Carlo, and the use of a suitable software package will make the task rather easy.

But, before we build up the model of interest, let us start with a simpler one, in which we fully trust the reported standard uncertainty, i.e. we assume $f(r_i\,\vert\,I) = \delta(1)$, and hence $\sigma_i=s_i$. We also take for the prior, following Gauss [17,18], a flat distribution of $\mu$ in the region of interest.9