More in general, the scaling factor is at least suspect. This is because it is well known that the distribution does not scale with and therefore, while a , for example, is quite in the norm for equal to 2, 3 or 4 (even a strict frequentist would admit that the resulting p-values of 0.14, 0.11 and 0.09, respectively, are nothing to worry), things get different for equal to 10, 20 or 30 (p-values of 0.029, 0.005 and 0.0009, respectively). Moreover, I am not aware of cases in which the standard deviation of the weighted average was scaled down, in the case that was smaller than one.15
But there is another subtle issue with the method, which I have realized only very recently, going through the details of the charged kaon mass measurements: if the prescription is applied to a sub-sample of results and then to all them (taking for the sub-sample weighted average and scaled standard deviation), then a bias is introduced in the final result with respect to when all results were taken individually. This is because the summary provided by such a prescription is not a sufficient statistics.
The lowest, high precision mass value of
(see Tab. 1 and Fig. 3)
come in fact from the combination, done directly by the experimental
team [18] applying the
prescription.
Without this scaling, the four individual results,
reported in Tab. 2,
|
Nevertheless, if we apply to the standard deviation a scaling factor of , then we get MeV (the difference between this value of 0.010 MeV and 0.011 MeV of Tabs. 1 and 2 could be just due to rounding of the individual values). The result is shown in Fig. 4, together with the individual results that enter the analysis (see also entry B of the summary table 3).
It is interesting to see what we get if we use the nine individual points, i.e. 1, 2, 3, 4 and 6 of Tab. 1, together with , , and of Tab. 2.
As a further example to show this effect on the same data, let us make the academic exercise of grouping the data in a different way. For example we first combine all results published before year 1990 (1-4,-, with references to Tabs. 1 and 2, and include the most recent one (6 of Tab. 1) in a second step. The outcome of the exercise is reported in Fig. 6 and in the entries D and E of the summary table 3.
The weighted average of the eight results before year 1990 (upper plot of Fig. 6 and entry D in Tab. 3) gives MeV (dashed red line). The is equal to 10.8, producing a scaling factor of 1.24 and thus a modified result of MeV (solid brown line of Fig. 6 and entry D in Tab. 3).Combining this outcome with the 1991 result [19,20] we get (lower plot of Fig. 6 and entry E in Tab. 3) a weighted average of MeV, but with the very large of 29 (p-value ), thus yielding a scaling factor and then a widened standard deviation of keV. At least, contrary to the previous cases, this time the scaled standard deviation is able to cover both individual results, although an experienced physicist would suspect that most likely only one of the two is correct. (In situations of this kind a `sceptical analysis' would result in a bimodal distribution, as shown in Fig. 4 of Ref. [3].)