More in general,
the scaling factor
is at least suspect.
This is because it is well known that the
distribution does not scale with
and therefore,
while a
, for example, is quite in the norm
for
equal to 2, 3 or 4 (even a strict frequentist would admit
that the resulting p-values of 0.14, 0.11 and 0.09, respectively,
are nothing to worry), things get different for
equal to 10,
20 or 30 (p-values of 0.029, 0.005 and 0.0009, respectively).
Moreover, I am not aware of cases in which the standard
deviation of the weighted average was scaled down,
in the case that
was smaller
than one.15
But there is another subtle issue with the method, which I have realized only very recently, going through the details of the charged kaon mass measurements: if the prescription is applied to a sub-sample of results and then to all them (taking for the sub-sample weighted average and scaled standard deviation), then a bias is introduced in the final result with respect to when all results were taken individually. This is because the summary provided by such a prescription is not a sufficient statistics.
The lowest, high precision mass value of
(see Tab. 1 and Fig. 3)
come in fact from the combination, done directly by the experimental
team [18] applying the
prescription.
Without this scaling, the four individual results,
reported in Tab. 2,
|
Nevertheless,
if we apply to the standard deviation a scaling factor of
, then
we get
MeV (the difference between this value
of 0.010 MeV
and 0.011 MeV of Tabs. 1 and 2
could be just due to rounding of the individual values).
The result is shown in
Fig. 4, together with the individual results
that enter the analysis (see also entry B of the summary table 3).
![]() |
It is interesting to see what we get if we use
the nine individual points, i.e.
1, 2, 3, 4 and 6 of Tab. 1, together
with ,
,
and
of Tab. 2.
![]() |
As a further example to show this effect on the same data,
let us make the academic exercise of grouping
the data in a different way. For example we first combine all results
published before year 1990 (1-4,-
, with references to
Tabs. 1 and 2, and include the most recent one (6 of Tab. 1) in a second step.
The outcome of the exercise is reported in Fig. 6
and in the entries D and E of the summary table 3.
Combining this outcome with the 1991
result [19,20]
we get (lower plot of Fig. 6
and entry E in Tab. 3) a weighted average of
MeV, but
with the very large
of 29
(p-value
), thus yielding
a
scaling factor and then a widened
standard deviation of
keV. At least, contrary
to the previous cases,
this time the scaled standard deviation is able
to cover both individual results, although
an experienced physicist would suspect that
most likely only one of the two is
correct. (In situations of this kind a `sceptical analysis'
would result in a bimodal distribution, as shown in Fig. 4 of
Ref. [3].)