From a sample of individual observations to a
couple of numbers: the role of
statistical sufficiency
Let us restart from the Eq. (5) of Ref. [2],
based on the graphical model in Fig. 5 of the same paper,
reproduced here for the reader's convenience
as Eq. (1) and Fig. 1.
Figure:
Graphical model behind the standard combination,
assuming independent measurements of the same quantity, each
characterized by a Gaussian error function with standard deviation
 |
is the joint probability density function (pdf) of all the
quantities of interest,
with
. The standard deviations
are instead considered just conditions of the problem.
The pdf
models our prior beliefs about the `true' value
of the quantity of interest (see Ref. [2]
for details, in particular footnote 9). The pdf of
, also
conditioned
on
, is then, in virtue of a well known theorem of
probability theory,
Noting that, given the model and the observed values
,
the denominator is just a number, although in general
not easy to calculate,
and making use of Eq. (1), we get
Speaking in terms of likelihood, and ignoring
multiplicative factors,5we can rewrite the previous equation as
that is, indeed, the particular case, valid for independent
observations
, of the more general form
since, under condition of independence,
- The inference depends on the product of likelihood and prior
(note `;' instead of `
' in the notation,
to remind that in `conventional statistics'
is simply a mathematical
function of
, with parameters
and
);
- if the prior is `flat',6then the inference is determined by the likelihood,
In particular,
the most probable value7(`mode') of
is the value which maximizes
the likelihood;8
- in the case of independent Gaussian error functions
the likelihood can be rewritten, besides multiplicative factors,
as
having recognized the sum in the exponent
as
:
under the hypotheses and the approximations of this model the most probable
value of
can then also be obtained by
minimizing
; 9
- going through the steps from Eqs. (7)-(12) of
Ref. [2] and, under
the assumptions stated in the previous items,
we can further rewrite the Eq. (3) as
where
in which we recognize Gauss' Eqs. (G1) and (G2).
In terms of likelihoods,
Equation (7) is an important result, related
to the concept of
statistical sufficiency: the inference is exactly the same
if, instead of using the detailed information provided by
and
, we just
use the weighted mean
and its
standard deviation
, as if
were
a single equivalent observation of
with a Gaussian error function
with “degree of accuracy” [1]
-
this is exactly the result Gauss was aiming in Book 2, Section 3 of
Ref. [1], reminded in the opening quote
and in the introduction
of Ref. [2].
Moreover we can split the sum of Eq.(3)
in two contributions,
from
to
(arbitrary) and from
to
, thus having
Going again through the steps from Eq.(7) to Eq.(12) of
Ref. [2] we get
where
It follows, writing the right hand side as product of exponentials
and complementing each of them [2],
that is, in terms of likelihoods,
The result can be extended to averages of averages, that is
where
The property can be extended further to many partial averages,
showing that the inference
does not depend on whether we use the individual
observations, their weighted average or even the grouped weighted
averages, or the weighted average of the grouped averages.
This is one of the `amazing' properties
of the Gaussian distribution, which simplifies our work
when it is possible to use it.
But there no
guarantee that it works in general, and it should be
then proved case by case.