At this point some further remarks on the utility
of Eq. () is in order.
Its advantage,
within its limits of validity (checked in our case),
is that it allows to disentangle the contributions to the overall uncertainty.
In particular we can rewrite it as
that is a
`quadratic sum' (or `quadratic combination', indicated by the symbol `')
of three contributions,
due, as indicated by the suffixes, to the binomials (`'
standing for `random'), to the uncertainty on
and to that on .
This quadratic combination of the contributions
can be easily extended, just dividing by ,
to the uncertainty on the fraction of positives,
thus getting
quadratic sum of
We see immediately, for example,
that for around 0.1 the contribution due to
dominates over that due to
by a factor
.
This allows us
to evaluate, on the basis of the Monte Carlo
results shown in Tab. , the
contribution due the systematic effects alone.
For example we get, for our customary
values of and ,
equal to
0.003 and 0.020, respectively.
Assuming a quadratic combination,
the contribution due to systematics is then
. Besides questions of
rounding,32it is clear that the uncertainty is largely dominated
by the uncertainty on and .
We can check this result by a direct, although
approximated, calculation using
Eq. () and ():
getting the same result.
Looking at the numbers of Tab. ,
we see that this effect starts already at .
For example, for we get
,
twice the standard uncertainty of 0.010 due to the binomials alone.
The sample size at which the two contributions have the
same weight in the global uncertainty is around 300
(for example, for we get
).
The take-home message is, at this point,
rather clear (and well known to physicists and other scientists):
unless we are able to make our knowledge about and more
accurate, using sample sizes much larger than 1000 is
only a waste of time.
However, there is still another important effect we need
to consider, due to the fact that we
are indeed sampling a population.
This effect leads unavoidably to extra variability
and therefore to a new contribution to the uncertainty in prediction
(which will be somehow reflected into uncertainty in the inferential process).
Before moving to this other important effect, let us
exploit a bit more the approximated evaluation of
.
For example,
solving with respect to the condition
we get from Eqs. ()-()
which gives a rough idea of the sample size above which
the uncertainty due to systematics starts to dominate.
For example, for we get
of the order of magnitude (
)
got from the Monte Carlo study.
If we require, to be safe,
2-3
we get
and
,
again in reasonable
agreement with the results of Tab. .
We shall go through a more complete analysis of
in Sec. , in which
a further contribution to the uncertainty will be also taken
into account.
Subsections