Obviously, if is equal to
,
i.e. if we completely empty the box, then
we acquire full knowledge of the box content and the solution
is trivial. However, in most cases
we are unable to analyze the entire population
and we have to infer
from a sample.
Therefore, although
can be a reasonable rough
estimate of
, we can never be sure about the true proportion.
At most, there are numerical values we shall believe
more (those around
) and others we shall believe less.
This problem was first tackled analytically
by Laplace in 1774 [27].
Let us now complicate the problem, taking into account the fact that we are not even sure about the characteristics of each sampled individual, as, instead, it happens with black and white balls. This is exactly what happens with infections of different kinds, unless the symptoms are so evident and unique to rule out any other explanation. We have then to rely on tests that are typically not perfect, especially if we have neither time nor money to inspect in detail each individual in order to really see the active agent. Sticking to tests providing only a binary response,6as we hear and read in the media, and assuming that such testing devices and procedures are planned to detect the infected individuals, we expect that if the answer is positive then there should be a quite high chance that the individual is really infected, and a small chance that she is not. Similarly, if the answer is negative, there should be a high chance that the individual is not infected. (The conditionals are due to the fact that there are other pieces of information to take into account, as we shall see.)
We can characterize therefore the test by two virtually continuous
numbers and
in the range between 0 an 1 such that, depending on whether
the individual is infected or not,
the test procedure
provides positive and negative answers with
probabilities
![]() ![]() ![]() |
![]() |
![]() |
|
![]() ![]() ![]() |
![]() |
![]() |
|
![]() ![]() ![]() |
![]() |
![]() |
|
![]() ![]() ![]() |
![]() |
![]() |
As it is easy to understand, the numerical quantities of and
do not come from first principles, but result from previous
measurements. They are therefore affected by uncertainty as
all results in measurements typically are [29].
Therefore, probability distributions have to be associated also to
the possible numerical values of these two test parameters.
Anyway, within this section we take the
freedom to use their nominal values of 0.98 and 0.12
for our first rough considerations.