The graphical model describing the quantities of interest
is shown in the left hand network of Fig. ,
based on that of Fig. ,
to which we have added parents to the nodes
and , the number
of Infected and Not Infected in the sample, respectively.
More precisely, the number of infectees
in the sample is described by a hypergeometric distribution,
that is
with and the numbers of infected and not infected individuals
in the population. Then, the number of not
infected people in the sample is
deterministically related to , being
.
However, since in this paper
we are interested in sample sizes
much smaller than those of the populations, we can remodel
the problem according to the right hand network
of Fig. , in which
is described by a binomial distribution, that is
with
.
This simplified model has been re-drawn in the network
shown in the left hand side
of Fig. ,
Figure:
Simplified graphical model of
Fig. rewritten in order to
make explicit `known'/`assumed' quantities, tagged by the symbol
'', and the uncertain ones. In particular, in the left
hand diagram precise values of and are assumed,
while in the the right hand one the uncertainty on their values
is modeled with Beta pdf's with parameters and .
|
indicating by the symbol `' the certain variables
in the game (indeed those which are for some
reason assumed),
in contrast to the others, which are uncertain
and whose values will be ranked in degree of belief
following the rules of probability theory.
Note that in this diagram and are assumed to be
exactly known. Instead, as we have already seen in Sec. ,
their values are uncertain and their probability distribution can
be conveniently modeled by Beta probability functions characterized
by parameters 's and 's. The graphical
model which takes into account also
the uncertainty about and
is drawn in the same Fig.
(right side).
We have already discussed extensively, in Sec. ,
how the expectation of , and therefore of the fraction
on positives in the sample, , depends on the model parameters.
Now we go a bit deeper into the question of the dependence
of on the fraction of infectees in the population and,
more precisely, which are the `closest' (to be defined somehow)
two values of , such that the resulting 's
are `reasonably separated' (again to be defined somehow)
from each other. Moreover, instead of simply relying
on the approximated
formulae developed in Sec. , we are going
to use Monte Carlo methods in different ways: initially
just based on R random number generators; then using
(well below its potentials!) the program JAGS, which will then
be used in Sec. for inferences.
However we shall keep using
the approximated formulae for cross check and
to derive some useful, although approximated, results in closed form.