Model and analysis method
Dealing with problems of this kind,
we have learned (see e.g. [17])
the importance of building up a graphical representation
of the causal model relating the quantities
of interest, some of them `observed' and others `unobserved',
among the latter the quantities we wish to infer.
Also in this case, despite some initial skepticism
about the possibility of getting some meaningful results,
once we have built up the model, very basic indeed,
it was clear that the main outcome concerning the vaccine efficacy
was not depending on the many aspects of the trials.
Our initial doubts were in fact related to the several
details concerning
the people involved in the test campaign,
but they finally resulted
to be much less critical than we had at first thought.
The causal model used in this analysis
is implemented in the Bayesian network
of Fig.
.
Figure:
Simplified Bayesian network of the vaccine vs placebo
experiment (see text).
 |
The top nodes
and
stand for the
number of individuals in the vaccine and placebo (i.e. control)
groups, respectively,
as the subscripts indicate, while the bottom
ones (
and
) are the number of individuals
of the two groups
resulting infected during the trial. These are the observed nodes
of our model and their values are summarized in Tab.
.
Then, there is the question of how to relate the numbers
of infectees to the numbers of the participants in the trial.
This depends in fact on several variables, like the
prevalence of the virus in the population(s) of the involved people,
their social behavior, personal
life-style, age, health state and so on. And, hopefully,
it depends on the fact
that a person has been vaccinated or not.
Lacking detailed information, we simplify
the model introducing an assault probability
, that is a
catch-all term embedding the many real life variables, apart
being vaccinated or not.
Nodes
and
in the network of Fig.
represent then the number of `assaulted individuals'
in each group, and they are modeled according to
binomial distributions, that is
represented in the graphical model by solid arrows.
The `assaulted individuals' of the control group
are then assumed to be all infected, and hence the
deterministic link with dashed arrow relating node
to node
follows (indeed the two numbers are exactly
the same in our model, and we make this distinction
only for graphical symmetry with respect to the vaccine group).
Instead, the `assaulted individuals' of the other group
are `shielded' by the vaccine with probability
,
that we therefore identify with efficacy,
although we shall come back at the due point about what
should be reported as `efficacy'.
The probability of becoming
infected if assaulted is therefore equal to
,
so that node
is related to node
by
At this point all the rest is a matter of calculations,
that we do by MCMC techniques4with the help of the program JAGS [15]
interfaced with R [18] via rjags [19].
The nice thing using such a tool is that we have to take
care only to describe the model, with
instructions whose meaning is quite
transparent:5
model {
nP.I ~ dbin(pA, nP) # 1.
nV.A ~ dbin(pA, nV) # 2.
pA ~ dbeta(1,1) # 3.
nV.I ~ dbin(ffe, nV.A) # 4. [ ffe = 1 - eff ]
ffe ~ dbeta(1,1) # 5.
eff <- 1 - ffe # 6.
}
We easily recognize in lines 1. and 2. of the R code
the above Eqs. (
) and
(
), while line 4. stands for
Eq. (
). Line 6. is simply the transformation
of `
' (`ffe' in the code) to
,
the quantity we want to trace in the `chain'.
Finally lines 3. and 5. describe the priors of the
`unobserved nodes' that have no `parents', in this case
and
.
We use for both a uniform prior,
modeled by a Beta distribution (see Sec.
for details) with
parameters
.6
Then we have to provide the data, in our case
,
,
and
.
The program samples the space of possibilities
and returns lists of numbers (a `chain')
for each `monitored variable', which can then
be analyzed `statistically'. For example
the frequency of occurrence of the values
in each list is expected to be proportional
to the probability of that values of the variable
(Bernoulli's theorem). Similarly we can evaluate
correlations among variables.
Table:
Top table: MCMC results for the
model parameter
(see text).
Bottom table: same as Tab.
for easier comparison with the MCMC results.
|