Inferring the proportions of infectees in two different populations

Let us now go through what has been anticipated in Sec. [*], talking about predictions. We have seen that, since (at least in our model) an important contribution to the uncertainty is due to systematics, related to the uncertain knowledge of $\pi_1$ and $\pi_2$, we cannot increase at will the sample size with the hope to reduce the uncertainty on $p$. Nevertheless, as a consequence of what we have seen in Sec. [*], we expect to be able to measure the difference of proportions of infectees in two populations much better than how we can measure a single proportion.

Let us use again sample sizes of 10000 (they could be different for the different populations) and imagine that we get numbers of positives rather `close', as we know from the predictive distribution: $n_P^{(1)}=2000$ and $n_P^{(2)}=2200$. As far as sensitivity and specificity are concerned, since we have learned their effect, let us stick, for this exercise, to our default case, summarized by $\pi_1=0.978\pm 0.007$ and $\pi_2=0.115\pm 0.022$. The R script is given in Appendix B.11. Here is the result of the joint inference and of the difference of the proportions:

$\displaystyle p^{(1)}$ $\displaystyle =$ $\displaystyle 0.097 \pm 0.023$  
$\displaystyle p^{(2)}$ $\displaystyle =$ $\displaystyle 0.120 \pm 0.022$  
$\displaystyle \Delta p = p^{(2)} - p^{(1)}$ $\displaystyle =$ $\displaystyle 0.023 \pm 0.007$  
$\displaystyle \rho(p^{(1)},p^{(2)})$ $\displaystyle =$ $\displaystyle 0.955\,.$  

As we see, $p^{(1)}$ and $p^{(2)}$ are, as we use to say, `equal within the uncertainties', but nevertheless their difference is rather `significative'. This is due to the fact that the common systematics induce a quite strong positive correlation among the determination of the two proportions, quantified by the correlation coefficient. The relevance of measuring differences has been already commented in Sec. [*], in which we also provided some details on how to evaluate the uncertainty of the difference from the other pieces of information. We would just like to stress its practical/economical importance. For example, dozens of regions of a state could be sampled and tested with `rather cheap' kits, with performances of the kind we have seen here (but it is important that they are the same!), and only one region (or a couple of them, just for cross-checks) also with a more expensive (and hopefully more accurate) one. The region(s) tested with the high quality kit could then be used as calibration point(s) for the others and the practical impact in planning a test campaign is rather evident.