Detailed study of the four contributions to $\sigma(f_P)$

At this point it is time to release the limiting assumption of exact values of sensitivity and specificity, i.e. $\sigma(\pi_1)=\sigma(\pi_2)=0$. Moreover, having checked that the approximated formulae can take into account with great accuracy also the contribution due to the uncertain value of $p_s$, we find it interesting and useful to study the individual contributions to the uncertainty with which we can forecast the fraction $f_P$ of tested individuals resulting positive. For the reader's convenience, we summarize here the relevant, approximated expressions, making also use, in order to simplify them, of the equality E$(p_s) = p$ :
E$\displaystyle (f_P)$ $\displaystyle \approx$ E$\displaystyle (\pi_1)\cdot p
+$   E$\displaystyle (\pi_2)\cdot (1-p)$ (69)
$\displaystyle \sigma(f_P)$ $\displaystyle \approx$ $\displaystyle \sigma_R(f_P) \oplus \sigma_{p_s}(f_P) \oplus
\sigma_{\pi_1}(f_P) \oplus \sigma_{\pi_2}(f_P)$ (70)
$\displaystyle \sigma_R(f_P)$ $\displaystyle =$ $\displaystyle \sqrt{\mbox{E}(\pi_1)\cdot (1-\mbox{E}(\pi_1))\cdot p
+ \mbox{E}(\pi_2)\cdot (1-\mbox{E}(\pi_2))
\cdot (1-p)}/\sqrt{n_s} \ $ (71)
$\displaystyle \
\sigma_{\pi_1}(f_P)$ $\displaystyle =$ $\displaystyle \sigma(\pi_1)\cdot p$ (72)
$\displaystyle \sigma_{\pi_2}(f_P)$ $\displaystyle =$ $\displaystyle \sigma(\pi_2)\cdot (1-p)\,.$ (73)
$\displaystyle \sigma_{p_s}(f_P)$ $\displaystyle =$ $\displaystyle \sigma(p_s)\cdot \vert$E$\displaystyle (\pi_1) -$   E$\displaystyle (\pi_2)\vert$  
  $\displaystyle \approx$ $\displaystyle \vert$E$\displaystyle (\pi_1) -$   E$\displaystyle (\pi_2)\vert \cdot
\sqrt{p\cdot (1-p)\cdot (1-n_s/N)}/\sqrt{n_s}$ (74)

We can note that $\sigma_{\pi_2}(f_P)$ and $\sigma_{\pi_1}(f_P)$ are independent of the sample size $n_s$, while $\sigma_R(f_P)$ and $\sigma_{p_s}(f_P)$ exhibit the typical `statistical dependence' $\propto 1/\sqrt{n_s}$. Therefore we shall refer hereafter to $\sigma_R(f_P)$ and $\sigma_{p_s}(f_P)$ as random (or statistical) contributions; to the others as contributions due to systematics, which cannot be improved increasing the sample size.

The upper plot of Fig. [*]

Figure: Contributions to the relative uncertainty on the fraction of positives as a function of the sample size $n_s$, assuming it much smaller than the population size $N$, for a proportion of infected individuals $p=0.1$. The solid blue line with negative slope is the contribution from $\sigma_R(f_P)$, the dashed blue one is the contribution from $\sigma_{p_s}(f_P)$, the dotted line is the `quadratic sum' of the two; the lower horizontal red one is the contribution from $\sigma_{\pi_1}(f_P)$ and the upper horizontal one is the contribution from $\sigma_{\pi_2}(f_P)$ (a dotted red line, showing their `quadratic sum' is indeed overlapping the $\pi_2$ contribution). The overall uncertainty is shown by the uppest curve (dotted brown). The upper plot is for a standard uncertainty on $\pi_2$ $\sigma(\pi_2)=0.022$. The lower plot is for the case of uncertainty reduced to $\sigma(\pi_2)=\sigma(\pi_1)=0.007$.
\begin{figure}\begin{center}
\epsfig{file=Contributions_rel_unc_p0.1_standard.e...
....92\linewidth}
\\ \mbox{} \vspace{-1.0cm} \mbox{}
\end{center}
\end{figure}
shows, for our reference value of $p=0.1$ and for uncertain $\pi_1$ and $\pi_2$ (summarized as $\pi_1=0.978\pm 0.007$ and $\pi_2=0.115\pm 0.022$), the relative uncertainty on $f_{P}$, that is $\sigma(f_{P})/$E$(f_{P})$, as a function of $n_s$, highlighting the different contributions to the total uncertainty. The horizontal lines represent the two systematic contributions, independent from $n_s$, while their quadratic sum does not appears in the plot, because it overlaps practically exactly with the dominant systematic contribution, due to the uncertain $\pi_2$. The `straight lines with negative slopes' (in log-log plot, which notoriously linearizes power laws) are the individual statistical contributions (solid and dashed, respectively - see the figure caption for details) and their quadratic sum (dotted). The uppest (dotted brown) curve is the overall uncertainty, dominated at small $n_s$ by the statistical contributions and at high $n_s$ by the systematic ones, namely by $\sigma_{\pi_2}(f_P)$. (We shall come in a while into the meaning and the importance of the vertical line.)

Since the dominant contribution due to $\sigma(\pi_2)$ limits the relative uncertainty on $f_P$ to about $10\%$, reached for $n_s$ above a few thousands, it is interesting to see what we would gain reducing $\sigma(\pi_2)$ to the value of $\sigma(\pi_1)$. This is done in the bottom plot of Fig. [*], which shows a clear improvement, although the contribution due to $\sigma(\pi_2)$ still dominates with respect to that due to $\sigma(\pi_1)$, because the former enters, for $p=0.1$, with a weight 9 times higher than the latter, as it results from Eqs. ([*]) and ([*]). Moreover, since all contributions to the uncertainty on $f_P$ depend also on $p$, we report in Fig. [*] the case of a supposed proportion of infectees37as high as $50\%$ (i.e. $p=0.5$).

Figure: Same as Fig. [*] for a proportion of infected individuals of $50\%$ ( $p=0.5$). In this case the contribution from sampling the population $\sigma_{p_s}(f_P)$ is larger than that from $\sigma_R(f_P)$. Note that in the lower plot the two solid horizontal lines collapse into a single one, being the contribution from $\sigma_{\pi_1}(f_P)$ and $\sigma_{\pi_2}(f_P)$ equal. It is, instead, visible, with respect to the plots of Fig. [*] the horizontal dotted line showing the quadratic combination of the systematic contributions, reached asymptotically by the top dotted curve representing the global relative uncertainty on $f_P$.
\begin{figure}\begin{center}
\centering {\epsfig{file=Contributions_rel_unc_p0....
..._unc_p0.5_spi20.007.eps,clip=,width=0.92\linewidth}}
\end{center}
\end{figure}
One of the remarkable difference with respect to Fig. [*] is that the contribution from $\sigma_{p_s}(f_P)$ becomes larger than that from $\sigma_R(f_P)$ (remaining always `parallel' as a function of $n_s$ in `log-log' plots, since they depend on the same power of the sample size). Indeed, $\sigma_{p_s}(f_P)$ starts dominating from $p\approx 0.15$ up to $p\approx 0.95$, as shown in Fig. [*],
Figure: Ratio of $\sigma_{p_s}(f_P)$ to $\sigma_R(f_P)$ as a function of the population fraction of infected $p$.
\begin{figure}\begin{center}
\epsfig{file=Rapporto_incertezze_sampling.eps,clip=,width=\linewidth}
\\ \mbox{} \vspace{-1.3cm} \mbox{}
\end{center}
\end{figure}
in which the ratio $\sigma_{p_s}(f_P)/\sigma_R(f_P)$ as a function of $p$, is reported, exhibiting a whale-like shape.

As a further example we show in Fig. [*] the contributions

Figure: Same quantities of Figs. [*] and [*], but in the symmetric case of specificity equal to sensitivity, i.e. E$(\pi_2)=1-$E$(\pi_1)=0.022$, again with equal uncertainties, i.e. $\sigma(\pi_2)=\sigma(\pi_1)=0.007$. The upper plot, for $p=0.1$, has to be compared to the lower plot of Fig. [*]; the lower plot, for $p=0.5$, has to be compared to the lower plot of Fig. [*].
\begin{figure}\begin{center}
% attenzione: i nomi dei file sono errati, nel sen...
...pi20.007_pi1-eq-pi2.eps,clip=,width=0.92\linewidth}}
\end{center}
\end{figure}
to the relative uncertainty of $f_P$ for the case of improved specificity of the test, i.e. reducing the expected value of $\pi_2$ from 0.115 to 0.022, keeping its uncertainty equal to that of $\pi_1$, that is 0.007. This means that we consider specificity equal to sensitivity, both in expected value and in uncertainty. In practice this is done swapping the parameters of the related Beta distributions, that is $r_2=s_1$ and $s_2=r_1$ (see Sec. [*]).

In order to make evident the differences with what has been shown in the previous cases, we plot $\sigma_{p_s}(f_P)/$E$(f_P)$ for both $p=0.1$ (upper plot) and $p=0.5$ (lower plot). In particular, in order to see the effect of this last improvement of the specificity (i.e. increasing its expected value from 0.885 to 0.978, keeping the same standard uncertainty) we need to compare the upper plot of Fig. [*] with the lower plot of Fig. [*]; the lower plot of Fig. [*] with the lower plot of Fig. [*]. The result is, at least at a first sight, quite counter-intuitive, since to a sizable improvement in specificity there is a reduction in the relative accuracy with which the fraction of positives is expected (effect particularly important for $p=0.1$). We shall comment about it in the next sub-section, in which we start describing the vertical lines in the plots of Figs. [*], [*] and [*], commenting on their importance.