Moving to probabilistic considerations

Let us start seeing what is going on when there are no infected individuals in the population, i.e. when

. In our rough reasoning none of the 10000 sampled individual will be infected. But 12% of them will be tagged as positive, exactly the critical value of 1200 we have seen above. In reality we have neglected the fact that 1200 is an expectation, in the probabilistic meaning of expected value, but that other values are also possible. In fact, given the assumed properties of the test, the number of individuals which shall result positive to the test is uncertain, and precisely described by the well known binomial distribution with `probability parameter' (see Ref. [19] for clarifications) $\pi_2$ . The expectation has therefore an uncertainty, that we quantify with the standard uncertainty [29], i.e. the standard deviation of the related probability distribution. Using the well known formula resulting from the binomial distribution, which in our case reads as $\sigma=\sqrt{\pi_2\cdot (1-\pi_2)\cdot m}$ , we get, using our numbers, $\sigma=32.5$ . Since we are dealing with reasonably large numbers, the Gaussian approximation holds and we can easily calculate that there is about 16% probability to get a number of positives equal or below 1167, and so on. In particular we get 0.1% probability to observe a number equal or below 1100, which we could consider a safe limit for practical purposes.

But, unfortunately, the story is a bit longer. In fact we don't have to forget that $\pi_2$ comes itself from measurements and is therefore uncertain. Therefore, although 0.12 is its `nominal value', also values below 0.10 are easily possible, yielding e.g. an expected number of positives, among the not infected individuals, of $1000\pm 30$ for $\pi_2=0.10$ and $800\pm 27$ for $\pi_2=0.08$ (hereafter, unless indicated otherwise, we quote standard uncertainties).

Then there is the question that we apply the tests on the sample, and not on the entire population. Therefore, unless the proportion of infectees in the population is exactly 0 or 1, the proportion of infectees in the sample (`'), will differ from . For example, sticking to a reference , in the 10000 individuals sampled from a population ten times larger we do not expect exactly 1000 infected, but $1000\pm 28$ as we shall see in detail in Sec. (we only anticipate, in answer to somebody who might have quickly checked the numbers, that the standard uncertainty differs from 30, calculated from a binomial distribution, because this kind of sampling belongs, contrary to the binomial, to the model `extraction without reintroduction').