next up previous contents
Next: Other comments, examples and Up: Bayesian unfolding Previous: Bayes' theorem stated in   Contents

Unfolding an experimental distribution

If one observes $ n(E)$ events with effect $ E$, the expected number of events assignable to each of the causes is
$\displaystyle \widehat{n}(C_i) = n(E)\, P(C_i\,\vert\,E)\,.$     (7.2)

As the outcome of a measurement one has several possible effects $ E_j$ ( $ j=1, 2, \ldots, n_E$) for a given cause $ C_i$. For each of them the Bayes formula ([*]) holds, and $ P(C_i\,\vert\,E_j)$ can be evaluated. Let us write ([*]) again in the case of $ n_E$ possible effects7.1, indicating the initial probability of the causes with $ P_\circ (C_i)$:

$\displaystyle P(C_i\,\vert\,E_j) = \frac{P(E_j\,\vert\,C_i)\, P_\circ (C_i)} {\sum_{l=1}^{n_C} P(E_j\,\vert\,C_l)\, P_\circ (C_l)}\, .$ (7.3)

One should note the following.

After $ N_{obs}$ experimental observations one obtains a distribution of frequencies $ \underline{n}(E) \equiv \{n(E_1), n(E_2),$ $ \ldots , n(E_{n_E})\} $. The expected number of events to be assigned to each of the causes (taking into account only the observed events) can be calculated by applying ([*]) to each effect:

$\displaystyle \left.\widehat{n}(C_i)\right\vert _{obs}$ $\displaystyle =$ $\displaystyle \sum_{j=1}^{n_E}n(E_j)\,
\,.$ (7.4)

When inefficiency7.2 is also brought into the picture, the best estimate of the true number of events becomes
$\displaystyle \widehat{n}(C_i)$ $\displaystyle =$ $\displaystyle \frac{1}{\epsilon_i}
\sum_{j=1}^{n_E}n(E_j)\, P(C_i\,\vert\,E_j)
\hspace{1. cm}\epsilon_i \ne 0\,.$ (7.5)

From these unfolded events we can estimate the true total number of events, the final probabilities of the causes and the overall efficiency:
$\displaystyle \widehat{N}_{true}$ $\displaystyle =$ $\displaystyle \sum_{i=1}^{n_C} \widehat{n}(C_i)\, ,$  
$\displaystyle \widehat{P}(C_i) \equiv P(C_i\,\vert\,\underline{n}(E))$ $\displaystyle =$ $\displaystyle \frac{\widehat{n}(C_i)}{\widehat{N}_{true}} \, ,$  
$\displaystyle \widehat{\epsilon}$ $\displaystyle =$ $\displaystyle \frac{N_{obs}}{\widehat{N}_{true}}\, .$  

If the initial distribution $ \underline{P_\circ} (C)$ is not consistent with the data, it will not agree with the final distribution $ \underline{\widehat{P}}(C)$. The closer the initial distribution is to the true distribution, the better the agreement is. For simulated data one can easily verify that the distribution $ \underline{\widehat{P}}(C)$ lies between $ \underline{P_\circ} (C)$ and the true one. This suggests proceeding iteratively. Fig. [*] shows an example of a bidimensional distribution unfolding.

More details about iteration strategy, evaluation of uncertainty, etc. can be found in Ref. [56].

Figure: Example of a two-dimensional unfolding: true distribution (a), smeared distribution (b) and results after the first four steps [(c) to (f)].
I would just like to comment on an obvious criticism that may be made: ``the iterative procedure is against the Bayesian spirit, since the same data are used many times for the same inference''. In principle the objection is valid, but in practice this technique is a ``trick'' to give to the experimental data a weight (an importance) larger than that of the priors. A more rigorous procedure which took into account uncertainties and correlations of the initial distribution would have been much more complicated. An attempt of this kind can be found in Ref. [57]. Examples of unfolding procedures performed with non-Bayesian methods are described in Refs. [54] and [55].

Note added: A recent book by Cowan[58] contains an interesting chapter on unfolding.

next up previous contents
Next: Other comments, examples and Up: Bayesian unfolding Previous: Bayes' theorem stated in   Contents
Giulio D'Agostini 2003-05-15