``` ---------- Forwarded message ----------
Date: Tue, 3 Jun 2003 20:34:53 +0200 (CEST)
From: Alexander Grossheim
To: Giulio.Dagostini@cern.ch
Subject: bayesian unfolding program

Hello,

I am currently investigating the use of your program to unfold measured
distributions in HARP (a fixed target experiment at the CERN PS to measure

Because I just started I do not have (yet) comments on method or
performance (the first trials yielded quite promising results though) but
a more technical issue on speed:

It is still possible to speed up the calculation of the covariance
matrices, at least on a standard intel linux platform using gcc as
compiler.

In error-mode 22 both 4-fold nested loops still contain calculations
in the innermost loop which can be done one level higher (I will just
quote the code here so you can easily see the changes):

1st loop:

if(IE .ne. 0) then
do k = 1,noc
l0 = 1
if (OFFD .lt. 0) l0 = k
do l = l0,k
do i =1,noe
--->         temp3=M_unf(k,i)*ne(i)
do j =1,noe
if (i .eq. j) then
if (IE .eq. 1) then
Vc0(k,l) = Vc0(k,l) + M_unf(k,i)*M_unf(l,j)*ne(j)
else
Vc0(k,l) = Vc0(k,l) + M_unf(k,i)*M_unf(l,j)*temp4(j)
endif
else  ! (i .ne. i)
if (IE .eq. 2) then
--->               Vc0(k,l) = Vc0(k,l) - temp3*M_unf(l,j)*temp5(j)
endif
endif
enddo
enddo
Vc0(l,k) = Vc0(k,l)
enddo
enddo
endif

temp3 does not depend on the index j an can be calculated earlier. the
gain here is not _that_ much, but on a little benchmark program with a few
hundred causes and effects I gained about 10-15 seconds on 4:30 minutes.

2nd loop:

...

do i = 1,noe
-->           temp1=M_unf(l,i)*nc_inv_mc(l)
-->           temp2=M_unf(l,i)*eff(l)*npec_inv(i,l)
\$              +M_unf(k,i)*eff(k)*npec_inv(i,k)
do j = 1,noe
C----------   covariance matrix of unfolding matrix
-->            Cov_M_kilj = temp1
+                     + M_unf(k,j)*nc_inv_mc(k)
+                     + M_tmp(i,j)
if (k.eq.l) then
Cov_M_kilj=Cov_M_kilj-neff_inv
if (i.eq.j) then
Cov_M_kilj=Cov_M_kilj+npec_inv(i,k)
endif
endif
if (i.eq.j) then
-->              Cov_M_kilj=Cov_M_kilj-temp2
endif
Cov_M_kilj = Cov_M_kilj*M_unf(k,i)*M_unf(l,j)

C----------   covariance matrix of results
Vc1(k,l) = Vc1(k,l) + ne(i)*ne(j)*Cov_M_kilj
enddo ! j
enddo ! i

...

here again, temp1 and temp2 are not depending on index j. the gain here
was 30 seconds on my 4:30 mins program.

So the (total) execution time went down from 270 seconds to about 230
seconds (in the 270 seconds there is already the time needed for about
10-15 iterations without error calculation).

As I already pointed out this might all depend on the platform and
compiler. Maybe commercial compilers will find these things during
code optimization. The system on which I made these 'measurements' is a
Pentium 4 (1.8GHz) running Redhat 9 with gcc 3.2 compiler using the
highest optimization level.

If you find this useful and if you can reproduce it feel free to do with
it whatever you like.

Cheers,
Alex

ps: please allow me _one_ (maybe stupid) non-technical question: if I come
with measured causes or effects into a region where the Monte Carlo input
sample is very weakly populated... shouldn't the error (in mode 22) on the
final result become very large in these regions? I see wild fluctuations
but the errors are very small.. (but maybe I am still doing something
completely wrong..)

```