---------- Forwarded message ---------- Date: Tue, 3 Jun 2003 20:34:53 +0200 (CEST) From: Alexander GrossheimTo: Giulio.Dagostini@cern.ch Subject: bayesian unfolding program Hello, I am currently investigating the use of your program to unfold measured distributions in HARP (a fixed target experiment at the CERN PS to measure hadron production cross-sections). Because I just started I do not have (yet) comments on method or performance (the first trials yielded quite promising results though) but a more technical issue on speed: It is still possible to speed up the calculation of the covariance matrices, at least on a standard intel linux platform using gcc as compiler. In error-mode 22 both 4-fold nested loops still contain calculations in the innermost loop which can be done one level higher (I will just quote the code here so you can easily see the changes): 1st loop: if(IE .ne. 0) then do k = 1,noc l0 = 1 if (OFFD .lt. 0) l0 = k do l = l0,k do i =1,noe ---> temp3=M_unf(k,i)*ne(i) do j =1,noe if (i .eq. j) then if (IE .eq. 1) then Vc0(k,l) = Vc0(k,l) + M_unf(k,i)*M_unf(l,j)*ne(j) else Vc0(k,l) = Vc0(k,l) + M_unf(k,i)*M_unf(l,j)*temp4(j) endif else ! (i .ne. i) if (IE .eq. 2) then ---> Vc0(k,l) = Vc0(k,l) - temp3*M_unf(l,j)*temp5(j) endif endif enddo enddo Vc0(l,k) = Vc0(k,l) enddo enddo endif temp3 does not depend on the index j an can be calculated earlier. the gain here is not _that_ much, but on a little benchmark program with a few hundred causes and effects I gained about 10-15 seconds on 4:30 minutes. 2nd loop: ... do i = 1,noe --> temp1=M_unf(l,i)*nc_inv_mc(l) --> temp2=M_unf(l,i)*eff(l)*npec_inv(i,l) $ +M_unf(k,i)*eff(k)*npec_inv(i,k) do j = 1,noe C---------- covariance matrix of unfolding matrix --> Cov_M_kilj = temp1 + + M_unf(k,j)*nc_inv_mc(k) + + M_tmp(i,j) if (k.eq.l) then Cov_M_kilj=Cov_M_kilj-neff_inv if (i.eq.j) then Cov_M_kilj=Cov_M_kilj+npec_inv(i,k) endif endif if (i.eq.j) then --> Cov_M_kilj=Cov_M_kilj-temp2 endif Cov_M_kilj = Cov_M_kilj*M_unf(k,i)*M_unf(l,j) C---------- covariance matrix of results Vc1(k,l) = Vc1(k,l) + ne(i)*ne(j)*Cov_M_kilj enddo ! j enddo ! i ... here again, temp1 and temp2 are not depending on index j. the gain here was 30 seconds on my 4:30 mins program. So the (total) execution time went down from 270 seconds to about 230 seconds (in the 270 seconds there is already the time needed for about 10-15 iterations without error calculation). As I already pointed out this might all depend on the platform and compiler. Maybe commercial compilers will find these things during code optimization. The system on which I made these 'measurements' is a Pentium 4 (1.8GHz) running Redhat 9 with gcc 3.2 compiler using the highest optimization level. If you find this useful and if you can reproduce it feel free to do with it whatever you like. Cheers, Alex ps: please allow me _one_ (maybe stupid) non-technical question: if I come with measured causes or effects into a region where the Monte Carlo input sample is very weakly populated... shouldn't the error (in mode 22) on the final result become very large in these regions? I see wild fluctuations but the errors are very small.. (but maybe I am still doing something completely wrong..)