From a.l.read@fys.uio.no Wed Apr 12 16:20:53 2000 Date: Mon, 03 Apr 2000 16:17:25 +0200 From: Alex Read To: [See mailing list in a separate file] Subject: Re: Conclusions of the CERN CLW Dear All, I would like to draw attention to this paragraph from Prosper. > What I'd like to know is: Why have we high energy physicists been beating > ourselves silly all these years to achieve absolute coverage just because > Neyman told us we should? Maybe approximate coverage, on average, across the > parameter space is good enough? I would hazard a guess that Neyman was far > less dogmatic about his ideas than we are about them! > In my talk in the CERN CLW I tried to show why absolute coverage makes perfect sense when the experimental sensitivity to the hypothetical signal one is searching for is outstanding and is completely useless when the experimental sensitivity has vanished. In most searches the transition of coverage from a useful to a useless principle is continuous and obtained results are often in the grey region in between (such as the (in)famous case of a handful of background candidates expected and significantly fewer candidates than this observed). What I offered in my talk was a prescription (not found in established statistical literature, that's true; an approximation (!) to something we can never actually do in our experiments, that's true; conservative with respect to the absolutely frequentistic confidence CLs+b, that's numerically true) for constructing this transition, the motivations for and the properties of this prescription. Highland was right, Zech's epsilon which is CLs for a single-channel counting experiment, if interpreted as a strict probability, does not correspond to any physical experiment. If we look to approximate coverage across the parameter space, as Prosper suggests above, we will again get into a discussion about the ensemble, about which parameter space, which metric, etc, etc, etc. I'm asking you to consider whether it is not reasonable to have absolute coverage precisely where it is possible, to have NO coverage when the experimental sensitivity is ZERO, and to identify a practical prescription to vary the coverage in a controlled manner from CL down to 0. I concur with Murray's and Prosper's reservations about point 5. I argue for explicitly tossing converage as an absolute principle relevant in all situations. I find it reasonable of Cousins not to claim consensus. Point 7 was indeed a point which it was obvious there was a growing consensus for at the CERN workshop. My impression in January was that this would turn out to be the strongest conclusion/recommendation of the workshop. I would like to express the (psychological, subjective, irrational, take your pick) impression the workshop made on me. I had previously considered myself a frequentist of sorts and thought it was a reasonable point of view. I like the idea of being able to associate confidences with the range of possible outcomes of gedanken experiments. As I read the required reading for the workshop I started to realize that the membership fee for joining the frequentist club might be so high that I was not sure I could afford to pay the price. This fear was only strengthened at the workshop. There is no frequency distribution for signal-only experiments in the presence of background (so no "quantum mechanics" of search results near the sensitivity bound). To insist that absolutely frequentistic confidence intervals can make statements about ONLY the signal is wishful thinking. This is an approximation (!) which works well for large signals (measurements), but not always for the small ones we consider in frontier experiments. One of the consequences of this approximation is that distasteful results can arise when nuisance parameters are introduced. Cousins remarks that 'It is disturbing that the classical method method gives the "wrong" sign to the effect.' (the effect is an uncertainty on the integrated luminosity of the experiment). Let me twist slightly Highland's (valid) criticism of Zech and implicitly of Cls (in the sense that CLs doesn't come from a confidence distribution): It is difficult to see how a strict frequentist confidence interval can say much about the signal in isolation, since background and signal are indistinguishable. The likelihood function is an unbiased representation of an experimental result, for large or small signals, and for searches should be computed and presented down to the sensitivity limit. This statement is not on the list of points but it could perhaps be promoted. I think we have understood the arguments against the existance of a purely objective probability of the signal given the data and the need for priors, but I also think a step in between, a relatively unbiased and intuitive interpretation of the obtained likelihood function (which is NOT a probability of signal given data) before embedding it in a larger Bayesian probability framework is possible and instructive. Cheers, Alex "Harrison B. Prosper" wrote: > > Hi All, > > I would tend to agree with Bill, that the ten points listed below > > > 1) Civility > > 2) P(Hypothesis|data) requires a prior > > 3) Likelihood is not a PDF in the unknown parameters > > 4) Answers assuming Uniform prior depend upon the metric > > 5) Consistent treatment of UL requires automatic 2-sided CL > > 6) Bayesian intervals do not have frquentist coverage > > 7) Publishing L function is encouraged > > 8) Chi2 does not exist naturally in Bayes statistics > > 9) Classical CI construction has no prior > > 10) Any argument for objective decisions ignores the subjective > > utility function. > > are non-controversial apart from number 5). Here is a cartoon version of the > argument for 5): > > flip-flopping between 1-sided and 2-sided limits wrecks coverage, > so if you want absolute coverage you're not allowed to flip-flop. > > My quibble with the wording of 5) is that it is not so much that it is > "inconsistent" to flip-flop, but rather that it wrecks coverage. What is > inconsistent is to make an a priori claim of coverage while using an > ensemble in which you flip-flop. But note: No one has proven that coverage > is impossible in a flip-flopping ensemble. I suspect that if you are > prepared to pay the price of extra over-coverage then you can flip-flop as > much as you like. > > But this argument assumes that you believe that absolute coverage is a > useful notion. If, as Glen Cowan suggested in his excellent summary talk at > the Fermilab workshop, flip-flopping and the wrecking of coverage may be > only a formal difficulty rather than one with verifiable (bad) consequences > then one can question 5). Indeed, if, as Berger claimed last week, some > modern frequentists reject Neyman's absolutism regarding coverage then the > fact that flip-flopping undercovers a la Neyman is no big deal. > > What I'd like to know is: Why have we high energy physicists been beating > ourselves silly all these years to achieve absolute coverage just because > Neyman told us we should? Maybe approximate coverage, on average, across the > parameter space is good enough? I would hazard a guess that Neyman was far > less dogmatic about his ideas than we are about them! > > A useful outcome of the two workshops was the general recognition that we > should be less dogmatic on all fronts. > > Harrison