``Perhaps a Socratic exchange between an ideally sharp, i.e not easily bamboozled student (S.) of a typical introductory statistics course and his prof (P.) is the best way to illustrate what I think of the issue. The class is at the point where confidence interval (CI) for the normal mean is introduced and illustrated with a concrete example for the first time.

**P.**- ...and so a 95% CI for the unknown mean is (1.2, 2.3).
**S.**- Excuse me sir, just a few minutes ago you emphasized that a CI is some kind of random interval with certain coverage properties in REPEATED trials.
**P.**- Correct.
**S.**- What, then, is the meaning of the interval above?
**P.**- Well, it is one of the many possible realizations from a collection of intervals of a certain kind.
**S.**- And can we say that the 95collective, is somehow carried over to this particular realization?
**P.**- No, we can't. It would be worse than incorrect; it would be meaningless for the probability claim is tied to the collective.
**S.**- Your claim is then meaningless?
**P.**- No, it isn't. There is actually a way, called Bayesian statistics, to attribute a single-trial meaning to it, but that is beyond the scope of this course. However, I can assure you that there is no numerical difference between the two approaches.
**S.**- Do you mean they always agree?
**P.**- No, but in this case they do provided that you have no reason, prior to obtaining the data, to believe that the unknown mean is in any particularly narrow area.
**S.**- Fair enough. I also noticed sir that you called it `a' CI, instead of `the' CI. Are there others then?
**P.**- Yes, there are actually infinitely many ways to obtain CI's which all have the same coverage properties. But only the one above is a Bayesian interval (with the proviso above added, of course).
**S.**- Is Bayesian-ness the only way to justify the use of this particular one?
**P.**- No, there are other ways too, but they are complicated and they operate with concepts that draw their meaning from the collective (except the so called likelihood interval, but then this strange guy does not operate with probability at all).
**...**

It could be continued ad infinitum. Assuming sufficiently more advanced students one could come up with similar exchanges concerning practically every frequentist concept orthodoxy operates with (sampling distribution of estimates, measures of performance, the very concept of independence, etc.). The point is that orthodoxy would fail at the first opportunity had students been sufficiently sharp, open minded, and inquisitive. That we are not humiliated repeatedly by such exchanges (in my long experience not a single one has ever taken place) says more about... well, I don't quite know about what -- the way the mind plays tricks with the concept of probability? The background of our students? Both?

Ultimately then we teach the orthodoxy not only because of intellectual inertia, tradition, and the rest; but also because, like good con artists, we can get away with it. And that I find very disturbing. I must agree with Basu's dictum that nothing in orthodox statistics makes sense unless it has a Bayesian interpretation. If, as is the case, the only thing one can say about frequentist methods is that they work only in so far as they don't violate the likelihood principle; and if they don't (and they frequently do), they numerically agree with a Bayesian procedure with some flat prior - then we should go ahead and teach the real thing, not the substitute. (The latter, incidentally, can live only parasitically on an illicit Bayesian usage of its terms. Just ask an unsuspecting biologist how he thinks about a CI or a P-value.)

One can understand, or perhaps follow is a better word, the historical reasons orthodoxy has become the prevailing view. Now, however, we know better.''