Where is the problem?
The question is very simple. No matter which test statistic
has been used, there is no simple
logical relation between a p-value and the probability
of the hypothesis to test
(`
' -- in this case ``
'').
Indeed, p-values are notoriously misunderstood, as
well explained in a section of Wikipedia
that I report here verbatim for the convenience of the
reader[11], highlighting the sentences that
mostly concern our discourse.
- ``The p-value is not the probability that the null hypothesis is true.
In fact, frequentist statistics does not, and cannot,
attach probabilities to hypotheses. Comparison of Bayesian
and classical approaches shows that a p-value can be very close
to zero while the posterior probability of the null is very close
to unity (if there is no alternative hypothesis with a large
enough a priori probability and which would explain the results
more easily). This is the Jeffreys-Lindley paradox.
- The p-value is not the probability that a finding is
``merely a fluke.''
As the calculation of a p-value is based on the assumption that
a finding is the product of chance alone, it patently cannot also
be used to gauge the probability of that assumption being true.
This is different from the real meaning which is that the p-value
is the chance of obtaining such results if the null hypothesis is true.
- The p-value is not the probability of falsely rejecting
the null hypothesis. This error is a version of the so-called
prosecutor's fallacy.
- The p-value is not the probability that a replicating
experiment would not yield the same conclusion.
-
is not the probability of the
alternative hypothesis being true.
- The significance level of the test is not determined by the p-value.
The significance level of a test is a value that should
be decided upon by the agent interpreting the data before
the data are viewed, and is compared against the p-value
or any other statistic calculated after the test has been performed.
(However, reporting a p-value is more useful than simply saying
that the results were or were not significant at a given level,
and allows the reader to decide for himself whether to consider
the results significant.)
- The p-value does not indicate the size or importance
of the observed effect (compare with effect size).
The two do vary together however - the larger the effect,
the smaller sample size will be required to get a significant p-value.''
Are you still sure you had really understood what p-values mean?
Giulio D'Agostini
2012-01-02