If
or
have a nice parabolic
shape,
the likelihood is, apart a multiplicative factor,
a Gaussian function^{4} of .
In fact, as is well known from calculus,
any function can be approximated to a parabola in the vicinity
of its minimum.
Let us see in detail the expansion of
around its minimum
:

(4) | |||

(5) |

where the second term of the r.h.s. vanishes by definition of minimum and we have indicated with the inverse of the second derivative at the minimum. Going back to the likelihood, we get:

(6) | |||

(7) |

apart a multiplicative factor, this is `Gaussian' centered in with standard deviation . However, although this function is mathematically a Gaussian, it does not have yet the meaning of probability density in an inferential sense, i.e. describing our knowledge about in the light of the experimental data. In order to do this, we need to process the likelihood through Bayes theorem, which allows

We can speak now about the ``probability that is within a given interval'' and calculate it, together with expectation of , standard deviation and so on.

If this is the case, it is a simple exercise to show that

*a*)- is equal to which minimizes the or .
*b*)- can be obtained by the famous conditions or , respectively, or by the second derivative around : or , respectively.

- restart from Eq. (9) or from Eq. (11), depending on the other underlying hypotheses;
- go even one step before Eq. (9), namely to the most general Eq. (8), if priors matter (e.g. physical constraints, sensible previous knowledge, etc.).

Other examples of asymmetric curves, including the case with more than one minimum, are shown in Chapter 12 of Ref. [3], and compared with the results coming from frequentist prescriptions (but, indeed, there is not a general accepted rule to get frequentistic results - whatever they mean - when the shape gets complicated).

Unfortunately,
it is not easy to translate numbers obtained by *ad hoc*
rules into probabilistic results, because the dependence
on the actual shape of the or curve can be
not trivial.
Anyhow,
some *rules of thumb* can be given in next-to-simple situations
where the or has only one minimum and
the or curve looks like a `skewed parabola',
like in Fig. 2:

- the 68% `confidence interval' obtained by the , or rule still provides a 68% probability interval for .
- the standard deviation obtained using Eq. (14)
is approximately equal to the average between the
and values obtained by the
,
or
rule:

- the expected value is equal to the mode (,
coinciding with the maximum likelihood or minimum value)
plus a
*shift*:

[This latter rule is particularly rough because is more sensitive than on the exact shape of or curve. Equation (16) has to be taken only to get an idea of the order of magnitude of the effect. For example, in the case depicted in Fig 2 the shift is 80% of .]

The remarks about misuse of
and
rules
can be extended to cases where several parameters are involved.
I do not want to go into details
(in the Bayesian approach there is
nothing deeper than studying
or
in function of several
parameters.^{8}), but I just want
to get the reader
worried about the meaning of contour plots of the kind shown
in Fig. 3.