Probabilità e Incertezza di Misura
lezioni per i Dottorati
di Ricerca in Fisica (32^{o} Ciclo)
(G. D'Agostini)
Syllabus for the written exam
Corso di 20 ore, con inizio
giovedì 12 gennaio 2017
Programma
- Programma indicativo.
- Per ulteriori indicazioni vedi i corsi degli anni precedenti
(in particolare quelli evidenziati in grassetto),
tenendo conto che tali corsi erano da 40 ore
- Nota: alcuni argomenti e applicazioni potranno dipendere
dagli interessi dei dottorandi.
Modalità di esame
Da decidere fra le seguenti due opzioni (e forse entrambe, con la prima
intermedia, e programma da ridefinire):
- una verifica scritta su una sottoparte del corso
(valida dal 26.mo al 29.mo ciclo e soggetta a cambiamento):
/dott-prob_26/programma_scritto.html)
- una presentazione sotto forma seminariale su tema concordato,
che prevedano possibilmente, ma non necessariamente, sviluppo/utilizzo
di programmi per risolvere problemi pratici o basati su toy model.
Lezioni
Dettaglio degli argomenti delle lezioni
- Legenda
- ☑ : assumed
well known
- ☛ : assumed known,
with invitation to check it
- ☝☀ :
important item which will discussed in detail: start to think about
- ★ : nice observation
- ☠ : possible
misinterpretation/pitfall
- Lezione 1 (12/1/17)
- Introduction to the course:
- Entry self-test.
- Some comments on the first seven problems
(reminders and overview of issues we shall see in detail):
- Probability functions and probability density functions
☑
- Expected value, variance and standard deviation
of discrete and continuos uncertain numbers
(“random variables”)
☑
- Uniform distribution
☑
- Gaussian distribution
☑
- Central Limit Theorem
☛
- Binomial ad Poisson distributions
☑
- Probability 'of' true values, of hypotheses, of causes,
of model parameters, etc
☝☀
('of': “of being true”;
or “of being in a given interval ”)
- Observations, inference and predictions
☝☀
- Distribution of the arithmetic average
☑
- Uncertainty about the Gaussian's parameter μ
☝☀
- Uncertainty about the binomial's parameter p
and the Poisson's parameter λ
☝☀
- Distribution of observations ('X') in the
case of uncertain parameters
☝☀
- Chi-square distribution
☛
- Poisson distribution and Poisson process
☝☀
- First introduction to R (see also here
for a starting point ans some examples
related to the course).
- Handling vectors;
- Simple plots;
- Simple functions;
- Probability distribution, e.g. dnorm, pnorm, qnorm and rnorm,
ect: 'unif', 'binom', 'pois', 'chisq', 'exp'...
- Lezione 2 (13/1/17)
- Intro 2
- More on the entry test.
- From the probability distribution of the value of the observation
given a parameter to the probability of the value of the parameter
given the observation
☝☀
- f(x|μ) → f(μ|x) [the average can be considered
an 'equivalent' measurement ('statistical sufficiency')
with σ/sqrt(n) ]
- f(x|λ) → f(λ|x)
- f(x|n,p) → f(p|n,x)
- Predictive distribution: f(x_{f}|x_{p}), with
x_{p} and x_{f} past and future
observation
☝☀
- Gaussian case: ∫
f(x_{f}|μ) f(μ|x_{p}) dμ
[integral from -∞ to +∞]
→ √2 effect!!
☝☀
→ problem nr. 3.c and (under Gaussian approximation)
nr. 6.
- Exponential distribution
☛
- Probability distribution vs (past) frequency distribution:
the summaries of the distribution have similar names
☑,
but different meaning
☝☀.
In particular expected value
and standard uncertainty (σ) only apply to
probability distributions.
- P-values
☠
☝☀
- Covariance and correlation coefficient
☛
- Effect of 'reconditioning' (problem nr 9)
☝☀
- About frequentist confidence intervals at a given
confidence level:
☠
☝☀
- Please try to recollect how you have learned them!
☛
- Case of the Higgs boson (before its experimental observation
in the final state):
- Reading error and measurement uncertainties due to instruments
with analogue scales (plus other issues related
to standard teaching of “errors in measurements”)
☛
- Uncertainty related to a digital instrument (having no systematic
error)
- Probability distribution of the sum and difference of two dice
☛
→ Probability distribution of the sum and different of
independent uniformly distributed variables
“with the same widths”
☝☀
- ISO GUM
(see also here):
example of evaluation
of Type B Uncertainty → Section 4.3.7
- General theorem concerning linear combination of
uncertain numbers (what you are used to call
“random variable”, but without
going through the 'methaphysical' issue of
what is randomness —
what matters is uncertainty! ☝☀)
- Expected value of a linear combination
☑
- Variance of a linear combination (of independent variables)
☑
- Central Limit Theorem
☑
- Playing with simulations, e.g. with R
- n=100000; x1=runif(n); x2=runif(n); x3=runif(n) # etc. ...
- hist(x1+x2, nc=100, col='cyan')
- hist(x1-x2, nc=100, col='cyan')
- hist(x1+x2+x3, nc=100, col='cyan')
- etc. ...
- How to present a result affected by uncertainty ☝☀ :
- possibly give the information about the pdf;
- give at least expected value and 'standard uncertainty'
- possibly give more 'summaries' if the distribution is not trivial
- providing intervals of 'certainty' in most case correspond in
providing no informationa at all (imagine the Gaussian case)
- About the results presented as "most like value" with "asymmetric uncertainty"
☠
☝☀
→There are no general theorem
(of the kind of those for expected values and variances)
for modes and probability intervals!
☑
- Finally starting...
- Uncertainty and probability
- What does 'statistics' mean?
- Lies, damn lies and statistics.
- Getting confused by p-values.
- Odds and probabilities.
- Role of the bet.
- Probabilities and coherent bets (de Finetti).
- Lezione 3 (19/1/17)
- Claim of discoveries based on sigma's
- More on p-values, with real cases from particle physics.
- References
- Lezione 4 (20/1/17)
- R session — Measurements
- Some of the problems of the entry test reviewed with R
- Rhistory_20Jan.txt
☝☀
- From the binomial distribution to the
Bernoulli theorem (beware from its misinterpretations!)
☝☀
- More on linear fits on simulated data:
fit_residui.R
To execute the script from an R session:
- source("fit_residui.R")
☝☀
→ change the model parameters (m, c and sigma),
or even the nr of data points, to get a feeling of the
results.
- Observations (not “observables” as meant e.g. by
particle physics theorists) vs model parameters
('true values').
- Higgs → γγ at LHC vs Anderson's positron .
- Measuring a mass on a scale: the 'infinite' complications
of a high precision and high accuracy measurement.
- Mass → reading
↔ reading → mass.
- Clarifying the terminology
[ ☛
ISO GUM
(see also
here)]:
measurand; result of a measurement; uncertainty; error; true value.
- Type A and Type B uncertainties (according to the ISO GUM)
☝☀
- Sources of uncertainties ★
☛
ISO GUM
(con dettagli in “decalogo ISO”)
- Lezione 5 (24/1/17)
- Probabilistic inference (and forecasting)
- Learning about causes of effects:
- Hobservation → hypotheses
- Causes → effects: deep reason of uncertainty
- “The essential problem of the experimental method” (Poincaré)
- An example: AIDS test.
- Model thinking: “The knowledge of a single effect
acquired by its causes open our mind to understand
and ensure us of other effects without
the need of making experiments ” (Galileo)
(By the way,
one of the best Coursera)
- The six box toy experiment:
- Where is probability?
- What is probability?
- Meaning of subjective probability: probability is always
conditional probabilitt
- About probability and its dependence on the status of information:
- With Schroedinger's words.
- Ellsberg paradox ☛ Wiki
- Variants of the three box problem
☛ "Monty Hall" → Wiki
+ apps
- Probability and coherent bet.
- Basic rules of the mathematics of beliefs
- Laplace's Bayes theorem
References
Problems (AIDS; three boxes with Gold/Silver rings;
particle identification)
Self study (for the moment)
- try to remember/understand
how the frequentistic confidence intervals
were taught during your studies.
(We shall come back to them)
- Lezione 6 (25/1/17)
- Bayesian reaoning
- Some practice with R: qnorm() etc.; quantile(); barplot().
- Solution of selected problems: AIDS; boxes with Gold/Silver rings.
☛
Particle identification: try again, with all variants.
- Laplace's teaching and the original sin
of the hypothesis tests (→ p-values).
- Prior and posterior probabilities, probability
rations and odds.
- Bayes factor, or Bayes-Turing factor
- Why do p-values "often work":
- Different ways to express the Laplace's Bayes theorem.
- Probability of the sequences WW, WB, BW and BB from the
box of unkown composition and from the box with 5W and 5B.
Try to solve the problem in two different ways:
- using the conditional probability of the second extraction,
given the first one (as outlined during the lecture);
- starting from the probabilities of the different sequences,
given the box content, and then 'averaging' the result
with the probabilities of the different compositions.
- Application to the six box toy experiment:
- updating the probabilities of the hypotheses;
- updating the probabilities of the effects;
- sequential use of the Bayes theorem: posterior of the
previous inference becomes the prior of the following inference.
- analysis of simulated sequences
- comparison probability and past frequencies of the outcomes;
- on the impossibility of comparing the probabilistic results
concerning the probability o hypotheses with
frequentistic methods;
- Complicating the problem
- Adding a 'Reporter' (or witness), which could lie or err
(detectore do err);
- Adding a different hypothesis concerning the box preparation
(uniform vs binomial).
- Probability vs propensity
- On the verificability events subject of probabilistic statements.
- From the probability of H_{i} to the
probability distribution of p.
Main references
Try to play with the R commands shown in the papers
(or to reproduce them in other 'human oriented' scripting languages)
Probability distributions — an useful vademecum
Please come to the next lecture with
Probability distributions
installed (there is also an Apple version).
- Lezione 7 (26/1/17)
- Bayesian inference: applications (+ other matter)
- From the inference of the box composition
to the inference of uncertain number (p):
- The Bayes' billiard
- Importance of graphical models in probabilistic inference
(→ Bayesian networks):
- fearning about a Poisson intensity ('r') from an
observed number of events and the best knowledge
of the background;
- model of fits (including systematics)
- Generalities about 'uncertain numbers'.
- Probability distributions vs statistical distribution:
- Example of probability distribution:
n=10; x=0:n; p=0.7; barplot(dbinom(x,n,p), names=x)
- Example of statistical distribution
n=10; p=0.7; x=rbinom(10000,n,p); barplot(table(x))
- Example, continued from the previous one, of a probability
distribution derived from a statistical distribution:
sample(x)[1]
(the probability of each outcome is proportional to the
bar heights of barplot(table(x)) )
- First intro to Monte Carlo:
- Partitioning of F(x) → probability of occurrence
proportional to the 'steps' in F(x) [for discrete variables]
→ probability of occurrence
proportional to the slope of F(x) [for continuous variables]
→ inverting F(R)
- f(x)=3x^2 (0<=x<=1) → F(x)=x^3 → runif(1)^(1/3)
- write a generator
such that f(t) = k exp(-t/τ)
- Hit/miss method for sampling and for integration
- extimating π:
n=10000; x=runif(n); y=runif(n); sum(x^2+y^2 <= 1)/n * 4
- Given the very complicate function
( sin(log(x^2+1)/4) )^(1+log(sqrt(x)+1)) * exp(-(x-0.1)^2/0.1) / sqrt(log(x+2) )
with x in the interval [0,1]:
- Estimate by sampling (hit/miss) its integral
in the interval [0,1];
- Sample x using the hit/miss algorithm...
- ... and evaluate average and standard deviation.
[As far as points 2 and 3 are concerned,
it easy to understand that normalization
is not needed.]
- Bernoulli process and related probability distributions
- Bernoulli distribution;
- Geometric distribution (the "drunk man problem": p=1/8)
- On the occurrence of rare events
- Binomial distribution
... and probability of relative frequencies (X/n)
- Bernoulli theorem, again;
- no relative frequency is, strictly speaking impossible,
although some can be VERY UMPROBABLE, e.g.
- n=1000; dbinom(n, n, 1/n, log=TRUE) / log(10)
- Pascal distribution.
- Bayes theorem for probability functions and for probability
density functions.
- First (very important, historically and practically) application:
inferring the p parameter of a Bernoulli process from
- first trial in which the event of intereste occurs;
- the number (x) of successes in n trials;
- the detail of the sequence
→ same dependence on the nr of successes and failures
- Conjugate priors:
- Binomial ↔ Beta
- Poisson ↔ Gamma
- Gaussian ↔ Gaussian
- Updating rule for the Beta parameters, given x successes and
(n-x) failures
Assignments (please try!)
- Practice with the Beta distribution of the app,
for example with the following sequence
of parameters (α,β):
(1,1), (2,1), (3,1), (4,1), (4,2), (4,3), (5,3), etc.
Does it remind you something?
- Examples in R:
- p=seq(0,1,len=100); plot(p,dbeta(p,5,3),ty='l')
- pbeta(0.5, 5, 3)
- qbeta(0.5, 5, 3)
- hist( rbeta(100000, 5, 3), nc=100 )
- qbeta(0.5, 5, 3)
- quantile( rbeta(100000, 5, 3), 0.5 )
- Using a uniform prior, i.e. (1,1), and applying
using the 'Formulas'[*] of the app, evaluate the
expression of:
- f(p|n,x)
- E[p|n,x] (and verify that you get the
Laplace's rule of succession)
- σ(p|n,x)
[*] Pay attentions: the formulae, written for the generic
variable 'x', have to be rewritten for the variable 'p',
to which we are intersted.
- Gamma function (using the notation of the app):
- Special case of α = 1 :
which function do we recover?
- What happens if α = 1 and β → 0 ?
- Examples in R:
- In analogy to the previous commands for the beta,
try to practice with
dgamma(), pgamma(), qgamma() and rgamma(),
understanding the meaning
of the outcomes.
- Analysing the structure of the Poisson distribution (with
parameter λ) and the Gamma distribution
(in the variable λ),
- find the update rule of the Gamma (expressed
in function of λ), given x observed counts
- find the expression of
- f(λ|x)
- E[λ|x]
- Var(λ|x)
- &sigma(λ|x)
in the special case in which α=1 and β=0.
- Inference of μ of Gaussian, given the observed value x
and assuming σ well known:
- Assuming a uniform prior, i.e. f_{0}(μ) = const,
- find the expression of f(μ|x,σ),
- find the expression of the expected value
and the 'standard uncertainty' of μ.
Other references
- As far as the techical properties of the various probability
distributions are concerned, see e.g.
the lecture notes
Probabilità e incertezze di misura,
Parte 2 and Parte 3 (in Italian, but math is math, and the names
are similar).
- Lezione 8 (3/2/17)
- Bayesian inference: applications (+ other matter)
- More on inference of Bernoulli p and Poisson λ
- More on conjugate priors.
- Poisson process and related distributions: Poisson,
exponential, Erlang → Gamma.
- Summary of distributions derived by the Bernoulli process,
- Disequalities of Markov and Cebicev (not relevant for the course).
- More on the misunderstangings of Bernoulli's theorem.
- 'Non memory' property of the geometric and exponential.
- Generalities about probability distributions of continuous variables
(a reminder).
- Triangular distributions (symmetric and asymmetric) and their
role in modelling uncertainties due to 'systematic errors'.
- Decays: from probabilistic model to (approximated) deterministic
model described by the famous differential equation whose solution
id the exponential distribution:
- How to measure life times without having to observe
the instants of birth and dead.
The proton that had 10^{25}
years (according to Corriere della Sera):
- Gaussian distribution
- A short reminder
- The reason why many distribution 'tend to'
a Gaussian, thanks to the Central Limit Theorem:
reproductivity property under the sum.
- Properties of the -log of the Gaussian pdf:
- parabola with the minimum at μ and
second derivative equal to 1/σ^{2}
- The “Gaussian trick” (not official name,
and anyway due to Laplace) to evaluate in a simple
way (just derivatives, instead of integrals!!)
μ and σ of pdf 'assumed to be'
(by general arguments or by visual inspection)
approximately Gaussian
References
Assignments
- ☛ Problem 4 of the entry test
- ☛ Problem 7 of the entry test
- Apply the “Gaussian trick” in the following
cases
- f(λ|x) assuming a flat pdf for λ;
- f(p|n,x) assuming a flat pdf for p.
(Note: multiplicative factors are irrelevant
because the 'trick' is based on the log).
- Install JAGS and Rjags and test the installation executing
the following simple scripts
→ R scripts can be executed e.g. by
source("simpleMC_1.R"),
- Lezione 9 (19/2/17)
- Multidimensional problems
- Multiple variables ('uncertain vectors'): generalities
for discrete and continuous variables (including extension of
Bayes theorem).
- Covariance and correlation coefficient.
- Independence vs null covariance.
- Bivariate normal distribution: joint distribution,
marginal and conditional →
problem 7 of entry test.
- Matrix form of the bivariate normal distribution and
its extension to the multivariate case:
covariance matrix
- 'Gaussian trick' in many dimensions.
- Linear combination of uncertain variables:
expected values, variance and covariances (also
taking into account possible covariances between 'input quantities'):
→ compact form ('transformation of covariance matrices').
- Linearization: the Matrix C becomes the matrix of derivatives taken
in the expected values of the input quantities.
- General solution of uncertainty about functions of uncertain variables:
discrete and continuous case (with the Dirac delta).
- Warning about propagating "best values" and "confidence intervals".
- Details on the inference of the parameters of a gaussian model:
- Single measurement: → μ;
- Role of prior and Gaussian prior: %rarr; "combinations of results";
- Predictive distribution: problem 3
of entry test (taking the arithmetic average as an
'equivalent single measurement', in virtue of
sufficiency)
- Inferring μ from a sample → statistical sufficiency
of the arithmetic average.
- Joint inference of μ and σ from a sample:
- general ideas;
- approximated results for large samples
(application of the 'Gaussian trick', details
left as exercise).
References
- Lezione 10 (16/2/17)
- Systematics -- Fits
- Uncertainty due to 'systematics':
- Reminding ISO terminology: variables of influence
- General strategies to include the uncertainties due
uncertain variables of influence.
- Detailed case of the uncertain offset.
- Correlations of results affected by common systematics.
- Handling uncertainty due systematics by transformation:
detailed case of offset and scale uncertainty.
- Fits
- Building up the model as a Bayesian network
- Simplest case of linear model between true values,
with a bunch of assumptions/approximations
- Exact solution of the 'simplest model' with known
'σs'.
- Example with Jags:
References
Back to G.D'Agostini - Teaching
Back to G.D'Agostini Home Page