Probabilistic Inference and Forecasting in the Sciences
lectures to PhD students in Physics (39o Ciclo)
(Giulio D'Agostini and Andrea Messina)


[Poster for lectures at LNF]

The course will be of about 40 hours (6 credits),

Contents


Time table

Nr.Giorno OrarioAula
1 Lun 08 gen   10:00-12:00  Sala Lauree 
2Mer 10 gen 9:00-12:00 Sala Lauree 
3Ven 12 gen 9:00-12:00Sala Lauree
4Lun 15 gen 9:00-11:00Aula 8
5Mer 17 gen 9:00-11:00Aula 8
6Ven 19 gen 9:00-12:00Aula 8
7Lun 22 gen 9:00-11:00Aula 8
8Mer 24 gen 9:00-11:00Aula 8
9Ven 26 gen 9:00-12:00Aula 8
10Lun 29 gen 9:00-11:00Aula 8
11Mer 31 gen 9:00-11:00Aula 8
12Ven 2 feb 9:00-12:00Sala Lauree
----- ------
13Lun 12 feb 9:00-11:00Sala Lauree
14Mer 14 feb 9:00-11:00Aula 8
15Ven 16 feb 9:00-12:00Aula 8
----- ------
16Lun 26 feb 9:00-11:00Aula 8
(*) Lun 26 feb 16:00 Aula Conversi
17Mer 28 feb 9:00-11:00Aula 8
18Ven 1 mar 9:00-11:00Aula 8
(*) Seminar on 30 years of Bayesian unfolding
Lecture 1 (8 January)
 
Introduction to the course and entry test
in particular (although at qualitative level)
 
References, links, etc.
 
Lecture 2 (10 January)
 
 
References, links, etc.
  • A cultural/scientific must, mentioned during the lecture
     
    Lecture 3 (12 January)
     
    • Probabilistic statements of the numerical value physical quantities
      vs confidence intervals and coverage
      (somehow qualitative, as in invitation to think about and, as far the 'frequentistic' terms are concerned, to look in the preferred books and lecture notes).
    • Causes → Effects, and back.
    • Simple example with two Causes and two effects (AIDS test).
    • P(A|B) vs P(B|A).
    • 'Prosecutor fallacy' (aka base rate fallacy, base rate neglect and base rate bias).
    • What is 'statistics'? [→"Lies , dammn lies and statistics"]:
      descriptive statistics, probability theory, inference.
    • "Lies, damned lies and... Physics": claimes of discoveries based on 'p-values' ('n-σ')
     
    References, links, etc.
     
    Lecture 4 (15 January)
     
    • Entry test: nr. 10, with qualitative discussion on related concepts:
      • inferring several physical quantities from the same dataset;
      • they are, in most cases `correlated', i.e. 'not independent';
      • reconditioning the value of some quantities on the 'assumed exact value' of the others
        (but also uncertainty on the conditionand can be taken into accout).
      • Special case of two quantities, e.g. m and p resulting from a linear fit:
        • graphical representation of the result in the (m,p) plane;
        • graphical meaning of the correlation;
        • reason why if the data points (x and y) are in the first quadrant then ρ is negative (→ m and p anticorrelated)
      Blackboard
    • Probability of hytheses vs 'classical' hypothesis tests
    • More on χ2: see last slide of previous lecture (updated)
    • Mechanism behind the 'classiical' hypothesis tests:
      • from falsificationism to p-values
        ... and misundestandings(!!)
    • Doing logical mistakes and crimes with p-values:
      • p-hacking
    • Examples of p-values based on χ2
     
    References, links, etc.
     
    Lecture 5 (17 January)
     
    • Learning from data: model thinking
    • A toy experiment: six boxes, each containing five balls:
      • Probability of the box composition;
      • Probability of the next extraction
        (we have analysed, for the moment, the case of 'reintroduction')
      It will be the guide examples for many concepts of the course
    • Ellsberg paradox (shown without mentioning the name)
    • What/where is probability?
    • Subjective nature of probability:
      • Probability is always conditioned probability
      • P(E|Is(t))
    • Role of bets, and in particular of coherent bets, in order to express the 'degree of belief' ('of certainty', etc.)
    • On the standard (old) textbook 'definitions' of probability
    • Basic rules of probability (3+1):
      for the moment, just a reminder, with clarifications
    • Rule to update the probability in the light of a new piece of information:
      → just a mathematical manipulation of the forth basic rule
      (no membership to a sect required...)
      → Bayes' theorem, or 'rule'
      (indeed, calling it 'theorem' is perhaps too much; calling it 'principle' shows mental confusion...)
     

     
    Lecture 6 (19 January)
     
    • From the basic rules of probabilities to the Laplace's "Bayes Theorem"
    • Proposed problems:
      • Aids test
      • Three boxes and two rings
      • Particle identification
      • Six box toy experiment:
        • analysis of the experimental data (5 times Black):
          →evolution of the probabilities of the box composition;
          → evolution of the probability of the next Black
          (BUT remember that the box composition remains constant: WHERE IS PROBABILITY?)
        • Homework at pag. 14 of the slides
    • Playing with Hugin
      (six box problem, with variations, framed in a Bayesian network)
    • On the intuitive evaluations of probability
      (following David Hume, with caveats concerning induction)
    • "Bayes' factor" (conceptual tool proposed by Gauss!)
     
    References, links, etc.
     
    Lecture 7 (22 January)
     
    • Six boxes analysied by HUGIN:
      suggested variations (no code provided — try it)
    • Uncertain numbers — an introduction.
    • More about the importance of the state of information in our scientific judgements (cows and sheep jokes).
    • Extending the past to the future (possibly avoiding the end of the inductivist turkey...).
    • Uncertain numbers and probability distributions
    • Summaries of probability distribution (do not confuse RMS with σ!)
    • Bernoulli process and related distributions (Geometric, Binomial and Pascal).
    • Introduction to Monte Carlo sampling using basic techniques:
      • hit/miss
      • inverting the cumulative distribution
      (techniques introduced on discrete distribution
      and easily/better extended to continuous distributions)
      Write (pseudo-)random number generators for
      1. f(x) ∝ x   (0 ≤ X ≤ 1)
      2. f(t) ∝ e-t/τ   (0 ≤ T < ∞)
      3. f(x) ∝ cos(x)   (-π/2 ≤ X ≤ π/2)
      4. X ∼ B10,1/5 (binomial distribution with n=10 and p=1/5)
      [Note: in the cases 1, 3 and 4 use both techniques we have seen;
      in the case 2 we can only use the "F-1(yR)" technique]
    • From the Bernoulli process to the Bernoulli theorem:
      • Bernoulli distribution
      • Geometric distribution
      • Pascal distribution (no details for the moment)
      • Binomial distribution
    • Bernoulli theorem: "p → fn"
     
    References, links, etc.
     
    Lecture 8 (24 January)
     
    • Poisson process and related distributions (Poisson, Exponential and Erlang).
          → Entry test nr. 8
    • Bernoulli theorem: meaning and misundestrandings.
      • decay lifetime vs half time of radioactive decays.
    • Markov and Cebicev disequalities
    • Continuous probability distributions
      with some important ones
      • (negative) exponential;
      • uniform;
      • triangular (symmetric and asymmetric)
      • Gaussian
     
    References, links, etc.
     
    Lecture 9 (26 January)
     
    • The 'Gaussian trick' (Laplace approximation — !! )
    • Propation of uncertainties: general problem and minimal solution
    • Central Limit Theorem and its importance (it plays a central role)
    • Reproductivity property of some distributions
    • Exact, approximate and MC propagation of uncertainties
    • Criticism concerning `propagation prescriptions'
    • Introduction to multivariate distributions ('uncertain vectrors')
    • Chain rule and its importance
    • Probabilistic dependence/independence and covariance
     
    References, links, etc.
     
    Lecture 10 (29 January)
     
    • Logical vs probabilistic(*) dependence/independence
      (examples in the slides just sketched: → work out the details)
      (*) Note: in the slides the adjective 'stochastic' is often used, but I tend now to prefer the Latin root.
    • About the transitivity of probabilistic(*) dependence/independence
      (examples in the slides just sketched: → work out the details)
    • Some remarks on updating probabilities:
      ↠ P(B) is modified, in percentage, by hypothesis A
      as P(A) is modified, in percentage, by hypothesis B.
    • More on exact propagations:
      • work out the assigned problems (before reading the solutions on the slides);
      • alternative method (wrt to that using the Dirac delta) for variables monotonically related;
      • mathematical transformation functions equal to the cumulative:
        • the resulting variables is uniformelly distributed;
        • exercise: prove it ising the 'Dirac delta' method.
      • Waiting time for 'k' counts in a Poisson process:
        • Erlang distribution;
        • from Erlang distribution to Gamma distribution;
      • Summary of distributions arising from the Bernoulli process
      • Some related problems on Gaussian distributions
      • Bivariate normal distribuion
        Problem nr 10 of the entry test
     
    References, links, etc.
     
    Lecture 11 (31 January)
     
    • Complementing the exponential, based on the following exercise:
      • given a pdf f(x) ∝ K*exp[-α x^2 + β x],
        with K and α positive, find
        • E[X]
        • Var[X] and σ(X)
        without making integrals
      (and tell which kind of distribution we are dealing with).
    • 'Gaussian Trick' in 2 dimensions (use and misuse)
    • Extension of the bivariate normal to n dimensions:
      multidimentional reconditioning.
    • More on linear combinations of uncertain numbers
      ↠ 'Propagation' of covariance matrix
    • Propagation via linearization;
    • Special case of monomial functions
     
    References, links, etc. (*) Note: Ref. 3 (M. Eaton, Multivariate Statistics...) is no longer available in the website indicated there.
    However
    • the formulae for multivariate conditioning (see slides) can be found on Wiki (see above);
    • the book can be found on an alternative link (visited 31 Janyary 2024; but pdf download not available)
    and, moreover, the book is VERY formal — NOT recommended.

     
    Lecture 12 (2 February)
     
    • A first look at JAGS/rjags for MC simulation
      → try to translate the examples in Python and/or Julia
    • Back again to the six boxes
      → simulations and comparison with frequency based evaluations of probabilities
    • Inferring p of Bernoulli processes
    • Inferring p of a binomial distribution and inference of the number of successes in future trials
      (assuming p constant although unknown — and if p depends on time? Think about it!)
      • Probability vs frequency:
        • Bernoulli theorem: pfn
        • Laplace's rule of succession (via "Bayes Theorem"): fnp
      • Conjugate prior → Beta distribution
      • Predicting the number of successes in future trials.
      • Special case in which no successes were observed (but the experiment was performed)
        → Exercice proposed at p. 28 of the slides (very important → we shall come back to the issue)
    • Inference and prediction related to Poisson processes
      • Inferring Poisson λ (and 'r')
      • Conjugate prior → Gamma distribution
      • Predicting the number of counts in future measurements (assuming constant 'r')
      • Special case in which no events have been observed (but the experiment was done!)
        Solve the exercise similar to that proposed at p. 28 for the slides
        [in this case the quantity of interest is r = λ/T
        → plot f(r | x=0, f0=k) in log-log scale].
      • Use of JAGS for inference and prediction (see examples below).
     
    References, links, etc.
     
    Lecture 13 (12 February)
     
    • Gaussian model assuming σ known:
      • joint pdf of model parameters and observed values;
      • inference of μ;
      • predicting a future x;
      • case of several 'independent' observations, with some remarks:
        • observation on independency vs conditional independency;
        • empirical observations vs measurements
          (slide nr 17, to be commented in a following lecture);
        • remarks on propagation of evidence (case of 'divergent connections')
          [for 'converging' and 'serial' connections see arXiv:1504.02065 (Sec. 11)];
      • conjugate prior.
    • Joint inference of μ and σ (general considerations — details left to self study)
      • summary of 'large n behaviour;
      • problem solved by JAGS
    • Details on the rejection sampling ('hit/miss') for Monte Carlo.
    • Importance sampling.
    • Introduction to Markov Chain Monte Carlo by playing with a three state toy model.
     
    References, links, etc.
     
    Lecture 14 (14 February)
     
    • Practical introduction to MCMC (focusing on data analysis):
      • global and detailed balance conditions;
      • Metropolis algorithm (with mention to Metropolis-Hasting);
      • simulated annealing;
      • Gibbs sampler.
    • Simples examples of inference/forecasting making use of self-made MCMC's.
    • Inferring n of a binomial given x and p.
    • Framing the previous 'exercise' in a more general model.
    • Including in 'systematics' in the probabilistic model
      • reminding ISO's influence quantities (h hereafter: → h);
      • possible strategies:
        1. global inference on f(μ, h) followed by marginalization;
        2. conditional inference;
        3. inference of 'μR', followed by propagation
          (method particulary suited to get approximated formulae).
      • Details on the systematics due to an uncertain offset in a Gaussian model:
        • case of a single μ;
        • influence of systematics in the result vs calibration (μ known → z);
        • case of several μ's measured with the same instrument (same 'f(z)'), with details on two μ's
          • f(μ12 | x1,x2),
            in particular ρ(μ12).
     
    References, links, etc.
     
    Lecture 15 (16 February)
     
    • Proposed exercises on exact approximations:
      • sum of two Gaussians
      • distribution of Z2, with Z ∝ N(0,1)
    • Some problems involving binomial distributions related to Covid.
    • Ratios of counts vs ratios λ's in Poisson processes.
    • Uncertainties due to systematics
      • Reminder of 'approach nr. 3';
      • Application to offset and scale systematics.(*)
    • Fits (just parametric inference!):
      • the importance of the underlying model;
      • linear model: general approach; simplified model (under well understood conditions) and
        'least square' approximation (no 'principles'!)
      • case of uncertain σ;
      • forecasting a future 'y' at a given xf.
    (*) Note added : as we have seen, the 'systematics' are related to influence quantities (ISO GUM).
    Not only we can evaluate the contribution to the overall uncertainty due to the uncertain value of the influence factors,
    but, if there are good reasons to expect that a quantity can better measured in the future (think to frontier Physics),
    we can also provide, among the results, also the derivative of the final value of interest wrt the the value of the uncertain input quantities.
    As and example see paper with Degrassi on the Higgs mass: → Tabs. 1-4.
     
    References, links, etc.
     
    Lecture 16 (26 February) + Seminar on unfolding
     
    • Two curious problems:
      • two envelops: hold or change?
      • three prisoner problem.
    • Coherent bet and basic rules of probability.
    • p-values vs Bayes Factors (and much more!)
      (based on the real case of 2015 GW's)
    • Back to basic probabilistic issues
    • Probability and odds
    • Coherent bets (de Finetti)
    • Basic probability rules derived from coherence
    • Expected gain in coherent bets
    • Events and sets (and rules of probabilities)
    • More on independence
    • Relative update of probability
    • Back to p-values:
      • What they are
      • What they are not
      • p-values say very little (if not nothing) on the probability of hypotheses
      • 'Discovery' of Higgs particle (2011-2012) and first detection on Earth of GW's
    • More on Bayes factors and how they are (not) related to p-values
     
    References, links, etc.
    1. Lecture
    2. Seminar on unfolding
      (some of the papers were already indicated in previous lectures)

     
    Lecture 17 (28 February)
     
    • Remarks on unfolding
    • High Bayes Factor and high p-value: ??
    • Recall on general ideas concerning fits
    • Fits with 'other complications'
    • Model comparison (just the general ideas + references):
      → (automatic) Ockham's Razor
    • Fom linear fits to 'linear models'
     
    References, links, etc.
     
    Lecture 18 (1 March)
     
    • More on fits and 'linear models'
    • On a curious bias due to scale correlation amond the data points
    • Multinomial distribution (just a qualitative introduction)
    • Presenting prior-free results in frontier Physics searches
     
    References, links, etc.

  • Back to G.D'Agostini - Teaching
    Back to G.D'Agostini Home Page