Maximium CMC

Bayesian Techniques for Parameter Estimation
He has Van Goghs ear for music, Billy Wilder
Statistical Inference
Goal: The goal in statistical inference is to make conclusions about a
phenomenon based on observed data.
Frequentist: Observations made in the past are analyzed with a specified
model. Result is regarded as confidence about state of real world.
Probabilities defined as frequencies with which an event occurs if
experiment is repeated several times.
Parameter Estimation:
o Relies on estimators derived from different data sets and a specific
sampling distribution.
o Parameters may be unknown but are fixed
Bayesian: Interpretation of probability is subjective and can be updated with
new data.
Parameter Estimation: Parameters described as density
Bayesian Inference
Framework:
Prior Distribution: Quantifies prior knowledge of parameter values.
Likelihood: Probability of observing a data if we have a certain set of
parameter values.
Posterior Distribution: Conditional probability distribution of unknown
parameters given observed data.
Joint PDF: Quantifies all combination of data and observations
Bayes Relation: Specifies posterior in terms of likelihood, prior, and

normalization constant
Problem: Evaluation of normalization constant typically requires high

dimensional integration.
Bayesian Inference
Uninformative Prior: No a priori information parameters
Informative Prior: Use conjugate priors; prior and posterior from same
distribution
Evaluation Strategies:
Analytic integration --- Rare
Classical quadrature; e.g., p = 2
Monte Carlo quadrature Techniques
Markov Chains
Bayesian Inference
Example:
Bayesian Inference
Example:
1 Head, 0 Tails
Note:
5 Heads, 9 Tails
49 Heads, 51 Tails
Bayesian Inference
Example: Now consider
5 Heads, 5 Tails
50 Heads, 50 Tails
Note: Poor informative prior incorrectly influences results for a long time.
Parameter Estimation Problem

Likelihood:
Assumption:
Note:
Parameter Estimation: Example

Example: Consider the spring model

Ordinary Least Squares: Here
Sample Distribution

Bayesian Inference: The likelihood is
Posterior Distribution:
Strategy: Create Markov chain using

random sampling so that created chain
has the posterior distribution as its
limiting (stationary) distribution.
Markov Chains
Definition:
Note: A Markov chain is characterized by three components: a state space, an

initial distribution, and a transition kernel.
State Space:
Initial Distribution: (Mass)
Transition Probability: (Markov Kernel)
Markov Chains
Example:
Chapman-Kolmogorov Equations:
Markov Chains: Limiting Distribution

Example: Raleigh weather -- Tomorrows weather conditioned on todays
weather
rain
sun
Question:
Definition: This is the limiting distribution (invariant measure)
Markov Chains: Limiting Distribution

Example: Raleigh weather
Solve
rain
sun
Irreducible Markov Chains

Reducible Markov Chain:
p1
p2
Note: Limiting distribution not

unique if chain is reducible.
Irreducible:
Periodic Markov Chains

Example:
Periodicity: A Markov chain is periodic if parts of the state space are visited at
regular intervals. The period k is defined as
Periodic Markov Chains

Example:
Stationary Distribution
Theorem: A finite, homogeneous Markov chain that is irreducible and aperiodic
has a unique stationary distribution
and the .chain will converge in the sense of
distributions from any initial distribution .
Recurrence (Persistence):
Example: State 3 is transient
Ergodicity: A state is termed ergodic if it is aperiodic and recurrent. If all states

of an irreducible Markov chain are ergodic, the chain is said to be ergodic.
Matrix Theory
Definition:
Lemma:
Example:
Matrix Theory
Theorem (Perron-Frobenius):
Corollary 1:
Proposition:
Corollary:
Proof:
Convergence: Express
Detailed Balance Conditions

Reversible Chains: A Markov chain determined by the transition matrix
is reversible if there is a distribution that satisfies the detailed balance
conditions
Proof: We need to show that

Example:
Markov Chain Monte Carlo Methods

Strategy: Markov chain simulation used when it is impossible, or
computationally prohibitive, to sample directly from
Note:
In Markov chain theory, we are given a Markov chain, P, and we
construct its equilibrium distribution.
In MCMC theory, we are given a distribution and we want to construct
a Markov chain that is reversible with respect to it.

General Strategy:
Intuition: Recall that

Intuition:
Note: Narrower proposal distribution yields higher probability of acceptance.
Metropolis Algorithm
Metropolis Algorithm: [Metropolis and Ulam, 1949]
Metropolis-Hastings Algorithm
Metropolis-Hastings Algorithm:
Examples:
Note: Considered one of top 10 algorithms of 20th century
Proposal Distribution
Proposal Distribution: Significantly affects mixing
Too wide: Too many points rejected and chain stays still for long periods;
Too narrow: Acceptance ratio is high but algorithm is slow to explore
parameter space
Ideally, it should have similar shape to posterior (target) distribution.
Problem:
Anisotropic posterior,
isotropic proposal;
Efficiency nonuniform for
different parameters
Result:
Recovers efficiency of
univariate case
Proposal Distribution and Acceptance Probability

Proposal Distribution: Two basic approaches
Choose a fixed proposal function
o Independent Metropolis
Random walk (local Metropolis)
o Two (of several) choices:
Acceptance Probability:
Random Walk Metropolis Algorithm for Parameter Estimation
Random Walk Metropolis Algorithm for Parameter Estimation
Markov Chain Monte Carlo: Example


Example: Single parameter c

Case i:

Case i:
Note:

Case i:

Example: SMA-driven bending actuator -- talk with John Crews
Model:
Estimated Parameters:



Transition Kernel and Detailed Balance Condition

Transition Kernel: Recall that
Detailed Balance Condition:
Transition Kernel and Detailed Balance Condition

Detailed Balance Condition: Here
Note:
Transition Kernel: Definition
Sampling Error Variance

Strategy: Treat error variance
as parameter to be estimated.
Recall: Assumption that errors
are normally distributed yields
Goal: Determine posterior distribution
Strategy:
Choose prior so that posterior is from same family --- termed conjugate prior.
For normal distribution with unknown variance, conjugate prior is inverse
Gamma distribution
which is equivalent to inverse

Definition:
Strategy:
Note:
Note:

Related Topics
Note: This is an active research area and there are a number of related topics
Burn in and convergence
Adaptive algorithms
Population Monte Carlo methods
Sequential Monte Carlo methods and particle filters
Gaussian mixture models
Development of metamodels, surrogates and emulators to improve
implementation speeds
References:
A. Solonen, Monte Carlo Methods in Parameter Estimation of Nonlinear Models,
Masters Thesis, 2006.
H. Haario, E. Saksman, J. Tamminen, An adaptive Metropolis algorithm, Bernoulli,
7(2), pp. 223-242, 2001.
C. Andrieu and J. Thomas, A tutorial on adaptive MCMC, Statistics and
Computing, 18, pp. 343-373, 2008.
M. Vihola, Robust adaptive Metropolis algorithm with coerced acceptance rate,
arXiv:1011.4381v2.

Maximium CMC

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Maximium CMC

Uploaded by

Copyright:

Available Formats

Bayesian Techniques for Parameter Estimation

He has Van Goghs ear for music, Billy Wilder

Bayes Relation: Specifies posterior in terms of likelihood, prior, and

Problem: Evaluation of normalization constant typically requires high

Parameter Estimation Problem

Parameter Estimation: Example

Parameter Estimation: Example

Parameter Estimation: Example

Strategy: Create Markov chain using

Note: A Markov chain is characterized by three components: a state space, an

Transition Probability: (Markov Kernel)

Markov Chains: Limiting Distribution

Definition: This is the limiting distribution (invariant measure)

Markov Chains: Limiting Distribution

Irreducible Markov Chains

Note: Limiting distribution not

Periodic Markov Chains

Periodic Markov Chains

Example: State 3 is transient

Ergodicity: A state is termed ergodic if it is aperiodic and recurrent. If all states

Detailed Balance Conditions

Proof: We need to show that

Markov Chain Monte Carlo Methods

Markov Chain Monte Carlo Methods

Intuition: Recall that

Markov Chain Monte Carlo Methods

Note: Narrower proposal distribution yields higher probability of acceptance.

Note: Considered one of top 10 algorithms of 20th century

Proposal Distribution and Acceptance Probability

Random Walk Metropolis Algorithm for Parameter Estimation

Random Walk Metropolis Algorithm for Parameter Estimation

Markov Chain Monte Carlo: Example

Markov Chain Monte Carlo: Example

Markov Chain Monte Carlo: Example

Markov Chain Monte Carlo: Example

Markov Chain Monte Carlo: Example

Markov Chain Monte Carlo: Example

Markov Chain Monte Carlo: Example

Markov Chain Monte Carlo: Example

Markov Chain Monte Carlo: Example

Transition Kernel and Detailed Balance Condition

Detailed Balance Condition:

Transition Kernel and Detailed Balance Condition

Transition Kernel: Definition

Sampling Error Variance

Recall: Assumption that errors

are normally distributed yields

Goal: Determine posterior distribution

Sampling Error Variance

Sampling Error Variance

You might also like