Course On Bayesian Methods in Environmental Valuation: Basics (Continued) : Models For Proportions and Means

Course on Bayesian Methods in
Environmental Valuation
Basics (continued):
Models for proportions and means
Francisco Jos Vzquez Polo

[www.personales.ulpgc.es/fjvpolo.dmc]
Miguel ngel Negrn Hernndez
[www.personales.ulpgc.es/mnegrin.dmc]
{fjvpolo or mnegrin}@dmc.ulpgc.es
1
Course on Bayesian Methods in Environmental Valuation
Contents
1. Introduction to Bayesian Analysis
2. Bayesian inference. Conjugate priors

2.1 Analysis of proportions
2.2 Analysis of count data
3. Software: WinBUGS
Introduction to Bayesian Analysis
Thomas Bayes (1702 - 1761)

He set out Bayess theory of probability in the paper An

essay towards solving a problem in the doctrine of
chances (Philosophical Transactions of the Royal Society
of London, 1763). The paper was sent by Richard Price, a
friend of Bayes.
This paper introduced the concept of inverse probability;

H1 ,, H k set of hypothesis
PH i , i 1,...., k prior probabilities, P H
i i
P A | H i i 1,..., k likelihood of the data A

Bayes Theorem:
P A | H i PH i
P H i | A
j PA | H j PH j
The posterior probability of Hi given A is proportional to
the product of the prior probability of Hi and the
likelihood of A when Hi is true.
Some settings in which

Bayesian statistics is used
today:
-Marketing
-Economics and econometrics
-Social sciences
-Queues
-Education
-Operations & Manufacturing
-Health policy
-Quality
-Medical research
-Diagnosis
-Weather
-Maintenance
-Environmental
What does probability mean?
The frequency definition of the probability of an event:

The probability of an event is the proportion of the time
it would occur in a long sequence of observations (i.e.
as the number of trials tends to infinity).
Example: when we say that the probability of getting head on
a toss of a fair coin is 0.5, we mean that we would expect to
get a head half the time if we flipped the coin a huge number
of times under exactly the same conditions.
Requires a sequence of repeatable experiments.
No frequency interpretation possible for probabilities of many
kinds of events:
Probability as degree of belief
The subjective definition of probability is
A probability of an event is a number between 0 and 1
that measures a particular persons subjective opinion as
to how likely that event is to occur (or to have occurred).
Applies whenever the person in questions has an opinion
about the event
-If we count ignorance as an opinion.
Different people may have different subjective
probabilities regarding the same event.
The same persons subjective probability may change as
more information comes in.
Properties of probabilities
These properties apply to probability whichever definition

is being used.
-Probabilities must not be negative. If A is any event,
then
P(A) 0
-All possible outcomes together must have probability 1.
If S is the sample space in a probability model then

P(S) = 1
Example: Do you have a rare disease?
Suppose your friend is diagnosed with a rare disease

that has no obvious symptoms.
You wish to determine how likely it is that you, too, have
the disease.
That is, you are uncertain about your true disease status.
Your friends doctor has told him that the proportion of
people in the general population who have the disease is
0.001. The disease is not contagious.
A blood test exists to detect the disease, but it
sometimes gives incorrect results (0.05)
Prior distribution
Before we see any data, we have some idea about what

values the parameters might take
Experts, experience, previous studies, and so on.
Example:
-e.g. there are very few people 3m tall
Our subjective uncertainty about the parameters before
we see the data
Prior Terminology
Uninformative prior
-Uniform, as wide as possible
-Sometimes called flat priors
-Problem: often difficult to define
Informative Prior
-Not uniform
-Assume we have some prior knowledge
Conjugate Prior
-Prior and posterior have same distribution
Often makes the maths easier
Noninformative or reference priors
Useful when we want inference to be unaffected by

information apart from the current data.
In many scientific contexts, we would not bother to carry

out an experiment unless we thought it was going to
increase our knowledge significantly
- i.e. we expect and want the likelihood to
dominate the prior
Informative priors
Elicitation is the process of extracting expert knowledge

about some unknown quantity of interest, or the
probability of some future event, which can then be used
to supplement any numerical data that we may have.
If the expert in question does not have a statistical

background, as is often the case, translating their beliefs
into a statistical form suitable for use in our analyses can
be a challenging task.
Prior elicitation is an important and yet under researched

component of Bayesian statistics
Example (continuation)
Two possible events:

1.You have the disease
2.You dont have the disease
Before taking any blood test, you think your chance of

having the disease is similar to that of a randomly
selected person in the population. So you assign the
following prior probabilities to the two events:
Prob (Have disease) = 0.001
Prob(Dont have disease) = 0.999
Data
You decide to take the blood test.

- The new information that you obtain to learn about the
different models is called data.
- The different possible data results are called observations.
- The data in this example is the result of the blood test.
The two possible observations are:
- A positive blood test (+) suggests you have the disease.
- A negative blood test (-) suggests you dont have the
disease.
Likelihood
The probabilities of the two possible test results are

different depending on whether you have the disease or
not.
These probabilities are called likelihoods the
probabilities of the different data outcomes conditional on
each possible model.
P(+ | have disease) = 0.95

P(+ | dont have disease) = 0.05
P(- | have disease) = 0.05
P(- | dont have disease) = 0.95
Bayesian Inference
As P(X) is a constant, all we need to estimate P( | X) are

P() and P(X | )
Bayes rule becomes:
P | X P P X |
P( | X) is called the posterior distribution
Product of the prior and the likelihood
We can ignore the constant of proportionality
Posterior distribution
The posterior distribution contains all the current
information about the unknown parameter
All Bayesian inference is based on the posterior
distribution:
-Estimation
-Estimating values of unknown parameters that can
never be observed or known
-Testing
-Prediction
-Estimating the values of potentially observable but
currently unobserved quantities.
Using Bayes rule to update probabilities
Bayes rule is the formula for updating your probabilities

about the models given the data.
Enables you to compute posterior probabilities given the
observed data
Bayes rule:
P(event | data) P(event) x P(data | event)
Posterior prior x likelihood
Bayes rule applied to the example
You take the blood test and the result is positive (+). This
is the data or observation.
P | have disease Phave disease

Phave disease |
P | have disease Phave disease P | don' t have disease Pdon' t have disease
P(have disease | +) = 0.019

P(dont have disease | +) = 0.981
Learning
The Bayesian approach is often talked about as a

learning process
As we get more data, we add it to our store of information

by multiplying it by our current posterior distribution.
It has been argued that this can form the basis of a

philosophy of science
What have you learned from the blood test?
The probability of your having the disease has increased

by a factor of 19.
But the actual probability is still small.
You decide to obtain more information by taking the

blood test again.
Updating the probabilities again
We assume that the blood tests are independent.

The posterior probabilities after the first test will become
your prior probabilities with respect to the second test.
Suppose that the second test is also positive.
The new posterior probabilities are:
P(have disease | +,+) = 0.269

P(dont have disease | +,+) = 0.731
What if the second test had been negative?
Suppose that the second test is negative.

The new posterior probabilities are:
P(have disease | +,-) = ?

P(dont have disease | +,-) = ?
P(have disease | +,-) = 0.001

P(dont have disease | +,-) = 0.999
Introduction to Bayesian Analsysis.
References
Lee, P. (1993) Bayesian Statistics: An introduction. Oxford, UK: Oxford

University Press, UK.
Zellner, A. (1971) An introduction to Bayesian Inference and Econometrics. John
Wiley & Sons.
Chen, M., Shao, Q. e Ibrahim, J.(2000). Monte Carlo Methods in Bayesian
Computation. Springer-Verlag. NY.
Leonard,T. y Hsu, J.S.(1999). Bayesian Methods. An analysis for statisticians

and interdisciplinary researches Cambridge Series in Statistical and Probabilistic
Mathematics. Cambridge.
OHagan, A.(1994). Bayesian Inference. Kendalls Advanced Theory of Statistics

(vol.2b). E. Arnold. University Press. Cambridge.
OHagan, A.(2003). A primer on Bayesian Statistics in Health Economics and

Outcomes Research. Centre for Bayesian Statistics in Health Economics.
Bayesian Inference
Estimation
Point estimates (mean, mode, median)

- Measures of spread
Bayesian intervals
Bayesian Inference
The posterior variance

The posterior variance is one summary of the spread of the
posterior distribution
The larger the posterior variance, the more uncertainty we still
have about the parameter
Bayesian Inference
Precisely what information does a p-value provide?

Recall the definition of a p-value: The probability of observing
a test statistics as extreme as or more extreme than the
observed value, assuming that the null hypothesis is true.
What is the correct way to interpret a confidence interval?
Does a 95% confidence interval provide a 95% probability
region for the true parameter value? If not, what is the correct
intepretation?
A range of values, which is likely, with a specified degree of
certainty, to contain the true population value of a variable
drawn from the study sample
Bayesian Inference
Frequentist approach
Parameters are considered "fixed but unknown"
We can not assign a distribution.
Bayesian approach
Parameters are considered random and unknown
They are random because they are unknown
Bayesian Inference
Bayesian intervals
Called posterior intervals or credible sets

Recall that the posterior distribution represents our updated
subjective probability distribution for the unknown parameter.
Thus, for us, the interpretation of the 95% credible set is that
the probability is .95 that the true is in the interval.
Contrast this with the interpretation of a frequentist confidence
interval.
Hypothesis testing
Frequentist approach
H0 vs. H1 (2 hypothesis)
: Type I error (Probability of rejecting the hypothesis when
hypothesis is true)
1%, 5% 10%
: Type II error (Probability of accepting the hypothesis
when hypothesis is false)?
p-value
(accept if p-value > 0.05; reject if p-value < 0.05)
Hypthotesis testing
Bayesian approach
H0 , H1 , H2, etc. (several hypothesis)

We can estimate the probability of each event from the
posterior distribution of the parameters, f(|X).
H0 : 0
H1 : 0
Prob(H0) = Prob( 0) Prob(H1) = Prob( > 0)

Prediction
In many situations, interest focuses on predicting values of

a future sample from the same population.
-i.e. on estimating values of potentially observable
but not yet observed quantities
Example: we can be interested in the result of the next
blood test.
The posterior predictive probability is defined as
*
*

P x | x p x | p | x d

Course On Bayesian Methods in Environmental Valuation: Basics (Continued) : Models For Proportions and Means

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Course On Bayesian Methods in Environmental Valuation: Basics (Continued) : Models For Proportions and Means

Uploaded by

Copyright:

Available Formats

Course on Bayesian Methods in

Francisco Jos Vzquez Polo

2. Bayesian inference. Conjugate priors

Thomas Bayes (1702 - 1761)

He set out Bayess theory of probability in the paper An

This paper introduced the concept of inverse probability;

P A | H i i 1,..., k likelihood of the data A

Some settings in which

The frequency definition of the probability of an event:

These properties apply to probability whichever definition

If S is the sample space in a probability model then

Suppose your friend is diagnosed with a rare disease

Before we see any data, we have some idea about what

Useful when we want inference to be unaffected by

In many scientific contexts, we would not bother to carry

Elicitation is the process of extracting expert knowledge

If the expert in question does not have a statistical

Prior elicitation is an important and yet under researched

Two possible events:

Before taking any blood test, you think your chance of

You decide to take the blood test.

The probabilities of the two possible test results are

P(+ | have disease) = 0.95

As P(X) is a constant, all we need to estimate P( | X) are

Bayes rule is the formula for updating your probabilities

P | have disease Phave disease

P(have disease | +) = 0.019

The Bayesian approach is often talked about as a

As we get more data, we add it to our store of information

It has been argued that this can form the basis of a

The probability of your having the disease has increased

But the actual probability is still small.

You decide to obtain more information by taking the

We assume that the blood tests are independent.

P(have disease | +,+) = 0.269

Suppose that the second test is negative.

P(have disease | +,-) = ?

P(have disease | +,-) = 0.001

Lee, P. (1993) Bayesian Statistics: An introduction. Oxford, UK: Oxford

Leonard,T. y Hsu, J.S.(1999). Bayesian Methods. An analysis for statisticians

OHagan, A.(1994). Bayesian Inference. Kendalls Advanced Theory of Statistics

OHagan, A.(2003). A primer on Bayesian Statistics in Health Economics and

Point estimates (mean, mode, median)

The posterior variance

Precisely what information does a p-value provide?

Called posterior intervals or credible sets

H0 , H1 , H2, etc. (several hypothesis)

Prob(H0) = Prob( 0) Prob(H1) = Prob( > 0)

In many situations, interest focuses on predicting values of

You might also like