2bayesian Models

BAYESIAN MODELS
(SINGLE-PARAMETER)
Shirlee Remoto-Ocampo
shirlee.ocampo@dlsu.edu.ph
Components of Bayesian Inference

Prior Distribution use probability to
quantify uncertainty about unknown quantities
(parameters)
Likelihood relates all variables into a full
probability model
Posterior Distribution result of using data
to update information about unknown
quantities (parameters)
Bayesian inference
Prior information p() on parameters
Likelihood of data given parameter
values f(y| )
Bayesian inference
f ( y | ) p ()
p ( | y )
f ( y)
or
( | y ) f ( y | ) p()
Posterior distribution is proportional

to likelihood prior distribution.
Importance of priors
Prior beliefs about uncertain parameters
are a fundamental part of Bayesian
statistics.
When we have few data about the
parameter of interest, our prior beliefs
dominate inference about that parameter.
In any application, effort should be made
to model our prior beliefs accurately.
24-25 January 2007
An Overview of State-ofthe-Art Data Modelling
Assessment of Prior Distributions

Discrete Case I: Very little or no information available
(1) Assign probabilities directly to various possible values of
the uncertainty quantity of interest
Ex. The probability that p = 0.6 is 0.01.
(2) Use of lotteries
Ex. X is obtained with P(E); Y is obtained with 1- P( E )
(3) Use of betting odds
If the probability of an event is p then the odds in favor of
that event are p to 1 p. If the odds in favor of an event are
a to b then the probability of that event is a/(a+b).

Discrete Case 2: There is prior information
Rule of thumb: Let the probability be as close as
possible to the relative frequencies
Independent case: joint prior distribution is the

product of the marginals
Uncertain about independent case: use conditional
distribution

Continuous Case 1: Historical or sample
information are available
Draw the histogram, smooth it out and determine if the

resulting curve more or less approximates a common
parametric family
Use of measures such as mean, median, mode,

quantiles, etc.
Simply determine a series of quantiles and then
plot. Draw a rough curve through them and
determine the distribution
Continuous Case 2: Very little or no information
PRINCIPLE OF INSUFFICIENT REASON
When nothing is known about in

advance, let the prior p( ) be a uniform
distribution, that is, let all possible
outcome of have the same probability.
Example
1. Prior Distribution: A ball W is randomly thrown
(according to a uniform distribution on the table).
The horizontal position of the ball on the table is ,
expressed as a fraction of the table width.
2. Likelihood: A ball O is randomly thrown n times.
The value of y is the number of times ball O lands to
the right of ball W.
UNIFORM-BINOMIAL MODEL
Location Model
P(y-|) is free of and y
Example:
Show that N(,1) represents a location
Model.
Other examples:
U( 1/2 , + )
Cauchy density
Example
Scale Model
P(y/|) is free of and y
Y has a scale model if a function f and a quantity
such that Y has a distribution
f(y|) =
1
f( )

Show that N(0, 2 ) represents a scale model.

Other Examples:
Exponential family
U(0, )
Location-Scale Model
Y has a scale model if a function f and quantities
and such that Y has a distribution given (, )
satisfies
f(y|, ) =
1
f( )

Show that N(, 2 ) is a location-scale model.
Weak prior information

If we accept the subjective nature of Bayesian
statistics and are not comfortable using subjective
priors, then many have argued that we should try to
specify prior distributions that represent no prior
information.
These prior distributions are called noninformative,
reference, ignorance or weak priors.
The idea is to have a completely flat prior distribution
over all possible values of the parameter.
Unfortunately, this can lead to improper distributions
being used.
24-25 January 2007
Prior and Improper Priors

Proper prior prior distribution that does not
depend on data and integrates to 1
Improper prior prior distribution that does not
integrate to 1 ( infinite integral or to any positive
finite value)
Note: Improper prior distributions can lead to

proper posterior distribution
Weak prior information

In our coin tossing example, Be(1,1), Be(0.5,0.5) and Be(0,0)
have been recommended as noninformative priors. Be(0,0) is
improper.
J() = I() = Fisher information for
Jeffreys non-informative prior

1. Consider a one-to-one transformations of the
parameter, say = h()
2. Prior density p() is equivalent to prior
density on .
p()| |
p() =
density
leading to the noninformative prior

1/2
p() [J(]
Fisher Information: I() = J() = Shirlee Remoto-Ocampo
2 (|)
E[
|]
2
Example
Find the Jeffreys noninformative prior
for the Poisson distribution (single
parameter).
Example
Find the Jeffreys noninformative prior
for the Poisson distribution.
1 , 2 | ~Poisson()
1 , 2 |1 , 2 ~Poisson(1 , 2 )
EXERCISE/SEATWORK
Informative priors
An informative prior is an accurate
representation of our prior beliefs.
An informative prior is essential when we
have few or no data for the parameter of
interest.
Elicitation is the process of translating
someones beliefs into a distribution.
24-25 January 2007
Conjugate priors
When we move away from noninformative
priors, we might use priors that are in a
convenient form.
That is a form where combining them with the
likelihood produces a distribution from the
same family.
24-25 January 2007
Informative Priors
Conjugate Priors
Conjugacy property that the posterior
distribution follows the same parametric form as
the prior distribution
Hyperparameters parameters of a prior
distribution
Assessment of Likelihood
Case 1: Sufficiently large set of data
1. Make an appropriate frequency histogram for the
data
2. Determine the mean and the variance.
3. Hypothesize a distribution that might fit the data
4. Estimate the parameter under the hypothesized
data
5. Test the good ness of fit of data against the
hypothetical data
Assessment of Likelihood
Case 2: Extremely (sparse) Set of Data
Make a smooth assessment of the CDF
1. Make a preliminary estimate of the cumulative
frequencies corresponding to each observed value
of the variable of interest, that is, array of n
observations. The kth observation will be the
k/(n+1)th quantile.
2. Adjust the estimates so that the whole distribution
will be smooth and of reasonable shape.
Assessment of Posterior Distributions

CONJUGATE PRIOR DISTRIBUTIONS
Formal Definition of CONJUGACY
If F is a class of sampling distributions h(|x), and P
is a class of prior distributions for , then the
class P is CONJUGATE for F is
h(|x) P for all h(.|) F and p(.) P
Conjugate Prior for a Bernoulli Process

Let 1 , 2 , be a random sample ~
Bernoulli(). Suppose that the prior distribution
of is Beta (, ), , > 0. Then the posterior
distribution of is Beta ( + , + ) where y =
=1 .
Recall: Bernoulli Distribution
Recall: Binomial Distribution
Binomial Distribution
Prior: Beta Distribution
Beta and Gamma Functions
Beta Distribution
Beta-Binomial Bayesian Model
Posterior Mean and Variance
Question:
What is the posterior mean and posterior
variance of the beta-binomial model?
Uniform-binomial model?
Poisson-Gamma Bayesian Model

Let 1 , 2 , be a random sample ~
Poisson(), > 0. Suppose that the prior
distribution of is Gamma (,), , > 0. Then the
posterior distribution of is Gamma (r = r + ,
n = n + ) where y = =1 .
Recall: Poisson Distribution
Recall: Gamma Distribution
Poisson Gamma Bayesian Model
Posterior Mean
What is the posterior mean for PoissonGamma model? posterior variance?

Gamma-Poisson Model
https://www.youtube.com/watch?v=0XD6C_MQX
XE
Negative Binomial -Beta Model

Let 1 , 2 , be a random sample ~ Negative
Binomial
(r,),

ranges from 0 to 1. Suppose that the prior
distribution of is Beta (, ), , > 0. Then the
posterior distribution of is
Beta ( + , + ) where y = =1 .
EXERCISE!
Exponential-Gamma Model
Let 1 , 2 , be a random sample ~ Exp(), >
0. Suppose that the prior distribution of is
Gamma (,), , > 0. Then the posterior
distribution of is Gamma (r = r + , n = n +
) where y = =1 .
EXERCISE!
Normal-Normal Model
Normal with known mean
PROBLEM SET
Normal-Normal Bayesian Model
Inverse Gamma- Normal

Normal (with known variance)
Conjugate Priors
Prior
Beta
Gamma
Gamma
Beta
Likelihood
Binomial
Poisson
Exponential
Negative Binomial
Normal
Normal (with known

variance)
Normal (with known
variance)
Posterior
Beta
Gamma
Gamma
Conjugate Priors
Prior
Likelihood
Dirichlet
Multinomial
Posterior
APPLICATIONS
Suppose that the prior information concerning p, the
proportion of defectives, can well be represented by a
Beta distribution with parameters = 1 and = 19.
Following the assessment of his prior distribution, the
manager takes a random sample of 5 items from the
production process, observing one defective item.
Determine the posterior distribution of p.
Suppose that the manager decides to take another
random sample of 5 items and observes 2 defectives.
Determine the new posterior distribution of p.
APPLICATIONS
Suppose that the magnetic recording tape is
manufactured by a certain process, the mean number of
defects W on a 1000 ft roll of tape is unknown, and
suppose that the prior distribution of W is a gamma
distribution with mean 2 and variance 1. Suppose that
the number of defects on any roll of tape when W = w
has a Poisson distribution with mean w. Suppose further
that after a random sample of rolls of tape has been
counted, the mean of the posterior distribution of W is
1.6 and the variance is 0.16. Show that 8 rolls of tape
were included in the sample.
APPLICATIONS
An unknown proportion W of the items produced by a
certain machine is defective. Suppose that the prior
distribution of W is Beta with parameters = 1 and
= 99. Suppose also that items produced by the
machine are selected at random and observed one at a
time until exactly 5 defective items have been found. If,
when sampling terminates, the mean of posterior
distribution of W is 0.02, show that 195 nondefective
items were observed during the sampling process.
EXERCISE!
APPLICATIONS
Suppose that two physicists A and B are concerned with
obtaining a more accurate of some physical constant ,
previously known approximately. Suppose physicist A,
being very familiar with this area of study, can make a
moderately good guess of what the answer will be, and
that his prior opinion about can be approximately
represented by a normal distribution centered at 900
with a standard deviation of 20. By contrast, suppose that
physicist B has had little experience in this area, and has
rather vague prior beliefs which can be represented by a
normal distribution with mean 800 and a standard
deviation of 80.
Suppose now that an unbiased method of experimental

measurement is available and that an observation made
by this method, to a sufficient information, follows a
normal distribution with a standard deviation of 40.
Suppose that the result of a single observation is X =
850. Determine the posterior distribution of .
Pythagorean Theorem is to
Geometry as Bayes Theorem is to
Probability.
THANK YOU!

2bayesian Models

Diunggah oleh

Informasi Dokumen

Judul Asli

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

2bayesian Models

Diunggah oleh

Hak Cipta:

Format Tersedia

BAYESIAN MODELS

Components of Bayesian Inference

Posterior distribution is proportional

An Overview of State-ofthe-Art Data Modelling

Assessment of Prior Distributions

Assessment of Prior Distributions

Independent case: joint prior distribution is the

Assessment of Prior Distributions

Draw the histogram, smooth it out and determine if the

Assessment of Prior Distributions

Use of measures such as mean, median, mode,

Continuous Case 2: Very little or no information

PRINCIPLE OF INSUFFICIENT REASON

When nothing is known about in

Show that N(0, 2 ) represents a scale model.

Show that N(, 2 ) is a location-scale model.

Weak prior information

An Overview of State-ofthe-Art Data Modelling

Prior and Improper Priors

Note: Improper prior distributions can lead to

Weak prior information

J() = I() = Fisher information for

Jeffreys non-informative prior

leading to the noninformative prior

Fisher Information: I() = J() = Shirlee Remoto-Ocampo

An Overview of State-ofthe-Art Data Modelling

24-25 January 2007

An Overview of State-ofthe-Art Data Modelling

Assessment of Posterior Distributions

Conjugate Prior for a Bernoulli Process

Recall: Bernoulli Distribution

Recall: Binomial Distribution

Prior: Beta Distribution

Beta and Gamma Functions

Beta-Binomial Bayesian Model

Posterior Mean and Variance

Poisson-Gamma Bayesian Model

Recall: Poisson Distribution

Recall: Gamma Distribution

Poisson Gamma Bayesian Model

What is the posterior mean for PoissonGamma model? posterior variance?

Negative Binomial -Beta Model

Normal-Normal Bayesian Model

Inverse Gamma- Normal

Normal (with known

Suppose now that an unbiased method of experimental

Anda mungkin juga menyukai