Anda di halaman 1dari 47

Probability

Questions
what is a good general size Ior artiIact
samples?
what proportion oI populations oI interest
should we be attempting to sample?
how do we evaluate the absence oI an
artiIact type in our collections?
Irequentist approach
probability should be assessed in purely
objective terms
no room Ior subjectivity on the part oI
individual researchers
knowledge about probabilities comes Irom
the relative Irequency oI a large number oI
trials
this is a good model Ior coin tossing
not so useIul Ior archaeology, where many oI
the events that interest us are unique.
ayesian approach
ayes Theorem
Thomas ayes
18
th
century English clergyman
concerned with integrating prior knowledge into
calculations oI probability
problematic Ior Irequentists
prior knowledge bias, subjectivity.
basic concepts
probability oI event p
0 p 1
0 certain non-occurrence
1 certain occurrence
.5 even odds
.1 1 chance out oI 10
iI A and are mutually exclusive events:
P(A or P(A P(
ex., die roll: P(1 or 6 1/6 1/6 .33
possibility set:
sum oI all possible outcomes
~A anything other than A
P(A or ~A P(A P(~A 1
basic concepts (cont.
discrete vs. continuous probabilities
discrete
Iinite number oI outcomes
continuous
outcomes vary along continuous scale
basic concepts (cont.
0
.25
.5
discrete probabilities
p
HH
TT
HT
0
.1
.2
p
-5 5
0.00
0.22
continuous probabilities
0
.1
.2
p
-5 5
0.00
0.22
total area under curve 1
but
the probability oI any
single value 0
( interested in the
probability assoc. w/
intervals
independent events
one event has no inIluence on the outcome
oI another event
iI events A & are independent
then P(A& P(AP(
iI P(A& P(AP(
then events A & are independent
coin Ilipping
iI P(H P(T .5 then
P(HTHTH P(HHHHH
.5.5.5.5.5 .5
5
.03
iI you are Ilipping a coin and it has already
come up heads 6 times in a row, what are
the odds oI an 7
th
head?
.5
note that P(10H ~ P(4H,6T
lots oI ways to achieve the 2
nd
result (thereIore
much more probable
mutually exclusive events are not
independent
rather, the most dependent kinds oI events
iI not heads, then tails
joint probability oI 2 mutually exclusive events
is 0
P(A&0
conditional probability
concern the odds oI one event occurring,
given that another event has occurred
P(A,Prob oI A, given
e.g.
consider a temporally ambiguous, but
generally late, pottery type
the probability that an actual example is
late increases iI Iound with other types oI
pottery that are unambiguously late.
P probability that the specimen is late:
isolated: P(T
a
.7
w/ late pottery (T
b
: P(T
a
,T
b
.9
w/ early pottery (T
c
: P(T
a
,T
c
.3
P(,A P(A&/P(A
iI A and are independent, then
P(,A P(AP(/P(A
P(,A P(
conditional probability (cont.
ayes Theorem
can be derived Irom the basic equation Ior
conditional probabilities


! ! ! !
! !
!
,~ ~ ,
,
,

application
archaeological data about ceramic design
bowls and jars, decorated and undecorated
previous excavations show:
75 oI assemblage are bowls, 25 jars
oI the bowls, about 50 are decorated
oI the jars, only about 20 are decorated
we have a decorated sherd Iragment, but it`s too
small to determine its Iorm.
what is the probability that it comes Irom a bowl?
can solve Ior P(,A
events:??
events: bowlness; A decoratedness
P(??; P(A,??
P(.75; P(A,.50
P(~.25; P(A,~.20
P(,A.75.50 ((.7550(.25.20
P(,A.88
bowl jar
dec. ??
50 oI bowls
20 oI jars
undec.
50 oI bowls
80 oI jars
75 25


! ! ! !
! !
!
,~ ~ ,
,
,

inomial theorem
P(n,k,p
probability oI successes in 3 trials
where the probability oI success on any one
trial is 5
success some speciIic event or outcome
speciIied outcomes
3 trials
5 probability oI the speciIied outcome in 1 trial

3

5 5 3 5 3 !

1 , , ,

! !
!
,
3
3
3

where
n! n(n-1(n-2.1 (where n is an integer
0!1
misc. useIul derivations Irom T
iI repeated trials are carried out:
mean successes (k np
sd oI successes (k bnpq (note: q1-p
(really only approximated when trials are
repeated many times.
k0; P(n,0,p(1-p
n
binomial distribution
binomial theorem describes a theoretical
distribution that can be plotted in two
diIIerent ways:
probability density Iunction (PDF
cumulative density Iunction (CDF
probability density Iunction (PDF
summarizes how oddsprobabilities are
distributed among the events that can arise
Irom a series oI trials
ex: coin toss
we toss a coin three times, deIining the
outcome ead as a success.
what are the possible outcomes?
how do we calculate their probabilities?
coin toss (cont.
how do we assign values to
!n,k,p)?
3 trials; n 3
even odds oI success; p.5
P(3,k,.5
there are 4 possible values Ior k`,
and we want to calculate P Ior
each oI them
k
0 TTT
1 HTT (THT,TTH
2 HHT (HTH, THH
3 HHH
probability oI successes in 3 trials
where the probability oI success on any one trial is 5

3

3
3
5 5 5 3 !

1 , ,
! ( !
!

1 3
1
! 1 3 ( ! 1
! 3
5 . 1 5 . 5 ,. 1 , 3

!

0 3
0
! 0 3 ( ! 0
! 3
5 . 1 5 . 5 ,. 0 , 3

!
0.000
0.050
0.100
0.150
0.200
0.250
0.300
0.350
0.400
0 1 2 3
k
P
(
3
,
k
,
.
5
)
practical applications
how do we interpret the absence oI key
types in artiIact samples??
does sample size matter??
does anything else matter??
1. we are interested in ceramic production in
southern Utah
2. we have surIace collections Irom a
number oI sites
are any oI them ceramic workshops??
3. evidence: ceramic wasters
ethnoarchaeological data suggests that
wasters tend to make up about 5 oI samples
at ceramic workshops
example
one oI our sites 15 sherds, none identiIied
as wasters.
so, our evidence seems to suggest that this
site is not a workshop
how strong is our conclusion??
reverse the logic: assume that it is a ceramic
workshop
new question:
how likely is it to have 2880/ collecting wasters in a
sample oI 15 sherds Irom a real ceramic workshop??
P(n,k,p
n trials, k successes, p prob. oI success on 1 trial|
P(15,0,.05
we may want to look at other values oI k.|
k P(15,k,.05
0 0.46
1 0.37
2 0.13
3 0.03
4 0.00
.
15 0.00
0.00
0.10
0.20
0.30
0.40
0.50
0 5 10 15
k
P
(
1
5
,
k
,
.
0
5
)
how large a sample do you need beIore you
can place some reasonable conIidence in the
idea that no wasters no workshop?
how could we Iind out??
we could plot P(n,0,.05 against diIIerent
values oI n.
0.00
0.10
0.20
0.30
0.40
0.50
0 50 100 150
3
P
(
3
,
0
,
.
0
5
)
50 less than 1 chance in 10 oI collecting
no wasters.
100 about 1 chance in 100.
0.00
0.05
0.10
0.15
0.20
0.25
0.30
0.35
0.40
0.45
0.50
0 20 40 60 80 100 120 140 160
3
P
(
3
,
0
,
p
)
p=.05
p=.10
hat iI wasters existed at a higher proportion than 5??
so, how big should samples be?
depends on your research goals & interests
need big samples to study rare items.
rules oI thumb are usually misguided (ex.
200 pollen grains is a valid sample
in general, sheer sample size is more
important that the actual proportion
large samples that constitute a very small
proportion oI a population may be highly
useIul Ior inIerential purposes
the plots we have been using are probability
density Iunctions (PDF
cumulative density Iunctions (CDF have a
special purpose
example based on mortuary data.
$ite 1
800 graves
160 exhibit body position and grave goods that mark
members oI a distinct ethnicity (group A
relative Irequency oI 0.2
$ite 2
badly damaged; only 50 graves excavated
6 exhibit group A characteristics
relative Irequency oI 0.12
Pre-Dynastic cemeteries in Upper Egypt
expressed as a proportion, $ite 1 has around
twice as many burials oI individuals Irom
group A as $ite 2
how seriously should we take this
observation as evidence about social
diIIerences between underlying
populations?
assume Ior the moment that there is no
diIIerence between these societiesthey
represent samples Irom the same underlying
population
how likely would it be to collect our $ite 2
sample Irom this underlying population?
we could use data merged Irom both sites as
a basis Ior characterizing this population
but since the sample Irom $ite 1 is so large,
lets just use it .
$ite 1 suggests that about 20 oI our
society belong to this distinct social class.
iI so, we might have expected that 10 oI the
50 sites excavated Irom site 2 would belong
to this class
but we Iound only 6.
how likely is it that this diIIerence (10 vs. 6
could arise just Irom random cance??
to answer this question, we have to be
interested in more than just the probability
associated with the single observed outcome
6
we are also interested in the total probability
associated with outcomes that are more
extreme than 6.
imagine a simulation oI the
discovery/excavation process oI graves at
$ite 2:
repeated drawing oI 50 balls Irom a jar:
ca. 800 balls
80 black, 20 white
on average, samples will contain 10 white
balls, but individual samples will vary
by keeping score on how many times we
draw a sample that is as, or more divergent
(relative to the mean sample than what we
observed in our real-world sample.
this means we have to tally all samples that
produce 6, 5, 4.0, white balls.
a tally oI just those samples with 6 white
balls eliminates crucial evidence.
we can use the binomial theorem instead oI
the drawing experiment, but the same logic
applies
a cumulative density Iunction (CDF
displays probabilities associated with a
range oI outcomes (such as 6 to 0 graves
with evidence Ior elite status
n k p P(n,k,p cumP
50 0 0.20 0.000 0.000
50 1 0.20 0.000 0.000
50 2 0.20 0.001 0.001
50 3 0.20 0.004 0.006
50 4 0.20 0.013 0.018
50 5 0.20 0.030 0.048
50 6 0.20 0.055 0.103
0.00
0.10
0.20
0.30
0.40
0.50
0.60
0.70
0.80
0.90
1.00
0 10 20 30 40 50
k
c
u
m

P
(
5
0
,
k
,
.
2
0
)
so, the odds are about 1 in 10 that the
diIIerences we see could be attributed to
random eIIectsrather than social
diIIerences
you have to decide what this observation
really means, and other kinds oI evidence
will probably play a role in your decision.

Anda mungkin juga menyukai