Anda di halaman 1dari 2

Descriptive statistics

!
!
Variance = ! (!! !)! = ! [

k=n for population


k=n-1 for sample

!!! !! ! ]

Coeff. of variation =

!.!.
!

* Vp/100 = Xk when k > np/100 if it is not an integer; Vp/100 = (Xnp/100 + Xnp/100+1)/2 when np/100 is an integer
Probability
!" (!|!)
- Relative Risk = !" (!|!)
- Bayes rules: Pr ! ! = !" ! !
Given +ve
PV+
1 PV+

Disease
No disease

!" ! ! !" (!)


!" ! ![!!!" ! ! ]!" (!)

Given -ve
1 PVPV-

+ve
Sensitivity
False +ve (1-SF)

Given disease
Given no disease

Probability distribution
- ! = ! ! ; ! ! = !"# ! = !{ ! ! ! }! = ! ! ! [! ! ]!
- ! = !" ! ; ! ! = [(! !)! ! ! ] = [! ! ! ! ] !!
- ! = !" ! !" ; ! ! = [(! !)! ! ! ]!" = [! ! ! ! ]!" !!
Binomial (limited by n)
Pr ! = ! = !!! ! ! (1 !)!!!
! !! !!
Poisson (not limited by n)
Pr(X = x) =
where ! = !"
Normal [for N(0,12)]
Normalization: ! =

! ! =

1
! 2!

!!

!
(!!!)!
!! !
[!(!; 0,1)

1
2!

!!

! !

! = Pr ! ! = !! ! !; 0,1 !"
!! = Pr ! !! = !

!!!

(for all X)
Discrete X
Continuous X
!"#
!

!"
!

! !!! ]

!!!

-ve
False ve (1-SS)
Specificity

!!!

* ! = 1 (!) ; Pr ! ! ! = Pr !! Z !! = ! ( ! )
* Pr(X a) = Pr(X < a) for continuous random variables only (as Pr(X = a) =0)
* Approximation = equalize E(X) and Var(X) of different distribution
Poisson approximation to binomial when np < 5 (remember to check values)
Normal approximation to binomial when npq 5 (remember continuity correction)
Pr ! ! ! Pr (! 0.5 < ! < ! + 0.5) ; Pr ! = ! Pr (! 0.5 < ! < ! + 0.5)
special cases: !" ! = 0 !" ! < 0.5 ; !"(! = !) Pr (Y > n 0.5)
Relationships between random variables
- ! !" = !"#$(! = ! ! = !) ; E(XY) = E(X)E(Y) if X,Y are independent
- linear combination (l.c.) for all Xs : ! ! = !! !(!! ) = !! !!
- !"# !, ! = ! ! !! ! !! = ! !" !! !!
!"# !, ! = !"# ! ; !"# !, ! = !"# ! ; !"# !, ! = 0 if X,Y are independent
!"# !, !" + !" = !"#$ !, ! + !"#$(!, !)
!"#(!,!)
- !"## !, ! = !!" = ! !
! !

- l.c. for all Xs : !"# ! =

!!! !"# !! + 2
!

- (sample covariance) !!" = !!!

!
!!!(!!

!! !! !"# !! , !! =

!!! !!! + 2

!! !! !! !! !"##(!! , !! )

!)(!! !)

!!! = !!! (!! !)! = !! (sample variance) when X = Y


sample corr. coeff. = !!" = !

!!"

!! !!!

Point estimation
- Choice of estimator: ! ! = ! for unbiased; ! ! < !(! ) for minimum variance
!
- ! ! = !!! !!!!(!! !)! is estimator for pop. variance 2
- ! is the best estimator for pop. mean for N.D. standard error = ! ! = !"# ! = !/ !
!/ ! is estimator for standard error
- ! is the best estimator for pop. prop. p for N.D. standard error = ! ! = !"# ! = !"/!
!!/! is estimator for standard error
- !~! !,

!!
!

!!" !~! !, ! ! ; ! ! !,

!!
!

!!" ! 30 (central limit theorem)

Sampling
distribution
Sampling
distribution

Interval estimation: (1-)100% C.I. (two-sided)


!
!!
! !!!!/!
!

!
!!!/!
!
!
when is known or n
>200 if is unknown

90%
95%
99%

(when !!! 5)
!!.!" = 1.645
!!.!"# = 1.960
!!.!!" = 2.576

! !!!!,!!!/!

! 1 !!

when is unknown
valid for large n even not normally distributed

! 1 !!

!! !
!
!!!,!!
!!!,
!
!
not valid for non-normal distribution

!!!!,! =

!!

No simple relationship
!
between !!!!,!/!
and
!
!!!!,!!!/!

!!!!,!!!/!

Hypothesis testing (Given: H0: = 0)

Type I and Type II Error


H1: 0 >

- significance level = (given)


= Pr(Reject H0 | H0 is true)
- Power of test = 1
= (

!!! !!! !/! !!!


!/!

- sample size ! =

(!!!! !!!!! )! ! !

One-
sided

(!! !!)!

- p-value = probability of obtaining test statistic as extreme as or more than that observed test statistics
p > implies it is a general case so not sufficient evidence to reject H0 (vice versa for p < )
(for z, t) one-sided = 1 (test statistics) ; two-sided = 2[1 (test statistics)]
(for 2) two sided = 2[(2)] for ! ! ! ! ; 2[1 - (2)] for ! ! > ! !
One sample (v.s. population) tests (two-sided)

Two paired-samples test (two-sided)

Two independent-samples tests (two-sided)

For 2 sample test,


!! !"#$%&
compare with
!! !"!#$%&'"(
* Must clarify!