K Kiran Kumar
Outline
What and why of statistical inference h Point and Interval estimates Sampling Distributions Mean, Variance and Proportion Hypothesis Testing yp g Basic Concepts Parametric tests for Mean, Variance and Proportion Non-parametric tests
K Kiran Kumar
Variation
Shape
Range Skewness Interquartile q Kurtosis Range R Variance Standard Deviation Coeff. of Variation
K Kiran Kumar
So what is a Population?
Aggregate of things / objects in under study Can be finite or infinite objects
Module 2 : Statistical Inference K Kiran Kumar
Key Concepts
Population
Sample
Subset
Parameter
Statistic
Statistical Inference
Statistical inference is the process of making an estimate estimate, prediction, or decision about a population based on a sample.
Population Sample
Inference
Statistic Parameter
K Kiran Kumar
Types of Sampling
Types of Sampling
Non-Probability Sampling Convenience Sampling Probability Sampling Expert Opinion Quota Sampling
Systematic Sampling S l
Cluster Sampling S l
K Kiran Kumar
Simple Random
Population P p l ti n
Sample
K Kiran Kumar
Systematic
Select picking interval e.g. every fifth Choose randomly one among the first five (or whatever the picking interval is) Pick out every fifth (or whatever the picking interval is) beginning from the chosen one
K Kiran Kumar
Stratified
Population P l ti
Guarantee that all the groups are represented like in the population
Sample l
Proportional allocation
Sample
K Kiran Kumar
Cluster
Divide population into the clusters d l h l (industry groups, districts,) Choose randomly some of the clusters Ch d l f th l t Draw sample from the chosen clusters using appropriate sampling method (or investigate chosen groups in whole)
Sample
K Kiran Kumar
K Kiran Kumar
Statistical Inference
Rationale Rationale: Large populations make investigating each member impractical and expensive + its been shown that p p observing 100% of a population is not perfect. Easier and cheaper to take a sample and make inferences about the population from the sample sample. However: Such conclusions and estimates are not always going to y g g be correct. For this reason, we build into the statistical inference measures of reliability , namely confidence level and measures reliability significance level.
K Kiran Kumar
Population
Sample _
K Kiran Kumar
SampleC _ SampleD _p
XC Population
SampleB _
XD
XB
SampleE _
XE
SampleA _
XA
The Th sample mean i j t one of many possible sample means drawn from the l is just f ibl l d f th population, and is rarely equal to the real population value.
Module 2 : Statistical Inference K Kiran Kumar
Sampling Error
Sample 1 mean 40.5 Population mean 40.8 Sample 2 mean 40.3 Sample 3 mean 41.4
Different samples from the same population give different results Due Due to chance
Module 2 : Statistical Inference
K Kiran Kumar
Estimation
Estimation of sample characteristics
Characteristics? What are they?
o Parameter Vs Statistic o Estimator Vs Estimate
K Kiran Kumar
Point estimation Provides Single Value Gives No Information about How Close Value Is to the Unknown Population Parameter Sample MeanX = 3 Interval estimation Provides Range of Values Gives Information about Closeness to Unknown Population Parameter
Stated in terms of Probability Location of Parameter with Specified Probability = Statistic Error Level of confidence (1-alpha)% (1 alpha)% Eg. 95% of times confidence interval captures mean between limits
K Kiran Kumar
X= x Zx
_ x
x x+1.65x
x+1.96x
x+2.58x
K Kiran Kumar
Sampling Distributions
A sampling distrib tion is created b as the name s ggests distribution by, suggests, sampling. The method we will employ on the rules of probability and the laws of expected value and variance to derive the sampling distribution. distribution For example, consider the roll of one and two dice
K Kiran Kumar
K Kiran Kumar
While there are 36 possible samples of size 2, there are only ( g ) 11 values for , and some (e.g. =3.5) occur more frequently than others (e.g. =1).
Module 2 : Statistical Inference K Kiran Kumar
6/36 5/36
P( )
K Kiran Kumar
Compare
Compare the distribution of X distrib tion X
1.0
1.5
2.0
2.5
3.0
3.5
4.0
4.5
5.0
5.5
6.0
K Kiran Kumar
Generalize
We can generali e the mean and variance of the sampling of generalize ariance two dice:
to n-dice:
The standard deviation of the sampling distribution is called the standard error:
K Kiran Kumar