Anda di halaman 1dari 30

Applied St ti ti f S A li d Statistics for Securities Markets iti M k t cSMP@ NISM

K. K Kiran Kumar Faculty @ NISM

Module 2 : Statistical Inference

K Kiran Kumar

Outline
What and why of statistical inference h Point and Interval estimates Sampling Distributions Mean, Variance and Proportion Hypothesis Testing yp g Basic Concepts Parametric tests for Mean, Variance and Proportion Non-parametric tests

Module 2 : Statistical Inference

K Kiran Kumar

Lets start withwhat we know about statistics


Collect.Present ..Characterize the data

Descriptive Statistics Central Tendency


Mean Median Mode

Variation

Shape

Range Skewness Interquartile q Kurtosis Range R Variance Standard Deviation Coeff. of Variation
K Kiran Kumar

Module 2 : Statistical Inference

Inferential StatisticsOne step ahead from there


Descriptive Statistics Vs Inferential Statistics
Collecting and describing data Is it same as drawing conclusions ??
o Descriptive Statistics describe the data set thats being analyzed, but doesnt allow us to draw any conclusions or make any interferences about the data, other than visual It looks like .. type statements. statements Hence we need another branch of statistics: inferential statistics

Inferential or Inductive Statistics


Logic of drawing statistically valid conclusions on the population based on sample Subject of interest to us now Estimate Estimate..test the claim.draw conclusion test claim draw

So what is a Population?
Aggregate of things / objects in under study Can be finite or infinite objects
Module 2 : Statistical Inference K Kiran Kumar

Key Statistical Concepts


Population a population is the group of all items of interest frequently very large; sometimes infinite infinite. Sample A sample is a set of data drawn from the population p p p o Potentially very large, but less than the population o E.g. a sample of 1000 voters exit polled on election day. Parameter A descripti e meas re of a pop lation descriptive measure population Statistic A descriptive measure of a sample
Module 2 : Statistical Inference K Kiran Kumar

Key Concepts

Population

Sample

Subset

Parameter

Statistic

Populations have Parameters, Samples have Statistics.


Module 2 : Statistical Inference K Kiran Kumar

Statistical Inference
Statistical inference is the process of making an estimate estimate, prediction, or decision about a population based on a sample.
Population Sample
Inference

Statistic Parameter

What can we infer about a Populations Parameters f p based on a Samples Statistics?


Module 2 : Statistical Inference K Kiran Kumar

Questions that we can answer

Module 2 : Statistical Inference

K Kiran Kumar

Types of Sampling
Types of Sampling

Non-Probability Sampling Convenience Sampling Probability Sampling Expert Opinion Quota Sampling

Simple Random Sampling

Stratified Random d Sampling

Systematic Sampling S l

Cluster Sampling S l
K Kiran Kumar

Module 2 : Statistical Inference

Simple Random

Population P p l ti n

Sample

Module 2 : Statistical Inference

K Kiran Kumar

Systematic

Select picking interval e.g. every fifth Choose randomly one among the first five (or whatever the picking interval is) Pick out every fifth (or whatever the picking interval is) beginning from the chosen one

Module 2 : Statistical Inference

K Kiran Kumar

Stratified

Population P l ti

Guarantee that all the groups are represented like in the population

Sample l
Proportional allocation

18-29 30-49 65+ 50-64

Even allocation Compare g p p groups

Sample
K Kiran Kumar

Module 2 : Statistical Inference

Cluster

Divide population into the clusters d l h l (industry groups, districts,) Choose randomly some of the clusters Ch d l f th l t Draw sample from the chosen clusters using appropriate sampling method (or investigate chosen groups in whole)
Sample

Module 2 : Statistical Inference

K Kiran Kumar

Non probability Sample


When a sample is not dra n randoml it is called a non drawn randomly probability sample For example, when y use elements most available, like in p you self-selecting surveys or street interviews In the case of a non probability sample you should not draw conclusions about the whole population Statistical inference: Drawing conclusions about the whole g population on the basis of a sample Precondition for statistical inference: A sample is randomly selected from the population (=probability sample) ( probability

Module 2 : Statistical Inference

K Kiran Kumar

Statistical Inference
Rationale Rationale: Large populations make investigating each member impractical and expensive + its been shown that p p observing 100% of a population is not perfect. Easier and cheaper to take a sample and make inferences about the population from the sample sample. However: Such conclusions and estimates are not always going to y g g be correct. For this reason, we build into the statistical inference measures of reliability , namely confidence level and measures reliability significance level.

Module 2 : Statistical Inference

K Kiran Kumar

You describe the samplewith any central tendency measuresay mean.

Population

Sample _

Module 2 : Statistical Inference

K Kiran Kumar

Sample Mean is a random variable

SampleC _ SampleD _p

XC Population
SampleB _

XD

XB

SampleE _

XE

SampleA _

XA

The Th sample mean i j t one of many possible sample means drawn from the l is just f ibl l d f th population, and is rarely equal to the real population value.
Module 2 : Statistical Inference K Kiran Kumar

Sampling Error

Sample 1 mean 40.5 Population mean 40.8 Sample 2 mean 40.3 Sample 3 mean 41.4

Different samples from the same population give different results Due Due to chance
Module 2 : Statistical Inference

K Kiran Kumar

GAUSS, Carl Friedrich 1777-1855 http://www.york.ac.uk/depts/maths/histstat/people/gauss_note.gif.gz


Module 2 : Statistical Inference K Kiran Kumar

Estimation
Estimation of sample characteristics
Characteristics? What are they?
o Parameter Vs Statistic o Estimator Vs Estimate

Point or Interval Estimation

Are sample characteristics are closely represent population characteristics ?


How close is close . Some yardsticks are needed yardsticks Measures of central tendencymean, median, mode Unbiasedness Consistency Efficiency

Module 2 : Statistical Inference

K Kiran Kumar

Point estimation Provides Single Value Gives No Information about How Close Value Is to the Unknown Population Parameter Sample MeanX = 3 Interval estimation Provides Range of Values Gives Information about Closeness to Unknown Population Parameter
Stated in terms of Probability Location of Parameter with Specified Probability = Statistic Error Level of confidence (1-alpha)% (1 alpha)% Eg. 95% of times confidence interval captures mean between limits

Module 2 : Statistical Inference

K Kiran Kumar

Confidence & Significance Levels


The confidence level is the proportion of times that an interval estimate for a population parameter will be correct. E.g. a confidence level of 95% means that, interval estimates based on this form of statistical inference will be correct 95% of the time time. I am 95% confident that the TRUE mean is between 120 and 122. When the purpose of the statistical inference is to test a claim about a population parameter the significance level measures how frequently a parameter, true claim is accidently rejected. E.g. a 5% significance level means that, in the long run, a true claim j will be rejected 5% of the time. Coin flips should result in 50% heads, on average. A 5% significance level implies that we run a 5% risk of concluding that heads do not occur 50% of the time, on average We use (Greek letter alpha) to represent the significance level when testing a claim about a population parameter , 1 to represent the confidence level when we wish to estimate a population parameter.
Module 2 : Statistical Inference K Kiran Kumar

Standardized C fid St d di d Confidence Interval I t l

X= x Zx

_ x

x-2.58x x-1.65x x-1.96x

x x+1.65x

x+1.96x

x+2.58x

90% Samples 95% Samples p 99% Samples


Module 2 : Statistical Inference K Kiran Kumar

Sampling Distribution-A Conceptual Framework


Sampling Distribution
A distribution of sample statistic based on random samples of a fixed size from a population a distribution that describes the chance fluctuations of a statistic

Sampling Distribution of Mean


Mean of Means distribution is same as population mean p p How about its standard deviation ? Standard Error = SDeviation of Sampling Distribution
o Standard Deviation of All Possible Sample Means
o Measures Scatter in All Sample Means

Module 2 : Statistical Inference

K Kiran Kumar

Sampling Distributions
A sampling distrib tion is created b as the name s ggests distribution by, suggests, sampling. The method we will employ on the rules of probability and the laws of expected value and variance to derive the sampling distribution. distribution For example, consider the roll of one and two dice

Module 2 : Statistical Inference

K Kiran Kumar

Sampling Distribution of the Mean


A fair die is thro n infinitel man times thrown infinitely many times, with the random variable X = # of spots on any throw. The probability distribution of X is:
x P(x) 1 1/6 2 1/6 3 1/6 4 1/6 5 1/6 6 1/6

and the mean and variance are calculated as well:

Module 2 : Statistical Inference

K Kiran Kumar

Sampling Distribution of Two Dice


A sampling distrib tion is created b looking at distribution by all samples of size n=2 (i.e. two dice) and their means

While there are 36 possible samples of size 2, there are only ( g ) 11 values for , and some (e.g. =3.5) occur more frequently than others (e.g. =1).
Module 2 : Statistical Inference K Kiran Kumar

Sampling Distribution of Two Dice


The sampling distribution of distrib tion P( )
1.0 1.5 2.0 2.5 25 3.0 3.5 4.0 4.5 5.0 50 5.5 6.0 1/36 2/36 3/36 4/36 5/36 6/36 5/36 4/36 3/36 2/36 1/36

is sho n belo shown below:

6/36 5/36

P( )

4/36 3/36 2/36 1/36


1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0

Module 2 : Statistical Inference

K Kiran Kumar

Compare
Compare the distribution of X distrib tion X

1.0

1.5

2.0

2.5

3.0

3.5

4.0

4.5

5.0

5.5

6.0

with the sampling distribution of As well, note that ell that:

Module 2 : Statistical Inference

K Kiran Kumar

Generalize
We can generali e the mean and variance of the sampling of generalize ariance two dice:

to n-dice:

The standard deviation of the sampling distribution is called the standard error:

Module 2 : Statistical Inference

K Kiran Kumar

Anda mungkin juga menyukai