Anda di halaman 1dari 80

Statistical Concepts And Their Applications In Business

Lesson 2

Copyright 2014, Simplilearn, All rights reserved.


Copyright 2014, Simplilearn, All rights reserved.

Agenda
After completing
this course, you will
be able to
understand:

Statistical Methods overview

Population and Samples

Developing a sampling plan and Sampling Methods

What is Descriptive Statistics

What are its components

Business usage of Descriptive Statistics via a Case Study

Probability theory and distributions

Confidence Interval

The concept of tests of significance

One sided and two sided hypothesis testing

The various tests of significance

Non parametric testing


Copyright 2014, Simplilearn, All rights reserved.

Statistical Methods

Statistics is a applied/business mathematics which estimate the present and predict the
future.
Descriptive Statistics

Inferential Statistics

Sample

Population

Measure of Central Tendency

Estimation

Measure of Dispersion

Hypothesis Testing

Copyright 2014, Simplilearn, All rights reserved.

Population and Samples


A population is any entire collection of objects or observations from which we may collect data. It is
the entire group we are interested in, which we wish to describe or draw conclusions about.
For each population there are many possible samples.

It is important that the investigator carefully and completely defines the population before
collecting the sample, including a description of the members to be included.
A sample is a group of units selected from a larger group (the population). By studying the sample it
is hoped to draw valid conclusions about the larger group.
A sample is generally selected for study because the population is too large to study in its entirety.
The sample should be representative of the general population. This is often best achieved by
random sampling.
Copyright 2014, Simplilearn, All rights reserved.

Developing a sampling plan


Define the target population in terms of number of elements, sampling unit, extent and time.
Select a sampling method probability or non-probability sampling.
Obtain the sampling frame must contain all the potential factors.
Determination of sample size for desired level of accuracy.
Choose data collection method procedure to obtain the data.
Develop operational plan which technique fits the best.

Execute operational plan verification of specified procedure.


Copyright 2014, Simplilearn, All rights reserved.

Sampling techniques
Sampling

Probability

Simple
Random

Systematic

Convenience

Non-Probability

Stratified

Judgmental

Cluster

Quota

Snowball

Copyright 2014, Simplilearn, All rights reserved.

Descriptive Statistics

Help describe, show and summarize data in a meaningful manner


Non-conclusive as it is only limited to the data being analysed
Number of Students
Score
Range
Below 40
40-50
50-60
60-70
70-80
>80
Total

Number of
Students
20
22
33
21
13
5
114

35
30
25
20
15
10
5
0
Below 40-50 50-60 60-70 70-80
40

>80

Copyright 2014, Simplilearn, All rights reserved.

Measure of Central Tendency

Identify with a single value


Also called measures of central location
Measure of Central Tendency

Mean

Median

Mode

Copyright 2014, Simplilearn, All rights reserved.

Mean

mean is the average of the numbers

a calculated "central" value of a set of numbers

Copyright 2014, Simplilearn, All rights reserved.

Median

Median is the number in the middle

Number of values above and below median is same

Copyright 2014, Simplilearn, All rights reserved.

Mode

Mode is the value that occurs often

A set of data can have more than one mode


35

Frequency

30
25
20
15
10
5
0

6
Copyright 2014, Simplilearn, All rights reserved.

When to use what?


Mean:
The average is required
The variable is continuous / discrete
Median:
The variable is discrete
There are abnormal extreme values / Non-normal data
The characteristic under study is qualitative
Mode:
The variable is discrete
There are abnormal extreme values
The characteristic under study is qualitative

Copyright 2014, Simplilearn, All rights reserved.

Measure of Dispersion

The spread or dispersion of a set of scores around some central value


Describes the amount of heterogeneity or variation within a distribution of scores
Measure of Dispersion

Variance

Standard Deviation

Copyright 2014, Simplilearn, All rights reserved.

Variance and Standard Deviation

Variance is an average of squared deviations about the mean

Standard deviation is the squared root of variance.

Example data : 2,5,5,4,6,8


n= 6
Mean = (2+5+5+4+6+8)/6 = 5
Example data : 2,5,4,6,8
(25)2 + (55)2 + (45)2 + (65)2 + (85)2

Variance =

Standard Deviation = 4 = 2

20
5

=4

Copyright 2014, Simplilearn, All rights reserved.

Case Study Descriptive Statistics


Business Case: A telecommunications company maintains a customer database that includes, among
other things, information on how much each customer spent on long distance, toll-free, equipment
rental, calling card, and wireless services in the previous month.
The telecom company surveyed 1000 of its customers on all the above services.
Use Descriptive analysis to study customer spending to determine which services are most profitable.

Copyright 2014, Simplilearn, All rights reserved.

Case Study Descriptive Statistics (Contd.)


N

Valid
N

Min

Max

Mean

Standard Deviation

Long distance last month

1000

1000

0.90

99.95

11.72

10.36

Toll free last month

1000

475

0.00

173.00

13.27

16.90

Equipment last month

1000

386

0.00

77.70

14.21

19.07

Calling card last month

1000

678

0.00

109.25

13.78

14.08

Wireless last month

1000

296

0.00

111.95

11.58

19.72

On average, customers spend the most on equipment rental, but there is a lot of variation in the
amount spent.
Customers with calling card service spend only slightly less, on average, than equipment rental
customers, and there is much less variation in the values.
The real problem here is that most customers don't have every service, so a lot of 0's are being
counted. One solution to this problem is to treat 0's as missing values so that the analysis for each
service becomes conditional on having that service.
Copyright 2014, Simplilearn, All rights reserved.

Probability Theory
Probability is a branch of mathematics that deals with the uncertainty of an event happening in
the future.
Probability value always occurs within a range of 0 to 1.
Probability of an event, P(E) = No. of favorable occurrences
No. of possible occurrences

HEAD

TAIL

Copyright 2014, Simplilearn, All rights reserved.

Assigning Probabilities
Classical method based on equally likely
outcomes.
E.g.: Rolling a dice.
Relative frequency method based on
experimentation or historical data.

E.g.: A car agency has 5 cars. His past record


as shown in the table shows his cars used in past 60
days.
Subjective method based on judgment.

No. of
cars used

No. of
days

Probability

(3/60) = 0.05

10

(10/60) = 0.17

16

(16/60) = 0.27

15

(15/60) = 0.25

(9/60) = 0.15

(7/60) = 0.11

E.g.: 75% chance that England will adopt to


Euro currency by 2020.
Copyright 2014, Simplilearn, All rights reserved.

Probability Distribution
Probability distribution for a random variable gives information about how the probabilities are
distributed over the values of that random variable.
Its defined by f(x) which gives probability of each value.

E.g. Suppose we have sales data for AC sale in last 300 days.
Units
sold

Probability of units
No. of days
sold, f(x)
0

10

0.03

55

0.18

150

0.5

55

0.18

25

0.08

0.02

Probability of units sold, f(x)


0.6
0.4
Probability of
units sold, f(x)

0.2
0

Copyright 2014, Simplilearn, All rights reserved.

Binomial Distribution
Discrete probability distribution
Following conditions should be satisfied
A fixed number of trials
Each trial is independent of the others
The probability of each outcome remains constant from trial to trial.
Examples
Tossing a coin 10 times for occurrences of head
Surveying a population of 100 people to know if they watch television or not
Rolling a die to check for occurrence of a 2

Copyright 2014, Simplilearn, All rights reserved.

Case StudyBinomial Distribution


Example of binomial distribution: Amir buys a chocolate bar every day during a promotion that says
one out of six chocolate bars has a gift coupon within. Answer the following questions :
What is the distribution of the number of chocolates with gift coupons in seven days?
What is the probability that Amir gets no chocolates with gift coupons in seven days?
Amir gets no gift coupons for the first six days of the week. What is the chance that he will get a one
on the seventh day?
Amir buys a bar every day for six weeks. What is the probability that he gets at least three gift
coupons?
How many days of purchase are required so that Amirs chance of getting at least one gift coupon is
0.95 or greater?

(Assume that the conditions of binomial distribution apply: the outcomes for Amirs purchases are
independent, and the population of chocolate bars is effectively infinite.)
Copyright 2014, Simplilearn, All rights reserved.

Case StudyBinomial Distribution (contd.)


Steps:
Formula = nCr pr q n-r
Where n is the no. of trials , r is the number of successful outcomes , p is the probability of success,
and q is the probability of failure.
Other important formulae include
p+q=1
Hence, q = 1 p
Thus,

p = 1/6
q = 5/6
Copyright 2014, Simplilearn, All rights reserved.

Case StudyBinomial Distribution (contd.)


1.

Distribution of number of chocolates with gift coupons in 7 days: 7C r (1/6)r (5/6)7-r

2.

Probability of failing 7 days : P(X=0) =(5/6)7

3.

Probability of winning a coupon on the 7th day : 1/6

4.

The number of winning at least 3 wrappers in six weeks:

P(X 3)=1 P(X2)


=1 (P(X=0)+P(X=1)+P(X=2)
=1 (0.0005+0.0040+0.0163)

= 0.979
5.

Number of purchase days required so that probability of success is greater than 0.95:
P(X 1) 0.95 = 1 P(X 0) 0.95
= 1 P(X=0) 0.05
= n 16.43 (applying log function)
= 17days.
Copyright 2014, Simplilearn, All rights reserved.

Normal Distribution
Theoretical model of the whole population
Centered around the mean and symmetrical on both sides
Standard normal distribution mean 0 and standard deviation 1

Copyright 2014, Simplilearn, All rights reserved.

Poisson distribution
Discrete probability distribution for events that happen randomly in time
Following conditions need to be satisfied
The event results in a success or failure
The average number of successes, is known
Probability of success is proportional to the region/time
Probability of success in an extremely small region/time is almost zero.
Properties: Mean and variance is equal, and is denoted by .
Examples
Average number of houses sold by a company is 5 per day. What is the probability that exactly 4
houses will be sold tomorrow?
Average number of births in a hospital is 2.1 births per hour. What is the probability that there
will be exactly 6 births in the next two hours?
Copyright 2014, Simplilearn, All rights reserved.

Skewness and Kurtosis


Skewness measure of deviation from symmetry
Difference between median and mean
Right or left skewed
Skewness negative more negative values (Left skewed)
Skewness positive more positive values (Right skewed)
Kurtosis measure of peakedness of the distribution
High kurtosis tall peak, rapid decline in the tails
Low kurtosis flat peaks, gradual decline in the tails
Extreme case uniform distribution

Copyright 2014, Simplilearn, All rights reserved.

Case Study Skewness and Kurtosis


N

Skewness

Kurtosis

Statistic

Std. Error

Statistic

Std. Error

Long distance last month

1000

2.966

0.077

14.012

0.155

Toll free last month

475

3.465

0.112

26.735

0.224

Equipment last month

386

0.756

0.124

0.641

0.248

Calling card last month

678

2.150

0.094

7.572

0.187

Wireless last month

296

1.359

0.142

3.079

0.282

Equipment last month data is more accurate in nature and its SD is comparatively lower than the
other measures.

Conclusion - Equipment is the segment where the telecom company is getting more profits than
the others and it can invest more .

Copyright 2014, Simplilearn, All rights reserved.

Confidence interval
Its a rule for a population parameter to determine an interval that is likely to include the parameter
based on the sample information.
Supposing that a random variable has been taken and the random samples were taken repeatedly
from the population, certain percentage of interval contains unknown value.
In such case, if population is repeatedly sampled and intervals calculated in that fashion then 95%
of interval contains true value of the unknown parameter.

This interval is then said to be 95% confident for the population proportion.
Data Requirements
Confidence level
Statistic
Margin of error
Range of the confidence interval = sample statistic + margin of error.
The uncertainty associated with the confidence interval is specified by the confidence level.

Copyright 2014, Simplilearn, All rights reserved.

Constructing a Confidence Interval


Identify a sample statistic - Choose the statistic that will be used to estimate a population
parameter.
Select a confidence level - It describes the uncertainty of a sampling method.

Find the margin of error.


Margin of error = Critical value * Standard error of statistic
Specify the confidence interval - The range of the confidence interval is defined by the following
equation.
Confidence interval = sample statistic + Margin of error

Copyright 2014, Simplilearn, All rights reserved.

Tests of Significance
Tests used in assessing the evidence in favor of or against a given assumption
Begins with a Null Hypothesis, H0
Tests either validate the null hypothesis, or reject it in favor of an Alternate Hypothesis, Ha

Two types of tests


One sided tests
Two sided tests
Results decided by calculating the p-value
Interpretation:
If p-value is less than the significance level , reject the null hypothesis.
General values of are 0.05, 0.01.
General Assumptions:
The distribution is almost normal
The samples in the distribution have almost unequal variances
Copyright 2014, Simplilearn, All rights reserved.

One Sided Hypothesis Testing


0 = null value
Null hypothesis : = 0
Alternative hypothesis : < 0 OR > 0
Example : Given a sample of heights of 100 males in New York, decide whether the height has
increased in general from a given average height of 5 feet 9 inches.
Null Value: 0 = 5 feet 9 inches
Null Hypothesis : = 5.9
Alternative Hypothesis : > 5.9
Using one of various hypothesis tests, calculate p-value and reject null hypothesis if p-value is
less than 0.05.

Copyright 2014, Simplilearn, All rights reserved.

Two Sided Hypothesis Testing


0 = null value
Null hypothesis : = 0
Alternative hypothesis : 0
Example : Given a sample of heights of 100 males in New York, decide whether the height has
increased/decreased in general from a given average height of 5 feet 9 inches.
Null Value: 0 = 5 feet 9 inches
Null Hypothesis : = 5.9
Alternative Hypothesis : 5.9
Using one of various hypothesis tests, calculate p-value and reject null hypothesis if p-value is
less than 0.05.

Copyright 2014, Simplilearn, All rights reserved.

Tests of Significance
One sample z-test
Two sample z-test
One sample t-test
Two sample t-test
Paired t-test
Chi Squared test
F test - Analysis of Variance (ANOVA)
F test - Regression
Copyright 2014, Simplilearn, All rights reserved.

Chi- Squared tests


Compare the observed result against an expected result based on a hypothesis
Steps:
State the null hypothesis
Prepare the contingency table for the variable
Determine the expected results
Calculate the chi-squared value
Calculate the degrees of freedom
Based on the above, calculate the p-value
If p-value < 0.05, reject the null hypothesis.
Test of independence:
Verify if two variables are independent
Same steps as above.

Copyright 2014, Simplilearn, All rights reserved.

Case StudyChi-Squared Test


A city has a newly opened nuclear plant, and there are families staying dangerously close to the
plant. A health safety officer wants to take this case up to provide relocation for the families that
live in the surrounding area. To make a strong case, he wants to prove with numbers that an
exposure to radiation levels is leading to an increase in diseased population. He formulates a
contingency table of exposure and disease.
Does the data suggest an association between the disease and exposure?
Disease

Total

Exposure

Yes

No

Yes

37

13

50

No

17

53

70

Total

54

66

120
Copyright 2014, Simplilearn, All rights reserved.

Case StudyChi-Squared Test (contd.)


Steps:
Calculate the number of individuals of exposed and unexposed groups expected in each disease
category (yes and no) if the probabilities were the same.
If there were no effect of exposure, the probabilities should be same and the chi-squared
statistic would have a very low value.
Proportion of population exposed = (50/120) = 0.42
Proportion of population not exposed = (70/120) = 0.58
Thus, expected values:
Population with disease = 54
Exposure Yes : 54 * 0.42 = 22.5
Exposure No : 54 * 0.58 = 31.5
Population without disease = 66
Exposure Yes : 66 * 0.42 = 27.5
Exposure No : 66 * 0.58 = 38.5
Copyright 2014, Simplilearn, All rights reserved.

Case StudyChi-Squared Test (contd.)


Calculate the Chi-squared statistic
2 =
=

= 29.1
Calculate the degrees of freedom :
(Number of rows 1) X (Number of columns 1)
df = (2 1) X (2 1) = 1
Calculate the p-value from the chi-squared table

For chi-squared value 29.1 and degrees of freedom = 1, from the table, p-value is < 0.001
Interpretation: There is 0.001 chance of obtaining such discrepancies between expected and
observed values if there is no association
Conclusion : There is an association between the exposure and disease.
Copyright 2014, Simplilearn, All rights reserved.

ANOVA
Analysis of Variance used to compare more than two groups
Extension of the independent t-tests
Factor variable variable defining the groups

Response variable variable being compared


One way ANOVA
Groups of a single variable
E.g. : Is there a difference in students scores based on the row he is seated
front/middle/back?
Two way ANOVA
Two independent variables
E.g. : Does the race and gender affect a persons yearly income?

Copyright 2014, Simplilearn, All rights reserved.

Case StudyOne Way ANOVA


Marks obtained in the same subject by 3 students belonging to three different schools are given
below.
Does the data suggest any association between schools and marks?
School

Marks

82

83

38

83

78

59

97

68

55

Basic Idea : Partition the total variation in the data into the variance between groups and variance
within groups.

Copyright 2014, Simplilearn, All rights reserved.

Case StudyOne Way ANOVA (contd.)


Steps:
Calculate the means
School A : mean(82,83,97) = 87.3
School B : mean(83,78,68) = 76.3
School C : mean(38,59,55) = 50.6
Calculate the grand mean
Grand mean = mean(82,83,97,83,78,68,39,59,55) = 71.4
Calculating the variations
Sum of Squared Deviations about the grand mean, across all observed values : SSTotal = 2630.2
Sum of Squared Deviations of group mean about the grand mean three group means against
the grand mean : SSBetween = 2124.2
Sum of Squared Deviations of observations within a group about their group mean; added
across all groups : SSWithin = 506
Copyright 2014, Simplilearn, All rights reserved.

Case StudyOne Way ANOVA (contd.)


Calculate the degrees of freedom for every variance
dfTotal = Number of observations 1 = 9 -1 = 8
dfBetween = Number of groups -1 = 3-1 = 2
dfWithin = Number of observations number of groups = 9-3 = 6
Calculate the Mean Squared Variances
Mean Squared variance between groups : MSBetween= SSBetween /dfBetween = 2124.2/2 = 1062.1
Mean Squared variance within groups : MSWithin= SSWithin /dfWithin = 506/6 = 84.3
Calculate the f-statistic
F-value : MSBetween /MSWithin= 1062.1/84.3 = 12.59

Calculate the p-value from the F-table


p-value for given f-value 12.59 and degrees of freedom 2 and 6 is 0.007
Conclusion : Since the p-value is less than alpha, we can conclude by rejecting the null hypothesis,
that there is a difference in the marks obtained by students belonging to different groups.
Copyright 2014, Simplilearn, All rights reserved.

Non Parametric Testing


Referred to as distribution free, as they dont involve making assumptions of any data.
They have lower power than the parametric tests and hence are always given the second
preference after the parametric tests.

These tests are typically focused on median rather than mean.


They involve straight-forward procedures like counting and ordering.
There are at least one non-parametric test done for each parametric test and are classified into the
following categories.
Tests of differences between groups (independent samples)
Tests of differences between variables (dependent variables)
Tests of relationships between variables.

Copyright 2014, Simplilearn, All rights reserved.

Non Parametric Tests


Test

Parametric

Non Parametric

One Qualitative Response


Variable

One Sample Test

Sign Test

One Quantitative Response


Variable Two Values from
Paired Samples

Paired sample t-test

Wilcoxon Signed Rank Test

One Quantitative Response


Variable One Qualitative
Independent Variable with
two groups

Two Independent Sample ttest

Wilcoxon Rank Sum or Mann


Whitney Test

One Quantitative Response


Variable One Qualitative
Independent Variable with
three or more groups

ANOVA

Kruskall Wallis

Copyright 2014, Simplilearn, All rights reserved.

Correlation
Measure of association between variables
Positive and negative correlation, ranging between +1 and -1
Positive correlation example:
Earning and expenditure
Negative correlation example
Speed and time

Parametric normal distribution and homogenous variance


Pearson correlation
Non parametric no assumptions, nominal variables
Spearman correlation

Copyright 2014, Simplilearn, All rights reserved.

Correlation coefficient
r : correlation coefficient
+1 : Perfectly positive
-1 : Perfectly negative

0 0.2 : No or very weak association


0.2 0.4 : Weak association
0.4 0.6 : Moderate association
0.6 0.8 : Strong association
0.8 1 : Very strong to perfect association

Copyright 2014, Simplilearn, All rights reserved.

Summary
Here is a quick
recap of what we
have learned in this
lesson

Overview of statistical methods

Descriptive statistics Measures of Central Tendency and Measures of


Dispersion

A business case study to understand the concepts of descriptive statistics

Probability distribution

What are tests of significance

The process flow of hypothesis testing

One sided and two sided hypothesis tests

Various tests used in calculating the p-value

What is non parametric testing and why is it used

Non parametric alternatives for the usual tests of significance

Copyright 2014, Simplilearn, All rights reserved.

Quiz

Copyright 2014, Simplilearn, All rights reserved.

QUIZ
1

Which of the following is NOT a part of measure of central tendency?

a.

Mean

b. Median
c.

Mode

d.

Standard Deviation

Copyright 2014, Simplilearn, All rights reserved.

QUIZ
1

Which of the following is NOT a part of measure of central tendency?

a.

Mean

b. Median
c.

Mode

d.

Standard Deviation

Answer: d
Explanation: Standard Deviation is used to measure dispersion and not to measure central
tendency.
Copyright 2014, Simplilearn, All rights reserved.

Calculate the mean, median and mode of the following data and choose the right
option:

QUIZ
2

13, 3, 10, 9, 7, 10, 12, 8, 6


a.

Mean = 8.67, Median = 9, Mode = 10

b. Mean = 9, Median = 9, Mode = 10


c.

Mean = 9, Median = 10, Mode = 9

d.

Mean = 8.67, Median = 10, Mode = 10

Copyright 2014, Simplilearn, All rights reserved.

Calculate the mean, median and mode of the following data and choose the right
option:

QUIZ
2

13, 3, 10, 9, 7, 10, 12, 8, 6


a.

Mean = 8.67, Median = 9, Mode = 10

b. Mean = 9, Median = 9, Mode = 10


c.

Mean = 9, Median = 10, Mode = 9

d.

Mean = 8.67, Median = 10, Mode = 10

Answer: a.
Mean is the average of all the values, median is the middle value and the mode is the most
commonly occurring value.
Copyright 2014, Simplilearn, All rights reserved.

QUIZ
3

Calculate the variance of the following data and choose the right option:
5,10,12,4,8,9,16

a.

15.41

b. 14.41
c.

9.14

d.

12.41

Copyright 2014, Simplilearn, All rights reserved.

QUIZ
3

Calculate the variance of the following data and choose the right option:
5,10,12,4,8,9,16

a.

15.41

b. 14.41
c.

9.14

d.

12.41

Answer: b.
Variance is the average of squared deviations about the mean, given by

Copyright 2014, Simplilearn, All rights reserved.

From the research question below, choose the alternative hypothesis from the
following options.

QUIZ
4

Given a sample of body temperatures of 50 persons, and the average temperature 0 ,


decide if the average body temperature has increased over time.
a.

= 0

b. > 0
c.

< 0

d.

Copyright 2014, Simplilearn, All rights reserved.

From the research question below, choose the alternative hypothesis from the
following options.

QUIZ
4

Given a sample of body temperatures of 50 persons, and the average temperature 0 ,


decide if the average body temperature has increased over time.
a.

= 0

b. > 0
c.

< 0

d.

Answer: b.
Explanation: The question forms a one sided hypothesis, checking if the average
temperature has increased, that is, if > 0
Copyright 2014, Simplilearn, All rights reserved.

QUIZ
5

Choose the commonly used value for significance level from the values given below

a.

0.1

b. 0.5
c.

1.0

d.

0.05

Copyright 2014, Simplilearn, All rights reserved.

QUIZ
5

Choose the commonly used value for significance level from the values given below

a.

0.1

b. 0.5
c.

1.0

d.

0.05

Answer: d.
Explanation: The commonly used value for significance levels are 0.01 and 0.05.

Copyright 2014, Simplilearn, All rights reserved.

QUIZ
6

Choose the right answer Non parametric tests can also be referred to as?

a.

Distribution free

b. Deviation free
c.

Dispersion free

d.

Decision free

Copyright 2014, Simplilearn, All rights reserved.

QUIZ
6

Choose the right answer Non parametric tests can also be referred to as?

a.

Distribution free

b. Deviation free
c.

Dispersion free

d.

Decision free

Answer: a.
Explanation: Non parametric tests are distribution free

Copyright 2014, Simplilearn, All rights reserved.

QUIZ
7

Descriptive statistics measures which of the following?

a.

Estimation

b. Hypothesis testing
c.

Dispersion

d.

Data mining

Copyright 2014, Simplilearn, All rights reserved.

QUIZ
7

Descriptive statistics measures which of the following?

a.

Estimation

b. Hypothesis testing
c.

Dispersion

d.

Data mining

Answer: c.
Explanation: Descriptive statistics deals with the measure of dispersion.

Copyright 2014, Simplilearn, All rights reserved.

QUIZ
8

Which of the following conditions binomial distribution does not satisfy?

a. A fixed number of trials


b. Each trial is independent of the others
c. The probability of each outcome remains constant from trial to trial.
d. Normal distribution

Copyright 2014, Simplilearn, All rights reserved.

QUIZ

Which of the following conditions binomial distribution does not satisfy?

a. A fixed number of trials


b. Each trial is independent of the others
c. The probability of each outcome remains constant from trial to trial.
d. Normal distribution

Answer: d.
Explanation: Normal distribution. Rest of the things are satisfied by binomial distribution.

Copyright 2014, Simplilearn, All rights reserved.

QUIZ
9

Probability of an event always lies between?

a. 0 and 1
b. -1 and 1
c. Negative and positive
d. Only positive

Copyright 2014, Simplilearn, All rights reserved.

QUIZ

Probability of an event always lies between?

a. 0 and 1
b. -1 and 1
c. Negative and positive
d. Only positive

Answer: a.
Explanation: The probability of an event always lies between 0 and 1, i.e. failure and success
of that event

Copyright 2014, Simplilearn, All rights reserved.

QUIZ
10

Which is not a part of descriptive statistics?

a. Sample
b. Measure of central tendency
c. Measures of dispersion
d. Hypothesis testing

Copyright 2014, Simplilearn, All rights reserved.

QUIZ

Which is not a part of descriptive statistics?

10

a. Sample
b. Measure of central tendency
c. Measures of dispersion
d. Hypothesis testing

Answer: d.
Explanation: Hypothesis testing is not a part of descriptive statistics, it is a part of inferential
statistics.

Copyright 2014, Simplilearn, All rights reserved.

QUIZ
11

Which is used to calculate the central value of numbers?

a. Median
b. Mode
c. Mean
d. Standard deviation

Copyright 2014, Simplilearn, All rights reserved.

QUIZ

Which is used to calculate the central value of numbers?

11

a. Median
b. Mode
c. Mean
d. Standard deviation

Answer: c.
Explanation: Mean is used to calculate the central value or average of an given value of
numbers.

Copyright 2014, Simplilearn, All rights reserved.

QUIZ
12

Which is used to find the highest frequency in a given value of numbers?

a. Median
b. Mode
c. Mean
d. Standard deviation

Copyright 2014, Simplilearn, All rights reserved.

QUIZ

Which is used to find the highest frequency in a given value of number?

12

a. Median
b. Mode
c. Mean
d. Standard deviation

Answer: b.
Explanation: Mode is used to calculate the highest frequency which is being occurred in a
given value of numbers.

Copyright 2014, Simplilearn, All rights reserved.

QUIZ
13

Which is used to measure the dispersion in a given set of numbers?

a. Median
b. Mode
c. Mean
d. Standard deviation

Copyright 2014, Simplilearn, All rights reserved.

QUIZ

Which is used to measure the dispersion in a given set of numbers?

13

a. Median
b. Mode
c. Mean
d. Standard deviation

Answer: d.
Explanation: Standard deviation is used to measure the dispersion in a given set of numbers.

Copyright 2014, Simplilearn, All rights reserved.

QUIZ
14

What is the logical conclusion if p value is less than 0.05?

a. Accept null hypothesis


b. Reject null hypothesis
c. Reject alternate hypothesis
d. None of the above

Copyright 2014, Simplilearn, All rights reserved.

QUIZ

What is the logical conclusion if p value is less than 0.05?

14

a. Accept null hypothesis


b. Reject null hypothesis
c. Reject alternate hypothesis
d. None of the above

Answer: b.
Explanation: if the p value is less than 0.05, i.e., p<0.05, we reject the true null hypothesis.

Copyright 2014, Simplilearn, All rights reserved.

QUIZ
15

Which is used to measure the peakedness of distribution?

a. Skewness
b. Outlier
c. Kurtosis
d. Variance

Copyright 2014, Simplilearn, All rights reserved.

QUIZ

Which is used to measure the peakedness of distribution?

15

a. Skewness
b. Outlier
c. Kurtosis
d. Variance

Answer: c.
Explanation: Kurtosis is mainly used to measure the peakedness of an distribution of a
particular data set .

Copyright 2014, Simplilearn, All rights reserved.

QUIZ
16

Which is the measure of deviation from symmetry?

a. Skewness
b. Outlier
c. Kurtosis
d. Variance

Copyright 2014, Simplilearn, All rights reserved.

QUIZ

Which is the measure of deviation from symmetry?

16

a. Skewness
b. Outlier
c. Kurtosis
d. Variance

Answer: a.
Explanation: Skewness is the measure of deviation from symmetry and this maybe left
skewed or right skewed.

Copyright 2014, Simplilearn, All rights reserved.

Thank You

Copyright 2014, Simplilearn, All rights reserved.


Copyright 2014, Simplilearn, All rights reserved.

Anda mungkin juga menyukai