Anda di halaman 1dari 63

SRI GURU TEGH BAHADUR INSTITUTE OF MANAGEMENT &

INFORMATION TECHNOLOGY (GGSIPU)

RESEARCH METHODOLOGY USING SPSS – PRACTICAL FILE

SUBMITTED TO: SUBMITTED BY:

Ms. Simranjeet Kaur Karmdeep

Assistant Professor BBA (b&i) Mor(IV)

SGTBIMIT 01290201817

GGSIPU SGTBIMIT

1
TABLE OF CONTENTS

Sr. No. Title Page No.

1 Frequency Distribution - workers data 1-12

2 Descriptive Statistics - ice data 13-22

3 Outlier Testing - sports data 23-30

4 Normal Distribution Detection - monthly sales data 31-35

5 One Sample T-Test – weight loss program data 36-40

Paired Samples T- Test – impact of training program


6 41-44
data

Independent Samples T-Test - average performance 45-50


7
of employees data

8 One-way ANOVA – three companies sales data 51-58

2
LIST OF FIGURES

Sr. No. Title Page No.

1. Bar Diagram for Education 9

2. Pie Chart for Education 10

3. Cluster Bar Diagram 21

4. Box Plot 29

5. Sales Diagram 57

3
DATA SET 1: FREQUNCY DISTRIBUTION

Description:Workers
This data set consist of workers working in a small & medium enterprise in a
city of India.

Objective:

 To represent the frequency distribution in tabular as well as graphical form.


 To calculate frequency distribution and present bar chart of education profile
of the workers.

Statistical Package Used:


IBM SPSS Statistical Version 21

Statistical Technique Deployed:

a) Frequency Distribution
b) Bar- Chart
c) Pie- Chart
d) Cross tabs

4
Dataset of workers working in small & medium scale enterprises in city of India is shown
below in table.

S.No Gender Age Religion Education S.No Gender Age Religion Education
group group
1 1 1 3 2 26 1 5 3 2
2 1 4 2 1 27 1 1 1 2
3 1 3 3 4 28 1 5 2 2
4 1 3 1 3 29 1 1 2 4
5 2 4 1 1 30 1 5 2 2
6 1 4 1 1 31 1 2 3 5
7 2 2 1 1 32 1 3 2 1
8 1 2 3 1 33 2 2 2 2
9 1 2 2 1 34 1 5 2 1
10 2 2 2 2 35 2 5 1 2
11 1 3 1 2 36 2 5 2 3
12 1 3 1 3 37 2 2 3 4
13 1 4 1 4 38 2 5 2 3
14 2 1 2 3 39 1 3 3 3
15 1 5 2 2 40 1 5 2 2
16 2 2 2 2 41 1 2 1 1
17 1 1 1 5 42 1 2 3 1
18 1 5 1 5 43 1 3 2 1
19 1 5 2 2 44 1 5 2 5
20 1 2 2 5 45 1 2 1 2
21 2 5 2 2 46 2 5 2 3
22 1 2 2 1 47 2 2 1 2
23 1 2 3 1 48 1 3 3 4
24 1 2 1 5 49 1 4 2 4
25 2 5 2 5 50 2 1 1 1
Table 1.1: Data set of workers working in small & medium scale enterprises in city of India.

The coding details of different variables in the dataset are shown below in table 1.2

Variables Numeric codes


Gender 1=Male
2=Female
Age group 1=Less than 25 years
2=26-35 years
3=36-45 years
4=46-55 years
5= 56 & above
Religion 1=Hindu
2=Muslim
3=Other religion
Education 1=Below 10th
2=High school
3=Intermediate
4=Technical diploma
5=Degree level
Table1.2: Coding details of different variables

5
STEPS

STEP 1: In the VARIABLE VIEW select the heads of DATA VIEW. In this
case first head is ‘Gender’ and then choose the type,measure,values,labes etc
as done below.

STEP 2: Similarly assign the values of all the names of variable view as per
data given above.

6
STEP 3: Final output of variable view.

Results shown in the Data View

7
 To convert data in the values click the convertor sign from the menu
bar(A arrow 1)

Observation 1: Frequency Distribution


STEP 4: Analyze< Descriptive Statistics< Frequencies

8
STEP 5: From the new dialogue box choose any one label and transfer it to
the variable box.

For example: Education. Then click OK

9
RESULT
FREQUENCIES

Statistics

EDUCATION
Frequency Percent Valid Cumulative
Percent Percent
BELOW TENTH 14 28.0 28.0 28.0
GRADE
HIGH SCHOOL 16 32.0 32.0 60.0
INTERMEDIATE 7 14.0 14.0 74.0
Valid
TECHNICAL 6 12.0 12.0 86.0
DIPLOMA
DEGREE LEVEL 7 14.0 14.0 100.0
Total 50 100.0 100.0
Table 1.3: Shows the frequency distribution

STEP 6: For graph. Graphs< Chart Builder

10
STEP 7: From the gallery of new dialogue box select the type of graph
(here bar graph is selected).

STEP 8: Drag education to the x-axis and count will appear on the y axis.

11
Bar graph will apper like this.

Fig1.1:SPSS output in graphical form

STEP 9: PIE CHART for Education

12
Fig:1.2: Pie chart of education in count and percentage.

Observation 2 : CROSS TABS

STEP 10: Analyze< Descriptive <Cross tabs. Select the variables.

13
STEP 11: Select the two variables.

RESULT

Education * Gender Cross tabulation


Count
Gender Total
Male Female
Below tenth 11 3 14
grade
High school 10 6 16
Education
Intermediate 3 4 7
Technical diploma 5 1 6
Degree level 6 1 7
Total 35 15 50
Table 1.4: Shows the cross tab of gender and education.

14
CONCLUSION

A. Table 1.3: Frequency distribution of education shows number of


workers in each of education category in summarized manner.
B. Fig1.1: Bar diagram graphically shows count of workers in each
education category.
C. Fig 1.2: Pie Chart concludes that education level of 50 workers are
calculated and found that the number of workers-Below 10th grade is
28%, High school is 32%, Intermediate is 14%, Technical Diploma is 12% &
Degree level is 14%.
D. Table 1.4: Cross tabulation of gender and education shows interaction
between these two. For example there are total 14 workers below tenth
grade 11 of them are male and 3 are female.

Hence through frequency distribution we find that majority number of workers had done
their education till high school level which is 32% (16 of total number of 50 workers).

15
DATA SET 2: MEAN, MODE& RANGE
Introduction: Ice Melt

In this data set we will estimate or predict the time as and when will the ice melt upon the
river next year by observing and understanding past data.

Objectives:

1. To determine mean and mode hour of the day in which ice melts using variable like

hour, minute, month and year.

2. To determine the hour range for the ice melts and to determine in which month ice

melts the most.

Statistical Package Used:

IBM SPSS Statistics, Version 21

Statistical Techniques Deployed:

1. Mean
2. Mode
3. Bar chat
4. Pie chart
5. Range

16
STEPS

STEP 1: Fill the information in the variable view.

Result on data view.

17
18
19
STEP 2: To find mean, mode hour of the day. Analyze<Descriptive Statistics
< Frequencies.

STEP 3: Select the mean and mode hour that to find out the central tendency.

20
RESULT

Statistics
hour of the day
Valid 90
N
Missing 0
Mean 14.60
Mode 13
Table 2.1: Shows Mean & Mode hour of the days in which ice melt.

OUTPUT OF Descriptive Statistics

Descriptive Statistics
N Minimum Maximum Mean Std.
Deviation
hour of the day 90 5 23 14.60 4.069
Valid N (list 90
wise)
Table 2.2: Shows the descriptive statistics of hour of day of ice melt.

STEP 4: To determine hour range, using bar graph. Graphs< Legacy Dialogue < Bar.

21
STEP 5: For cluster bar graph choose hour for category axis and month for cluster.

RESULT

Fig2.1: Cluster bar diagram for hour of ice melt corresponding to months.

22
STEP 6: To determine in which month ice melts most. Graphs< Legacy Dialogue< Pie.

Define pie chart.

23
Pie Chart for hour of the day in which ice melt in count not in percentage.

Fig 2.2: Shows the pie chart for hours of ice melt in count.

24
Key Findings

1. Table 2.1 concludes that mean hour of ice melt is 14.60


2. Table 2.2 shows the descriptive statistics that is minimum hour are 5 and maximum
hour for ice melt are 23. And S.D. is 4.069 i.e. the amount of dispersion of ice melting
hour.
3. The mode month of ice melt is the May that is 5 month and mode hour of ice melt is
13 that is 1 PM.

25
DATA SET 3: OUTLIERS
Description: Sports
This data set consist of male and female of different age category and the time(hours) they
spend on playing their favourate outdoor sport.

Outlier:

 In statistics, an outlier is an observation point that is distant from other observations.


 An outlier may be due to variability in the measurement or it may indicate
experimental error; the latter are sometimes excluded from the data set.
 An outlier can cause serious problems in statistical analysis.

 An outlier is an observation that lies outside the overall pattern of a distribution


usually the presence of an outlier indicates some sort of problem.
 This can be a case which does not fit the model under study or an error in
measurement.

Objective:
a) To find the outlier (if any)
b) To understand the effect of outlier on the measurement and replace it do that distorted
data could be rectified.
c) Correcting the data.

Statistical package used:


IBM SPSS Statistical Version 21.

Statistical techniques deployed:


 Extreme Value and Stem & leaf
 Box Plot

26
STEPS

STEP 1: In the VARIABLE VIEW select the heads of DATA VIEW. In this case first head
is ‘Gender’ and then choose the type,measure,values,labes etc as done below.Values given are
1 for male and 2 for female.

Similarly fill other names. This is the final of variable view.

27
RESULT as per data view.

28
STEP 4: For outlay,
Analyze< Descriptive statistics< Explore

STEP 5: From the new dialogue box drag any name (say hour spend for playing) into
dependent list. Click statistics.

29
STEP 6: From the new dialogue box click OUTLIERS. Click continue.

Result of outliers.

Case Processing Summary


Cases
Valid Missing Total
N Percent N Percent N Percent
hours spent for 30 100.0% 0 0.0% 30 100.0%
playing

30
Table 3.1: Shows the summary of data

Statistic Std. Error

Descriptives
Mean 3.033 .4082
95% Confidence Interval Lower Bound 2.198
for Mean Upper Bound 3.868
5% Trimmed Mean 2.759
Median 2.750
Variance 4.999
hours spent for
Std. Deviation 2.2358
playing
Minimum .5
Maximum 13.0
Range 12.5
Interquartile Range 2.1
Skewness 3.146 .427
Kurtosis 13.662 .833
Table3.2: Shows all the descriptive if the data.

Extreme Values
Case Number Value
1 22 13.0
2 29 5.0
Highest 3 4 4.5
4 26 4.5
hours spent for 5 27 4.5
playing 1 8 .5
2 30 1.0
Lowest 3 11 1.0
4 10 1.0
5 15 1.5a
Table 3.3:Shows the extreme values.

31
Hours spent for playing

Hours spent for playing Stem-and-Leaf Plot

Frequency Stem Leaf

1.00 0. 5
6.00 1. 000555
8.00 2. 00000555
7.00 3. 0000055
6.00 4. 000555
1.00 5. 0
1.00 Extremes (>=13.0)

Stem width: 1.0


Each leaf: 1 case(s)

BOX PLOT

Fig:3.1:Box plot

32
Key findings

From Extreme Value table we find out that-


1 The highest value is 13 at case number 22
2. The lowest value is 0.5 at case number 8
3. The extreme value is 13 at case number 22

This shows that 13 hours are spent on playing favorite sport.

Box plot shows that case 22 is an outlier.

33
DATA SET 4: NORMALITY TEST

Introduction: Monthly Sales


The following data shows the number of items sold by an enterprise in India.

Objective:

To check the given data is normally distributed or not.

Statistical Package Used:

IBM SPSS STATISTICAL VERSION 21

Statistical Techniques Deployed:

 Normality test

Assumptions/ Hypothesis:

Ho: Data is normally distributed.

Ha: Data is not normally distributed.

34
STEPS

STEP 1: Fill the data in the variable view whose result is shown on the data view.

STEP 2: For normality test click on Analyze < Descriptive Statistics < Explore.

35
STEP 3: Drag Monthly Sales to Dependent List

STEP 4: Explore< Plots. Select normality plot with test < Continue< Ok.

36
RESULT
OUTPUT MONTHLY_SALES.sav

Case Processing Summary


Cases
Valid Missing Total
N Percent N Percent N Percent
Monthly 50 100.0% 0 0.0% 50 100.0%
Sales
Table 4.1: Shows the summary of the data

Descriptives
Statistic Std.
Error
Mean 61.34 4.519
Lower 52.26
95% Confidence Bound
Interval for Mean Upper 70.42
Bound
5% Trimmed Mean 59.99
Median 55.00
monthly Variance 1021.290
sales
Std. Deviation 31.958
Minimum 8
Maximum 150
Range 142
Interquartile Range 35
Skewness .761 .337
Kurtosis .240 .662
Table 4.2: Shows all the statistical value

37
Tests of Normality
Kolmogorov-Smirnova Shapiro-Wilk
Statistic df Sig. Statistic df Sig.
Monthly .133 50 .027 .951 50 .039
Sales
Table 4.3: Test of normality
a. Lilliefors Significance Correction

KEY FINDINGS
Table 4.5: Shows that significance value is 0.027 & 0.039 respectively, which is less than
0.05.

1. So, Ho will be rejected & Ha will be accepted.


2. Hence, the monthly sales of employees are not normally distributed.

CONCLUSION
That means we cannot apply Parametric test(T-test, F-test, Z- test, ANNOVA) and Non-
parametric test (chi-square) should be applicable.

38
DATA SET 5: ONE SAMPLE t-TEST
Introduction: Healthcare

A healthcare provider claims that on an average its customers have lost 5 kg of weight in a
month after joining its weight loss programme. In order to test the validity of the claim an
independent researcher collects data of weiht loss by 5 customes a month after joining the
programme. The researcher has decided to apply 1 sample t- test in order to test the validity of
the claim.

Hypothesis:
So, Null Hypo(Ho): mean of the population is 5
Alternative Hypo(Ha/H1): mean of the population is ≠ 5 which means <5 or >5.

Objective:
To check whether the claim of healthcare provider is right i.e null hypothesis is accepted or
not. If the null hypothesis is rejected find the correct mean of the population.

Statistical package used:


IBM SPSS Statistical Version 21.

Statistical techniques deployed:


One sample t- test
P-value approach

39
STEPS

STEP 1: Fill the data of the weight loss in the variable view.

STEP 2: Result as per data view of 50 customers.

40
STEP 3: Analyse< Compare mean < one sample t test

STEP 4: Drag the variable and in test value put 5

41
RESULT

t- Test

One-Sample Statistics

N Mean Std. Deviation Std. Error Mean

50 4.02 1.116 .158


Loss in Weight During Weight
Loss Program

Table5.1: Shows the statistics of one sample

One-Sample Test
Test Value = 5
t Df Sig. (2- Mean 95% Confidence Interval of
tailed) Difference the Difference
Lower Upper
Loss in Weight During -6.212 49 .000 -.980 -1.30 -.66
Weight Loss Program
Table5.2: Shows the result of one sample t -test

Observation 1:
As here p-value (o.ooo) is not greater than alpha value(0.05). We will reject the null
hypothesis which means claim of healthcare of 5 kg weight loss is wrong.

Now take 4 as the mean of the population instead of 5.

RESULT
t-TEST

One-Sample Statistics
N Mean Std. Std. Error
Deviation Mean
Loss in Weight During 50 4.02 1.116 .158
Weight Loss Program

42
Table5.3: Shows the result for one sample statistics

One-Sample Test
Test Value = 4
T df Sig. Mean 95% Confidence Interval of the Difference
(2- Difference Lower Upper
tailed)
Loss in .127 49 .900 .020 -.30 .34
Weight
During
Weight Loss
Program
Table5.4: Shows the (revised) result of one sample t- test

Observation 2

Here p-value (0.900) is greater than alpha value(0.05). So we will accept the alternative
hypothesis which means if healthcare would have claimed 4kg as the mean weight of the
population he would be right.

Key findings
a) Table 5.2 conclude that p-value is 0.000 which is not greater than 0.05 alpha
value.We will reject the null hypothesis which means claim of healthcare of 5 kg
weight loss is wrong.
b) Table5.4 concludes that p value is 0.900 that is obviously greater than alpha value. So
we conclue that 4 is the correct mean of the population.
c) Null hypothesis is rejected because p value is not greater than alpha value(0.05) at the
sample mean 5.
d) Alternative hypothesis will be accepted at the sample mean 4 where p value(0.900) is
greater than alpha value(0.05).

CONCLUSION

Therefore, there is no significant difference between sample mean and population mean at test
value 4.

The claim of healthcare provider is not true. A customer loses 4 kg of weight instead of 5kg in
a month after joining its weight loss program.

43
DATA SET 6: PAIRED t-TEST
Introduction: Training program

The HR manager of a business firm wants to analyze the impact of a training program
conducted for 30 employees. The purpose of conduction the training program was to improve
performance of employees. The performance scores of employees are noted before and after
training program. He wanted observe the performance of same respondents on pre sample and
post sample i.e. Bivariate.

Assumption:

 Null hypothesis (Ho): There is no difference between pre training and post training of
employees.
 Alternative hypothesis (Ha): There is difference between pre training and post training
of employees.

Objective:

 To record the performance of scores of the employees before & after training.
 To improve the performance of employees.
 To perform paired t-test.

Statistical Package Used:

IBM SPSS STATISTICAL VERSION 21

Statistical Technique Deployed

 Paired / Repeated Sample T-Test

44
STEPS

STEP 1: Define the variables (name, type, label, measure) and fill it with values.

STEP 2: For paired sample t-test Analyze< Compare Mean < Paired Sample t-Test.

45
STEP 3: Drag both Pre Training Score to Paired Variables.

RESULT
Output of “Paired t-test.”

Paired Samples Statistics


Mean N Std. Deviation Std. Error
Mean
Pre_training_Score 51.43 30 12.792 2.335
Pair 1
Post_training_Score 68.80 30 12.416 2.267
Table6.1: Shows Paired Samples Statistics

Paired Samples Correlations


N Correlation Sig.

Pre_training_Score & 30 .712 .000


Pair 1
Post_training_Score
Table 6.2: Shows paired sample correlations

46
Paired Samples Test
Paired Differences t df Sig. (2-
Mean Std. Std. 95% Confidence tailed)
Deviatio Error Interval of the
n Mean Difference
Lower Upper
Pair Pre_training_Score - -17.367 9.565 1.746 -20.938 -13.795 -9.945 29 .000
1 Post_training_Score
Table 6.3: Shows paired sample test

Observation

Since p- value (0.000) of the sample mean is NOT GREATER than alpha value (0.05) that
means we will not accept null hypothesis.

Key Findings
Table 6.3 shows that significance value 0.000 is less than alpha value 0.005 which
conclude that we will reject null hypothesis

1. Null hypothesis is rejected due to lesser p value than alpha value which means that
alternative hypothesis is accepted. Therefore, there is significant difference between
means of pre-sample and post-sample performance of employees.

So, the training program is highly effective in increasing the sales figure of the company.

47
DATA SET 7: INDEPENDENT SAMPLES t-TEST

Introduction: Average performance of employees.

A researcher is interested to analyze the difference in the average performance of employees


in different demographic profile. He divide employees on the basis of gender and age group
and apply independent t-test to analyze a difference between their performances.

Objective:

To perform Independent Sample T-Test

Assumptions
For Levene’s homogeneity test
Ho = There is no significance difference between the sample variances of two independent
samples (equality of variance)
Ha = There is significance difference between the sample variances of two independent
samples.

For Independent sample t-test


Ho = There is no significance difference between the performance scores of male and female
employees.
Ha = There is significance difference between the performance scores of male and female
employees.

Statistical Package Used:

IBM SPSS STATISTICAL VERSION 21

Statistical Techniques Deployed:

 Independent Sample T-Test


 Levene’s Homogeneity Test

48
STEPS

STEP 1: Define the variables (name, type, label, measure) and fill it with values.

STEP 2: For Levene’s homogeneity test of variances Analyze< Compare mean


<Independent Sample t -Test

49
STEP 3: Define variable list. Drag Performance Score to Test variables & Gender to
Grouping Values. Click “Define Groups”, write “Male” & “Female” in Group 1 & Group 2
column respectively < Click Continue then OK.

RESULT

Group Statistics
Gender N Mean Std. Deviation Std. Error Mean

Male 25 61.68 19.313 3.863


Performance_Score
Female 25 60.60 18.949 3.790
Table 7.1: Shows group statistics

50
Independent Samples Test
Levene's Test for t-test for Equality of Means
Equality of
Variances

F Sig. t df Sig. (2- Mean Std. 95% Confidence


tailed) Differen Error Interval of the
- ce Differen Difference
- ce Lower Upper

Equal .003 .956 .200 48 .843 1.080 5.411 -9.800 11.960


variances
Performan- assumed
ce_Score Equal .200 47.9 .843 1.080 5.411 -9.800 11.960
variances not 83
assumed
.

51
Table 7.2: Shoes levenes test and one sample t test

If Levene’s hypothesis (Ho) will reject i.e. if it fails then go for latter value.

As first two columns shows the result of Levene’s homogeneity test, p-value =0.956 is
compare with alpha value which is 0.05. and latter columns tells t-test results.
In this case, p-value ( 0.937) > 0.05,so accept the Ho.

CASE -2 On the basis of age group

STEP4: For Levene’s homogeneity test of variances ANALYZE< COMPARE MEAN<


INDEPENDENT SAMPLE T-TEST

52
STEP5: Define variable list on the basis of age group and set cut point”40”.

Group Statistics
Age N Mean Std. Deviation Std. Error Mean
>= 40 22 68.86 19.075 4.067
Performance Score
< 40 28 55.07 16.777 3.171

Table 7.3: shows group statistice for age

53
Independent Samples Test
Levene's Test t-test for Equality of Means
for Equality of
Variances
F Sig. t Df Sig. Mean Std. 95%
(2- Differ Error Confidence
tailed) ence Differ Interval of the
ence Difference
Lower Upper
Equal 1.408 .241 2.7 48 .009 13.792 5.077 3.585 23.999
variances 17
Performa
assumed
nce_Scor
Equal 2.6 42. .011 13.792 5.157 3.387 24.197
e
variances 75 170
not assumed

Table 7.4: Shows t test for age

54
KEY FINDINGS
1. According to table 7.2 the p value of Levene test on the basis of gender is 0.956,
which is greater than 0.05. So, the variances of performance for male & female are
equal. (σ12=σ22)
2. According to table 7.4 the p value of Levene test on the basis of age is 0.241, which is
greater than 0.05. So, the variances of performance of male & female are equal. (σ12 =
σ2 2 )
3. According to table 7.2 for the performance of employees on the basis of gender, the p
value is 0.843 which is greater than 0.05, which implies that the HO will be accepted.
4. For the performance of employees on the basis of age, the p value is 0.009 which is
less than 0.05,which implies that the HO will be rejected & HA will be accepted

So, there is no significant difference in the average performance of the Male & Female
employees.

So, there is a significant difference in the average performance of the Employees below and
above 40 years of age.

55
DATA SET 8: ANOVA

Introduction: Company Sales


A researcher wants to compare the sales of three companies is collected frem diferrent retail
stores. The companies are coaded as 1,2 and 3 and the data of their sales from different retail
stores is goven in the data set.

Assumption:

FOR LEVENES TEST-


Null Hypothesis(Ho): Variances are equal for all 3 companies.
Alternative Hypothesis(Ha): Varianves of all three companies are not equal

FOR ANOVA TEST-


Null Hypothesis(Ho): Sales i.e. mean for all three companies are equal
Alternative Hypothesis (Ha) of Anova are:-
a) Mean sale of comp.1= mean sale of comp. 2 ≠ mean sale of comp.3
b) Mean sale of comp.1 ≠ mean sale of comp.2 = mean sale of comp.3
c) Mean sale of comp.1 ≠ mean sale of comp. 3 = mean sale of comp.2
d) Mean sale of comp1 ≠ mean sale of comp. 2 ≠ mean sale of comp. 3

Objective :
Apply one way anova to find out if the sales of three companies are equal.

Statistical Package Used:


IBM SPSS Statistical Version 21

Statistical Techniques Deployed:


 Levenes test
 Anova test

56
STEPS
STEP 1:Fill the data in the variable view.

STEP 2: Result as per data view.

57
STEP 3: Analyze< Compare Mean< One way Anova

STEP 4: Take sales to dependent list and company to the factor list.

58
STEP 5: Post Hoc< Tukey< Continue

STEP 6: Click options. From option select descriptive, homogenityand mean plot. Click
Continue.

59
RESULT

Descriptive
Sales
N Mean Std. Std. 95% Confidence Interval for Mini Maximu
Deviation Error Mean mum m
Lower Bound Upper Bound
1 18 17.11 15.335 3.615 9.49 24.74 6 76
2 15 22.53 20.525 5.299 11.17 33.90 5 76
3 17 44.47 15.895 3.855 36.30 52.64 20 89
Total 50 28.04 20.767 2.937 22.14 33.94 5 89
Table 8.1: Shows descriptive data.

Test of Homogeneity of Variances


Sales
Levene df1 df2 Sig.
Statistic
2.098 2 47 .134
Table 8.2: Shows Homogeneity of Variance

ANOVA
Sales
Sum of Squares Df Mean Square F Sig.
Between Groups 7194.174 2 3597.087 12.130 .000
Within Groups 13937.746 47 296.548
Total 21131.920 49

60
Multiple Comparisons

Dependent Variable: Sales


Tukey HSD
(I) (J) Mean Std. Sig. 95% Confidence Interval
Company Company Difference Error Lower Upper
(I-J) Bound Bound
2 -5.422 6.020 .643 -19.99 9.15
1 *
3 -27.359 5.824 .000 -41.45 -13.26
1 5.422 6.020 .643 -9.15 19.99
2 *
3 -21.937 6.100 .002 -36.70 -7.17
*
1 27.359 5.824 .000 13.26 41.45
3 *
2 21.937 6.100 .002 7.17 36.70
*. The mean difference is significant at the 0.05 level.
Table:8.3: Shows Anova

Table 8.4: Shows multiple comparison.

Sales
a,b
Tukey HSD
Compan N Subset for alpha = 0.05
y 1 2
1 18 17.11
2 15 22.53
3 17 44.47
Sig. .639 1.000
Table8.5: Shows tukey comparison of the three companies
Means for groups in homogeneous subsets are displayed.
a. Uses Harmonic Mean Sample Size = 16.570.
b. The group sizes are unequal. The harmonic mean of the group sizes is used. Type I error
levels are not guaranteed.

61
Fig 8.1: Shows the diagrammatic representation of sales of companies

62
CONCLUSION
Table 8.2 –Homogeneity of Variance in Anova using levene’s test concludes that since
significance value is 0.134 that is more than alpha0.05 value hence our null hypothesis for
levene’s test is accepted i.e variance of sale all three companies are equivalent or similar.

Table 8.3- In the Anova table significance value is 0.00 which states that null hypothesis (Ho)
is rejected and the mean sales of all three companies are not equivalent.

Hence, we will accept alternative hypothesis (Ha). Now to check which of the 4 cases are to
be selected we will perform post Hoc test.

Table 8.4: Shows the significance value of each company with respect to other for example
company1 with company 2 has significance value 0.643 we will Ho here and significance
value of 1 with company 3 is 0.000 here we will reject Ho.

Fig 8.1: Shows that sales of company 1 is equivalent to company 2 and vice versa but sales of
company 3 are different from both.

Lastly we conclude that mean sales of Company 1 and Company 2 are equivalent but
mean sale of company 3 are totally different.

63