2
This is not a training on Six Sigma!!
The training presentation assumes that you are already
aware of Six Sigma concepts, and are looking for ways to
implement the same using MS Excel.
The training presentation also assumes that you know the
basics of MS Excel, and hence it focuses on some advanced
analytical concepts.
The excel tips and tools mentioned in this presentation can
be used in multiple phases of the DMAIC order. So, the
presentation does not follow a DMAIC flow of thought.
The training is based on MS Excel 2007. Improvise a little
when you are using MS Excel 2003.
3
In mathematics, the central tendency of a data set is a measure of the
"middle" or "expected" value of the data set. There are many different
descriptive statistics that can be chosen as a measurement of the
central tendency of the data items. These include mean, the median
and the mode.
Other statistical measures such as the standard deviation and the range
are called measures of spread and describe how spread out the data is.
Measures of • Mean
• Median
Central Tendency • Mode
Syntax
=AVERAGE(number1,number2,...)
5
A median is described as the number separating the higher half of a
sample, a population, or a probability distribution, from the lower half.
If there is an even number of observations, the median is not unique, so
one often takes the mean of the two middle values.
Syntax
=MEDIAN(number1,number2,...)
6
The mode is the value that occurs the most frequently in a data set or a
probability distribution. The mode is not necessarily unique, since the
same maximum frequency may be attained at different values.
Syntax
=mode(number1,number2,...)
7
In Statistics, variance is the expected square deviation of a variable or
distribution from its expected value or mean. To obtain variance from a
distribution, excel uses the function “=var”. Click below for the syntax.
Syntax
=VAR(number1,number2,...)
8
Standard deviation is a measure of the variability or dispersion of a
statistical population, a data set, or a probability distribution. To
calculate Standard Deviation in an excel worksheet, we use the
function, “=stdev”.
Syntax
=STDEV(number1,number2,...)
9
In descriptive statistics, the range is the length of the smallest interval
which contains all the data. It is calculated on excel by subtracting the
Min from the max value of the sample. Click below for the syntax.
Syntax
=max(A2:A16)-Min(A2:A16)
10
In probability theory and statistics, skewness is a measure of the
asymmetry of the probability distribution of a real-valued random
variable. It is measured in Six Sigma because, in reality, data points are
always not perfectly symmetric.
Syntax
=skew(A2:A16)
11
In probability theory and statistics, kurtosis is a measure of the
"peakedness" of the probability distribution of a real-valued random
variable.
Syntax
=kurt(A2:A16)
12
If the mean is 85 days and the standard deviation is 5 days,
what is the yield if the USL is 90 days?
USL
Z = (90 − 85) / 5 = 1 Area under curve to
Y = Pr( x ≤ 90) = Pr( z ≤ 1) right of USL would
be considered %
defective
P(z<1) = P(z>-1) = 1-.15865 =
.8413 Yield ≅ 84.1% Yield
-7 -6 - -4 -3 -2 - 0 2 3 4 5 6 7
5 1 1
Z-Scale
13
=normdist(x,mean,standarddeviation,cumulative)
14
=normdist(x,mean,standarddeviation,cumulative)
15
=normdist(x,mean,standarddeviation,cumulative)
16
=normdist(x,mean,standarddeviation,cumulative)
17
For a pizza delivery center, the mean of the delivery time is
20 minutes and the standard deviation is 3.5. What is their
target, if the probability of achieving the target is 99.78%?
USL
Yield
Hours
a s
18
=norminv(probability,mean,standarddeviation)
19
=norminv(probability,mean,standarddeviation)
20
=norminv(probability,mean,standarddeviation)
21
Data in raw form are usually not easy to use
for decision making
Some type of organization is needed
▪ Table
▪ Graph
Techniques reviewed here:
Ordered Array
Histograms
Bar charts and pie charts
Contingency tables
22
A sorted list of data:
Shows range (min to max)
23
Data in raw form (as
collected):
25
Class
Class Midpoint Frequency
10 but less than 20 15 3
20 but less than 30 25 6
30 but less than 40 35 5 Histogram : Daily High Tem perature
40 but less than 50 45 4
50 but less than 60 55 2 7 6
6 5
5
Frequency
4
4 3
3 2
2
(No gaps 1 0 0
between 0
bars)
5 15 25 35 45 55 More
26
27
28
2
Choose Histogram
(
Input data range and bin range (bin
range is a cell range containing
the upper class boundaries for
3 each class grouping)
Select Chart Output
and click “OK”
29
30
31
Scatter Diagrams are used for bivariate
numerical data
Bivariate data consists of paired observations
taken from two numerical variables
32
1
33
Volume Cost per
Cost per Day vs. Production Volume
per day day
23 125 250
26 140
200
Cost per Day
29 146
150
33 160
38 167 100
42 170 50
50 188
0
55 195
0 10 20 30 40 50 60 70
60 200
Volume per Day
34
35
36
Microsoft Excel
descriptive statistics output,
using the house price data:
House Prices:
$2,000,000
500,000
300,000
100,000
100,000
37
Select
Data Analysis
Choose Correlation from
the selection menu
Click OK . . .
38
Input data range and select
appropriate options
Click OK to get output
39
Select the
input range s
from the data
Select the
residuals
pattern. If you
are not sure,
just select
line fit plots.
40
Regression Statistics
Multiple R 0.76211 The regression equation is:
R Square 0.58082
Adjusted R Square 0.52842 house price = 98.24833 + 0.10977 (square feet)
Standard Error 41.33032
Observations 10
ANOVA
df SS MS F Significance F
Regression 1 18934.9348 18934.9348 11.0848 0.01039
Residual 8 13665.5652 1708.1957
Total 9 32600.5000
41
42