BOXPLOT
DESCRIBING SHAPES AND PATTERNS
SKEWNESS
Symmetry of Distribution
MOTIVATION:
SKEWNESS
A measure of symmetry/asymmetry of a
distribution of values in a data.
SKEWNESS
POSITIVELY SKEWED or SKEWED TO
THE RIGHT - the concentration of the
values are on the left side of the distribution,
with a tail that is tapering off on the right
side.
SKEWNESS
NEGATIVELY SKEWED or SKEWED TO
THE LEFT - the concentration of the values
are on the right side of the distribution, with
a tail that is tapering off on the left side.
SKEWNESS
the distribution is symmetric, if the mean =
median = mode.
the distribution is skewed to the right or
positively skewed, if the mean > median >
mode.
the distribution is skewed to the left or
negatively skewed, if the mean < median <
mode.
SKEWNESS
SKEWNESS
1 =
2 =
SKEWNESS
Interpretation:
if the coefficient of skewness is less than 0,
then the distribution is skewed to the left or
negatively skewed.
If the coefficient of skewness is greater
than 0, then the distribution is skewed to
the right or positively skewed.
If the coefficient of skeweness is 0, then the
distribution is symmetric.
KURTOSIS
Concavity of Distribution
KURTOSIS
It describes the hump of the
distribution of the values relative to
the hump of a normal distribution of
the same amount of variability.
KURTOSIS
The Normal Distribution:
It is a bell-shaped curve.
It is symmetric about its mean.
It has tails that extend on both ends but it doesnt
touch the x-axis
The area below the curve and above the x-axis is 1.
68% of observations lie within one standard deviation
from the mean, 95% of observations lie within two
standard deviations from the mean, and at least 99%
of observations lie within three standard deviations
from the mean.
KURTOSIS
The Normal Distribution
KURTOSIS
Coefficient of kurtosis:
=1
4
=
4
4
interpretation:
4
3 < 0
4
4
3 > 0
4
4
3 = 0
4
KURTOSIS
KURTOSIS
BOXPLOT
Exploratory-Data-Analysis Tool
BOXPLOT
Box-and-whisker plot a
graphical tool for assessing
the shape of distribution.
Features:
Location
Spread
Symmetry
Extremes
BOXPLOT
OUTLIERS!
Q3+1.5(IQR)
or max
IQR=Q3-Q1
3rd Quartile
AXIS
Range of
the
middle
50% of
the data
Q1-1.5(IQR)
or min
2nd Quartile
Or Mean
1st Quartile
ASSESSING SKEWNESS
USING A BOXPLOT
/:END:/