Anda di halaman 1dari 15

Skewness/Kurtosis

Skewness is the degree of departure from symmetry of a distribution. A positively


skewed distribution has a "tail" which is pulled in the positive direction. A negatively
skewed distribution has a "tail" which is pulled in the negative direction.

Kurtosis is the degree of peakedness of a distribution. A normal distribution is a


mesokurtic distribution. A pure leptokurtic distribution has a higher peak than the normal
distribution and has heavier tails. A pure platykurtic distribution has a lower peak than a
normal distribution and lighter tails.

Most departures from normality display combinations of both skewness and kurtosis
different from a normal distribution.
Calculating Skewness and Kurtosis
There are many methods for calculating skewness and kurtosis indices. Not all
computer programs calculate Skewness and Kurtosis the same way. If you use a
computer program to obtain skewness and kurtosis indices be sure you know how it
calculates them!
There are measures of skewness such as Pearson's second coefficient of skewness,
which is simply three times the mean minus the median divided by the standard
deviation. There are also skewness indices which look at the quartiles, and many
others. The most important group of measures of skewness and kurtosis use the third
and fourth moments about the mean.
The moments about the mean are simply the sum of [each observed value minus the
mean] raised to some power and divided by the sample size. In algebraic form, the rth
moment about the mean is:

The second moment about the mean is simply the variance.


The Moment Coefficient of Skewness, denoted by statisticians as g3, is defined in
dimensionless form as:

The expected value of this statistic will be zero for symmetrical distributions.
And similarly, the Moment Coefficient of Kurtosis, denoted by statisticians as g4, is
defined in dimensionless form as:

This expected value of this statistic will be zero for Normal distributions.
These are the Skewness and Kurtosis formulas that are used by MVPstats, and
programs such as SPSS, and Excel.
Critical Values
The critical value tables for Skewness and Kurtosis may be found on this Website. (See
Skewness Critical Values, and Kurtosis Critical Values.) These tables have been
generated to match the formulas above. Note that other tables exist which do not match
these formulas, and using them would be misleading.
Using the Critical Value Tables and p-values
The tests for skewness and kurtosis are two-sided tests. The null hypothesis to be
tested is that the skewness and kurtosis values are zero. The alternative hypothesis
generally are that skewness and kurtosis are not equal to zero.
The critical value tables, found on this Website, provide the critical values for different
selections of alpha, for various sample sizes.
For skewness, if the absolute value is equal or exceeds the critical value for your level
of confidence, reject the assumption of normality.
For kurtosis, if the kurtosis value is greater than or equal to the high critical value, or is
less than or equal to the low critical value, reject the assumption of normality.
MVPstats displays p-values for skewness based on a t-distribution where (D'Agostino
and Tietjen, 1971):

The 't' approximation has an error in p-value no more than 1/2 percent below samples
sizes of 20, compared to the published values. It is actually better (1/10 percent error)
for small alpha (0.01-0.02) in this range. In the range 20-35 it has an error not greater
than 1/10 percent. Above samples of size 40, the p-value is essentially exact.
The kurtosis random sampling distribution is difficult to model, so p-values cannot be
calculated. The program simply looks up in the table and displays the range of the
significance. Sample sizes greater than 5,000 use the Normal approximation of the
Kurtosis random sampling distribution to generate p-values.
fundamental task in
many statistical
analyses is to
characterize the
location and
variability of a data
set. A further
characterization of
the data includes
skewness and
kurtosis.
Skewness is a
measure of
symmetry, or more
precisely, the lack
of symmetry. A
distribution, or data
set, is symmetric if
it looks the same to
the left and right of
the center point.
Kurtosis is a
measure of whether
the data are peaked
or flat relative to a
normal distribution.
That is, data sets
with high kurtosis
tend to have a
distinct peak near
the mean, decline
rather rapidly, and
have heavy tails.
Data sets with low
kurtosis tend to
have a flat top near
the mean rather than
a sharp peak. A
uniform distribution
would be the
extreme case.
The histogram is an
effective graphical
technique for
showing both the
skewness and
kurtosis of data set.
Definition of For univariate data Y1, Y2, ..., YN, the formula for skewness is:
Skewness

where is the mean, is the standard deviation, and N is the


number of data points. The skewness for a normal distribution
is zero, and any symmetric data should have a skewness near
zero. Negative values for the skewness indicate data that are
skewed left and positive values for the skewness indicate data
that are skewed right. By skewed left, we mean that the left tail
is long relative to the right tail. Similarly, skewed right means
that the right tail is long relative to the left tail. Some
measurements have a lower bound and are skewed right. For
example, in reliability studies, failure times cannot be negative.
Definition of For univariate data Y1, Y2, ..., YN, the formula for kurtosis is:
Kurtosis
where is the mean, is the standard deviation, and N is the
number of data points.
Alternative The kurtosis for a standard normal distribution is three. For this
Definition of reason, some sources use the following definition of kurtosis
Kurtosis (often referred to as "excess kurtosis"):

This definition is used so that the standard normal distribution


has a kurtosis of zero. In addition, with the second definition
positive kurtosis indicates a "peaked" distribution and negative
kurtosis indicates a "flat" distribution.
Which definition of kurtosis is used is a matter of convention
(this handbook uses the original definition). When using
software to compute the sample kurtosis, you need to be aware
of which convention is being followed. Many sources use the
term kurtosis when they are actually computing "excess
kurtosis", so it may not always be clear.
Examples The following example shows histograms for 10,000 random
numbers generated from a normal, a double exponential, a
Cauchy, and a Weibull distribution.

Normal The first histogram is a sample from a normal distribution. The


Distribution normal distribution is a symmetric distribution with well-
behaved tails. This is indicated by the skewness of 0.03. The
kurtosis of 2.96 is near the expected value of 3. The histogram
verifies the symmetry.
Double Exponential The second histogram is a sample from a double exponential
Distribution distribution. The double exponential is a symmetric
distribution. Compared to the normal, it has a stronger peak,
more rapid decay, and heavier tails. That is, we would expect a
skewness near zero and a kurtosis higher than 3. The skewness
is 0.06 and the kurtosis is 5.9.
Cauchy The third histogram is a sample from a Cauchy distribution.
Distribution For better visual comparison with the other data sets, we
restricted the histogram of the Cauchy distribution to values
between -10 and 10. The full data set for the Cauchy data in
fact has a minimum of approximately -29,000 and a maximum
of approximately 89,000.
The Cauchy distribution is a symmetric distribution with heavy
tails and a single peak at the center of the distribution. Since it
is symmetric, we would expect a skewness near zero. Due to
the heavier tails, we might expect the kurtosis to be larger than
for a normal distribution. In fact the skewness is 69.99 and the
kurtosis is 6,693. These extremely high values can be
explained by the heavy tails. Just as the mean and standard
deviation can be distorted by extreme values in the tails, so too
can the skewness and kurtosis measures.
Weibull The fourth histogram is a sample from a Weibull distribution
Distribution with shape parameter 1.5. The Weibull distribution is a skewed
distribution with the amount of skewness depending on the
value of the shape parameter. The degree of decay as we move
away from the center also depends on the value of the shape
parameter. For this data set, the skewness is 1.08 and the
kurtosis is 4.46, which indicates moderate skewness and
kurtosis.
Dealing with Many classical statistical tests and intervals depend on
Skewness and normality assumptions. Significant skewness and kurtosis
Kurtosis clearly indicate that data are not normal. If a data set exhibits
significant skewness or kurtosis (as indicated by a histogram or
the numerical measures), what can we do about it?
One approach is to apply some type of transformation to try to
make the data normal, or more nearly normal. The Box-Cox
transformation is a useful technique for trying to normalize a
data set. In particular, taking the log or square root of a data set
is often useful for data that exhibit moderate right skewness.
Another approach is to use techniques based on distributions
other than the normal. For example, in reliability studies, the
exponential, Weibull, and lognormal distributions are typically
used as a basis for modeling rather than using the normal
distribution. The probability plot correlation coefficient plot
and the probability plot are useful tools for determining a good
distributional model for the data.
Software The skewness and kurtosis coefficients are available in most
general purpose statistical software programs, including
Dataplot.
fundamental task in
many statistical
analyses is to
characterize the
location and
variability of a data
set. A further
characterization of
the data includes
skewness and
kurtosis.
Skewness is a
measure of
symmetry, or more
precisely, the lack
of symmetry. A
distribution, or data
set, is symmetric if
it looks the same to
the left and right of
the center point.
Kurtosis is a
measure of whether
the data are peaked
or flat relative to a
normal distribution.
That is, data sets
with high kurtosis
tend to have a
distinct peak near
the mean, decline
rather rapidly, and
have heavy tails.
Data sets with low
kurtosis tend to
have a flat top near
the mean rather
than a sharp peak.
A uniform
distribution would
be the extreme
case.
The histogram is an
effective graphical
technique for
showing both the
skewness and
kurtosis of data set.
Definition of For univariate data Y1, Y2, ..., YN, the formula for skewness is:
Skewness

where is the mean, is the standard deviation, and N is the


number of data points. The skewness for a normal distribution is
zero, and any symmetric data should have a skewness near zero.
Negative values for the skewness indicate data that are skewed
left and positive values for the skewness indicate data that are
skewed right. By skewed left, we mean that the left tail is long
relative to the right tail. Similarly, skewed right means that the
right tail is long relative to the left tail. Some measurements
have a lower bound and are skewed right. For example, in
reliability studies, failure times cannot be negative.
Definition of For univariate data Y1, Y2, ..., YN, the formula for kurtosis is:
Kurtosis

where is the mean, is the standard deviation, and N is the


number of data points.
Alternative The kurtosis for a standard normal distribution is three. For this
Definition of reason, some sources use the following definition of kurtosis
Kurtosis (often referred to as "excess kurtosis"):

This definition is used so that the standard normal distribution


has a kurtosis of zero. In addition, with the second definition
positive kurtosis indicates a "peaked" distribution and negative
kurtosis indicates a "flat" distribution.
Which definition of kurtosis is used is a matter of convention
(this handbook uses the original definition). When using
software to compute the sample kurtosis, you need to be aware
of which convention is being followed. Many sources use the
term kurtosis when they are actually computing "excess
kurtosis", so it may not always be clear.
Examples The following example shows histograms for 10,000 random
numbers generated from a normal, a double exponential, a
Cauchy, and a Weibull distribution.
Normal The first histogram is a sample from a normal distribution. The
Distribution normal distribution is a symmetric distribution with well-
behaved tails. This is indicated by the skewness of 0.03. The
kurtosis of 2.96 is near the expected value of 3. The histogram
verifies the symmetry.
Double The second histogram is a sample from a double exponential
Exponential distribution. The double exponential is a symmetric distribution.
Distribution Compared to the normal, it has a stronger peak, more rapid
decay, and heavier tails. That is, we would expect a skewness
near zero and a kurtosis higher than 3. The skewness is 0.06 and
the kurtosis is 5.9.
Cauchy The third histogram is a sample from a Cauchy distribution.
Distribution For better visual comparison with the other data sets, we
restricted the histogram of the Cauchy distribution to values
between -10 and 10. The full data set for the Cauchy data in fact
has a minimum of approximately -29,000 and a maximum of
approximately 89,000.
The Cauchy distribution is a symmetric distribution with heavy
tails and a single peak at the center of the distribution. Since it is
symmetric, we would expect a skewness near zero. Due to the
heavier tails, we might expect the kurtosis to be larger than for a
normal distribution. In fact the skewness is 69.99 and the
kurtosis is 6,693. These extremely high values can be explained
by the heavy tails. Just as the mean and standard deviation can
be distorted by extreme values in the tails, so too can the
skewness and kurtosis measures.
Weibull The fourth histogram is a sample from a Weibull distribution
Distribution with shape parameter 1.5. The Weibull distribution is a skewed
distribution with the amount of skewness depending on the value
of the shape parameter. The degree of decay as we move away
from the center also depends on the value of the shape
parameter. For this data set, the skewness is 1.08 and the
kurtosis is 4.46, which indicates moderate skewness and
kurtosis.
Dealing with Many classical statistical tests and intervals depend on normality
Skewness and assumptions. Significant skewness and kurtosis clearly indicate
Kurtosis that data are not normal. If a data set exhibits significant
skewness or kurtosis (as indicated by a histogram or the
numerical measures), what can we do about it?
One approach is to apply some type of transformation to try to
make the data normal, or more nearly normal. The Box-Cox
transformation is a useful technique for trying to normalize a
data set. In particular, taking the log or square root of a data set
is often useful for data that exhibit moderate right skewness.
Another approach is to use techniques based on distributions
other than the normal. For example, in reliability studies, the
exponential, Weibull, and lognormal distributions are typically
used as a basis for modeling rather than using the normal
distribution. The probability plot correlation coefficient plot and
the probability plot are useful tools for determining a good
distributional model for the data.
Software The skewness and kurtosis coefficients are available in most
general purpose statistical software programs, including
Dataplot.
fundamental task in
many statistical
analyses is to
characterize the
location and
variability of a data
set. A further
characterization of
the data includes
skewness and
kurtosis.
Skewness is a
measure of
symmetry, or more
precisely, the lack
of symmetry. A
distribution, or data
set, is symmetric if
it looks the same to
the left and right of
the center point.
Kurtosis is a
measure of whether
the data are peaked
or flat relative to a
normal distribution.
That is, data sets
with high kurtosis
tend to have a
distinct peak near
the mean, decline
rather rapidly, and
have heavy tails.
Data sets with low
kurtosis tend to
have a flat top near
the mean rather
than a sharp peak.
A uniform
distribution would
be the extreme
case.
The histogram is an
effective graphical
technique for
showing both the
skewness and
kurtosis of data set.
Definition of For univariate data Y1, Y2, ..., YN, the formula for skewness
Skewness is:

where is the mean, is the standard deviation, and N is the


number of data points. The skewness for a normal
distribution is zero, and any symmetric data should have a
skewness near zero. Negative values for the skewness
indicate data that are skewed left and positive values for the
skewness indicate data that are skewed right. By skewed
left, we mean that the left tail is long relative to the right
tail. Similarly, skewed right means that the right tail is long
relative to the left tail. Some measurements have a lower
bound and are skewed right. For example, in reliability
studies, failure times cannot be negative.
Definition of For univariate data Y1, Y2, ..., YN, the formula for kurtosis is:
Kurtosis
where is the mean, is the standard deviation, and N is the
number of data points.
Alternative The kurtosis for a standard normal distribution is three. For
Definition of this reason, some sources use the following definition of
Kurtosis kurtosis (often referred to as "excess kurtosis"):

This definition is used so that the standard normal


distribution has a kurtosis of zero. In addition, with the
second definition positive kurtosis indicates a "peaked"
distribution and negative kurtosis indicates a "flat"
distribution.
Which definition of kurtosis is used is a matter of
convention (this handbook uses the original definition).
When using software to compute the sample kurtosis, you
need to be aware of which convention is being followed.
Many sources use the term kurtosis when they are actually
computing "excess kurtosis", so it may not always be clear.
Examples The following example shows histograms for 10,000
random numbers generated from a normal, a double
exponential, a Cauchy, and a Weibull distribution.

A high kurtosis distribution has a sharper peak and longer,


fatter tails, while a low kurtosis distribution has a more
rounded peak and shorter thinner tails.
Distributions with zero excess kurtosis are called
mesokurtic, or mesokurtotic. The most prominent example
of a mesokurtic distribution is the normal distribution
family, regardless of the values of its parameters. A few
other well-known distributions can be mesokurtic,
depending on parameter values: for example the binomial

distribution is mesokurtic for .


A distribution with positive excess kurtosis is called
leptokurtic, or leptokurtotic. In terms of shape, a
leptokurtic distribution has a more acute peak around the
mean (that is, a higher probability than a normally
distributed variable of values near the mean) and fatter tails
(that is, a higher probability than a normally distributed
variable of extreme values). Examples of leptokurtic
distributions include the Laplace distribution and the
logistic distribution. Such distributions are sometimes
termed super Gaussian.

The coin toss is the most platykurtic distribution

A distribution with negative excess kurtosis is called


platykurtic, or platykurtotic. In terms of shape, a
platykurtic distribution has a lower, wider peak around the
mean (that is, a lower probability than a normally
distributed variable of values near the mean) and thinner
tails (if viewed as the height of the probability density—
that is, a lower probability than a normally distributed
variable of extreme values). Examples of platykurtic
distributions include the continuous or discrete uniform
distributions, and the raised cosine distribution. The most
platykurtic distribution of all is the Bernoulli distribution
with p = ½ (for example the number of times one obtains
"heads" when flipping a coin once, a coin toss), for which
the kurtosis is −2. Such distributions are sometimes termed
sub Gaussian.
[edit] Graphical examplesIn probability theory and
statistics, kurtosis (from the Greek word κυρτός, kyrtos or
kurtos, meaning bulging) is a measure of the "peakedness"
of the probability distribution of a real-valued random
variable. Higher kurtosis means more of the variance is the
result of infrequent extreme deviations, as opposed to
frequent modestly sized deviations.
Kurtosis:
Kurtosis measures the "heaviness of the tails" of a
distribution (in compared to a normal distribution).
Kurtosis is positive if the tails are "heavier" then for a
normal distribution, and negative if the tails are "lighter"
than for a normal distribution. The normal distribution
has kurtosis of zero.
Kurtosis characterizes the shape of a distribution - that
is, its value does not depend on an arbitrary change of
the scale and location of the distribution. For example,
kurtosis of a sample (or population) of temperature
values in Fahrenheit will not change if you transform the
values to Celsius (the mean and the variance will,
however, change).
The kurtosis of a distribution or sample is equal to the
4th central moment divided by the 4th power of the
standard deviation, minus 3.
To calculate the kurtosis of a sample:
i) subtract the mean from each value to get a set of
deviations from the mean;
ii) divide each deviation by the standard deviation of all
the deviations;
iii) average the 4th power of the deviations and subtract
3 from the result.
Browse Other Glossary Entries

Kurtosis, of Greek origin meaning "bulging" or


"swelling", is a measurement used to
determine the peakedness of a data
distribution. It essentially measures a bell
curve. In other words, Kurtosis measures
whether the data is sharp or flat relative to a
normal distribution. Since Kurtosis measures
the shape of the distribution (the fatness of the
tails), it focuses on how returns are ranged
around the mean. A Kurtosis coefficient of
three indicates a normal distribution. Kurtosis
of less than three indicates a low peak with a
fat midrange on either side; this is referred to
as platykurtic. Conversely, Kurtosis greater
than three indicates a sharp/high peak with a
thin midrange and fat tails; this is called
leptokurtic. Therefore, put simply, Kurtosis
describes how bunched around the center or
spread at the endpoints a frequency
distribution is. Investors can use the
information of Kurtosis to describe trends
found in the charts to assess volatility;
sometimes Kurtosis is called "the volatility of
volatility." Kurtosis is like skewness, except
skewness only measures one tail's fatness.

Anda mungkin juga menyukai