n
x
i
i=1
x := .
n
Note: in this class we will work with both the population mean and the sample
mean x. Do not confuse them! Remember, x is the mean of a sample taken from the
population and is the mean of the whole population.
Sample median: order the data values x
(1)
x
(2)
x
(n)
, so then
x
(
n+1 n odd
2
)
median:= x:= .
1
[x
(
n
)
+x
(
n
+1)
] n even
2 2 2
Mean and median can be very dierent: 1,2,3,4, 500 .
outlier
The median is more robust to outliers.
Quantiles/Percentiles: Order the sample, then nd x
p
so that it divides the data
into two parts where:
a fraction p of the data values are less than or equal to x
p
and
the remaining fraction (1p) are greater than x
p
.
That value x
p
is the p
th
-quantile, or 100p
th
percentile.
5-number summary
{x
min
, Q
1
, Q
2
, Q
3
, x
max
},
where, Q
1
=
.25
, Q
2
=
.5
, Q
3
=
.75
.
Range: x
max
x
min
measures dispersion
Interquartile Range: IQR :=Q
3
Q
1
, range resistant to outliers
1
Sample Variance s
2
and Sample Standard Deviation s:
s
2
:=
1
n1
n
(x
i
x)
2
.
i=1
see why later
Remember,foralargesamplefromanormaldistribution,95%ofthesamplefallsin
[ x2s,x + 2s].
Do not confuse s
2
with
2
which is the variance of the population.
Coecient of variation(CV) :=
x
s
and
(y
i
x
tw+1
+ +x
t
MA
t
= ,
w
where w is the size of the window. Note that we make window w smaller at the
beginning of the time series when t < w.
Example
To use moving averages for forecasting, given x
1
, . . . , x
t1
, let the predicted value
at time t be x
t
=MA
t1
. Then the forecast error is:
e
t
=x
t
x
t
=x
t
MA
t1
.
The Mean Absolute Percent Error (MAPE) is:
T
1
100%.
e
t
MAPE =
T 1 x
t
t=2
6
MIT OpenCourseWare
http://ocw.mit.edu
15.075J / ESD.07J Statistical Thinking and Data Analysis
Fall 2011
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.