'{ees
,,;,:1sremsue eql ecr^Jes sJno sI ecroqJ eql Jo sJeluolsnc eql entS plnon soJns€eu eseql
i;s1 'selqeu€A lecrJerunu o,4quee ueq uoll€Icoss€ erllJo qfuerls etll eJns€eu dleq qclqm
-r'rlleleJJocJo luelclJJeoc eql pu? ecu?IJe^oc eql lnoqe uJ€el osle IIP\ no 'olqslJe^ B Jo
il:?r{s pu€ ouorleuel tcuepuel l?4uec eql eJns€elu uec no,( sfe,l sossncsry reldeqc sq;
ffi:::::ffi;fid
-:,r€rluec
reprsuoc
eqr no.z;?ff#j#"#:H::#"ff#'J;J:l#'Hl':ffiffi'#
olpeeu
Surqucseppuu Eurzuetutunsuell\\ 'solqe
rEql aJoruop ol poou no,{ 'selqeuel I?cuelunu T
- -E-\I?cuerunulnoq€ suo4senb3ur1seeJ€ou€uecsscllspels Sutsn eql uI sJeruolsnoeqr-l-l
3. 1: Measuresof CentralTendencv 97
The Mean
The arithmetic mean (typically referredto asthe mean) is the most commonmeasureof cen-
tral tendency.The meanis the only commonmeasurein which all the valuesplay an equalrole.
The meanservesasa "balancepoint" in a setof data(like the fulcrum on a seesaw). you calcu-
late the meanby addingtogetherall the valuesin a dataset and then dividing that sum by the
numberof valuesin the dataset.
The symbol X, called,X-bar,isusedto representthe meanof a sample.For a samplecon-
taining n values,the equationfor the meanof a sampleis written as
Using the seiesXr, Xr, . . . , Xnto representthe setof n values andn torepresentthe numberof
values,the equationbecomes:
x= Xr+Xr+...+Xn
By using summationnotation (discussedfully in Appendix B), you replacethe numerator
Xr+ X2* ' ' ' * Xnby theterm *r,which meanssumall theX, valuesfrom the firstXvalue,
f
l =l
Xt, to the lastXvalue, Xn,to form Equation(3.1),a formal definition of the samplemean.
SAMPLEMEAN
The samplerneaR,: ;. u of theva ueson,*ro uv *re nu;-,
"r*ur:
ff,.,:.* i=i
ffi (3.1)
:r:11rl
" n,,
rl;::;::::1;:
i,::,t,::::
.:'
..t..i '.,r.1.t
".t'.,;t..;,
where ;,il.,t
,,,..,.t,l.f t1,'.;,..,f..;;:.,
;ilir+
ff a *u*nt. ..t'..i ... ,.
...ttil
','',t. :ltt.",:
J,tii.,i,:lii.:
i,:,i:,:
.r',r:.
tt,tt::,fi
ffi ''..
ff# in the sample
$ian$!#:si:s{ifi$q*,is&fitfi:l*ii*si*u€!l*!*tifr$i:'a:S$i$:Sfe;$!*lii:}S$!S-*Sp[#_ilqf*i*S,ii#ii i
Becauseall the valuesplay an equalrole, a mean is greatly affectedby any value that is
greatly different from the others in the data set. When you have such extremevalues, you
shouldavoidusing the mean.
The meancan suggesta typical or centralvaluefor a dataset.For example,if you knew the
typical time it takes you to get ready in the morning, you might be able to better plan your
morning andminimizeany excessivelateness(or earliness)going to your destination.Suppose
you define the time to get ready as the time (roundedto the nearestminute) from when you get
out of bed to when you leaveyour home.You collect the times shownbelow for 10 consecutive
work days(storedin the datafile@S$:
:€ueluc oseqlil€ leelu seru€dluocua^es3uyto11o3 eql qslJ raol oq o1pe,\Iecredere
.s,glcsfqorllrtorE al?q osetu€dtuocdec-1eursut SuzqercedsS" peIJISS€Ic oJ€tr"gl spurU
"
puru eql no,( 'snq; A{oIqll^{ spuru esoql fpo e1e3r5e'ru1
ry e12colpu€ €lep l?n1ntll uos lsF
pe,tr nof ore,roero141'1ei1ue1od 3 qlrm seruedruoc fierus ut pelseJelu
{gor3 Jo }ol '!repc4
ereno1'(qafq p"n '.3nr.* llof sprngl€n1nureqlJo Ie^oI>lslreql pue '(snle'rro qu*rorE)
eql ,(d;; eEiel pue ,dec prur 'dec 11erus).,fto3e1eceql ol EulproccepeIJISs?1c ere (96
Bursneqt3loged erel€ql(EE4E@)
scqsllp,ls
ees)ou?uecs spurylenlruu8€8eqr
)sruMol HllMscNnJlvnrnw
HLV\OUgdvf,-]]Vl S UOJNUnIlu caznvnNNv UVSA-3:UHINVSN3HI L ' t 3 1d t N V X l l
'r(cuepuel
'soru4
UscJo emseerurood e a\ousI uebureqt'enlerreule4xeeqlJo esneceg '{peer-3uq1e8
utuo 6 ueql releer3 sI ueelu eql '(serurlJer{loS oq} u€ql sselpu€ seuq fpeer-8ur11e8
^l.ou 'ssln
fo S :rlqat? 'sl 1eql),,eIPPIU,,oql ur s"1d1eIPu€elu leurErroeql ol lseJluocuI
"nqf
g'W 019'6€ uro{J 'o 0I rjrlql eroru ,(q ueelu eql pos?eJculs?q onI?A elue4xe euo eql
OI
f iV =
WV
u
'xs
t17
-r
u
senl€AJo reqrunN
senl€AorllJo urns
OI
9'68 = g6E=X
OI
X
9E+ VV+IE + 0V+ vv + 6E+ 7,9+EV+ 67,+ 6E
u
X
.I A
- r\r y
-T-
u
senl€AJo roqrunN
1/
sonl€AoIDJo rrrns -A
:(selnulLu)ourll
I€ av vv 6E 7,9 EV 6Z 6t
:[eq
Fund { t
Three-Year
Category Objective R+sX Return
Baron Growth Small Cap Growth Low 20.8
ColumbiaAcorn Z Small Cap Growth Low 26.0
FBR Small Cap Small Cap Growth Low 24.9
Perritt Micro CapOpportunities Small Cap Growth Low 29.9
SchroderCapitalUS OpportunitiesInv Small Cap Growth Low 22.3
ValueLine EmergingOpportunities SmallCap Growth Low 19.0
Wells FargoAdvtg Small CapOpp Adm Small Cap Growth Low 22.4
),LJ'x,
_ i-|
n
- 16s'3- 23.6143
7
The ordered array for the seven small- cap growth funds with low risk is:
Four of thesereturnsarebelow the meanof 23.61.and threeof them are abovethe mean.
The Median
The median is the middle value in a set of datathat hasbeenrankedfrom smallestto largest.
Half the values are smaller than or equal to the median, and half the values are larger than or
equalto the median.Themedianis not affectedby extremevalues,so you can use the median
when extremevalues are present.
To calculatethe median for a set of data, you first rank the values from smallestto largest
and then useEquation(3.2) to computethe rank of the valuethat is the median.
Mi'bAr.r
t3#}
:,.:.:.:.l;tllit;l::it,:tt:
i::
zsw w E v0 v6 E 6 E 9 tIg6Z
:^\oleqrl\{oqs?lep fp€eJ-1e3-ol-eru4eIil Jeplsuoc'eldtuexeJod '?tr€pJolos 3 uI sepou
eJ€eJeqlJo epo{uou sr eJeg}'ueUO 'opou eql lceJ? lou op senl€Aelue4xelnelu eql
pu? uerpouleql o{lT ,tpuenbe.gtsolu sr€edd€l€ql?}?pJo }ese uI enIBAerl}sI epou ogJ,
aFsfftfieq_il
u€rpoIN
J
v
:s)ueu
:sonleAPolueu
fseEr?Ioql ol lsolletuseql uro{ pe{u?Jere(66 e8edees){slr n\ol qlu!\ spurgW \orE decletus
eqtroJsrunleJpezquruIu?reef-eerql eqJ'enle peTl?r IFmoJoql sI u€1peureq1'1epa Sursn
goeyduessFt roJV: ZIC + ,) q Z Kgt + z SrnpnpgollnsereWesn€cegNO[nlOs
uerPeru
IsrJ 1t\olrlllm spurgqy(oJ8declleurseql JoJILmleJpezq"nuuereed-eeJgl
apduro3 '(qfig pue oeflete,t€'rvrof sprng I€nlnu eqlJo le^el {slr eql pue '(en1enro qUnorS)
eqt '(dec e8reypue 'dec pnu 'deq llerus) ,{roEeleceql ol Eurprocc€pelJlssep en (96
ees)oueu6csscllsq€ls Sursg eql;o wdere 1elil(@@ft@) spurUlen1ruu8€8 eql
g'6t _ uBIpeIN
:s)ueu
vv EV 0v 6E 6E SE I€ 6Z
:sonle^Po)ueu
130326274023363
SOLUTION The orderedarray for thesedatais
001223333346726
Because3 appearsfive times, more times than any other value, the mode is 3. Thus, the systems
mar:rigercansaythat the most commonoccurrenceis havingthreeserverfailuresin a day.For this
dataset,the medianis alsoequalto 3, andthe meanis equalto 4.5.The extremevalue26 is anout-
lier. For thesedata,themedianandthe modebettermeasurecentraltendencvthanthe mean.
A set of datahasno mode if none of the valuesis "most typical." Example3.4 presentsa data
setwith no mode.
Cluartiles
rlnnrnmfiiieil,-"
and O? Quartiles split a set of datainto four equalparts-the first quartileo 01, dividesthe smallest
-i5,:- 50th, 25.0% of the valuesfrom the other 75.0% that arelarger.The secondquartile, Q2, is the
il@rflEBr: ies, median-50 .0o/oof the values are smaller than the median and,50.0o/o are larger.The third
E:gli-eftons quartile, Q, dividesthe smallest75.0% of the valuesfrom the largest25.0%.Equations(3.3)
mrd 3"4) can and(3.4) define the first and third quartiles.l
i';€,^era lly in
trn,c esl'c'enti/es;
merce"ntr/e -
'llllll'imrr6
e,3 rra/ue.
:s)ueu
6'8 19S
:s)ueu
vv vv EV 0v 6E 6E 9€ IE 6Z
:sanleAPe)ueu
or{}esn
:selruenboql elslncl€col selruEutznolloJ
sornseolN
e^rldrrcsecIecrrorunNEiruHJ uEIdvHJ Z0l
3. 1: Measuresof CentralTendencv I 03
^ (n+l)
()r-- 'rankedvalue
4
7 +1
- i---ranked value = 2nd rankedvalue
4
1/r + l\
O, - -''" -' rankedvalue
4
3(7+ l)
= --)---------11s11ked
value = 6th rankedvalue
4
The GeometricMsan
The geometric mean measuresthe rate of changeof a variableover time. Equation(3.5)
the geometricmean.
.defines
I = I - ET,I
E ZI'0 I' I =
I - ,^ltr LtZ'Il=
/ =
I - r,rlGgvo' t) x (sEsI'I)l
r - ,,rl(Gsto'o)+ t) x ((sE8I'o)+ t)l -
I -, *l(zv+ t) x (Iu + I)l = "Y.
sr sJ?eAo u eqlJoJxepul
flessnU eq] ur umterJo sler u€elu rlrleruoa8eql'(9'g) uotlenbg Sutsn NO;1nlOS
'urn{er;o etu cuteruos8eql elnduroC'SOOduto ggt+pue V00Zq %€€'8I+ s€A\
oc geuls 000'ZJo secud1co1seqlJo xepq 0002 lessn1 eql ut e8ueqce3eluecredeql
0=I-I=
I - z^lo'Il :
r - ,,rl(o'dx (os'o)l=
I - .,,[((o'I)+ I) x ((os'o-)+ t)]=
I - u5l(7v+ I) x (Iu + I)l = 0v
000'09 zv
%oorIo 'oo'I=
[ 000'0s- 000'00I) :
s! T,J€eI roJ urnler Jo eler eql pu€
000'00I_) = I u
%09 - JO '09'0 - =
000'00r- 000'og)
o gzro ,gz'0_ L.
(oo'r)
+ (os'o-) -L
The Range
The range is the simplestnumericaldescriptivemeasureof variationin a set of data.
fi
The rtnge is equalto,thelargestvalue,minus,the
smallest
',,' va1ue.
To determinethe range of the times to get ready in the morning, you rank the data from small-
estto largest:
29 3L 35 39 39 40 43 44 44 s2
i
{
s
i
J
T ,rloleqo$ql4slp senIBAJeII?us ^\oq pu€ 1l e^oq" elentrcngsenl€Are8rel al,oq-ueeru eql
Ja$BOs,,e?etete,, eql ems"our scllsqels eseql 'uopcllop pJspusls eql pu? oJUUIJBA
pe$qr4srp ere z]P-peql uI senle^ eql il? ltmocc? olul o{3} leq} uoIlBu€AJo semseelu
'serue4xe ^rog eql Moq uol]BJeplsuoc
{uouruoc oA[ eql ueoaleq Je]snlcJo olnql4slp sonl€A
te:pl tou op,{eqt ouorlerre,rgo
sems?etueJeeAu€Jeygenbreluleql pu? eEuereql q8noqlly
sL{&puB ssrrslrffiA&q&
p"!ep$rs&s
usf&egAeffi
es'sernunu
]r ols€r€Arer* leEoleu'r"*f!{r#:;ff";'#i'T::jffJ:$;
6sr.(peer
selnuru 6 : gE - W : e8uer eplrenbrelul
z9 vv vv Ev 0v 6E 6E s€ IE 6Z
,{peer1eBol sorml eq};o o8uerepgenbrelul eq} eulluJe}epoI 'sonl€ eluo4xo ,(q pecuengut
sr 1r'ero;ereqL'rtrr4peql Jo %0S e1ppFueql ut peerdseql sems€etue8uerepgenbrelq eql
'?lepJoleseursapLnnb
pueW!.fl oqt uee $eq ocuereJllpeql sr (peerdsppr pellsc osle) eiuu.r eglrenbralul eql
e6ueg allpenbratul aql
sernseelN
e^rldlrcseclecrrerunNaiIuHJ uEIdvHJ 90I
3.2: Variation
andShape 107
i i!,,itiit i,
:i:rt:::,:i:l;:::;l
.
'u,til',
,,'(X'-',
If the denominatorwere n insteadof n - l, Equation (3.9) [and the inner term in Equation
(3.10)] would calculatethe averageof the squareddifferencesaroundthe mean.However,n - 1
is usedbecauseof certaindesirablemathematicalpropertiespossessed by the statistic52 that
z8'91:
6
v'zw
I - OI
- "' + r@'68- a d + r G' 68- ot)
,G '68 sE )+
I-" (1
I=! ZD
,flK
,lX -
; ) uorlenbE uI sluJel eql JoJ senlen Supnlpsqns f,q ecu€IJene{} elelncl€c osls u€c no^
Z8'SV 0v'T,rv
:(t - u) f,qap1,r1q :Iuns
:V dalg : g da4g
grrc 09'v- 9€
9 t'6 r 0v'v vv
96'tL 09'8- IE
9 t'0 07'0 0v
9E'61 0v'v vv
9E'0 09'0- 6E
g L't;r OV'T,I Z9
99'tI 0v't w
gET,ll 09'0I- 6Z
9€'0 09'0- 6E soLulI
{peeg-6u!}leD eL{}io
,(x - !x) & -!x) (X)
of,uerrenoL1}6ultnduo3
"ida$ 71 dwS OIuII
g'6t = X L' g ! l1 8 v r
=^tr-
,S 45.82= 6.77
n
sr
L/l,a
- X) = 0 for all setsof data
^L\t'i
i :1
This propertyis one of the reasonsthat the meanis usedas the most cofilmonmeasureof cen-
ffal tendency.
uE
, 3.9 COMPUTING THE VARIANCEAND STANDARD DEVIATIONOF THE
THREE-YEARANNUALIZED RETURNSFOR SMALL-CAPGROWTH MUTUAL
FUNDS WITH LOW RISK
The 838 mutual fu"ds (E@tf,@El@ that arepart of the Using Statisticsscenario(seepage
96) are classifiedaccordingto the category(small cap, mid cap, and large cap), the type
(gro6h or value), and the risk level of the mutual funds (low, average,and high). Computethe
variance and standarddeviation of the three-yearannualizedreturns for the small-cap growth
fundswith low risk (seepage99).
3 "2 X - 23.6143
_: Three-\'ear
:* = Thrree- Annualized Step I: Sten 2:
-: o Returns Return (Xi - Xt (xi: x)2
.{r;1f1T;.;' 2 n
.Jf ve ti ./
r[ii1{|ir*1r-
3 FundS 20.8 -2.8143 7.9202
260 2.3857 5.6916
24.9 r,285i 1.653I
29.9 6.2857 39.5102
22.3 --1.3143 r.7273
19.0 -4.6143 2r.2916
22.4 .-I.2143 r.4745
Step 3: Step 4:
Sum: Divide by (tr - 1):
79.2686 T3.2TT4
ueerueldules:
X
uor]BrloppJ€puetseldulus S
-
or0qA\
usou
eql
dq uorl'
Tp':rp *"lplBpu?ls
n#
- ffitffiffidJ*f, ##
'ueeru eql ol e^rlelot elep eql ur Jol
Jqr semseoru'13 pquf,s eqt ,{q pelouepouorleue,rgo oqJ '"1€preyncrgedeqt
luercrJJooc
erpJo suilel ur rreqt JorDeJe8eluecred e se pesserdxesfelqe sr l€ql uoq€lJ? LJo a"msoaw
a/ e sr uoIlaIJEaJo luolr.rJJaocaq1peluessrduo4euerr3o semsserusnor,rardeql e{llun
'enrpSeueq,D^auec (ecueuen
Torlerlep pJ?pueF'e8uerepgenbrelur'e8uereq1)uorleuenJo soms€erlleqlJo euoN r
'orez lenbeil" IIII!\ uorlerlep pJ€puelspue ,ecueu?,r,e8uerep1
aguroe8u€r eql'(ewp oql ul uoq€rJ€Aou sr eJeql1eq1os) e{u"seql ile er€ senlel oqlJl r
'uo4er^ep prcpuels pu? 'ocueuett ,eauet
:gnbrelur'e8uureql Jolleruseqtr'snoeueSoruoq
Jo pep4uecuoc en ewp eql eJoureqr r
'uorl€r^eppJepu€lspue .ecu€rJ€A
eyprenbraluroe8uereql re8re1eql pesredsrpJo lno peerds era etepeql oJorrreqr r
:uo4er^eppJ€puelspu?
oe?ueteqlJo scrlsrJelceJ?rlc
Fel 'eEuereplrenbrelur oql sezu"uuns Bul,t.ollogeq1
9t9'E= =
'S[=S
:_J
q S 'uotlet^ep prepuels eldules eql ' L1I e8ed uo (0I '€) uorlenbE 3urs61
VTIZ.TI=
9
9892'61
I_ L
-
I-u T1
I=! ZJ
Z(X
_ ,DK
:LU e8eduo (6'E)uollenbEEursq
For the sampleof l0 getting-readytimes, becauseX : 39.6 and.S : 6.77, the coefficient of
variationis
= 3'g \
r/w-(
cT/ _ 15.0%
lfr )r00%
For volume, the coefficient of variation is
Cvr = ('-r-\roo%
- 25.0%
\.8.8
i
Thus, relative to the mean,the packagevolume is much more variable than the packageweight.
Z Scoras
An extreme value or outlier is a value located far awayfrorn the mean.Z scoresare useful in
identifying outliers' The larger the Z score,thegreaterthe distancefrom the value to the mean.
The Zscore is the differencebetweenthe value and the mean,divided by the standarddeviation.
,:: :: I , '
,,'
z scoRES
, .2,,*,# Q,l2)
'€+ ueql releer8 Jo €- u"ql ssel oJ€ seJocsZ aLAJo euou esmceq €l€p osoql uI sJeII
: :ue;eddeou eJ€eJeqJ '0'6I Jo uJnloJpezlTenuu€ue JoJ' LZ'Y sI oJocsZ lsemol eql'6'62
urual pez{"nuue ve :r.irS'g2'1sI eJocs7 1se?r 1eql qslJ 1v\olqlyv\ sprng qlmor8 dec-11eurs
;oJ surnler pezllenuu€ rte,(-eerqt eqlJo sorocs z aql sel€4snlll t'€ elqsl Nolln-Ios
fi
'."'"'.,"."-,".[
LL'9 uopul^opprspuuls
9'68 uuat\l
89'0- 9E
s9'0 vv
LT'T- TE
90'0 0v
99'0 vv
60'0- 6E
g8'r 7,9
0s'0 EV so u l l f
L S ' I_ 67 {peey-6u!}}eD 0L
6E oL.lllo+ sorDs z
60'0-
arcJs,Z W) outtr t'g :l18VI
'srerTlnopeJeprsuoceq ol uolrsllJc lBql lelu seu4 eqlJo euo\l 'Q'[1 u€ql rel€eJtsro 0'€-
illnr;:ssal sr lr Jr Jelllno ue peJeprsuocsr eJoos7 e'elnt yereue8€ sV 'selnulru 6Z selll.r(peer 1sB
),rj; -rrrloql qcr+!\ uo 'Z teO.JoJ,g'I- s€,l\ eJocsZ lso1rcl oqJ, 'selnulru ZS sal{peer 1eBol erul}
rJrq.ry\uo '7 ,ftq JoJ €g'I sr erocs Z lse8rel eql 's,fup 0I II€ roJ seJocsZ eql s1l\or{s€'€ elqeJ,
;iilrr:
PanelA P a n e lB P a n e lC
Negative,or left-skewed Symmetrical Positive,or right-skewed
The datain PanelA arenegative,or left-skewed.In this panel,most of the valuesarein the
upper portion of the distribution.A long tail and distortion to the left is causedby some
extremely small values.Theseextremely small values pull the mean downward so that the
meanis lessthan the median.
The data in Panel B are symmetrical.Each half of the curve is a mirror image of the other
half of the curve.The low andhigh valueson the scalebalance,andthe meanequalsthe median.
The datain PanelC arepositive,or right-skewed.In this panel,most of the valuesarein the
lower portion of the distribution.A long tail on the right is causedby someextremelylargevalues.
Theseextremelylargevaluespull the meanupward so that the meanis greaterthan the median.
AL EXPLORATIONSExploringDescriptive
Statistics
iiitoiinuu
-;i61 use the Visual Explorations Descriptive diagram for the sample of 10 getting-ready times used
aililillliltmic$
trocedure to seethe effect of changing datavalues throughout this chapter.
i;res of central tendency,variation, and shape.Open Experiment by entering an extreme value such as 10
add-in workbook (seeAppendix D) minutes into one of the tinted cells of column A. Which
lrffirrvr:\isualExplorations t Descriptive Statistics measuresare affected by this change?Which ones are not?
: --1003) or Add-ins + Visual Explorations t You can flip between the "before" and "aftef' dtagramsby
h e Statistics(Excel 2007)from the Microsoft repeatedly pressing Crtl + Z (undo) followed by Crtl + Y
"mreriu
bar. Read the instructions in the pop-up box (redo) to help see the changesthe extreme value causedin
il;ur*mation
below) and click oK to examine a dot-scale the diagram.
.sseu^/y\o{s
'sdnor8 eeJr{}erll Jo suoll€I^ep
sod go ecueprlo eruos pe./Koqsslelel >lslJer{}Jo r{cBE
u€Ipelu pu€
uuls eq] q ecueroJJrpelllll frerr se,/y\eregrspunJ 4su-eEete\e plp ue{}
'sle^el >lslroerq}oql roJ uJnler pozr
; req8F{,(l}qBIIse peqspunJ4srr-q8F{puB>lsIr-./Ko'I
*uBreef-eel{} eq} ur secuereJJlp }qBIIseQof rceddeorer{}'s11nser e{} Sututurexe uI
.{ffi.*#* t lf f it t * s g d,'[
ffih
st
Fil
ffi$
iu;:" ' litg- ffir
$$
'$, - ' t ffiffi*--*"* #$
i#$$;[ i$s##-* l*m*ff.*i #**s$r*H# .slgl
iffi#ffi -
Hy*ggg
ffi&e--*i*;Yffi fff,*g ffi eJeu) ol L'93 uollfas aas
('teeqseuo
oJe { steoq$lJorrreleredes eeJlp uo pereedde 1€q} sllnseJ eql pu€ 'seru4 eeJIIl
-tel€pqosuoc .sprmJ JoJ sllnsoJ etr€J
* ,n^ rr.p"cord eql) I?nlruu lsp-q3u pue '1su-eSelel€ '4su-tltol
e1epc1e,o1 mtl.ttnts e,rqducseqeqt Eursn.;fo slFSeJoql s^\oqs7'g ern81g
-,+m;,s
"*p"*ra 'uoIlnqIDSIppedeqslleq e ueql 4eed
€ qll^{ uo4nqulsrp selecrpur3nI?Ae^qrsody'uorlnqulsrp pedeqsleq 3 u€ql JeDsUsI
.llm:lEr.ls
" enle^
mtr'uollnqulslp € selsclpurenls^ e,uleSeuv'uollnql4slp pedeqs-1sqe seletlpul oJezJo
$ssrun)lv .EIel oql qryv\peredurocse 'uotlnqplsp eql Jo Jeluec uI
oqtr senl€AJo uollsquecuoc
fiidl€f3Joql sems€erusffo4ny'sserl \e{S Uel se}eclpurOnI?Ae,rqe8eu€ ellqa sseu,le>1s lqBF
,ffir[aJrpur anl€Aelr]rsody 'uorlnqUsrpleculeururfs? sel"clpul oJozJo onl€A sseul\e>ls V 'e]€p
'ezts eldrueseqtJo looJ erenbs eqt fQ pepFlp
"nm: Jo {o€I eql seJns"oussauways
ui-,trteruru,{s
si
ffi':lerneppJ€puelseqt 'lretdeqJ uI pessnc Stp pnpuDts eql 'uollcesstql ut flsnor'rerd
"totta
pu€ 'slsolm>leql torre pJepu"tseqi sfeldstpPu?solelno
@snJsrplou scrlsrl€tseerql'sseurvre>1s
-*. ,r.,p""ord ?qt 'uourppe q 'teeq$IJoa\ € uo scqsllslseseqtsfeldsrppue (ezls sldures)
^\eu 'UBoIU
fimf--DJpue 'UInUIIXeUJ 'UInUIIUIUI 'gEUet 'OOU€Uert 'UOrler'rep pJ€pU€lS 'epotU 'U?Ipetu
pr. salnduroc(1'gg uorlceg ees)q-ppe {edlool egtrJo empecord scqsqelS e'r4drrcse61 eqJ
tlNe Basics tires of each grade was selecte{ and the resultsrepresent-
ing the inner diameters of the tires, rankedfrom smallest
t following is a set of data from a sample
to largest, are as follows:
ffi
GradeX Grade Y
7 49 8 2
568 s70 s75 578 584 573 s74 575 577 578
fu mean, median, and mode.
:ffie ftmge, interquartile range, varrance, stan- a. For eachof the two gradesof tires, computethe mean,
mmion-and coefficient of variation. median,and standarddeviation.
ffis Z scores.Are there anyoutliers? b. Which gradeof tire is providing betterquality?Explain.
shape of the data set.
c. What would be the effect on your answersin (a) and (b)
if the last value for grade )'were 588 insteadof 57g?
followitrg is a set of data from a sample Explain.
3.7 The datain the file @ containthe price
7 497 3 12 for two ticketswith online servicecharges,largepopcorn,
tffis mean, me dian, and mode. and two medium soft drinks at a sampleof six theatre
ffis range, interquartile tange, variance, stan- chains:'
and coefficient of variation. $36.1s $3I .00 $35.0s $40.2s $33.75 $43.00
M Z scores.Are there anyoutliers?
ffiE shape of the data set. source:Extractedfrom K. Kelly,"TheMultiplex(Indersiege,"The
WallStreetJournal, December 24-25,200i,pp.pI, p5.
following set of data is from a sample a. Computethe mean, median, first quartile, and third
quartile.
[27 49 0 7 3 b. Compute the va:nance,standard deviation, range,
interquartilerange,and coefficient of variation.
lilffie
mean,median,and mode. c. Are the dataskewed?If so, how?
ffie range, interquartile range, variance, stan- d. Basedon the resultsof (a) through(c), what conclusions
ilmim' and coefficient of variation. canyoureachconcerningthe costof goingto the movies?
ffie Z scores.Are there anyoutliers?
shape of the data set. 3.8 A total of 92,000new single-familyhomeswere sold
in the united Statesduring February 2006.The median
The following is a set of data from a sample price of the homeswas $230,400,a decrease
of 2.9% from
m: 5: February2005 (U.S. CensusBureau,www.census.gov).
7-5-879 Why do you think the CensusBureaurefersto the median
price insteadof the meanprice?
ffis mean,median, andmode.
ffis range,interquartile ran5e,variance,stan- 3.9 The datain the file @ containthe bounced
tffiilm, and coefficient of variation. checkfees,in dollars,for a su*pt. of 23 banksfor direct-
Z gcores.Are thereany outliers? depositcustomers who marntaina$100 balance:
shapeof the dataset. 26 28 20 20 2t 22 25 25 18 2s 15 20
Supposethe rate of return for a particular 18 20 2s 2s 22 30 30 30 ls 20 29
during the past two years was l0% and
Computethe geometricmeanrateof return. Source: Extractedfrom "Tlte l{ew Face of Bankiftg," June 2000.
of return of 10o/ois recorded as 0.10, and a Copyright @ 2000 by Consumers(Jnion of (J.5.,Inc.,yonkers, Ny
r0703-r057.
ffi3'0f/ois recordedas 0.30.)
L. Compute the mean, median, first quartile, and third
Concepts quartile.
b. Compute the variance, standard deviation, range,
The operations manager of a plant that interquartile rarrge,coefficient of variation, and Z scores.
s tires wants to compare the actual c. Are the data skewed? If so, how?
diameters of two grades of tires, each of d. Based on the results of (a) through (c), what conclusions
to be 575 millimeters.A sampleof five can you reach concerning the bounced check fees?
6L'E 6r'9 9V'9 Zy S 8€'0 0I'9 js'v
ffiililEV9'E VEZ LL'V EI'9 Z0'E S9'S rZ'V 0v7, 0I I 07r 08€ 9E 092
petsllerepTrelfi@fit ellJ ewq
:,ry\oloq 09v 08E \Lr 98 08I 00€
'>loo./y\
euo Jo poped 3 JoAo :sererueclelr?IpIex1deerqlroJ (s1oqsul) oJII,("re1
$peuleluoc eJ€ sllnseJ oIII
sr rnoq s1ql Suunp sreluolsnc g I Jo eldures 3 Jo -wq,eqtslueserder@ellJ otll q ewp eqJ, ZVt
LY\Jellel oql seI{ceeJel{s Jo oI{ uoq,&\ol ewl oq} sJe}ue
aseqcl^\
eq] erull oql se peulJep) solnuru uI 'eulp Eultlezlt
-pu€sue>lclqcJo 1€Jlelol oql Suturecuocqc€eJno.( uel
?olred qcunl 'ru'd 00:I-ol-uoou eql Eupnp sroruol suorsnlcuoc ler{^a'(c) qEnorq}(€)Jo s}lnsereq}uo pes€g 'p
ffmnresJoJssecordpenordultue pedololep seq ,qlc elep oql eJY 'c
el\oq 'or JI ape,/Ke>ls
nsFlslpl€IcreruruocE rrI pelecol qcu€rq>lueqv I L'g 'ureldxg esrerllnotlueororpoJV
'sllnsoJ eql uI ocue 'seJocs 'e?uetelpenb;olul
Zpue 'uo4epenJoluoTclJJooc
eq] uo tuolu-ruoJ 'enlB^ srql Sutsn '(c) qEnorqt (e) 'e1uet 'uorlerlop pJepuels 'ecu€uen oI{} e}ndulo3 'q
'ggJo peelsul 86 s€^/y\ enl€AlsrlJ eql }€q1esoddnS 'p 'eppenb
'slo>lcl1,(ep-euo pJlrll pue 'eltpenb lsrlJ 'uerpeur'ueeul eql elndulo3 'e
ud uorssnupe Surlrels eql EuluJecuoc l{Jeer nort 'I fge 'dd 'h002 raqulatdas 'sgodeS rarunsuoJ
.J
rsnlcuoc let!!r. '(q) pu€ (e) lo sllnser eIP uo pes?g aql
,,'nua74J of tppag Sutpp7:pool 1sDl,, u'ro'r{papuqxfl :a)'mos
pBIAoppJ€pu€lspu€ 'ocrJeuen'eEu€Jeql elndulo3 'q
'eppenb 9S 0V 0€ 0€ 6Z 67, 61 9Z 0E EZ
-e mrean,median, first quartile, and third tions institutions,and the military, TASER'spopularity
has enjoyeda roller-coasterride. The stockprice in 2004
the r-ariance, standard deviation, tange, increased36r.4%,but in 2005,it decreased 78.0%
range, coefficient of vanattofl, and Z (Source: Extracted
fromfinance.yahoo.com, April I 7,2006).
fiiMG thEre any outliers? Explain.
a. Computethe geometricmean rate of increasefor fhe
skewed? If so, how?
two-yearperiod 2004-2005.(Hint: Denotean increase
rvalks into the branch office during the
of 3 61.4%asRl : 3.614.)
she asks the branch manager how long she
b. If you purchased$ 1,000of TASER stockat the start of
tmu-ait. The branch managerreplies, 'Almost
2004,what wasits valueat the end of 20A5?
fiqxi's
than five minutes." On the basis of the
c. comparethe resultof (b) to that of problem3.17(b).
( a pthrough (c), evaluate the accuracy of this
3.19 In 2002, all the major stock market indexes
decreaseddramaticallyas the attackson glll drove stock
that another branch. located in a residen-
prices spiralingdownward.Stockssoon rebounde4but
rurilsn
concerned with the noon-to- 1 p.m. lunch
what type of meanreturn did investorsexperienceover the
w,ururingtime, in minutes (defined as the time
four-yearperiod from 2002 to 2005?The datain the fol-
qnters the line to when he or she reachesthe
1owingtab1e(containedinthedatafi1eEBI@)repre-
). of a sampleof 15 customersduring this
sentthe total rate of return (in percentage)for the Dow
over a period of one week. The results
JonesIndustrtalAverage(DJIA), the Standard& Poor's
500 (s&P 500), and the technology-heavyNASDAe
Composite(Nasdaq).
5 ar) 8.02 5.79 9.73 3.82 9.01 9.35
ffi"ffi85.64 4.09 6.t7 g.gI 5.47
Year DJIA s&P500 NASDAQ
2005 -0._6 2.9 r .4
tfuemean,median, first quartile, and third 20041 3.4 g.L 8.6
2001 30.0 26.4 50.0
the variance, standard deviation, range,. 200t/ * 16.8 -24.2 -3 1.5
range, and coefficient of variation. Are
ruruthers?
Explain. source:Extracted
from finance.yahoo.com,
April I 4, 2006.
skewed? If so" how?
walks into the branch office during the a. Calculate the geometric mean rateof return for the DJIA,
he asks the branch manager how long he can S&P 500, and Nasdaq.
tM,wrait. The branch manager replies, 'Almost b. What conclusions can you reach concerning the geomet-
ilessthan five minutes." On the basis of the nc rates of return of the three market indexes?
(iat nhrough (c), evaluate the accuracy of this c. Compare the results of (b) to those of Problem 3.20 (b).
11
t9'Z= =d
sI'E I 99't+ g8'Z+ 9Z' Z+ Z 9 ' I+ V L ' Z
'*K
-r
/1
:(gt'E)uorlenbg
elqer ut uenrEspungpuoqJo uorlelndodeql JoJurnler ree.(-euou?orueql etnduroco1
.41 ,
4{{l[I'f) - - I_ _ :l= n
'*K
:ortelndod
eqtdqpep1,r1p
uopelndod.Ur,r,
,.nr* eqlJorrmseqlsru€oru 3'Uf
""*"tu#
NV3kld
Ntrttv'xndod
ffiiffi#&&$tu$#ff8,ffi#ffidffiffi#
ffia-$a
*'TP"on mean
o2= (3;14)
ffiere
:
l"r population mean
xi: ith valueof the var rrbleX
1r
Lr', -,u)r- summationof all the squareddifferencesbetweenthe
:Ji;,il''
i= !
POPULATIONSTANDARD
nr
Itti -p)r
i =7
O= (3.15)
ff
,A/
Zrri-',)2
i= l
o2=
t/
_(2.74-2.63)2+ (r. 6 2-2 . 6 3 )2+ (2 . 2 5-2 $ )2 + e . 8 8- 2 . $ )2 + (3.66- 2.$)2
5
0.0121+ 1.0201
+ 0.1444+ 0.0625+ 1.0609
)10
=:::: = 0.46
5
o= I
= 40.46= 0.68
^F_
o/oo0rx(ttil-)
lseel ]B eq
IIBetu oq] Iuo{ suollelAep pJ€puBls4 Jo secu€lslp uilIll/!\ punoJ ere Wql senle^ Jo e8eluec
sql 'edeqsJo sselpreEer'1osewp KueroJ l€rll selels ( t ocuereJer)alnr aeqsriqeqJ eql
afimHA$r,tsAq$q3aq&
+ 90'zI: o€ + tJ
'00'TD- (20'0)g
(T,r'T,r
+ go'zt: ezr rf
(ot'zt'zo'z)- (zo'o)z
: zo'oT
(so'zt'vo'zl) :o T t'f
90'zr NO|Inl0S
{,?locJo secrmozI veql sseluleluoc ilF& u€c e pq1 ,!e>11dre,r1ts1's1q8tem-1llJ
Jo uo4
ry eql eqrrcseq 'pedeqsileq eq otr st
urtou>I uo4eyndod eql'20'0Jo uoll€Ilep prupuets€
s'ouno g0'ZI Jo lqElemlg u€eru€ effiq ol ux\otDlsI elocJo su€cectmo-Z1gouo4epdod y
31nU]V3rutdru3
tHr 9NrSn zL' t 31dl l v x ]
'elnJ lecurdrua
Jo peelsurpegdde oq plnoqs naoleqpessncslpelnr neqsfqeqJ eql'uos€er raqlo fue rog
geq Eurreedde1ouesoql Jo 's1eselep peA\e{s,Q1,reeq Jog 'sJelllnopoJeplsuocs,{ern1e
eJgo€ a r{ lerrrelureql ut prmoJlou senlel 'erogereq;'userueql ruo{ suoq€I^eppJ€p
eerqlpuo,fuqeq ilI1!\000'I uI € lnoq" dpo leql segdur osl" eIru eql'srelllno 1el1ue1odse
rl
a lerrrelur egl ur punoJlou senl€AJeprsuot uec no,{ 'e1rulereue8s sV'uoqceJlp JaqlleuI
eql ruoJJsuorlsr^eppJ?pu?lsorrq puo,teqeq rur!\ senl?AOzJo }no I lnoqe dpo 'suoqnq
p pedeqsleq roJ leql segdune1rulecFrdrueegl 'sJelltno,!4uept no,{ dleq u?c pu? useru
aoleq pue eloqe elnqrqsrp senIBAelp eJns€elunof sdleq elnr lecrrtduresq;
^{oq
'lreetrreql ruo.Usuo4?I^epprepu€lsg+Jo ecIIBlsIpB ulq{1( en yo;16,feleurxorddy
'ueerueql
suorlerlop pr€puelsZ+ Jo ecu?lsrpe ulqll^{ eJBsenl€Aeq}Jo %S6 ,(lepurxorddy
.u€erueql
uorler^eppJ"pusls I+ Jo ecuslslp 3 ulqlll|/\ eJ€senl?AoLllJo o 8g,(leleunxorddy
:suo4nqlrlslp
JIeq q dillqeuurr eql eurruexeol olnr 1uc;.4dureeql esn u€c no1'uollnql4slp pedeqs
e Eurcnpord'useur pue u€rperueql prmoJ€Jelsnlc ol puel uego senl€Aeq1'eureseql
s?etupu€ uerperuoql eJaqy\'s1eselep leculeurul(s uI 'ueeru eql u€gl releerS en1e,re 1e'sr
u€eurerpJotqEp eqt otrJelsnlcol puel senl€Aeql 'sleseleppe l.e{s-geluI 'u€euleql uerll
enlel € 1e's1pq1-ueeu eql Jo SeI egl ol smcco 8uFe1sn1c srql 'sles epp perrc1s-1q8F
Jetrsnlcot puel sonl"A oql Jo uoluod eErel € osles€1zplsou uI
'rrerperuoql Jeeulerl,&\eruos
flleerE ragro
op leql sllnseJocnpordsputu puoq s8rel eseqtr1eqlsNsSSnsuoq?u€AJo1tmoluelleurs sql
'g flepunxordde ,Q $'Z Jo rr€erueql uIoU sreJIIpumlar eEetuecredlectddl eql 'erogereq;
You can use this rule for any value of ft greaterthan 1. Considerk : 2. The Chebyshevrule
statesthat at least 1t - 1ttZ121x100%io:75%of the valuesmust be found within +2 standard
deviationsof the mean.
The Chebyshevrule is very generaland appliesto any type of distribution.The rule indi-
catesat least what percentageof the values fall within a given distance from the mean.
However,if the data set is approximatelybell shaped,the empirical rule will more accurately
l r i llllilr reflect the greaterconcentrationof datacloseto the mean.Table3.6 comparesthe Chebyshev
ililillll\\\h* and empiricalrules.
illllliltffililll
" iilllllililll' '"i,rililliittlllll[[,,-,S: o/o afValues Found in Intervals Around the Mean
3 6
rrrrrliiillllli'''
ililrfilulltll]r
lit::"-"., ;,,'-, +fCUnd
iirrrtllli'i,'.r,,11111, Chebyshev Empirical Rule
Interval (an distribution) (bell-shaped distribution)
,'J.
E 3.1 3 USINGTHE CHEBYSHEVRULE
As in Example3.12,a populationof 12-ouncecansof cola is known to havea meanfill-weight
of 12.06 ouncesand a standarddeviation of 0.02. However,the shapeof the population is
unknown, and you cannotassumethat it is bell shaped.Describethe distribution of fill-
weights.Is it very likely that a canwill containlessthan 12 ouncesof cola?
SOLUTION p + o - 12.06+0.02: (12.04,12.08)
p + 2o - 12.06
+ 2(0.02): (12.02,
t2.10)
p + 30 - 12.06 - (12.00,
+ 3(0.02) 12.12)
Becausethe distribution may be skewed,you cannot use the empirical rule. Using the
Chebyshevrule, you cannot say anything about the percentageof cans containing between
t2.04 and 12.08ounces.You can statethat at least75Yoof the canswill containbetween12.02
arrd12.10onncesand at least88.89%will containbetween12.00and 12.12ounces.Therefore,
between0 and l l.Il% of the canswill containlessthan 12 ounces.
You can use thesetwo rules for understandinghow data aredistributedaroundthe mean
when you havesampledata.In eachcase,you usethe valueyou calculated for X in place of trr
and the valueyou calculatedfor ^Sin placeof o. The resultsyou computeusing the samplesta-
tistics areapproximationsbecauseyou usedsamplestatistics( X,
^9)andnot populationparam-
eters(p, o).
the Basics
3.22 The following is a set of datafor a popula-
3"21 The following is a set of datafor a popula-
tion with l/: 10:
. I \ \ ir hi /-1 0 :
1t 8 3 62 98 7 5 6664 8 693
::e population mean. a. Compute the population mean.
:re population standarddeviation. b. Compute the population standarddeviation.
'(V prur-g secuere;er)1o1drolsqm.-pu?-xoq oqt pu€ Ar€luums
eql sepnlcut pqt srs,(leueelep ,fto1ero1dxeqEnorqt sl "tep I€cIJoIrmuEutqucsep
ulrluiJlfln-eArJ
** JaqiolrV.edeqs pue 'uorlepe,r',{cuepuel l€4uecJo sems"alussncslp€'€-I'g suo4ces
,I
Wffiruw
ffiHqdw-ffi
wmhwr Swffisffiffiryr
;illilril
A five-numbersummarvthatconsists
of
provides a way to determine the shape of a distribution. Table 3.7 explains how the relation-
ships among the "five numbers" allows you to reco gnrze the shape of a data set.
iii[ruiriifll|,i"!
3.7 ?elationships
Among the Five-NumberSummaryand the Typeof Distribution
Tlpe of Distribution
ifillmN$llll
,ttlitliililulrmmnmutufftlnlr
[lffir Left-Skewed Symmetric Right*Skewed
ilillitltttttun', lil]Iltllh,
iillnlululii*rnu,.
Irom Xsmallest The distance from Both distances The distance from
Trru"-I;:ll \-efSUS Xsmall.stto the median is are the same. Xsmalt"st,to the median
l; :rOm the greater than the distance is less than the
from the median to distance from the
1/
Xlargest' median to Xtarg..t.
r;* ftom Xsmallest The distance from Both distances The distance from
,(ms*isthe Xsmaltest
to Ql ts greater are the same. Xsmattertto Q, is less
i: in O. to
4- J
than the distance frorn than the distance from
Q3 toxlurg.rt' Q3 to xlurn.rt'
:,efromQlto The distance from Ql to Both distances The distance from Q,
r,iinm
",ensusthe the median is greater are the same. to the median is less
:: nnthe than the distance from than the distance from
i t!u
the median to Qs. the median to Qz.
29 35 39.5 44 52
(solnulur)oulll
0v 9g 00 OZ
09
uelpeN
'serul]
'r(reuurns roqlunu
eql roJ lold ro>lslqA\-pu€-xoqeql se]€rlsnlll E E ornBIC
r;:r-3u111e3
e sepllord lold ro{slqa '-puu-xoq V
; eq] uo pos€qelep oql Jo uoll€]ueserder I€cqdErE
ffiq&
em$d#wp#sgq&&-pffiw*H#ffi
e elec
'uo4nql4slpPem,ols-1q8tr
*mr::
suosneduroceorqr eqluo'g
nv'G'e : v'zz- o'gdtfi o1';rztpau ""1tt:Jl#111i:YT::ltl
';;;#--o1-'9-::{
- =g'02-v'zz)u€rperu
rLih, IA
"rltotrOuro{ocu?1slqe5-r-'(o'€:0'92-6'6d
ecu?trsrp
eqru€qrsselsr(g'1: 0'6I- g'oz) o1Nelleursx.'ou :11- -:^l{:l,HiPfS
n*lr"rsrp "
nuLrro{IG't:v'Zz- e'Adecu€lslp eqtru€qlsse1 sr (7'g : 0'6I - V'zOu€Ipeueqlol x
pe}slTsuosu€dluoceeJqloql
[&j :treJrr€lsrpeql 'sseua\e{so}€nleleol pesn ole L'E eIq€I ur
ri' ,', l i l l
: - c el b o x -
i,,lillllllllllr
' , ,: - c lot s o f th e
lt ll ' " : - : ^' t ua l i z e d
illlllllilililrr, " I * ^ u ' - r iSk,
- " : . and h i g h -
' . . : t - lds
" Average
ffi-
5"3
-i?
h--ffi
P a n e lC P a n e lD
Right-skeweddistribution Rectangular d istribution
i , r:-inch located rn a commercial district of a time, in minutes (defined as the time the customer enters
:: : :r3d an improved process for serving cus- the line to when he or she reachesthe teller window), of a
- _ : -e n o o n -to -1 :0 0p .m . l u n c h p eri od. The sample of 15 customers during this hour is recorded over a
ill r llllt lli ":; - :ninutes (defined as the time the customer period of one week. The results are contained in the data
,ll, ilti
ilnri " : n hen he or she reachesthe teller window), file ffi and are listed below:
imllllt : - 5 customers during this hour is recorded
9.66 5.90 8.02 5.79 8.73 3.82 8. 01 8. 35
ilrlrir:r""
- : rrfleweek. The results are contained in the
'irrh:, and are listed below:
iril'lii 10.49 6.68 5 .64 4.08 6.n 9 .9T s . 47
a. List the five-number summaries of the waiting times at
:: 3.02 5.13 4.77 2.34 3.54 3.20 the two bank branches.
0 38 5.12 6.46 6.19 3.79 b. Construct box-and-whisker plots and describe the shape
of the distribution of each for the two bank branches.
located Ln a residential are1 is also con- c. What similarities and differences are there in the distrib-
noon-to-1 p.m. lunch hour. The waiting utions of the waiting time at the two bank branches?
3.5 THECOVARIANCE
AND THECOEFFICIENT
OF CORRELATION
In Section2.5, yott usedscatterplots to visually examinethe relationshipbetweentwo numeri-
cal variables.This sectionpresentslnryonumericalmeasuresthat examinethe relationship
betweentwo numericalvariables:the covarianceand the coefficient of correlation.
The Covariance
Thecovariancemeasures thestrengthof thelinearrelationshipbetweentwonumericalvariables
(X andY).Equation(3.16)definesthesamplecovariance, andExample3.16illustrates
its use.
THE SAMPLECOVARIANCE
;::/l : ,.
'ar
{
' - t,,'
tt
tfl
lU City Hamburger Nlovie Tickets
Tokyo 5.99 32.66
London 7.62 28.4r
New York s.75 20.00
Sydney 4.4s 20.71
Chicaeotr)
4.99 18.00
San Francisco 5.29 19.50
Boston 4.39 18.00
Atlanta 3.70 16.00
Toronto 4.62 18.05
Rio de Janeiro 2.99 9.90
. . , . . . dLaa*
,,I.i .a .{ .l .j Ij rri .i i t} rrrri rj .;rj l ri l rrrtrrtl l tttttIIr:! . . , r , , . . , . , i .
!*,***-4a66l@*1**Yryq4sa$::44qry44"-fsaTYw4,.ry*"TT*"Y***T*ry*1TT*T*.*a1*-T*^:-***-T
selqelJenuoo^^laq
uorlenosse1o sedr{1
L't lunDll
'selgeuel o r1uee qeq uo4elcosseJo seddl luereglp eeJIil sele4snlp 1'g em8tg 'uoIPIeJJoc
luiJrjeoc eql JoJ 1oqur,{seq} s€ pesn s d regel leerg eq1'selqeuerr I€clJoIImu olq JoJ€tr€puop
cld qil^\ Eqleap ueql\ 'ouII fqEle.qs 3 qllv\ pepeuuoc eq ppoc slutod eql 1e '1o1drog€cs 3 uI
:',i a;arnslurodeqlgr leqtrsueerupato4 'aolleptroc e,ultsodlceped e rog I+ ol uoq€lerlooeAIl
fuunaped € JoJI- urog e3ueruorleloJJocJo luarcrJeocoqlJo senls oql'solq€IJ? Iscuarrmu
:eJ.!qeq drqsuoqeler J€elrl ?Jo qfue4s ellleloJ eIil seJns€eu UoPBIaIJoJ Jo lueIJIJJooJ eql
#ffi&ffi#$s$ffie*-*q&
ffiffi$&wff@r*s#*
LLL[8'9=
I-oI - ([y)noc
t66Es'19
eql l€ql pur; nod'dpcelp (91'g)uoaunbgSqsn,tqro'973 IIecruorJ
tLLLEg'gsI ocIIBIr€Aoc
iqelmlecroll€ursJolese o1q (91'g)uo4enbguY\op$l?erq9'g em8rglo eer€suoqslncluJeqJ
ssorDJoJ ecrr?IJB^oceql selslncl?c leqt leeq$lJo1( Iecxg gosoJcll tr? sul€luoc 9'[ emtsIc
+
t
+
a
l+
+
It
.}t
-
J,i P anelD (r:0.3)
P anelF(r:0.9)
&8
r- ,: r,is created from Microsoft E x c e l and thei r sampl e coeffi ci entsof correl ati on,r
= '{S
* !i)
T,(4
T?.
- x,s,
,(X-t*rT
ilr'fi ; G,'y)troi
l::li:,,:.lr:::::i , !: I
I= l
(j - I.r)(x- !#l
sJerlm
lsxs
t't') qt--"t
i*= r lgaoe!t'tdtAtvsIHI
,,,.,,,,...ir,,,,,,1,,,,,,,i,,,l
NOrrvllUUOf JO
'esnslr sele4
r 11'g eydurexgpue '"r 'uopulo-rroJJo luoIJIJJaorelduresoqt seurJep(1 .g) uoaunbg
1
'uorlesn€c fldwr
saopeuol" uorl?leJJoc1nq 'uorleyerrocserlduuuorlesneclegl ,{esuec no,{ ,erogereq;
uEIaJJoc eql pecnpoJd,(11en1ce suorl€nlrseeJgl esoqlJo qcIrIA eurruJelapo1 srsfleue
rurpp€urrogredot peeu plno \ noa 'drqsuolleleJloeJJe-pu?-asnec e ,(q ro ouorlelerroc
uorl"Insl?c eql uI poJeprsuoc10u elqsrJe^pJlql ? Jo iceJJeeql ,tq 'ecueqc,(q ,{ldrurs
-ro
pord eq uec uorleleJrocEuorls v 'olqerJ€^Jerllo eql ur e8ueqc eqrpasnDJelqerJgl euo
JnleAeql ul e8ueqceql let{} 'sI leql-lceJJo uorlesnsr e sr oJeqlleql e,rord}ouu€oeuol€
rcleJJoJ'esodrnduo pesn sen Burpron suqT's\callapuy sasnDcse lou pue sanuapual
leqrrcsep ,,(1e1ereqr1ep ere,1.sdrqsuorl€lorerll 'g'€ ernSrgJo uorssncslpeql uI
7 Jo sen
LaEJslr{}r^\pelercoss"eq o} puelxJo senlsl e3re1eqt pue sonl"AIlerusqlr^\ perred
'Jrpuel,YJo sanle^ IIsIus osn€coquorleleJJocJo sluercrJJeoc e,rrllsode^Bql€ql sles €lep
"{Jo
bp g q8norql c sleued 7 Jo senle re8rel eqr qrrrvrperredeq orxJo senl?A
lleurs eql roJ
npuellqEqse fpo sr eJoqrpue '€'0- : "r'4eem,{re,rr sr,Jpuexuoe \leq dqsuor1e1", rl"ur1
'J Ieu"d uI 'v
leu"d ul leqt se e,n1e3eus? lou sr g leued ur uorlsleJJocJo luercrJJeoceql
q.[ v leu?d ur leql se 3uo4s se rou sr g Ieued ar f, pt:r-x uee,/rueq drqsuorlelsrJeeurleqJ
itr SOftl?A eErelqlr,l. perredeq otrpuelxJo senl€Allslus eql pue .g.0- o1
lenbeuo4elerro"
:uorcrJJeoc e 0A€rIg Iouedlu:'erepeq7'Tcattadse peqrJcsep eq touu€J prnxuee geq
r?rJosseeql os 'eurylq8rerls e uo IIeJ IIe lou op etr?peqJ Z Jo senl€A^
II?us qlr^\ perred
ol puetrxJo senle^e8rel eqt 'esum4r1 'eEre1eq o11 JoJfcuepuel 8uor1s.(rerre sr ereql ,a
ienl€Allerus JoJleql oesuec no^ '6'0- sl ',t 'uotlelettocJo luarcrJJeocar{l ,y
1aue4u1
sernseel4errrldrrssec
IecrrerunNaEuHI uiIJdvHJ 0E I
3.5: The Covarianceand the Coefficient of Correlation 131
. . lli
,rt|{llrrrrJ&"-
f 3 "17 COMPUTING THE SAMPLECOEFFICIENTOF CORRELATION
i,,iii'rlllirl
):l-lli
Considerthe cost of a fast-foodhamburgermeal and the cost of two movie tickets in 10 cities
aroundthe world (seeTable3.8 on page 127).From Figure 3.9 and Equation(3.17), compute
""rllllllllllr the samplecoefficient of correlation.
Itllltlltr
SOLUTION
cov(x,Y)
,-
at
SxSy
6.83777
(r.2e2sx6.337)
- 0.8348
{r
&,v
fi,
llunnbxrg*r
lil*sl
$.$$.
F,S?. *s-{ri s"$F{s: &s.sy.{a, Kt*ffi
s.F$l ?s s.*$.44 s,st${ $,'W*S
4,4S, tg-trt, n*rwi s#**s -*.***#
4-S$. *s" n rl *ry1 s,ss3d
S;*Si ts,$, fi$wri s.ffil s,ts3fi
4.SS' {$, s.3*s.s, t,s&ri r#w
s,f i t$, ,tr-s,$, {$,ffi*- s"3tr$.s
4"$?, 1*,SSl {I"{?ffii {.ffif3 s-:?sdfr
3.$S; S,$' $-$ffitr *S{,ffi'F ffis33$
suw;Js*S$,
S*luul*,clor*
XSmr r ,,i[.SfS *AtffiS*ffS$n$t$
f , ffi"tPs *fiVSffifrSSffirS*tI
#" 1 .S *S$t$ffiYtfitr*Atil$
S#nrfsl*rs . g"*gytrf *Sl* d,$tF
$x i !*S*$ *$SRY{f;-t*#tf}
'r
s'3$r* *$*ftY{*t* I Ht$
r i $-sS{S **Sffiffi#|"{&tue1i,
ffi${t}
The cost of a fast-foodhamburgermeal and the cost of two movie tickets are positively
correlated.Thosecities with the lowestcost of a fast-foodhamburgermeal tend to be associ-
atedwith the lowestcost of two movie tickets.Thosecities with the highestcost of a fast-food
hamburgermeal tend to be associatedwith the highestcost of two movie tickets.This relation-
ship is fairly strong,as indicatedby a coefficient of correlation,r: 0.8348.
You cannot assumethat having a low cost of a fast-food hamburgermeal causedthe
low cost of two movie tickets. You can only say that this is what tended to happen in the
sample.
rilri|;;:rr-TT
spJ€pII?]SIUOUruIOAOS ]UOJJnCpU? JOU^/KOUee goq :s>lcnqrels
i:
i|Lli;
,f :leleJ er{} }no qE q.cBeJno.( ue} suolsnlcuoc }€r{1V\'p pu? slnuocl 6ur>luncle s>lulrp eeJJocpocl
'ur?ldxf,euollslerroc ecuno-gI Jo 'surerS ul ']€J pu€ serJol€coql lues
-,: r[J[30c oq] Jo e3u€IJe^oceq]-aEeeFul spJepue]s -erder ollJ eql ur etep eql O?'t
-,., -J-\OElUeJJnCpUU J3UA\OUeO1KlOq
iiri drqsuoll€leJ '( e) 8 €' €
iilr" ::sserdxeq elq€nl€A ororu sl {uF{} nof op qcll{71\ .J urelqord Jo esoq] ol (e) Jo sllnser eql eredulo3 'q
'uorlslerJocJo luorcrJJooceq] olndulo3 'q
'ecuerJe^ocoq] elndulo3 .E LsluelulsoAuI
'S'n Jo
Jo sedr$ reqlo e^lJ eseql Jo qcee pue spuog
'gt 'd'900e 'y y ["mnun; VSn ,,'pa'tallYaq luerulselul uo uJnloJ er{} uee.&ueqdrqsuo}}eleJ eq} Jo
rir ',,,:,xtDlnt)p3 [u,touoJg lanl,, 'tapa11 'p"lepoJ
u,to,{paputlxfl :a),mos ql8uer1s ot{} lnoqe o>lerunor( vel suolsnlcuoc }el{7y1'E
'0I'0 se..l\lqep sle>lreruSurEreruepu€ spuoq 'S'n pue
",99 E LE snlJde1o.(o;S00Z
'02'0- sBA s>lcolssle>lr€ruEurEreruepu€ spuoq 'S'n '8?'0,
:Eg elloro3 e1o,(o1E00Z s3.&\spuoq l€uol]surolul pu€ spuoq'S'n'8I'0- s€/!\s>lcols
; 8Z L'IT, ,furue3e1o.(o; g00Z dec II€us Ieuolleurelul pu€ spuoq 'S'O 'tl'g- s€.,Ks>lcols
J"8r 8'91 reroldxg proC7,002 dec e8rel Ieuorl€uralul pue spuoq 'S'n Jo luetulsonul
'r"Lv 8'8t plrq,(H cl^lc spuoH v00z uo uJnleJ er{} uee,&uequor}€leJJocJo }uolclileoc eq} wql
;"F[ 6'LZ cl^I3 upuoH7,002 pelels spuoq u8rerog ur ]uerulselur pqssncslp wqt ( t C
:"97, 8'LZ xT proscYBpuoH7,002 'd 6nyz 'gZ JegruenoN 'pu,tnoy pa"tts na/ll aqJ ,.'spung
r
!"1, 0'9I opsro^lls]elor^oqc 9002 uEre;og ur orloJuod {co}S rleql Jo %08 o} dn }nd
$ '9 r EV r 091-cprod 9002 plnoqs srolsenul {q4,, 'slueulolJ 'f) elcl}r€ uV 6g'g
aao9 Jouaro reJ '( e) $t
urolqord Jo esoql ol (e) Jo sllnser er{} ereduro3 'q
:spJspu€lsluaruureno8 lueJJnc fq pue
ul,o,(qpe}3In3I€cse,e3eOIFIeq}Se}€cIpu(E@Hu eslueIulse^ul
'.(ulouoce sedfi rer{}o e^lJ esoqt r{c€e pue s>lco}s 'S 'n Jo
egl uI peuleluoc) elq€l SutznolloJorII Jo Jo
frurlelnclecroJ spoqleurI€renosete erer{J ,V'e luorulsonul uo uJnleJ eq] uao.ttleq dlqsuoll€leJ er{}Jo
ql8uer1s eql lnoqe e>lerunof uec suolsnlouoc 13q16 'E
LIEJpu? selJolecuee^\}eq '89'0 seln lqep sle>lreruEur8reruepu€ s>lcols'S'n pue
rtBIeJoql lno qe I4cEe;nor( u€c suolsnlcuoc ]?r{a 'p 'IL'0 s€1Ks>lcolssle>lreruSur8reruopu€ s>lcols'S'n '80'0
'ureldxg euollelorroc Jo luelclJJeoc eql s€1y\spuoq I€uolleuJolul pu€ s>lJols'S'n 'Eg' 0 s€zlrs>lcols
ue^oc eql-lal prJesolJolec uoe./KleqdrqsuolleleJ dec lleurs Iauortaurelul pu€ s>lcols'S'n '08'0 sen{ s>lco}s
imsserdxe q elqenl€A eroru sl {u}q1 nor( op qcll{A 'c dec e8rel Ieuolleurelul pue s>lcols 'S'n Jo luetulsonul
'uorlelerroc Jo luercrJJeoceql olndulo3 'q uo uJnloJ eI{} uee./Klequol}sleJJocJo }uelclJJeoc eq} Wql
'ocu€rJeloc eql olnduro3 .E pel€ls s>lcolsu8rerog ur luetulse^ul pessncslp wql (tC
'6 'd 'h00e aunl"sl;ode6 Jelunsuo3 ,,'s4cnq"rcry 'd 6n1Z 'gZ Jeqruolo5l 'lnutnoy paus Ua/UaqJ ..'spung
w:a slnuoT,ur4unQ 7o r(puo3 so aa.[o3,, uto",ttpapo"\xf,:ac'mos uSrerog ur orloJuod {co}S rlaql Jo %08 o} dn }nd
plnoqs srolsalul f,{16,, 'slueulel3 'f) elclue uV 8t't
,*,',ilil-l
[*tn,[lege basketball is big business, with coaches' 3.43 College football players trying out for the NFL are
rllrrLr$tti,.
rrqvenues,and expensesin millions of dollars. The given the Wonderlic standardtzedintelligence test. The data
'riiiiillu* contains the coaches' in the file@ contains the averageWonderlic score
ir,,rllnrrrtti$i
nnC revenue for college basketball at selected of football players tryitrg out for the NFL and the gradua-
fim& recent year (extracted from R. Adams, "Pay
;,,,rrr,irlrilllrlillrltilliliii tion rate for football players at selected schools (extracted
.ttr,rrrllllllluumuirrruffi."'
The Wall Street Journal, March I 1-12, 20A6, from S. Walker, "The NFI-s Smartest Team," The Wall
irrrrrr
iilffi1r , ili'$ StreetJournal, September30,2005, pp. Wl, W10).
,rrr,rn|lnmmrume
the cov arLance. a. Compute the covanance.
the coefficient of correlation. b. Compute the coefficient of correlation.
rnmclusionscan you reach about the relationship c. What conclusions can you reach about the relationship
lllMunen a coach's salary and revenue? between the averugeWonderlic score and graduation rate?
3.6 IN NUMERICALDESCRIPTIVE
PITFALLS MEASURES
AND ETHICALISSUES
In this chapter,you have studiedhow a set ofnumerical data can be characterizedbyvarious
statisticsthat measurethe propertiesofcentral tendency,variation,and shape.Yournext stepis
qnalysisand interpretationofthe calculatedstatistics.Your analysisis objective;your interpre-
tation is subjective.Youmust avoid errorsthat may ariseeitherin the objectivity of your analy-
sis or in the subjectivity of your interpretation.
The analysis of the mutual funds is objective and reveals several impartial findings.
Objectivity in dataanalysismeansreporting the most appropriatenumericaldescriptivemea-
suresfor a given data set. Now that you have read the chapter and have become familiar with
various numerical descriptive measuresand their strengthsand weaknesses,how should you
proceedwith the objective analysis?Becausethe data distribute in a slightly asymmetrical
manner,shouldn't you report the median in addition to the mean?Doesn't the standarddevia-
tion provide more information about the property of variation than the range?Should you
describethe datasetasright-skewed?
On the othe*hand,datainterpretation is subjective.Different people form different conclu-
sions when interpretingthe analytical findings. Everyoneseesthe world from different per-
spectives.Thus,becausedatainterpretationis subjective,you must do it in a fair, neutral,and
clear manner.
Ethical lssues
Ethical issuesare vitally important to all dataanalysis.As a daily consumerof information, you
need to question what you read in newspapersand magazines,what you hear on the radio or
television,and what you seewhile surfing the Internet.Over time, much skepticismhas been
expressedabout the purpose,the focus, and the objectivity ofpublished studies.Perhapsno
commenton this topic is more telling than a quip often attributedto the famous,nineteenth-
century British statesmanBenjamin Disraeli: "There are threekinds of lies: lies, damnedlies,
and statistics."
Ethical considerationsarise when you are deciding what results to include in a report.
You should documentboth good and bad results. In addition, when making oral presenta-
tions and presentingwritten reports,you needto give results in a fair, objective,and neutral
manner.Unethical behavior occurs when you willfully choose an inappropriate summary
measure(for example,the mean for a very skewedset of data)to distort the facts in order to
support a particular position. In addition, unethical behavior occurs when you selectively
fail to report pertinent findings becauseit would be detrimentalto the supportof a particular
position.
:lllll:
:!ll!:
=
(s'g) u , r(u x x " ' x z x x rx )_ "X
llit :
"S/\=S
:___l
ilflllr
i
UBoIAIclrleruoec
H;
frirl uopu1,roqprupuulS eldulug
(v'E) enle^pe>luer@
v =t0
l-u Cl
l= ! ZJ tO orqlrun| prlrtrI
, lx-'xlK
ecuslrul eldulug
(t'g) enle^ pe>lueJ
v =w
r +u
t6 - t0 - e8uereppunbrelul
'O'rqlrun| lsrld
aEuug elprunbrelul
(z'E) Z
onlsApe>lu€r = ueIpeIAI
_ tse8relx_ eEueg
rsellerusx r+u
uBIpoHI
eBuug
u
[- ,5 1 ( 'v+ I) x "'x (zY+ I) x (Iu+ I)l = ev (t'g) 17
A
I.r
'xs
r17
rlrlaluoeo
urnlou Jo alBu u?atrAl u aldruug
uuetrAl
selqslJuAIsclJerunu o/vu
uolloo5) uo|1€lorrocJo luolclJJeoc'scuBIl€AoJ uoo^\loq drqsuorlelor eqt Eurqlrcsoq
llt irillllllllllll',,
itilililll11|lliillli
SampleCovariance n
rumrrrliiiiilmrul
fi: ,iilililllllillllillu' Z=
X_X
(3.12) I r r , - x ) 9, - 7 1
i =l
cov(X,Y) = (3.16)
ilffill|||lililililruiii n-I
\Iean ,A/
2*,
i= l (3.13)
Sample Coefficient of Correlation
U= cov(X ,Y )
'l/ (3.17)
sxsv
\ ariance l/
Lrri - v)2
i= l (3.14)
o2
t/
fut*q Ieels secnpordz(uedurocEuunlceJnu€tuY 89't I€uolllppe roJ slsenber elqlssod 51ceqcn€ernq uoll€ruroJul
I€crporu e 'uo11ecr1dde eIIl Jo /KeIAer3 sopnlcq qclq/K 'Eul
'ursldxg er(€snof plno l'
-]rr^\repun Jo slslsuoc ssecord lenordde eql '(tfgS) ecue
penlosor luleldruoc e onerl ol llelv\ o] loedxo plnoqs -rnsur oJII >lueq sEutzrespe11€oocu€rnsul oJIIJo ruroJ € IIes
e Euol 1soq.(uedulocoqlJo tuoplsord eq] 11o]o] ol pollrur.red oJe $lueq s8utrres 'el?ls >lJoI ,'!\3N uI 99't
no{ J} '(c) q8nolqt (e) .+os}lnsererl}Jo slseqeq} uO 'p
' reeurEug
000'09 6EL'19 | L9'91 000'gZI 0AZ'6 668
rt;?tlrtT"5 'r
s eleperllerv 'lold re>lslrll!\-pus-x"O , , , ..'. i . '. :::
. ,$ te n D
i ,l :
lllll$rllfltt,ttl iriillililllillllllll;
ilmr ::le;&lnrriledian, range, and standard devia- c. Construct side-by-side box-and-whisker plots. Are the
jffiffi *:,',ith. Interpret these measures of central
lltillil' data skewed? If so, how?
mm' rumffi-"'aiability.
uiurffiilllilIlul)$r d. on the basis of the results of (a) through (c), are there
rlnrltfl ilitiiinruits
illllllllm -"1-lmb er sufirmary. any differences between the two central offices?
ililffililillllis,&':n:x-end-whisker plot and describe its shape. Explain.
,"iliillllffiilttttutirumr
,,/:r* conclude about the number of troughs
,rffilttrrrffiiiluilMi 3.61 In many manufacturing processes,the term work-
ulnreffuhe company's requirement of troughs
rtilffi;tllll||tmmm
in-process (often abbreviated wIP) is used. In a book
rilmn E"3 I and 8.61 inches wide?
manufacturing plant, the WIP represents the time it takes
irrilMrnr;fecruring
company in Problem 3.58 also for sheetsfrom a press to be folded, gathered sewn, tipped
u umsulators.
If the insulatorsbreakwhen in on end sheets, and bound. The data contained in the file
ils likely to occur. To test the strength of U[ilG represent samples of 20 books at each of two pro-
-rnm,w;lt
ciestmctive testing is carried out to deter- duction plants and the processing time (operationally
rrmnuush
tonce is required to break the insulators. defined as the time, in days, from when the books came
by observing how many pounds must be off the press to when they were packed in cartons) for
ilfu mmsmtrator
before it breaks.The datafrom 30 thesejobs:
this experiment are contained in the file
Plant A
trfffii56
il.610 1,634I ,784 | ,522 I ,696 I ,592 | ,662 5.62 s.2g 16.25 r0.g2 1r.46 2r.62 8.45 8.58 s.4r rr.42
mLru4N.6621,7341,7741,550r,756 r,762 1,966 rr.62 7.2g 7.50 7.96 4.42 10.50 7.58 g.2g 7.s4 8.g2
il[/ffiifr
[,688 1,9101,7521,6901,9101,6521,736
Plant B
ffiB mrean,median, range, and standard devia-
fforce variable. 9.54 tr.46 t6.62 12.6225.75 rs.4r 14.2913.13 13.7110.04
ffis measures of central tendency and variabil-
5.75 12.46 9.r7 t3.21 6.00 2.33 14.25 s.37 6.25 9.7r
, -"u mneasurements made on the company's tion and benefits; national and other local expenses;and
l;ililllllllllltitl
il utillilrffitis)lrr;S
3.ffid140 measurements made on vermont income from baseball operations?
ffimr t:rrltttutrllrl,
*r*,:-:lurnber summary for the Boston shingles 3.69 In Section 3.5 on page 131, the correlation coeffi-
dffir illnirllililililtnu
-, cient between the cost of a fast-food hamburger meal and
rrruulllliiltlttmrr"rrtffitr
cnrront shingles.
the cost of movie tickets in 10 different cities was com-
s,1e-by-side box-and-whisker plots for the
r,irrilltillllMffilllliurrirr,r
puted. The datafile@also includes the overall
rumnntnlrrllittilrruumlil*
:'i shingles and describe the shapes of the
cost index, the monthly rent for a two bedroom apartment,
,,mnniiiilitttrurutn,fl
,ti)nius
and the costs of a cup of coffee with service, dry cleaning
i,,iuuuillililffimflm[fi,
r',,,,,,,,, :,r. uhreshingles' ability to achieve a granule
for a men's blazer, and toothpaste.
i{ffiuiurililt''irrtilr
li E:J.rTIor less.
a. Compute the correlation coefficient between the overall
,ffiur,d"
imthe file lssffi representthe resultsof cost index and the monthly rent for a two-bedroom
CcrnmunitySurvey,a samplingof 700,000 apartment, the cost of a cup of coffee with service, the
mrums,nm eachstateduringthe 2000U.S.Census. cost of a fast food hamburger meal, the cost of dry
u-ariablesaverage travel-to-work time in
'rmtili'ffi,e cleanin g a men's blazer, the cost of toothpaste, and the
mmnrrulcr3mge of homes with eight or more rooms, cost of movie tickets. (There will be six separatecorre-
oild income, and percentage of mortgage' lation coefficients.)
merswhose housing costs exceed 3 |oh of b. What conclusions carryou reach about the relationship
of the overall cost index to each of these six vniables?
rnhErnean, median, first quartile, and third
3.7O The data in the file EBIIfr contains the character-
istics for a sample of 20 chicken sandwichesfrom fast-food
ffie range, interquartile range, variance, stan-
chains.
and coefficient of variation.
a. Compute the correlation coefficient between calories
**-and-whisker plot. Are the dataskewed?
# and carbohydrates.
b. Compute the correlation coefficient between calories
snonscan you reach concerning the mean
and sodium.
r time in minutes, percentage of homes
c. Compute the correlation coefficient between calories
$'r'more rooms. medran household income,
and total fat.
ge of mortgage-paying homeowners whose
d. Which variable (total fat, carbohydrates, or sodium)
luroilsms
exceed 30% of income?
seemsto be most closely related to calories? Explain.
cs of baseball has causeda great deal of
3.71Thedatai nthefi 1e@ repreS entthetota1com .
w,ith owners arguing that they are losing
pensation (in $millions) of CEOs of the 100 largest compa-
aryuingthatowners are making money, and
nies, by revenue (extracted from "Special Report:
rnrn,i'rno
about how expensive it is to attend a
Executive Compensation," USA Tbday,April 10, 2006, pp.
Eameson cable television. In addition to
38,4B ).
ffim$eamstatistics for the 2001 season"the file
a. Compute the mean, median, first quartile, and third
statisticson ticketprices;
insteam-by-team
quartile.
regular gate receipts; local televi-
'rmflrnd€K' season b. Compute the range, interquartile tange, variance, stan-
;md cahle receipts; all other operating revenue;
dard deviation, and coefficient of variation.
lon and benefits; national and other local
c. Construct a box-and-whisker plot. Are the data skewed?
rncorne from baseball operations. For each
If so, how?
d. What conclusions can you draw concerning the total
ffire mean, median, first quartile, and third
c9-mpensation(in $millions) of CEOs?
l' -s tra.nge,interquartile range, variance, stan- 3.72 The data in the file @ is the per captta
urm-and coefficient of variation. spending, in thousands of dollars, for each state in2004.
;nrhor-and-whisker plot. Are the data skewed? a. Compute the mean, median, first quartile, and third
,10)
quartile.
tffirecorrelation between the number of wins b. Compute the tange, interquartile tange, variance, stan-
clrrmpensationand benefits. How strong is the dard deviation, and coefficient of variation.
rmmhem€en these two vartables? c. Construct side-by-side box-and-whisker plots. Are the
rons can you reach concerning the regular data skewed? If so, how?
r@ receipts; local television, radio, and cable d. What conclusions can you reach concerning per caprta
omheroperating revenue; player compensa-
orffiili spending, in thousandsof dollars, for each state in2004?
'lottdsoHounavJ ru puo 'lotldsoH orcnbas.taqua3p)rpary p,rotuotsruo"ttpaqdopV:n,mos
li
' su o rl eredo l l e +o l soo eD el ene aql are el ep prorr.uel s
.u o ltre;ed o W
ru q ce o r o ; sa 6 r e q c lle + o %A gol pp!. e,.{}1o se6erone ere s}soo etonbeg
[]
' Ae ls Ae p - e u r u e q ltM l uoruoce;del dl q e pue A el s A ep-onnl e ql l m ql rl q
e ;d u ts e r o l se 6 ;e q c nnol pue q6tq;o e6e.ro^e oql al e sl soc oul ure3 l :l
ffi
lu o u r o ce ld a t d lp qulq e;durg ssedAqAreuoro3
0:
V/N
000'01,
000'02 U
o
0)
proluels 000'0t U)
ffi
e r on b a s
000'0t
ourrJes l:l
ffi
' u o ltg te d u loc l ecol uteul s,l ol uaC l ecl pol 4 prol uel s
000'09
o Je sle lr d so ;1 o u n r ,te J l l pue etonbeg 'suotl eJedo snotl en Jo+
e r u Jo + !le Cu r sa b le q c le lr dsoq gO-OB G;,a6eran e +o uostl edtuoc y
slsoc erec qlleeH leqfut
'seJns€oruelqducsep leclJelunu pu€ 'sgeqc eql ,(q peu8rsse selqelJe^ lecrro8elac pu€ I€clJeunu
:rpr eleudordde ile eq plnoqs Uodor rno.( 01 pepueddy IeJoAesSurureluoc les ewp e JoJ sper{c pue 'se1qe1'uot1
rrlo U red (surer8 ul) sol€rpfqoqrecJo regrunu -BruroJurfreununs pepeeu aq] le8 o1 Iacxfl gosorcll tresn
"seJuno71 nd serrol€3Jo Joqrunu'loqocl€ e8eluocred o] poroolunlo^ ssq l€npl^lpq slql 'sserdu4 ol ]u€./y\.(pe1n
p{qBue^Iscrrerunu eI{}Jo qc€eJo uol}?nl?Aoentlducsep -ctped no.( woq./KJoeuo 'seleulsselc;o dnotB e qll1\ uollsu
,Cruoc e uo peseq lroder e elIJ./Ko1 sI {se} JnoA -rruexo sorlsrlelsrno,( rog ,(prus o1 Suruueld ere no1 VL'e
'9002 '[€ qcffiW 'trroJ'00[nag'ituu tuo'ttpapouxg :a),mos ' ' ' {der pue 'qlee;q daep B e>lel'epurs no1'esuodse.r
'secunou red (sulerS req eredord nor{ leqt s}senborzlrou oqs 'uorutdo req pe>lsg
"eterp,(qoqJ€cJo Jeqrunupu€ 'secuno Zl red selJolscJo pus sselEuru€eruf11e1o1 sBA ueqc s1p Wqt peuollueru rueql
u '1oqoc1ee?eluecred:pspnlcul eJeselqelJe^ooJlpJoJ Jo euo wqt pu€ lq81u ]s€l sOf,J reluec Isclpeul eerv leuor8er
iE\ eqJ'ffia1J oI{}ur pe}ecoler€'S n oq}ul go Surloeu e Jo ued se Surpes dnorE uorssncslp e ur.pelues
rrlseruop equo
Surllos-lsoq 89 Sururecuoc epe g4t -erd se./Kelcrue eq] }€q} nor( slle] oqs 'sFp ssncsrpol uI nof
soslrJax36u;ry.1M
uodeu sil€c pu€ scrlsllels q esJnoce Surlel r(lluerrnc ore no,( s/!\oDI
OAJ JnoI'Jeluoc I€clpotu e q EurqrozneJe./v\ nof esoddng
sl rofeur roJ useru '(progu€]Spue 'etonbeg 'outure3
6r(1derrnort sI ler{A ,;E;t
"0s'I sr repueSro3u€eruoqt'9L'z sr.xepu ]urodeperS IE) suorlnlllsur Eurleduroc oorr{t te (}ueurecelder dtq pue
roJ u€erueq}'ees lSurqtr(re^o
IrlrEerueql 'EZ'ggsr 1q8rer{ 'I{ulq elduus 'ssedfq ,ftuuoroc) seJnpoco;d Iuclpetu eorq}
r uLreqllo8 I-selqerr€A erll Jo eruos JoJ slels ollldrrcsep roJ se8reqc lelrdsoq 0661 o1 686I o8erene oql eredruoc
;mwla8 l,uec e/K ples lelqqoJ) JosseJoJdfqrn pu€lsrepun o] peprnord s€.&\A\oleq Neqc erIJ 'sluollsd xeldruoc eJotu
| 'oslv '1{3req roJ pue xepq }urod eper8 roJ sueqc
'fi,ilHrro pue 're>1crs?leclpery 'erecrpenl 'lue8tpw ]€oJ101 suollez
,ilmr$aql pue rofeur roJ pue JopueE rog s1o1dJo>lslql\-pue -rue8ro Jor{}ouerp fle{}l eJoruse.,lt. JoruJoJer{}esn€ceqsuoq
4i1c'r4 3r{} e>lrl-pJre \ qool 1nd1noerl} Jo eruos 'st urelqord -nlrlsrn Surleduroc w ueqtreq8rq dn uenlrp ueeq per{ re}ueJ
'selqerrel rno IecrpelNproJuels 1€slsoc rcqt pelldurt (OOO I 'I I reque^oNl
im_l_ II€ roJ-sperlc eld eq] 's1o1dre>lslqa
-mrm-xoqoql 'suotlellep pJepu€1se{} 'suerpeur oq} 'sueeul 'uotlceg ssoursng fepung saLuIJ4Jor{.Mal{ aqJ ..'ecueq
tm:-lle I loE eA.L, 'surrelcxe pue lnolurrd eqt gll/tr no.( o1 lu€qdrlg proJu€ls eqt Eux€oJ,,) uourer; uuelg ,(q e1c
;e-\o seruoc uos;ed sFII 'sesodrnd ,(pn1sJoJ Jo}cnJ}sul
il{ffiu -Um u€ 'scrlsrlelsJo osnspr orll Jo uol]€rlsnlll u€ sv t4e
{mWlll i lllr -trqffiS d. What conclusions can you reach about differences
lffil*r111ffi6, L..l|||llllt|l|lllii[[ri|gfficontainsinformationregard- between mutual funds that have a growth objective and
EtiitffMH[ii' rntttutu''i'rr{rilur:iinrb"es
from a sample of 838 mutual funds: those that have a value objective?
m, lililHhi ,ulturuumnl,,-T)?e of stocks comprising the mutual fund 3.79 You wish to compare sm aII cap mid e&p, and large
rrrrrnnMlllilll
urffiffi^,
mid cdp,large cap) cap mutual funds. For each of these three groups, for the
illllllfuMm-,r-Cbjectiveof stocks comprising the mutual
variables expense ratio in percentage, 2005 return, three-
irrffillllilltqgmrrrrnm,,*-il
or value) year return, and five-year return,
lrrillffiililtii---:
nnilli ons of dollars a. Compute the mean, median, first quartile, and third
iiitimr'-S'tl,lE:ch&rges (no or yes)
quartile.
lillfihpnmrrrm of expenses to net assets ln per- b. Compute the range, interquartile tange) variance, stan-
-rlm-ratio
'iirffiffiilllilmH' dard deviation, and coefficient of variation.
;-of-loss factor of the mutual fund (low, c. Construct a box-and-whisker plot. Are the data skewed?
I
"mxgh If so, how?
Twelve-monthreturn in 2005 d. What conclusions can you reach about differences
return-Annuali zed return, 2003100 5 between small cap, mid edp,and large cap mutual funds?
uerurn-Annuali zedreturn, 200| -2005
Student Survey Data Base
rmmgrense
ratio in percentage, 2005 return, three- 3.80 Problem I.27 on page 15 describesa survey of 50
md five-year return, undergraduate students (see the file ).
the mean, median, first quartile, and third For these data, for each numerical variable
a. Compute the mean, median, first quartile, and third
rtherange, interquartile tange, vatiance, stan- quartile.
io'o. and coefficient of variation. b. Compute the range, interquartile range, variance, stan-
n box-and-whisker plot. Are the data skewed? dard deviation, and coefficient of variation.
'? c. Construct a box-and-whisker plot. Are the data skewed?
,rommmclusions
can you reach concerning these If so, how?
,,fllll
d. Write a report summafiztngyour conclusions.
w"""""""ush
to compare mutual funds that have fees to 3.81 Problem 1.27 on page 15 describesa survey of 50
,ffi, not have fees. For each of these two groups, undergraduate students (see the file ).
expenseratio in percentage,2005 return, L. Select a sample of 50 undergraduate students at your
and five-year return, school and conduct a similar survey for those students.
the mean, median, first quartile, and third b. For the data collected in (a), repeat (a) through (d) of
Problem 3.80.
ffie range, interquartile range, variance, stan- c. Compare the results of (b) to those of Problem 3.80.
ion- and coefficient of variation. 3.82 Problem I.28 on page 15 describesa survey of 50
n hox-and-whisker plot. Are the data skewed? MBA students (see the file ffi). For these data,
for each numerical variable,
lu@mmmctusions
aan you reach about differences a. Compute the mean, median, first quartile, and third
funds that have fees and those that do
mmunral quartile.
fuss? b. Compute the tange, interquartile tange, vatrance, stan-
dard deviation, and coefficient of variation.
wrsh to compare mutual funds that have a
c. Construct a box-and-whisker plot. Are the data skewed?
h"e to those that have a value objective. For If so, how?
fino groups, for the variables expense ratio in d. Write a report summarrzing your conclusions.
3005 return, three-yeatreturn, and five-year
3.83 Problem I.28 on page 15 describesa survey of 50
uhe nnean, median, first quartile, and third MBA students (see the fileffi).
a,, Select a sample of 50 graduate students from your MBA
ffie range, interquartile range, vatiance, stan- program and conduct a similar survey for those students.
Mon. and coefficient of variation. b. For the data collected in (a), repeat (a) through (d) of
a box-and-whisker plot. Are the data skewed? Problem 3 .82.
l c. Compare the results of (b) to those of Problem 3 .82.
'(IOOZ'uor1€rodro3
'( t SOt 'ssel4.ftnqxnq Uosorclltr :V16 puourpe$ n7Z p)xfl tIoso,tcry'Z
:gl srsrfJouyotoe {"toqruo1dxE Io SuryndwoJ puo
'fuaot
s 'sttorlncryddy'urlEeoH'C 'CIpuu 'C A 'ueurelle1 'v 'sser4 .(lrsrenlun proJXO:>lro1A\oI\D'pe r{}9 '{toaq1
' (t tA t'r(e1se16-uos
rppv uounqqusle : I awnlol 'srttsryo$Io [,toaqJpacuo^pv
"frurpeeU)
slsrQnuyotoe rfuorutoldxE''1'fe>1n;'t s,llopuax?.lO')'f pue'pen1g'V ''D'IN 6ll€pue)'I