Project
Statistics
Inthisproject,wearecomparingstartingcompensationpackagesfordifferentoccupationalfieldsof
collegegraduates.Itanalyzesdatafrom2014,likelybecauseitwasavailableingoodqualityandmorereadily
than2015datawouldbeatthebeginningofthe2016Springsemester.Themethodsusedtoanalyzethedata
includehistograms,5PointSummaries,boxplots,confidenceintervals,andhypothesistests.
Humanities
Communic
Computer
Math&
MajorField
Business
Education
Engineering
&Social
ations
Science
Sciences
Sciences
Sample
$54,537
$46,227
$59,542
$40,021
$60,664
$38,817
$41,923
Mean
Sample
$7,231.72
$7,214.31
$4,669.97
$2,364.82
$7,878.86
$7,146.18
$4938.29
Std.Dev.
Histograms
BusinessStartingCompensations
StartingCompensation(inUSD)
CommunicationStartingCompensation
StartingCompensation(inUSD)
ComputerScienceStartingCompensation
StartingCompensation(inUSD)
EducationStartingCompensation
StartingCompensation(inUSD)
EngineeringStartingCompensation
StartingCompensation(inUSD)
Humanities&SocialSciencesStartingCompensation
StartingCompensation(inUSD)
Mathematics&NonSocialSciencesStartingCompensation
StartingCompensation(inUSD)
Thegraphsreflectwhatweexpecttoseeinasmuchastheyreflectthegivendata.Italsoseemsreflective
enoughofthegeneralpopulationatthegiventime.Itappearsthattherearethreereasonablysymmetricgraphs
andfourskewedgraphs.ThesymmetricgraphsincludethefieldsforComputerScience,Education,and
Humanities&SocialSciences.ThefourskewedgraphsareBusiness,Communication,Engineering,and
Mathematics&NonSocialSciences.Thosegraphsthataresymmetricfollowanormaldistribution,andthose
thatareskewedseemtofollowsomeotherdistribution.Onabasiclevel,thedifferencesbetweeneachfieldis
thatdifferentfieldsarefundeddifferently,differentpeoplehavedifferentskillsanddifferentinterestsequatingto
anoverallchaoticlookingdistributionifthedatawereallononegraph.Wesayonabasiclevelsincethe
authorschosenfieldissociology(asocialscience),andintheinterestofkeepingthisprojectfocussedonmath:
Wechoosetoforegothelengthiersociologicalexplanations.
5PointSummaries
Education5Point
Business5Point
Summary
Summary
Min
35250
Min
40745
InnerQuartile
38698.75
IQ
49229.25
Median
39857
Median
54729
OuterQuartile
41589
OQ
58885.25
Max
44591
Max
71803
Communications5
PointSummary
Min
IQ
Median
OQ
Max
33282
41689.75
45420
51103.25
62359
Engineering5Point
Summary
Min
IQ
Median
OQ
Max
MathandScience5
PointSummary
Min
IQ
Median
OQ
Max
46475
54975
60042
66394.75
80236
31920
38739
41939.5
45000.25
54261
ComputerScience5
PointSummary
Min
IQ
Median
OQ
Max
46785
56937.25
59836
62462.25
69490
Humanities&Social
Sciences5Point
Summary
Min
IQ
Median
OQ
Max
23114
33733.25
39600.5
42940.5
54220
Boxplots
CompensationinUSD/Year
CompensationinUSD/Year
Withtheseboxplotsinmind,Business,Engineering,andHumanities&SocialSciencesseemtohave
someoutliersthatskewthedatapresentedintheboxplotsbysomemeasure.Theseoutliersmayhaveoccurred
becauseBusinessandEngineeringarewellfunded(bythegovernmentandtechnologyindustry,respectively)
leadingtosomehighervalues.ManyofthethingsthattheHumanities&SocialSciencesworkwithalreadyhave
interactionwithsomeoftheotherfields,leadingtolessvaluesoccurringasvariousproblemsolvingexpenditures
arealreadyinvestedinbybeingaddressedatallinotherfields.Forexample,abusinesslookingtosellgoods
mightallocatesomeoftheirprofitsknowingtheywillmakemoreprofitsbyinvestinginsomeadvertising
(Communications)whichmaytheninturnspendsomemoneyonpsychology(SocialScience).However,bythe
timethemoneymakesittothefieldofpsychology,thoseinbusinessarealreadyhopingtoseeareturnontheir
investmentfromtheadvertisingtheycommissioned.Knowingthis,itmakessensethatlessmoneywouldbe
availableinthestartingpositionsofferedbyHumanities&SocialSciences,sincetheydontproducethesame
immediatelyobservableresultsseenintheotherfieldspresentedinthisproject.Thedataseemstobesuitablefor
normalzdistributionsandtdistributions,dependingonifwechoosetoanalyzetheproportionormean,
respectively.
ConfidenceIntervals
Aconfidenceintervaliscalculatedfromarangeofsamplevaluesinordertoestimatetherealvalueofa
populationparameter.Itspurposeistoestablishwhatthatparametermightbeatapreviouslydecidedorgiven
confidencelevel.
TheresultsofthecalculatedconfidenceintervalsfromthepageConfidenceIntervalscanbeinterpretedas
follows:
Weare95%confidentthattheintervalfrom$35,726.33to$41,907.68actuallyincludesthetruevalueof
theHumanitiesandSocialSciencesstartingcompensationpopulationmean.
Weare99%confidentthattheintervalfrom$5,800.87to$9,287.74includesinactualitythetruevalueof
theBusinessstartingcompensationpopulationstandarddeviation.
Weare80%confidentthattheintervalfrom.396to.464doesactuallycontainthetruevalueofthe
populationproportionforallstudentsgraduatingcompensated$50,000ormore.
Eachintervalremainsrelativelyconsistent,hoveringabout6relevantunitswithvaryingconfidencelevels.One
couldreasonablyestimatethestartingcompensationpackagemeanforHumanitiesandSocialSciences
occupationsisabout$39,000.ItwouldalsobereasonabletostatethatalthoughthoseintheBusinessfieldcould
haveastandarddeviationofjustover$9,000fromthepopulationmean,positiveornegative.Theactualvalueis
likelytobeless(astandarddeviationofabout$7,500forthoseenteringtheBusinessfieldismorereasonable
withtheconfidenceintervalinmind.)Finally,knowingthatanywherefromabout40%46%ofstudents
graduatingcanmakemorethan$50,000upongraduatingismotivational,evenifnoneofthosestudentsare
pursuingacareerinEducation(wherenograduatingstudentsexceededastartingcompensationpackageofeven
$45,000.00fromthegivendata.)
HypothesisTests
Ahypothesistestisaprocedurethattestsaclaimaboutapropertyofapopulation.Withtheabilitytotestaclaim
aboutapropertyofapopulation,weareabletoseeifaclaimisreasonable.
Eachresultforthegivenclaimswererejected.Thefirstclaimthatonewillmakeunder$35,000beginninginthe
fieldofEducationringspersonallyaccurate,althoughitisprobablynotreflectivefortheoverallpopulation.The
secondclaimthatupongraduatingfromcollege,astudentwillreceiveastartingcompensationpackageof
$40,000ormoredoesnotseemtobethecase.Itseemsmoreaccuratetostatethatastartingcompensation
packageforastudentgraduatingcollegeislessthan$40,000.
Reflection
Todoanintervalestimate,wemusthaveasimplerandomsample.Wealsoneedtoknowtheproceduretousefor
testingahypothesisorclaim.Thesimplerandomsampleofthepopulationthatweobtainmustbelargeenough
totest.Thesimplerandomsampleofthepopulationthatweobtainmustbelargeenoughtotest.Wemayalso
requirenormaldistributionifoursamplesizeislessthantherequisiteamountsthatvaryfromfivetothirty
dependingonthesituationofourapplicationofstatisticalmethods.Normaldistributionisrequiredforthe
confidenceintervalofapopulationstandarddeviation.Thegivensampleswereenoughtomeetthenecessary
conditions,althoughwithmoredatawemayhavebeenabletoimproveourresults.Errorsthatcouldhavebeen
madebyusingthisdataincludesomeoutliersforbothhighandlowvaluesaccompaniedbyasmallsamplesize.
Ifwewantedtoimprovethesamplingmethod,wecouldincreasethesamplesize.Fromthisstatisticalresearch,
weconcludethatreceivingadegreeincreasesthelikelihoodofagreaterstartingpackageinanyfield.
Nr.thu^
Lonlie*u
15"t e ,:n$iJerr"e
i,lie,{'v"rl
**a= ?"mq
$tr
ltln*n,td s
H',lmay'{1-t's
Sr.;ol S**,,,*t
fl=5C
r-
5q;{i l+ BrOg0.6B
tlJ'
[,! lsl,
+onr$iJenee
JEA J'
gf,= 1JS
A{ = '"14 (n - t)
"
7f,
c87S")3
1.o7
I
n=50
2oo1
col,}rs^g^*i*4s sti,,J*.J
l-,{"7o7
4=
2Q= 7U, i 51
5
76 - lSLl
75b 25109 z4
76"t5,t
J
:s 650fi t"
bI"7*t
deun.utrr-Ml
$01. con\tlen u in
A=
.qs t=.sl
' o=
3b 'Zr/^= l}q
d-s1
On
F>1.)6
[(q*i,,
+
s66
-ole = .o 32/
"s.j'b..,
Nathq,t Nisk
H vPothus;''
Tttf
Ar.tit*-,on
A30Ll
WF#
?s{q *}
l art
,g)
T5?
=)
s31.q?
+ l5"ot s
r_u-
z.iL-'-*\. 6 761-11'-q-15-,'
-,--*-.<
i*
* $,o ,\^,* tkr Stulenfs
Jf*<!u*tr,3
eJu c+.f oir Wi\\ \nn * dr Siir.*i^,1 (nxgi,*s*tioo gxck*5t YL'-+
is #
iln,n$=1 0Oo . ft ' 5+., +lj Lavyff,t&,lio,t ga"tate
\irje
{o')et
,
g.rr.o,?
&r'atr
c>{
tL^ o5? 0; ,s
,la-'ln
nj
tqo{& llk"\y.
St,Jrrrt<
=.01
}}o=
4,=
//
>'{gaao
klLJ},oors
L{fi$\{
.1\ - t4 r,ooa
51]o. 5L
_-.--}
sst:
=\
*$\f,'7
?1?o,"*1
s $,108
hlr. rc.Je'+ ti,l, ol*,* thu+ *iS"t, ,$ St.&yr*s W'\\ lnuu, 6Slart,n5 L.o{\ynst'+;0n,?^rf;^3, Ave{ *YL{.3D. A Slar|;n5
rq//ft)C is wor1 li)<ely"
Lomy{,n*,riort leES lV^*