Anda di halaman 1dari 5

IntroductionBigdataisabroadtermfordatasetssolargeorcomplexthattraditional

dataprocessingapplicationsareinadequate.Challengesincludeanalysis,capture,data
curation,search,sharing,storage,transfer,visualization,andinformationprivacy.The
termoftenreferssimplytotheuseofpredictiveanalyticsorothercertainadvanced
methodstoextractvaluefromdata,andseldomtoaparticularsizeofdataset.Accuracy
inbigdatamayleadtomoreconfidentdecisionmaking.Andbetterdecisionscanmean
greateroperationalefficiency,costreductionsandreducedrisk.Analysisofdatasetscan
findnewcorrelations,to"spotbusinesstrends,preventdiseases,combatcrimeandso
on."Scientists,practitionersofmediaandadvertisingandgovernmentsalikeregularly
meetdifficultieswithlargedatasetsinareasincludingInternetsearch,financeand
businessinformatics.ScientistsencounterlimitationsineSciencework,including
meteorology,genomics,connectomics,complexphysicssimulations,andbiologicaland
environmentalresearch.Datasetsgrowinsizeinpartbecausetheyareincreasinglybeing
gatheredbycheapandnumerousinformationsensingmobiledevices,aerial(remote
sensing),softwarelogs,cameras,microphones,radiofrequencyidentification(RFID)
readers,andwirelesssensornetworks.Theworld'stechnologicalpercapitacapacityto
storeinformationhasroughlydoubledevery40monthssincethe1980s;asof2012,
everyday2.5exabytes(2.51018)ofdatawerecreated;Thechallengeforlarge
enterprisesisdeterminingwhoshouldownbigdatainitiativesthatstraddletheentire
organization.Workwithbigdataisnecessarilyuncommon;mostanalysisisof"PCsize"
data,onadesktopPCornotebookthatcanhandletheavailabledataset.Relational
databasemanagementsystemsanddesktopstatisticsandvisualizationpackagesoften
havedifficultyhandlingbigdata.Theworkinsteadrequires"massivelyparallelsoftware
runningontens,hundreds,oreventhousandsofservers".Whatisconsidered"bigdata"
variesdependingonthecapabilitiesoftheusersandtheirtools,andexpanding
capabilitiesmakeBigDataamovingtarget.Thus,whatisconsideredtobe"Big"inone
yearwillbecomeordinaryinlateryears."Forsomeorganizations,facinghundredsof
gigabytesofdataforthefirsttimemaytriggeraneedtoreconsiderdatamanagement
options.Forothers,itmaytaketensorhundredsofterabytesbeforedatasizebecomesa
significantconsideration."
WhatisBigData?
Sincethefirstcensusesweretakenandcropyieldsrecordedinancienttimes,data
collection
andanalysishavebeenessentialtoimprovingthefunctioningofsociety.Foundational
workincalculus,probabilitytheory,andstatisticsinthe17thand18thcenturies
providedanarrayofnewtoolsusedbyscientiststomorepreciselypredictthe
movements
ofthesunandstarsanddeterminepopulationwideratesofcrime,marriage,and
suicide.Thesetoolsoftenledtostunningadvances.Inthe1800s,Dr.JohnSnowused

earlymoderndatasciencetomapcholeraclustersinLondon.Bytracingtoa
contaminated
publicwelladiseasethatwaswidelythoughttobecausedbymiasmaticair,
Snowhelpedlaythefoundationforthegermtheoryofdisease.1
GleaninginsightsfromdatatoboosteconomicactivityalsotookholdinAmerican
industry.
FrederickWinslowTaylorsuseofastopwatchandaclipboardtoanalyzeproductivity
atMidvaleSteelWorksinPennsylvaniaincreasedoutputontheshopfloorandfueled
hisbeliefthatdatasciencecouldrevolutionizeeveryaspectoflife.2
In1911,Taylorwrote
ThePrinciplesofScientificManagementtoanswerPresidentTheodoreRooseveltscall
forincreasingnationalefficiency:
[T]hefundamentalprinciplesofscientificmanagementareapplicabletoallkinds
ofhumanactivities,fromoursimplestindividualactstotheworkofourgreat
corporations.
[W]henevertheseprinciplesarecorrectlyapplied,resultsmustfollow
whicharetrulyastounding.3
Today,dataismoredeeplywovenintothefabricofourlivesthaneverbefore.Weaspire
tousedatatosolveproblems,improvewellbeing,andgenerateeconomicprosperity.
Thecollection,storage,andanalysisofdataisonanupwardandseeminglyunbounded
trajectory,fueledbyincreasesinprocessingpower,thecrateringcostsofcomputation
andstorage,andthegrowingnumberofsensortechnologiesembeddedindevicesofall
kinds.In2011,someestimatedtheamountofinformationcreatedandreplicatedwould
surpass1.8zettabytes.4
In2013,estimatesreached4zettabytesofdatagenerated
worldwide.5
WhatisaZettabyte?
Azettabyteis1,000000,000,000,000,000,000bytes,orunitsofinformation.Consider

thatasinglebyteequalsonecharacteroftext.The1,250pagesofLeoTolstoysWar
andPeacewouldfitintoazettabyte323trilliontimes.6Orimaginethateverypersonin
theUnitedStatestookadigitalphotoeverysecondofeverydayforoveramonth.Allof
thosephotosputtogetherwouldequalaboutonezettabyte.
Morethan500millionphotosareuploadedandsharedeveryday,alongwithmorethan
200hoursofvideoeveryminute.Butthevolumeofinformationthatpeoplecreate
themselvesthe
fullrangeofcommunicationsfromvoicecalls,emailsandtextstouploaded
pictures,video,andmusicpalesincomparisontotheamountofdigitalinformation
createdaboutthemeachday.
Thesetrendswillcontinue.Weareonlyintheverynascentstageofthesocalled
Internet
ofThings,whenourappliances,ourvehiclesandagrowingsetofwearable
technologies
willbeabletocommunicatewitheachother.Technologicaladvanceshave
drivendownthecostofcreating,capturing,managing,andstoringinformationto
onesixth
ofwhatitwasin2005.Andsince2005,businessinvestmentinhardware,software,
talent,andserviceshasincreasedasmuchas50percent,to$4trillion.
TheInternetofThings
TheInternetofThingsisatermusedtodescribetheabilityofdevicestocommunicate
witheachotherusingembeddedsensorsthatarelinkedthroughwiredandwireless
networks.Thesedevicescouldincludeyourthermostat,yourcar,orapillyouswallow
sothedoctorcanmonitorthehealthofyourdigestivetract.Theseconnecteddevices
usetheInternettotransmit,compile,andanalyzedata.
Therearemanydefinitionsofbigdatawhichmaydifferdependingonwhetheryouare
acomputerscientist,afinancialanalyst,oranentrepreneurpitchinganideatoaventure
capitalist.Mostdefinitionsreflectthegrowingtechnologicalabilitytocapture,aggregate,

andprocessanevergreatervolume,velocity,andvarietyofdata.Inotherwords,data
isnowavailablefaster,hasgreatercoverageandscope,andincludesnewtypesof
observations
andmeasurementsthatpreviouslywerenotavailable.7Moreprecisely,big
datasetsarelarge,diverse,complex,longitudinal,and/ordistributeddatasetsgenerated
frominstruments,sensors,Internettransactions,email,video,clickstreams,and/orall
otherdigitalsourcesavailabletodayandinthefuture.8
Whatreallymattersaboutbigdataiswhatitdoes.Asidefromhowwedefinebigdataas
atechnologicalphenomenon,thewidevarietyofpotentialusesforbigdataanalytics
raisescrucialquestionsaboutwhetherourlegal,ethical,andsocialnormsaresufficient
toprotectprivacyandothervaluesinabigdataworld.Unprecedentedcomputational
powerandsophisticationmakepossibleunexpecteddiscoveries,innovations,and
advancements
inourqualityoflife.Butthesecapabilities,mostofwhicharenotvisibleor
availabletotheaverageconsumer,alsocreateanasymmetryofpowerbetweenthose
whoholdthedataandthosewhointentionallyorinadvertentlysupplyit.
Partofthechallenge,too,liesinunderstandingthemanydifferentcontextsinwhichbig
datacomesintoplay.Bigdatamaybeviewedasproperty,asapublicresource,oras
anexpressionofindividualidentity.9Bigdataapplicationsmaybethedriverof
Americas
economicfutureorathreattocherishedliberties.Bigdatamaybeallofthesethings.
Forthepurposesofthis90daystudy,thereviewgroupdoesnotpurporttohaveallthe

answerstobigdata.Boththetechnologyofbigdataandtheindustriesthatsupportitare
constantlyinnovatingandchanging.Instead,thestudyfocusesonaskingthemost
important
questionsabouttherelationshipbetweenindividualsandthosewhocollectand
usedataaboutthem.

Anda mungkin juga menyukai