Anda di halaman 1dari 19

8/3/2016

AStepbyStepBackpropagationExampleMattMazur

AStepbyStep
BackpropagationExample
MattMazur
Background

Home
About

Backpropagationisacommonmethodfortraininganeuralnetwork.Thereisno

Archives

shortageofpapersonlinethatattempttoexplainhowbackpropagationworks,but

Contact

fewthatincludeanexamplewithactualnumbers.Thispostismyattemptto

Now

explainhowitworkswithaconcreteexamplethatfolkscancomparetheirown

Projects

calculationstoinordertoensuretheyunderstandbackpropagationcorrectly.

FollowviaEmail

Enteryouremailaddressto

Ifthiskindofthinginterestsyou,youshouldsignupformynewsletterwhereIpost
aboutAIrelatedprojectsthatImworkingon.

followthisblogandreceive
notificationsofnewpostsby
email.

BackpropagationinPython

Join1,810otherfollowers

YoucanplayaroundwithaPythonscriptthatIwrotethatimplementsthe
backpropagationalgorithminthisGithubrepo.

Enteryouremailaddress
BackpropagationVisualization

Follow

Foraninteractivevisualizationshowinganeuralnetworkasitlearns,checkoutmy

About

NeuralNetworkvisualization.

I'madeveloperatAutomattic
whereIworkongrowthand
analyticsforWordPress.com.I

AdditionalResources

alsobuiltLeanDomainSearch,
Precedenandanumberofother

Ifyoufindthistutorialusefulandwanttocontinuelearningaboutneuralnetworks

softwareproductsovertheyears.

andtheirapplications,IhighlyrecommendcheckingoutAdrianRosebrocks

Ilovesolvingproblemsand

excellenttutorialonGettingStartedwithDeepLearningandPython.

helpingothersdothesame.
DropmeanoteifIcanhelpwith

Overview

anything.

Forthistutorial,weregoingtouseaneuralnetworkwithtwoinputs,twohidden

neurons,twooutputneurons.Additionally,thehiddenandoutputneuronswill
includeabias.

Search

Heresthebasicstructure:
FollowmeonTwitter

Tweetsby
@mhmazur
MattMazur
http://mattmazur.com/2015/03/17/astepbystepbackpropagationexample/

1/19

8/3/2016

AStepbyStepBackpropagationExampleMattMazur

MattMazur
@mhmazur

Ifausersignsupforyourapp
at11:59pmonTuesday,
doessomethingat12:01am
onWednesday,thennothing
everagain,isheD1
retained?
5h

MattMazur
@mhmazur

Globaltemperaturereached
+1Coverpreindustrial
averageinOct2015,nowas
muchas+1.4CasofFeb
2016
slate.com/blogs/future_t

OurHemispheresTempe
Update,March3,2016:Si

Inordertohavesomenumberstoworkwith,heresaretheinitialweights,the
biases,andtraininginputs/outputs:

slate.com

6h

MattMazurRetweeted
KyleWild
@dorkitude

(somethingsIhavelearned)

04Apr

MattMazurRetweeted
AprilUnderwood

Thegoalofbackpropagationistooptimizetheweightssothattheneuralnetwork
canlearnhowtocorrectlymaparbitraryinputstooutputs.

@aunder

Realityofhighgrowth
startup:thingswillfeel
broken,infowontfeel
appropriatelydistributed,
importantthingswillseem
overlooked.

29Feb

Fortherestofthistutorialweregoingtoworkwithasingletrainingset:given
inputs0.05and0.10,wewanttheneuralnetworktooutput0.01and0.99.
TheForwardPass

Tobegin,letsseewhattheneuralnetworkcurrentlypredictsgiventheweightsand
biasesaboveandinputsof0.05and0.10.Todothiswellfeedthoseinputs

MattMazur
http://mattmazur.com/2015/03/17/astepbystepbackpropagationexample/

2/19

8/3/2016

AStepbyStepBackpropagationExampleMattMazur

MattMazur

forwardthoughthenetwork.

@mhmazur

RetentionRateTerminology
mattmazur.com/2016/03/04/r
et
RetentionRat

Wefigureoutthetotalnetinputtoeachhiddenlayerneuron,squashthetotalnet
inputusinganactivationfunction(hereweusethelogisticfunction),thenrepeat
theprocesswiththeoutputlayerneurons.

Yesterdaymy
mattmazur.com

Totalnetinputisalsoreferredtoasjustnetinputbysomesources.
04Mar

MattMazurRetweeted

Hereshowwecalculatethetotalnetinputfor

OscarKoeroo
@okoeroo

AccidentalEscher

Wethensquashitusingthelogisticfunctiontogettheoutputof

Carryingoutthesameprocessfor

Embed

weget:

19May

ViewonTwitter

Werepeatthisprocessfortheoutputlayerneurons,usingtheoutputfromthe
hiddenlayerneuronsasinputs.
Herestheoutputfor :

Andcarryingoutthesameprocessfor

weget:

CalculatingtheTotalError

Wecannowcalculatetheerrorforeachoutputneuronusingthesquarederror
functionandsumthemtogetthetotalerror:

Somesourcesrefertothetargetastheidealandtheoutputastheactual.

http://mattmazur.com/2015/03/17/astepbystepbackpropagationexample/

3/19

8/3/2016

AStepbyStepBackpropagationExampleMattMazur

The isincludedsothatexponentiscancelledwhenwedifferentiatelater
on.Theresultiseventuallymultipliedbyalearningrateanywaysoitdoesnt
matterthatweintroduceaconstanthere[1].

Forexample,thetargetoutputfor is0.01buttheneuralnetworkoutput
0.75136507,thereforeitserroris:

Repeatingthisprocessfor

(rememberingthatthetargetis0.99)weget:

Thetotalerrorfortheneuralnetworkisthesumoftheseerrors:

TheBackwardsPass

Ourgoalwithbackpropagationistoupdateeachoftheweightsinthenetworkso
thattheycausetheactualoutputtobecloserthetargetoutput,therebyminimizing
theerrorforeachoutputneuronandthenetworkasawhole.
OutputLayer

Consider
aka

.Wewanttoknowhowmuchachangein

affectsthetotalerror,

isreadasthepartialderivativeof
alsosaythegradientwithrespectto

withrespectto

.Youcan

Byapplyingthechainruleweknowthat:

Visually,hereswhatweredoing:

http://mattmazur.com/2015/03/17/astepbystepbackpropagationexample/

4/19

8/3/2016

AStepbyStepBackpropagationExampleMattMazur

Weneedtofigureouteachpieceinthisequation.
First,howmuchdoesthetotalerrorchangewithrespecttotheoutput?

issometimesexpressedas

Whenwetakethepartialderivativeofthetotalerrorwithrespectto
thequantity

becomeszerobecause

doesnot

affectitwhichmeansweretakingthederivativeofaconstantwhichiszero.

Next,howmuchdoestheoutputof changewithrespecttoitstotalnetinput?
Thepartialderivativeofthelogisticfunctionistheoutputmultipliedby1minusthe
output:

Finally,howmuchdoesthetotalnetinputof

changewithrespectto

Puttingitalltogether:
http://mattmazur.com/2015/03/17/astepbystepbackpropagationexample/

5/19

8/3/2016

AStepbyStepBackpropagationExampleMattMazur

Youlloftenseethiscalculationcombinedintheformofthedeltarule:

Alternatively,wehave

and

whichcanbewrittenas

,aka

(theGreekletterdelta)akathenodedelta.Wecanusethistorewritethe
calculationabove:

Therefore:

Somesourcesextractthenegativesignfrom soitwouldbewrittenas:

Todecreasetheerror,wethensubtractthisvaluefromthecurrentweight
(optionallymultipliedbysomelearningrate,eta,whichwellsetto0.5):

Somesourcesuse (alpha)torepresentthelearningrate,othersuse
(eta),andothersevenuse (epsilon).

Wecanrepeatthisprocesstogetthenewweights

,and

Weperformtheactualupdatesintheneuralnetworkafterwehavethenew
weightsleadingintothehiddenlayerneurons(ie,weusetheoriginalweights,not
theupdatedweights,whenwecontinuethebackpropagationalgorithmbelow).

Follow
http://mattmazur.com/2015/03/17/astepbystepbackpropagationexample/

6/19

8/3/2016

AStepbyStepBackpropagationExampleMattMazur
HiddenLayer

Follow Matt
Next,wellcontinuethebackwardspassbycalculatingnewvaluesfor
,
Mazur
and

Bigpicture,hereswhatweneedtofigureout:

Get every new post delivered to


your Inbox.
Join 1,810 other followers
Enteryouremailaddress

Visually:

Signmeup
Build a website with WordPress.com

Weregoingtouseasimilarprocessaswedidfortheoutputlayer,butslightly
differenttoaccountforthefactthattheoutputofeachhiddenlayerneuron
contributestotheoutput(andthereforeerror)ofmultipleoutputneurons.Weknow
that

affectsboth

and

thereforethe

needstotakeinto

considerationitseffectonthebothoutputneurons:

Startingwith

Wecancalculate

http://mattmazur.com/2015/03/17/astepbystepbackpropagationexample/

usingvalueswecalculatedearlier:

7/19

8/3/2016

AStepbyStepBackpropagationExampleMattMazur

And

isequalto

Pluggingthemin:

Followingthesameprocessfor

,weget:

Therefore:

Nowthatwehave

,weneedtofigureout

andthen

foreach

weight:

Wecalculatethepartialderivativeofthetotalnetinputto

withrespectto

the

sameaswedidfortheoutputneuron:

Puttingitalltogether:

Youmightalsoseethiswrittenas:

http://mattmazur.com/2015/03/17/astepbystepbackpropagationexample/

8/19

8/3/2016

AStepbyStepBackpropagationExampleMattMazur

Wecannowupdate

Repeatingthisfor

,and

Finally,weveupdatedallofourweights!Whenwefedforwardthe0.05and0.1
inputsoriginally,theerroronthenetworkwas0.298371109.Afterthisfirstroundof
backpropagation,thetotalerrorisnowdownto0.291027924.Itmightnotseem
likemuch,butafterrepeatingthisprocess10,000times,forexample,theerror
plummetsto0.000035085.Atthispoint,whenwefeedforward0.05and0.1,the
twooutputsneuronsgenerate0.015912196(vs0.01target)and0.984065734(vs
0.99target).
Ifyouvemadeitthisfarandfoundanyerrorsinanyoftheaboveorcanthinkof
anywaystomakeitclearerforfuturereaders,donthesitatetodropmeanote.
Thanks!

Sharethis:

Twitter

Facebook

283

Like
18bloggerslikethis.

Related

TheStateofEmergent
Mind
In"EmergentMind"

Experimentingwitha
NeuralNetworkbased
PokerBot
In"PokerBot"

EmergentMind#10
In"EmergentMind"

PostedonMarch17,2015byMazur.ThisentrywaspostedinMachineLearningandtaggedai,
backpropagation,machinelearning,neuralnetworks.Bookmarkthepermalink.

IntroducingABTestCalculator.com,
anOpenSourceA/BTest

TetriNETBotSourceCodePublished
onGithub

SignificanceCalculator

http://mattmazur.com/2015/03/17/astepbystepbackpropagationexample/

9/19

8/3/2016

AStepbyStepBackpropagationExampleMattMazur
115thoughtsonAStepbyStepBackpropagationExample

OlderComments

MostafaRazavi
December7,2015at1:09pm
Thatwasheaven,thanksamillion.
Reply

SonalShrivastava
December8,2015at11:40am
Thatwasawesome.Thankaton.
Reply

Nayantara
December9,2015at7:29am
HiMatt,Canyoualsopleaseprovideasimilarexampleforaconvolutionalneural
networkwhichusesatleast1convolutionallayerand1poolinglayer?Surprisingly,I
haventbeenabletofindANYsimilarexampleforbackpropagation,ontheinternet,
forConv.NeuralNetwork.
TIA.
Reply

Mazur
December9,2015at8:36am
Ihaventlearntthatyet.Ifyoufindagoodtutorialpleaseletmeknow.

Pi
ng

Reply

AStepbyStepBackpropagationExample|MattMazur|tensorflowgraphs

payamrastogi
December11,2015at4:24am
AllhailtoTheMazur
Reply

http://mattmazur.com/2015/03/17/astepbystepbackpropagationexample/

10/19

8/3/2016

AStepbyStepBackpropagationExampleMattMazur

LouisHong
December11,2015at4:41pm
Thankyousomuchforyourmostcomprehensivetutorialeverontheinternet.
Reply

ad
December17,2015at1:49am
whyisbiasnotupdated?
Reply

Mazur
December17,2015at9:23am
Hey,inthetutorialsIwentthroughtheydidntupdatethebiaswhichiswhyI
didntincludeithere.
Reply

justaguy
December24,2015at8:54pm
Typically,biaserrorisequaltothesumoftheerrorsoftheneurons
thatthebiasconnectsto.Forexample,inregardstoyourexample,
b1_error=h1_error+h2_error.Updatingthebiasweightwouldbe
addingtheproductofthesummederrorsandthelearningratetothe
bias,ex.b1_weight=b1_error*learning_rate.Althoughmany
problemscanbelearnedbyaneuralnetworkwithoutadjusting
biasesandtheremaybebetterwaystoadustbiasweights.Also,
updatingbiasweightsmaycauseproblemswithlearningasopposed
tokeepingthemstatic.Asusualwithneuralnetworks,through
experimentationyoumaydiscovermoreoptimaldesigns.
Reply

patriczhao
January13,2016at1:30am
niceexplanations,thanks.
Reply

AhadKhan
December20,2015at2:26am
http://mattmazur.com/2015/03/17/astepbystepbackpropagationexample/

11/19

8/3/2016

AStepbyStepBackpropagationExampleMattMazur

Thisisperfect.Iamabletovisualizebackpropagationalgobetterafterreadingthis
article.Thanksonceagain!
Reply

sunlyt
December21,2015at12:57am
Brilliant.Thankyou!
Reply

garky
December24,2015at8:25am
Ifwehavemorethanonesampleinourdatasethowwecantrainitbyconsideringall
samples,notjustonesample?
Reply

DanielZukowski
December24,2015at2:32pm
Invaluableresourceyouveproduced.Thankyouforthisclear,comprehensive,visual
explanation.Theinnermechanicsofbackpropagationarenolongeramysterytome.
Reply

LongPham
December26,2015at10:58am
precisely,intuitively,veryeasytounderstand,greatwork,thankyou.
Reply

DionisiusAN
December27,2015at1:16pm
Thankyouverymuch,itshelpmewell,ureallygivedetaildirectiontoallowme
imaginehowitworks.Ireallyappreciateit.MayGodrepayyourkindnessthousand
timethanudo.
Reply

singhrocks91
http://mattmazur.com/2015/03/17/astepbystepbackpropagationexample/

12/19

8/3/2016

AStepbyStepBackpropagationExampleMattMazur

December28,2015at1:35am
ThankYou.Ihaveabetterinsightnow
Reply

DGelling
January1,2016at6:48pm
Shouldntthederivativeofout_o1wrtnet_o1benet_o1*(1net_o1)?
Reply

NaanTadow
February24,2016at1:10am
Notheonestatedaboveiscorrect,seehereforthestepsonthegradientof
theactivationfunctionwithrespecttoitsinputvalue(net):
https://theclevermachine.wordpress.com/2014/09/08/derivationderivatives
forcommonneuralnetworkactivationfunctions/
OhandthanksforthisMattwasabletoworkthroughyourbreakdownofthe
partialderivativesfortheAndrewNgMLCourseoncoursera:D

Pi
ng

Pi
ng

Pi
ng

Reply

CodingNeuralnetworks|Bitsandpieces
Apprendrecoderunrseaudeneurones|ActuairesBigData
ContextualIntegrationIstheSecretWeaponofPredictiveAnalytics

Aro
January10,2016at6:23pm
thankssomuch,Ihaventseetutorialbeforelikethis.
Reply

DeriveMe
January12,2016at1:22am
Hello.Idontunderstand,belowthephraseFirst,howmuchdoesthetotalerror
changewithrespecttotheoutput?,whythereisa(*1)inthesecondequation,that
eventuallychangestheresultto(targetoutput)insteadofjust(targetoutput).Can
youhelpmeunderstand?
Thankyou!
Reply
http://mattmazur.com/2015/03/17/astepbystepbackpropagationexample/

13/19

8/3/2016

AStepbyStepBackpropagationExampleMattMazur

angie1pecht
January17,2016at8:52pm
Thishelpedmealot.Thankyousomuch!
Reply

LEarningAIagain
January18,2016at4:26pm
Thiswasawesome.Thankssomuch!
Reply

Ashish
January19,2016at7:21am
ThanksalotMattAppreciatedtheeffort,Kudos

Pi
ng

Reply

LearningHowToCodeNeuralNetworks|ipythonblog

Tariq
January20,2016at12:03pm
IftheerrorissquaredbutsimplyE=sum(targetoutput),youcanstilldothe
calculustoworkouttheerrorgradient..andthenupdatetheweights.WheredidIgo
wrongwiththislogic?
Reply

Elliot
January28,2016at9:03am
Goodafternoon,dearMattMazur!
Thankyouverymuchforwritingsocompleteandcomprehensivetutorial,everything
isunderstandableandwritteninaccessibleway!IfisitposdiblemayIaskfollowing
questionifIneedtocomputeJacobianMatrixelementsinformulaforcomputingError
GradientwithrespecttoweightdEtotal/dwiIshouldjustpercieveEtotalnotasthefull
errorfromalloutputsbutasanerrorfromsomecertainsingleoutput,couldyou
pleasesayisthiscorrect?Couldyoupleasesayareyounotplanningtomakea
simillartutorialbutforcomputingsecondorderderivatives(backpropagationwith
partialderivativesofsecondorder)?Ihavesearchinginternetfortutorialofcalculating
secondorderderivativesinbackpropagationbutdidnotfoundanything.Maybeyou
knowsomegoodtutorialsforit?Ihaveknowthatsecondorderpartialderivatives
http://mattmazur.com/2015/03/17/astepbystepbackpropagationexample/

14/19

8/3/2016

AStepbyStepBackpropagationExampleMattMazur

(elementsofHessianMatrix)canbeapproximatedbymultiplayingJacobiansbut
wantedtofinditsexactnonapproximatedcalculation.Thankyouinadvanceforyour
reply!
Sincerely
Reply

Pulley
February1,2016at9:52pm
helloMatt,CanyoupleasetellmethatafterupdatingallweightsinfirstiterationI
shouldupdatethevaluesofallhatlastinfirstiterationornot.
Reply

BehrozAhmadAli
February6,2016at8:01am
Thankyouforsuchacomprehensiveexplanationofbackpropagation.Ihavebeen
tryingtounderstandbackpropagationformonthsbuttodayIfinallyunderstooditafter
readingyourthispost.
Reply

Tariq
February8,2016at10:57am
iamwritingagentleintrotoneuralnetworksaimedatbeingaccessibleto
someoneatschoolapproxage15hereisadraftwhichincludesaveryvery
gentleintrotobackprop
https://goo.gl/7uxHlm
idappreciatefeedbackto@myoneuralnet
Reply

RebekaSultana
February16,2016at12:59am
Thankyousomuch.
Reply

Ron
February21,2016at1:10pm

http://mattmazur.com/2015/03/17/astepbystepbackpropagationexample/

15/19

8/3/2016

AStepbyStepBackpropagationExampleMattMazur

Firstly,thankyouVERYmuchforagreatwalkthroughofallthestepsinvolvedwith
realvalues.Imanagedtocreateaquickimplementationofthemethodsused,and
wasabletotrainsuccessfully.
Iwaslookingtousethissetup(butwith4inputs/3outputs)forthefamousirisdata
(http://archive.ics.uci.edu/ml/datasets/Iris).The3outputswouldbe0.01.0foreach
classification,astherewouldbeanoutputweighttowardseachtype.
Unfortunatelyitdoesntseemtobeabletoresolvetoanalwayslowerrorvalue,and
fluctuatesdrasticallyasittrains.Isthisanindicationthatasecondlayerisneededfor
thistypeofdata?
Reply

Werner
February22,2016at5:44am
ThefirstexplanationIreadthatactuallymakessensetome.Mostjustseemtostart
shovellingmathsinyourfaceinthenameofnotmakingitsimplerthattheyshould.
NowletshopemyAIwillfinallybeabletoplayagameofdraughts.
Reply

admin
February22,2016at9:20am
Ithelpsmealot.thanksforthework!!!
Reply

Name(required)
February24,2016at9:04pm
Greattutorial.Byanychancedoyouknowhowdobackpropagate2hiddenlayers?
Reply

Mazur
February25,2016at8:22am
Idonot,sorry.
Reply

Kiran
February25,2016at12:29am
Thankyousomuch!Theexplanationwassointuitive.
http://mattmazur.com/2015/03/17/astepbystepbackpropagationexample/

16/19

8/3/2016

AStepbyStepBackpropagationExampleMattMazur

Reply

Anon
February25,2016at11:18pm
Thankyou!Thewayyouexplainthisisveryintuitive.
Reply

tariq
February26,2016at9:38am
Idloveyourfeedbackonmyattempttoexplainthemathsandideasunderlying
neuralnetworksandbackrpop.
Heresanearlydraftonline.Theaimformeistoreachasmanypeopleaspossible
inckteenagerswithschoolmaths.
http://makeyourownneuralnetwork.blogspot.co.uk/2016/02/earlydraftfeedback
wanted.html
Reply

GarettRidgeAndThenSomeMoreWords
March1,2016at5:45pm
IhaveapresentationtomorrowonneuralnetworksinagradclassthatIm
drowningin.Thisbookisgoingtosavemylife
Reply

falcatrua
February29,2016at2:23pm
ItsagreattutorialbutIthinkIfoundanerror:
atforwardpassvaluesshouldbe:
neth1=0.15*0.05+0.25*0.1+0.35*1=0.3825
outh1=1/(1+e^0.3825)=0,594475931
neth2=0.20*0.05+0.30*0.1+0.35*1=0.39
outh2=1/(1+e^0.39)=0.596282699
Reply

GarettRidgeAndThenSomeMoreWords
March1,2016at9:37pm
Thelabelsgotheotherwayinhisdrawing,wherethelabelthatsaysw_2
goeswiththelineitsnextto(ontherightofit)andthevalueofw_2gets
http://mattmazur.com/2015/03/17/astepbystepbackpropagationexample/

17/19

8/3/2016

AStepbyStepBackpropagationExampleMattMazur

writtentotheleftlookatthepreviousdrawingwithoutthevaluestoseewhatI
mean
Reply

Bill
March2,2016at3:09am
Goodstuff!Professorsshouldlearnfromyou.Mostprofessorsmakecomplexthings
complex.Arealgoodteachershouldmakecomplexthingssimple.
Reply

b
March2,2016at3:11am
Also,recommendthislinkifyouwanttofindaevensimplerexamplethanthisone.
http://www.cs.toronto.edu/~tijmen/csc321/inclass/140123.pdf
Reply

Priti
March2,2016at4:27am
Canyougiveanexampleforbackpropagationinopticalnetworks
Reply

Moboluwarin
March2,2016at2:13pm
Heythereveryhelpfulindeed,inthelinefornet01=w5*outh1+w6*outh2+b2*1,isit
notmeanttobew7??
Cheers
Reply

Dara
March4,2016at9:17am
Cananywayhelpmeexplainingmanualcalculationfortestingoutputswithtrained
weightsandbias?SeemsitdoesnotgivethecorrectanswerwhenIdirectlysubstitute
myinputstotheequations.AnswersaredifferentthanIgetfromMATLABNNtoolbox.
Reply

http://mattmazur.com/2015/03/17/astepbystepbackpropagationexample/

18/19

8/3/2016

AStepbyStepBackpropagationExampleMattMazur

OlderComments

LeaveaReply

Enteryourcommenthere...

BlogatWordPress.com.ThePublishTheme.

http://mattmazur.com/2015/03/17/astepbystepbackpropagationexample/

19/19

Anda mungkin juga menyukai