AStepbyStepBackpropagationExampleMattMazur
AStepbyStep
BackpropagationExample
MattMazur
Background
Home
About
Backpropagationisacommonmethodfortraininganeuralnetwork.Thereisno
Archives
shortageofpapersonlinethatattempttoexplainhowbackpropagationworks,but
Contact
fewthatincludeanexamplewithactualnumbers.Thispostismyattemptto
Now
explainhowitworkswithaconcreteexamplethatfolkscancomparetheirown
Projects
calculationstoinordertoensuretheyunderstandbackpropagationcorrectly.
FollowviaEmail
Enteryouremailaddressto
Ifthiskindofthinginterestsyou,youshouldsignupformynewsletterwhereIpost
aboutAIrelatedprojectsthatImworkingon.
followthisblogandreceive
notificationsofnewpostsby
email.
BackpropagationinPython
Join1,810otherfollowers
YoucanplayaroundwithaPythonscriptthatIwrotethatimplementsthe
backpropagationalgorithminthisGithubrepo.
Enteryouremailaddress
BackpropagationVisualization
Follow
Foraninteractivevisualizationshowinganeuralnetworkasitlearns,checkoutmy
About
NeuralNetworkvisualization.
I'madeveloperatAutomattic
whereIworkongrowthand
analyticsforWordPress.com.I
AdditionalResources
alsobuiltLeanDomainSearch,
Precedenandanumberofother
Ifyoufindthistutorialusefulandwanttocontinuelearningaboutneuralnetworks
softwareproductsovertheyears.
andtheirapplications,IhighlyrecommendcheckingoutAdrianRosebrocks
Ilovesolvingproblemsand
excellenttutorialonGettingStartedwithDeepLearningandPython.
helpingothersdothesame.
DropmeanoteifIcanhelpwith
Overview
anything.
Forthistutorial,weregoingtouseaneuralnetworkwithtwoinputs,twohidden
neurons,twooutputneurons.Additionally,thehiddenandoutputneuronswill
includeabias.
Search
Heresthebasicstructure:
FollowmeonTwitter
Tweetsby
@mhmazur
MattMazur
http://mattmazur.com/2015/03/17/astepbystepbackpropagationexample/
1/19
8/3/2016
AStepbyStepBackpropagationExampleMattMazur
MattMazur
@mhmazur
Ifausersignsupforyourapp
at11:59pmonTuesday,
doessomethingat12:01am
onWednesday,thennothing
everagain,isheD1
retained?
5h
MattMazur
@mhmazur
Globaltemperaturereached
+1Coverpreindustrial
averageinOct2015,nowas
muchas+1.4CasofFeb
2016
slate.com/blogs/future_t
OurHemispheresTempe
Update,March3,2016:Si
Inordertohavesomenumberstoworkwith,heresaretheinitialweights,the
biases,andtraininginputs/outputs:
slate.com
6h
MattMazurRetweeted
KyleWild
@dorkitude
(somethingsIhavelearned)
04Apr
MattMazurRetweeted
AprilUnderwood
Thegoalofbackpropagationistooptimizetheweightssothattheneuralnetwork
canlearnhowtocorrectlymaparbitraryinputstooutputs.
@aunder
Realityofhighgrowth
startup:thingswillfeel
broken,infowontfeel
appropriatelydistributed,
importantthingswillseem
overlooked.
29Feb
Fortherestofthistutorialweregoingtoworkwithasingletrainingset:given
inputs0.05and0.10,wewanttheneuralnetworktooutput0.01and0.99.
TheForwardPass
Tobegin,letsseewhattheneuralnetworkcurrentlypredictsgiventheweightsand
biasesaboveandinputsof0.05and0.10.Todothiswellfeedthoseinputs
MattMazur
http://mattmazur.com/2015/03/17/astepbystepbackpropagationexample/
2/19
8/3/2016
AStepbyStepBackpropagationExampleMattMazur
MattMazur
forwardthoughthenetwork.
@mhmazur
RetentionRateTerminology
mattmazur.com/2016/03/04/r
et
RetentionRat
Wefigureoutthetotalnetinputtoeachhiddenlayerneuron,squashthetotalnet
inputusinganactivationfunction(hereweusethelogisticfunction),thenrepeat
theprocesswiththeoutputlayerneurons.
Yesterdaymy
mattmazur.com
Totalnetinputisalsoreferredtoasjustnetinputbysomesources.
04Mar
MattMazurRetweeted
Hereshowwecalculatethetotalnetinputfor
OscarKoeroo
@okoeroo
AccidentalEscher
Wethensquashitusingthelogisticfunctiontogettheoutputof
Carryingoutthesameprocessfor
Embed
weget:
19May
ViewonTwitter
Werepeatthisprocessfortheoutputlayerneurons,usingtheoutputfromthe
hiddenlayerneuronsasinputs.
Herestheoutputfor :
Andcarryingoutthesameprocessfor
weget:
CalculatingtheTotalError
Wecannowcalculatetheerrorforeachoutputneuronusingthesquarederror
functionandsumthemtogetthetotalerror:
Somesourcesrefertothetargetastheidealandtheoutputastheactual.
http://mattmazur.com/2015/03/17/astepbystepbackpropagationexample/
3/19
8/3/2016
AStepbyStepBackpropagationExampleMattMazur
The isincludedsothatexponentiscancelledwhenwedifferentiatelater
on.Theresultiseventuallymultipliedbyalearningrateanywaysoitdoesnt
matterthatweintroduceaconstanthere[1].
Forexample,thetargetoutputfor is0.01buttheneuralnetworkoutput
0.75136507,thereforeitserroris:
Repeatingthisprocessfor
(rememberingthatthetargetis0.99)weget:
Thetotalerrorfortheneuralnetworkisthesumoftheseerrors:
TheBackwardsPass
Ourgoalwithbackpropagationistoupdateeachoftheweightsinthenetworkso
thattheycausetheactualoutputtobecloserthetargetoutput,therebyminimizing
theerrorforeachoutputneuronandthenetworkasawhole.
OutputLayer
Consider
aka
.Wewanttoknowhowmuchachangein
affectsthetotalerror,
isreadasthepartialderivativeof
alsosaythegradientwithrespectto
withrespectto
.Youcan
Byapplyingthechainruleweknowthat:
Visually,hereswhatweredoing:
http://mattmazur.com/2015/03/17/astepbystepbackpropagationexample/
4/19
8/3/2016
AStepbyStepBackpropagationExampleMattMazur
Weneedtofigureouteachpieceinthisequation.
First,howmuchdoesthetotalerrorchangewithrespecttotheoutput?
issometimesexpressedas
Whenwetakethepartialderivativeofthetotalerrorwithrespectto
thequantity
becomeszerobecause
doesnot
affectitwhichmeansweretakingthederivativeofaconstantwhichiszero.
Next,howmuchdoestheoutputof changewithrespecttoitstotalnetinput?
Thepartialderivativeofthelogisticfunctionistheoutputmultipliedby1minusthe
output:
Finally,howmuchdoesthetotalnetinputof
changewithrespectto
Puttingitalltogether:
http://mattmazur.com/2015/03/17/astepbystepbackpropagationexample/
5/19
8/3/2016
AStepbyStepBackpropagationExampleMattMazur
Youlloftenseethiscalculationcombinedintheformofthedeltarule:
Alternatively,wehave
and
whichcanbewrittenas
,aka
(theGreekletterdelta)akathenodedelta.Wecanusethistorewritethe
calculationabove:
Therefore:
Somesourcesextractthenegativesignfrom soitwouldbewrittenas:
Todecreasetheerror,wethensubtractthisvaluefromthecurrentweight
(optionallymultipliedbysomelearningrate,eta,whichwellsetto0.5):
Somesourcesuse (alpha)torepresentthelearningrate,othersuse
(eta),andothersevenuse (epsilon).
Wecanrepeatthisprocesstogetthenewweights
,and
Weperformtheactualupdatesintheneuralnetworkafterwehavethenew
weightsleadingintothehiddenlayerneurons(ie,weusetheoriginalweights,not
theupdatedweights,whenwecontinuethebackpropagationalgorithmbelow).
Follow
http://mattmazur.com/2015/03/17/astepbystepbackpropagationexample/
6/19
8/3/2016
AStepbyStepBackpropagationExampleMattMazur
HiddenLayer
Follow Matt
Next,wellcontinuethebackwardspassbycalculatingnewvaluesfor
,
Mazur
and
Bigpicture,hereswhatweneedtofigureout:
Visually:
Signmeup
Build a website with WordPress.com
Weregoingtouseasimilarprocessaswedidfortheoutputlayer,butslightly
differenttoaccountforthefactthattheoutputofeachhiddenlayerneuron
contributestotheoutput(andthereforeerror)ofmultipleoutputneurons.Weknow
that
affectsboth
and
thereforethe
needstotakeinto
considerationitseffectonthebothoutputneurons:
Startingwith
Wecancalculate
http://mattmazur.com/2015/03/17/astepbystepbackpropagationexample/
usingvalueswecalculatedearlier:
7/19
8/3/2016
AStepbyStepBackpropagationExampleMattMazur
And
isequalto
Pluggingthemin:
Followingthesameprocessfor
,weget:
Therefore:
Nowthatwehave
,weneedtofigureout
andthen
foreach
weight:
Wecalculatethepartialderivativeofthetotalnetinputto
withrespectto
the
sameaswedidfortheoutputneuron:
Puttingitalltogether:
Youmightalsoseethiswrittenas:
http://mattmazur.com/2015/03/17/astepbystepbackpropagationexample/
8/19
8/3/2016
AStepbyStepBackpropagationExampleMattMazur
Wecannowupdate
Repeatingthisfor
,and
Finally,weveupdatedallofourweights!Whenwefedforwardthe0.05and0.1
inputsoriginally,theerroronthenetworkwas0.298371109.Afterthisfirstroundof
backpropagation,thetotalerrorisnowdownto0.291027924.Itmightnotseem
likemuch,butafterrepeatingthisprocess10,000times,forexample,theerror
plummetsto0.000035085.Atthispoint,whenwefeedforward0.05and0.1,the
twooutputsneuronsgenerate0.015912196(vs0.01target)and0.984065734(vs
0.99target).
Ifyouvemadeitthisfarandfoundanyerrorsinanyoftheaboveorcanthinkof
anywaystomakeitclearerforfuturereaders,donthesitatetodropmeanote.
Thanks!
Sharethis:
283
Like
18bloggerslikethis.
Related
TheStateofEmergent
Mind
In"EmergentMind"
Experimentingwitha
NeuralNetworkbased
PokerBot
In"PokerBot"
EmergentMind#10
In"EmergentMind"
PostedonMarch17,2015byMazur.ThisentrywaspostedinMachineLearningandtaggedai,
backpropagation,machinelearning,neuralnetworks.Bookmarkthepermalink.
IntroducingABTestCalculator.com,
anOpenSourceA/BTest
TetriNETBotSourceCodePublished
onGithub
SignificanceCalculator
http://mattmazur.com/2015/03/17/astepbystepbackpropagationexample/
9/19
8/3/2016
AStepbyStepBackpropagationExampleMattMazur
115thoughtsonAStepbyStepBackpropagationExample
OlderComments
MostafaRazavi
December7,2015at1:09pm
Thatwasheaven,thanksamillion.
Reply
SonalShrivastava
December8,2015at11:40am
Thatwasawesome.Thankaton.
Reply
Nayantara
December9,2015at7:29am
HiMatt,Canyoualsopleaseprovideasimilarexampleforaconvolutionalneural
networkwhichusesatleast1convolutionallayerand1poolinglayer?Surprisingly,I
haventbeenabletofindANYsimilarexampleforbackpropagation,ontheinternet,
forConv.NeuralNetwork.
TIA.
Reply
Mazur
December9,2015at8:36am
Ihaventlearntthatyet.Ifyoufindagoodtutorialpleaseletmeknow.
Pi
ng
Reply
AStepbyStepBackpropagationExample|MattMazur|tensorflowgraphs
payamrastogi
December11,2015at4:24am
AllhailtoTheMazur
Reply
http://mattmazur.com/2015/03/17/astepbystepbackpropagationexample/
10/19
8/3/2016
AStepbyStepBackpropagationExampleMattMazur
LouisHong
December11,2015at4:41pm
Thankyousomuchforyourmostcomprehensivetutorialeverontheinternet.
Reply
ad
December17,2015at1:49am
whyisbiasnotupdated?
Reply
Mazur
December17,2015at9:23am
Hey,inthetutorialsIwentthroughtheydidntupdatethebiaswhichiswhyI
didntincludeithere.
Reply
justaguy
December24,2015at8:54pm
Typically,biaserrorisequaltothesumoftheerrorsoftheneurons
thatthebiasconnectsto.Forexample,inregardstoyourexample,
b1_error=h1_error+h2_error.Updatingthebiasweightwouldbe
addingtheproductofthesummederrorsandthelearningratetothe
bias,ex.b1_weight=b1_error*learning_rate.Althoughmany
problemscanbelearnedbyaneuralnetworkwithoutadjusting
biasesandtheremaybebetterwaystoadustbiasweights.Also,
updatingbiasweightsmaycauseproblemswithlearningasopposed
tokeepingthemstatic.Asusualwithneuralnetworks,through
experimentationyoumaydiscovermoreoptimaldesigns.
Reply
patriczhao
January13,2016at1:30am
niceexplanations,thanks.
Reply
AhadKhan
December20,2015at2:26am
http://mattmazur.com/2015/03/17/astepbystepbackpropagationexample/
11/19
8/3/2016
AStepbyStepBackpropagationExampleMattMazur
Thisisperfect.Iamabletovisualizebackpropagationalgobetterafterreadingthis
article.Thanksonceagain!
Reply
sunlyt
December21,2015at12:57am
Brilliant.Thankyou!
Reply
garky
December24,2015at8:25am
Ifwehavemorethanonesampleinourdatasethowwecantrainitbyconsideringall
samples,notjustonesample?
Reply
DanielZukowski
December24,2015at2:32pm
Invaluableresourceyouveproduced.Thankyouforthisclear,comprehensive,visual
explanation.Theinnermechanicsofbackpropagationarenolongeramysterytome.
Reply
LongPham
December26,2015at10:58am
precisely,intuitively,veryeasytounderstand,greatwork,thankyou.
Reply
DionisiusAN
December27,2015at1:16pm
Thankyouverymuch,itshelpmewell,ureallygivedetaildirectiontoallowme
imaginehowitworks.Ireallyappreciateit.MayGodrepayyourkindnessthousand
timethanudo.
Reply
singhrocks91
http://mattmazur.com/2015/03/17/astepbystepbackpropagationexample/
12/19
8/3/2016
AStepbyStepBackpropagationExampleMattMazur
December28,2015at1:35am
ThankYou.Ihaveabetterinsightnow
Reply
DGelling
January1,2016at6:48pm
Shouldntthederivativeofout_o1wrtnet_o1benet_o1*(1net_o1)?
Reply
NaanTadow
February24,2016at1:10am
Notheonestatedaboveiscorrect,seehereforthestepsonthegradientof
theactivationfunctionwithrespecttoitsinputvalue(net):
https://theclevermachine.wordpress.com/2014/09/08/derivationderivatives
forcommonneuralnetworkactivationfunctions/
OhandthanksforthisMattwasabletoworkthroughyourbreakdownofthe
partialderivativesfortheAndrewNgMLCourseoncoursera:D
Pi
ng
Pi
ng
Pi
ng
Reply
CodingNeuralnetworks|Bitsandpieces
Apprendrecoderunrseaudeneurones|ActuairesBigData
ContextualIntegrationIstheSecretWeaponofPredictiveAnalytics
Aro
January10,2016at6:23pm
thankssomuch,Ihaventseetutorialbeforelikethis.
Reply
DeriveMe
January12,2016at1:22am
Hello.Idontunderstand,belowthephraseFirst,howmuchdoesthetotalerror
changewithrespecttotheoutput?,whythereisa(*1)inthesecondequation,that
eventuallychangestheresultto(targetoutput)insteadofjust(targetoutput).Can
youhelpmeunderstand?
Thankyou!
Reply
http://mattmazur.com/2015/03/17/astepbystepbackpropagationexample/
13/19
8/3/2016
AStepbyStepBackpropagationExampleMattMazur
angie1pecht
January17,2016at8:52pm
Thishelpedmealot.Thankyousomuch!
Reply
LEarningAIagain
January18,2016at4:26pm
Thiswasawesome.Thankssomuch!
Reply
Ashish
January19,2016at7:21am
ThanksalotMattAppreciatedtheeffort,Kudos
Pi
ng
Reply
LearningHowToCodeNeuralNetworks|ipythonblog
Tariq
January20,2016at12:03pm
IftheerrorissquaredbutsimplyE=sum(targetoutput),youcanstilldothe
calculustoworkouttheerrorgradient..andthenupdatetheweights.WheredidIgo
wrongwiththislogic?
Reply
Elliot
January28,2016at9:03am
Goodafternoon,dearMattMazur!
Thankyouverymuchforwritingsocompleteandcomprehensivetutorial,everything
isunderstandableandwritteninaccessibleway!IfisitposdiblemayIaskfollowing
questionifIneedtocomputeJacobianMatrixelementsinformulaforcomputingError
GradientwithrespecttoweightdEtotal/dwiIshouldjustpercieveEtotalnotasthefull
errorfromalloutputsbutasanerrorfromsomecertainsingleoutput,couldyou
pleasesayisthiscorrect?Couldyoupleasesayareyounotplanningtomakea
simillartutorialbutforcomputingsecondorderderivatives(backpropagationwith
partialderivativesofsecondorder)?Ihavesearchinginternetfortutorialofcalculating
secondorderderivativesinbackpropagationbutdidnotfoundanything.Maybeyou
knowsomegoodtutorialsforit?Ihaveknowthatsecondorderpartialderivatives
http://mattmazur.com/2015/03/17/astepbystepbackpropagationexample/
14/19
8/3/2016
AStepbyStepBackpropagationExampleMattMazur
(elementsofHessianMatrix)canbeapproximatedbymultiplayingJacobiansbut
wantedtofinditsexactnonapproximatedcalculation.Thankyouinadvanceforyour
reply!
Sincerely
Reply
Pulley
February1,2016at9:52pm
helloMatt,CanyoupleasetellmethatafterupdatingallweightsinfirstiterationI
shouldupdatethevaluesofallhatlastinfirstiterationornot.
Reply
BehrozAhmadAli
February6,2016at8:01am
Thankyouforsuchacomprehensiveexplanationofbackpropagation.Ihavebeen
tryingtounderstandbackpropagationformonthsbuttodayIfinallyunderstooditafter
readingyourthispost.
Reply
Tariq
February8,2016at10:57am
iamwritingagentleintrotoneuralnetworksaimedatbeingaccessibleto
someoneatschoolapproxage15hereisadraftwhichincludesaveryvery
gentleintrotobackprop
https://goo.gl/7uxHlm
idappreciatefeedbackto@myoneuralnet
Reply
RebekaSultana
February16,2016at12:59am
Thankyousomuch.
Reply
Ron
February21,2016at1:10pm
http://mattmazur.com/2015/03/17/astepbystepbackpropagationexample/
15/19
8/3/2016
AStepbyStepBackpropagationExampleMattMazur
Firstly,thankyouVERYmuchforagreatwalkthroughofallthestepsinvolvedwith
realvalues.Imanagedtocreateaquickimplementationofthemethodsused,and
wasabletotrainsuccessfully.
Iwaslookingtousethissetup(butwith4inputs/3outputs)forthefamousirisdata
(http://archive.ics.uci.edu/ml/datasets/Iris).The3outputswouldbe0.01.0foreach
classification,astherewouldbeanoutputweighttowardseachtype.
Unfortunatelyitdoesntseemtobeabletoresolvetoanalwayslowerrorvalue,and
fluctuatesdrasticallyasittrains.Isthisanindicationthatasecondlayerisneededfor
thistypeofdata?
Reply
Werner
February22,2016at5:44am
ThefirstexplanationIreadthatactuallymakessensetome.Mostjustseemtostart
shovellingmathsinyourfaceinthenameofnotmakingitsimplerthattheyshould.
NowletshopemyAIwillfinallybeabletoplayagameofdraughts.
Reply
admin
February22,2016at9:20am
Ithelpsmealot.thanksforthework!!!
Reply
Name(required)
February24,2016at9:04pm
Greattutorial.Byanychancedoyouknowhowdobackpropagate2hiddenlayers?
Reply
Mazur
February25,2016at8:22am
Idonot,sorry.
Reply
Kiran
February25,2016at12:29am
Thankyousomuch!Theexplanationwassointuitive.
http://mattmazur.com/2015/03/17/astepbystepbackpropagationexample/
16/19
8/3/2016
AStepbyStepBackpropagationExampleMattMazur
Reply
Anon
February25,2016at11:18pm
Thankyou!Thewayyouexplainthisisveryintuitive.
Reply
tariq
February26,2016at9:38am
Idloveyourfeedbackonmyattempttoexplainthemathsandideasunderlying
neuralnetworksandbackrpop.
Heresanearlydraftonline.Theaimformeistoreachasmanypeopleaspossible
inckteenagerswithschoolmaths.
http://makeyourownneuralnetwork.blogspot.co.uk/2016/02/earlydraftfeedback
wanted.html
Reply
GarettRidgeAndThenSomeMoreWords
March1,2016at5:45pm
IhaveapresentationtomorrowonneuralnetworksinagradclassthatIm
drowningin.Thisbookisgoingtosavemylife
Reply
falcatrua
February29,2016at2:23pm
ItsagreattutorialbutIthinkIfoundanerror:
atforwardpassvaluesshouldbe:
neth1=0.15*0.05+0.25*0.1+0.35*1=0.3825
outh1=1/(1+e^0.3825)=0,594475931
neth2=0.20*0.05+0.30*0.1+0.35*1=0.39
outh2=1/(1+e^0.39)=0.596282699
Reply
GarettRidgeAndThenSomeMoreWords
March1,2016at9:37pm
Thelabelsgotheotherwayinhisdrawing,wherethelabelthatsaysw_2
goeswiththelineitsnextto(ontherightofit)andthevalueofw_2gets
http://mattmazur.com/2015/03/17/astepbystepbackpropagationexample/
17/19
8/3/2016
AStepbyStepBackpropagationExampleMattMazur
writtentotheleftlookatthepreviousdrawingwithoutthevaluestoseewhatI
mean
Reply
Bill
March2,2016at3:09am
Goodstuff!Professorsshouldlearnfromyou.Mostprofessorsmakecomplexthings
complex.Arealgoodteachershouldmakecomplexthingssimple.
Reply
b
March2,2016at3:11am
Also,recommendthislinkifyouwanttofindaevensimplerexamplethanthisone.
http://www.cs.toronto.edu/~tijmen/csc321/inclass/140123.pdf
Reply
Priti
March2,2016at4:27am
Canyougiveanexampleforbackpropagationinopticalnetworks
Reply
Moboluwarin
March2,2016at2:13pm
Heythereveryhelpfulindeed,inthelinefornet01=w5*outh1+w6*outh2+b2*1,isit
notmeanttobew7??
Cheers
Reply
Dara
March4,2016at9:17am
Cananywayhelpmeexplainingmanualcalculationfortestingoutputswithtrained
weightsandbias?SeemsitdoesnotgivethecorrectanswerwhenIdirectlysubstitute
myinputstotheequations.AnswersaredifferentthanIgetfromMATLABNNtoolbox.
Reply
http://mattmazur.com/2015/03/17/astepbystepbackpropagationexample/
18/19
8/3/2016
AStepbyStepBackpropagationExampleMattMazur
OlderComments
LeaveaReply
Enteryourcommenthere...
BlogatWordPress.com.ThePublishTheme.
http://mattmazur.com/2015/03/17/astepbystepbackpropagationexample/
19/19