A Step by Step Backpropagation Example - Matt Mazur

8/3/2016
AStepbyStepBackpropagationExampleMattMazur
AStepbyStep
BackpropagationExample
MattMazur
Background
Home
About
Backpropagationisacommonmethodfortraininganeuralnetwork.Thereisno
Archives
shortageofpapersonlinethatattempttoexplainhowbackpropagationworks,but
Contact
fewthatincludeanexamplewithactualnumbers.Thispostismyattemptto
Now
explainhowitworkswithaconcreteexamplethatfolkscancomparetheirown
Projects
calculationstoinordertoensuretheyunderstandbackpropagationcorrectly.
FollowviaEmail
Enteryouremailaddressto
Ifthiskindofthinginterestsyou,youshouldsignupformynewsletterwhereIpost
aboutAIrelatedprojectsthatImworkingon.
followthisblogandreceive
notificationsofnewpostsby
email.
BackpropagationinPython
Join1,810otherfollowers
YoucanplayaroundwithaPythonscriptthatIwrotethatimplementsthe
backpropagationalgorithminthisGithubrepo.
Enteryouremailaddress
BackpropagationVisualization
Follow
Foraninteractivevisualizationshowinganeuralnetworkasitlearns,checkoutmy
About
NeuralNetworkvisualization.
I'madeveloperatAutomattic
whereIworkongrowthand
analyticsforWordPress.com.I
AdditionalResources
alsobuiltLeanDomainSearch,
Precedenandanumberofother
Ifyoufindthistutorialusefulandwanttocontinuelearningaboutneuralnetworks
softwareproductsovertheyears.
andtheirapplications,IhighlyrecommendcheckingoutAdrianRosebrocks
Ilovesolvingproblemsand
excellenttutorialonGettingStartedwithDeepLearningandPython.
helpingothersdothesame.
DropmeanoteifIcanhelpwith
Overview
anything.
Forthistutorial,weregoingtouseaneuralnetworkwithtwoinputs,twohidden
neurons,twooutputneurons.Additionally,thehiddenandoutputneuronswill
includeabias.
Search
Heresthebasicstructure:
FollowmeonTwitter
Tweetsby
@mhmazur
MattMazur
http://mattmazur.com/2015/03/17/astepbystepbackpropagationexample/
1/19
8/3/2016
MattMazur
@mhmazur
Ifausersignsupforyourapp
at11:59pmonTuesday,
doessomethingat12:01am
onWednesday,thennothing
everagain,isheD1
retained?
5h
MattMazur
@mhmazur
Globaltemperaturereached
+1Coverpreindustrial
averageinOct2015,nowas
muchas+1.4CasofFeb
2016
slate.com/blogs/future_t
OurHemispheresTempe
Update,March3,2016:Si
Inordertohavesomenumberstoworkwith,heresaretheinitialweights,the
biases,andtraininginputs/outputs:
slate.com
6h
MattMazurRetweeted
KyleWild
@dorkitude
(somethingsIhavelearned)
04Apr
MattMazurRetweeted
AprilUnderwood
Thegoalofbackpropagationistooptimizetheweightssothattheneuralnetwork
canlearnhowtocorrectlymaparbitraryinputstooutputs.
@aunder
Realityofhighgrowth
startup:thingswillfeel
broken,infowontfeel
appropriatelydistributed,
importantthingswillseem
overlooked.
29Feb
Fortherestofthistutorialweregoingtoworkwithasingletrainingset:given
inputs0.05and0.10,wewanttheneuralnetworktooutput0.01and0.99.
TheForwardPass
Tobegin,letsseewhattheneuralnetworkcurrentlypredictsgiventheweightsand
biasesaboveandinputsof0.05and0.10.Todothiswellfeedthoseinputs
MattMazur
2/19
8/3/2016
MattMazur
forwardthoughthenetwork.
@mhmazur
RetentionRateTerminology
mattmazur.com/2016/03/04/r
et
RetentionRat
Wefigureoutthetotalnetinputtoeachhiddenlayerneuron,squashthetotalnet
inputusinganactivationfunction(hereweusethelogisticfunction),thenrepeat
theprocesswiththeoutputlayerneurons.
Yesterdaymy
mattmazur.com
Totalnetinputisalsoreferredtoasjustnetinputbysomesources.
04Mar
MattMazurRetweeted
Hereshowwecalculatethetotalnetinputfor
OscarKoeroo
@okoeroo
AccidentalEscher
Wethensquashitusingthelogisticfunctiontogettheoutputof
Carryingoutthesameprocessfor
Embed
weget:
19May
ViewonTwitter
Werepeatthisprocessfortheoutputlayerneurons,usingtheoutputfromthe
hiddenlayerneuronsasinputs.
Herestheoutputfor :
Andcarryingoutthesameprocessfor
weget:
CalculatingtheTotalError
Wecannowcalculatetheerrorforeachoutputneuronusingthesquarederror
functionandsumthemtogetthetotalerror:
Somesourcesrefertothetargetastheidealandtheoutputastheactual.
3/19
8/3/2016
The isincludedsothatexponentiscancelledwhenwedifferentiatelater
on.Theresultiseventuallymultipliedbyalearningrateanywaysoitdoesnt
matterthatweintroduceaconstanthere[1].
Forexample,thetargetoutputfor is0.01buttheneuralnetworkoutput
0.75136507,thereforeitserroris:
Repeatingthisprocessfor
(rememberingthatthetargetis0.99)weget:
Thetotalerrorfortheneuralnetworkisthesumoftheseerrors:
TheBackwardsPass
Ourgoalwithbackpropagationistoupdateeachoftheweightsinthenetworkso
thattheycausetheactualoutputtobecloserthetargetoutput,therebyminimizing
theerrorforeachoutputneuronandthenetworkasawhole.
OutputLayer
Consider
aka
.Wewanttoknowhowmuchachangein
affectsthetotalerror,
isreadasthepartialderivativeof
alsosaythegradientwithrespectto
withrespectto
.Youcan
Byapplyingthechainruleweknowthat:
Visually,hereswhatweredoing:
4/19
8/3/2016
Weneedtofigureouteachpieceinthisequation.
First,howmuchdoesthetotalerrorchangewithrespecttotheoutput?
issometimesexpressedas
Whenwetakethepartialderivativeofthetotalerrorwithrespectto
thequantity
becomeszerobecause
doesnot
affectitwhichmeansweretakingthederivativeofaconstantwhichiszero.
Next,howmuchdoestheoutputof changewithrespecttoitstotalnetinput?
Thepartialderivativeofthelogisticfunctionistheoutputmultipliedby1minusthe
output:
Finally,howmuchdoesthetotalnetinputof
changewithrespectto
Puttingitalltogether:
5/19
8/3/2016
Youlloftenseethiscalculationcombinedintheformofthedeltarule:
Alternatively,wehave
and
whichcanbewrittenas
,aka
(theGreekletterdelta)akathenodedelta.Wecanusethistorewritethe
calculationabove:
Therefore:
Somesourcesextractthenegativesignfrom soitwouldbewrittenas:
Todecreasetheerror,wethensubtractthisvaluefromthecurrentweight
(optionallymultipliedbysomelearningrate,eta,whichwellsetto0.5):
Somesourcesuse (alpha)torepresentthelearningrate,othersuse
(eta),andothersevenuse (epsilon).
Wecanrepeatthisprocesstogetthenewweights
,and
Weperformtheactualupdatesintheneuralnetworkafterwehavethenew
weightsleadingintothehiddenlayerneurons(ie,weusetheoriginalweights,not
theupdatedweights,whenwecontinuethebackpropagationalgorithmbelow).
Follow
6/19
8/3/2016
HiddenLayer
Follow Matt
Next,wellcontinuethebackwardspassbycalculatingnewvaluesfor
,
Mazur
and
Bigpicture,hereswhatweneedtofigureout:
Get every new post delivered to

your Inbox.
Join 1,810 other followers
Enteryouremailaddress
Visually:
Signmeup
Build a website with WordPress.com
Weregoingtouseasimilarprocessaswedidfortheoutputlayer,butslightly
differenttoaccountforthefactthattheoutputofeachhiddenlayerneuron
contributestotheoutput(andthereforeerror)ofmultipleoutputneurons.Weknow
that
affectsboth
and
thereforethe
needstotakeinto
considerationitseffectonthebothoutputneurons:
Startingwith
Wecancalculate
usingvalueswecalculatedearlier:
7/19
8/3/2016
And
isequalto
Pluggingthemin:
Followingthesameprocessfor
,weget:
Therefore:
Nowthatwehave
,weneedtofigureout
andthen
foreach
weight:
Wecalculatethepartialderivativeofthetotalnetinputto
withrespectto
the
sameaswedidfortheoutputneuron:
Puttingitalltogether:
Youmightalsoseethiswrittenas:
8/19
8/3/2016
Wecannowupdate
Repeatingthisfor
,and
Finally,weveupdatedallofourweights!Whenwefedforwardthe0.05and0.1
inputsoriginally,theerroronthenetworkwas0.298371109.Afterthisfirstroundof
backpropagation,thetotalerrorisnowdownto0.291027924.Itmightnotseem
likemuch,butafterrepeatingthisprocess10,000times,forexample,theerror
plummetsto0.000035085.Atthispoint,whenwefeedforward0.05and0.1,the
twooutputsneuronsgenerate0.015912196(vs0.01target)and0.984065734(vs
0.99target).
Ifyouvemadeitthisfarandfoundanyerrorsinanyoftheaboveorcanthinkof
anywaystomakeitclearerforfuturereaders,donthesitatetodropmeanote.
Thanks!
Sharethis:
Twitter
Facebook
283
Like
18bloggerslikethis.
Related
TheStateofEmergent
Mind
In"EmergentMind"
Experimentingwitha
NeuralNetworkbased
PokerBot
In"PokerBot"
EmergentMind#10
In"EmergentMind"
PostedonMarch17,2015byMazur.ThisentrywaspostedinMachineLearningandtaggedai,
backpropagation,machinelearning,neuralnetworks.Bookmarkthepermalink.
IntroducingABTestCalculator.com,
anOpenSourceA/BTest
TetriNETBotSourceCodePublished
onGithub
SignificanceCalculator
9/19
8/3/2016
115thoughtsonAStepbyStepBackpropagationExample
OlderComments
MostafaRazavi
December7,2015at1:09pm
Thatwasheaven,thanksamillion.
Reply
SonalShrivastava
December8,2015at11:40am
Thatwasawesome.Thankaton.
Reply
Nayantara
HiMatt,Canyoualsopleaseprovideasimilarexampleforaconvolutionalneural
networkwhichusesatleast1convolutionallayerand1poolinglayer?Surprisingly,I
haventbeenabletofindANYsimilarexampleforbackpropagation,ontheinternet,
forConv.NeuralNetwork.
TIA.
Reply
Mazur
Ihaventlearntthatyet.Ifyoufindagoodtutorialpleaseletmeknow.
Pi
ng
Reply
AStepbyStepBackpropagationExample|MattMazur|tensorflowgraphs
payamrastogi
AllhailtoTheMazur
Reply
10/19
8/3/2016
LouisHong
Thankyousomuchforyourmostcomprehensivetutorialeverontheinternet.
Reply
ad
whyisbiasnotupdated?
Reply
Mazur
Hey,inthetutorialsIwentthroughtheydidntupdatethebiaswhichiswhyI
didntincludeithere.
Reply
justaguy
Typically,biaserrorisequaltothesumoftheerrorsoftheneurons
thatthebiasconnectsto.Forexample,inregardstoyourexample,
b1_error=h1_error+h2_error.Updatingthebiasweightwouldbe
addingtheproductofthesummederrorsandthelearningratetothe
bias,ex.b1_weight=b1_error*learning_rate.Althoughmany
problemscanbelearnedbyaneuralnetworkwithoutadjusting
biasesandtheremaybebetterwaystoadustbiasweights.Also,
updatingbiasweightsmaycauseproblemswithlearningasopposed
tokeepingthemstatic.Asusualwithneuralnetworks,through
experimentationyoumaydiscovermoreoptimaldesigns.
Reply
patriczhao
January13,2016at1:30am
niceexplanations,thanks.
Reply
AhadKhan
11/19
8/3/2016
Thisisperfect.Iamabletovisualizebackpropagationalgobetterafterreadingthis
article.Thanksonceagain!
Reply
sunlyt
Brilliant.Thankyou!
Reply
garky
Ifwehavemorethanonesampleinourdatasethowwecantrainitbyconsideringall
samples,notjustonesample?
Reply
DanielZukowski
Invaluableresourceyouveproduced.Thankyouforthisclear,comprehensive,visual
explanation.Theinnermechanicsofbackpropagationarenolongeramysterytome.
Reply
LongPham
precisely,intuitively,veryeasytounderstand,greatwork,thankyou.
Reply
DionisiusAN
Thankyouverymuch,itshelpmewell,ureallygivedetaildirectiontoallowme
imaginehowitworks.Ireallyappreciateit.MayGodrepayyourkindnessthousand
timethanudo.
Reply
singhrocks91
12/19
8/3/2016
ThankYou.Ihaveabetterinsightnow
Reply
DGelling
January1,2016at6:48pm
Shouldntthederivativeofout_o1wrtnet_o1benet_o1*(1net_o1)?
Reply
NaanTadow
February24,2016at1:10am
Notheonestatedaboveiscorrect,seehereforthestepsonthegradientof
theactivationfunctionwithrespecttoitsinputvalue(net):
https://theclevermachine.wordpress.com/2014/09/08/derivationderivatives
forcommonneuralnetworkactivationfunctions/
OhandthanksforthisMattwasabletoworkthroughyourbreakdownofthe
partialderivativesfortheAndrewNgMLCourseoncoursera:D
Pi
ng
Pi
ng
Pi
ng
Reply
CodingNeuralnetworks|Bitsandpieces
Apprendrecoderunrseaudeneurones|ActuairesBigData
ContextualIntegrationIstheSecretWeaponofPredictiveAnalytics
Aro
thankssomuch,Ihaventseetutorialbeforelikethis.
Reply
DeriveMe
Hello.Idontunderstand,belowthephraseFirst,howmuchdoesthetotalerror
changewithrespecttotheoutput?,whythereisa(*1)inthesecondequation,that
eventuallychangestheresultto(targetoutput)insteadofjust(targetoutput).Can
youhelpmeunderstand?
Thankyou!
Reply
13/19
8/3/2016
angie1pecht
Thishelpedmealot.Thankyousomuch!
Reply
LEarningAIagain
Thiswasawesome.Thankssomuch!
Reply
Ashish
ThanksalotMattAppreciatedtheeffort,Kudos
Pi
ng
Reply
LearningHowToCodeNeuralNetworks|ipythonblog
Tariq
IftheerrorissquaredbutsimplyE=sum(targetoutput),youcanstilldothe
calculustoworkouttheerrorgradient..andthenupdatetheweights.WheredidIgo
wrongwiththislogic?
Reply
Elliot
Goodafternoon,dearMattMazur!
Thankyouverymuchforwritingsocompleteandcomprehensivetutorial,everything
isunderstandableandwritteninaccessibleway!IfisitposdiblemayIaskfollowing
questionifIneedtocomputeJacobianMatrixelementsinformulaforcomputingError
GradientwithrespecttoweightdEtotal/dwiIshouldjustpercieveEtotalnotasthefull
errorfromalloutputsbutasanerrorfromsomecertainsingleoutput,couldyou
pleasesayisthiscorrect?Couldyoupleasesayareyounotplanningtomakea
simillartutorialbutforcomputingsecondorderderivatives(backpropagationwith
partialderivativesofsecondorder)?Ihavesearchinginternetfortutorialofcalculating
secondorderderivativesinbackpropagationbutdidnotfoundanything.Maybeyou
knowsomegoodtutorialsforit?Ihaveknowthatsecondorderpartialderivatives
14/19
8/3/2016
(elementsofHessianMatrix)canbeapproximatedbymultiplayingJacobiansbut
wantedtofinditsexactnonapproximatedcalculation.Thankyouinadvanceforyour
reply!
Sincerely
Reply
Pulley
February1,2016at9:52pm
helloMatt,CanyoupleasetellmethatafterupdatingallweightsinfirstiterationI
shouldupdatethevaluesofallhatlastinfirstiterationornot.
Reply
BehrozAhmadAli
Thankyouforsuchacomprehensiveexplanationofbackpropagation.Ihavebeen
tryingtounderstandbackpropagationformonthsbuttodayIfinallyunderstooditafter
readingyourthispost.
Reply
Tariq
iamwritingagentleintrotoneuralnetworksaimedatbeingaccessibleto
someoneatschoolapproxage15hereisadraftwhichincludesaveryvery
gentleintrotobackprop
https://goo.gl/7uxHlm
idappreciatefeedbackto@myoneuralnet
Reply
RebekaSultana
Thankyousomuch.
Reply
Ron
15/19
8/3/2016
Firstly,thankyouVERYmuchforagreatwalkthroughofallthestepsinvolvedwith
realvalues.Imanagedtocreateaquickimplementationofthemethodsused,and
wasabletotrainsuccessfully.
Iwaslookingtousethissetup(butwith4inputs/3outputs)forthefamousirisdata
(http://archive.ics.uci.edu/ml/datasets/Iris).The3outputswouldbe0.01.0foreach
classification,astherewouldbeanoutputweighttowardseachtype.
Unfortunatelyitdoesntseemtobeabletoresolvetoanalwayslowerrorvalue,and
fluctuatesdrasticallyasittrains.Isthisanindicationthatasecondlayerisneededfor
thistypeofdata?
Reply
Werner
ThefirstexplanationIreadthatactuallymakessensetome.Mostjustseemtostart
shovellingmathsinyourfaceinthenameofnotmakingitsimplerthattheyshould.
NowletshopemyAIwillfinallybeabletoplayagameofdraughts.
Reply
admin
Ithelpsmealot.thanksforthework!!!
Reply
Name(required)
Greattutorial.Byanychancedoyouknowhowdobackpropagate2hiddenlayers?
Reply
Mazur
Idonot,sorry.
Reply
Kiran
Thankyousomuch!Theexplanationwassointuitive.
16/19
8/3/2016
Reply
Anon
Thankyou!Thewayyouexplainthisisveryintuitive.
Reply
tariq
Idloveyourfeedbackonmyattempttoexplainthemathsandideasunderlying
neuralnetworksandbackrpop.
Heresanearlydraftonline.Theaimformeistoreachasmanypeopleaspossible
inckteenagerswithschoolmaths.
http://makeyourownneuralnetwork.blogspot.co.uk/2016/02/earlydraftfeedback
wanted.html
Reply
GarettRidgeAndThenSomeMoreWords
March1,2016at5:45pm
IhaveapresentationtomorrowonneuralnetworksinagradclassthatIm
drowningin.Thisbookisgoingtosavemylife
Reply
falcatrua
ItsagreattutorialbutIthinkIfoundanerror:
atforwardpassvaluesshouldbe:
neth1=0.15*0.05+0.25*0.1+0.35*1=0.3825
outh1=1/(1+e^0.3825)=0,594475931
neth2=0.20*0.05+0.30*0.1+0.35*1=0.39
outh2=1/(1+e^0.39)=0.596282699
Reply
GarettRidgeAndThenSomeMoreWords
March1,2016at9:37pm
Thelabelsgotheotherwayinhisdrawing,wherethelabelthatsaysw_2
goeswiththelineitsnextto(ontherightofit)andthevalueofw_2gets
17/19
8/3/2016
writtentotheleftlookatthepreviousdrawingwithoutthevaluestoseewhatI
mean
Reply
Bill
March2,2016at3:09am
Goodstuff!Professorsshouldlearnfromyou.Mostprofessorsmakecomplexthings
complex.Arealgoodteachershouldmakecomplexthingssimple.
Reply
b
March2,2016at3:11am
Also,recommendthislinkifyouwanttofindaevensimplerexamplethanthisone.
http://www.cs.toronto.edu/~tijmen/csc321/inclass/140123.pdf
Reply
Priti
March2,2016at4:27am
Canyougiveanexampleforbackpropagationinopticalnetworks
Reply
Moboluwarin
March2,2016at2:13pm
Heythereveryhelpfulindeed,inthelinefornet01=w5*outh1+w6*outh2+b2*1,isit
notmeanttobew7??
Cheers
Reply
Dara
March4,2016at9:17am
Cananywayhelpmeexplainingmanualcalculationfortestingoutputswithtrained
weightsandbias?SeemsitdoesnotgivethecorrectanswerwhenIdirectlysubstitute
myinputstotheequations.AnswersaredifferentthanIgetfromMATLABNNtoolbox.
Reply
18/19
8/3/2016
OlderComments
LeaveaReply
Enteryourcommenthere...
BlogatWordPress.com.ThePublishTheme.
19/19

A Step by Step Backpropagation Example - Matt Mazur

Diunggah oleh

Informasi Dokumen

Judul Asli

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

A Step by Step Backpropagation Example - Matt Mazur

Diunggah oleh

Hak Cipta:

Format Tersedia

8/3/2016

Get every new post delivered to

Anda mungkin juga menyukai