Luke de Oliveira
! @lukede0
" @lukedeo
# lukedeo@ldo.io
$ https://ldo.io
Welcome!
Schedule
Schedule
• RNNs
Schedule
• RNNs
• CNNs
Schedule
• RNNs
• CNNs
• What is an order?
• Answer: reductions!
Reductions (or, “bags”)
• Pros:
• Interpretable (sometimes…)
• …?
• Cons:
X1 X2 X3 X4 X5 X6
X1 X2 X3 X4 X5 X6
X1 X2 X3 X4 X5 X6
Ignore time
X1 X2 X3 X4 X5 X6
X1 X2 X3 X4 X5 X6
X1 X2 X3 X4 X5 X6
X1 X2 X3 X4 X5 X6
X1 X2 X3 X4 X5 X6
X1 X2 X3 X4 X5 X6
X1 X2 X3 X4 X5 X6
X1 X2 X3 X4 X5 X6
X1 X2 X3 X4 X5 X6
X1 X2 X3 X4 X5 X6
X1 X2 X3 X4 X5 X6
0 Δ1
X1 X2 X3 X4 X5 X6
X1 X2 X3 X4 X5 X6
0 Δ1 Δ2
X1 X2 X3 X4 X5 X6
X1 X2 X3 X4 X5 X6
0 Δ1 Δ2 Δ3 Δ4 Δ5
X1 X2 X3 X4 X5 X6
• [0, 1] and [-1, 1] and [0, 3], fine! Then adding [-99,
120]? Not ideal…
• Classification / regression /
survival / etc. agnostic
• Transformers
Next state
Recurrent Neural Networks
Take the last hidden state Output every hidden state, then
use them to predict the target
RNNs (summary)
X1 X2 X3 X4 X5 X6 X1 X2 X3 X4 X5 X6
X1 X2 X3
Filter is parameterized by w, u, v
u
w v
• Position encoding
Positional Encoding
P1 P2 P3 P4 P5 P6
X1 X2 X3 X4 X5 X6
How does this fit with our archetypes?
How does this fit with our archetypes?
X1 X2 X3 X4 X5 X6
• Attention has gotten lots of hype - intuitively, give a deep learning model an explicit
mechanism to look through it’s history
Paying Attention in Sequence Models
• Attention has gotten lots of hype - intuitively, give a deep learning model an explicit
mechanism to look through it’s history
• Attention has gotten lots of hype - intuitively, give a deep learning model an explicit
mechanism to look through it’s history
• Attention has gotten lots of hype - intuitively, give a deep learning model an explicit
mechanism to look through it’s history
• Attention has gotten lots of hype - intuitively, give a deep learning model an explicit
mechanism to look through it’s history
• Attention has gotten lots of hype - intuitively, give a deep learning model an explicit
mechanism to look through it’s history
• In self-attention, the word ‘treat’ can pay attention to the word ‘dog’, and report
that out to another layer)
Paying Attention in Sequence Models
• Attention has gotten lots of hype - intuitively, give a deep learning model an explicit
mechanism to look through it’s history
• In self-attention, the word ‘treat’ can pay attention to the word ‘dog’, and report
that out to another layer)
Transformers
Encoder Decoder
Transformers (overly simplified encoder)
X1 X2 X3 X4 X5 X6
! @lukede0
" @lukedeo
# lukedeo@ldo.io
$ https://ldo.io