Anda di halaman 1dari 2

Deep Learning for Semantic Composition

Xiaodan Zhu Edward Grefenstette

National Research Council Canada DeepMind

1 Tutorial Description We will then advance to discuss several se-

lected topics. We first cover the models that con-
Learning representations to model the meaning sider compositional with non-compositional (e.g.,
of text has been a core problem in natural lan- holistically learned) semantics (Zhu et al., 2016,
guage understanding (NLP). The last several years 2015a). Next, we discuss composition models
have seen extensive interests on distributional ap- that integrate multiple architectures of neural net-
proaches, in which text spans of different gran- works. We also discuss semantic composition
ularities are encoded as continuous vectors. If and decomposition (Turney, 2014). In the end
properly learned, such representations have been we briefly discuss sub-word neural-network-based
shown to help achieve the state-of-the-art perfor- composition models (Zhang et al., 2015; Sennrich
mances on a variety of NLP problems. et al., 2016)
In this tutorial, we will cover the fundamentals We will then summarize the tutorial, flesh out
and selected research topics on neural network- limitations of current approaches, and discuss fu-
based modeling for semantic composition, which ture directions that are interesting to us.
aims to learn distributed representations for larger
spans of text, e.g., phrases (Yin and Schütze, 2014) 2 Tutorial Outline
and sentences (Zhu et al., 2016; Chen et al., 2016;
Zhu et al., 2015b,a; Tai et al., 2015; Kalchbrenner • Introduction
et al., 2014; Irsoy and Cardie, 2014; Socher et al., ◦ Definition of semantic composition
2012), from the meaning representations of their ◦ Conventional and basic approaches
parts, e.g., word embedding.  Formal semantics

We begin by briefly introducing traditional  Bag of words with learned representa-

approaches to semantic composition, including tions (additive, learned projection)

logic-based formal semantic approaches and sim- • Parametrising Composition Functions
ple arithmetic operations over vectors based on ◦ Convolutional composition models
corpus word counts (Mitchell and Lapata, 2008; ◦ Recurrent composition models
Landauer and Dumais, 1997). ◦ Recursive composition models
Our main focus, however, will be on distributed  TreeRNN/TreeLSTM
representation-based modeling, whereby the rep-  SPINN and RL-SPINN
resentations of words and the operations com- ◦ Unsupervised models
posing them are jointly learned from a training
 Skip-thought vectors and paragraph
objective. We cover the generic ideas behind
neural network-based semantic composition and
 Variational auto-encoders for text
dive into the details of three typical composi-
tion architectures: the convolutional composition • Selected Topics
models (Kalchbrenner et al., 2014; Zhang et al., ◦ Incorporating compositional and non-
2015), recurrent composition models (Zhu et al., compositional (e.g., holistically learned)
2016), and recursive composition models (Irsoy semantics
and Cardie, 2014; Socher et al., 2012; Zhu et al., ◦ Integrating multiple composition archi-
2015b; Tai et al., 2015). After that, we will tectures
discuss several unsupervised approaches (Le and ◦ Semantic composition and decomposition
Mikolov, 2014; Kiros et al., 2014; Bowman et al., ◦ Sub-word composition models
2016; Miao et al., 2016). • Summary
Xiaodan Zhu, Researcher, National Research Yishu Miao, Lei Yu, and Phil Blunsom. 2016. Neural
Council Canada. variational inference for text processing. In ICML. Jeff Mitchell and Mirella Lapata. 2008. Vector-based models of semantic composition. In ACL.
Xiaodan Zhu is a Research Officer at the National
Rico Sennrich, Barry Haddow, and Alexandra Birch.
Research Council Canada. His research interests 2016. Neural machine translation of rare words with
are in Natural Language Processing and Machine subword units. In ACL.
Learning. His recent work has focused on deep
Richard Socher, Brody Huval, Christopher D. Man-
learning, semantic composition, sentiment analy- ning, and Andrew Y. Ng. 2012. Semantic composi-
sis, and natural language inference. Xiaodan has tionality through recursive matrix-vector spaces. In
taught a tutorial at EMNLP ’14. EMNLP.
Kai Sheng Tai, Richard Socher, and Christopher D.
Edward Grefenstette, Senior Research Scientist, Manning. 2015. Improved Semantic Representa-
DeepMind. tions From Tree-Structured Long Short-Term Mem- ory Networks. In ACL. Peter Turney. 2014. Semantic composition and de-
Edward Grefenstette is a Senior Research Scientist composition: From recognition to generation. In
at DeepMind. His research covers the intersection arXiv:1405.7908.
of Machine Learning, Computer Reasoning, and Wenpeng Yin and Hinrich Schütze. 2014. An explo-
Natural Language Understanding. Recent publica- ration of embeddings for generalized phrases. In
tions cover the topics of neural computation, rep- ACL 2014 Student Research Workshop.
resentation learning at the sentence level, recog- Xiang Zhang, Junbo Zhao, and Yann LeCun. 2015.
nising textual entailment, and machine reading. Character-level convolutional networks for text clas-
sification. In NIPS.
Xiaodan Zhu, Hongyu Guo, and Parinaz Sobhani.
References 2015a. Neural networks for integrating composi-
Samuel R. Bowman, Jon Gauthier, Abhinav Ras- tional and non-compositional sentiment in sentiment
togi, Raghav Gupta, Christopher D. Manning, and composition. In *SEM.
Christopher Potts. 2016. A fast unified model for
parsing and sentence understanding. In ACL. Xiaodan Zhu, Parinaz Sobhani, and Hongyu Guo.
2015b. Long short-term memory over recursive
Qian Chen, Xiaodan Zhu, Zhenhua Ling, Si Wei, and structures. In ICML.
Hui Jiang. 2016. Enhancing and combining sequen- Xiaodan Zhu, Parinaz Sobhani, and Hongyu Guo.
tial and tree lstm for natural language inference. In 2016. Dag-structured long short-term memory for
arXiv:1609.06038v1. semantic compositionality. In NAACL.
Ozan Irsoy and Claire Cardie. 2014. Deep recursive
neural networks for compositionality in language.

Nal Kalchbrenner, Edward Grefenstette, and Phil Blun-

som. 2014. A convolutional neural network for
modelling sentences. In ACL.

Ryan Kiros, Yukun Zhu, Ruslan Salakhutdinov,

Richard Zemel, Antonio Torralba, Raquel Urtasun,
and Sanja Fidler. 2014. Skip-thought vectors. In

Thomas K Landauer and Susan T. Dumais. 1997. A

solution to platos problem: The latent semantic
analysis theory of acquisition, induction, and rep-
resentation of knowledge. Psychological Review

Quoc Le and Tomas Mikolov. 2014. Distributed repre-

sentations of sentences and documents. In ICML.