Anda di halaman 1dari 11

Information Sciences 266 (2014) 90100

Contents lists available at ScienceDirect

Information Sciences
journal homepage: www.elsevier.com/locate/ins

Sentiment topic models for social emotion mining


Yanghui Rao a,, Qing Li a, Xudong Mao a, Liu Wenyin b
a
b

Department of Computer Science, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong
College of Computer Science and Technology, Shanghai University of Electric Power, Shanghai, China

a r t i c l e

i n f o

Article history:
Received 12 June 2013
Received in revised form 7 November 2013
Accepted 30 December 2013
Available online 11 January 2014
Keywords:
Social emotion mining
Sentiment topic model
Social emotion classication
Social emotion lexicon

a b s t r a c t
The rapid development of social media services has facilitated the communication of opinions through online news, blogs, microblogs/tweets, instant-messages, and so forth. This article concentrates on the mining of readers emotions evoked by social media materials.
Compared to the classical sentiment analysis from writers perspective, sentiment analysis
of readers is sometimes more meaningful in social media. We propose two sentiment topic
models to associate latent topics with evoked emotions of readers. The rst model which is
an extension of the existing Supervised Topic Model, generates a set of topics from words
rstly, followed by sampling emotions from each topic. The second model generates topics
from social emotions directly. Both models can be applied to social emotion classication
and generate social emotion lexicons. Evaluation on social emotion classication veries
the effectiveness of the proposed models. The generated social emotion lexicon samples further show that our models can discover meaningful latent topics exhibiting emotion focus.
2014 Elsevier Inc. All rights reserved.

1. Introduction
Measuring public opinions about social events, political movements, company strategies, marketing campaigns, and
product preferences emerge as a challenging task [8]. Traditional methods would require questioning a large number of people about their feelings in polls, either through ofine or online channels. With the paradigm shift in the usage of the Web
from information consumption to information production and sharing (Web 2.0), numerous social media services have
emerged. Online users can now conveniently express their opinions through news portals, forum discussions, reviews, messages, blogs and microblogs/tweets, thus their willingnesses to engage in social interactions increase tremendously. As an
important medium to report events happening around the world, news portals become increasingly popular in conveying
positive or negative sentiments underlying an opinion, or for communicating an affective state. Many news portals have
provided an interactive emotion rating service on various channels, e.g., news.sina.com.cn, cul.chinanews.com, tech.huanqiu.com, and learning.sohu.com. In such news portals, each article is associated with ratings shared by users who have
voted over a set of predened emotion labels/tags. Fig. 1 presents an example of emotion labels and user ratings from a popular news portal in China (i.e., news.sina.com.cn), in which multiple emotion labels are voted by 3064 users for a particular
news article. Such a kind of emotional response is known as social emotions, the amount of which has been increasing rapidly [1,2,34]. Social emotion mining has therefore attracted a large amount of attentions from researchers of natural language
processing and machine learning. Due to the important role played by social media in reecting and shaping the public
opinions, the analysis of social emotions also brings benets to social sciences and humanities.
The existing approaches to social emotion mining can be classied into two categories: word-level [16,20,21,31] and topic-level [1,2] models. Earlier works of social emotion mining focused on exploiting the sentiment of individual words
Corresponding author. Tel.: +852 6525 8082.
E-mail address: raoroland@gmail.com (Y. Rao).
0020-0255/$ - see front matter 2014 Elsevier Inc. All rights reserved.
http://dx.doi.org/10.1016/j.ins.2013.12.059

Y. Rao et al. / Information Sciences 266 (2014) 90100

91

Fig. 1. An example of emotion labels and user ratings.

[16,31], by assuming that words are the foundation of annotating user sentiments on public events. For example, a number
of relevant words are extracted from the news articles and the word-level features used to assign each article to an appropriate emotion category. However, the underlying assumption that word is the only essential feature in social emotion
mining suffers from several problems. One is the sentiment ambiguity [25], as the same word in different topics or contexts
may reect different emotions. Another problem is due to the background noisy words [1]. In particular, the methods of constructing the word-level emotion lexicon treat each word individually, so many relevant words associated with emotions are
usually mixed with background noisy words which do not convey much affective meaning. More recently, topic-level models
are proposed to exploit the sentiment of topics. A topic means a real-world event, object, or abstract entity that indicates the
subject or context of the sentiment, which is essential to the topic-level social emotion mining [30]. The Emotion-Topic Model (ETM) [2] borrows the machinery of latent topic models like the Latent Dirichlet Allocation (LDA) [6], thereby facilitating
to distinguish different meanings of the same word. Unfortunately, it is more suited to emotion annotations of text from the
writers perspective, rather than to online user emotional response after reading specic news articles (i.e., the readers perspective). Sentiment analysis from the readers perspective has potential applications that differ from those of writer-sentiment analysis [18]. For example, the former can assist authors in foreseeing how their work will inuence the readers
emotionally, and help users retrieve documents that contain both relevant contents and desired emotions if we integrate
emotion scores and rankings into information retrieval.
In this article, we present two sentiment topic models called Multi-label Supervised Topic Model (MSTM) and Sentiment
Latent Topic Model (SLTM), in order to overcome the drawbacks of the previous approaches. Both MSTM and SLTM can be
applied to the tasks of (1) social emotion classication and (2) generating social emotion lexicons. For (1), social emotion
classication allows us to infer the distribution of social emotions for an unseen document, and is often utilized to evaluate
and compare the performance of models quanticationally [1,2]. For (2), social emotion lexicons help users identify entities
(e.g., products, brands, cities and organizations), aspects, topics or public events that evoke certain social emotions. The main
contributions of our work in this article include the following:
Firstly, both MSTM and SLTM allow us to distinguish between different affective senses of the same word, and to discover
meaningful topics evoking social emotions, since the expression of sentiment is characterized by its topic rather than individual words. Secondly, both models associate each topic with social emotions jointly, so they are capable of capturing the
public feelings and attitudes towards an existing topic, and predicting user reactions and emotions on a new topic. Finally,
different from most existing writer-sentiment analysis approaches, our proposed models are designed from the readers perspective. Capturing reader emotions is sometimes more meaningful in social media, especially when the detection of opinion
spamming [22] has become a major issue for writer-sentiment analysis.
We evaluate the proposed models on online news articles gathered from one of the largest news portals in China, i.e.,
news.sina.com.cn. Experimental results show that both MSTM and SLTM can effectively discover meaningful latent topics
from news articles. The models can also distinguish between the topics with strong emotions and background topics. For
social emotion classication, the SWAT [16,31], Emotion-Topic Model (ETM) and Emotion-Term (ET) model [1,2] are used
as baselines. As to be shown, in terms of accuracy, SLTM outperforms ETM, SWAT and ET by 10.81, 12.96 and 74.91 percent,
respectively. We also conduct a qualitative investigation on samples of the social emotion lexicon generated by our models.
The result shows that the generated lexicon can reect not only explicit emotive words, but also implicit emotive words that
evoke social emotions potentially.
The models proposed in this article offer many avenues for future expansions and applications. Although only tested in
the eld of social emotion mining of news articles, our models could well be applied to other tasks including other types of
text or media. First, it would be interesting to explore ratings, tags and comments of music, photos and videos that conveying
viewers opinions by our sentiment topic models [29]. Second, since all words, latent topics/aspects and multi-labels of each
document are modeled jointly, our proposed models could also be applied or extended to the tasks of aspect-level sentiment
analysis [36], multi-label classication [3,17] and ranking [15,18,34].
The remainder of this article is organized as follows. Related work is given in Section 2. The two sentiment topic models
for social emotion mining are presented in Section 3. The dataset, experimental results and discussions are illustrated in
Section 4. Finally, we draw conclusions in Section 5.

92

Y. Rao et al. / Information Sciences 266 (2014) 90100

2. Related work
In this section, we rstly review the related work on sentiment classication and analysis, and then summarize the existing works on social emotion mining. Sentiment classication and analysis mainly deal with reviews and messages by supervised or unsupervised algorithms, which paved the way to the newer research area of social emotion mining. The
comparison of the above two research areas is described in Table 1.
2.1. Sentiment classication and analysis
In previous studies, most sentiment analysis algorithms focus on the classication of emotions or opinions, with the corresponding data sources being reviews or messages. Das and Chen [9] utilized a classication algorithm to extract market
emotions from stock message boards, based on which decisions like whether to buy or sell a stock can be made. However,
the performance was heavily dependent on certain key words. For instance, in stock market, the sentence It is not a bear
market means a bull market effectively, because negation words such as no, not are much more important and serve to
reverse the meaning. Pang et al. [23] applied three classication algorithms: Naive Bayes, Maximum Entropy and Support
Vector Machines (SVM), to classify movie reviews into positive and negative ones. They reported that those algorithms do
not perform as well on sentiment classication as on text classication. Xia et al. [35] made a comparative study of the effectiveness of ensemble techniques for sentiment classication. Three types of ensemble methods of feature sets and classication algorithms were evaluated on reviews of movies and products. In another early work [32], Turney applied an
unsupervised learning algorithm to classify the emotional orientation of users reviews (i.e., reviews of movies, travel destinations, automobiles and banks). His approach rstly calculated the mutual information between each phrase and the word
excellent, as well as the mutual information with the word poor. Then, the difference of the two mutual information
scores was used to classify each review as recommended or not.
Sentiment analysis by classication is effective and useful, but it cannot nd which entities, aspects, topics or events people like and dislike [10]. Without knowing the targets of each emotional state, the resultant solution is also of limited use
[22]. Besides, as the feelings, opinions, views or beliefs of authors/writers expressed in their reviews are personalized, the
sentiment classication of reviews is sensitive to the domain of the training data [7,14,24].
2.2. Social emotion mining
The rst line of research work on social emotion mining is the affective text in SemEval-2007 Tasks [16,31], with an aim
to annotate news headlines for the evoked emotions of readers. A similar piece of research work has studied readers emotional states evoked by news sentences [3,4]. However, due to the limited information of news headlines or sentences, it is
usually intractable to annotate the emotions consistently even for human beings [16,25]. The Emotion-Term (ET) model [1,2]
is a variant of Naive Bayes, designed to model the word-emotion associations by exploiting all words in the news content.
The disadvantage of ET is that words used in different contexts can be quite different for expressing emotions [7,14,25], because words that bear sentiment ambiguity (i.e., the same word may mean positive in one context but negative in another)
or multiple emotions (i.e., the different senses contained in the same word) are difcult to be recognized and distinguished
by the word-level emotion lexicons [25].
In order to solve the above problem, topic models like the Latent Dirichlet Allocation (LDA) [6] have been used to distinguish different senses of the same word. As a joint emotion-topic model for social emotion mining, the Emotion-Topic Model
(ETM) [1,2] introduced an intermediate layer into LDA, in which a topic acts as an important component of an emotion. Informative and coherent topics are extracted and grouped under different emotions. The detailed generative process of an affective document is as follows. First, for each document, a distribution over ratings is generated from a Multinomial
distribution. Second, for each word in the document, a single emotion label is sampled according to the above distribution.
Third, a latent topic is generated from a Dirichlet distribution conditioned on the emotion, and nally a term is generated
from the latent topic, which is modeled by another Multinomial distribution over words. Fig. 2 shows the graphical model
of ETM, where shaded nodes are observed data, blank ones are latent (not observed), and arrows encode dependencies. Given
a document d, the distribution rd,n is sampled from a Multinomial distribution parameterized by the set of ratings. Section 3.1
will illustrate all other notations in detail. The experimental results reported [1,2] show that ETM outperforms SVM and several other methods for social emotion classication.
Table 1
Comparison of the related work.
Research areas

Algorithms and models

Data sources

Sentiment classication and analysis

Supervised: Naive Bayes, Maximum


Entropy and Support Vector Machines
Unsupervised: Mutual Information

Reviews and messages

Social emotion mining

Word-level: Emotion-Term Model


Topic-level: Emotion-Topic Model

News headlines and articles

Y. Rao et al. / Information Sciences 266 (2014) 90100

93

vd

d,n

zd,n

wd,n N
d

Fig. 2. The graphical model of Emotion-Topic Model (ETM).

From the aspect of generative process, ETM is homologous to the Author-Topic Model (ATM) [28], Labeled LDA [26,27]
and Joint Sentiment/Topic Model (JSTM) [19]. ATM extends LDA through including authorship information, by assuming that
each author is associated with a Dirichlet distribution over topics, and each topic is associated with a Multinomial distribution over words. Labeled LDA incorporates supervision by constraining the model to use only those topics that correspond to
a documents observed label set. JSTM exploiting a similar generative process shows good performance on sentiment analysis of movie reviews. Those models are mainly designed from the perspective of the authors/writers. For ATM, authors may
have their preferred types of topics before their writings. For JSTM, if a user has a positive impression on a movie, then the
latent topics and observable words of the review written by the user may be determined, and the review also has a positive
emotion property. In terms of readers emotional responses when they are exposed to news articles, however, the distribution of emotion ratings does not decide, rather is determined by, the latent topics and words. The reason is that readers emotion ratings are triggered after they have read the content of a news article.

3. Sentiment topic models


In this section, we present two sentiment topic models from the readers perspective, namely, the Multi-label Supervised
Topic Model (MSTM) and the Sentiment Latent Topic Model (SLTM). The problem is rstly dened. Then, we present the two
sentiment topic models in detail.
3.1. Problem denition
For the sake of convenience, we dene the following notations for describing the sentiment topic models:
An online news collection {d1, d2, . . . , dD} consists of D documents with word tokens from a vocabulary containing W distinct terms, and a set of ratings by online users over a predened list of emotion labels. The list of emotion labels is denoted
by e = {e1, e2, . . . , eE}, and the common instances of e are joy, anger, fear, surprise, touching, empathy, boredom,
sadness, warmness, etc. In particular, a document d is a sequence of Nd word tokens w1 ; w2 ; . . . ; wNd , and a set of ratings
over E emotion labels denoted by v d fv d;e1 ; v d;e2 ; . . . ; v d;eE g. The value of v d;ek is the number of online users who have voted
the kth emotion label ek for document d. Similar to the sequence of word tokens, the set of ratings for document d can be
represented by a sequence of Md emotion labels e1 ; e2 ; . . . ; eMd , where em is the mth emotion label in the sequence and em e e. The number of times ek appeared in the sequence of Md emotion labels is v d;ek , i.e., v d;ek jfei jei ek ; i 1; 2; . . . ; Md gj.
For example, assume that there is a collection of two documents d1 and d2, and the predened four emotion labels are
joy, anger, fear and surprise. Thus, D and E respectively equal to 2 and 4, and the list of emotion labels is e =
{joy, anger, fear, surprise}. Each document is a sequence of word tokens and has been voted by the following
number of readers over the four emotion labels: v d1 f3; 0; 0; 1g, v d2 f0; 3; 2; 0g. Accordingly, document d1 is voted
by 4 readers in total, among which 3 readers voted joy and 1 reader voted surprise. The second document (d2) is voted
by 5 readers in total, among which 3 readers voted anger and 2 readers voted fear. The task of social emotion mining
needs to model the number of readers who have voted each document over these four emotions jointly, and we can treat
each count of emotion rating akin to the occurrence of a word. Then, each document can be represented by a sequence of
both word tokens and emotion labels. From the aspect of emotion labels, since the emotion label joy (i.e., e1) is voted 3
times and surprise (i.e., e4) is voted 1 time, document d1 can be represented by a sequence of emotion labels joy, joy,
joy, surprise. For the above sequence, e1, e2 and e3 are joy, and e4 is surprise. Similarly, document d2 is represented by
a sequence of emotion labels anger, anger, anger, fear, fear, where e1, e2 and e3 are anger, e4 and e5 are fear,
respectively.
The whole collection of online news is modeled by K latent topics, and the notations hd, u and d are used to represent the
document-topic, topic-word and topic-emotion distributions, respectively. The hyperparameters are denoted by a, b and c.
In particular, the hyperparameter a is a Dirichlet prior of h, which can be interpreted as the prior observation counts for the
number of times the topic was sampled from document before any word is observed. The hyperparameter b is a Dirichlet
prior of u, which can be treated as the prior observation counts for the number of times words were sampled from topic
before any actual word has been observed [19]. The hyperparameter c is a Dirichlet prior of d, which can be interpreted

94

Y. Rao et al. / Information Sciences 266 (2014) 90100


Table 2
Notations of frequently-used variables.
Symbol

Description

D
Nd
wd,n
zd,n
Md

Number of documents
Number of word tokens in document d
The nth word token in document d
The topic assigned to word token wd,n
Number of emotion labels in document d
The mth emotion label in document d
The topic assigned to emotion label ed,m
The set of ratings for document d
Number of distinct terms
Number of predened emotion labels
The kth emotion label
Number of topics
The multinomial distribution of topics specic to document d
The multinomial distribution of words specic to topics
The multinomial distribution of emotions specic to topics
Dirichlet prior of h
Dirichlet prior of u
Dirichlet prior of d

ed,m
zd,m

vd
W
E
ek
K
hd

u
d

a
b

as the prior observation counts for the number of emotion ratings were sampled from topic before any emotion rating is
observed. Table 2 summarizes the notations of these frequently-used variables.
We describe in the next two subsections the Multi-label Supervised Topic Model (MSTM) and Sentiment Latent Topic
Model (SLTM), respectively. As a complete generative model, both MSTM and SLTM allow us to associate each topic with
words and emotions jointly, and to infer the probability of emotions conditioned on future unlabeled documents which only
contain words.
3.2. Multi-label supervised topic model
In this subsection, we rst briey introduce the Supervised Topic Model (STM) [5], and then present our Multi-label STM
which extends STM for social emotion mining. STM is a statistical model of labeled documents. The label might be the
category of a document, the number of stars given to a movie, or the number of readers who marked an article interesting.
Sentiment analysis of movie reviews is usually conducted to evaluate STM, in which the document (i.e., movie review) and
its label (i.e., the number of stars given) are modeled jointly. The latent topics that will best t the observed data are
generated, and can be used to predict the label for future unlabeled documents. Two similar pieces of research work are
the Topic-Over-Time (TOT) model [33] and User-QuestionAnswer (UQA) model [12]. Nevertheless, those models cannot
be applied to social emotion mining directly. This is because social emotion mining requires us to model multiple kinds
of labels rather than only one type of labels.
The Multi-label Supervised Topic Model (MSTM) is proposed to jointly model multiple labels, e.g., reader ratings over
multiple emotions. For each document d, the word tokens are rstly generated as follows:
1. Choose hd  Dir(a).
2. For each of the Nd word tokens wd,n:
(a) Choose a topic zd,n  Multinomial(hd).
(b) Choose a word token wd,n from p(wd,n|zd,n, b).
After generating D documents by the process described above, the posterior distribution for hd is further used to generate
the emotion labels as follows:
1. Choose a topic zd,m from hd.
2. Choose an emotion label ed,m from p(ed,m|zd,m, c).
The graphical model of MSTM is presented in Fig. 3, where the word tokens are generated rstly, and then the emotion
labels are generated conditionally.
In MSTM, the generative process of each word token is the same as LDA. Since it is intractable to perform an exact inference, we use an approximate inference method based on Gibbs sampling to estimate the parameters. For each word token
wn, the conditional posterior distribution Pzn zjz:n ; w; a; b can be derived below:

c:n
z;wn b
;
:n
w0 cz;w0 Wb

P
Pzn zjz:n ; w; a; b / c:n
dn ;z a 

Y. Rao et al. / Information Sciences 266 (2014) 90100

95

zd,n

wd,n

Nd

zd,m

d,m M
d

Fig. 3. The graphical model of Multi-label Supervised Topic Model (MSTM).

where zn is the candidate topic that wn is assigned to, z:n refers to the topic assignments of all other word tokens, dn indicates
the document from which word token wn is sampled, cd,z is the number of word tokens in document d assigned to topic z, and
cz,w is the number of instances of word token w that has been assigned to topic z. The superscript : n means the number that
does not include the current assignment of word token wn.
Given the sampled topics, the posterior distribution for hd is estimated by the following:

!
X
cd;z0 K a :

hd;z cd;z a=

z0

After generating all word tokens, we use the posterior distribution for hd to sample the topics of emotion labels so as to ensure the oneone thematic mapping between words and emotions. For each emotion label em, the conditional posterior distribution Pzm zjz:m ; e; hd ; c can be derived as follows:

c:m
z;em c
Pzm zjz:m ; e; hd ; c / hd;z  P :m
;
e0 cz;e0 Ec

where cz,e is the number of instances of emotion label e that has been assigned to topic z. The superscript : m means the number that does not include the current assignment of emotion label em.
After the sampling according to Eqs. (1) and (3), it is convenient to estimate the probability of word token w conditioned
on topic z, and the probability of emotion label e conditioned on topic z by the following:

!
X
cz;w0 Wb ;

uz;w cz;w b
,
dz;e cz;e c

w0

!
z;e0

Ec :

e0

~ as follows:
We can further infer the emotions conditioned on unlabeled document d

~
Pejd

X
dz0 ;e hd;z
~ 0;

z0

~ can be estimated according to Eq. (2) once we have a sample from its word-topic
where the topic distribution hd;z
of d
~
assignment zd~ . Sampling zd~ can be performed with a similar method as in Eq. (1), but now only for each word token wi in
~ i.e.,
d,
_


c :i
z;wi b
:i
Pzi zjz:i ; w; a; b / cd;z
:
~ a P _
:i
w0 c z;w0 Wb

The notation c z;w refers to the total number of instances of word token w that has been assigned to topic z in the union of D
~
training documents and the new document d.
3.3. Sentiment latent topic model
In this subsection, we proceed to proposing another model named Sentiment Latent Topic Model (SLTM) for social
emotion mining. Like the MSTM, SLTM can also associate each topic with words and emotions jointly, and infer the emotion
distribution of future unlabeled documents. For each document d, the emotion labels are rstly generated as follows:
1. Choose hd  Dir(a).
2. For each of the Md emotion labels ed,m:
(a) Choose a topic zd,m  Multinomial(hd).
(b) Choose an emotion label ed,m from p(ed,m|zd,m, c).

96

Y. Rao et al. / Information Sciences 266 (2014) 90100

zd,m

d,m

Md

zd,n

wd,n N
d

Fig. 4. The graphical model of Sentiment Latent Topic Model (SLTM).

After generating D documents by the above process, the posterior distribution for hd is further used to generate the word
tokens, as follows:
1. Choose a topic zd,n from hd.
2. Choose a word token wd,n from p(wd,n|zd,n, b).
The graphical model of SLTM is shown in Fig. 4, where the emotion labels are generated rstly, and then the word tokens
are generated conditionally.
Similar to MSTM, we also use the approximate inference method based on Gibbs sampling to estimate the parameters of
SLTM, as it is intractable to perform an exact inference. For each emotion label em, the conditional posterior distribution
Pzm zjz:m ; e; a; c is derived as follows:



c:mm c
P z;e:m
Pzm zjz:m ; e; a; c / c:m
;
dm ;z a 
e0 cz;e0 Ec

Given the sampled topics, the posterior distribution for hd can be estimated as in Eq. (2), but here the notation cd,z refers to
the number of emotion labels in document d assigned to topic z.
To ensure the oneone thematic mapping between emotions and words, the posterior distribution for hd after generating
all emotion labels is used to sample the topics of word tokens. For each word token em, the conditional posterior distribution
Pzn zjz:n ; w; hd ; b is derived as follows:

Pzn zjz:n ; w; hd ; b / hd;z  P

c:n
z;wn b
;
:n
c
w0 z;w0 Wb

After the sampling according to Eqs. (8) and (9), it is convenient to estimate the probability of emotion label e conditioned
on topic z, and the probability of word token w conditioned on topic z by Eqs. (5) and (4), respectively. We can also infer the
emotions conditioned on unlabeled documents as in Eq. (6). Here are the resultant samples of SLTM. First, the emotion ranking of each topic z is achieved by sampling the emotion labels from the latent topics (e.g., z is represented by an emotion
ranking list of touching, empathy, sadness); second, the word ranking of the same topic z is obtained by sampling the word
tokens (e.g., z is also represented by a word ranking list of rescue, touched, help, hospital, disease). Those sampling results
can be used to annotate the emotions for future unlabeled documents.
The time complexity of both MSTM and SLTM is O(VK + ZK) for each iteration of sampling, where K is the number of topics,
P
P
V represents the total number of word tokens and V d0 N d0 , Z denotes all the number of emotion labels and Z d0 M d0 .
4. Experiments
In this section, we rstly evaluate the performance of the two proposed models on social emotion classication quanticationally; the experiments are designed to achieve the following two goals: (i) to analyze the inuence of topic numbers,
and (ii) to study the effect of hyperparameters. We then conduct a qualitative investigation on samples of the social emotion
lexicon generated by the proposed models.
4.1. Experiment design
To test the effectiveness of the proposed models, we have collected 4570 news articles from the Society channel of Sina1.
The attributes of each article include the URL address, publishing date (from January to April of 2012), news title, content, and
user ratings over 8 emotion labels: touching, empathy, boredom, anger, amusement, sadness, suprise and warmness. To ensure that the user ratings have stabilized, the dataset is crawled from half a year after the publishing dates of the
1
While the proposed models are language independent, we use this Chinese dataset for evaluation since there is no similar services in English yet [1,2]. The
dataset is available in public at: http://www.hkws.org/public-sources/newsdata.zip.

97

Y. Rao et al. / Information Sciences 266 (2014) 90100


Table 3
Statistics of the dataset.
Emotion label

# Of articles

# Of ratings

Touching
Empathy
Boredom
Anger
Amusement
Sadness
Surprise
Warmness

749
225
273
2048
715
355
167
38

41,796
23,230
21,995
138,167
43,712
37,162
11,386
7986

news articles. Table 3 summarizes the statistics of the dataset for each emotion label. The number of articles of each emotion
label represents the sum of articles having the most ratings over that emotion. For example, there are 2048 news articles having
the most user ratings over the emotion label of anger, with a total number of ratings being 138,167 for that emotion.
As a preprocessing step, the above dataset has gone through the following two measures:
Firstly, the news headline and content are integrated into a single news document. This process is reasonable for social
emotion mining, since readers emotions can be evoked by both the title and the content of each news article.
Secondly, a Chinese lexical analysis system (ICTCLAS) is used to perform the Chinese word segmentation. ICTCLAS is an
integrated Chinese lexical analysis system based on multi-layer HMM2.
As a result, the dataset is comprised of 1,975,153 word tokens and 325,434 emotion ratings. For each document, the number of word tokens ranges from 6 to 3042, and the number of emotion ratings ranges from 1 to 3064.
We split the dataset into a training set and a testing set, and evaluate the performance by 5-fold cross-validation [5]. The
existing SWAT [16,31], Emotion-Term (ET) and Emotion-Topic Model (ETM) [1,2], and the proposed Multi-label Supervised
Topic Model (MSTM) and Sentiment Latent Topic Model (SLTM) are implemented for comparison. The accuracy is employed
as the indicator of performance, which is considered to be the most important metric in Acc@k (i.e., the accuracy at top k)
[1,2]. Given an unlabeled document d, the top ranked predicted emotion label ep, and the truth emotion set Etopk@d including
the k top-ranked emotion labels, the emotion distribution of document d is predicted correctly if ep e Etopk@d. Then, Acc@k is
computed through dividing the number of correctly predicted documents by the total number of documents. Accuracy, i.e.,
Acc@1, is dened as the percentage of documents whose top ranked emotion label is correctly predicted.
4.2. Inuence of topic number
The number of topics K indicates how many latent aspects of articles can be derived, which may inuence the performance of the baseline ETM and the proposed MSTM and SLTM. Note that the performance of the baseline SWAT and ET does
not change with different topic numbers since they do not consider any latent topics. To evaluate the inuence of K, we vary
it from 2 to 100. Similar to the previous works [1,2,11], the hyperparameters a and b are set to symmetric Dirichlet priors
with values of 50/K and 0.1, respectively. The extra hyperparameter of MSTM and SLTM, i.e., c is set to be the same as b. Fig. 5
presents the performance of all the ve models with different K values.
The experimental results show that the performance of ETM, MSTM and SLTM converges within a relatively small number
of topics. The proposed MSTM outperforms the baseline ETM when the number of topics K is less than 8, while performing
worse than ETM with a larger K. The proposed SLTM always outperforms the baseline ETM when K is less than 40. For any
other larger K, SLTM and ETM yield competitive performance.
Table 4 shows the mean and variance of accuracy for the different models. Compared with the baselines ETM, SWAT and
ET, the averaged accuracy of SLTM improves 10.81%, 12.96% and 74.91%, respectively; the averaged accuracy of MSTM improves 0.44%, 2.39% and 58.54%, respectively. The variance of the accuracy for SLTM and MSTM is much smaller than that of
ETM, meaning that the performance of both SLTM and MSTM is more stable than ETM with respect to different topic
numbers.
To evaluate the differences of these models statistically, we also perform two kinds of statistical test on paired models.
The rst one is conducted to evaluate the stability of performance in terms of variances, and the second one is to evaluate the
averaged performance in terms of means. The p-values are estimated for both kinds of statistical test. The conventional signicance level (i.e., p-value) is 0.05, which means the null hypothesis can be rejected with a probability of 95%. The difference between paired models is statistically signicant if the resultant p-value is lower than 0.05.
Firstly, the analysis of variance in terms of F-test is employed to test the underlying assumption of homoscedasticity (i.e.,
the null hypothesis implies homogeneity of variance). As SWAT and ET do not exploit latent topics, their performance is
independent of K. So the F-tests are only conducted on SLTM, MSTM and ETM. The result shows that the difference between
SLTM and MSTM is not statistically signicant in terms of variance with p-value equal to 0.47, while the variance of ETM is
signicantly different from SLTM and MSTM with p-values equal to 5.5E10 and 4.1E10, respectively. This indicates that
the performance of both SLTM and MSTM is signicantly more stable than ETM.
2

http://www.ictclas.org/.

98

Y. Rao et al. / Information Sciences 266 (2014) 90100

Fig. 5. The performance with different topic numbers.


Table 4
Statistics of different models.
Models/accuracy

Mean (%)

Variance

SLTM
MSTM
ETM
SWAT
ET

51.86
47.01
46.80
45.91
29.65

0.0001
0.0001
0.0051

Secondly, t-tests are conducted to test the underlying assumption that the difference of performance between paired
models has a mean value of zero (i.e., the null hypothesis implies identical performance). The result shows that the proposed
SLTM outperforms the baselines ETM, SWAT and ET signicantly with p-values equal to 0.0042, 6.7E14 and 1.7E23,
respectively. Furthermore, the performance of the proposed MSTM is signicantly better than the baselines SWAT and ET
with p-values equal to 0.0005 and 8.0E22, respectively, while the difference between MSTM and the baseline ETM is
not statistically signicant with p-value equal to 0.45.
4.3. Effect of hyperparameters
Hyperparameters generally have a smoothing effect on multinomial parameters. There are three hyperparameters (i.e., a,
b and c) in the proposed MSTM and SLTM. The estimate for hyperparameter a is an indicator on how different the documents
are in terms of their latent semantics, and the estimate for hyperparameter b suggests how large the groups of commonly cooccurring words are [13]. According to [1,2,11], good model quality is obtained for a equal to 50/K and b equal to 0.1 under
various experiments. The hyperparameter c is employed as the prior observation counts for the number of times emotions
are sampled from topic before any emotion rating is observed, so the value of c reects how large the groups of commonly
co-occurring emotion ratings are. To evaluate the effect of c, we rstly set the number of topics K to be the same as the number of predened emotion labels (or categories, i.e., E), and set the hyperparameters a and b to symmetric Dirichlet priors
with values of 50/K and 0.1 uniformly. Then, we vary the value of c from 0.01 to 2. The performance of MSTM and SLTM with
different values of c is plotted in Fig. 6.
The experimental results show that the value of c has a different impact on MSTM and SLTM. Firstly, the performance of
MSTM with different values of hyperparameter c is more stable than SLTM. The variances of accuracy for MSTM and SLTM
are 5.5E6 and 1.6E5, respectively, and the difference is statistically signicant in terms of F-test with p-value equal to
0.003. However, those variance values (i.e., 5.5E6 and 1.6E5) are all smaller than the variance of MSTM or SLTM (i.e.,
0.0001) under different topic numbers. This indicates that the performance of MSTM or SLTM with different values of hyperparameter c is more stable than those with different topic numbers. Secondly, the performance of SLTM or MSTM is not the
best, but good enough when c is set to be the same as b (i.e., 0.1). In particular, the performance of SLTM and MSTM with c
equal to b ranks top-4 and top-5, respectively.
4.4. Social emotion lexicon samples
As stated earlier, both MSTM and SLTM are proposed to jointly model the latent topics and social emotions from the readers perspective. Fig. 7 shows the representative samples of the social emotion lexicon generated by SLTM, in which the number of topics K is set to be the same as the number of emotion labels (i.e., E), and the hyperparameters a, b and c are set to
symmetric Dirichlet priors with values of 50/K, 0.1 and 0.1, respectively. The probability of social emotions conditioned on
each topic is estimated by Eq. (5), and is shown in parenthesis (Ref. Fig. 7). The representative words of each topic are
selected according to Eq. (4). All the words have been translated from Chinese to English. Since word tokens are sampled
from the posterior topic distributions drawn by emotion labels in SLTM, it is of oneone mapping between words and emo-

Y. Rao et al. / Information Sciences 266 (2014) 90100

99

Fig. 6. The performance with different values of hyperparameter c.

Representative word

Topic ID
7

investigate crime legal case


rob
condemn court
the old
life

look after

help

reside mother

son

father

mother

parents

daughter

Emotion label
anger(1.0)

touching(0.94)
warmness(0.06)

sadness(0.66)
die

empathy(0.34)

surprise(0.54)
6

most

high

report

long

meter

safe

warmness(0.27)
empathy(0.19)

Fig. 7. Samples of social emotion lexicon.

tion labels for each topic. Although the baseline ETM can also associate words and emotion labels with topics, it estimates
the probabilities of topics conditioned on social emotions, therefore cannot generate social emotion lexicons like the ones
shown in Fig. 7.
The samples indicate that the proposed SLTM can discover meaningful latent topics which trigger social emotions, where
both explicit and implicit emotive words are identied. For instance, the words look after and help are explicit emotive
words that evoke reader emotions of touching and warmness. The words crime and rob are implicit emotive words
that trigger the emotion of anger. Our model can also distinguish the topics evoking strong emotions from background
topics. For example, topics 7 and 1 mainly trigger the emotions of anger and touching, respectively. Topic 2 is also a topic
with an emotion focus, because the emotions of sadness and empathy are highly correlated. As an exception, Topic 6 is a
topic with merged social emotions. In this topic, many news articles reporting the world wonders evoke the emotion of surprise, while other articles on safety trigger the emotions of warmness and empathy.
5. Conclusion
Social emotion mining has many potential applications, including emotion-based document retrieval [34], and emotion
classication for online news articles [1,2,31]. It can also help understand the preferences and perspectives of online users,
and therefore facilitate news portals to provide users with more relevant and personalized services. In this article, we have
proposed two reader-oriented sentiment topic models, i.e., Multi-label Supervised Topic Model (MSTM) and Sentiment
Latent Topic Model (SLTM), to connect latent topics with evoked emotions of readers. Empirical experiments are conducted
to evaluate the effectiveness of the proposed models. The main conclusions of this research include the following:
(1) The performance of the proposed SLTM and MSTM is more stable than the baseline ETM with varied topic numbers. In
terms of the averaged accuracy, SLTM and MSTM outperform the baselines ETM, SWAT and ET.
(2) The performance of MSTM and SLTM with different hyperparameter values is more stable than that with different
topic numbers.
(3) The generated social emotion lexicon samples indicate that the proposed models can discover meaningful latent topics
with strong social emotions, where both explicit and implicit emotive words are identied.

100

Y. Rao et al. / Information Sciences 266 (2014) 90100

For our future work, we plan to evaluate MSTM and SLTM with a larger scale of online news articles; we also plan to
extend and apply our models to other applications such as emotion-aware recommendation of news articles or multimedia
resources like audios and videos.
Acknowledgments
We are mostly thankful to the anonymous reviewers for their constructive comments and suggestions on an earlier
version of this paper. The work described in this paper has been supported substantially by a Strategic Research Grant from
City University of Hong Kong (Project Number: 7002912).
References
[1] S. Bao, S. Xu, L. Zhang, R. Yan, Z. Su, D. Han, Y. Yu, Joint emotion-topic modeling for social affective text mining, in: Proc. 9th IEEE International
Conference on Data Mining (ICDM), 2009, pp. 699704.
[2] S. Bao, S. Xu, L. Zhang, R. Yan, Z. Su, D. Han, Y. Yu, Mining social emotions from affective text, IEEE Trans. Knowl. Data Eng. 24 (2012) 16581670.
[3] P.K. Bhowmick, A. Basu, P. Mitra, Reader perspective emotion analysis in text through ensemble based multi-label classication framework, Comput.
Inf. Sci. 2 (2009) 6474.
[4] P.K. Bhowmick, A. Basu, P. Mitra, A. Prasad, Sentence level news emotion analysis in fuzzy multi-label classication framework, Res. Comput. Sci. 46
(2010) 143154.
[5] D.M. Blei, J.D. McAuliffe, Supervised topic models, in: Proc. Advances in Neural Information Processing Systems (NIPS), 2007.
[6] D.M. Blei, A.Y. Ng, M.I. Jordan, Latent dirichlet allocation, J. Mach. Learn. Res. 3 (2003) 9931022.
[7] D. Bollegala, D. Weir, J. Carroll, Using multiple sources to construct a sentiment sensitive thesaurus for cross-domain sentiment classication, in: Proc.
49th Annual Meeting of the Association for Computational Linguistics (ACL), 2011, pp. 132141.
[8] E. Cambria, B. Schuller, Y. Xia, C. Havasi, Knowledge-based approaches to concept-level sentiment analysis: new avenues in opinion mining and
sentiment analysis, IEEE Intell. Syst. 28 (2013) 1521.
[9] S. Das, M. Chen, Yahoo! for Amazon: Extracting market sentiment from stock message boards, in: Proc. 8th Asia Pacic Finance Association Annual
Conference (APFA), 2001.
[10] R. Feldman, Techniques and applications for sentiment analysis, Commun. ACM 56 (2013) 8289.
[11] T.L. Grifths, M. Steyvers, Finding scientic topics, Proc. Natl. Acad. Sci. USA 101 (2004) 52285235.
[12] J. Guo, S. Xu, S. Bao, Y. Yu, Tapping on the potential of Q&A community by recommending answer providers, in: Proc. 17th ACM International
Conference on Information and Knowledge Management (CIKM), 2008, pp. 921930.
[13] G. Heinrich, Parameter Estimation for Text Analysis, Technical Report, 2008.
[14] Y. He, C. Lin, H. Alani, Automatically extracting polarity-bearing topics for cross-domain sentiment classication, in: Proc. 49th Annual Meeting of the
Association for Computational Linguistics (ACL), 2011, pp. 123131.
[15] K.-H. Jung, J. Lee, Probabilistic generative ranking method based on multi-support vector domain description, Inf. Sci. 247 (2013) 144153.
[16] P. Katz, M. Singleton, R. Wicentowski, Swat-mp: the semeval-2007 systems for task 5 and task 14, in: Proc. 4th International Workshop on Semantic
Evaluations (ACL), 2007, pp. 308313.
[17] P. Li, H. Li, M. Wu, Multi-label ensemble based on variable pairwise constraint projection, Inf. Sci. 222 (2013) 269281.
[18] K. Lin, H. Chen, Ranking reader emotions using pairwise loss minimization and emotional distribution regression, in: Proc. Conference on Empirical
Methods in Natural Language Processing (EMNLP), 2008, pp. 136144.
[19] C. Lin, Y. He, Joint sentiment/topic model for sentiment analysis, in: Proc. 18th ACM International Conference on Information and Knowledge
Management (CIKM), 2009, pp. 375384.
[20] K.H.-Y. Lin, C. Yang, H.-H. Chen, What emotions do news articles trigger in their readers? in: Proc. 30th Annual International ACM SIGIR Conference on
Research and Development in Information Retrieval (SIGIR), 2007, pp. 733734.
[21] K.H.-Y. Lin, C. Yang, H.-H. Chen, Emotion classication of online news articles from the readers perspective, in: Proc. IEEE/WIC/ACM International
Conference on Web Intelligence and Intelligent Agent Technology (WI/IAT), 2008, pp. 220226.
[22] B. Liu, Sentiment Analysis and Opinion Mining, Morgan & Claypool Publishers, 2012.
[23] B. Pang, L. Lee, S. Vaithyanathan, Thumbs up? sentiment classication using machine learning techniques, in: Proc. Empirical Methods in Natural
Language Processing (EMNLP), 2002, pp. 7986.
[24] S. Pan, X. Ni, J. Sun, Q. Yang, Z. Chen, Cross-domain sentiment classication via spectral feature alignment, in: Proc. 19th International Conference on
World Wide Web (WWW), 2010, pp. 751760.
[25] C. Quan, F. Ren, An exploration of features for recognizing word emotion, in: Proc. 23rd International Conference on Computational Linguistics (Coling),
2010, pp. 922930.
[26] D. Ramage, S. Dumais, D. Liebling, Characterizing microblogs with topic models, in: Proc. 4th International AAAI Conference on Weblogs and Social
Media (ICWSM), 2010.
[27] D. Ramage, D. Hall, R. Nallapati, C.D. Manning, Labeled LDA: a supervised topic model for credit attribution in multi-label corpora, in: Proc. Conference
on Empirical Methods in Natural Language Processing (EMNLP), 2009, pp. 248256.
[28] M. Rosen-Zvi, T. Grifths, M. Steyvers, P. Smyth, The author-topic model for authors and documents, in: Proc. 20th Conference in Uncertainty in
Articial Intelligence (UAI), 2004, pp. 487494.
[29] P. Saari, T. Eerola, Semantic computing of moods based on tags in social media of music, in: IEEE Transactions on Knowledge and Data Engineering,
Manuscript Submitted for Publication, 2013. arxiv.org, http://dx.doi.org/10.1109/TKDE.2013.128.
[30] V. Stoyanov, C. Cardie, Annotating topics of opinions, in: Proc. 6th International Conference on Language Resources and Evaluation (LREC), 2008.
[31] C. Strapparava, R. Mihalcea, Semeval-2007 task 14: affective text, in: Proc. 4th International Workshop on Semantic Evaluations (ACL), 2007, pp. 7074.
[32] P.D. Turney, Thumbs up or thumbs down? Semantic orientation applied to unsupervised classication of reviews, in: Proc. 40th Annual Meeting of the
Association for Computational Linguistics (ACL), 2002, pp. 17424.
[33] X. Wang, A. McCallum, Topic over time: a non-markov continuous-time model of topical trends, in: Proc. 12th ACM SIGKDD International Conference
on Knowledge Discovery and Data Mining (SIGKDD), 2006, pp. 424433.
[34] Q.S. Wang, O. Wu, W.M. Hu, J.F. Yang, W.Q. Li, Ranking social emotions by learning listwise preference, in: Proc. 1st Asian Conference on Pattern
Recognition (ACPR), 2011, pp. 164168.
[35] R. Xia, C. Zong, S. Li, Ensemble of feature sets and classication algorithms for sentiment classication, Inf. Sci. 181 (2011) 11381152.
[36] X. Xu, S. Tan, Y. Liu, X. Cheng, Z. Lin, Towards jointly extracting aspects and aspect-specic sentiment knowledge, in: Proc. 21st ACM International
Conference on Information and Knowledge Management (CIKM), 2012, pp. 18951899.

Anda mungkin juga menyukai