Analyzing Sentiment
Emily Fox & Carlos Guestrin
Machine Learning Specialization
University of Washington
1 2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
Predicting sentiment by topic:
An intelligent restaurant
review system
Sample review:
Watching the chefs create
incredible edible art made
the experience very unique.
All reviews
for restaurant
Experience
Novel intelligent Ramen
restaurant review app
Sushi
Sentence Sentiment
Classifier
Sushi
Sentiment
All the sushi was delicious. All the sushi was delicious.
The sushi was amazing, and The sushi was amazing, and Classifier
the rice is just outstanding. the rice is just outstanding. Most
The service is somewhat hectic. &
Easily best sushi in Seattle. Easily best sushi in Seattle.
Easily best
sushi
in Seattle.
8
2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
Classifier applications
Sentence
Classifier
from
review MODEL
Output: y
Input: x Predicted
class
Education
Finance
Technology
Input: x Output: y
Webpage
11
2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
Spam filtering
Not spam
Spam
Input: x Output: y
Text of email,
12
sender, IP, 2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
Image classification
Input: x Output: y
Image pixels Predicted object
13
2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
Personalized medical diagnosis
Input: x Output: y
Healthy
Disease Cold
Classifier Flu
MODEL Pneumonia
Hammer
House
Sentence
Classifier
from
review MODEL
Output: y
Input: x
Predicted class
0
0 1 2 3 4
awesome
25
2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
Decision boundary example
Word Weight
awesome 1.0
Score(x) = 1.0 #awesome 1.5 #awful
awful -1.5
awful
Score(x) < 0
0
Score(x) > 0
0 1 2 3 4
awesome
26
2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
Decision boundary separates
positive & negative predictions
For linear classifiers:
-When 2 weights are non-zero
line
-When 3 weights are non-zero
plane
-When many weights are non-zero
hyperplane
For more general classifiers
more complicated shapes
Training Learn
set classifier
Data
(x,y) Test
Evaluate?
(Sentence1, ) set
(Sentence2, )
29
2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
Classification error
Learned classifier
=
Test example Correct 1
0
Correct!
Mistake!
(Food
Foodwas
(Sushi
Sushi wasgreat,
OK
OK,
great )) Mistakes 0
1
Hide label
30
2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
Classification error & accuracy
Error measures fraction of mistakes
error = .
True False
True label
Positive Negative
Positive
(FN)
(FP)
False True
Positive
Negative Negative
(FP)
(FN)
37
2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
Cost of dierent types of mistakes can be
dierent (& high) in some applications
Spam Medical
filtering diagnosis
False
Disease
negative Annoying not treated
False
Wasteful
positive Email lost
treatment
True False
True label
Positive Positive
(FP)
False True
Negative Negative
(FN)
Healthy
True label
Cold
Flu
In practice:
- More complex models require more data
- Empirical analysis can provide guidance
Bias of model
Classifier based
on single words
P(y|x)
Output label Input sentence
Extremely useful in practice
48
2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
Summary of classification