MATRIX FACTORIZATION
proprietary material
METHODS COLLABORATIVE
FILTERING
USER RATINGS PREDICTION
1 Alex Lin
Senior Architect
Intelligent Mining
Outline
Factoranalysis
Matrix decomposition
proprietary material
Matrix Factorization Model
2
Factor Analysis
Aprocedure can help identify the factors that
might be used to explain the interrelationships
among the variables
proprietary material
Model based approach
3
Refresher: Matrix Decomposition
r q p
5 x 6 matrix 5 x 3 matrix 3 x 6 matrix
proprietary material
X21 X22 X12 X24 X25 X26 y
proprietary material
1 5 4 4 3
2 3 ?
items
3 3 ? 5 ? 2
4 3 3 4 3
4 4
5
Rating Prediction Movie Preference Factor Vector
Learn Factor Vectors
users
1 2 3 4 5 6 7 8 9 10 n
1 5 4 4 3
proprietary material
2 3 ?
items
3 3 ? 5 ? 2
4 3 3 4 3
4 4
4=
m U3-1 * I1-1 + U3-2 * I1-2 + U3-3 * I1-3 + U3-4 * I1-4
3 = U7-1 * I2-1 + U7-2 * I2-2 + U7-3 * I2-3 + U7-4 * I2-4
..
proprietary material
- 99.9%)
See Appendix for SVD
7
How to Learn Factor Vectors
How do we learn preference factor vectors (a, b, c) and
(x, y, z)?
proprietary material
Minimize errors on the known ratings
8
Data Normalization
Remove Global mean
users
proprietary material
1 2 3 4 5 6 7 8 9 10 n
2 .79 ?
items
.52 .8
9
Factorization Model
Only Preference factors
min ui
(r q T
p ) 2
proprietary material
i u
q*. p*
(u,i)k
To learn the factor vectors (pu and qi)
Rating = 4
Global Preference
Mean Factor
min ui
(r b b q T
p ) 2
proprietary material
i u i u
q*. p*
(u,i)k
To learn Item bias and User bias
Rating = 4
Global Preference
Mean Factor
proprietary material
T 22 2
min (r bi bu q p u ) + ( qi +
i pu + bi + bu)
q*. p*
(u,i)k
Regularization to prevent overfitting
Rating = 4
Global Preference
Mean Factor
Optimize Factor Vectors
Find optimal factor vectors - minimizing cost
function
proprietary material
Algorithms:
Stochastic gradient descent
Others: Alternating least squares etc..
Most frequently use:
Stochastic gradient descent
13
Matrix Factorization Tuning
Number of Factors in the Preference vectors
Learning Rate of Gradient Descent
proprietary material
Best result usually coming from different learning rate
for different parameter. Especially user/item bias terms.
Parameters in Factorization Model
Time dependent parameters
Seasonality dependent parameters
Many other considerations !
14
High-Level Implementation Steps
Construct User-Item Matrix (sparse data structure!)
Define factorization model - Cost function
proprietary material
Take out global mean
proprietary material
16
Appendix
Stochastic Gradient Descent
Batch Gradient Descent
proprietary material
Singular Value Decomposition (SVD)
17
Stochastic Gradient Descent
Repeat Until Convergence {
for i=1 to m in random order {
proprietary material
j := j + (y (i) h (x ( i) ))x (ji) (for every j)
} partial derivative term
}
18
Batch Gradient Descent
Repeat Until Convergence {
m
j := j + (y (i) h (x ( i) ))x (ji) (for every j)
proprietary material
i=1 partial derivative term
}
19
Singular Value Decomposition
(SVD)
A = U S VT
A U S VT
m x r matrix r x r matrix r x n matrix
proprietary material
m x n matrix
items
items
rank = k
k<r
users users
Ak = U k Sk VkT
20