2010/12/10
r2 r3
r1
I1 I1 r2
I3 I2
Se I2 e
ns T im
or
s
Factors
(loadings)
Conventional formulation (nonconvex)
minimize !Ω ◦ (Y − X ) !F
2
s.t. rank(X ) ≤ (r1 , r2 , r3 ).
X
• Alternate minimization
• Have to fix the rank beforehand
Our approach
Matrix
Trace norm
Estimation of minimization
low-rank matrix (tractable)
(hard) [Fazel, Hindi, Boyd 01]
Generalization
Tensor
Estimation of Extended
low-rank tensor trace norm
(hard) minimization
Rank defined in the sense of
Tucker decomposition (tractable)
Trace norm regularization (for matrices)
X ∈ RR×C , m = min(R, C)
m
!
!X!tr = σj (X) Linear sum of
singular-values
j=1
• Roughly speaking, L1 regulariza6on on the singular‐values.
• Stronger regulariza6on ‐‐> more zero singular‐values ‐‐>
low rank.
• Not obvious for tensors (no singular‐values for tensors)
Mode-k unfolding (matricization)
rank ≦ r2
The rank of X(k)
Mode-3 unfolding is no more than
X (3) = U 3 C (3) (U 2 ⊗ U 1 )!
the rank of C(k)
rank ≦ r3
What it means
• We can use the trace norm of an unfolding of a
tensor X to learn low‐rank X.
Matricization
Tensor X is Unfolding Xk is
low-rank a low-rank
∃k, rk<Ik matrix
Tensorization
Approach 1: As a matrix
• Pick a mode k, and hope that the tensor to be
learned is low rank in mode k.
1
minimize !Ω ◦ (Y − X ) !F + !X (k) !∗ ,
2
X ∈RI1 ×···×IK 2λ
• Constrain so that each unfolding of X is
simultaneously low rank.
K
!
1
minimize !Ω ◦ (Y − X ) !F +
2
γk !X (k) !∗ .
X ∈RI1 ×···×IK 2λ
k=1
• Each mixture component Zk is regularized to be
low‐rank only in mode‐k.
! !2
!
1 ! " # K $ ! K
#
!
minimize !Ω ◦ Y − Zk ! + γk #Z k(k) #∗ ,
Z1 ,...,ZK 2λ ! k=1 !
F k=1
• True tensor: Size 50x50x20, rank 7x8x9. No noise (λ=0).
• Random train/test split.
2
10
As a Matrix (mode 1)
As a Matrix (mode 2)
As a Matrix (mode 3)
0 Constraint
Generalization error
10 Mixture
Tucker (large)
Tucker (exact)
−2
Optimization tolerance
10
−4
10 Tucker=EM
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
algo
Fraction of observed elements
(nonconvex)
Computation time
• Convex formula6on is also fast
50
As a Matrix
Constraint
40 Mixture
Tucker (large)
Computation time (s)
Tucker (exact)
30
20
10
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Fraction of observed elements
Phase transition behaviour
• Sum of true ranks= min(r1,r2r3)+ min(r2,r3r1)+ min(r3,r1r2)
1
Fraction required for perfect reconstruction
0.9
0.8 [40 20 6]
0.7 [40 9 7]
[17 13 15]
[30 12 10]
0.6 [20 20 5]
[10 12 13]
0.5
[7 8 15]
[10 12 9]
0.4 [16 6 8]
[8 5 6] [7 8 9]
0.3 [32 3 4]
[40 3 2] [4 42 3]
0.2 [5 4 3]
[20 3 2]
0.1
0
0 20 40 60 80
Sum of true ranks
Summary
• Low‐rank tensor comple6on can be computed in a convex
op6miza6on problem using the trace norm of the unfoldings.
‐ No need to specify the rank beforehand.
• Convex formula6on is more accurate and faster than
conven6onal EM‐based Tucker decomposi6on.
• Curious “phase transi6on” found → compressive‐sensing‐
type analysis is an on‐going work.
• Technical report: Arxiv:1010.0789 (including op6miza6on)
• Code:
‐ h]p://www.ibis.t.u‐tokyo.ac.jp/RyotaTomioka/So_wares/Tensor
Acknowledgment
• This work was supported in part by MEXT KAKENHI
22700138, 22700289, and NTT Communica6on
Science Laboratories.