Anda di halaman 1dari 16

On the extension of trace norm to tensors

Ryota Tomioka1, Kohei Hayashi2, Hisashi Kashima1


1The University of Tokyo
2Nara Institute of Science and Technology

2010/12/10

NIPS2010 Workshop: Tensors, Kernels, and Machine Learning


Convex low-rank tensor completion
Tucker decomposition
Core
I3
(interactions)
r1
r3
Features

r2 r3
r1
I1 I1 r2
I3 I2
Se I2 e
ns T im
or
s
Factors
(loadings)
Conventional formulation (nonconvex)

minimize !Ω ◦ (Y − C ×1 U 1 ×2 U 2 ×3 U 3 ) !2F + regularization.


C,U 1 ,U 2 ,U 3

observation mode-k product

minimize !Ω ◦ (Y − X ) !F
2
s.t. rank(X ) ≤ (r1 , r2 , r3 ).
X

• Alternate minimization
• Have to fix the rank beforehand
Our approach

Matrix
Trace norm
Estimation of minimization
low-rank matrix (tractable)
(hard) [Fazel, Hindi, Boyd 01]

Generalization
Tensor
Estimation of Extended
low-rank tensor trace norm
(hard) minimization
Rank defined in the sense of
Tucker decomposition (tractable)
Trace norm regularization (for matrices)

X ∈ RR×C , m = min(R, C)

m
!
!X!tr = σj (X) Linear sum of
singular-values
j=1

• Roughly speaking, L1 regulariza6on on the singular‐values.
• Stronger regulariza6on ‐‐> more zero singular‐values     ‐‐> 
low rank.
• Not obvious for tensors (no singular‐values for tensors)
Mode-k unfolding (matricization)

Mode-1 unfolding X (1)


I2 I2 I2
I1
I1
I2 I3
I3
Mode-2 unfolding X (2)
I3 I3 I3
I1
I2
I2 I3
I1
Elementary facts about Tucker decomposition
I3
X = C ×1 U 1 ×2 U 2 ×3 U 3
U3
r1 C r3
Mode-1 unfolding r2 r3
X (1) = U 1 C (1) (U 3 ⊗ U 2 )!
r1 r2
rank ≦ r1 I1 U2
U1 I2
Mode-2 unfolding
X (2) = U 2 C (2) (U 1 ⊗ U 3 )!

rank ≦ r2
The rank of X(k)
Mode-3 unfolding is no more than
X (3) = U 3 C (3) (U 2 ⊗ U 1 )!
the rank of C(k)
rank ≦ r3
What it means

• We can use the trace norm of an unfolding of a 
tensor X to learn low‐rank X.
Matricization
Tensor X is Unfolding Xk is
low-rank a low-rank
∃k, rk<Ik matrix
Tensorization
Approach 1: As a matrix

• Pick a mode k, and hope that the tensor to be 
learned is low rank in mode k.
1
minimize !Ω ◦ (Y − X ) !F + !X (k) !∗ ,
2
X ∈RI1 ×···×IK 2λ

Pro: Basically a matrix problem


→ Theoretical guarantee (Candes & Recht 09)
Con: Have to be lucky to pick the right mode.
Approach 2: Constrained optimization

• Constrain so that each unfolding of X is 
simultaneously low rank.
K
!
1
minimize !Ω ◦ (Y − X ) !F +
2
γk !X (k) !∗ .
X ∈RI1 ×···×IK 2λ
k=1

Pro: Jointly regularize every mode


Con: Strong constraint
γk: tuning parameter usually set to 1.

See also Marco Signoretto et al.,10


Approach 3: Mixture of low-rank tensors

• Each mixture component Zk is regularized to be 
low‐rank only in mode‐k.
! !2
!
1 ! " # K $ ! K
#
!
minimize !Ω ◦ Y − Zk ! + γk #Z k(k) #∗ ,
Z1 ,...,ZK 2λ ! k=1 !
F k=1

Pro: Each Zk takes care of each mode


Con: Sum is not low-rank
Numerical experiment

• True tensor: Size 50x50x20, rank 7x8x9. No noise (λ=0).
• Random train/test split.
2
10
As a Matrix (mode 1)
As a Matrix (mode 2)
As a Matrix (mode 3)
0 Constraint
Generalization error

10 Mixture
Tucker (large)
Tucker (exact)
−2
Optimization tolerance
10

−4
10 Tucker=EM
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
algo
Fraction of observed elements
(nonconvex)
Computation time

• Convex formula6on is also fast
50
As a Matrix
Constraint
40 Mixture
Tucker (large)
Computation time (s)

Tucker (exact)
30

20

10

0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Fraction of observed elements
Phase transition behaviour
• Sum of true ranks= min(r1,r2r3)+ min(r2,r3r1)+ min(r3,r1r2)
1
Fraction required for perfect reconstruction

0.9

0.8 [40 20 6]

0.7 [40 9 7]
[17 13 15]
[30 12 10]
0.6 [20 20 5]
[10 12 13]
0.5
[7 8 15]
[10 12 9]
0.4 [16 6 8]
[8 5 6] [7 8 9]
0.3 [32 3 4]
[40 3 2] [4 42 3]
0.2 [5 4 3]
[20 3 2]
0.1

0
0 20 40 60 80
Sum of true ranks
Summary

• Low‐rank tensor comple6on can be computed in a convex 
op6miza6on problem using the trace norm of the unfoldings.
‐ No need to specify the rank beforehand.
• Convex formula6on is more accurate and faster than 
conven6onal EM‐based Tucker decomposi6on.
• Curious “phase transi6on” found → compressive‐sensing‐
type analysis is an on‐going work.
• Technical report: Arxiv:1010.0789 (including op6miza6on)
• Code:
‐ h]p://www.ibis.t.u‐tokyo.ac.jp/RyotaTomioka/So_wares/Tensor
Acknowledgment

• This work was supported in part by MEXT KAKENHI 
22700138, 22700289, and NTT Communica6on 
Science Laboratories.

Anda mungkin juga menyukai