On The Extension of Trace Norm To Tensors

On the extension of trace norm to tensors
Ryota Tomioka1, Kohei Hayashi2, Hisashi Kashima1

1The University of Tokyo
2Nara Institute of Science and Technology
2010/12/10
NIPS2010 Workshop: Tensors, Kernels, and Machine Learning

Convex low-rank tensor completion
Tucker decomposition
Core
I3
(interactions)
r1
r3
Features
r2 r3
r1
I1 I1 r2
I3 I2
Se I2 e
ns T im
or
s
Factors
(loadings)
Conventional formulation (nonconvex)
minimize !Ω ◦ (Y − C ×1 U 1 ×2 U 2 ×3 U 3 ) !2F + regularization.

C,U 1 ,U 2 ,U 3
observation mode-k product
minimize !Ω ◦ (Y − X ) !F
2
s.t. rank(X ) ≤ (r1 , r2 , r3 ).
X
• Alternate minimization
• Have to fix the rank beforehand
Our approach
Matrix
Trace norm
Estimation of minimization
low-rank matrix (tractable)
(hard) [Fazel, Hindi, Boyd 01]
Generalization
Tensor
Estimation of Extended
low-rank tensor trace norm
(hard) minimization
Rank defined in the sense of
Tucker decomposition (tractable)
Trace norm regularization (for matrices)
X ∈ RR×C , m = min(R, C)
m
!
!X!tr = σj (X) Linear sum of
singular-values
j=1
• Roughly speaking, L1 regulariza6on on the singular‐values.
• Stronger regulariza6on ‐‐> more zero singular‐values ‐‐>
low rank.
• Not obvious for tensors (no singular‐values for tensors)
Mode-k unfolding (matricization)
Mode-1 unfolding X (1)

I2 I2 I2
I1
I1
I2 I3
I3
Mode-2 unfolding X (2)
I3 I3 I3
I1
I2
I2 I3
I1
Elementary facts about Tucker decomposition
I3
X = C ×1 U 1 ×2 U 2 ×3 U 3
U3
r1 C r3
Mode-1 unfolding r2 r3
X (1) = U 1 C (1) (U 3 ⊗ U 2 )!
r1 r2
rank ≦ r1 I1 U2
U1 I2
Mode-2 unfolding
X (2) = U 2 C (2) (U 1 ⊗ U 3 )!
rank ≦ r2
The rank of X(k)
Mode-3 unfolding is no more than
X (3) = U 3 C (3) (U 2 ⊗ U 1 )!
the rank of C(k)
rank ≦ r3
What it means
• We can use the trace norm of an unfolding of a
tensor X to learn low‐rank X.
Matricization
Tensor X is Unfolding Xk is
low-rank a low-rank
∃k, rk<Ik matrix
Tensorization
Approach 1: As a matrix
• Pick a mode k, and hope that the tensor to be
learned is low rank in mode k.
1
minimize !Ω ◦ (Y − X ) !F + !X (k) !∗ ,
2
X ∈RI1 ×···×IK 2λ
Pro: Basically a matrix problem

→ Theoretical guarantee (Candes & Recht 09)
Con: Have to be lucky to pick the right mode.
Approach 2: Constrained optimization
• Constrain so that each unfolding of X is
simultaneously low rank.
K
!
1
minimize !Ω ◦ (Y − X ) !F +
2
γk !X (k) !∗ .
X ∈RI1 ×···×IK 2λ
k=1
Pro: Jointly regularize every mode

Con: Strong constraint
γk: tuning parameter usually set to 1.
See also Marco Signoretto et al.,10

Approach 3: Mixture of low-rank tensors
• Each mixture component Zk is regularized to be
low‐rank only in mode‐k.
! !2
!
1 ! " # K $ ! K
#
!
minimize !Ω ◦ Y − Zk ! + γk #Z k(k) #∗ ,
Z1 ,...,ZK 2λ ! k=1 !
F k=1
Pro: Each Zk takes care of each mode

Con: Sum is not low-rank
Numerical experiment
• True tensor: Size 50x50x20, rank 7x8x9. No noise (λ=0).
• Random train/test split.
2
10
As a Matrix (mode 1)
0 Constraint
Generalization error
10 Mixture
Tucker (large)
Tucker (exact)
−2
Optimization tolerance
10
−4
10 Tucker=EM
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
algo
Fraction of observed elements
(nonconvex)
Computation time
• Convex formula6on is also fast
50
As a Matrix
Constraint
40 Mixture
Tucker (large)
Computation time (s)
Tucker (exact)
30
20
10
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Fraction of observed elements
Phase transition behaviour
• Sum of true ranks= min(r1,r2r3)+ min(r2,r3r1)+ min(r3,r1r2)
1
Fraction required for perfect reconstruction
0.9
0.8 [40 20 6]
0.7 [40 9 7]
[17 13 15]
[30 12 10]
0.6 [20 20 5]
[10 12 13]
0.5
[7 8 15]
[10 12 9]
0.4 [16 6 8]
[8 5 6] [7 8 9]
0.3 [32 3 4]
[40 3 2] [4 42 3]
0.2 [5 4 3]
[20 3 2]
0.1
0
0 20 40 60 80
Sum of true ranks
Summary
• Low‐rank tensor comple6on can be computed in a convex
op6miza6on problem using the trace norm of the unfoldings.
‐ No need to specify the rank beforehand.
• Convex formula6on is more accurate and faster than
conven6onal EM‐based Tucker decomposi6on.
• Curious “phase transi6on” found → compressive‐sensing‐
type analysis is an on‐going work.
• Technical report: Arxiv:1010.0789 (including op6miza6on)
• Code:
‐ h]p://www.ibis.t.u‐tokyo.ac.jp/RyotaTomioka/So_wares/Tensor
Acknowledgment
• This work was supported in part by MEXT KAKENHI
22700138, 22700289, and NTT Communica6on
Science Laboratories.

On The Extension of Trace Norm To Tensors

Diunggah oleh

Informasi Dokumen

Judul Asli

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

On The Extension of Trace Norm To Tensors

Diunggah oleh

Hak Cipta:

Format Tersedia

On the extension of trace norm to tensors

Ryota Tomioka1, Kohei Hayashi2, Hisashi Kashima1

NIPS2010 Workshop: Tensors, Kernels, and Machine Learning

minimize !Ω ◦ (Y − C ×1 U 1 ×2 U 2 ×3 U 3 ) !2F + regularization.

observation mode-k product

Mode-1 unfolding X (1)

Pro: Basically a matrix problem

Pro: Jointly regularize every mode

See also Marco Signoretto et al.,10

Pro: Each Zk takes care of each mode

Anda mungkin juga menyukai