Anda di halaman 1dari 11

10x - Culture of

Technical Break
Through

Motivation & Principles


- Project 10x Qi Review, 2014 April
10x vs. 10%
Rethink problem entirely, not marginally but magnitude
better
Challenge the constraints and assumptions

Impact driven
Big problem, radical solution, breakthrough technology
Aspirational and grounded

Project 10x

Big Problem

A series of effort to drive Ads platform capability


Change agent for innovation culture
Radical
Catalyst for technology breakthrough
Solution

10x
Breakthrough
Technology

Ref: Solve For X

Project 10X: IDHash Status and


Roadmap - Project 10x Qi Review, 2014 Nov
RnR Capacity
Gains
Store

Partitio
ns &
SSD
size

Pegasu
s
Capacit
y

Pegasu
s
Memor
y

IDHash
Capacit
y

IDHash
Memor
y

Current
Model
Size

Growth
Headro
om

Nena

4 * 1TB

270GB

120GB

1.4TB

27GB

26GB

50x

Jenny

8 * 1TB

1.1TB

12GB

400GB

2.5x

Categor
y

12 *
160GB

320G

32GB

400GB

1.3GB

46GB

9x

Roadm
ap
Focus Area

Metrics

Per machine
Capacity

Memory
Overhead

Partitions

Latency for 100

May 2014

Sep 2014

Read-only store
Perfect hash
index
SSD

500GB
8.3 bytes
16
2ms

Scenari
os

Nena, Jenny, CatStore

(in PROD)
AAA Store (in PROD)
AdPredictUser Store
(coding)
Eddy (coding)
Relevance Store
(coding)

Dec 2014

June 2015

Sparse index
Compression
In-memory mode
I/O and cache
improvement

Updatable store
enhanced refresh
support
snapshot + delta
mode

Partitioning
Streaming update
Communication
stack

1TB
2.3 bytes
16
2ms

1TB * 2
2.3 bytes
16
2ms

1TB * 2
2.3 bytes
100
2ms

Project 10X: NRT Status and Roadmap


- Project 10x Qi Review, 2014 Nov

KPI
s

Click

Impression

Slice

Covera
ge

Accura
cy

Rev

All

99.63
%

98.96
%

Bing O&O enus PC

99.86
%

Y! O&O en-us
PC
Roadm

99.51
%

Y! Synd en-us
ap

99.71
98.11
June 2014
%
%

PC

Focus Area

Metrics

Event
Delivery

Fraud
Accuracy

Latency

Scenari

os
NRT KPI for DE Deployment (in PROD,

Laten
cy

Covera
ge

Accura
cy

0.02
%

99.97
%

98.41
%

99.10
%

0.13
%

99.97
%

99.15
%

99.53
%

0.06
%

99.98
%

98.69
%

99.98
%

96.56
March 2015
%

8 sec

0.93
Nov 2014
%

Laten
cy

8 sec

5h -> 15min)
NRT MP metrics (in PROD, 5h -> 15min)
NRT Abacus Counting (in Flight, 16h ->
10sec, +1% CY)
NRT AdInsight (in Pilot)
Fraud decision feed for FastBI
(Integration)
Online budget update (Design)

June 2015

Streaming Join
Streaming
Fraud

ACK and Retry for


cross DC event
transmission
ML fraud model for
more marketplace
slices

ACK and Retry across NRT


stack
Streaming Join v2
ML fraud model
improvement

One Fraud Pipeline


(LRP)

99%
98%
8sec @99%tile

99.6%
99%
8sec @99%tile

99.9%
99.5%
8sec @99%tile

99.99%
99.9%
8sec @99%tile

Project 10X: Woodblocks Status and


Roadmap - Project 10x Qi Review, 2014 Nov
Motivation:

memorization to generalization, big data

Abacus

CDSSM

TM

Woodblocks

Training
Data

40M

50M

2B (1 year)

70B (1 year)

Data Type

Click and nonclick

Click only

Click only

Click and nonclick

# Features

200

300k

80M

120M

Feature
Type

counting and
semantic

Tri-letter

Term pair

Trainer

NN - single
machine

NN single
GPU

Cosmos based
IBM model 1

LR (L-BFGS)
-ScopeML

April 2014

30 (on gbin
Neutral for CP
slice)
Oct 2014

(Mainstreamed)

Term pair and


term
matching/drop

Feature
Roadm
Rank
ap

Scenari
osText Ads Click Prediction

1 (on gbin slice)


Jan 2015

First released in April: MLCY +0.7%,


MLIY -2.26%

Subsequent 3 releases, average


MLCTR +0.5%

PA Click Prediction (Flighting)

For L1 ranker in Jenny: CY +0.86%, IY


-4.31% (PA slice)

For L2 ranker: Offline AUC +5.3%

Search MM WPR (Mainstreamed)

+1.57 DSQ/UU for video, +0.88


DSQ/UU for image

Mobile CP,
Selection
MSM/Eddy,
June.
2015
Relevance

Focus
Area

Training data 30B


Features 80M

Training data 70B


Features 120M

Training data 70B


Features 200M+
Spark based trainer
(MSRA)

Training data 70B


Features 500M+ on
LR
DNN model on big
data, parameter
server

Applicatio
ns

Ads Click Prediction

Ads Click Prediction


Ads Selection
Search MM

Product Ads
Ads Relevance

More adoption in ads


and search

Today

Transcend Excellence

Excellence

Excellence
Time

Time

Review framework from Qi


Lu
1.
2.
3.
4.

Compare
Compare
Compare
Compare

with
with
with
with

past
your goal
competitor
perfect

How to promote a culture of


technical break through?
Every project for breakthrough ambition, that are
Breakthrough impact. Think of first version of IdHash.
Self-initiated
Greater than 10% chance of success.

It will take both depth and ambition to come up with


these goals. Depth will take time/effort to build.
Use this opportunity to push/pull up the ambition,
the ambition of real world-class, not just better than
before.

Appendix

IDHash Design Doc

Anda mungkin juga menyukai