Anda di halaman 1dari 25

Modeling and Predicting Retweeting

Dynamics on
Microblogging Platforms
Shuai Gao
Ju n M a
Zhumin Chen
WSDM 15

Sankarshan Mridha
Abir De
Reading Group
11/06/2015

1.
2.
3.
4.

Introduction
Problem Statement
Data Set
Point Process
1. Basic Idea
2. Different Types of Point Process
3. Poisson Process
4. Reinforced Poisson Process
5. Extended Reinforced Poisson Process
1. Model Formulation
2. Parameter Estimation
3. Prediction
6. Result
2

1.
2.
3.
4.

Introduction
Problem Statement
Data Set
Point Process
1. Basic Idea
2. Different Types of Point Process
3. Poisson Process
4. Reinforced Poisson Process
5. Extended Reinforced Poisson Process
1. Model Formulation
2. Parameter Estimation
3. Prediction
6. Result
3

INTRODUCTION

Popularity prediction is a trending research topic in current times.


Existing works focuses only on effective features.

It ignores the underlying arrival process of the event.

1.
2.
3.
4.

Introduction
Problem Statement
Data Set
Point Process
1. Basic Idea
2. Different Types of Point Process
3. Poisson Process
4. Reinforced Poisson Process
5. Extended Reinforced Poisson Process
1. Model Formulation
2. Parameter Estimation
3. Prediction
6. Result
5

PROBLEM STATEMENT
To model the retweeting dynamics of a message using training period

data
To use the above model to predict the popularity of that message in the

future.

CONTD

The retweeting dynamics of a message m upto Ti is characterized by a

set of time moments


arrives.

when each retweet

Prediction Problem: For a message m, given its retweeting dynamics

{tkm} upto the indicator time Ti , Predict its popularity at the reference
time Tr .

1.
2.
3.
4.

Introduction
Problem Statement
Data Set
Point Process
1. Basic Idea
2. Different Types of Point Process
3. Poisson Process
4. Reinforced Poisson Process
5. Extended Reinforced Poisson Process
6. Experiment
1. Model Formulation
2. Parameter Estimation
3. Prediction
4. Result
7. Conclusion

DATA SET

Two dataset of Weibo message for the month July 2013.


Random: 0.8 million original message from 10K random users. 10K

random messages with retweeting count [50,20K] from this set.


News: All original message from 25 news account. 18K messages

with retweeting count [50,20K] from that set.

1.
2.
3.
4.

Introduction
Problem Statement
Data Set
Point Process
1. Basic Idea
2. Different Types of Point Process
3. Poisson Process
4. Reinforced Poisson Process
5. Extended Reinforced Poisson Process
1. Model Formulation
2. Parameter Estimation
3. Prediction
6. Result
10

POINT PROCESS ( BASIC IDEA )


A point process is a random collection of points.
Each point represents time and/or location of an event
Eg: lightning strike or earthquake.

11

POINT PROCESS ( BASIC IDEA )


A point process is a random collection of points.
Each point represents time and/or location of an event
Eg: lightning strike or earthquake.

12

TYPES OF POINT PROCESS

Simple Point Process


Temporal Point Process
Marked Point Process

13

POISSON PROCESS

Its a simple point process.


N(t) is a Poisson process if the number of events in [0,t] follows a

Poisson distribution.

14

POISSON PROCESS

Its a simple point process.


N(t) is a Poisson process if the number of events in [0,t] follows a

Poisson distribution.

15

REINFORCED POISSON PROCESS [shen et al 14 ]


Generative probabilistic model.
Salient Features:
Item fitness
Aging effect
Reinforcement mechanism (rich-gets-richer phenomenon)

Rate Equation:

16

[Shen et al 2014] Modeling and Predicting Popularity Dynamics via Reinforced Poisson Processes,
Huawei Shen , Dashun Wang , Chaoming Song , Albert-Laszl o Barab asi (AAAI 2014)

1.
2.
3.
4.

Introduction
Problem Statement
Data Set
Point Process
1. Basic Idea
2. Different Types of Point Process
3. Poisson Process
4. Reinforced Poisson Process
5. Extended Reinforced Poisson Process
1. Model Formulation
2. Parameter Estimation
3. Prediction
6. Result
17

EXTENDED REINFORCED POISSON PROCESS


Gao et all 15 extends the RFP process (Shen et al 14) for retweeting

dynamics

Power Law Temporal relaxation function instead of log-nomal relaxation


function

Exponential reinforcement function instead of linear reinforcement function

Rate equation:

18

MODEL FORMULATION
Given the (k 1)th retweet arrives at tk-1m, the probability that the kth retweet

arrives at tkm follows:

The probability that no retweet arrives between tmnm and Ti is

The likelihood of the observing retweeting dynamics {tmk } up to Ti follows

19

CONTD
The log-likelihood for the retweeting dynamics {tk} up to Ti is

where

20

PARAMETER ESTIMATION (c*, *, *)


Maximizing log-likelihood function:
For parameters and , the optimal values can be found

by maximizing the log-likelihood

where

using the gradient ascent method.

1 and 2 are the learning rate at each iteration

21

PREDICTION

To predict the expected number of retweets N(t) for message m at any

given time moment.

Solving this,

22

1.
2.
3.
4.

Introduction
Problem Statement
Data Set
Point Process
1. Basic Idea
2. Different Types of Point Process
3. Poisson Process
4. Reinforced Poisson Process
5. Extended Reinforced Poisson Process
1. Model Formulation
2. Parameter Estimation
3. Prediction
6. Result
23

RESULT

OUTPUT

24

SH: Linear Regression For Logarithmic Popularity, ML: Multivariate Linear Regression Method , LL: RPF model with log normal relaxation,
PL: RPF with power law relaxation and linear Reinforcement function, PE: RPF model with power law relaxation and exponential reinforcement function

FIN

25

Anda mungkin juga menyukai