DSC4213-2016-4 Notes

DSC4213 Analytical Tools for
Consulting
Session 4 Modeling Demand and Choice:

The Multinomial Logit model
Prof. WANG Tong

Dept. of Decision Sciences
NUS Business School
Estimation Demand -- Data Sources
• Sales transaction data

– POS
– Loyalty program
– Web clicks, sensor networks
• Third-party data
– Industry research (A. C. Nielsen, etc.)
– Macro-economic data
• Survey/experimental data
– Questionnaires
– Live experiments
• Expert Judgment
Aug 2016 DSC4213 Session 4 - Prof. WANG Tong 2

NYHC
• Survey data on WTP

– Multiple products: 6 time slots
– Multiple segments: student, others
Client MAXIMUM WTP for each time period Max.WTP

ID 6am-9am 9am-noon Noon-2pm 2pm-5pm 5pm-9pm 9pm-close Day
1 75 25 50 25 75 50 75
2 25 25 25 25 75 75 75
3 25 25 25 25 50 50 50
4 25 25 25 25 50 75 75
5 75 75 75 75 75 75 75
6 25 25 25 25 75 75 75
7 25 25 25 25 25 25 25
8 50 50 50 50 50 50 50
9 100 100 100 100 100 100 100
10 100 100 100 100 100 100 100
11 25 25 25 25 50 25 50
12 50 75 150 75 150 75 150
13 25 25 25 25 25 25 25
14 50 75 100 100 125 100 125
Aug 2016 15 100 125 125 4 - Prof.
DSC4213 Session 125 125
WANG Tong 100 125 3
Willingness-To-Pay (WTP)
• WTP (a.k.a. reservation price, valuation) is the maximum amount a

potential customer is willing to pay for a product
– Customer will buy if WTP ≥ p (surplus WTP -p ≥ 0)
– Customer will not buy if WTP < p (surplus WTP -p < 0)
• If the distribution of WTP in a given population of size N is F(w) =

Pr(WTP<w)
Pr(one buys at price p) = Pr(WTP ≥ p) = 1-F(p)

The Choice Process
• Customer assigns utilities (WTP) for each alternative i
ui = utility for alternative i
• Price of alternative i
pi = price for alternative i
• Deterministic choice: customer picks the alternative with the highest Net
Utility (surplus)
max {0, u1-p1, u2-p2, … , un-pn}
• Customer does not buy if all net utilities are negative

NYHC Q1: single price
• Histogram
Max WTP
40
35
30
25
20
15
10
0
25 50 75 100 125 150

• Demand and revenue

• P* = $75

Estimation -- Parametric vs. nonparametric
• Nonparametric estimation
– Use raw data to directly
• Empirical histogram of demand volume
• Empirical histogram of reservation prices
– Pros: no assumptions, directly from observed data
– Cons: cannot extrapolate beyond data, hard to optimize, over-fitting
• Parametric estimation
– Assume a demand model with modest number of parameters, and estimate
parameters from data
• Linear: D(p) = a – bp
• Normal : D ~ N(µ, σ2)
– Pros: concise description, mathematically tractable, can extrapolate beyond data
– Cons: assumptions on the form of response may not be valid

Parametric Demand Models
• Demand Functions D
– Non-negative
– Downward sloping
– Continuous (hopefully)
– Differentiable (ideally)
• Exceptions not considered
P
– Luxury goods
– Giffen goods
– Goods that its price affects the perceived quality
• Measures of price effect
– Price sensitivity: increase p by $1, demand decreases by ?
• Unit dependent
– Price elasticity: increase p by 1%, demand decreases by ?%

Common Demand Functions (Single-Product)
Linear Exponential Iso-elastic
Demand D(p) = a–bp D(p) = exp(a-bp) D(p) = a p-e, e > 1
WTP
Distribution
Uniform Exponential
D = a – bp D = exp(a-bp) D = a p-e
|ϵ|
|ϵ|
|ϵ|
Estimation D = a – bp Log(D) = a – bp Log(D) = Log(a) – e log(p)
Constant sensitivity Constant elasticity
Optimizatio p* = c*e/(e-1) (c is unit

P* = a/2b P* = 1/b
n cost)

• Linear demand: D(p) = 11362 - 84.755p

• P* = a/2b = $67

• Exponential demand: D(p) = exp(10.4 – 0.0348p)

• P* = 1/b = $29

• Iso-elastic demand: D(p) = exp(13.584) * p^(-1.434)

• P* = ??

NYHC Q2
• Price 5pm-9pm and fix other slots at $50

NYHC Q2
• Choice by Net Utility maximization

• P* = $60

NYHC Q3
• Price both 5pm-9pm and other slots

Modeling Choice
• What if there are multiple alternatives to price?
• Examples:
– Multiple versions of a product
– Competing products
• Models:
– Multi-product demand functions
– Discrete choice models

Multi-product Demand Functions
• Two-product linear demand
d1(p1, p2) = a1 – b11 p1 + b12 p2

d2(p1, p2) = a2 – b22 p2 + b21 p1
– Parameter estimation can be done by regressing each demand against both prices
• Multi-product iso-elastic demand functions are also available

Multinomial Logit (MNL) Choice Model
• Incorporate a random component in the net utility
utility for i = ui – pi + ξi
• Random component ξ’s are assumed to be independent and have Gumbel

distribution with mean zero and variance σ2 = s 2 π 2 / 6
• Given n alternatives, then
𝑉. 67 897
Pr(chooses i) = , 𝑉. = 𝑒 :
1 + 𝑉1 + ⋯ + 𝑉3

• Demo of the MNL model: 5pm-9pm vs others
Choices 5pm-9pm Others

WTP $100 $75
Price P $50
Net Utility 100-P 25
• Under deterministic choice, as long as P ≤ 75,

Pr(choose 5pm-9pm) = 1
• Under MNL, the probability looks like

• MNL choice probability for different levels of uncertainty s

1
0.9
0.8
0.7
0.6 deterministic
s=1
0.5
s=10
0.4 s=100
s=10000
0.3
0.2
0.1
0
0
4
8
12
16
20
24
28
32
36
40
44
48
52
56
60
64
68
72
76
80
84
88
92
96
100
104
108
112
116
120
124
128
132
136
140
144
148
NYHC B: Estimate MNL from WTP data
• Determine ui based on data
ui = average(WTPi)
• s is derived from average of var(WTPi)
• Prediction
– Calculate Vi = exp( (ui – pi) / s)
– Purchase probability
Pr( buy i) = Vi / (1 + V1 + V2 + … + VN)

Challenges in using MNL
• Model
– Independence of Irrelevant Alternatives (IIA)
• Optimization
– Not trivial
• Data & Estimation

– Often WTP is not available
– But we have actual choices from sales data
– Maximum Likelihood Estimation (MLE)

Model
• Independence of Irrelevant Alternatives (IIA)

– Probi / Probj depends only on their own net utilities NUi and NUj, not on other
alternatives (proportional substitution)
– Was the original motivation of the logit model
– Not always appealing, e.g. (red-bus-blue-bus problem)
• Choice between Car and Red Bus, same NU => 50%-50% choice
• If add Blue Bus as the third option, again same NU
– Intuition: 1/2, 1/4, 1/4
– MNL prediction of choice probabilies: 1/3, 1/3, 1/3
• Extentions
– Probit model: multivariate Normal ξi
– Generalized extreme value (GEV) model: allow correlation in ξi
– Nested logit: allow hierarchical choices
– Mixed logit: individual variations

Optimization
• Given an MNL demand model, we choose prices Pi to maximize total

revenue
Revenue = 𝑃1 @ 𝐷1 (𝑃) + 𝑃B @ 𝐷B (𝑃) + ⋯ + 𝑃C @ 𝐷C (𝑃)
𝑃1 @ 𝑉1 𝑃1 + 𝑃B @ 𝑉B 𝑃B + ⋯ + 𝑃C @ 𝑉C 𝑃C
= 𝑁
1 + 𝑉1 𝑃1 + 𝑉B 𝑃B + ⋯ + 𝑉C 𝑃C
• The revenue function is in general not concave or unimodal
– Generic solvers will not work (reliably)
– Ad-hoc algorithms exist (Hanson and Martin, 1996, Mgmt. Sci.)
• Special cases that are solvable:

– When there is only one option to price (Fjord case)
– When there are multiple options, but all prices are the same

Estimation based on choice data
• NYHC Data
– Price =$50 for all slots
– Observed choices
Noon-
Client 6am-9am 9am-noon 2pm 2pm-5pm 5pm-9pm 9pm-close No-purchase
1 1 0 0 0 0 0 0
2 0 0 0 0 1 0 0
3 1 0 0 0 0 0 0
4 0 0 0 0 1 0 0
5 0 0 0 0 1 0 0
6 0 0 1 0 0 0 0
7 0 0 0 0 0 1 0
8 0 0 0 0 0 1 0
9 0 0 0 0 1 0 0
10 0 0 0 0 0 1 0
11 0 0 0 0 1 0 0
12 0 0 0 0 0 0 1
13 0 0 0 0 1 0 0
14 0 0 0 0 0 0 1
15 0 0 0 1 0 0 0

Maximum Likelihood Estimation (MLE)
• A standard statistical method for parameter estimation

• Brief idea: find the most likely values of the parameters that lead to the
observed data
• Quick example
– Two possible models: f(x | u1) and f(x | u2)
– Data: x1, x2, …, xn
– Likelihood of the two models:
• f(data|u1) = f(x1|u1) f(x2|u1) … f(xn|u1) = L(u1|data)
• f(data|u2) = f(x1|u2) f(x2|u2) … f(xn|u2) = L(u2|data)
– Choose the one with max likelihood
• Often max log(Likelihood) instead

• Recall in MNL, the probability of choice is

𝑉. 67 897
Pr(chooses i |𝑢, 𝑠) = , 𝑉. = 𝑒 :
1 + 𝑉1 + ⋯ + 𝑉H
Noon-
1 1 0 0 0 0 0 0
2 0 0 0 0 1 0 0
… … … … … … … …
100 0 0 0 0 0 0 1
• Likelihood of Client 1 choosing 1 and Client 2 choosing 5 is

Pr(chooses 1 |𝑢, 𝑠) @ Pr(chooses 5 |𝑢, 𝑠)
• Likelihood of seeing the whole data set is

Pr(chooses 1 |𝑢, 𝑠) @ Pr(chooses 5 |𝑢, 𝑠) @ ⋯ @ Pr(chooses 0 |𝑢, 𝑠) = L(𝑢, 𝑠|data)

• Solving max L(𝑢, 𝑠|data) or max log L 𝑢, 𝑠 data yields the maximum likelihood
6,: 6,:
estimate
s= 9.888807
9am- Noon- 5pm- No
6am-9am noon 2pm 2pm-5pm 9pm 9pm-close Purch.
Price $ 50.00 $ 50.00 $ 50.00 $ 50.00 $ 50.00 $ 50.00
Total Log
uj $ 55.05 $ 34.08 $ 44.95 $ 39.14 $ 55.81 $ 50.64 Likelihood
Net util. $ 5.05 $ (15.92) $ (5.05) $ (10.86) $ 5.81 $ 0.64 0 -75.98299317
vj 1.666646 0.199999 0.599996 0.333329 1.79998 1.066657 1
pj 0.249999 0.03 0.09 0.05 0.269999 0.16 0.150001
9am- Noon- 5pm- No

Client 6am-9am noon 2pm 2pm-5pm 9pm 9pm-close Purch. Likelihood
1 0 0 0 0 0 1 0 0.15999999
2 1 0 0 0 0 0 0 0.249999182
3 0 0 0 0 1 0 0 0.269999435
4 0 0 0 0 1 0 0 0.269999435
5 1 0 0 0 0 0 0 0.249999182
6 0 0 0 0 0 1 0 0.15999999
Noon-
1 1 0 0 0 0 0 0
2 0 0 0 0 1 0 0
… … … … … … … …
100 0 0 0 0 0 0 1
• Aggregate demand
Demand 25 3 9 5 27 16 15
• Likelihood is
H
L(𝑢, 𝑠|data) = S Pr(chooses i |𝑢, 𝑠)Demand7

.UV
• Log-Likelihood is
H
LL(𝑢, 𝑠|data) = W Demand. @ Log(Pr(chooses i |𝑢, 𝑠))
.UV

MNL with more explanatory variables
• Utility can be a function of various explanatory variables

𝑢. = 𝑎1 𝑥1. + 𝑎B 𝑥B. + ⋯ + 𝑎Z 𝑥Z.
– 𝑥\. is attribute 𝑘 for alternative 𝑖
– 𝑎\ is unknown coefficient for attribute k
• Can be similarly estimated by MLE => Logistic Regression

Summary
• Single-product demand models

– WTP distribution ó Demand curve
– Estimation: parametric/nonparametric, transaction/survey
– Linear/Exponential/Iso-elastic demand
• Discrete Choice Models

– Deterministic choice: maximize net utility ui – pi
– Random utility model: ui – pi + ξi
• Multinomial logit: ξi independent, zero mean, σ2 = s 2 π 2 / 6
• Challenges and solutions in applying MNL

– Model: IIA
– Optimization
– Estimation: MLE for choice data

DSC4213-2016-4 Notes

Diunggah oleh

Informasi Dokumen

Judul Asli

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

DSC4213-2016-4 Notes

Diunggah oleh

Hak Cipta:

Format Tersedia

DSC4213 Analytical Tools for

Session 4 Modeling Demand and Choice:

Prof. WANG Tong

• Sales transaction data

Aug 2016 DSC4213 Session 4 - Prof. WANG Tong 2

• Survey data on WTP

Client MAXIMUM WTP for each time period Max.WTP

• WTP (a.k.a. reservation price, valuation) is the maximum amount a

• If the distribution of WTP in a given population of size N is F(w) =

Aug 2016 DSC4213 Session 4 - Prof. WANG Tong 4

• Customer assigns utilities (WTP) for each alternative i

ui = utility for alternative i

pi = price for alternative i

max {0, u1-p1, u2-p2, … , un-pn}

• Customer does not buy if all net utilities are negative

Aug 2016 DSC4213 Session 4 - Prof. WANG Tong 5

Aug 2016 DSC4213 Session 4 - Prof. WANG Tong 6

• Demand and revenue

Aug 2016 DSC4213 Session 4 - Prof. WANG Tong 7

Aug 2016 DSC4213 Session 4 - Prof. WANG Tong 8

Aug 2016 DSC4213 Session 4 - Prof. WANG Tong 9

Estimation D = a – bp Log(D) = a – bp Log(D) = Log(a) – e log(p)

Constant sensitivity Constant elasticity

Optimizatio p* = c*e/(e-1) (c is unit

Aug 2016 DSC4213 Session 4 - Prof. WANG Tong 10

• Linear demand: D(p) = 11362 - 84.755p

Aug 2016 DSC4213 Session 4 - Prof. WANG Tong 11

• Exponential demand: D(p) = exp(10.4 – 0.0348p)

Aug 2016 DSC4213 Session 4 - Prof. WANG Tong 12

• Iso-elastic demand: D(p) = exp(13.584) * p^(-1.434)

Aug 2016 DSC4213 Session 4 - Prof. WANG Tong 13

• Price 5pm-9pm and fix other slots at $50

Aug 2016 DSC4213 Session 4 - Prof. WANG Tong 14

• Choice by Net Utility maximization

Aug 2016 DSC4213 Session 4 - Prof. WANG Tong 15

• Price both 5pm-9pm and other slots

Aug 2016 DSC4213 Session 4 - Prof. WANG Tong 16

• What if there are multiple alternatives to price?

Aug 2016 DSC4213 Session 4 - Prof. WANG Tong 17

• Two-product linear demand

d1(p1, p2) = a1 – b11 p1 + b12 p2

• Multi-product iso-elastic demand functions are also available

Aug 2016 DSC4213 Session 4 - Prof. WANG Tong 18

• Incorporate a random component in the net utility

• Random component ξ’s are assumed to be independent and have Gumbel

• Given n alternatives, then

Aug 2016 DSC4213 Session 4 - Prof. WANG Tong 19

• Demo of the MNL model: 5pm-9pm vs others

Choices 5pm-9pm Others

Net Utility 100-P 25

• Under deterministic choice, as long as P ≤ 75,

• Under MNL, the probability looks like

Aug 2016 DSC4213 Session 4 - Prof. WANG Tong 20

• MNL choice probability for different levels of uncertainty s

• Determine ui based on data

• s is derived from average of var(WTPi)

Pr( buy i) = Vi / (1 + V1 + V2 + … + VN)

Aug 2016 DSC4213 Session 4 - Prof. WANG Tong 22

• Data & Estimation

Aug 2016 DSC4213 Session 4 - Prof. WANG Tong 23

• Independence of Irrelevant Alternatives (IIA)

Aug 2016 DSC4213 Session 4 - Prof. WANG Tong 24

• Given an MNL demand model, we choose prices Pi to maximize total