Anda di halaman 1dari 2

Question 1

235 commuters have each been asked to choose between two different train journeys, "A"
and "B". The journeys vary in time, price, number of shifts and comfort levels; journey "A"
(and "B") is thus not the same for different commuters. The journeys have duration
between 55 minutes and 4 hours with price ranging from 1 to 100 euros.

A sensible model for such data is that the individual commuter selects the option / journey
which gives her / him the greatest "utility". If we let

denote the utility for options "A" and "B", then option "A" is chosen if 𝑈𝐴 > 𝑈𝐵 .
In (1) 𝑥𝐴 is covariates (one or more among price, travel time, number of shifts, comfort
level) for option "A", while 𝑒𝐴 is a random variable that is interpreted as the person-specific
part of the utility (i.e. differences between commuters); same for to 𝑥𝐵 and 𝑒𝐵 . It is
assumed that 𝑒𝐴 and 𝑒𝐵 are independent and evenly distributed, and that the distribution of
𝑒𝐴 is continuous with a strictly positive density in the entire of R.

In practice, we only observe if 𝑈𝐴 is greater than 𝑈𝐵 (or equivalent to 𝑈𝐴 − 𝑈𝐵


is greater than 0 or not), along with the values of the explanatory variables.

1. Explain how (1) causes the statistical model of the observed choices to be a
generalized linear model with linear predictor given as

and a link function, g which satisfies:

In the data material, the commuters’ preference (choice of "A" or "B") is found in the
variable “valg”. The difference between the price of alternative "A" and the price of
alternative "B" can be found in “pris.diff”; a positive value means that option "A" is more
expensive than option "B".

2. Examine how the probability of selecting option "A" depends on the difference
in price. Use this to make statistical model for how the probability of choosing option
"A" depending on the difference in price and estimates how big a price difference
there need to exist to get a minimum probability of 80% that alternative "A" will be
chosen over “B”.

3. Test whether the parameters in the model model are significantly different from 0.
Explain what the hypothesis H0: β0 = 0 means in this model with this data and
discusses if it’s a meaningful hypothesis. Reduce the model if possible.
In the data material there are also additional differences (denoted as the value for
alternative "A" minus the value for alternative "B") between the other explanatory
variables: tid.diff (difference in travel time), skift.diff (difference in number of shifts) and
komfort.diff (difference in comfort). A positive value of, for example, skift.diff
means that option "A" has more shifts than option "B".

4. It is of economic interest to estimate the "price of time": how big of a price


difference has to exist to compensate for an extended travel time of 60 minutes?
Add the aforementioned explanatory variables to the fitted model from question 3
and estimates "the price of time" in the fitted model and find a 95% confidence
interval for this.

5. Check if the model from question 4 provides a reasonable description of data, and
correct any problems if possible

6. Reduce the model and formulate a conclusion: What does the model say about the
probability to select option "A"?

Anda mungkin juga menyukai