Anda di halaman 1dari 2

Advanced Econometrics – Assignment #1

Name: Devvrat Raghav

a) In the specified model, both the variables corresponding to racial identity (black and

hispan) are statistically insignificant for conventional levels of significance, i.e. at 10%,

5% and 1%. Further, as shown by the F-test, they are also jointly insignificant. Hence, we

claim that on average, race is not a significant predictor of a baseball player’s salary.

Similarly, three of the four variables that capture player performance are statistically

insignificant at conventional levels. These are bavg, hrunsyr, rbisyr. However, allstar is an

important predictor of salary at the 99.5% confidence level. This implies that MLB team

owners deem selection for the all-star team as the dominant proxy for player performance,

rather than raw batting statistics. Thus, we find that for each additional year that a player is

nominated as an all-star, his salary is expected to rise by 8.62%.

b) It is very likely that the variables indicating a player’s performance are highly correlated.

For instance, if a player scores more home runs in a given year (hrunsyr), then it directly

increases both his batting average (bavg) and runs batted-in (rbisyr) for that year, and vice

versa. Consequently, the player’s likelihood of being selected for the all-star team rises,

thereby leading to potentially higher values of allstar. We evaluate this claim using a test

for Variance Inflation, which yields an especially high vif for rbisyr, indicating that it is

highly correlated with the other regressors.

c) The problem of high multicollinearity can be mitigated by either:

i. Dropping one or more of the highly correlated regressors. This is done by

comparing their vif to a predetermined threshold, with variables having a high vif

being dropped from the model. Subsequently, the vif of remaining regressors goes

down due to a fall in the R2 of the auxiliary regressions, which reduces the standard

errors on our estimates of the beta coefficients, thereby making them more precise.
ii. Conducting Principal Component Analysis on the model, through which it is

possible to collapse multiple highly correlated regressors (x1, … x4) into another

variable, known as their principle component (z). This variable replaces (x1, … x4)

in our main regression function as the proxy for player performance.

d) Both methods yield different specifications, thereby requiring separate justification:

i. By setting our arbitrary threshold at vif <= 10, we drop rbisyr from the model. This

is because it’s vif of 19.01 indicates that ~94.8% of the variation in this variable can

be explained by other regressors already in the model. As such, we retain most of

the model’s explanatory power, while shedding some of its multicollinearity.

However, this does occur at the cost of endogeneity in the model, since the rbisyr

portion of the error term will be correlated with the other regressors.

ii. The PCA approach, by design, aims to capture the maximum possible variation in

the correlated regressors through the principal component (factor). This is done by

ensuring that the eigenvalue of the selected component(s) is greater than 1, meaning

that it explains the variation in those regressors fairly well. Hence, the issue of

multicollinearity can be somewhat averted without losing the model’s explanatory

power, while simultaneously increasing the degrees of freedom and ease of

interpreting performance’s effect on salary.

e) Depending on our method of choice, the following results are obtained:


i. Once rbisyr is dropped, hrunsyr becomes a significant predictor of the player’s
salary, while allstar remains significant for a 1% confidence level. Even so, bavg
remains insignificant, albeit slightly less so the p-value drops from 0.588 to 0.472.
Taken together, the effect of performance on salary is still statistically significant,
as the F-statistic of ~12.558 and very small p-value indicate.
ii. The variable obtained from the partial component analysis (perf_hat) is a
statistically significant predictor of the player’s salary. Hence, for each standard
deviation increase in a baseball player’s performance, their salary is expected to
increase by 22.7%.

Anda mungkin juga menyukai