Anda di halaman 1dari 3

ECMT1020: Introduction to Econometrics

Tutorial Questions, Week 3


As this is the first tutorial using Stata, I include a list of the most useful Stata commands with a short
explanation. For more detailed information, check Stata’s extensive Help menu; if anything is still
unclear, your tutor or lab helper will be happy to assist you.
• OPENING FILES:
Stata can read data files that have the .dta extension. Loading a data set is most easily done
through the drop-down menu: just click File, Open, and find the data file you need. Alternatively,
the command to do this is use; for example, use intrate to load the data set for Question 2.
• LOOKING AT THE DATA:
To see the individual values in your data set, you can use the “Data Browser” from the Data menu.
Alternatively, the Stata command list shows you the entire data set. You can also choose to view,
for example, only cases where a certain condition is met (list if infl>1 & infl<=4), or
only some of the variables (list date intrate). These options can also be combined (list
date intrate if infl>1 & infl<=4).
• SUMMARY STATISTICS:
Summary statistics can be found using the summarize command: summarize infl will give
the number of observations, mean, standard deviation, minimum, and maximum values of inflation.
If you need more detail (percentiles, skewness, kurtosis), use summarize infl, detail.
The summarize command can be combined with if conditions in the same way as list; for
example, summarize infl if intrate>5 gives you summary statistics on the inflation
rates from those months with interest rates over 5% only.
• CREATING NEW VARIABLES:
To create a new variable that is a function of an existing one, you can use generate. For exam-
ple, if you have a variable called tempF that measures temperature in Fahrenheit, the command
generate tempC=(tempF-32)/1.8 creates a new temperature variable that is in Celsius.
• T-TESTS:
The t test for H0 : µ = µ∗ versus Ha : µ 6= µ∗ is performed by ttest. For example, to test
whether the mean inflation rate is five percent, ttest infl==5 will give you both the t statistic
and the p-value. (It actually performs three different t tests; the one that you want is in the middle.)
• MAKING GRAPHS:
For histograms, use histogram infl. Scatter plots can be made using scatter infl
intrate; this will put “infl” on the vertical axis and “intrate” on the horizontal one.
• SIMPLE CALCULATIONS:
Calculator functionality is available through the display command. For example, display
3+5 gives you the number 8, display normprob(1.96) gives you Pr [Z ≤ 1.96] = 0.975,
and display invnormal(0.975) gives you 1.96. The related functions for the t distribution
are ttail and invttail, which were discussed in Lecture 2.
• GETTING HELP:
If you can’t find something through the Help menu, another good option is the help command.
For example, help on the histogram command can be obtained by typing help histogram.
Question 1. For this question, use the data set generated.dta that is available on Canvas. This is not
economic data but rather some random values that I have generated; you should find that it only contains
a single variable called x.

(a) Load the data and obtain the sample mean and standard deviation.

(b) Use these results to find intervals that 68% and 95% of the data should lie in, respectively, if we
assume a normal distribution.

(c) Find the proportion of the data that actually lies in the intervals you computed in part (b). How would
you explain the discrepancy?

(d) Create a histogram of the data. How does it relate to your findings in part (c)?

Question 2. For this question, use the data set intrate.dta, also available on Canvas. It contains
monthly data on the nominal interest rate (intrate) and inflation (infl) in the United States, for
1980–2009.

(a) We first focus on the infl variable. Find its mean, standard deviation, skewness, and kurtosis, and
plot a histogram of the sample. Show that both the summary statistics and the histogram suggest that this
variable is not normally distributed.

(b) We introduced the t statistic and the related test under the assumption that the data came from a
normal distribution. Why can we still use this test, despite what we found in part (a)?

(c) Macroeconomics textbooks often state that the average inflation rate for an industrialized country
should be 4% in the long run. Based on your results from part (a), but without using the ttest com-
mand, find the t statistic and the p-value for testing the hypothesis that 4% is the population mean
inflation rate. What do you conclude?

(d) Use the ttest command to confirm your results in part (c).

(e) Find the mean and standard deviation of the interest rates as well.

(f) Create a new variable, measuring the real interest rate: the nominal interest rate minus the inflation
rate. Also find this variable’s mean and standard deviation.

(g) The variable you created in part (f) is defined as a linear combination of two variables: it is equal
to a·intrate + b·infl, where a = 1 and b = −1. Show that the formula for the mean of a linear
combination correctly gives the result that you found in part (f).

(h) A similar formula for the variance of a linear combination was given in the lecture slides. Show that
this formula does not give the correct result in this case. Why is that?

(i) The capital gains tax system was constructed around the assumption that on average, the real interest
rate should be 2.5% per year. Test whether this is the case for the years in this data set.
Non-stata questions: Even in the weeks with Stata tutorials, I will occasionally include some pencil-
and-paper questions. Your tutor may cover these questions if time allows - solutions will be posted to
Canvas.

Question 3. Let X ∼ N (12, 3).

(a) Find P (X ≤ 12).

(b) Find the value of k such that: P (12 − k ≤ X ≤ 12 + k) = 0.05.

(c) If a new variable Y is defined as Y = 2X − 6, find P (Y ≥ 18).

n
X
Question 4. This question focuses on the sum of squares (xi − x̄)2 , which pops up in the definition
i=1
of a variance, as well as in several other places, as we will see later on in the semester.
n
X
Prove that this sum of squares can be simplified to x2i − nx̄2 .
i=1

Question 5. A random survey of 20 people found the following values for D:


(1, 0, 0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1)
where a 1 denotes in employment and 0 denotes not employed. Find the mean and median of D. Find the
coefficient of variation. What is the unemployment rate according to this data?

Anda mungkin juga menyukai