Anda di halaman 1dari 2

# Guide to Stata (using the example auto.

dta)

1. Importing data into stata
- From existing .dta file : File -> open -> select the .dta file : variables appear and can be used
during the session
- Typing in the variables directly into Stata : Data -> Data Editor -> Edit : text or numerical data
- From an existing Excel file : Easy/Quick way : copy paste using Data Editor (change variable
names if necessary) OR create a .csv file and use the command insheet using name.csv

NOTES : (i) page up to go to the previous command used (ii) every command obtained by
clicking can also be directly typed into the command window (corresponding command scan
be found in the help section : type help followed by a key word e.g. help histogram), (iii)
missing data are marked by a dot (.)

2. Creating a log file which saves everything you do in Stata
- First create a log file name : log using filename

3. Simple calculations
- Use the command display in the command window e.g. display log(2) or invttail(5,0.05)
(inverse cumulative distribution i.e. the mass to the right of x is 5 percent for 5 degrees of
freedom)

4. Descriptive statistics
- Making graphs with Stata
o Histogram : go to graphics -> histogram -> select the variable (variable price)
o Pie chart (densities) : go to graphics -> pie chart -> select the variable
o Scatter plot (relationship between two variables) : use command scatter x y (e.g.
scatter mpg price)
o Summary statistics for the variables : go to Data -> Describe Data -> Summary
statistics -> select the variables of interest
o DIsplaying some variables from the dataset e.g. car makers and their price (add
conditional on the price being lower than 6000) : list make price if price<6000
o Conditional calculations e.g. summary statistics for price only considering prices
below 5000 : summarize price if price<5000 (NOTE conditions can be applied to
almost everything e.g. conditional regressions if price<5000)

5. Creating/managing variables
- Generating new variables as functions of existing variables : use generate
new_variable=log(price)
- Replacing a variable e.g. replace trunk=trunk/100
- Renaming existing variable : rename rep78 rep
- Sorting variables by ascending order of price : sort price

6. Simple Linear regression
- Easy way to make a simple/multivariate regression between variables : go to Statistics ->
Linear model and related -> linear regression -> select the variables OR directly write regress
y x in the command window
- NOTE : options can be used directly into the command window (for example add regress y x,
noconstant)
- Other types of regression and test e.g. with binary variables (logit, probit), panel data, time
series. We will talk about some of them during the semester

7. Post estimation computation
- testing parameters: regress price mpg trunk. Then test that the coefficient on both
parameters are jointly equal to zero : go to Statistics -> Postestimation -> Test -> Test
parameters
- Fitted values (for price given mpg and trunk and the regression result). First regress price
mpg trunk. Then Postestimation -> Prediction -> give a name to the fitted values OR simply
predict price_hat after having done the regression
- Same procedure to get the residuals from the regression

8. Panel and time series data
- In this case, you must first declare that the dataset is a panel or a time series
o For panel use xtset followed by the name of the variable describing the different
individuals in the panel (e.g. a variable named Country : show an example on the
board)
o For Time series declare tsset year (where year is a variable for years e.g. 1990 1991
1992)

9. Closing the log file : log close
- This creates a smcl file that can be opened in Stata. Stata will redo all your calculations

10. Getting help
- Type help followed by a key word