Anda di halaman 1dari 3

SOFTWARE NOTES FOR REGRESSION ANALYSIS

STATISTIX FOR WINDOWS (pp.1,2) and EXCEL (pp.2,3)

STATISTIX FOR WINDOWS

Statistix is a popular, easy to use professional software package. You can download a FREE trial version of Statistix
from: http://www.statistix.com. This version is good for 30 days only. Really. A copy of Statistix is on the
Blackboard, also.

Click on “free trial” on left panel and then “download now”. Click on the program you downloaded (sx9trial.exe).
The Statistix program will be installed in the folder c:\Statistix (unless you override this default). You can start the
program either via START - PROGRAMS - STATISTIX, or by clicking on sxw.exe in the Statistix folder.

Data Entry
Click Data on Tool Bar - Insert - Variables
Key in names, then OK
You should see a screen with your names on the top of the columns
Key in the data values. (Doing one column at a time using the "down arrow" is a fast way to enter the data.)

If you want to convert an Excel file to the Statistix format, you may be able to IMPORT the data (depending on the
version of Statistix and on whether you have garbage ($, commas, etc,) in the file – which you can clean up by
changing the cell definition to numeric). Another method that is easy is to go into Statistix and click Data-Insert-
Variables and key in the names of the variables. Then go to Excel and highlight the numeric data (clean file) and click
CTRL-C or Copy. Back in Statistix, click CTRL-V to paste in the data.

Save the File


Use the "Save As" option. (Example: ABC.sx)

Do a Scatter Plot
Click Statistics on the Tool Bar -Summary Statistics - Scatter Plot
Select X and Y axes. Check “display regression line”
(This is easier than "eyeballing" a linear relationship.)
Print this now if you wish; this is the easiest approach.
To get out of this screen, click on the "X" in the upper right corner of the window,
even though it may not be highlighted. Your data should be displayed again.
(If you wish to save plots and import them into a document, save the plot as an
“emf” or “wmf” file (example: abc.emf, abc.wmf).
To import the graph into a document, open the emf (wmf) file, click copy on the tool bar,
then open your document, position the cursor to the location where you
want to place the graph, click paste. (If it is too large you may need to shrink the plot before clicking copy. )

Do Regression Analysis
Click Statistics - Linear Models - Linear Regression
Select the independent and dependent variables, then OK.
The output report will appear on the screen

Save the Output Report


Click File - Save As (Example: ABC_OUT.txt)
On subsequent saves, use the same name; it will then give you the option to append information to the existing file,
which you should take. This file will be saved in a text format (*.txt), which you can later read in a word processor.)

Additional Regression Results


For each option click “Results” on the tool bar. Don't get out of this screen until you have gotten all the
reports that you need. (You should do the first three categories listed here.)
Results - Durbin-Watson Statistics - File - Save As (using same *.txt name) - Append
Results - Prediction - key in the value of the independent variable in the Predictor Values,

2010 1
note CI box (95% is the default, which you have the option of changing).- File -Save As - Append
Results - Plots- select the "Simple Regression Plot", "Std Resids by Fitted Value" – Print;
then do, "Std Resids Time Series" – Print (Alternatively save, as notes above, and paste later.)
(Note repeat "Results - Plots - your selection - Print" for each plot you desire)
(optional) Results – Save Residuals – for fitted values key in “Yhat” and for
residuals key in “e”. These will be saved as columns in the original .sx file,
which you should then save again.

Printing Your Results


By this time, you should have all your graphs printed or saved in separate files.
To print the output file, click "X" in the upper right corner of the window to get out of the current screen.
(You can't print your file from this screen, since you'll just get what's on the screen,
not what is in your file.)
Exit from Statistix.
Bring your output file, which contains the report information, into a word processor, paste in graphs if
necessary, and print.

EXCEL FOR REGRESSION ANALYSIS

The following assumes that you have some knowledge of EXCEL, and that you are using only the standard
ANALYSIS TOOLPAK of EXCEL. Note there may be some slight differences between various versions
of EXCEL, so read screens carefully and employ the HELP function when necessary.

1. Data Entry
Put the observation or numbers or identifiers (if any) in column A.
Put the dependent variable data (Y values) in column B.
For simple regression, put the independent variable data (X values) in column C.
For multiple regression put the data into column C and the adjacent columns. So if there are 3 independent
variables they would go into columns C, D, and E.
Save this data in a file NOW and also periodically.

2. Scatter Plot
HIGHLIGHT the X and Y data values. (You can include the title, doesn’t seem to matter for this
particular operation.)
Select INSERT | CHART. You want to select "XY Scatter" and the sub-chart that just has dots. Click
NEXT. In the Data Range tab select columns; ignore everything else. In the Series tab set up the X and
Y values. They will be reversed so you need to fix that. (We could have switched the B and C columns in
Data Entry above, but that is not good if you are planning on doing multiple regression.) In NEXT screen
you can select labels and any other options regarding appearance. Some versions of Excel will vary, so
just follow directions.

If you want to draw a trend line through the points you can click on your chart and “CHART” will appear
in the toolbar. Depending on your system do ONE of the following:
A. CHART | ADD TRENDLINE Click linear. Ignore other options.
B. INSERT | TRENDLINE Choose type: linear Do NOT select "set intercept = 0".

3. Regression Analysis
Before proceeding make sure sheet1 is active.
Select TOOLS | DATA ANALYSIS | REGRESSION
(Note, if you do not have data analysis on the tools drop-down menu you need to select
TOOLS | ADD-INS and select ANALYSIS TOOLPAK, then OK.
Then terminate EXCEL and restart the program.)

2010 2
To do the regression indicate Y range (click on box then highlight the Y values including the label on the
spreadsheet), X range (click on box and then highlight), click Labels, click Confidence Level (95% is
default; you can change it), New Worksheet Ply and include a name like RESULTS, and take Residual
options: Residual and Residual Plot, at least. To see the titles completely, you will need to widen the
columns. Note that the data in the Regression Statistics section labeled "Standard Error" is the standard
deviation of the residuals. (Note, if you do not highlight the variable label name, then do not click
“LABELS”.)

4. Analysis of residuals
At this point, you should have a residual plot of units v. residuals from the previous action. To create a plot
of Fitted Y values v. Residuals, do INSERT | CHART and select scatter plot as above. You want to graph
the predicted Y values (found in the residual section) on the horizontal axis (X axis in Excel), and the
residual values on the vertical axis (Y axis in Excel). To evaluate autocorrelation you need a time series
plot. You can do INSERT | CHART but this time select LINE and the sub-graph with the dots. You can
select the residuals for the vertical axis and this graph form will automatically place numbers on the
horizontal axis. Note that if your data is spread out on different worksheets that’s OK. In front of the cell
ranges you can type: =sheetname! (For example: =sheet1!a4:a15.) The Durbin-Watson statistic also
indicates autocorrelation and requires considerably more effort. This calculation could be done by hand for
small data sets. (When graphing you can use either the Residuals or the Standardized Residuals.)

5. Confidence or Prediction Intervals


The confidence intervals for the beta coefficients are done automatically. The Y intervals are not done
automatically. You can get the standard deviation of residuals from the output and then follow the formulae
in the NOTES handout. The EXCEL file for determining indirect manufacturing costs called “Indirect with
intervals.XLS” is on Blackboard. It has three worksheets called Interval X (for three different values of X)
that contain all the steps needed to create prediction and confidence intervals for Y. You can use that as a
model if you wish to perform the calculations in EXCEL. See the discussion of confidence intervals versus
prediction interval in the NOTES handout.
6. Output
You can print out each work sheet separately. When printing the regression results make sure that the
cursor is clicked on the written report, rather than the charts. If it is set on the charts (since, probably, you
just moved them) then only the selected chart will be printed.

2010 3

Anda mungkin juga menyukai