Anda di halaman 1dari 51

SPSS Training

SPSS Training Bangalore


Copyright 2001 ACNielsen

Understand your Data


Four different types of Data :
Nominal Scale Ordinal Scale Interval Scale Ratio Scale

Copyright 2001 ACNielsen

SPSS Training Bangalore

Examples ......
Nominal Scale
   

Ordinal Scale
 

Occupation Education Brand Usage Products used

Ranking of Attributes Age Categories

Copyright 2001 ACNielsen

SPSS Training Bangalore

Examples Contd.. .........


Interval Scale
  

Ratio Scale
 

Purchase Interest Uniqueness Attribute Ratings

Number of Packs used Number of Toffees consumed

Copyright 2001 ACNielsen

SPSS Training Bangalore

Broadly data is of two types


Non-Metric
 

Metric
 

Nominal Ordinal

Interval Ratio

Copyright 2001 ACNielsen

SPSS Training Bangalore

Data Types
Order Nominal Ordinal Interval Ratio yes yes yes Equal Step Size yes yes Absolute zero yes

Copyright 2001 ACNielsen

SPSS Training Bangalore

Types of Analysis
Univariate Bivariate Multivariate

Copyright 2001 ACNielsen

SPSS Training Bangalore

Univariate Analysis
Single Variable
     

Mean Variance Standard Deviation Median Mode Range

Copyright 2001 ACNielsen

SPSS Training Bangalore

Bivariate Analysis
Two Variables Covariance Chi-square Simple Regression Correlation Coefficient

Copyright 2001 ACNielsen

SPSS Training Bangalore

Multivariate Analysis
Many Variables Multiple Regression Correspondence Analysis Factor Analysis Cluster Analysis Discrimnant Analysis Multi Dimension Scaling AID

10

Copyright 2001 ACNielsen

SPSS Training Bangalore

Multivariate Analysis for more than one Dependent variables


Major techniques used for such analysis are :
 

Canonical Correlation Analysis of variance & Co-variance

11

Copyright 2001 ACNielsen

SPSS Training Bangalore

Multivariate Analysis Interdependence


  

Factor Analysis Cluster Analysis Multi - Dimensional Scaling

12

Copyright 2001 ACNielsen

SPSS Training Bangalore

MVA Analysis

13

Copyright 2001 ACNielsen

SPSS Training Bangalore

MVA Analysis

14

Copyright 2001 ACNielsen

SPSS Training Bangalore

Data Files

15

SPSS Training Bangalore


Copyright 2001 ACNielsen

Data File Types




Data files files come in wide variety of forms SPSS is designed to handle many of them

Spreadsheets created in Excel Database files created in Dbase Database files created in MS Access Tab delimited and other type of ASCII text files

Opening a file

- File
 Open  Data

16

Copyright 2001 ACNielsen

SPSS Training Bangalore

Opening Excel file


   

 

    

Reads variable names from first row of spreadsheet Variable names should not be more than 8 characters If name is longer than 8 characters then it is truncated If first 8 characters dont create unique name then it is modified to create a unique variable name By default the data is read from first worksheet To read from different sheet select worksheet from drop down list In open file dialog box select folder in Look in Select file type Excel Select Excel file (district.xls in SPSSTrain folder) Select worksheet Click open
17

Copyright 2001 ACNielsen

SPSS Training Bangalore

Opening Dbase file


  

Field names are automatically translated in to variable names Variable names greater than 8 characters are truncated Records marked for deletion are also read but an additional field D_R is created which contain asterisk for cases marked for deletion Command File, Open, Data In open file dialog box select file type as Dbase Select folder in Look in: Select Dbase file (EFED.DBF in SPSSTrain folder)

   

18

Copyright 2001 ACNielsen

SPSS Training Bangalore

MS ACCESS File
           

 

Command : File Open Database New Query In database wizard select MS Access Database Click Next In ODBC Driver Login dialog box click Browse Select Pay.mdb in SPSSTrain folder and click OK You can select one table or fields from different tables Tables are shown on left Double click on fields from various Tables Click Next Specify relationships and click Next Click Next for all cases to be retrieved In Define Variables one can change type of field from string to numeric for fields having numeric value Results shows the syntax which can be pasted in syntax editor Click Finish
19

Copyright 2001 ACNielsen

SPSS Training Bangalore

Reading Text Files




  

Text Wizard can read text files formatted in variety of ways - Tab-delimited - Space-delimited - Comma-delimited - Fixed-Field format files Command : File Read Text Data Select Pay.txt In Read Text Wizard - Step1 - select no predefined format - click next - Step2 select delimited and yes for variable names click next - Step3 let there be default selection click next - Step4 Select comma as delimiter - Step5 can change type of fields - Step6 Click Finish Repeat it with reg.txt which is fixed format
20

Copyright 2001 ACNielsen

SPSS Training Bangalore

Data Transformations

21

SPSS Training Bangalore


Copyright 2001 ACNielsen

Types of Data transformations




Compute Variable - computes values for a variable based on numeric transformations of other variables Recode Values - One can modify values by recoding them. This is useful for combining or collapsing categories. One can recode values within existing variable or create a new one Visual Bander - One can make intervals on different criteria Count - Counts occurrences of values within cases Rank cases - You can select multiple ranking methods. Separate ranking variable for each method is created Automatic Recode - Converts string and numeric values into consecutive integers
22

Copyright 2001 ACNielsen

SPSS Training Bangalore

Compute Variable
   

Use Employees data.sav file Compute increase in salary as salary begsalary Create new variable as sal_inc Command : Transform - Compute

23

Copyright 2001 ACNielsen

SPSS Training Bangalore

Recode Values
            

Use Employees data.sav Recode salary into salary groups Command : Transform Recode into different variable Double Click on current salary Type Salary_Gps in Output variable Name Type Salary Groups in Label Click old & New values Click Range Lowest through Type 25000 Click value in new value type 1 click Add Click Range type 25001 50000 new value 2 click add Click Range type 50001 75000 new value 3 click add Click Range type 75001 100000 new value 4 click add Click range through Highest type 100001 new value 5 click Add Click continue click Change click OK
Copyright 2001 ACNielsen

24

SPSS Training Bangalore

Visual Bander
            

Open Employees data.sav Transform Bander Double click current salary and click Continue Click current salary Type banded variable name as Salary_B Click Make Cut points Choose equal width intervals Type 20000 in First cut point location Type Number of Cut points as 4 Click on width you will see width is automatically calculated Click Apply Click make labels Click OK
25

Copyright 2001 ACNielsen

SPSS Training Bangalore

Count
        

Use S3data.sav Click Transform Count Type Target variable as TopBox_Count Type in Target label as Count of attributes rated 5 Select variables q16r1 q16r18 Click define values Type 5 click Add Click Continue Click OK

26

Copyright 2001 ACNielsen

SPSS Training Bangalore

Rank Cases


Rank cases creates new variables containing ranks, normal & savage scores and percentile values for numeric variables One can rank in ascending or descending order Ranking can be within sub-groups Open Employee data.sav Click Transform Rank cases Choose variable Current Salary Click assign Rank 1 to Largest value Select By: Employee category Click rank Type select Rank click Continue Click Ties click sequential ranks to unique values Click OK
27

         

Copyright 2001 ACNielsen

SPSS Training Bangalore

Automatic Recode
        

Open Employee data.sav Auto Code Current salary This will tell us how many different values of Salary exist Click Transform Automatic Recode Select variable Current Salary Click starting from lowest value Type new name Click add new name Click OK

28

Copyright 2001 ACNielsen

SPSS Training Bangalore

Command Syntax

29

SPSS Training Bangalore


Copyright 2001 ACNielsen

Syntax File
 

A syntax file is simply a text file that contains Commands While it is possible to Open a syntax window and Type in the commands, it is easier to let the SPSS build your syntax file There are three methods of doing this - Pasting Command Syntax from dialog boxes - Copying Syntax from the output log - Copying Syntax from journal file For on line help click help select Command Syntax Reference It will give Reference guide for all syntax and the options available

 

30

Copyright 2001 ACNielsen

SPSS Training Bangalore

Command Syntax Rules




Each Command must begin on a new line and end with a period (.). Most sub commands are separated by slash ( / ). The slash before the first sub command on a command is usually optional Variable names must be spelled out fully Text included within apostrophes or quotation marks must be contained on a single line Each line of Command syntax cannot exceed 80 Characters A period (.) must be used to indicate decimals regardless of windows regional settings Variable names should not end with period (.)

 

 

31

Copyright 2001 ACNielsen

SPSS Training Bangalore

More Points about Command Syntax


   

Command syntax is case insensitive Three letter abbreviations can be used for many commands Can use any number of lines to specify single command You can add space or break lines at almost any point where a single blank is allowed, such as around slashes, parentheses, arithmetic operators or between variable names For Example FREQUENCIES VARIABLES=JOBCAT GENDER /PERCENTILES=25 50 75 /BARCHART.

And freq var=jobcat gender /percent=25 50 75 /bar. Are both acceptable alternatives that generate the same results.
32

Copyright 2001 ACNielsen

SPSS Training Bangalore

Command Syntax from Dialog Boxes


         

To paste the syntax open the dialog box Make selections Click Paste For Example S3data.sav Click Analyze Descriptive Select Variables Click Options Select statistics required Click continue Click paste

33

Copyright 2001 ACNielsen

SPSS Training Bangalore

Using syntax from Output Log




You can build a syntax file by copying from the log that appears in the viewer Before running the analysis from Edit menu choose Options Click Viewer Tab Select Display commands in the log As you run analysis commands for your dialog box selections are recorded in the log

   

34

Copyright 2001 ACNielsen

SPSS Training Bangalore

Syntax from Journal file




By default, all commands executed during a session are recorded in spss.jnl file You can edit the journal file and save it as a syntax file Journal file is a text file and can be edited like any other text file Since error messages and warnings are also recorded in journal file along with command syntax it should be edited Save edited journal file with different file name because journal file is automatically overwritten for each session To locate file Open File Other c:\windows\temp\spss.jnl

  

35

Copyright 2001 ACNielsen

SPSS Training Bangalore

Syntax Files
    

Syntax files are saved with *.sps extention To open a syntax file click File Open Syntax Select a *.sps file To run the commands from syntax file To run the command Select the commands and click Run button (the right pointing triangle

36

Copyright 2001 ACNielsen

SPSS Training Bangalore

Regression Analysis

37

SPSS Training Bangalore


Copyright 2001 ACNielsen

What is it ?
Regression analysis is a method used to develop an equation relating a single metric criterion variable with a set of predictor variables

38

Copyright 2001 ACNielsen

SPSS Training Bangalore

What does it do ?


Estimates overall relationships between the criterion variable and set of predictor variables Estimates magnitude, relative importance, and statistical significance of the contribution of each of the predictors to the relationship Derives a predictive equation for the criterion variable on the basis of known values of the predictors

39

Copyright 2001 ACNielsen

SPSS Training Bangalore

What is it used for ?




To forecast sales, market share, profitability. To model choice, buying patterns, impact of marketing programs. To estimate elasticities, response functions.

40

Copyright 2001 ACNielsen

SPSS Training Bangalore

Steps in Regression Analysis


 

Identify clearly the criterion ( dependent ) variable to be explained Isolate and identify an exhaustive list of predictor ( independent ) variables that may explain the criterion variable Screen and carefully select the predictor variables and decide which to retain Estimate the parameters of the regression equation Perform diagnostic checks on the output of analysis to test for the adequacy of the model proposed by researcher If model appears to be okay, understand and interpret the analysis Monitor, maintain and update the model.

 

 

41

Copyright 2001 ACNielsen

SPSS Training Bangalore

Assumptions


  

The predictor/independent variables are are metric and the criterion/dependent variable is metric, continuous and unbounded All variables are measured without error All independent variables have non-zero variance There is no exact linear relationship (perfect multi-collinearity ) between two or more independent variables At each set of values for the K-independent variables, the error terms are normally distributed, with mean zero and constant variance ( i.e. homoscedasticity ). Each independent variable is uncorrelated with the error term The error terms for different observations are uncorrelated ( e.g. no autocorrelation

 

42

Copyright 2001 ACNielsen

SPSS Training Bangalore

Example
 

A Market Researcher is interested in consumers attitude towards nutritional additives in ready-to-eat cereals. A set of written concept description are prepared in which two characteristics are varied - Amount of Protein - percentage of minimum daily requirement of Vitamin D Researcher obtains consumers interval-scaled evaluations of ten concept descriptions on a preference rating scale of 1-dislike extremely to 9-like extremely well Multiple regression analysis is done by taking Preference rating as criterion (dependent ) variable and Amount of Protein, Percentage of Vitamin as predictor (independent ) variables

43

Copyright 2001 ACNielsen

SPSS Training Bangalore

Data collected

44

Copyright 2001 ACNielsen

SPSS Training Bangalore

SPSS Output

45

Copyright 2001 ACNielsen

SPSS Training Bangalore

SPSS output

46

Copyright 2001 ACNielsen

SPSS Training Bangalore

SPSS output

47

Copyright 2001 ACNielsen

SPSS Training Bangalore

Importance of predictors

48

Copyright 2001 ACNielsen

SPSS Training Bangalore

Some terms used




Hetroscedasticity - Large variance associated with certain groups while conducting a cross-sectional study e.g. larger variance in consumption is associated with heavy users than light users. Autocorrelation - Correlated errors from adjacent time-periods ( Panel data), or cross-sectional observations ( such as errors related due to different ethnic groups in a market segmentation study), are called autocorrelation or serial correlation. DurbinWatson statistic is computed using residuals and compared to table value which gives two limits - the number of independent variables and number of observations.

49

Copyright 2001 ACNielsen

SPSS Training Bangalore

Some terms ...


 

  

Partial regression coefficient (b) beta coefficient - b when all variables are transformed into standard unit variates ( zero mean and unit standard deviation) multiple correlation (R) Coefficient of multiple determination (R^2) Adjusted (R^2) - deflation of R-Square by taking into account the number of predictor variablesand the number of observations in the analysis Part Correlation - Correlation coefficient between Y and the residual that remain after X has been regressed on all other predictors. Square of part correlation represents the absolute increase in R-Suare

50

Copyright 2001 ACNielsen

SPSS Training Bangalore

Some terms


Partial correlation - Simple correlation between two residuals, the residual of Y and the residual of X, the residuals having been obtained by regressing both Y and X separately on all the remaining predictor variables Standard Error - is simply the standard deviation of actual Y values from corresponding values fitted by least squares regression analysis. - It is conditional standard deviation, measured from the regression line rather than sample mean. Tolerance - measure of multi-collinearity and is 1-Rj^2 where Rj is the multiple correlation using variable j as criterion and other independent variables as predictors Variance Inflation Factor (VIF) - Reciprocal of tolerance

51

Copyright 2001 ACNielsen

SPSS Training Bangalore