bhfdhfc

Attribution Non-Commercial (BY-NC)

79 tayangan

bhfdhfc

Attribution Non-Commercial (BY-NC)

- quizz 5
- Regression Analysis
- Faster, Better, Cheaper Search Engines
- 25
- Cicn 2014 Ieee Paper
- I Don't Know
- 6_Jump height is a poor indicator of lower limb maximal power output.pdf
- Common pitfalls in statistical analysis: Linear regression analysis
- Lda Theory
- Correlation
- Determinants to Define ERM Intensity
- TALLER DE SUPERACIÓN No. 1 ÁLGEBRA
- Carver Nash Spss Problem
- Research paper.docx
- What is Memory in Psychology and Its Types
- Customer Satisfaction
- Quantitative Methods Homework Help
- hw2
- Identifying Wine Innovators: A Test of the Domain Specific Innovativeness Scale
- Vaquero 25

Anda di halaman 1dari 38

Discriminant analysis

Discriminant analysis is used to estimate the relationship between a categorical dependent variable and a set of interval scaled, independent variables.

Naresh Malhotra and David Birks, Marketing Research, 3rd Edition, Pearson Education Limited 2007

Slide 21.2

Chapter outline

1.

2. 3. 4. 5.

Basic concept

Relation to regression and ANOVA Discriminant analysis model Statistics associated with Discriminant analysis Conducting Dscriminant analysis

6.

7.

Stepwise Discriminant analysis

Slide 21.3

Table 21.1 Similarities and differences among ANOVA, regression and discriminant analysis

Y = b0 + b1 X1 + b2 X2 + b3 X3

Slide 21.4

Discriminant analysis

Y = b0 + b1 X1 + b2 X2 + b3 X3

Discriminant analysis is a technique for analyzing data when the dependent variable is categorical and the independent variables are interval in nature. The objectives of discriminant analysis are as follows: Development of Discriminant functions, or linear combinations of independent variables, which will best discriminate between the categories of the dependent variable (groups). Examination of whether significant differences exist among the groups, in terms of the Independent variables. Determination of which predictor variables contribute to most of the intergroup differences. Classification of cases to one of the groups based on the values of the Independent variables. Evaluation of the accuracy of classification.

Slide 21.5

When the criterion variable has two categories, the technique is known as two-group discriminant analysis. When three or more categories are involved, the technique is referred to as multiple discriminant analysis. The main distinction is that, in the two-group case, it is possible to derive only one discriminant function. In multiple discriminant analysis, more than one function may be computed. In general, with G groups and k predictors, it is possible to estimate up to the smaller of G 1, or k, discriminant functions. The first function has the highest ratio of between-groups to within-groups sum of squares. The second function, uncorrelated with the first, has the second highest ratio, and so on. However, not all the functions may be statistically significant.

Slide 21.6

Geometric interpretation

Independent Variable

X2

G1

1 1 1 1 1 1 1 1 1

2 11 1 1 2 22 21

2 2 2 2 2 2 22 22

G2

G1 G2 X1 Variable D

Independent

Discriminant

Slide 21.7

The discriminant analysis model involves linear combinations of the following form:

Where D bs = = discriminant score discriminant coefficient or weight

Xs =

The coefficients or weights (b), are estimated so that the groups differ as much as possible on the values of the discriminant function. This occurs when the ratio of between-group sum of squares to within-group sum of squares for the discriminant scores is at a maximum.

Between group sum of squares Within group sum of squares

Slide 21.8

Canonical correlation. Canonical correlation measures the extent of association between the discriminant scores and the groups. It is a measure of association between the single discriminant function and the set of dummy variables that define the group membership. Centroid. The centroid is the mean values for the discriminant scores for a particular group. There are as many centroids as there are groups, as there is one for each group. The means for a group on all the functions are the group centroids.

Classification matrix. Sometimes also called confusion or prediction matrix, the classification matrix contains the number of correctly classified and misclassified cases.

Slide 21.9

Discriminant function coefficients. The discriminant function coefficients (unstandardised) are the multipliers of variables, when the variables are in the original units of measurement. Discriminant scores. The unstandardised coefficients are multiplied by the values of the variables. These products are summed and added to the constant term to obtain the discriminant scores. Eigenvalue. For each discriminant function, the Eigenvalue is the ratio of between-group to within-group sums of squares. Large Eigenvalues imply superior functions. Between group sum of squares

Within group sum of squares

Slide 21.10

F values and their significance. These are Between group sum of squares calculated from a one-way

Within group sum of squares

ANOVAGroup means and group standard deviations. These are computed for each predictor for each group. Pooled within-group correlation matrix. The pooled within-group correlation matrix is computed by averaging the separate covariance matrices for all the groups.

Slide 21.11

Standardised discriminant function coefficients. The standardised discriminant function coefficients are the discriminant function coefficients and are used as the multipliers when the variables have been standardised to a mean of 0 and a variance of 1. Structure correlations. Also referred to as discriminant loadings, the structure correlations represent the simple correlations between the predictors and the discriminant function. Total correlation matrix. If the cases are treated as if they were from a single sample and the correlations computed, a total correlation matrix is obtained. Wilks l . Sometimes also called the U statistic, Wilks l for each predictor is the ratio of the within-group sum of squares to the total sum of squares. Its value varies between 0 and 1. Large values of l (near 1) indicate that group means do not seem to be different. Small values of l (near 0) indicate that the group means seem to be different. Within group sum of squares

Total sum of squares

Slide 21.12

Slide 21.13

Identify the objectives, the dependent variable and the independent variables. The dependent variable must consist of two or more mutually exclusive and collectively exhaustive categories. The predictor variables should be selected based on a theoretical model or previous research, or the experience of the researcher. One part of the sample, called the estimation or analysis sample, is used for estimation of the discriminant function. The other part, called the holdout or validation sample, is reserved for validating the discriminant function. Often the distribution of the number of cases in the analysis and validation samples follows the distribution in the total sample.

Slide 21.14

Example: We want to determine the salient characteristics of families that have visited a skiing resort during the last 2 years. The households that visited the resort during the last two years are coded as 1 and those that did not visit are coded as 2.

Table 21.2

Slide 21.15

Table 21.2 (Continued)

Slide 21.16

Table 21.3

Slide 21.17

The direct method involves estimating the discriminant function so that all the predictors are included simultaneously.

In stepwise discriminant analysis, the predictor variables are entered sequentially, based on their ability to discriminate among groups.

Slide 21.18

Table 21.4

The two groups are separated in terms of income than other variables

Low correlation among indepenedent variables Small values of lambda means groups are different on these variables

Only income, holiday and house hold size significantly differentiate between those who visited a resort and those who did not.

Slide 21.19

Because there are two groups, only one discriminant function is estimated. The eigen value associated with this function is 1.782, and it accounts for 100% of the explained variance.

The canonical correlation associated with this function is 0.8007. The square of this correlation is .64, which indicates that 64% of the variance in the dependent variable is explained by this model.

The sig indicates that the predictors significantly discriminate the groups.

The most important independent variable in discriminating between groups. Large standardised coefficients contribute more to the discriminating power of the function.

The signs of coefficients of all the independent variables are positive, which suggest that higher family income, household size, importance attached to family skiing holiday, attitude towards travel and age are more likely to result in the family visiting the resort.

Structure matrix: tells the relative importance of the predictors

Slide 21.20

Group centroids give the value of the discriminant function evaluated at the group means.

Slide 21.21

Table 21.4 (Continued)

Slide 21.22

The null hypothesis that, in the population, the means of all discriminant functions in all groups are equal can be statistically tested.

In SPSS this test is based on Wilks l . If several functions are tested simultaneously (as in the case of multiple discriminant analysis), the Wilks l statistic is the product of the univariate for each function. The significance level is estimated based on a chisquare transformation of the statistic. If the null hypothesis is rejected, indicating significant discrimination, one can proceed to interpret the results.

Slide 21.23

The interpretation of the discriminant weights, or coefficients, is similar to that in multiple regression analysis. Given the multicollinearity in the predictor variables, there is no unambiguous measure of the relative importance of the predictors in discriminating between the groups. With this caveat in mind, we can obtain some idea of the relative importance of the variables by examining the absolute magnitude of the standardised discriminant function coefficients. Some idea of the relative importance of the predictors can also be obtained by examining the structure correlations, also called canonical loadings or discriminant loadings. These simple correlations between each predictor and the discriminant function represent the variance that the predictor shares with the function. Another aid to interpreting discriminant analysis results is to develop a Characteristic profile for each group by describing each group in terms of the group means for the predictor variables.

Slide 21.24

Many computer programs, such as SPSS, offer a leave-one-out cross-validation option. The discriminant weights, estimated by using the analysis sample, are multiplied by the values of the predictor variables in the holdout sample to generate discriminant scores for the cases in the holdout sample. The cases are then assigned to groups based on their discriminant scores and an appropriate decision rule. The hit ratio, or the percentage of cases correctly classified, can then be determined by summing the diagonal elements and dividing by the total number of cases. It is helpful to compare the percentage of cases correctly classified by discriminant analysis to the percentage that would be obtained by chance. Classification accuracy achieved by discriminant analysis should be at least 25% greater than that obtained by chance.

Slide 21.25

Table 21.5

Slide 21.26

Slide 21.27

Table 21.5 (Continued)

Slide 21.28

Table 21.5 (Continued)

Slide 21.29

Slide 21.30

Slide 21.31

Stepwise discriminant analysis is analogous to stepwise multiple regression in that the predictors are entered sequentially based on their ability to discriminate between the groups. An F ratio is calculated for each predictor by conducting a univariate analysis of variance in which the groups are treated as the categorical variable and the predictor as the criterion variable. The predictor with the highest F ratio is the first to be selected for inclusion in the discriminant function, if it meets certain significance and tolerance criteria. A second predictor is added based on the highest adjusted or partial F ratio, taking into account the predictor already selected.

Slide 21.32

Each predictor selected is tested for retention based on its association with other predictors selected. The process of selection and retention is continued until all predictors meeting the significance criteria for inclusion and retention have been entered in the discriminant function. The selection of the stepwise procedure is based on the optimizing criterion adopted. The Mahalanobis procedure is based on maximising a generalised measure of the distance between the two closest groups. The order in which the variables were selected also indicates their importance in discriminating between the groups.

Slide 21.33

SPSS Windows

The DISCRIMINANT program performs both twogroup and multiple discriminant analysis. To select this procedure using SPSS for Windows click: Analyze>Classify>Discriminant Then run logit analysis or logistic regression using SPSS for Windows, click:

Slide 21.34

1. 2. 3. 4. 5. Select ANALYZE from the SPSS menu bar. Click CLASSIFY and then DISCRIMINANT. Move visit in to the GROUPING VARIABLE box. Click DEFINE RANGE. Enter 1 for MINIMUM and 2 for MAXIMUM. Click CONTINUE. Move income, travel, vacation, hsize and age into the INDEPENDENTS box. Select ENTER INDEPENDENTS TOGETHER (default option). Click on STATISTICS. In the pop-up window, in the DESCRIPTIVES box check MEANS and UNIVARIATE ANOVAS. In the MATRICES box check WITHIN-GROUP CORRELATIONS. Click CONTINUE. Click CLASSIFY.... In the pop-up window in the PRIOR PROBABILITIES box check ALL GROUPS EQUAL (default). In the DISPLAY box check SUMMARY TABLE and LEAVE-ONE-OUT CLASSIFICATION. In the USE COVARIANCE MATRIX box check WITHIN-GROUPS. Click CONTINUE. Click OK.

6. 7.

8.

9.

Slide 21.35

The X4 variable indicates the region in which the firm was located i.e North America or outside North America. The HBAT management team is interested in any differences in perceptions between those customers served by the US sales force versus those customers outside US which are served by independent distributors. The managment team is interested to see whether the other areas of operation (variables X6 to X18) are viewed differently between these two sets of customers. This inquiry follows the obvious need by management to always strive to better understand their customer, in this instance by focusing on any differences that may occur between geographic areas. If any perceptions of HBAT are found to differ significantly between firms in these two regions, the company then would be able to develop strategies to remedy any perceieved deficiencies and develop differentiated strategies to accomodate different perceptions.

Slide 21.36

Dependent Variable is X4 Independent Variable is X6 to X 18 to discriminate between firms in each area. Estimation model : The objective is to identify the set of independent variables (HBAT perceptions) that maximally differentiates between the two groups of customers.

Slide 21.37

Assessing Group differences: In profiling the two groups, we can identify variables with larges differences in the group mean. (X6, X11, X12, X13 and X17) Repeat the same using step method. Carry out profiling of each group on these variables to understand the differences between them.

Slide 21.38

We see varied profile between these two groups on these five variables. Group 0 : Customers in US have higher perceptions on three variables X6 Product quality X13 Competitve pricing X11 Product line Group 1 Customers outside US have higher perceptions on these variables X7 E-commerce X17 Price flexibility The US customers have much better perceptions of the HBAT products, whereas the outside US customers feel better about pricing issues and ecommerce.

Management should use these results to develop strategies that accentuate these strengths and develop additional strengths to complement them.

- quizz 5Diunggah olehBibliophilioManiac
- Regression AnalysisDiunggah olehldlewis
- Faster, Better, Cheaper Search EnginesDiunggah olehJohn McGowan
- 25Diunggah olehAbdu Mohammed
- Cicn 2014 Ieee PaperDiunggah olehvivekpali
- I Don't KnowDiunggah olehJb Du
- 6_Jump height is a poor indicator of lower limb maximal power output.pdfDiunggah olehMateusz
- Common pitfalls in statistical analysis: Linear regression analysisDiunggah olehSillyBee1205
- Lda TheoryDiunggah olehMydie Azriel
- CorrelationDiunggah olehThakur Sahil Narayan
- Determinants to Define ERM IntensityDiunggah olehdoko98
- TALLER DE SUPERACIÓN No. 1 ÁLGEBRADiunggah olehLaura Romero
- Carver Nash Spss ProblemDiunggah olehNitish Sen
- Research paper.docxDiunggah olehhasnain shah
- What is Memory in Psychology and Its TypesDiunggah olehAsad Mehmood
- Customer SatisfactionDiunggah olehCJ Daniel
- Quantitative Methods Homework HelpDiunggah olehStatisticsAssignmentExperts
- hw2Diunggah olehapi-232186343
- Identifying Wine Innovators: A Test of the Domain Specific Innovativeness ScaleDiunggah olehJyothi Mallya
- Vaquero 25Diunggah olehCristina Bălănescu
- M_2.7_RMDiunggah olehPawan Nayak
- Absorptive Capacities 05Diunggah olehMba Ibdem
- 1ceo Cognitive MapDiunggah olehNihan Firat
- 11. Hypothesis Testing Results Analysis Using SPSS RM Dec 2017Diunggah olehAiman Omer
- StatisticsDiunggah olehAjay Jamnani
- Thesis TitleDiunggah olehMhelet Dequito Pacheco
- Modeling Extreme Events from Computer SimulationsDiunggah olehmgrubisic
- outcome4t3Diunggah olehnjenns
- 559-1349-1-PBDiunggah olehMohammad Usman Qadri
- 10 Easterly Levine NBER 2012 Wp18162Diunggah olehArunVellaiyappan

- Cambridge English Exams Paper-Based vs Computer-BasedDiunggah olehjusendq
- ObGynClerkshipGuidetoSuccess.pdfDiunggah olehAnnie
- broca aphasia.pdfDiunggah olehgelatta
- EE353SP15_final(2)Diunggah olehSyedZain1993
- EXAMPLE Admission Test MathematicsDiunggah olehriaku86
- Http Iklc.kangaroo.org.Pk Images Downloads 1722 Iklc RegDiunggah olehhamdabutt
- Minimum Requirements IweDiunggah olehIonutz Telteu
- Assam govt jobsDiunggah olehMallikarjuna Naidu
- PUNCTUATIONDiunggah olehSabrina LaBlue
- Ics Qualifications SummaryDiunggah olehअमोल नितनवरे
- Syllabus.docxDiunggah olehJustin
- DI 05Diunggah olehSandeep Kunal
- maxwell 6 1 exercise - karin fonsecaDiunggah olehapi-336333217
- Dessler HRM12e PPT 06Diunggah olehMohamed salah
- Year 9 Mathematics Study Program 1Diunggah olehsukriti
- scheme of work 1st yr tgDiunggah olehapi-273413959
- Competency Based Interview GuideDiunggah olehellycious
- Waltham Forest News August 2014Diunggah olehChingfordCC
- Application FormDiunggah olehGaurav Shahane
- Statistics for Data ScienceDiunggah olehAnubhav Chaturvedi
- PROF RESUME _MONROE BROWN III.docxDiunggah olehMonte Marron
- lesson plan april 1Diunggah olehapi-264154243
- Hypotheses q 04Diunggah olehMuhammad Bilal
- Skills Development OverviewDiunggah olehUWE Graduate School
- Job AnalysisDiunggah olehDaniel Carroll
- Rapid Critical Appraisal (RCT).docxDiunggah olehDedeh Kurniasih
- Transformations ComparisonsDiunggah olehJudit Florenza
- The Self Concept in PsychologyDiunggah olehZoha Merchant
- 2015 PhD Course WorkDiunggah olehBhawna Mehra
- Romanian Education System 2001Diunggah olehovidiu