4. NON- DISGUISED, STRUCTURED TECHNIQUES..................................................17 4.1. NOMINAL DATA ...........................................................................................................17 4.2. ORDINAL SCALES..........................................................................................................17 4.3. INTERVAL SCALES........................................................................................................17 4.4. RATIO SCALES..............................................................................................................18 4.5. SEMANTIC DIFFERENTIAL SCALE.................................................................................18 4.6. THE CONSTANT SUM SCALE.........................................................................................20 4.6.1. Advantages............................................................................................................20 4.6.2. Disadvantage........................................................................................................21 4.7. THURSTONE SCALE.......................................................................................................21 4.7.1. Advantages............................................................................................................21 4.7.2. Disadvantages.......................................................................................................21 4.8. LIKERT SCALE..............................................................................................................21 4.8.1. Advantages ...........................................................................................................22 4.8.2. Disadvantages.......................................................................................................22 4.9. COMPARISON OF THURSTONE AND LIKERT SCALE......................................................22 5. DISGUISED, STRUCTURED TECHNIQUES..............................................................22 6. CONCEPT TESTING.......................................................................................................22
Members of PMRO (Group 7) Devatanu Banerjee 01 Roopa Chandrasekhar 07 Padma Kinger 20 Yashesh Mukhi 27 Aditi Nair 28 Seema Pillai 31 Rishi Tanna 41
Page 2
Interval Scale
Ratio Scale
Cartoon Method
Likert Scale
Thurston Scale
Page 3
Page 4
1. Laddering involves having respondents identify attributes that distinguish brands by asking questions. Each distinguishing attribute is then probed to determine why it is important or meaningful. These reasons are then probed to determine why it is important, and so forth. The purpose is to uncover the network of meanings associated with the product, brand, or concept. 2. Hidden-issue questioning focuses on individual respondents feelings about sensitive issues. Analysis on focus on common underlying themes across respondents. These themes can then be used to guide advertising development 3. Symbolic questioning requires respondents to describe the opposites of the product/ activity of interest or a specific attribute of the product/ activity. Individual depth interviews have been found to generate more and higher quality ideas on a per respondent basis than either focus or minigroups. They are particularly appropriate when: 1. Detailed probing of an individuals behavior, attitude or needs is required; 2. The subject matter under discussion is likely to be of a highly confidential nature (e. g. personal investment) 3. The subject matter is of an emotionally charged or embarrassing nature; 4. Certain strong, socially acceptable norms exist (e.g. baby feeding) and the need to conform in a group discussion may influence responses; 5. Where highly detailed understanding of complicated behavior or decisionmaking pattern (e.g. planning the family holiday) are required; or The interviews are with professional people or with people on the subject of their jobs 9 e.g. finance directors) 2.1.2. Focus group discussions (F.G.Ds): The standard focus group interview in the United States involves 8 and 12 individuals and lasts about 2 hours. Normally each group is designed to reflect the characteristics of a particular market segment. The respondents are selected according to the relevant sampling plan and meet at a central location that generally has facility for taping and/ or filming the interviews. In Europe, focus tend to consist of 6 to 8 respondents, vary in length from 1.5 to 4 hours and are often conducted in the home of the recruiter. Otherwise the interviewers are similar. The discussion itself is led by a moderator. The moderator attempts to progress through three stages during the interviewer: (1) establish rapport with the group, structure the rules of group interaction, and set objectives; (2) provoke intense discussion in the relevant areas; and (3) summarize the groups responses to determine the extent of agreement. The general either the moderator or a second person prepares a summary of each session after analyzing the sessions transcript. Focus Group Interviews can be applied to: 1. Basic- need studies for product idea creation, 2. New product idea or concept exploration, 3. Product positioning studies,
Group 7: Phoenix Market Research Organization (VES College) Page 5
4. Advertising and communications research, 5. Background studies on consumers frames or reference, 6. Establishment of consumer vocabulary as a preliminary step in questionnaire development and, 7. Determination of attitudes and behavior. Advantages 1. Each individual is able to expand and refine their opinions in the interaction with the other members. This process provides more detailed and accurate information than could be derived from each separately. 2. A group interview situation is generally more exciting and offers more stimulation to the participants than the standard depth interviews. 3. The security of being in a crowd encourages some members to speak out when they otherwise would not. 4. As the questions raised by the moderator are addressed to the entire group rather than an individual the answer contains a degree of spontaneity that is not produced by other techniques. 5. Focus groups can be used successfully with children over five. They are also very useful with adults in developing countries where literacy rates are low and survey research is difficult. 88 6. A final major advantage of focus groups is that executives often observe the interview (from behind mirrors) or watch films of the interview. Disadvantages 1. Since focus group interviews last 1.5 to 3 hours and take place at a central location, securing cooperation from a random sample is difficult. 2. Those who attend group interviews and actively participate in them are likely to be different in many respects from those who do not. 3. There are chances that participants may go along with the popular opinion instead of expressing their own which may be contrary to the popular opinions. 4. The presence of a one-way mirror and /or an observer(s) has been found to distort participants responses. 5. The moderator can introduce serious biases in the interview by shifting topics too rapidly verbally or nonverbally encouraging certain answers, failing to cover specific areas, and so forth. 6. Focus groups are expensive on a per respondent basis. Minigroups Minigroups consist of a moderator and 4 and 5 respondents rather than the 8 to 12 used in most focus groups. They are used when the issue being investigated requires more extensive probing than is possible in a larger group. Minigroups do not allow the collection of a confidential or highly sensitive data as might be possible in an individual depth interview. However, they do allow the researcher to obtain substantially depth of response on the topics that are covered. Further the intimacy of the small group often allows discussion of quite sensitive issues.
Page 6
The advantages and disadvantages of minigroups are similar to those of standard focus groups, but on a smaller scale. In principle, these interviews are the same as the previous ones, excepting that they are conducted in groups rather than for individuals. This method is therefore less expensive and less time consuming than the depth interviews. This method is advantageous because it gives excellent leads to consumer attitudes that no other method can give. Another advantage of this method is that each respondent receives stimulation for responding from his group members and so the interviewer need not prompt the interviewee to answer. The disadvantage here is that one or two members could dominate in the group and others might not get a chance to answer. This would again make it an individual effort.
word or category or word is given in response to the word of interest to the researcher. Word association techniques are used in testing potential brand names and occasionally for measuring attitudes about particular products, product attributes, brands, packages or advertisements. 3.1.2. Completion Techniques This technique requires the respondent to complete an incomplete stimulus. Two types of completion are of interest to marketing researchers- sentence completion and story completion. Sentence completion, as the name implies, involves requiring the respondent to complete a sentence. In most sentence completion tests the respondents are asked to complete the sentence with a phrase. Generally they are told to use the first thought that comes to their mind or anything that makes sense. Because the individual is not required directly to associate himself or herself with the answer conscious or subconscious defenses are more likely to be relaxed and allow a more revealing answer. Story completion is an expanded version of sentence completion. As the name suggests part of a story is told and the respondent is asked to complete it. 3.1.3. Construction Techniques This technique requires the respondent to produce or construct something generally a story, dialogue, or description. They are similar to completion techniques except that less initial structure is provided. Cartoon techniques present cartoon-type drawings of one or more people in a particular situation. One or more of the individuals are shown with a sentence in bubble form above their heads and one of the others is shown with a blank bubble that the respondent is to fill in. Instead of having the bubble show replies or comments, it can be drawn to indicate the unspoken thoughts of one or more of the characters. This device allows the respondent to avoid any restraints that might be felt against having even a carton character speak as opposed to think certain thoughts. Third- person techniques allow the respondent to project attitudes onto some vague third person. This third person is generally an average woman, your neighbors, the guys where you work, most doctors or the like. Thus instead of asking the respondent why he or she did something or what he or she thinks about something the researcher asks what friends, neighbors or the average person thinks about the issue. Picture response, another useful construction technique, involves using pictures to elicit stories. These pictures are usually relatively vague, so that the respondent must use his or her imagination to describe what is occurring.
Group 7: Phoenix Market Research Organization (VES College) Page 8
Fantasy scenario requires the respondent to make up a fantasy about the product or brand. Personification asks the respondent to create a personally for the products or brands. 3.1.4. Expressive Techniques Role-playing is the only expressive technique utilized to any extent by marketing researchers. In role playing the consumer is asked to assume the role or behavior of an object or another person, such as a sales representative for a particular department store. The role-playing customer can then be asked to try to sell a given product to a number of different consumers who raise varying objections. The means by which the role player attempts to overcome these objections can reveal a great deal about his or her attitudes. Another version of the technique involves studying the role-players attitudes on what type of people should shop at the store in question. 3.1.5. Problems As projective techniques generally require personal interviews with highly trained interviewers and interpreters to evaluate the responses, they tend to be very expensive. Small sample sizes can increase the probability of substantial sampling error. The reliance on small samples often has been accompanied by non-profitability selection procedures. Some of the projective techniques require the respondents to engage in behavior that may well be strange to them; this is particular true for techniques such as roleplays. Thus there is reason enough to believe that there might be an error in the findings. Measurement error is also a serious issue with respect to projective techniques. The possibility of interpreter bias is obvious. 3.1.6. Promises They can uncover information not available through direct questioning or observation. They are particularly useful in the exploratory stages of research They can generate hypotheses for further testing and provide attributes lists and terms for more structures techniques such as the semantic differential. The results of projective techniques can be used directly for decision- making.
Page 9
Responses are timed so that those responses that respondents reason out are identified and taken into account in the analysis. The time limit is usually 5 seconds. The usual way of constructing such a test is to choose many stimulating and neutral words. The words are read out to the respondent one at a time, and the interviewer essentially records the first word association by the respondent. Respondents should not be asked to write their responses because then the interviewer will not know if the responses were spontaneous or whether the respondent took time to think out the responses. An example of such a test is: who would eat a lot of oatmeal? The first response is athletes. This means that the respondent feels that the product is more suited for sportspersons. More words on the same topic will reveal more about the respondents attitude about the product. While analyzing the results of word-association tests, responses are arranged along such lines as favorable - unfavorable and pleasant unpleasant.
Page 10
3.5.1. TAT Clinical psychologists have long used this method. Here the respondent is shown many ambiguous pictures and he is asked to spin stories about them. The interviewer may ask questions to help the respondent to think. For example what is happening here? makes the answer focused towards an action. Or which one is the aggressor? makes the respondent think about the picture as one of aggression. The reason that respondents must be asked such prompting questions is that the pictures are very abstract and general and as such are open to very broad and irreverent interpretations. So some amount of focus is needed to channel the respondents thinking. Each subject in the pictures is a medium through which the respondent projects his feelings, ideas, emotions and attitudes. The respondent attributes these feelings to the characters because he sees in the picture something related to himself. Responses differ widely and analysis depends upon the ambiguity of the picture, the extent to which the respondent is able to guess the conclusions and the vagueness of the support questions asked by the interviewer. 3.5.2. Cartoon Tests They are a version or modification of the TAT, but they are simpler to administer and analyze. Cartoon Characters are shown in a specific situation pertinent to a problem. One or more balloons indicating the conversation of the characters is left open. The respondent has to then fill these balloons and then analyzed.
The term reliability is used to refer to the degree of variable error in a measurement. We define reliability as the extent to which a measurement is free of variable errors . This is reflected when repeated measures of the same stable characteristic in the same objects show limited variation. A common conceptual definition for validity is the extent to which the measure provides an accurate representation of what one is trying to measure. In this conceptual definition, validity includes both systematic and variable error components. However, it is more useful to limit the meaning of the term validity to refer to the degree of consistent or systematic error in a measurement. Therefore we define validity as the extent to which a measurement is free from systematic error. Measurement accuracy is then defined as the extent to which a measurement is free from systematic and variable error. Accuracy is the ultimate concern of the researcher since a lack of accuracy may lead to incorrect decisions.
3.6. Reliability
There are various operational approaches for estimation of reliability. The following table summarizes these approaches. 3.6.1. Approaches to assessing reliability No. 1 2 3 4 Approach Test-retest reliability Alternative-forms reliability Internal-comparison reliability Scorer reliability Description Applying the same measure to the same objects a second time Measuring the same objects by two instruments that are designed to be as nearly alike as possible Comparing the responses among the various items on a multiple-item index designed to measure a homogeneous concept Comparing the scores assigned by two or more judges
No one approach is the best; several different assessment approaches should generally be used. The selection of one or more means of assessing a measures reliability depends on the errors likely to be present and the cost of each assessment method in the situation at hand. 3.6.2. Test-Retest Reliability Test-retest reliability estimates are obtained by repeating the measurement using the same instrument using the same instrument under as nearly equivalent conditions as possible. The results of the two administrations are then compared and the degree of correspondence is determined. The greater the differences, the lower the reliability
Group 7: Phoenix Market Research Organization (VES College) Page 12
A number of practical and computational difficulties are involved in measuring testretest reliability. They are: 1. Some items can be measured only once . For e.g.: It would not be possible to remeasure an individuals initial reaction to a new advertising slogan. 2. In many situations, the initial measurement may alter the characteristic being measured. Thus an attitude survey may focus the individuals attention on the topic and cause new or different attitudes to be formed about it. 3. There may be some form of a carryover effect from the first measure. The retaking of a measure may produce boredom, anger, or attempts to remember the answers given in the initial measurement. 4. Factors extraneous to the measuring process may cause shifts in the characteristic being measured. For e.g.: A favorable experience with the brand during the period between the test and the retest might cause a shift in individual ratings of that brand. 3.6.3. Alternative-Form Reliability Alternative-form reliability estimates are obtained by applying two equivalent forms of the measuring instrument to the same subjects. As in test-retest reliability, the results of the two instruments are compared on an item-by-item basis and the degree of similarity is determined. The basic logic is the same as in test-retest approach. Two primary problems are associated with this approach. They are: 1. The extra time, expense and trouble involved in obtaining two equivalent measures. 2. More importantly, the problem of constructing two truly equivalent forms. Thus a low degree of response similarity may reflect either an unreliable instrument or nonequivalent forms. Despite these difficulties, researchers should use alternative measures of important concepts whenever possible to allow assessment of reliability (and validity) as well as to improve accuracy (by using the data from both the measures) 3. Internal-Comparison Reliability Internal-comparison reliability is estimated by the intercorrelation among the scores of the items on a multiple-item index. All items on the index must be designed to measure precisely the same thing. For e.g.: measures of store image generally involve assessing a number of specific dimensions of the store, such as price level, merchandise, service, and location. Because these are somewhat independent, an internal-comparison measure of reliability is not appropriate across dimensions. However, it can be used within each dimension if several items are used to measure each dimension. Split-half reliability is the simplest type of internal comparison. It is obtained by comparing the results of half the items on a multi-item measure with the results from
Group 7: Phoenix Market Research Organization (VES College) Page 13
the remaining items. The usual approach to split-half reliability involves dividing the total number of items into two groups on a random basis and computing a measure of similarity. A better approach to internal comparison is known as coefficient alpha. This measurement, in effect, produces the mean of all possible split-half coefficients resulting from different splitting of the measurement instrument. Coefficient alpha can range from 0 to 1. A value of .6 or less is usually viewed as unsatisfactory. 4. Scorer Reliability: Marketing researchers frequently rely on judgment to classify a consumers response. This occurs, for e.g., when projective techniques, focus groups, observation, or open ended questions are used. In these situations, the judges or scorers, may be unreliable, rather than the instrument or respondent. To estimate the level of scorer reliability, each scorer should have some of the items he or she scores judged independently by another scorer. The correlation between the various judges is a measure of scorer reliability.
3.7. Validity
Validity like reliability is concerned with error. However it is concerned with consistent or systematic error rather than variable error. A valid measurement reflects only the characteristics of interest and random error. There are three basic types of validity. They are: 1. Content validity 2. Construct validity and 3. Criterion-related validity (predictive and concurrent) These are summarized in the table below: 3.7.1. Basic Approaches to Validity Assessment No. 1 2 a. b Approach Content validation Criterion-related validation Concurrent validation Predictive validation Description Involves assessing the representativeness or the sampling adequacy of the items contained in the measuring instrument Involves inferring an individuals score or standing on some measurement, called a criterion, from the measurement at hand Involves assessing the extent to which the obtained score may be used to estimate an individuals present standing with respect to some other variable Involves assessing the extent to which the obtained score may be used to estimate an individuals future standing with respect to the criterion variable
Page 14
3.7.2. Content Validity Content validity estimates are essentially systematic, but subjective, evaluations of the appropriateness of the measuring instrument for the task at hand. The term face validity has a similar meaning. However, face validity generally refers to nonexpert judgments of individuals completing the instrument and/or executives who must approve its use. This does not mean that face validity is not important. Respondents may refuse to cooperate or may fail to treat seriously measurements that appear irrelevant to them. Managers may refuse to approve projects that utilize measurements lacking in face validity. Therefore, to the extent possible, researchers should strive for face validity. The most common use of face validity is with multi-item measures. In this case, the researchers or some other individual or group of individuals assesses the representativeness, or sampling adequacy, of the included items in light of the purpose of the measuring instrument. Thus, an attitude scale designed to measure the overall attitude towards a shopping center would not be considered to have content validity if it omitted any major attributes such as location, layout and so on. Content validation is the most common form of validation in applied marketing research. 3.7.3. Criterion-Related Validity: Criterion-related validity can take two forms, based on the time period involved. They are: 1. Concurrent validity and 2. Predictive validity Concurrent validity is the extent to which one measure of a variable can be used to estimate an individuals current score on a different measure of the same, or a closely related variable. For e.g.: a researcher may be trying to relate social class to the use of savings and loan associations. In a pilot study, the researcher finds a useful relationship between attitudes towards savings and loan associations and social class, as defined by Warners ISC scale. The researcher now wishes to test this relationship further in a national mail survey. Unfortunately, Warners ISC is difficult to use in a mail survey. Therefore, the researcher develops brief verbal descriptions of each of Warners six social classes. Respondents will be asked to indicate the social class that best describes their household. Prior to using this measure, the researcher should assess its concurrent validity with the standard ISC scale. Predictive validity is the extent to which an individuals future level on some variable can be predicted by his or her performance on a current measurement of the same or a different variable. Predictive validity is the primary concern of the applied
Page 15
marketing researcher. Some of the predictive validity questions that confront marketing researchers are: (a) Will a measure of attitudes predict future purchases? (b) Will a measure of sales in a controlled store test predict future market share? (c) Will a measure of initial sales predict future sales? and (d) Will a measure of demographic characteristics of an area predict the success of a branch bank in the area? 3.7.4. Construct Validity Construct validity understanding the factors that underline the obtained measurement is the most complex form of validity. It involves more than just knowing how well a given measure works; it also involves knowing why it works. Construct validity requires that the researcher have a sound theory of the nature of the concept being measured and how it relates to the other concepts. A number of approaches exist for assessing construct validity of which the most common is called multitrait-multimethod matrix approach. These multiple measures (by methods as different from each other as possible) of multiple traits or concepts can be analyzed by the Campbell-Fiske procedure, confirmatory factor analysis, or direct product model. These techniques generally involve ensuring that the measure correlates positively with other measures of the same construct (convergent validity), does not correlate with theoretically unrelated constructs (discriminant validity), correlates in the theoretically predicted way with measures of different but related constructs (nomological validity), and correlates highly with itself (reliability). For e.g.: suppose we develop a multi-item scale to measure the tendency to purchase prestige brands. Our theory suggests that this tendency is caused by three personality variables. They are: 1. Low self-focus 2. High need for status and 3. High materialism We believe that it is unrelated to brand loyalty and the tendency to purchase new products. Evidence of construct validity would exist if our scale: 1. Correlates highly with other measures of prestige brand preference such as reported purchases and classifications by friends (convergent validity); 2. Has a low correlation with the unrelated constructs brand loyalty and tendency to purchase new products (discriminant validity); 3. Has a low correlation with self-focus and high correlations with need for status and materialism (nomological validity); and 4. Has a high level of internal consistency (reliability)
Page 16
An interval scale is a scale of measurement where the distance between any two adjacent units of measurement (or 'intervals') is the same but the zero point is arbitrary. Scores on an interval scale can be added and subtracted but cannot be meaningfully multiplied or divided. For example, the time interval between the starts of years 1981 and 1982 is the same as that between 1983 and 1984, namely 365 days. The zero point, year 1 AD, is arbitrary; time did not begin then. Other examples of interval scales include the heights of tides, and the measurement of longitude.
Page 18
3 3
2 2
1 1
0 0
1 1
2 2
3 3
Page 19
With rank order scale the researcher has no way of knowing if price is of importance (GROUP C); part of a general, strong concern for overall cost (GROUP A); or not much important than the other attributes (GROUP B). Constant Sum Scale provides such evidence. 4.6.2. Disadvantage A disadvantage could be that individuals could occasionally misassign points such that the total is more than, or less than 100. This can be adjusted for by dividing each point allocation by the actual total and multiplying the result by 100.
____ Strongly Agree ____ Agree ____ Neither Agree nor Disagree ____ Disagree ____ Strongly Disagree To analyze a Likert Scale, each response category is assigned a numerical value. These examples could be assigned values such as Strongly Agree=1, through Strongly Disagree=5 or the scoring could be reversed., or a 2 through +2 system could be used. They can be analyzed on an item-by-item basis, or they can be summed to form a single score for each individual. 4.8.1. Advantages 1. It is relatively easy to construct and administer. 2. Instructions that accompany the scale are easily understood; hence it can be used for mail surveys and interviews with children. 4.8.2. Disadvantages 1. It takes a longer time to complete as compared to Semantic Differential Scales, etc. 2. Care needs to be taken when using Likert Scales in cross cultural research, as there may be cultural variations in willingness to express disagreement.
6. CONCEPT TESTING
Attitude Scale: Sets of rating scales used to measure one or more dimensions of an individuals attitude toward some object. Attitude scales are constructed using likert, semantic differential or Stapel scales.
Page 22
Concurrent Validity: A measure of how accurate a measure of an object, state or event is now as opposed to how accurate it will be in the future (predictive validity), one measure of concurrent validity is how comparable the results of Instrument A and Instrument B are when both are used to measure the same characteristics in the same object at the same point of time. Constant Sum Scale: The constant sum scale requires the respondent to divide a constant sum, generally 10 or 100, among two or more objects or attributes on order to reflect the respondents relative preference for each object, the importance of the attribute, or the degree to which an object contains each attribute. Construct Validity: Understanding the factors that underlie the obtained measurement. It involves knowing how well and why a given measure works by having a sound theory of the nature of the concept being measured and hoe it relates to other concepts. Depth interview: An interviewing procedure in which the interviewer does not have a prespecified list of questions. The interviewer is free to create questions and probe responses that appear relevant. Respondents are free to respond to questions in any way they think appropriate. Types of depth interviews include individual, mini group and focus group. External Validity: The ability of the results from an experiment to predict the results in the actual situation. Face Validity: A form of content validity that exists when non experts such as respondents or executives judge the measuring instrument as appropriate for the task at hand. Free Word Association: A projective technique that requires the respondent to give the first word or thought that comes to mind after the researcher presents a word or phrase. Internal Validity: The degree of replicability of an experiment or assurance that experimental results are due to the variables manipulated in the experiment in that specific environment. Interval Scale: Numbers are used to rank items such that numerically equal distances on the scale represent equal distances in the property being measured. The location of the zero point and the unit of measurement is determined by the researcher; consequently, ratios calculated on data from interval scales are not meaningful. Ordinal Scale: A rating scale in which numbers, letters, or other symbols are used to assign ranks to items. An ordinal scale requires the respondent to indicate if one item has more or less of a characteristic than another item. The magnitude of difference between the items is not estimated.
Page 23
Predictive Validity: The extent to which the future level of some variable can be predicted by a current measurement of the same or a different variable. Projective Technique: The technique of inferring a subjects attitudes or values based on his or her description of vague objects requiring interpretation. Common types used in market research include cartoon, picture-response, third person and sentence completion. Ratio Scale: A rating scale in which items are ranked so that numerically equal scale distances represent equal distances in the property being measured. These scales have a natural and known zero point. Reliability: The extent of variable error in a measurement. Reliability exists when repeated measures of the same stable characteristics in the same objects or persons show limited variation. Scorer Reliability: The extent of agreement among judges (scorers) working independently to categorize a series of objects. The higher the degree of agreement between the judges, the greater the reliability of the categorization. Semantic Differential Scale: An attitude scaling device, it requires the respondent to rate the attitude object on a number of itemized, seven-point rating scales bounded at each end by one of two bipolar adjectives or phrases. Sentence Completion Technique: A projective technique requiring the subject to complete a sentence using the first phrase that comes to mind. The subject is not required to associate himself or herself with the response. Split Half Reliability: A measure of reliability in which the results form half the items on a multi-item measure are compared with the results for the remaining items. If there is a substantial variation between the groups, the reliability of the instrument is in doubt. Validity: The amount of systematic error in a measurement.
Page 24