Anda di halaman 1dari 117
2 Chap 1 Introduction In today’s world, an understanding of statistics is essential in many professions. across the spectrum from medicine to business. Physicians and other health-related professionals cvaluate results of studies investigating new drugs and dhetapies for treat ing disease Managers analyze quality of products, determine [actors that help predict sales of various products, and measure employce performance. But statistics is an important tool today even for those who do not use statistical methods as part of their job Every day we are exposed to an explosion of information, from advertising, news reporting. political campaigning. surveys about opinions on controversial issues. and other communications containing statistical arguments. Statis- ties helps us make sense of this information and better understand om world. Even if you never use statistical methods in your career. we think that you will find many of the ideas in this text helpful in understanding the information that you will encounter. We realize that you are probably not reading this book in hopes of becoming a statis- ticjan. and you inay not even plan on working in social science 1escarch. 1 fact, you may suffer from math phobia and feel fear at what lies ahead Please be assured that ‘you can read this book and lean the mayor concepts and methods of statistics with very litte know ledge of mathematics. To understand this book, logical thinking and perse- verance are much more important than mathematics And don’t be frustrated it learning comes slowly and you need to read a chapter a few times before it stats to make sense Just as you would not expect to take a single course in a foreign language and be able to speak that language fluently, the same is tue with the language of statistics On the ‘other hand, once you have completed even a portion of this text, you will have a much bette: understanding of how to make sense of quantitative information Data Information-gathering ist the heartof all sciences The social scieuces usea wide vari- cty of information- gathering techniques that provide the observations used in statistical analyses, These techniques include questionnaire surveys, telephone surveys, content analysis of newspapers and magazines, planned cxperiments, and direct observation of behavior in natural settings. In addition, social scicatists often analyze information already observed and recorded for other purposes. such as police records, census ma- terials. and hospital files, The observations gathered through such processes are collectively called data, Data consist of measurements on the characteristics of interest We might measure, for in- stance, characteristics such as political party affiliation, annual income, marital status, race, and opinion about the legalization of abortion. Statistics Statistics consists of a body of methods for collecting and analyzing data. ‘These methods for collecting and analyzing data help us to cvaluate the world in an objective manner. Sec 1.1. Introduction to Statistical Methodology 3 Purposes of Using Statistical Methods Let's now be more specific about the objectives of using statistical methods, Statistics provides methods for 1. Design: Planning and carrying out research studies. 2. Description: Summarizing and exploring data. 3, Inference: Making predictions or generalizing about phenomena represented by the data, Design refers to ways of determining liow best (o obtain the required data, The design aspects of a study miight consider, for instance. how to conduct a suvey, including the construction of a questionnaire and selection of a sample of people to participate in it. Description and inference are the two clements of statistical analysis—ways of ana- Jyzing the data obtained as a result of the design This book deals imarily with statistical analysis This is not to suggest that stati uucal design is unimportant. If a study is poorly designed or if the data are improperly collected or recorded. then the conclusions may be worthless or misleading, no matter how goud the statis i] analy sis, Mcthods fur statistical design are covcred in detail in textbooks on research methods (e.g.. Babbie, 1995). Description—describing and exploring data—includes ways of summarizing and exploring patterns in the data using measuicy that are more easily understood by an observer, The main purpose is to take whal, to the untrained observer, are meaningless reams of data and present them in an understandable and useful form, The sa data are 4 complete listing of measurements for each characteristic under study For example. an analysis of family size in New York City might start with a list of the sizes of all fumilics in the city. Such bulks of data, however, are pot easy to aassoss—wve simply get bogged down in numbers, For presentation of statistical information to readers, instead of listing aif observa- tions we use numbers that summarize the ¢ypical tamnily size in the collection of data. ©1 we present a graphical picture of the data. These summary descriptions, called de- scriptive statistics, are much more meaningful for most purposes than the complete data listing. In addition, these descriptions and related explorations of the data may reveal patterns to be investigated more full) future studies. Inference consists of ways of making predictions based on the data, For instance, it arecent survey of 750 Americans conducted by the Gallup organization, 24% indicated a belief in reincamation, Can we use this intormation to predict the percentage of the entire population of 260 million Americans that believe in reincamation? A method presented later in this book allows us te predict that for thus much Jurger group, the percentage belies ing in reincarnation falls between 21% and 27% Predictions made using data are called statistical inferences, Social scicntists use descriptive and inferential statistics to answer questions about social phenomena, For instance. “Are women politically more liberal than men” “Ts, the imposition of the death penalty associated with a reduction in violent crime?” “Does student performance in secondaty schools depend on the amount of money spent per 4 Chap. 1. Introduction student. the size of the classes, or the teachers" salanes?" Statistical methods help us study such issues. 1.2 Description and inference ‘We have seen that statistics consists of methods for designing studies and methods for anabzing data collected for the studies. Statistical methods for analyzing data include descriptive methods for summarizing the data and inferenual methods for making pre- dictions. A statistical analysis is classified as descriptive or inferential, according to whether its main purpose is (o describe the data or make predictions. To explain this distinction in more detail, we neat define the population and sample. Populations and Samples The objects on which one makes measurements are called the subjects for the study, Usually the subjects are people, but might instead be families. schools. cities, or com- panies. for mstance. Population and Sample The population is the total set of subjects of interest in a study. A samples the subset of the population on which the study collects data. ‘The ultimate goa! of any study is to lea about populations. But itis usually neces- sary, and more practical. to study only samples from those populations. For example, the Gallup and Hams polling organizations usually select samples of 750-1500 Amer- icans to collect information about political and social beliefs of the population of aff Americans Descriptive Statistics Descriptive statistical methods summarize the information in a collection of data, We use descriptive statstics to summarize basic characteristics of a sample, We might, for example, describe the typical family size in New York City by computing the average family siac. The main purpose of descriptive statistics is to explore the data and to reduce them to smupler and more understandable terms without distorting or losing much of the available information, Summary graphs, tables, and numbers such as averages and percentages are easier to comprehend and interpret than are long listings of data Sec 12 Description and inference § Inferential Statistics Inferential statistical methods provide predictions about characteristics of a popula- tion, based on information in a sample from that population Example 1] 1 illustrates the use of inferential statistics. Example 1.1 Opinion About Handgun Control ‘The first author of this text is a resident of Florida. a state with a relatively hugh crime rate He would like to know the percentage of Florida residents who favor controls ‘ver the sales of handguns, The population of interest is the collectian of mor. than 10 million adult residents in Florida. Since it is impossible for hin to discuss the issue with everyone. he can study resuils from a poll of 834 residents of Florida conducted in 1995 by the institute for Public Opinion Research at Flonda Internatioual University In that poll, 54% of the sampled subjects said that they favored controls ov cr the sales of handguns. This poll collected data fur 834 resideuts. He is imerested, however. notustin those 834 people but in the entire population of all adult Florida residents. Infetential statis tics can provide a prediction about this larger population using tie sample data. An inferential method presented in Chapter 5 predicts that the population percentage fa- voring control over sales of handguns falls betwcen 50% and 58%. Even though the sample is very small compared (o the population size. he can conclude, for instance. that probably a slim muyority of Florida residents tavor handgun control. a Using inferential statistical methods with properiy chosen samples, we can deter- mine characteristics of entive populations quite well by selecting samples that are small Telative to the size of the population Parameters and Statistics Parameters and Statistics ‘A parameter is a numerical summary of the population A statistic is a numerical ;_ Summary of the sample data { Example 1.1 dealt with estimating the percentage of Florida residents who support gun control, The parameter of interest was the true, but unkuow n, population percent- age favoring gun contiol The inference about this parameter was hased on a statistic— the percentage of the 834 Florida residents in the sample who favo1 gun control, namely. 548¢, Since this number describes a characteristic of the sample, itis an example of a descriptive statistic. The value of the parameter to which the inference refers, namely, the population percentage in favor of gun control. is unknown. In suunmary, We vse known sample statistics in making inferences about unknown population parameters. 6 Chap 1 Introduction (Students should note thal in statistical usage. the term parummeter does not have ils usual meaning of “Simit” or “boundary. The primary focus of most research studies is (hte parameters of the populutron, not the statistics calculated for the particula sample selected. The sample and statistics describing it are important only insofat as they provide information aboot the unknown parameters. We would want a prediction about aif Floridians. not only the 834 subjects in the sample. An important aspect of statistical interence involves reporting the likely accuracy of the sample statistic that predicts the valuc of a population parameter. An inferential slalistical method predicts how close the sample value of 54% is likely to be to the (rue (anknown) percentage of the population favoring gun control A method from Chapter S determines that a sample of size 834 yields accuracy within abou! 4%: thatis. the tue population percentage favoring gun contral falls within 4% of the sample value of S4%. ‘or between 50% and 58% ‘When data exist for an entire population. there is no nced to-use inferential statistical methods, since one can then calculate exactly the parameters of inteiest, Fot example, place of residence and hoine ownership are observed for virwally all Americans during census years. When the population of interest is small. we would normally study the records of the entire population instead of only a sample. In studying the voting records ‘of members of the U.S. Senate on bills concerning defense appropriations, fou example, we could obtain data on voles for all senators on all such In most social science research, itis impractical lo collect data for the entire popu- lation, due to monetary and time limitations It is usually unnecessary to do so. in any case, since good precisjon for inferences about population parameters results from rel- atively small samples, such as the 750-1500 subjects that most polls take. This hook caplains why this is so. Defining Populations Inferential statistical methods require specifying clearly the popvlation to which the inferences apply. Sometimes the population js a clearly defined set of subjects. In Ex- ample 1.1, it was the collection of adult Florida residents. Often. however, the general- izations refer to a conceprual population—a population that does not actually exist but that one can hypothetically conceptwalize, For example, suppose a team of researchers tests a new drug designed to relieve se- vee depression. They plan to analyze results for a sample of patients suffenng from depression to make inferonces ahout the conceptual population of all individuals who ight suffer depressive symptoms now or sometime in the future, Or a consumer orga- nization may evaluate gas mileage for a new model of an automobile by obser ving the average number of miles per gallon for fie sample autos driven on a standardized 100- tnile course, Inferences then refer to the performance on this course for the concepuwal popolation of all autos of this model that will be or could hypothetically be mamufac- tured. ‘A caution is due here, Investigators often try to generalize to a broader population than the one to which the sample results can be statistically extended, A psychologist Sec 1.3 The Role of Computers in Statistics 7 may conduct an experiment using a sample of students from an introductory psychol- ogy course With statistical inference, the sample results gencralize to the population of all students in the class. For the results to be of wider interest. however, the psycholo- ‘gist might claim that the conclusions generalize to all college students, 10 all young. adults, or even to a more heterogeneous group. These generalizations may well be ‘wrong, since the sample may differ from those populations in fundamental ways. such as in racial composition or average socioeconomic status. For instance, in het 1987 book Women in Love. Shere Hite ptesented results of asur- vey she conducted of adult women in the United States. One of her conclusions was that 70% of women who had heen married at least five years have extramarital affairs. She based this conclusion on responses to questionnaires retuined from a sample of 4500 women, which sounds impressively large However. the questionnaire was mailed to about 100.000 women We cannot know whether this sample of 4.5% of the women who responded is representative of the 100,000 who received the questionnaire. much less the entire population of adull American women Thus. it is dangerous to tr to make an inference to the larger population, You should carefully assess the scope of conclusions in rescarch articles, political and government reports, advertisements, and the mass media, Evaluate critically the basis for the conclusions by noting the makcup of the sample upon which the inferences are built. Chaptet 2 discusses some desirable and undesirable types of samples Tn the past quarter century. social sciomtists have increasingly recognized the power of inferential statistical methods Presentation of these methods occupies a large por- tion of this textbook, beginning in Chapter 5 1.3 The Role of Computers in Statistics Even as you read this book, the computer industry contunues its ceaseless growth, New and more powerful personal computers and workstations are reaching the market. and {hese computers are hecoming moreaccessible to people who are not technically tained. An itnportant aspect of this expansion is the developinent of highly specialized sofl- ware, Versalile and user-fnendly sofiware is now readily available for analyzing data us- ing descriptive and infeicntial statistical methods The development of this software has provided an enormous boon to the use of sophisticated statistical methods. Statistical Software age fos the Social Sciences (SPSS), SAS (SAS Institute. Ine.) and Mini- tab are among popular statistical sofiwure found on college campuses. Itis much easier to apply statistical methods using these software than using old-fashioned hand calcula- tion. The accuracy of computations is greally iinproved, since hand calculations often result in mistakes or crude ancwers, especially when the dat set is large. Moreover. some modern statistical methods prosented in this {ext are too complex to be done by hand, Statistical Pack 8 Chap. 1 Introduction The first seven chapters of this text present some fundamental ideas and methods of statistics. The remamning chapters deal with more advanced methods that provide more complete unalyses, The presentation of these more advanced methods shows cxamples of the ourput of statistical software. The calculations for the methods of Chapters 8-16 are 50 complex that they would almost always be done by computer. One purpose of this textbook iy to teach you what to look for in a computer printout and how to interpret the information provided. Knowledge of computer programming is not necessary for using statistical software or for reading this book, The appendix contains many examples of the use of SAS and SPSS statistical soft- ware for conducting the analyses, organized by chapter, If you use SAS or SPSS. refer to this appendix ay you read each chapter to see how they perfonn the analyses of that chapter A Data File Table 1 1 is an example of part ofa file of data as organized for analysis by computer software Each row contains the measurements for a different subject in the sample. Each column contains the measurement for each characteristic we observe. Table 1.1 shows data for the frst ten subjects in the sample, for the charactenstics: subjects” gen- der (P= fernale, M= male), racial group (B = black, H = Hispanic. W = white), marital status (1 = married, 0 = unmamed), age (in years), and annual income (in thousands of dollars). Some of the data are numerical. and some consist simply of labels, Chapter 2 introduces the various types of data that occu in files of this type. Chapter 3 presents descriptive statistics for summarizing the information about each characteristic. TABLE 1.1 Example of Part of a Deta File Teaciat Maral “Anu Subject Gender Group Status Age Income 1 F w T 2 ie3 2 F B ao 7 79 3 M w i @ wo 4 F w 1 6) 462 5 M B 1 30 165 6 ™ w o on iO 7 M w i 8 wi 8 F w a sn Hs 9 F H 1 6218 10 M B a. 6 0 Uses and Misuses A noic of caution: The easy accessibility of complex statistical methods by ineans of computer software has dangers as well as benefits. It is simple, for exainple, to apply Sec 14 Chapter Summary 9 inappropriate methods tothe data_A computer pertorms the analysis requested whether or not the assumptions required for its proper use ure satisfied, Many incorrect aualyses result when researchers take insufficient time to under- stand the nature of a statistical method. the assumptions for its use, or its application to the specific problem. It is vital to understand the method before using it, Even it you read about similar studies that uscd the same type of analysis you are considering. do not assume that the anelysis was correct without understanding the reasons It may well be that a different approach is more appropnate. Just knowing how to use statistical software does not guarantee a proper analy- sis You'll need a good background in statistics to understand which method to select. which options to choose in that method, and low to make valid conclusions based on the computer output The main purpose of this text is to give you this background. 1.4 Chapter Summary The discipline of statistics includes methods for designing and canying out research studies, © explonng and describing data, # making predictions (inferences) wsing the data, We normally apply statistical methods to measurements in a sample sclected trom population of interest. Statistics describe samples. w hile parameters describe popu- lations The two types of statistical analyses are descriptive methods for summarizing the sample aud population and inferential methods for making predictions about pop- ulation parameters using sample statistics. Statistical methods arc easy to apply using computer software, relieving us of computational drudgery and helping us tocus on the proper application and interpretation of the methods. PROBLEMS Practicing the Basics 1, Distinguish between description and inference as 1wo put poses for using statistical meth- ‘ods, Illustrare the distinction using an example Give an example of a situation in which descriptive statistics would be helpfu but infer- ential statistical methods would not be needed. 3. [he Fnvironmenial Protection Agency (EPA) uscs a few new automobiles of euch brand eveiy year to collect data on pollution emission and gasoline mileage performance For ticular brand, identify the (a) subject. (b) sample. (c) population. 4. a) Distinguish between a statistic and a parameter. b) Amarticte in a Florida newspaper (The Gainesville Sun) in 1995 reported that 66.5% of Flondians believe that state governineit should not restrict access to abortion, 1s 66.5% most likely the valve of a statistic, or of a parameter? Why? 10 Chap 1 Introduction S The student government at the University of Wisconsin conducts a study about alcohol abuse among students. One hundred of the 40.000 members of the student body are san pled anid asked to complete a questionnaire One question asked is “On how many days in the past week di you consume at least ove alcoholic drink” a) Describe the popalation of iuterest For the 40,000 students, suppose that one characteristic of interest is the percentage ‘who would respond "zero" to this question. This value is computed for tie students satm- pled. Is it parameter ot a statistic? Why? 6. The Current Population Survey of about 60,000 households in the United States in 1992 indicated that 10.3% of whites, 31 0% of blacks. and 26 7% of Hispanics in the Uuited ‘States have antual income below the poverty level (Statistical Abstract of the United States, 1994). a) Are these numbers statistics or parameters? Eaplait b) Using a method from this text. we would conclude that the percentage of all hlack houscholds in the Uuited States having income below the poverty level is at least 30% but no greater uxan 329 What type of statistical method does this iustrate—descnpiive. or inferential? Concepts and Applications 7. Yourinstrvctor will help the class create a datafile consisting ofthe values for class mem- ‘bers of characteristics such as GE = gender. AG = age in years. H/ = high school GPA (on a four-point scale), CO= college GPA, DH = distance (in miles) of the campus from your home town, DR = distance (in miles) of the classroom from your cunent residence, number of times a week you read a newspaper. 7V'= average auimber of hours per week thal you watch TV, SP = a.erage number of hours per week that you participate in ypons or have othet physical exercise, VE= whether you are a vegetarian (3e5, 10). AB = opin- ion about whether abortion should be legal in the first three months of pregnancy (yes, no} PI = political ideology (1 = very libeaal, 2 = hberal, 3 = slightly liberal, 4 = mod- 5 = slightly conservative, 6 = conservabve, 7 = very conservalite}, PA = political jemocrat, R = Republican, = independent). RE =bow often you attend religious services (never, occasionally. most weeks, every week). LD = beliet in tife alter death (yes. no). AA = support affirmative action (yes. no). AH = number of people 1ou ‘know whu have died from AIDS or who ae HIV+ Alremauvely. you instructor may ask you to use a data file of this type already: prepared in fall 1996 with u class of social science graduate studems at the University of Florida, available on the World Wide Web at http://www. stat.ufl.edu/users/aa/social/data,btml Using a spreadsheet program or the statistical software the instructor has chosen for your course, cteale a computer datafile containing thisinformation Each row of the fle should Contain daa for «particular student, and each column should contain vaiues o! aparticular churactenstic Print the dala. What are some questions one might ask about these data’? Homework exercises in each chapter will refer to these data 8. A sociologist is interested in estimating the average age at marrage for women in New England in the early eighteenth centary, She finds within her state archives reasonably complete marriage records fora large Puntan village forthe years 1700-1730, She then 10. ste Chap 1 Problems 11 takes a sample of those records, noting the age ol the bride for each. The average age of the brides in the sample is 24.1 years. Using a statistical method from Chapter §, the sociologist then estimates the average age of brides ai mariage for the popUlation to be between 23.5 and 247 years a) What part of ts example is descriptive? b) What pact of this example is inferential? ‘¢) To what popitlation does the inference refer” The 1994 General Social Survey of adult Americans asked subjects whether astrology— the stndy of star signs—has some scientific truth, Of 1245 sampled subjects who had an opinion, 651 responded definitely or probably trie. and 594 responded definitely or probably not true The proportion responding defintely or probably’ tiue was 651/1245 523, 2) Describe the population of interest b) For what popoiation parameter might we want to make an inference? ©) What sample statistic could be used in making ths inference? 4) Does the value ofthe statistic in (c) necessatily equal the parameter in (b)” Explain, Look at a few recent issues of a major social science journal. such as American Socio- logical Review or American Political Science Review. About what proportion ofthe arlicles seem 1o use statistics? Find some examples of descriptive statistics, Find out what statistical software is available to you while takang this course either on PCs or workstations in 2 computer lab or on an institution-wide mainframe computer Find out how to access the software. enter data and print any files you create As an ex- ‘icise, crate a data file using the data in Table J 1 in Section 1 3, and print it Chapier 2 Sampling and Measurement The ultimate goals of social science research arc to understand, explain, and make infer- ences about social phenomena, To do this. we need data, Descriptive statistical meth- ‘ways of summarizing the data. Jaferential statistical methods use sample ‘we must decide which subjects of the population to sample, Selecting a sample that is likely to be representa- tive of the population isa primary topic of this chapter ‘We must convert our ideas about social phenomena into actual data through meas- urement, The devclopment of ways to measure abstiact concepls such as prejudice, love, intelligence, and status is one of the most difficult problems of social research. Moreover. the problems related to finding valid and rcliable measures of concepts have consequences for statistical analysis of the data. In particular. invalid or unreliable data- gathering insbuinents render the statistical manipulations of the data meaningless. ‘The first section of this chapter discusses some statistical aspects of measurement. such as the different types of data, The second and third sections discuss the principal methods for selecting the sample that provides the measurements, 2.1 Variables and Their Measurement Statistical methods provide a way to deal with variabiliry. Variation occurs among peo- ple. schools, towns. and the various subjects of interest to us in our everyday lives. For 2 Sec 21. Variables and Theit Measurement 13 instance, variation occurs fiom person (o person in charactenstics such as income. 1Q. political party preference, religious beliefs. marital status, and musical talent. We shall ‘sec that the nature and the extent of the variability has important implications both on descriptive and inferential statistical methods. Variables A characteristic measured for each subject in « sample is called a variable The name refers to the fact that values of the characteristic vary among subjects in a sample or populauon, Variable AA variable is a characteristic that can vary in value among subjects in a sample or i Poputation Bach subject has a particular value for a variable. but different subjects may pos- sess different values. Examples of variables are gender (with values female and male), age at last birthday (with values 0. 1. 2. 3, and so on}. religious affiliation (Protestant. Roinan Catholic, Jewish, Other, None), number of children in a family (0.1.2... and political party preference (Democrat, Republican, Independent). The possible val- ues the variable can assume form the scale for measuring the variable. For gender. for instance, that scale consists of the two labels, female and male, ‘The valid statistical methods for analyzing a vasinble depend on the scule for its measurcment We treat 2 numerical-valued \riable such as anue! income (in thou- sands of dollars) differently than a variable with a scle consisting of labels. such as political preference (with scale Democrat, Republican. Independent), We next intro- duce two ways 10 classify vanables that determine the valid statistical methods. The first refers to whether (he measurement scale consists of labels or wumbers. The sec- ond refers to the number of levels in that scale. Qualitative and Quantitative Data Data are called qualitative when (he scale for measurement is a set of unordered cat- egoties. For example, marital status, with categories (single, married. divorced, wid- owed), is qualitative. For Canadians, the province of one’s residence is qualitative. with the categotics Alberta. British Columbia. and so on Othet qualitative variables are ye- ligious affiliation (with categories such ay Catholic. Jewish. Mustim. Protestant. Other. None), gender (female, male), political party preference (Democrat. Republican, Inde~ pendent), and marriage form of a sociely (monogamy. polygyny. polyandry ). For each variable. the categories are unordered; the scale does not have a “high” or “low” end. For qualitative variables. distinct categories differ in quality. not in quantity or mag- nitude. Although the different categories are often called the /eve/s of the scale, no level 14 Chap 2 Sampling and Measurement is greater than or smaller than any other level, Names or labels such as “Alberta” and British Columbia” identify the categories. but those names do not represent different magnitudes of the variable When the possible values of a variable do differ in magnitude. the variable is called ‘quantitative, Bach possible valuc of a quantitative variable is greater than or fess than any other possible value, Such comparisons resull from variables having a nameri- cal scale Examples of quantitative variables are a subject's annual income, number of scars of education conipleted, nutuiber of siblings. and numnber of times antested The set of categories for a qualitative variable is called a nominal scale. For in- stance, x variable pertaining (o one's mode of transportation to work might use the nom- inal scale consisting of the categories (ca. bus. subway, bicycle. walk). A sot of aumer- ical values for a quantitative variable is called an interval scale Tnierval scales have @ specific numerical distance or “interval” betwven each pair of levels, Annual income is usually measured on an interval scale; the interval between $40,000 and $30,000, for instance. equals $1(000 We can compare outcomes in terms of how much larger or how much smalle: one is than the other. a comparison that is not relevant for 4 nominal scale. A third type of scale falls, in a sense. between nominal and interval. It consists of categorical scales having a natural ordering of values. but undefined interval distances between the values. Examples are social class (classified into upper. middle, lowes), political philosophy (measured as very liberal, slightly libeial. moderate, slightly con- servative, very conservative), and government spending on the environment (classified 5 too little, about right, too much). These scales are riot nominal, because the cate- gories are naturally ordered. The levels are said 1o form an ordinal scale. Ordinal scales consist of a collection of ordered categories. Although the categories have a clear ordering. the distances between them are unknown. For example. a person categonzed as very liberal is more liberal than « person categorized as slightly liberal. but there is no numerical value for how much more liberal that person is. Both nominal and ordinal scales consist of a set of categories, Each observation falls into one and only one category Variables having categorical scales are called caf- egorical variables, While the categories have a natural ordering for an ordinal scale, they are unordered for a nominal scale Foi the categories (Catholic, Jewish, Muslim, Protestant, Other, None) for religious affiliation, it dees not make sense to think of one category as being higher or lower than another The various scales refer to the actual measurement of social phenomena and not to the phenomena themselves. Place of residence may indicatc the geographic place name of one’s residence (nominal). the distance of that residence from a point on the globe (interval), the size of one’s conmmunity (interval or ordinal). or other kinds of sociological variables Quantitative Nature of Ordinal Data As we've discussed, data from nominal scales arc qualitative—distinet levels differ in guabity. not in quantity. Data from interval scales are quantitative. disunct levels have differing magnitudes of the characteristic of interest. The position of ordinal scales Sec. 21 Variables and Their Measurement 15 oon the quantitative—ualitutive classification is fuzzy. Because their scale consists of a set of categories. they are often treated as qualitative, being analyzed using methods for nominal scales. But in many respects, ordinal scales more closely resemble inter- val scales. They possess an important quantitative feature. cach level has a greater or smaller magnitude of the characteristic than another level. Soine statistical methods apply specifically to ordinal variables. Oficn, though. sta- tisticians take advantage of the quantitative nature of ordinal scales by assigning nu- merical scores to categorics. That is, they oftcn treat ordinal data as interval in order to tase the more sophisticated methods available for quantitative data, For instance, course grades (such as A, B, C.D, E) are ordinal. but we treat them as interval when we ass numbers to the grades (such as 4, 3, 2. 1, 0) to compute a grade point average. Treat- ing ordinal data as interval requires good judgment in assigning scores, and it is often accompanied by a “sensitivity analysis" of checking whether substantive results differ for differing choices of the scores. The quantitative treatment of ordinal data has bene- hits in the variety of methods available for data analysis, particularly for data sets with many variables, Statistical Methods and Type of Measurement ‘The main reason for distinguishing between qualitative and quantitative data is that dif- terent different statistical methods apply lo each type of data. Some methods are de- signed for qualitative variables and others are designed for quanutative variables, It is not possible to analyze qualitative data using methods for quamitutive vari- ables Ifa variable has only a nominal scale. for instance. one cannot use methods for interval data. since the levels of the scale do not have numerical values. One cannot apply quantitative statistical methods based on interval scales to qualitative variables such a6 religious affiliation or county of residence. For instance. the averuge is a sta- tistical summary fo1 quantitative data since it uses numerical values: one can compute the average for a vaiable having an interval scale, such as income, but not for a variable having x nominal scale, such as religious affiliation. On the other hand, itis always possible to treat a vanable in aless quantitative man- ner For example, suppose age is meastred using the ordered categories under 18. 18 40. 41-65, over 65. This variable is quamitative. but one could treat it at qualitative either by igaoring the ordering of these four categories or by using unordered levels such as working age, nonworking age. Normally. though, we apply statistical methods specifically appropriate for the actual scale of measurement. since they use the charac- teristics of the data to the fullest. You should measure varables at as high a level as possible. because a gicater variety of methods apply with higher-level variables Discrete and Continuous Variables We now present one other way of classifying variables that helps determine which sta- ‘tistical method is most appropriate for a dataset This classification refers to the number of values in the measurement scale. 18 Chap. 2. Sampiing and Measurement Discrete and Continuous Variables A variable Is discrete if it can take on a finite number of values and continuous if can take an infinite continuum of possible real number values Examples of discrete variables ure number of children (measured for each family). ‘ouinber of murders in the past yeas (measured for each census tract). and number of vis- its to a physician in past year (measured fo1 each subject). Any variable phrased as “the number of.” is discrete. since one can list all the possible values (0, 1.2.3.4...) for the variable. (Strictly speaking, there could be an infinite number of values for such a sariuble, namely. all the nonnegative integers As long as the possible values do not form a continuum, the variable is still said to be discrete.) Examples of continuous variables are height, weight, age, and the amount of tine it takes to read a passage of a book Itis impossible to write down all the distinct po- tential values of a Continuous variable. since they form a continuum. The amount of time needed to read a book for example. couid take on the value 86294473. hours. With discrete vatiables, one cannot subdivide the basic unit of measurement. For example, 2 ind 3 are possible vulues for the number of children in w family, bat 2.571 is not. On the other hand, a collection of values for @ continuous variable can alway’s be refined: that is. between any two possibie values, there is always another possible value For example. an individual docs not age in discrete jumps. Between 20 and 21 years of age. there is 20,5 years (among other values). between 20 5 and 21, there is 20,7 At some well-dehned point during the year in which a person ages fiom 2010 21. that person is 20,3275 years old. and similarly for every othes real number between 20 and 21, A continuous. infinite collection of age values occurs between 20 and 21 alone. Qualitative variables are discrete. having a finite set of unordered categories. Jn fact, al] categorical variables, nominal or ordinal, are discrete. Quantitative variables can be discrete or continunus; uge is continuous. and number of times arrested is dis- ciete The distinction between discrete und continuous yaniables is often blurry in pra tice. because of the way variahles are actually measured Continuous variables must be 1ounded when measticd. so we measure them as though they are discrete, We usually sa) that an individual is 20 years old whenever that person’s age is somewhere between 20and21. Other variables ofthis type are prejudice, intelligence. motivation, and other internatized attitudes or orientations, Such variables are assumed to vary’ continuously. ‘but measurements of them describe. at best. rough

Anda mungkin juga menyukai