DATA SOURSES
Primary
Data Collection
Secondary
Data Compilation
Print or Electronic
Observation
Survey
Experimentation
TYPES OF DATA
Data
Categorical
Examples:
Numerical
Discrete
Examples:
Continuous
Examples:
DEFINITIONS
Quantitative Data (Numerical) consists of numbers representing counts or measurements. Qualitative Data (Categorical) can be separated into different categories that are distinguished by some nonnumeric characteristic.
DEFINITIONS
Discrete Data result when the number of possible values is either a finite number or a countable number. Continuous Data result from infinitely many possible values that correspond to some continuous scale that covers a range of values without gaps.
WHAT IS A VARIABLE?
A variable - a characteristic of a population or a sample, e.g. Examination marks Stock price The waiting time for medical services Data - Observed values of variables
EXAMPLE
46 45 46 48 41
49 46 44 43 47
46 44 42 43 43
48 47 45 49 47
45 44 46 40 48
49 45 46 44 42
46 49 42 46 44
45 46 45 43 48
47 42 41 45 48
43 47 47 44 45
Scores on a Test
TYPES OF VARIABLES
A. Qualitative or Attribute variable - the characteristic being studied is nonnumeric.
EXAMPLES: Gender, religious affiliation, type of automobile owned, state of birth, eye color are examples.
SCALES OF MEASUREMENT
Scales of Measurement 1. Nominal Scale Categorical/qualitative observations Use number to represent the categories. Example: Single=1, Married=2
2.
Ordinal Scale Ordered categorical observations Value are in order Example: Poor-1 Fair-2 Good-3
Interval Scale Numerical/quantitative observations Numerical bring the meaning of value. Example: marks, temperature, IQ Ratio Scale Numerical/quantitative observations Have absolute zero value Example: weight, height, income
3.
4.
SCALES OF MEASUREMENT
Nominal level data that is classified into categories and cannot be arranged in any particular order.
EXAMPLES: eye color, gender, religious affiliation. Interval level similar to the ordinal level, with the additional property that meaningful amounts of differences between data values can be determined. There is no natural zero point.
EXAMPLE: Temperature on the Fahrenheit scale.
Ordinal level involves data arranged in some order, but the differences between data values cannot be determined or are meaningless.
EXAMPLE: During a taste test of 4 soft drinks, Mellow Yellow was ranked number 1, Sprite number 2, Sevenup number 3, and Orange Crush number 4.
Ratio level the interval level with an inherent zero starting point. Differences and ratios are meaningful for this level of measurement. EXAMPLES: Monthly income of surgeons, or distance traveled by manufacturers representatives per month.
DEFINITIONS
Nominal Scale is characterized by data that consists of names, labels, or categories only. Ordinal Scale data can be arranged in some order, but differences between data values either cannot be determined or are meaningless.
DEFINITIONS
Interval Scale is like the ordinal scale, with additional property that the difference between any two data values is meaningful. However, data at this level do not have a natural zero starting point. Ratio Scale is similar to the interval scale with additional property that there is an absolute zero (where zero indicates that none of the quantity is present). In this scale ratios are meaningful.
EXAMPLES
Nominal Person Marital status
Ahmad Siva Ah Keong married single single
. .
. . Computer
1 2 3 . .
Brand
IBM Dell IBM . .
. .
. . Weight
gain
+10 +5
. .
EXAMPLES
Nominal
With nominal data, all we can do is, calculate the proportion of data that falls into each category.
IBM 25 50% Dell 11 22% Compaq 8 16% Other 6 12%
. .
Total 50
. . Weight
gain
+10 +5
. .
Code 1 2 3 4
Frequency 3 5 2 4
HIERARCHY OF DATA
Ratio/Interval* Values are real numbers All calculations are valid Data may be treated as ordinal or nominal Example : Examination Marks Ordinal Value must represent the ranked order of the data Calculation based on an ordering process are valid Data may be treated as nominal but not as interval Nominal Value are the arbitrary numbers that represent categories. Only calculation based on the frequencies of occurrence are valid. Data may not be treated as ordinal or interval
PUBLISHED DATA
This is often a preferred source of data due to low cost and convenience. For example: Data published by Published data is found as printed material, the US tapes, disks, and on the Internet. Bureau of Census. For example: Data published by the organization that has The Statistical abstracts of the United States, PRIMARY DATA collected it is called
compiles data from primary sources Compustat, sells variety of financial data tapes compiled from primary sources
Data published by an organization different than the organization that has collected it is called SECONDARY DATA.
OBSERVATIONAL or EXPERIMENTAL
When published data is unavailable, one needs to conduct a study to generate the data.
Observational study is one in which measurements representing a variable of interest are observed and recorded, without controlling any factor that might influence their values. Experimental study is one in which measurements representing a variable of interest are observed and recorded, while controlling factors that might influence their values.
STATISTICAL STUDIES
Statistical Studies Do you make observations only, or do you modify the subjects? Future Prospective study
Experiment
Past
Retrospective study
Cross-sectional study
DEFINITIONS
Voluntary Response Sample (or selfselected sample) is one in which the respondents themselves decide whether to be included in the sample. Voluntary response sample might not be representative of the intended population.
SURVEYS
Surveys solicit information from people. Surveys can be made by means of
QUESTIONNAIRE
A good questionnaire must be well designed:
Keep the questionnaire as short as possible. Ask short,simple, and clearly worded questions. Start with demographic questions to help respondents get started comfortably. Use dichotomous and multiple choice questions. Use open-ended questions cautiously. Avoid using leading-questions. Pretest a questionnaire on a small number of people. Think about the way you intend to use the collected data when preparing the questionnaire.