Anda di halaman 1dari 31

Data Tabulation

Descriptive Analysis
The transformation of raw data into a form that will make them easy to understand and interpret; rearranging, ordering, and manipulating data to generate descriptive information

Type of Measurement

Type of descriptive analysis

Two categories

Frequency table Proportion (percentage) Frequency table Category proportions (percentages) Mode

Nominal
More than two categories

Type of Measurement

Type of descriptive analysis

Ordinal

Rank order Median

Type of Measurement

Type of descriptive analysis

Interval

Arithmetic mean

Type of Measurement

Type of descriptive analysis Index numbers Geometric mean Harmonic mean

Ratio

Tabulation
Tabulation - Orderly arrangement of data in a table or other summary format Frequency table Percentages

Frequency Table
The arrangement of statistical data in a rowand-column format that exhibits the count of responses or observations for each category assigned to a variable

Central Tendency
Measure of Central Tendency Mode Median Mean

Type of Scale Nominal Ordinal Interval or ratio

Measure of Dispersion None Percentile Standard deviation

Making Data Usable


Frequency distributions Proportions Central tendency
Mean Median Mode

Measures of dispersion

Frequency Distribution of Deposits


Frequency (number of people making deposits in each range) 499 530 562 718 811 3,120

Amount less than $3,000 $3,000 - $4,999 $5,000 - $9,999 $10,000 - $14,999 $15,000 or more

Percentage Distribution of Amounts of Deposits


Amount Percent less than $3,000 $3,000 - $4,999 $5,000 - $9,999 $10,000 - $14,999 $15,000 or more 16 17 18 23 26 100

Probability Distribution of Amounts of Deposits


Amount less than $3,000 16 $3,000 - $4,999 17 $5,000 - $9,999 18 $10,000 - $14,999 23 $15,000 or more 26 Probability . . . . .

Measures of Central Tendency


Mean - arithmetic average
, Population;
X

, sample

Median - midpoint of the distribution Mode - the value that occurs most often

Internet Usage Data


Respondent Number 1 2 3 4 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 Sex 1.00 2.00 2.00 2.00 1.00 2.00 2.00 2.00 2.00 1.00 2.00 2.00 1.00 1.00 1.00 2.00 1.00 1.00 1.00 2.00 1.00 1.00 2.00 1.00 2.00 1.00 2.00 2.00 1.00 1.00 Familiarity 7.00 2.00 3.00 3.00 7.00 4.00 2.00 3.00 3.00 9.00 4.00 5.00 6.00 6.00 6.00 4.00 6.00 4.00 7.00 6.00 6.00 5.00 3.00 7.00 6.00 6.00 5.00 4.00 4.00 3.00 Internet Usage 14.00 2.00 3.00 3.00 13.00 6.00 2.00 6.00 6.00 15.00 3.00 4.00 9.00 8.00 5.00 3.00 9.00 4.00 14.00 6.00 9.00 5.00 2.00 15.00 6.00 13.00 4.00 2.00 4.00 3.00 Attitude Toward Internet Technology 7.00 3.00 4.00 7.00 7.00 5.00 4.00 5.00 6.00 7.00 4.00 6.00 6.00 3.00 5.00 4.00 5.00 5.00 6.00 6.00 4.00 5.00 4.00 6.00 5.00 6.00 5.00 3.00 5.00 7.00 6.00 3.00 3.00 5.00 7.00 4.00 5.00 4.00 4.00 6.00 3.00 4.00 5.00 2.00 4.00 3.00 3.00 4.00 6.00 4.00 2.00 4.00 2.00 6.00 3.00 6.00 5.00 2.00 3.00 5.00 Usage of Internet Shopping Banking 1.00 2.00 1.00 1.00 1.00 1.00 2.00 2.00 1.00 1.00 2.00 2.00 2.00 2.00 1.00 2.00 1.00 1.00 1.00 2.00 2.00 2.00 2.00 1.00 1.00 1.00 1.00 2.00 1.00 1.00 1.00 2.00 2.00 2.00 1.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 1.00 2.00 2.00 2.00 1.00 2.00 1.00 2.00 2.00 1.00 2.00 1.00 2.00 1.00 1.00 2.00 2.00 2.00

By this data make a frequency table of Frequency of Familiarity with the Internet

Frequency Distribution
In a frequency distribution, one variable is considered at a time. A frequency distribution for a variable produces a table of frequency counts, percentages, and cumulative percentages for all the values associated with that variable.

Frequency of Familiarity with the Internet

Frequency Histogram
8 7 6

Frequency

5 4 3 2 1 0 2 3 4 Familiarity 5 6 7

Statistics Associated with Frequency Distribution: Measures of Location


The mean, or average value, is the most commonly used measure of central tendency. The mean, ,is given by
X X = X i /n Where, i=1 Xi = Observed values of the variable X n = Number of observations (sample size)
The mode is the value that occurs most frequently. It represents the highest peak of the distribution. The mode is a good measure of location when the variable is inherently categorical or has otherwise been grouped into categories.
n

Statistics Associated with Frequency Distribution: Measures of Location


The median of a sample is the middle value when the data are arranged in ascending or descending order. If the number of data points is even, the median is usually estimated as the midpoint between the two middle values by adding the two middle values and dividing their sum by 2. The median is the 50th percentile.

Statistics Associated with Frequency Distribution: Measures of Variability


The range measures the spread of the data. It is simply the difference between the largest and smallest values in the sample. Range = Xlargest Xsmllest a The interquartile range is the difference between the 75th and 25th percentile. For a set of data points arranged in order of magnitude, the pth percentile is the value that has p% of the data points below it and (100 - p)% above it.

Statistics Associated with Frequency Distribution: Measures of Variability

The variance is the mean squared deviation from the mean. The variance can never be negative. The standard deviation is the square root of the variance.

(Xi - X)2 sx = i =1 the The coefficient of variation isn - 1ratio of the standard
deviation to the mean expressed as a percentage, and is a unitless measure of relative variability.
CV = sx /X

Statistics Associated with Frequency Distribution: Measures of Shape


Skewness. The tendency of the deviations from the mean to be larger in one direction than in the other. It can be thought of as the tendency for one tail of the distribution to be heavier than the other. Kurtosis is a measure of the relative peakedness or flatness of the curve defined by the frequency distribution. The kurtosis of a normal distribution is zero. If the kurtosis is positive, then the distribution is more peaked than a normal distribution. A negative value means that the distribution is flatter than a normal distribution.

Skewness of a Distribution
Symmetric Distribution

Skewed Distribution Mean Media n Mode (a)

Mean Median Mode (b)

Cross-Tabulation
While a frequency distribution describes one variable at a time, a cross-tabulation describes two or more variables simultaneously. Cross-tabulation results in tables that reflect the joint distribution of two or more variables with a limited number of categories or distinct values, e.g., Table 15.3.

Internet Usage Data


Respondent Number 1 2 3 4 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 Sex 1.00 2.00 2.00 2.00 1.00 2.00 2.00 2.00 2.00 1.00 2.00 2.00 1.00 1.00 1.00 2.00 1.00 1.00 1.00 2.00 1.00 1.00 2.00 1.00 2.00 1.00 2.00 2.00 1.00 1.00 Familiarity 7.00 2.00 3.00 3.00 7.00 4.00 2.00 3.00 3.00 9.00 4.00 5.00 6.00 6.00 6.00 4.00 6.00 4.00 7.00 6.00 6.00 5.00 3.00 7.00 6.00 6.00 5.00 4.00 4.00 3.00 Internet Usage 14.00 2.00 3.00 3.00 13.00 6.00 2.00 6.00 6.00 15.00 3.00 4.00 9.00 8.00 5.00 3.00 9.00 4.00 14.00 6.00 9.00 5.00 2.00 15.00 6.00 13.00 4.00 2.00 4.00 3.00 Attitude Toward Internet Technology 7.00 3.00 4.00 7.00 7.00 5.00 4.00 5.00 6.00 7.00 4.00 6.00 6.00 3.00 5.00 4.00 5.00 5.00 6.00 6.00 4.00 5.00 4.00 6.00 5.00 6.00 5.00 3.00 5.00 7.00 6.00 3.00 3.00 5.00 7.00 4.00 5.00 4.00 4.00 6.00 3.00 4.00 5.00 2.00 4.00 3.00 3.00 4.00 6.00 4.00 2.00 4.00 2.00 6.00 3.00 6.00 5.00 2.00 3.00 5.00 Usage of Internet Shopping Banking 1.00 2.00 1.00 1.00 1.00 1.00 2.00 2.00 1.00 1.00 2.00 2.00 2.00 2.00 1.00 2.00 1.00 1.00 1.00 2.00 2.00 2.00 2.00 1.00 1.00 1.00 1.00 2.00 1.00 1.00 1.00 2.00 2.00 2.00 1.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 1.00 2.00 2.00 2.00 1.00 2.00 1.00 2.00 2.00 1.00 2.00 1.00 2.00 1.00 1.00 2.00 2.00 2.00

By this data make a frequency table of Frequency of Familiarity with the Internet

Frequency of Familiarity with the Internet

Prepare two cross table Internet Usage by Gender Gender by Internet Usage

Value label

Gender and Internet Usage


Gender
Internet Usage Light (1) Heavy (2) Column Total Male 5 10 15 Female 10 5 15 Row Total 15 15

Two Variables CrossTabulation


Since two variables have been cross-classified, percentages could be computed either columnwise, based on column totals (Table 15.4), or rowwise, based on row totals (Table 15.5). The general rule is to compute the percentages in the direction of the independent variable, across the dependent variable. The correct way of calculating percentages is as shown in Table 15.4.

Internet Usage by Gender


Table 15.4

Gender Internet Usage Light Heavy Column total Male 33.3% 66.7% 100% Female 66.7% 33.3% 100%

Gender by Internet Usage


Internet Usage Gender Male Female Light 33.3% 66.7% Heavy 66.7% 33.3% Total 100.0% 100.0%