Descriptive Statistics Summary (Session 1-5) : Types of Data - Two Types

Descriptive Statistics Summary (Session 1-5)
Statistics is a science that helps us make better decisions in business and economics as well as in
other fields.
Statistics teaches us how to summarize, analyze, and draw meaningful inferences from data that
then lead to improve decisions.
Types of Data - Two Types
Qualitative - Categorical or Nominal and Quantitative - Measurable or Countable.
• Nominal Scale - groups or classes
✓ Gender, color, professional classification, etc.
• Ordinal Scale - order matters
✓ Ranks (top ten videos, products, etc.)
• Interval Scale - difference or distance matters
✓ Temperatures (0F, 0C)
• Ratio Scale - Ratio matters – “True Zero Point”
✓ Salaries, weight, volume, area, length, etc.
Population
Collection of all the items or individuals about which you want to draw a conclusion.
Sample
A portion of a population selected for analysis.
Parameter
A numerical measure that describes a characteristic of a population.
Statistic
A numerical measure that describes a characteristic of a sample.
Measures of Location
Population Mean: µ =  xi / N
Sample Mean: 𝑥̅ =  xi / n
∑ 𝑤𝑖 𝑥𝑖
Weighted Mean: 𝑥̅ = ∑ 𝑤𝑖
Geometric Mean: ̅̅̅
𝑥𝑔 = 𝑛√𝑥1 𝑥2 … 𝑥𝑛
Median: Middle value, middlemost or most central item
• Arrange n observations in increasing order.

• If n is odd, (n+1)/2th observation is the median.
• If n is even, median = average of (n/2)th and (n/2+1)th observation.
Mode: Most frequent: the value that is repeated most often.

Percentiles: To compute the pth percentile, determine the data point in position (n + 1)P/100.
⚫ Quartiles are the percentage points that break down the ordered data set into quarters.
th
⚫ The first quartile is the 25 percentile. It is the point below which lie 1/4 of the data.
th
⚫ The second quartile is the 50 percentile. It is the point below which lie 1/2 of the data. This is
also called the median.
th
⚫ The third quartile is the 75 percentile. It is the point below which lie 3/4 of the data.
Measures of dispersion:
Range: The difference between the highest and the lowest observed values.
Special fractiles: Deciles, percentiles and quartiles.
Interquartile range: Q3 - Q1
Variance: Measures the variability in the data from the mean.

Standard Deviation: It is the positive square root of variance.
Population Variance: 2 = [(xi-)2] /N

Sample Variance: s2 = [(xi- 𝑥̅ )2] /n-1
The sample variance s2 is the “UNBAISED ESTIMATE” of the population variance.
• Coefficient of variation relates the SD and the mean by expressing the SD as a

percentage of mean.
• Coefficient of variation = ( / )(100) %
Skewness and Kurtosis
Skewness
Measure of the degree of asymmetry of a frequency distribution
• Skewed to left or negatively skewed

• Symmetric or unskewed
• Skewed to right or positively skewed
Kurtosis
Measure of flatness or peakedness of a frequency distribution

• Platykurtic (relatively flat)
• Mesokurtic (normal)
• Leptokurtic (relatively peaked)
Methods of Displaying Data
Bar Graphs
• Heights of rectangles represent group frequencies
Histograms
• Histogram consists of a series of rectangles whose widths are defined by the limits of the
classes, and whose heights are determined by the frequency in each interval.
Frequency Polygons
• Height of line represents frequency
Ogives
• Height of line represents cumulative frequency
Pie Charts
• Categories represented as percentages of total
Exploratory Data Analysis – EDA
Techniques to determine relationships and trends, identify outliers and influential observations, and
quickly describe or summarize data sets.
Stem-and-Leaf Displays
• Quick way of listing all observations

• Conveys some of the same information as a histogram
Box Plots
• Median
• Lower and upper quartiles
• Maximum and minimum
Scatter Plots:
• Scatter Plots are used to identify and report any underlying relationships among pairs of
data sets.
• The plot consists of a scatter of points, each point representing an observation.
Relations between the Mean and Standard Deviation

Chebyshev’s Theorem
• Applies to any distribution, regardless of shape. Places lower limits on the percentages of
observations within a given number of standard deviations from the mean.
1
• At least (1- ) of the elements of any distribution lie within k standard deviations of the mean
𝑘2
Empirical Rule
• Applies only to roughly mound-shaped and symmetric distributions. Specifies

approximate percentages of observations within a given number of standard deviations
from the mean.
• Roughly 68% lie within one standard deviation from mean.
• Roughly 95% lie within two standard deviation from mean.
• Roughly all lie within three standard deviation from mean.

Descriptive Statistics Summary (Session 1-5) : Types of Data - Two Types

Diunggah oleh

Informasi Dokumen

Judul Asli

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Descriptive Statistics Summary (Session 1-5) : Types of Data - Two Types

Diunggah oleh

Hak Cipta:

Format Tersedia

Descriptive Statistics Summary (Session 1-5)

• Arrange n observations in increasing order.

Mode: Most frequent: the value that is repeated most often.

Variance: Measures the variability in the data from the mean.

Population Variance: 2 = [(xi-)2] /N

• Coefficient of variation relates the SD and the mean by expressing the SD as a

Skewness and Kurtosis

Measure of the degree of asymmetry of a frequency distribution

• Skewed to left or negatively skewed

Measure of flatness or peakedness of a frequency distribution

Methods of Displaying Data

• Heights of rectangles represent group frequencies

• Height of line represents frequency

• Height of line represents cumulative frequency

• Categories represented as percentages of total

Exploratory Data Analysis – EDA

• Quick way of listing all observations

Relations between the Mean and Standard Deviation

• Applies only to roughly mound-shaped and symmetric distributions. Specifies

Anda mungkin juga menyukai