Introduction
History of Statistics
• Statistics was derived from the New Latin statisticum collegium ("council of state") or the Italian word
“Statista”, meaning of these words is “Political State or Governmnet”
• In the past, the statistics was used by rulers.
• The application of statistics was very limited but rulers and kings needed information about lands,
agriculture, commerce, population of their states to assess their military potential, their wealth, taxation
and other aspects of government.
• At the beginning of the 20th century, William S Gosset developed the methods for decision making based
on small set of data.
• During the 20th century several statisticians are active in developing new methods, theories and
application of statistics.
• These days the availability of electronics computers is certainly a major factor in the modern development
of statistics.
Meaning of Statistics
• The word statistics has different meanings (sense) which are discussed below:
7. Statistics, psychology and education: In education and physiology statistics has found wide application
such as, determining or to determine the reliability and validity to a test, factor analysis etc.
8 . Statistics and war: In war the theory of decision function can be a great assistance to the military and
personal to plan “maximum destruction with minimum effort.”
• Statistical tools are very useful in the fields of defense and war because it helps to compare the military
strength of different countries in terms of man power, tanks, war-aero planes, missiles etc. Moreover, it
helps in planning future military strategy of the country. It helps to estimate the loss due to war. It helps to
arrange the war finance.
9. Statistics and State:
• Statistics are the eyes of state as they help in administration. In the ancient times, the ruling kings and
chiefs have to rely heavily on statistics to frame suitable military and fiscal policies. Similarly, modern
states make tremendous use of statistical tools on various problems.
Similarly, state conducts the population census to estimate the figures of national income and the prosperity
of the country. In this way, state is the most single unit which not only collects the largest amount of statistics
but also needs statistics on a very extensive scale.
10. Statistics in Research:
• Statistical techniques are of immense use in any research enquiry. In the field of industry and commerce,
researches are made to find out the causes of variations of different products.
• Similarly, various market research is made with the help of statistical techniques. Even in literary field,
various researches are made in which various types of statistical data are used.
Everyday Reasons Why Statistics Are Important
• Statistics are sets of mathematical equations that are used to analyze what is happening in the world
around us. When used correctly, statistics tells us any trends in what happened in the past and can be
useful in predicting what may happen in the future.
1. Weather Forecasts
• Considering a weather forecast sometime during the day, Have you ever heard the forecaster talk about
weather models? These computer models are built using statistics that compare prior weather conditions
with current weather to predict future weather.
2. Emergency Preparedness
• What happens if the forecast indicates that a hurricane is imminent or that tornadoes are likely to occur?
Emergency management agencies move into high gear to be ready to rescue people. Emergency teams
rely on statistics to tell them when danger may occur.
.3. Predicting Disease
• Lots of times on the news reports, statistics about a disease are reported. If the reporter simply reports
the number of people who either have the disease or who have died from it, it's an interesting fact but it
might not mean much to your life. But when statistics become involved, you have a better idea of how
that disease may affect you.
• For example, studies have shown that 85 to 95 percent of lung cancers are smoking related. The statistic
should tell you that almost all lung cancers are related to smoking and that if you want to have a good
chance of avoiding lung cancer, you shouldn't smoke
4. Medical Studies
• Scientists must show a statistically valid rate of effectiveness before any drug can be prescribed. Statistics
are behind every medical study you hear about.
5. Genetics
• Many people are afflicted with diseases that come from their genetic make-up and these diseases can
potentially be passed on to their children. Statistics are critical in determining the chances of a new baby
being affected by the disease.
• 6. Political Campaigns
• Whenever there's an election, the news organizations consult their models when they try to predict who
the winner is. Candidates consult voter polls to determine where and how they campaign. Statistics play a
part in who your elected government officials will be
7. Insurance
• You know that in order to drive your car you are required by law to have car insurance. If you have a
mortgage on your house, you must have it insured as well. The rate that an insurance company charges
you is based upon statistics from all drivers or homeowners in your area.
8. Consumer Goods
• Example, a worldwide leading retailer of imported , keeps track of everything they sell and use statistics to
calculate what to ship to each store and when.
9. Quality Testing
• Companies make thousands of products every day and each company must make sure that a good quality
item is sold. But a company can't test each and every item that they ship to you, the consumer. So the
company uses statistics to test just a few, called a sample, of what they make. If the sample passes quality
tests, then the company assumes that all the items made in the group, called a batch, are good.
10. Stock Market
• Another topic that you hear a lot about in the news is the stock market. Stock analysts also use statistical
computer models to forecast what is happening in the economy
Difference Between Descriptive and Inferential Statistics
• Descriptive Statistics is that branch of statistics which is concerned with describing the
population under study
• What it does? Organize, analyze and present data in a meaningful way.
• Form of final Result : Charts Graphs and Tables
• Usage: To describe a situation
• Function: It explains the data, which is already known, to summarize sample.
• Inferential Statistics is a type of statistics, that focuses on drawing conclusions about the
population, on the basis of sample analysis and observation.
• What it does? Compares, test and predicts data
• Form of final Result: Probability
• Usage: To explain the chances of occurrence of an event
• Function: It attempts to reach the conclusion to learn about the population, that extends beyond the data
available
Difference between a population and a sample?
• When we think of the term “population,” we usually think of people in our town, region, state or country
and their respective characteristics such as gender, age, marital status, ethnic membership, religion and so
forth. In statistics the term “population” takes on a slightly different meaning. The “population” in
statistics includes all members of a defined group that we are studying or collecting information on for
data driven decisions.
• In simple terms, population means the aggregate of all elements under study having one or more
common characteristic, for example, all people living in India constitutes the population. The population is
not confined to people only, but it may also include animals, events, objects, buildings, etc. It can be of any
size, and the number of elements or members in a population is known as population size. Denoted by N
while the sample size is denoted by n.
• Population is defined as the whole set of data, individuals, events or objects etc on which the researcher is
performing research.
The whole area of study is included in a population. While, the sample is relatively smaller. It is a subset of
the population. Since it is difficult to handle and analyze each and every member in the population, a
smaller and representative portion from the population is picked up. This is called sample.
Definition of Sample
• A part of the population is called a sample. The sample is a proportion of the population, a slice of it, a
part of it and all its characteristics. A sample is a scientifically drawn group that actually possesses the
same characteristics as the population – if it is a sample drawn randomly.
• By the term sample, we mean a part of population chosen at random for participation in the study. The
sample selected should represent the population in all its characteristics, and it should be free from bias,
so as to produce miniature cross-section, as the sample observations are used to make generalizations
about the population.
• In other words, the respondents selected out of population constitutes a ‘sample’, and the process of
selecting respondents is known as ‘sampling.’ The units under study are called sampling units, and the
number of units in a sample is called sample size.
• While conducting statistical testing, samples are mainly used when the sample size is too large to include
all the members of the population under study
• Sample and population are related to each other, i.e. sample is drawn from the population, so without
population sample may not exist. Further, the primary objective of the sample is to make statistical
inferences about the population, and that too would be as accurate as possible. The greater the size of the
sample, the higher is the level of accuracy of generalization
Difference Between Parameter and Statistic/Estimate
• A fixed characteristic of population based on all the elements of the population is termed as the
parameter. Here population refers to an aggregate of all units under consideration, which share common
characteristics.
A descriptive measure (such as mean, mode, or median) of a population is known as a parameter. It
numerically expresses the value for an attribute by summarizing the available data. As indicated earlier, it
is impossible to consider the values for attribute over the whole population. Therefore, the sample is used
to calculate the measures and then infer them into the population.
• A statistic is defined as a numerical value, which is obtained from a sample of data. It is a descriptive
statistical measure and function of sample observation. A sample is described as a fraction of the
population, which represents the entire population in all its characteristics. The common use of statistic is
to estimate a particular population parameter.
• In population parameter, µ (Greek letter mu) represents mean, P denotes population proportion,
standard deviation is labeled as σ (Greek letter sigma), variance is represented by σ2, population size is
indicated by N, Standard error of mean is represented by σx̄, standard error of proportion is labeled as σp,
standardized variate (z) is represented by (X-µ)/σ, Coefficient of variation is denoted by σ/µ.
• In sample statistic, x̄ (x-bar) represents mean, p̂ (p-hat) denotes sample proportion, standard deviation is
labeled as s, variance is represented by s2, n denotes sample size, Standard error of mean is represented
by sx̄, standard error of proportion is labeled as sp, standardized variate (z) is represented by (x-x̄)/s,
Coefficient of variation is denoted by s/(x̄)
What is a variable?
• A variable is any characteristics, number, or quantity that can be measured or counted. A variable may
also be called a data item. Age, sex, business income and expenses, country of birth, capital expenditure,
class grades, eye colour and vehicle type are examples of variables. It is called a variable because the value
may vary between data units in a population, and may change in value over time.
For example; 'income' is a variable that can vary between data units in a population (i.e. the people or
businesses being studied may not have the same incomes) and can also vary over time for each data unit
(i.e. income can go up or down).
Categorical variables have values that describe a 'quality' or 'characteristic' of a data unit, like 'what
type' or 'which category'. Categorical variables fall into mutually exclusive (in one category or in another)
and exhaustive (include all possible options) categories. Therefore, categorical variables are qualitative
variables and tend to be represented by a non-numeric value. Categorical variables may be further
described as ordinal or nominal:
An ordinal variable is a categorical variable. Observations can take a value that can be logically ordered
or ranked. The categories associated with ordinal variables can be ranked higher or lower than another,
but do not necessarily establish a numeric difference between each category. Examples of ordinal
categorical variables include academic grades (i.e. A, B, C), clothing size (i.e. small, medium, large, extra
large) and attitudes (i.e. strongly agree, agree, disagree, strongly disagree).
• A nominal variable is a categorical variable. Observations can take a value that is not able to be
organised in a logical sequence. Examples of nominal categorical variables include sex, business type, eye
colour, religion and brand.
• The data collected for a categorical variable are qualitative data
Types of variables flowchart
• Data can be understood as the quantitative information about a specific characteristic. The characteristic
can be qualitative or quantitative, but for the purpose of statistical analysis, the qualitative characteristic is
transformed into quantitative one, by providing numerical data of that characteristic. So, the quantitative
characteristic is known as a variable. And this refers to the discrete and continuous variable.
• These are also known as the result of numerical measurement or the
• Example can be a small collection of sea shells gathered on the beach . All the shells in the collection are
similar: small disk-shaped shells with a hole in the center. But the shells also differ from one another in
overall size and weight, in color, in smoothness, in the size of the hole, etc. Any data set is something like
the shell collection. It consists of cases: the objects in the collection. Each case has one or more attributes
or qualities, called variables. This word “variable” emphasizes that it is differences or variation that is
often of primary interest. Usually, there are many possible variables. The researcher chooses those that
are of interest, often drawing on detailed knowledge of the system that is under study. The researcher
measures or observes the value of each variable for each case.
Univariate vs. Bivariate Data
• Statistical data are often classified according to the number of variables being studied.
• Univariate data. When one conducts a study that looks at only one variable, then she is working with
univariate data. Example, conducting a survey to estimate the average weight of high school students,
the researcher will be working with only one variable (weight), called univariate data
• Bivariate data. When one conducts a study that examines the relationship between two variables, she is working
with bivariate data. Example, conducting a study to see if there is a relationship between the height and weight
of high school students. The researcher will be working with two variables (height and weight), considered as
bivariate data
Qualitative and Quantitative Data
• Data collected about a numeric variable will always be quantitative and data collected about a categorical variable
will always be qualitative. Therefore, you can identify the type of data, prior to collection, based on whether the
variable is numeric or categorical.
Why are quantitative and qualitative data important?
• Quantitative and qualitative data provide different outcomes, and are often used together to get a full picture of
a population. For example, if data are collected on annual income (quantitative), occupation data (qualitative)
could also be gathered to get more detail on the average annual income for each type of occupation.
How can you use quantitative and qualitative data?
• It is important to identify whether the data are quantitative or qualitative as this affects the statistics that can be
produced
Examples:
• The age of your car. (Quantitative.)
• The number of musical instruments at home. (Quantitative.)
• The softness of a cat. (Qualitative.)
• The color of the sky. (Qualitative.)
• The number of coins in your pocket. (Quantitative)
The Four Levels/Scales of Measurement