2 Collection of Data
2018-19
10 STATISTICS FOR ECONOMICS
from crop to crop. As these values vary, 2. WHAT ARE THE SOURCES OF DATA?
they are called variable. The variables Statistical data can be obtained from
are generally represented by the letters
two sources. The researcher may
X, Y or Z. Each value of a variable is an
collect the data by conducting an
observation. For example, the food
enquiry. Such data are called Primary
grain production in India varies
between 108 million tonnes in 1970– Data, as they are based on first hand
71 to 272 million tonnes in 2016-17 information. Suppose, you want to
as shown in the following table. The know about the popularity of a filmstar
years are represented by variable X and among school students. For this, you
the production of food grain in India will have to enquire from a large
(in million tonnes) is represented by number of school students, by asking
variable Y. questions from them to collect the
desired information. The data you get,
TABLE 2.1
Production of Food Grain in India
is an example of primary data.
(Million Tonnes)
If the data have been collected and
processed (scrutinised and tabulated)
X Y
by some other agency, they are called
1970–71 108 Secondary Data. They can be obtained
1978–79 132 either from published sources such as
1990–91 176 government reports, documents,
1997–98 194 newspapers, books written by
2001–02 212 economists or from any other source,
2015-16 252
for example, a website. Thus, the data
2016-17 272
are primary to the source that collects
Here, the values of these variables and processes them for the first time
X and Y are the ‘data’, from which we and secondary for all sources that later
can obtain information about the use such data. Use of secondary data
production of food grains in India. To saves time and cost. For example, after
know the fluctuations in food grains collecting the data on the popularity of
production, we need the ‘data’ on the the filmstar among students, you
production of food grains in India for publish a report. If somebody uses the
various years. ‘Data’ is a tool, which data collected by you for a similar
helps in understanding problems by study, it becomes secondary data.
providing information.
You must be wondering where do 3. HOW DO WE COLLECT THE DATA?
‘data’ come from and how do we collect
Do you know how a manufacturer
these? In the following sections we will
decides about a product or how a
discuss the types of data, method and
instruments of data collection and political party decides about a
sources of obtaining data. candidate? They conduct a survey by
2018-19
COLLECTION OF DATA 11
2018-19
12 STATISTICS FOR ECONOMICS
2018-19
COLLECTION OF DATA 13
2018-19
14 STATISTICS FOR ECONOMICS
Advantages Disadvantages
Personal Interview
• Highest Response Rate • Most expensive
• Allows use of all types of questions • Possibility of influencing
• Better for using open-ended respondents
questions • More time-taking.
• Allows clarification of ambiguous
questions.
Mailed Interview
• Least expensive • Cannot be used by illiterates
• Only method to reach remote • Long response time
areas • Does not allow explanation of
• No influence on respondents unambiguous questions
• Maintains anonymity of • Reactions cannot be watched.
respondents
• Best for sensitive questions.
Telephonic Interviews
• Relatively low cost • Limited use
• Relatively less influence on • Reactions cannot be watched
respondents • Possibility of influencing
• Relatively high response rate. respondents.
2018-19
COLLECTION OF DATA 15
2018-19
16 STATISTICS FOR ECONOMICS
Suppose you want to study the Now the question is how do you do
average income of people in a certain the sampling? There are two main types
region. According to the Census of sampling, random and non-random.
method, you would be required to find
out the income of every individual in Activities
the region, add them up and divide by • In which years will the next
number of individuals to get the Census be held in India and
average income of people in the region. China?
This method would require huge • If you have to study the opinion
expenditure, as a large number of of students about the new
enumerators have to be employed. economics textbook of class XI,
what will be your population and
Alternatively, you select a
sample?
representative sample, of a few
• If a researcher wants to
individuals, from the region and find estimate the average yield of
out their income. The average income wheat in Punjab, what will be
of the selected group of individuals is her/his population and sample?
used as an estimate of average income
of the individuals of the entire region. The following description will make
their distinction clear.
Example
• Research problem: To study the Random Sampling
economic condition of agricultural As the name suggests, random
labourers in Churachandpur district of sampling is one where the individual
Manipur. units from the population (samples)
• Population: All agricultural are selected at random. The
labourers in Churachandpur district. government wants to determine the
• Sample: Ten per cent of the impact of the rise in petrol price on the
agricultural labourers in household budget of a particular
Churachandpur district. locality. For this, a representative
Most of the surveys are sample (random) sample of 30 households has
surveys. These are preferred in statistics to be taken and studied. The names of
because of a number of reasons. A all 300 households of that area are
sample can provide reasonably reliable written on paper and mixed, then 30
and accurate information at a lower names to be interviewed are selected
cost and shorter time. As samples are one by one.
smaller than population, more detailed In random sampling, every
information can be collected by individual has an equal chance of
conducting intensive enquiries. As we being selected. In the above example,
need a smaller team of enumerators, it all 300 sampling units (also called
is easier to train them and supervise sampling frame) of the population got
their work more effectively. an equal chance of being included in
2018-19
COLLECTION OF DATA 17
Non-Random Sampling
There may be a situation that you have
A Population of 20 to select 10 out of 100 households in a
Kuchha and 20
Pucca Houses locality. You have to decide which
household to select and which to reject.
You may select the households
conveniently situated or the
A Representative A non-representative households known to you or your
Sample Sample
friend. In this case, you are using your
the sample of 30 units and hence the judgement (bias) in selecting 10
sample, such drawn, is a random households. This way of selecting 10
sample. This is also called lottery out of 100 households is not a random
method. Nowadays computer selection. In a non-random sampling
programmes are used to select random method all the units of the population
samples.
do not have an equal chance of being
Exit Polls selected and convenience or
You must have seen that when an judgement of the investigator plays an
election takes place, the television important role in selection of the
networks provide election coverage. sample. They are mainly selected on
They also try to predict the results. the basis of judgment, purpose,
This is done through exit polls, convenience or quota and are non-
wherein a random sample of voters random samples.
who exit the polling booths are asked
whom they voted for. From the data 5. S A M P L I N G AND N O N -S A M P L I N G
of the sample of voters, the prediction E RRORS
is made. You might have noticed that
Sampling Errors
exit polls do not always predict
correctly. Why? A population consisting of numerical
values has two important
characteristics which are of relevance
Activity here. First, Central Tendency which
• You have to analyse the trend of
may be measured by the mean, the
foodgrains production in India for
median or the mode. Second,
the last fifty years. As it is difficult
to collect data for all the years, Dispersion, which can be measured by
you are asked to select a sample caculating the “standard deviation”,
of production of ten years. ‘‘ mean deviation”, “ range”, etc.
2018-19
18 STATISTICS FOR ECONOMICS
2018-19
COLLECTION OF DATA 19
2018-19
20 STATISTICS FOR ECONOMICS
Recap
• Data is a tool which helps in reaching a sound conclusion on any
problem.
• Primary data is based on first hand information.
• Survey can be done by personal interviews, mailing questionnaires
and telephone interviews.
• Census covers every individual/unit belonging to the population.
• Sample is a smaller group selected from the population from which
the relevant information would be sought.
• In a random sampling, every individual is given an equal chance
of being selected for providing information.
• Sampling error is due to the difference between the value of the
sample estimate and the value of the corresponding population
parameter.
• Non-sampling errors can arise in data acquisition, by non-response
or by bias in selection.
• Census of India and National Sample Survey are two
important agencies at the national level, which collect,
process and tabulate data on many important economic
and social issues.
EXERCISES
2018-19
COLLECTION OF DATA 21
2018-19