1 / 4
Quota sampling in online research: what is it and how to use it?
To obtain a representative sample of the universe we want to study (which is known as
sampling), it is necessary to get reliable data from a survey.
The most direct way to obtain representative sample is the simple random sampling. In this
type of sampling, all individuals that constitute the studied universe have the same probability
to be part of the sample. It would be something like bringing in a vase all the individuals and
extracting them by chance until completing the wanted sampling size. Simple random
sampling guarantees the absence of bias in the sample selection.
But in practice, it is extremely difficult to use this sampling method. There are many conditions
required in order to use simple random sampling: the first one is to have a completed list of
the individuals that constitute the universe. We also need to be able to access them and to
have their willingness to participate. The refusal of an individual to participate in a survey will
be enough to invalidate a perfect process of random sampling.
Quota sampling
The above conditions are really demanding, which turns the simple random sampling to be an
impossible method or an extremely tough method. That is why other sampling techniques are
usually more used, such as quota sampling.
Quota sampling aims to obtain a representative sample from a selection of individuals not
necessarily random, forcing a distribution in relation to some specific variables identic to the
distribution of the studied universe.
To use quota sampling is necessary to take into account two points:
(1) We need to select the relevant variables to the object of our study. For example, if we
are designing an electoral study, variables such as age, geographical location or
social class are relevant variables because they can condition the voting behaviour.
In other words, age and voting behaviour are dependent variables. Otherwise, the
height or the name of an individual are irrelevant variables in an electoral study.
(2) The second factor to consider is that we need to know the distribution of those
relevant variables in the studied universe. Otherwise, well not be able to reproduce
that distribution in the sample. For example, if we want to carry out a study about the
general population and to fix a quota on the age variable, we can appeal to census
studies and other information sources provided by government agencies in each
country. Those studies are done either using direct access to all the individuals or
through a very similar sampling method to the simple random sampling. Both
guarantee the reliability of the information that we handle for the universe.
9th (In)formative Capsule
2 / 4
How many and which variables should I use in my quotas?
To fix quotas in a sampling design has costs. Regardless the methodology used (face-to-face,
by phone or online), to add quotas raises the price of the fieldwork as it forces the detection of
possible participants when a fixed objective is exceeded. Therefore, in order to select relevant
variables we should consider the following:
(1) Some relevant variables can be redundant (i.e. highly correlated), so it may be enough
to control only one of them. For example, if we fix a quota on the variable social
class, likely we could leave out other variables such as income, square meters of
dwelling or studies level.
(2) If the way in which I select individuals for the sample, although it isnt purely random
(simple random sampling), does guarantee randomness regarding the relevant
variable, then I can go without incorporating it to the quotas. For example, I might be
getting a sample from a sports news website readers and, although it would clearly not
be a random sample (all the individuals are Internet users, sport fans and readers of
the sports site), it could happen that geographically readers of this webpage are
distributed identically in the studied universe. In this case, it would not be necessary to
fix quotas for the variable region.
Considering the above two criteria, we should choose really essential variables. Those that will
guarantee us a good level of representativity in the sample, without increasing excessively the
studys cost.
Interlocked quotas or non-interlocked quotas?
When we are performing a quota control over two or more variables, we can define that
control in an interlocked and non-interlocked way. For example, lets suppose we want to get a
sample of 1,000 people for an electoral study and we have identified two relevant variables:
sex (50% men and 50% women) and age (50% under 40 years old and 50% over 40 years old).
If we impose non-interlocked quotas control, we will require that 500 of the total amount of
respondents in the sample (1.000) have to be men and the other 500 women, 500 to be under
40 years old and 500 older than 40 years old. In other words, a sample of 500 men under 40
and 500 women over 40 will be valid.
Nevertheless, if we define sex-age interlocked quotas, we will require the sample to be
composed by 250 men under, 250 men over 40, 250 women under 40 and 250 women over
40.
To define interlocked quotas is more expensive than to define non-interlocked quotas. To
know whether it is necessary or not to interlock quotas, we should evaluate the relevance and
the selection method of individuals:
(1) Do the variables influence in the studys object in an identical way when considered
separately or together? If two relevant variables are independent of each other, we
can consider them as non-interlocked. In the example above, if age influences in
9th (In)formative Capsule
3 / 4
the voting behaviour in the same way for men and women, we could use non-
interlocked quotas.
(2) Does the way in which we select individuals for our sample can favour the over-
representation of some groups defined by the combination of relevant samples? If the
way I select individuals guarantees per se the correct proportions of the crossed
variables, we could renounce to this control.
Offline and online quotas
The most used variables to define quotas in any methodology are the sociodemographic: sex,
age, region They are usually influential variables in most of the subjects studied and are
easily controllable, as there are reliable data sources to compare with. However, each
methodology has its own particularities so it is advisable to adapt the use of the quotas.
When we do studies using personal interviews, geographical quotas are often simplified due to
the huge cost that would cause the displacement of the interviewers to all the towns and cities
of a country. That is why it is usual to fix quotas in the main cities, assuming that those cities
represent perfectly the individual behaviour of a wide region. For example, in a sample from
Brazil, it is frequent to request a specific number of individuals for So Paulo, Ro de Janeiro
and Recife. For Mexico we will use Mexico City and Monterrey. Or in Spain we will consider
Madrid and Barcelona.
When doing telephone studies, it is especially relevant to consider the variable occupation,
because we have to contact people when they are at home and we could make a mistake over-
representing unemployed and retired people.
The use of quotas in online studies is still understudied. Something practised by the majority
and possibly erroneously is to use the same quotas as those defined when the study used to
be offline.
Below are listed a number of considerations to keep in mind when using Internet in obtaining
samples:
(1) Most of online research use panels: It is important to remember how panels work and
which criteria are used when recruiting their members. For example: panels try to
survey all their panelists in the same way. So, if there are some profiles which are
usually most requested in the studies (mums, people with high incomes, etc.) can be
over-represented. An online panel is not representative of the population itself, is a
place in which you can define representative samples using appropriate quotas.
(2) The geographical dispersion is not a problem when it comes to online: Nevertheless, it
is an opportunity. If we want to obtain representative sample from a region, thus we
do not need to limit the research in a few cities or towns as we used to do in offline.
When these limits are applied, they reduced the representativity and increase the cost
of the study, as it is cheaper to recruit people from wide regions rather than from a
specific town.
9th (In)formative Capsule
4 / 4
(3) All individuals use Internet: It is obvious, but necessary to take this into account. We
cannot fix quotas about non-Internet users or about illiterates: everybody who
uses Internet can read. Additionally, the Internet factor can correlate strongly with
other variables such as technologies usage, which will lead to set quotas on such
variables. For example, if we are carrying out a study about mobile technology, the
type of contract (contract/pre-paid) can influence in the expected results. This can
happen because in an online panel more advanced users contract forms can be over-
represented (mobile Internet access).
(4) The adoption of the Internet in each country is a key factor to define good quotas: If a
population group has adopted Internet to a greater extent than another one, its
presence in online sample will be over-represented. Variables such as age and
social class are essential, especially in Latin America; a region where social
differences are really extreme, which leads lower social classes to have difficulties
when accessing the Internet.
Ultimately, when we carry out an online survey, quotas are an essential tool to achieve
representativeness. So, we have to bear in mind the studys object as well as the way I
which sample is obtained when electing variables used in quotas.