.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact support@jstor.org.
American Marketing Association is collaborating with JSTOR to digitize, preserve and extend access to
Journal of Marketing Research.
http://www.jstor.org
Regression
Model
Market
for
Segmentation
Studies
unbiasedness, minimum variance, linearity, and maximum likelihood (Johnston 1972, p. 126).
However, the appropriateness of using OLS in
analyzing consumption data has been questioned.
Empirical work by Ehrenberg (1972) suggests that
typical measures of consumption and purchase, such
as number of items purchased in a given time period,
are not normally distributed but are better represented
by a Poisson process. Morrison (1973) indicates that
a strict Poisson process may not be appropriate because the mean of the process may not always be
equal to the variance, and proposes that the process
may be better represented by a distribution whose
variance is proportional to the mean. In addition,
Morrison argues that the usual R2 statistic obtained
from such an OLS regression is not an appropriate
statistic to be used in evaluating the results of the
usual segmentation study. This argument centers on
the notion that, though a model may accurately predict
a consumer's average purchase rate, it may do a very
poor job of predicting the exact number of purchases
for that consumer in any given time period. Beckwith
and Sasieni (1976) develop these ideas further, but
do not offer a feasible alternative to OLS for the
researcher interested in relating consumption behavior
to predictor variables, such as consumer characteristics. Wildt (1976) suggests a regression approach and
the decomposition of the error variance which permits
one to assess the results of a segmentation analysis.
We report on the development of a regression model
One of the oldest and most popular types of marketing research activity is the explanation of the variance
in some measure of consumption among a population
of consuming units. It is usually done by relating such
behavior to one or more managerially relevant characteristics of the consuming unit. Typical of such research are the attempts to use demographic and
socioeconomic variables to explain variations in
household consumption (purchase) of food items
(Frank 1968). A common feature of many of these
analyses is the use of the normal regression model
of the form
(1)
X= Z +u
336
xi =
i= 1, ..., N; j= 1, ...,
i + Vj
X,=
ESTIMA TION
The estimation of the model is influenced by two
considerations: (1) the functional form of the distribution of x j given X, is unspecified (only the mean and
variance are assumed) and (2) the parameters oa, k,
and ,, i = 1, ..., N, are unknown. In this section
we discuss the estimation of ?, the Nm x Nm
variance-covariance matrix of the ui in equation 4,
and a modified Aitken's procedure for estimating 13p.
The approach taken here is similar to that suggested
by Wallace and Hussain (1969) in the context of
combining cross-section with time-series data.
If the ui were observable, they could be considered
in the context of a one-way, random effects, analysis
of variance model with N levels and m observations
per level, and best quadratic unbiased estimators of
cri = kX, and Cr2 could be obtained (Graybill 1961,
Ch. 16). Further, if the ,i were also observable, the
least squares estimator of k could be obtained by
considering the relationship
(u,ij-u.)2
j=
=kX,+e,
m-1
x, = z;'p + E, + Vi = z, P+
u,j
i=
, ...,N;
ui. =
uij/m
M--=
'ui
where:
+ E
-2
S u,
,2
i.
i=, N- 1
and
N
E'Xi
i-='
j=I
(( l - U))2
a.
(Uij-Ui-)2/(m-1)
N
Ji
337
REGRESSIONMODELFOR MARKETSEGMENTATIONSTUDIES
UjjUkq=
+ a. kAj)
S S=
f )2 = m
(-z
i=I
(,
- z,)2,
i=I
j=l
SSW=
E
i=l
i)
(x,jj=I
1))
[SSA/(N-p)]/[(SSr/(Nm-(SSw/N(m
1))].
2
[SSR/(P-
Data
The data are for a low-price, frequently purchased
consumer good and consist of 36 monthly observations
for the three-year period 1964-1966 taken from the
household purchase panel operated by Market Research Corporation of America (MRCA). (See McCann
1974 for a more detailed description of the data.) Data
were made available for only those panel members
who purchased the product category at least once
during the time period considered. As with most
panels, some members dropped out of the panel (or
were added to the panel) during the study period and
some panel members occasionally failed to submit
a diary. For these data it was impossible to distinguish
between zero purchases and missing data. Hence, the
available data were screened for those households
who were in the panel for the full three-year period
and who submitted diaries every week. The screening
resulted in a usable sample of 110 households. In
addition to the purchase behavior of the households,
measures were taken on several household characteristics.
A nalysis
Models relating to the problem under discussion
typically consider a single criterion variable, which
is some measure of purchase or consumption, and
a set of predictor variables, often consisting of consumer characteristics. In this section we consider three
different models. Model I is an ordinary least squares
model relating the number of equivalent units of the
product purchased per month to a set of predictor
variables consisting of education level of head of
household, household size, and household income.
Table 1 describes these variables in more detail. Model
Table 1
1)] /[SSA/(N-p)]
DEFINITIONOF PREDICTOR
VARIABLES
A.
2 =
1 -
s/s,
where
and
ST
SST/(Nm-
-1).
Dummy variables
1. Education of head of household
Education 1
1 = 9-12 years
Education 2
1 = 13 or more years
(excluded class = 0-8 years)
2.
s2 are unbiased
1)].
[SSW/N(m-
B.
Household size
HH Size I
HH Size 2
HH Size 3
(excluded
I = 3 members
1 = 4 or 5 members
1 = 6 or more members
class = I or 2 members)
Continuous variable
Household income (thousands of dollars)
0 = other
0 = other
0 = other
0 = other
0 = other
338
II, the iterative generalized least squares model previously described, uses the same criterion and predictor variables as Model I. Model III is an ordinary least
squares model using the same predictor variables as
the other models, but with the average number of
equivalent units of the product purchased per month
as the criterion variable. Though our purpose is to
illustrate the estimation of Model II, the other two
models provide a useful comparison. The results of
all three models are presented in Table 2.
The results of pooling all of the data (36 observations
on 110 households) and using ordinary least squares
(OLS) to estimate the unknown parameters are shown
in the first panel of Table 2. The R2 value is relatively
low (0.14) and all predictor variables have coefficient
estimates with absolute values greater than twice their
estimated standard errors.
The results of the iterative generalized least squares
regression are shown in the second panel of Table
2. These results indicate that approximately 23% of
the total variation in consumption (and 27% of the
error variance) is attributable to within-household
variation, i.e., variation of individual purchases about
Table 2
ANALYSISRESULTSFOR THREEMODELS
Model III
Model II
Model I
Estimation method
Dependent variable
Parameter estimates
Variable
Constant
Education 1
Education 2
HH size I
HH size 2
HH size 3
Income
Summary statistics
Multiple R
R2
Adjusted R2
Est. std. error
R2x
2Adjusted R,
Estimated k
Est. variance of mean
purchase rate (r 2)
No. of iterations
Analysis of variance
Source
Regression
Error
Among
Within
2.11
0.88
0.57
0.81
0.93
5.36
-0.08
S,
0.10
0.14
0.16
0.15
0.15
0.22
0.01
P/so
20.86
7.68
3.60
5.28
6.11
24.41
-6.47
13i
2.09
0.86
0.56
0.81
0.93
5.32
-0.08
So
0.52
0.59
0.81
0.79
0.78
1.14
0.06
i/so
4.02
1.46
0.69
1.03
1.19
4.68
-1.25
i,
2.11
0.88
0.57
0.81
0.93
5.36
-0.08
l/ S
3.94
1.45
0.68
1.00
1.15
4.61
-1.22
.432
.186
.138
2.807
.379
.143
.379
.143
.142
3.180
So
0.53
0.60
0.84
0.81
0.81
1.16
0.07
.186
.181
0.974
7.312
2
Sum of
squares
6693.86
39965.83
d.f.
6
3953
F-ratio
110.35
Sum of
squares
6690.62
39969.07
29221.65
10747.42
d.f.
6
3953
103
3850
F-ratio
3.98a
Sum of
squares
185.94
811.62
d.f.
6
103
F-ratio
3.93
101.63b
"F-ratio for testing the significance of z, in explaining the mean purchase rate, Xi. The value of the traditional F-ratio, F = MSR/MSE,
is 110.29.
bComputed as F = MSA/MSW and used in testing the hypothesis: aor= 0.
REGRESSIONMODELFOR MARKETSEGMENTATIONSTUDIES
(a I
/
l111
i M) S
il1
339
and
E(2)
= EE,
[({i + v,)2]
= o2 + E, [k,/m]
But as m -> oo, kX,/m - 0, and E (U2 )> r 2. Therefore, as m -> oo, the OLS estimator of Model III
will approach BLUE and as a property of the Aitken's
estimator it is a minimum variance unbiased linear
estimator of 13. In fact, the estimators for Models
II and III will, for all practical purposes, be identical
for very large m. For the example data used, these
two analyses were repeated using only the first six
purchase occasions. The results are not reported here,
but the difference in estimated coefficients ranged
to approximately 9% and the estimated standard errors
of IGLS coefficients were about 2.5 to 3% lower than
the Model III OLS values. Even with only six observations per household, however, the among-household variation was very large. One other comparison
worth noting is that the R2 of Model III will be higher
2
than that of Model II but identical to the R of the
Model II. However, the adjusted R2's are different,
R2 being the more representative statistic for describing the variation in mean purchase rates explained
by the model.
After all things are considered, Model II, the IGLS
model, offers a more realistic representation of the
process under investigation and is more consistent
with available empirical evidence. With the functional
form and independent variables correctly specified,
the estimated coefficients of all three models are
unbiased. However, the modified Aitken's estimator
of Model II has the smallest true variance of the three
and the estimated variances of the coefficients for
it are unbiased, whereas they are biased in the case
of Models I and III. This fact alone should be sufficient
grounds for the acceptance of the IGLS model, Model
II. However, as has been shown, if the number of
purchase occasions becomes large, the unexplained
variation in Model III attributable to within-purchasing-unit variation becomes small and Model III provides a reasonably good estimate of 13. To extend
this reasoning further, in any case where the difference
between cr2 and E(u2) is small, the gain in efficiency
of the Aitken's estimator over the OLS estimate of
13 based on Model III may not be great enough to
justify the added computational effort. This situation
can occur when m is larger or when o 2, the unexplained
variance in mean purchase rate, is large.
SUMMARY
We incorporate the long-standing marketing research effort of explaining variation in observed consumption behavior (by person-specific characteristics)
340
REFERENCES
Bass, Frank M. (1974), "The Theory of Stochastic Preference and Brand Switching," Journal of Marketing Research, 11 (February), 1-20.
Beckwith, Neal E. and Maurice W. Sasieni (1976), "Criteria
for Market Segmentation Studies," Management Science,
22 (April), 892-903.
Ehrenberg, A. S. C. (1972), Repeat Buying. Amsterdam:
North Holland Publishing Co.
Frank, Ronald E. (1968), "Market Segmentation Research:
in The Application
Findings and Implications,"
of the
I
I
TUITION
August 10-15, 1980
Madison, Wl
The Wisconsin Center
Guest House (Lowell Hall)
School of Marketing
AMA members
Non-members
$750
$900
AMA's first School aimed at business professionals with three-to-five years practice in industry. The curriculum over 37 hours of classwork - concentrates on the role of marketing in a profit-making institution and stresses the
integration and coordination of the marketing discipline. Students will be housed in residence at the Continuing
Education Center with room and board included in the tuition fee. Deadline for admissions: June 10, 1980. Cosponsored with the Graduate School of Business, University of Wisconsin.
EDUCATION DEPARTMENT
American Marketing Association
Suite 606
222 South Riverside Plaza
Chicago, Illinois 60606