Support Vector Clustering For Customer Segmentation On Mobile TV Service.

This article was downloaded by: [187.104.207.
24]
On: 12 December 2014, At: 11:23
Publisher: Taylor & Francis
Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered
office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK
Communications in Statistics - Simulation

and Computation
Publication details, including instructions for authors and
subscription information:
http://www.tandfonline.com/loi/lssp20
Support Vector Clustering for Customer

Segmentation on Mobile TV Service
a
Pedro Albuquerque , Solange Alfinito & Claudio V. Torres

a
Department of Administration, University of Braslia, Braslia,

Braslia-Brazil
b
Department of Psychology, University of Braslia, Braslia-Brazil

Accepted author version posted online: 20 Aug 2014.Published
online: 10 Dec 2014.
To cite this article: Pedro Albuquerque, Solange Alfinito & Claudio V. Torres (2015) Support Vector
Clustering for Customer Segmentation on Mobile TV Service, Communications in Statistics - Simulation
and Computation, 44:6, 1453-1464, DOI: 10.1080/03610918.2013.794289
To link to this article: http://dx.doi.org/10.1080/03610918.2013.794289
PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all the information (the
Content) contained in the publications on our platform. However, Taylor & Francis,
our agents, and our licensors make no representations or warranties whatsoever as to
the accuracy, completeness, or suitability for any purpose of the Content. Any opinions
and views expressed in this publication are the opinions and views of the authors,
and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content
should not be relied upon and should be independently verified with primary sources
of information. Taylor and Francis shall not be liable for any losses, actions, claims,
proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or
howsoever caused arising directly or indirectly in connection with, in relation to or arising
out of the use of the Content.
This article may be used for research, teaching, and private study purposes. Any
substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing,
systematic supply, or distribution in any form to anyone is expressly forbidden. Terms &
Conditions of access and use can be found at http://www.tandfonline.com/page/termsand-conditions
Communications in StatisticsSimulation and ComputationR , 44: 14531464, 2015

Copyright Taylor & Francis Group, LLC
ISSN: 0361-0918 print / 1532-4141 online
DOI: 10.1080/03610918.2013.794289
Support Vector Clustering for Customer

Segmentation on Mobile TV Service
PEDRO ALBUQUERQUE,1 SOLANGE ALFINITO,1
AND CLAUDIO V. TORRES2
1
Department of Administration, University of Braslia, Braslia, Braslia-Brazil

Department of Psychology, University of Braslia, Braslia-Brazil
Downloaded by [187.104.207.24] at 11:23 12 December 2014
This article presents a proposal for customer segmentation through Support Vector
Clustering, a technique that has been gaining attention in the academic literature due
to the good results usually obtained. An application of this method to a sample from
consumers in Brazil about mobile TV service is performed; we compare this approach
with the classical hierarchical methods of cluster analysis. It is concluded that this
methodology is effective in reducing the heterogeneity often present in customers data,
improving the cluster segmentation analysis for managers.
Keywords Customer relationship management; Customer segmentation; Machine
learning
Mathematics Subject Classification 62H30; 62P25; 68T10
1. Introduction
Customer segmentation and pattern recognition through Computational Statistics in the
management environment is critical to the success of organizations as highlighted by Dibb
(1998). Market segmentation can help organizations deal with the frequent heterogeneity
present in consumers profile; although for most firms the determination of a unique form of
relationship or the developing of a unique strategy to serve its customers sounds unrealistic
and could result in higher costs. In this context, the development of Expert Systems to
support the manager in his strategies and marketing plans in order to model the different
types of consumers is essential to the organization (Peppard, 2000).
The large amount of data that permeates the organizational environment, either due to
computational progress achieved in recent centuries, or due to a more effective Information
Management, makes the task of customer segmentation often intractable in those casesan
approach that has obtained good results is the use of Machine Learning (Florez-Lopez and
Ramon-Jeronimo, 2008). Machine Learning is a multidisciplinary field of Expert Systems
that consists of statistical techniques, and mathematical and computational research whose
primary focus is to provide an automatic recognition of complex patterns and then make
intelligent decisions based on this information.
Received December 10, 2012; Accepted April 1, 2013

Address correspondence to Pedro Albuquerque, Management Department, Universidade de
Braslia, Campus Universitario Darcy Ribeiro, Braslia, 70254070 Brazil; E-mail: pedroa@unb.br
1453
1454
Albuquerque et al.
According to Kaufmann (1990), the history of Machine Learning can be conveniently

divided into three major periods of activity, namely: exploration of data (1950 to 1960),
development of practical algorithms (1970), and the explosion of research directions (from
1980). From the explosion of research directions started in 1980, associated with the
increase of personal computers, tools for analyzing complex patterns became available. It
is possible to cite methods and models already popular in Expert Systems field as: Neural
Networks, Fuzzy Logic, Classification Trees, Support Vector Machine (SVM), among
others.
Specifically, Support Vector Machines (SVM) is a set of computational algorithms,
constructed with the intention of learning based on the available data. For example, with
SVM it is possible to recognize credit risk patterns (Huang et al., 2004), face recognition
in biometry (Osuna et al., 1997), marketing applications (Cheung et al., 2003), and finance
applications (Huang et al., 2005).
In this article, we apply the SVM to formulate the consumer segmentation on mobile
TV service through Support Vector Clustering (SVC). Therefore, the text is organized as
follows: Section 2 presents the theoretical framework associated with customer segmentation and Machine Learning; Section 3 describes the SVC and how this can be applied in the
case of customer segmentation; finally, the last two sections present the SVC implementation of the data obtained from a survey conducted in Brazil in 2008 and the conclusion
of the work. This article contributes to the incipient literature on customer segmentation
through Machine Learning in an Expert Systems context, providing a more robust segmentation technique. It thus helps give direction for researchers interested in modeling profiles
and customer patterns, while promoting the integration and dissemination of this kind of
knowledge.
2. Material and Methods

In a competitive environment which organizations are embedded, customer segmentation
through an effective Customer Relationship Management (CRM) is necessary, since customer needs are becoming increasingly diversified (Dibb, 1998). These needs cannot be
attended to by a mass marketing approach. Hence, it is important to formulate differentiated
marketing policies for different types of customers that companies serve.
Since an individualist service policy is impractical because the costs of a personalized
service may outweigh the financial benefits for individual assistance, the search for an
optimal solution in which it can be possible to segment customers into homogenous groups
(in order to properly model the heterogeneity present in consumers profile) and also keep
the number of groups in a reasonable amount (so that the manager can work with these
clusters in a feasible way) is desirable for the marketing strategies in a company.
The basic assumption of customer segmentation focuses on the hypothesis that clients
demonstrate heterogeneity in preferences of their products and also in their buying behaviors
(Green, 1977). For instance, it formulates six segments through associative networks to
capture the complete image of the brand, discussing implications of perceptual segmentation
for Image Management, brand positioning, competitive analysis, and perceptual brand
communication.
Wind (1978) analyzes the situation and advances in segmentation research during
the 1970s, including the definition of the problem of segmentation, considerations of
research projects, types of approaches to data collection procedures, data analysis, and
implementation of customer segmentation.
SVC
1455
Despite the importance of customer segmentation in environments permeated by heterogeneity in consumers profile, there is no consensus in the literature about what should
be the better statistical technique to proceed with this partition of customers into similar
groups. Statistical techniques such as cluster analysis and principal component analysis
combined with discriminant analysis (DA), or through models such as logistic regression,
have been traditionally used for building segmentation models.
The existence of large volumes of data associated with the correlational characteristics
of individuals can reduce the actual fit, robustness, and interpretability of these models
(Berger and Nasr, 1998; Rao and Steckel, 1995; Reinartz and Kumar, 2000; Schmittlein
and Peterson, 1994).
Unlike traditional statistical models, Machine Learning techniques do not consider
restrictive assumptions about the data. This method combined with data storage from
customers history can provide strategic information to optimize the efficiency of promotion
policy, using socioeconomic and behavioral trends as input to develop models that can
predict future customer decisions (Gath and Geva, 1989; Hruschka and Natter, 1999; Kim
et al., 2005).
Among the models used in Machine Learning marketing applications, we can cite the
use of decision trees (Breiman et al., 1984; Florez-Lopez and Ramon-Jeronimo, 2008),
SVM (Cui and Curry, 2005; Zhao et al., 2005), Neural Networks (Baesens et al., 2002;
Venugopal and Baets, 1994; Wray et al., 1994), and Fuzzy Logic (Li et al., 2002; Yager,
2000). Despite the vast literature of Expert Systems applied to marketing, there are few
texts that explore the possibility of using SVC as a tool for customer segmentation and
thus modeling the profile of these clusters. The use of SVC is motivated by the fact that
this model has superiority over the classical models for group analysis commonly used in
customer segmentation.
Ben-Hur et al. (2002) were the forerunners of the SVC method. The authors demonstrate that through SVM a new clustering method can be developed. This methodology was
called SVC. The basic idea behind the SVC is that the data points can be usually mapped
through a Gaussian kernel (or other types of kernels) for a multidimensional feature space
larger than the original data space, where the aim is to find a hypersphere with minimal
enclosure. This sphere, when mapped back into the data space, can separate the observations into various components. Each component involves a separate group of points, thus
forming observational agglomerates. Ben-Hur et al. (2002) present a simple algorithm for
identifying these clusters. The difference of this algorithm over the classical cluster analysis, as hierarchical cluster analysis, is that the SVC method can generate clusters with
arbitrary shape, while other algorithms are more often limited to hyperellipsoids (Jain and
Dubes, 1988).
Chiang and Hao (2003) used SVC to recognize handwritten text. The authors demonstrate that this approach yields good results to generate clusters with arbitrary geometry,
thus corroborating the findings of Ben-Hur et al. (2002).
Xu and Zhang (2005) present an application of SVC in order to study characteristics of
intrusion detection in computer networks. Instead of using a similarity measure such as the
Euclidean distance intended by Ben-Hur et al. (2002), the authors propose other measures
of similarity. The results support the superiority of the SVC method in determining clusters
with arbitrary geometry, therefore decreasing the heterogeneity.
Hao et al. (2007) say that an automatic categorization of documents into predefined
hierarchies or taxonomies is a crucial step in Content Management. According to the
authors, standard techniques, such as Machine Learning (SVM), have been successfully
applied to the task of determining these taxonomies. Hao et al. (2007) compared the
1456
Albuquerque et al.
classic nonhierarchical SVM classifier with systems beyond traditional categorization of

documents regarding its proposed hierarchical SVM. The SVM classification suggested
by the authors demonstrates an improvement in classification accuracy compared to other
methods studied.
Huang et al. (2007) studied marketing segmentation using SVC, suggesting various
clustering algorithms to deal with problems of marketing segmentation that are generally
limited, as not all are able to determine clusters with arbitrary geometry. Huang et al.
(2007) apply the SVC in a dataset for a beverage company and demonstrate that the
suggested method outperforms algorithms such as k-means and the self-organizing feature
map (SOFM).
Unlike what was proposed by Huang et al. (2007), this article presents the application of
the standard SVC for determining the profile of the respondents regarding the consumption
and use of mobile TV service. The study used a sample of one thousand individuals in
Brazil in 2008, and compares the results with classical hierarchical clusters methods. The
formalization of the classical method proposed by SVC (Ben-Hur et al., 2002) is presented
below.
3. Theory
The SVC is a cluster analysis algorithm that uses the SVM approach. Consider x Rd1
i
the vector data for the ith observation consists of d1 variables for i = 1, . . . , N. Using a
nonlinear transformation : Rd1 Rd2 , where d1 << d2 in x to a multidimensional
i
feature space (Hastie et al., 2003), the goal of SVC is to find the smallest sphere of radius
R containing the data. Mathematically:

( x a ) R 2,
i
i,
(1)
where a is the center of the hypersphere. The problem can be mitigated by inserting slack
variables i :

(x a) R 2 + i ,
i
i.
(2)
To solve this nonlinear programming problem, we can introduce the Lagrangian

function:
L = R2
N
N
N

(R 2 i ( x a ))i
i i + C
i ,
i
i=1
i=1
(3)
i=1
where
i 0 and the Lagrange multipliers are i 0 and i 0. C is a positive constant
and N
i=1 i is a penalty term for the objective function that has the aim of achieving zero
a , and i and equating
the slack variables i . Deriving the expression (3) with respect to R,
to zero, we obtain
N

i=1
i = 1
(4a)
SVC
N
a=
1457
i ( x )
(4b)
i=1
i = C i
(4c)
The KarushKuhnTucker conditions result in

i i = 0
(5a)
a )2 )i = 0
(R 2 i (x
(5b)
Eq. (5b) tells us that the image of a sample point x with i > 0 and i > 0 is outside
i
the hypersphere belonging to the feature space. Using Eq. (5a), we have i = 0 and thus,
because of Eq. (4c) we have i = C. In this case, this point is called Bounded Support
Vector (BSV) since it is outside the hypersphere of radius R.
A sample point x with i = 0 is mapped within or on the hypersphere, where 0 <
i
i < C, then following Eq. (5b) we have that the image is on the boundary of hypersphere
contained in the feature space, this point is called Support Vector (SV). The SV-type points
lie exactly on the boundary of hypersphere thus determining the boundary between the
clusters of observations. The points characterized by BSV are outside the clusters; the
points named by SV form the boundary of clusters. All other points are contained within
the cluster. Note that when C 1, no points of type BSV exist due to restriction (4b).
Using the previous relationships used to eliminate R, a , and i , we can obtain the
dual form of the Lagrangian function as a function of the parameters i for i = 1, . . . , N

(Wolfe, 1961):
W=
N

i=1
(x )2 i
i
N
N

i=1 j =1
i j (x ) (x ),
i
(6)
where ( x )2 = ( x ) ( x ). Once the variables i do not appear in the structure (6), we

i
i
i
can further define the following subject to the dual problem:
0 i C,
for
i = 1, . . . , N
(7)
To solve the inner product ( x ) ( x ), we use the Kernel trick (Scholkopf, 2001),
i
j
which is to approximate the inner product by a kernel:
K(x , x ) = e
i
q x x 2
i j
(8)
with bandwidth parameter equal to q. Other kernel functions are proposed in the literature
(Cristianini and Shawe-Taylor, 2000), including the polynomial kernel (Tax and Duin,
1458
Albuquerque et al.
1999). The Lagrangian in its dual form can then be written as

W=
N

i=1
N
N

K x , x i
i j K x , x .
i
i=1 j =1
(9)
For each sample point x , its distance to the center of the hypersphere is calculated:
i

2
a
R 2 (x ) = x
i
(10)
Using the kernel function and Eq. (4b), we have

N
N
N

R2( x ) = K x , x 2
K x , x i +
i j K x , x .
i
i j
i=1
(11)
i=1 j =1
To determine which sample points belong to which clusters, we can proceed with a
geometric approach. According to Ben-Hur et al. (2002), the first step is to examine the
set of points. Usually only a few points are chosen arbitrarily, which connect two sample
points. Specifically, if there is any sample point lying outside the minimal hypersphere,
then the two sample points are initially considered as disconnected. It should be noted that
during the process clusters appointment, these points can later connect via the same cluster
owned by transitivity. For each pair of points ( x , x ), the element of adjacency matrix A
i j
has the following binary value:

Aij =
1, if x on the line segment connecting x and x , R ( x ) R, k.

0,
otherwise.
(12)
The clusters are now defined as the connected components of the graph induced by the
matrix A. In practice to check the points that are contained in the segment that connects x
i
and x , we do a systematic sampling of some points contained in this interval.
j
The SVC method can be summarized by the following algorithm:
1. Start values for q and C.
2. Find the kernel matrix by K(x , x ) for i = 1, . . . , N and j = 1, . . . , N.
i
3. Maximize Eq. (9) for the parameters = (1 , . . . , N ) .
4. Label the clusters using the adjacency matrix A.

Usually, it is used C = 1 and for the bandwidth parameter q, a common approach is to
traverse various values of q starting from
q =
1
,
maxi,j x x 2
i j
(13)
where q represents the initial value for q. This value is continuously increasing until they
find clusters that meet the requirements of the researcher (Ben-Hur et al., 2002) such as
size, number, or heterogeneity threshold.
SVC
1459
4. Results
The use of SVC as a mechanism for costumer segmentation was used for a survey conducted
in Brazil in 2008, for a total of 1,000 respondents. All respondents were users of mobile
phones; about 54% used a prepaid subscription and 46% a postpaid subscription. The
sample consisted of 50% of men and 50% of women with an age range of 18 to 83 years.
To proceed with the SVC method, the data were initially standardized to possess (for
all variables in a total of 78) zero mean and unit variance (Claeskens et al., 2008). These
variables describe the importance of some issues associated with mobile TV service, such
as: characteristics that possibly influence the acquisition of service, types of mobile TV
uses, frequency of use and types of mobile phone usage, consumer characteristics as user
mobile services, and consumer characteristics according to preference in types of TV shows.
Based on this information, we proceeded with the SVC analysis for the development
of internally homogenous groups with the aim of segmenting customers, thus treating the
heterogeneity present in the profile of those consumers surveyed. For the parameters of
the SVC, we set up since it would be uninterested obtaining points like BSV that were
contained in the hypersphere on the feature space, or be Support Vector points (Ben-Hur
et al., 2002).
The kernels bandwidth was gradually changed from the value presented in Eq. (13),
thereby producing 20 groups of clusters with sizes ranging from only one cluster (i.e., all
observations comprising a single group) to the trivial solution with 1,000 clusters (where
each observation would be a single cluster). Also, in order to make the problem tractable,
we chose a feasible value for the number of clusters, in this case we got five clusters, and
therefore an amount manageable for policy making in marketing management.
One way of evaluating the quality of the SVC method on the traditional cluster analysis
algorithms is to measure the sum of the squares of observations for each of the groups as
defined in (14):
S
N

(s)

x x (s) 2 ,
TSS =
(s)
s=1 i=1
(14)
where s = 1, . . . , 5 represent the possible clusters and N (s) is the number of observations
in the sth cluster, x (s) and x (s) represent, respectively, the ith observation of cluster s and
i
the average of the observations in the sth cluster.
Using the ratio of the total sum of squares (TSS) of the clusters and using five clusters
for each method to enable the comparison with the SVC, we can measure how many times
a method is more effective in reducing the total variability of the clusters compared with
the other methods to the dataset studied (Everitt et al., 2009).
The classical methods Ward, Single Linkage, Complete Linkage, Average, Mcquitty,
Median, and Centroid were constructed using hierarchical process. In other words, the
algorithm employed for these methods can be classified as a combinatory algorithm and
has a hierarchical cluster-formation structure. The following steps describe the hierarchical
algorithm:
1. We start by assuming N clusters, each containing just one element.
2. The two closest clusters are merged and then again the proximity between these
new sets of (N 1) clusters is evaluated.
3. Again the closest two clusters are merged and the procedure continues. It can go
until we are left with a single cluster consisting of all items.
1460
Albuquerque et al.
Table 1
Parameters for the hierarchical cluster methods
Method
Ward
Single
Complete
Average
Mcquitty
Median
Centroid
N(r)+N(s)
N(r)+N(q)
1
2
1
2
N(s)
N(s)+N(t)
1
2
1
2
N(s)
N(s)+N(t)
N(r)+N(t)
N(r)+N(q)
1
2
1
2
N(t)
N(s)+N(t)
1
2
1
2
N(t)
N(s)+N(t)
N(r)
N(r)+N(q)
0
0
0
0
14
N(s)N(t)
[N(s)+N(t)]
2
0
12
1
2
0
0
0
0
In practice, if we want only S clusters then we must stop when S clusters have been
obtained. Often we use some kind of dissimilarity measure to evaluate the proximity
between the clusters and the units. According to Lance and Williams (1967), it is possible
to consider a general distance formula for cluster distances. Since the new clusters are being
formed by combining the two most similar clusters or by the union of these, Lance and
Williams (1967) defined a distance measure between Br and Bs Bt = Bq as
h(Br , Bq ) = 1 h(Br , Bs ) + 2 h(Br , Bt )
+h(Bs , Bt ) + |h(Br , Bs ) h(Br , Bt )|,
(15)
where Br and Bs Bt = Bq are clusters created by the hierarchical algorithm. For the
classical methods, we have
Table 1 can be used to construct clusters using the hierarchical approach, furthermore
according to Table 2 it is possible to note that although the SVC method does not strictly
overcome all other methods, it overcomes most of the classical methods of hierarchical
cluster analysis, having up to 1.06 less variability than the method Ward. Where it does
overcome this variability it is not more than 0.01% when compared with methods Single
Linkage and Centroid, moreover the classical methods such as Single Linkage and Centroid
are limited to hyperellipsoids that constrain the possibilities of agglomeration (Jain and
Dubes, 1988). Thus, we proceeded using the methods SVC for the customer segmentation
Table 2
Ratio of the total sum of squares between the different clusters methods
Method
SVC
Ward
Single
Complete
Average
Mcquitty
Median
Centroid
SVC Ward
1
Single Complete Average Mcquitty Median Centroid
0.9358 1.0001 0.9703

1
1.0686 1.0367
1
0.9702
0.9891
1.0569
0.9890
1.019
1
0.9665
1.0327
0.9664
0.9961
0.9771
1
0.9977
1.0661
0.9976
1.028
1.008
1.0322
1
1.001
1.0705
1.001
1.0325
1.012
1.0365
1.004
1
1461
They consider that the

characteristics associated
with mobile TV service are
important and are shown to
be particular about quality.
They watch the TV on
mobile.
Consider almost exclusively
the quality of service as a
criterion for purchasing.
Consider the lightness of the
device, the size and battery
longevity as important
criteria.
Consider that the
characteristics of mobile
TV do not influence their
purchasing decisions.
See the variety of products
and services as an essential
criterion for acquiring
mobile phones and the use
of mobile TV service.
Cluster 1
Cluster 5
Cluster 4
Cluster 3
Cluster 2
Feature
Cluster
Watch all programs any time

of the day.
They watch soap operas.
They watch little TV but

when they do consider news
and shows most important.
Watch movies and sports on
the mobile.
Use the service more at night

and give more importance
to the news.
Service
Table 3
Types of clusters
They use the mobile more to

listen songs than make calls
or do any other activity
with the device.
Features such as messages,
photos, MP3s are used on
the device more than
making calls.
Consider the possibility of

making international calls
essential.
Prioritize innovation.
Listen more to music and ask

more for information than
make calls.
Mobile
Practicing sports and

considers that the device
should not have an
antiquated appearance.
Do not have a very active
social life and prefer to stay
at home.
Appreciate the change.
Consider that the mobile

should have the same
functionality as a personal
computer and declare that it
contributes to the
maintenance of their social
network.
They are video game fans.
Client
1462
Albuquerque et al.
since we want a method that minimizes the variability and allow clusters with arbitrary
geometry.
The method is therefore robust in generating clusters with low internal variability
compared to traditional cluster analysis methods. The clusters obtained by the SVC method
can be classified on the basis of available variables in Table 3.
5. Conclusions
When compared with traditional methods of hierarchical cluster analysis, the use of the
SVC method for customer segmentation in marketing thus corroborates the results obtained
by Huang et al. (2007).
Among the advantages of SVC method over other approaches is the fact that it is able
to formulate clusters with arbitrary geometry, thereby allowing a better partition of the
data into groups with maximum internal homogeneity. However, the use of SVC method is
sensitive to the choice of parameters q and C, which can provide fair results variants of these
values for various choices, in addition, labeling the clusters based on the adjacency matrix
A is a computational time-consuming process due to the size of the array and complexity
of the generated graph.
Regarding the results of the survey, it was possible to segment the clients of potential
users of mobile TV service, where groups have formed little similarity between them and
maximum homogeneity within groups of customers.
This kind of result can assist the manager in the decision making and formulating
marketing policies for each of these five clusters built. Advertisements, devices, services,
and even payment plans may be developed in order to reach segments of the population
who require a differentiated service that maximizes the possible loyalty of these users.
This article has presented no claim to exhaust the subject on customer segmentation
in the context of Expert Systems and Machine Learning methods, specifically via the
SVC. This methodology is still little explored in the literature of operations research
and marketing, thus providing a direction for researchers interested in modeling profiles
and patterns of customers promoting the integration and dissemination of this type of
knowledge.
References
Baesens, B., Viaene, S., den Poel, D. V., Vanthienen, J., Dedene, G. (2002). Bayesian neural network
learning for repeat purchase modelling in direct marketing. European Journal of Operational
Research 138: 191211.
Ben-Hur, A., Horn, D., Siegelmann, H. T., Vapnik, V. (2002). Support vector clustering. Journal of
Machine Learning Research 2: 125137.
Berger, P. D., Nasr, N. I. (1998). Customer lifetime value: Marketing models and applications. Journal
of Interactive Marketing 12: 1730.
Breiman, L., Friedman, J., Olshen, R., Stone, C. (1984). Classification and Regression Trees. Monterey, CA: Wadsworth and Brooks.
Cheung, K.-W., Kwok, J. T., Law, M. H., Tsui, K.-C. (2003). Mining customer product ratings for
personalized marketing. Decision Support Systems 35: 231243.
Chiang, J.-H., Hao, P.-Y. (2003). A new kernel-based fuzzy clustering approach: Support vector
clustering with cell growing. IEEE Transactions on Fuzzy Systems 11: 518527.
Claeskens, G., Croux, C., Van Kerckhoven, J. (2008). An information criterion for variable selection
in support vector machines. Journal of Machine Learning Research 9: 541558.
SVC
1463
Cristianini, N., Shawe-Taylor, J. (2000). An Introduction to Support Vector Machines and Other
Kernel-based Learning Methods. 1sted. Cambridge: Cambridge University Press.
Cui, D., Curry, D. (2005). Prediction in marketing using the support vector machine. Marketing
Science 24: 595615.
Dibb, S. (1998). Market segmentation: strategies for success. Marketing Intelligence & Planning 16:
394406.
Everitt, B. S., Landau, S., Leese, M. (2009). Cluster Analysis. 4thed. New York: Wiley.
Florez-Lopez, R., Ramon-Jeronimo, J. M. (2008). Marketing segmentation through machine learning
models: An approach based on customer relationship management and customer profitability
accounting. Social Science Computer Review 27: 96117.
Gath, I., Geva, A. (1989). Unsupervised optimal fuzzy clustering. IEEE Transactions on Pattern
Analysis and Machine Intelligence 11: 773780.
Green, P. E. (1977). A new approach to market segmentation. Business Horizons 20: 6173.
Hao, P.-Y., Chiang, J.-H., Tu, Y.-K. (2007). Hierarchically svm classification based on support
vector clustering method and its application to document categorization. Expert Systems with
Applications 33: 627635.
Hastie, T., Tibshirani, R., Friedman, J. H. (2003). The Elements of Statistical Learning. New York:
Springer.
Hruschka, H., Natter, M. (1999). Comparing performance of feedforward neural nets and kmeans for cluster-based market segmentation. European Journal of Operational Research 114:
346353.
Huang, J.-J., Tzeng, G.-H., Ong, C.-S. (2007). Marketing segmentation using support vector clustering. Expert Systems with Applications 32: 313317.
Huang, W., Nakamori, Y., Wang, S.-Y. (2005). Forecasting stock market movement direction with
support vector machine. Computers & Operations Research 32: 25132522.
Huang, Z., Chen, H., Hsu, C.-J., Chen, W.-H., Wu, S. (2004). Credit rating analysis with support
vector machines and neural networks: A market comparative study. Decision Support Systems
37: 543558.
Jain, A. K., Dubes, R. C. (1988). Algorithms for Clustering Data. Upper Saddle RiverNJ: PrenticeHall, Inc.
Kaufmann, M. (1990). Readings in Machine Learning (The Morgan Kaufmann Series in Machine
Learning). San Mateo, California, USA. Morgan Kaufmann.
Kim, Y., Street, W. N., Russell, G. J., Menczer, F. (2005). Customer targeting: A neural network
approach guided by genetic algorithms. Management Science 51: 264276.
Lance, G. N., Williams, W. T. (1967). A general theory of classificatory sorting strategies 1. Hierarchical systems. The Computer Journal 9: 373380.
Li, S., Davies, B., Edwards, J., Kinman, R., Duan, Y. (2002). Integrating group delphi, fuzzy logic
and expert systems for marketing strategy development: The hybridisation and its effectiveness.
Marketing Intelligence & Planning 20: 273284.
Osuna, E., Freund, R., Girosit, F. (1997). Training support vector machines: An application to face
detection. Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern
Recognition 6: 130136.
Peppard, J. (2000). Customer relationship management (crm) in financial services. European Management Journal 18: 312327.
Rao, V. R., Steckel, J. H. (1995). Selecting, evaluating and updating prospects in direct mail marketing.
Journal of Direct Marketing 9(2): 2031.
Reinartz, W. J., Kumar, V. (2000). On the profitability of long-life customers in a noncontractual
setting: An empirical investigation and implications for marketing. Journal of Marketing 64:
1735.
Schmittlein, D. C., Peterson, R. A. (1994). Customer base analysis: An industrial purchase process
application. Marketing Science 13: 4167.
Scholkopf, B. (2001). The kernel trick for distances. Advances in Neural Information Processing
Systems 13 Proceedings of the 2000 Conference 13: 301307.
1464
Albuquerque et al.
Tax, D. M. J., Duin, R. P. W. (1999). Support vector domain description. Pattern Recognition Letters
20: 11911199.
Venugopal, V., Baets, W. (1994). Neural networks and statistical techniques in marketing research:
A conceptual comparison. Marketing Intelligence Planning 12: 3038.
Wind, Y. (1978). Issues and advances in segmentation research. Journal of Marketing Research 15:
317337.
Wolfe, P. (1961). A duality theorem for nonlinear programming. Quarterly of Applied Mathematics
19: 239244.
Wray, B., Palmer, A., Bejou, D. (1994). Using neural network analysis to evaluate buyer-seller
relationships. European Journal of Marketing 28: 3248.
Xu, B., Zhang, A. (2005). Application of support vector clustering algorithm to network intrusion
detection. In: International Conference on Neural Networks and Brain, 2005. ICNN B 05.Vol.
2. pp. 10361040, IEEE.
Yager, R. (2000). Targeted e-commerce marketing using fuzzy intelligent agents. IEEE Intelligent
Systems and their Applications 15: 4245.
Zhao, Y., Li, B., Li, X., Liu, W., Ren, S. (2005). Customer churn prediction using improved oneclass support vector machine. In: Li, X., Wang, S., Dong, Z., eds. Advanced Data Mining and
Applications (Lecture Notes in Computer Science). Vol. 3584. Berlin/Heidelberg: Springer, pp.
731731.

Support Vector Clustering For Customer Segmentation On Mobile TV Service.

Diunggah oleh

Informasi Dokumen

Judul Asli

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Support Vector Clustering For Customer Segmentation On Mobile TV Service.

Diunggah oleh

Hak Cipta:

Format Tersedia

This article was downloaded by: [187.104.207.

Communications in Statistics - Simulation

Support Vector Clustering for Customer

Pedro Albuquerque , Solange Alfinito & Claudio V. Torres

Department of Administration, University of Braslia, Braslia,

Department of Psychology, University of Braslia, Braslia-Brazil

PLEASE SCROLL DOWN FOR ARTICLE

Communications in StatisticsSimulation and ComputationR , 44: 14531464, 2015

Support Vector Clustering for Customer

Department of Administration, University of Braslia, Braslia, Braslia-Brazil

Downloaded by [187.104.207.24] at 11:23 12 December 2014

Received December 10, 2012; Accepted April 1, 2013

Downloaded by [187.104.207.24] at 11:23 12 December 2014

According to Kaufmann (1990), the history of Machine Learning can be conveniently

2. Material and Methods

Downloaded by [187.104.207.24] at 11:23 12 December 2014

Downloaded by [187.104.207.24] at 11:23 12 December 2014

classic nonhierarchical SVM classifier with systems beyond traditional categorization of

To solve this nonlinear programming problem, we can introduce the Lagrangian

Downloaded by [187.104.207.24] at 11:23 12 December 2014

The KarushKuhnTucker conditions result in

dual form of the Lagrangian function as a function of the parameters i for i = 1, . . . , N

where ( x )2 = ( x ) ( x ). Once the variables i do not appear in the structure (6), we

1999). The Lagrangian in its dual form can then be written as

Downloaded by [187.104.207.24] at 11:23 12 December 2014

Using the kernel function and Eq. (4b), we have

1, if x on the line segment connecting x and x , R ( x ) R, k.

3. Maximize Eq. (9) for the parameters = (1 , . . . , N ) .

4. Label the clusters using the adjacency matrix A.

Downloaded by [187.104.207.24] at 11:23 12 December 2014

Downloaded by [187.104.207.24] at 11:23 12 December 2014

Single Complete Average Mcquitty Median Centroid

0.9358 1.0001 0.9703

They consider that the

Watch all programs any time

They watch soap operas.

They watch little TV but

Use the service more at night

They use the mobile more to

Consider the possibility of

Listen more to music and ask

Downloaded by [187.104.207.24] at 11:23 12 December 2014

Practicing sports and

Appreciate the change.

Consider that the mobile

Downloaded by [187.104.207.24] at 11:23 12 December 2014

Downloaded by [187.104.207.24] at 11:23 12 December 2014

Downloaded by [187.104.207.24] at 11:23 12 December 2014

Anda mungkin juga menyukai