Series Editors:
J. Breuker, N. Guarino, J.N. Kok, J. Liu, R. López de Mántaras,
R. Mizoguchi, M. Musen, S.K. Pal and N. Zhong
Volume 293
Recently published in this series
Vol. 292. H. Jaakkola, B. Thalheim, Y. Kiyoki and N. Yoshida (Eds.), Information Modelling
and Knowledge Bases XXVIII
Vol. 291. G. Arnicans, V. Arnicane, J. Borzovs and L. Niedrite (Eds.), Databases and
Information Systems IX – Selected Papers from the Twelfth International Baltic
Conference, DB&IS 2016
Vol. 290. J. Seibt, M. Nørskov and S. Schack Andersen (Eds.), What Social Robots Can and
Should Do – Proceedings of Robophilosophy 2016 / TRANSOR 2016
Vol. 289. I. Skadiņa and R. Rozis (Eds.), Human Language Technologies – The Baltic
Perspective – Proceedings of the Seventh International Conference Baltic HLT 2016
Vol. 288. À. Nebot, X. Binefa and R. López de Mántaras (Eds.), Artificial Intelligence Research
and Development – Proceedings of the 19th International Conference of the Catalan
Association for Artificial Intelligence, Barcelona, Catalonia, Spain, October 19–21,
2016
Vol. 287. P. Baroni, T.F. Gordon, T. Scheffler and M. Stede (Eds.), Computational Models of
Argument – Proceedings of COMMA 2016
Vol. 286. H. Fujita and G.A. Papapdopoulos (Eds.), New Trends in Software Methodologies,
Tools and Techniques – Proceedings of the Fifteenth SoMeT_16
Vol. 285. G.A. Kaminka, M. Fox, P. Bouquet, E. Hüllermeier, V. Dignum, F. Dignum and
F. van Harmelen (Eds.), ECAI 2016 – 22nd European Conference on Artificial
Intelligence, 29 August–2 September 2016, The Hague, The Netherlands – Including
Prestigious Applications of Artificial Intelligence (PAIS 2016)
Edited by
Shilei Sun
International School of Software, Wuhan University, China
Antonio J. Tallón-Ballesteros
Department of Languages and Computer Systems, University of Seville, Spain
Dragan S. Pamučar
Department of Logistic, University of Defence in Belgrade, Serbia
and
Feng Liu
International School of Software, Wuhan University, China
All rights reserved. No part of this book may be reproduced, stored in a retrieval system,
or transmitted, in any form or by any means, without prior written permission from the publisher.
Publisher
IOS Press BV
Nieuwe Hemweg 6B
1013 BG Amsterdam
Netherlands
fax: +31 20 687 0019
e-mail: order@iospress.nl
LEGAL NOTICE
The publisher is not responsible for the use which might be made of the following information.
Preface
Fuzzy Systems and Data Mining (FSDM) is an annual international conference devoted
to four main groups of topics: a) fuzzy theory, algorithm and system; b) fuzzy applica-
tion; c) the interdisciplinary field of fuzzy logic and data mining; and d) data mining.
Following the great success of FSDM 2015, held in Shanghai, the second edition in the
FSDM series was held in Macau, China, where experts, researchers, academics and
participants from the industry were introduced to the latest advances in the field of
Fuzzy Sets and Data Mining. Macau was declared a UNESCO World Heritage Site in
2005 by virtue of its cultural importance. The historic centre of Macau is of particular
interest because of its mixture of traditional Chinese and Portuguese cultures. Macau
has both Cantonese (a variant of Chinese) and Portuguese as official languages.
This volume contains the papers accepted and presented at the 2nd International
Conference on Fuzzy Systems and Data Mining (FSDM 2016), held on 11–14 Decem-
ber 2016 in Macau, China. All papers have been carefully reviewed by programme
committee members and reflect the breadth and depth of the research topics which fall
within the scope of FSDM. From several hundred submissions, 81 of the most promis-
ing and FAIA mainstream-relevant contributions have been selected for inclusion in
this volume; they present original ideas, methods or results of general significance
supported by clear reasoning and compelling evidence.
FSDM 2016 was also a reference conference, and the conference programme in-
cluded keynote and invited presentations, oral and poster contributions. The event pro-
vided a forum where more than 100 qualified and high-level researchers and experts
from over 20 countries, including 4 keynote speakers, gathered to create an important
platform for researchers and engineers worldwide to engage in academic communica-
tion.
I would like to thank all the keynote and invited speakers and authors for the effort
they have put into preparing their contributions to the conference. We would also like
to take this opportunity to express our gratitude to those people, especially the program
committee members and reviewers, who devoted their time to assessing the papers. It is
an honour to continue with the publication of these proceedings in the prestigious series
Frontiers in Artificial Intelligence and Applications (FAIA) from IOS Press. Our par-
ticular thanks also go to J. Breuker, N. Guarino, J.N. Kok, R. López de Mántaras, J. Liu,
R. Mizoguchi, M. Musen, S.K. Pal and N. Zhong, the FAIA series editors, for support-
ing this conference.
Last but not least, I hope that all our participants have enjoyed their stay in Macau
and their time at the Macau University of Science and Technology (M.U.S.T.). We
hope you had a magnificent experience in both places.
Antonio J. Tallón-Ballesteros
University of Seville, Spain
This page intentionally left blank
vii
Contents
Preface v
Antonio J. Tallón-Ballesteros
Data Mining
Introduction
1
Corresponding Author: Sanjay KUMAR, Department of Mathematics, Statistics & Computer Science,
G. B. Pant University of Agriculture & Technology, Pantnagar-263145, Uttarakhand, India; E-mail:
skruhela@hotmail.com.
4 S.S. Gangwar and S. Kumar / Cumulative Probability Distribution
Wong et al. [8] utilized window size of FTS to propose a time variant forecasting
model. Performance of this model was tested by using time series data of enrollments
of University of Alabama and TAIEX. Kai et. al. [9] used K-mean clustering technique
to discretize universe of discourse and proposed an enhanced fuzzy time series
forecasting model. Chen and Tanuwijaya [10] presented new methods to handle
forecasting problems using high-order fuzzy logical relationships and automatic
clustering techniques.
Cheng et al. [11] discretized universe of discourse (UD) using minimum entropy
principle and used trapezoidal membership functions for enhancing accuracy in FTS
forecasting. Hurang and Yu [12] used ratio-based method to identify the length of
intervals in fuzzy time series forecasting which was further enhanced by Yolcu et. al.
[13] using single-variable constrained optimization technique. Teoh et al. [14] used
cumulative probability distribution approach (CPDA) with rough set rule induction and
proposed a hybrid FTS model. Su et al. [15] proposed used MEPA, CPDA and a rough
set algorithm to develop a new model for FTS forecasting.
Fuzzy relational equation and suitable defuzzification process are the pivot
components in any fuzzy time series forecasting method. To minimize the time in
generating fuzzy relational equations using complex min-max composition operation
and to eliminate the search of suitable defuzzification process Singh [16, 17, 18]
proposed various computational methods using difference parameter as fuzzy relation
for FTS forecasting. Joshi and Kumar [19] also presented a computational method
using third order difference as fuzzy relation. To enhance the performance of
computational FTS forecasting method, Gangwar and Kumar [20] developed a
computational algorithm using high order difference parameters and implemented it in
discretized universe of discourse. Intuitionistic fuzzy set (IFS) were used with CPDA
by Gangwar and Kumar [21] to introduce hesitation in FTS forecasting with unequal
intervals.
UD in all computational methods was portioned into the intervals of equal length.
In some cases, the discretization of universe of discourse into equal length intervals
may not give correct classification of time series data. The motivation and intention of
this study is to present a computational method using high order difference parameters
as fuzzy relation with discretized UD in which length of the intervals are optimized
using CPDA. Proposed algorithm eliminates time of making relational equations by
using tedious min-max composition operations and defuzzification process. Developed
method of FTS forecasting has been applied to benchmark problem of forecasting
student enrollments data of University of Alabama and compared with the other recent
methods proposed by various researchers.
~
Let U = {u1, u2, u3, . . . , un,}be an UD. A fuzzy set Ai of U is defined as follows:
~
Ai P A~ (u1 ) / u1 P A~ (u 2 ) / u 2 P A~ (u3 ) / u3 ....... P A~ (u n ) / u n
i i i i
~
Here P A~ is membership function of fuzzy set Ai and assigns a value to each element
i
S.S. Gangwar and S. Kumar / Cumulative Probability Distribution 5
~
of U in [0, 1]. P A~ (u k )
i
(1 ≤ k ≤ n) is grade of membership of uk in Ai . Suppose
fuzzy sets fi(t), (i = 1, 2, . . .) are defined in the Universe of discourse Y(t). If F(t) is
the collection of fi(t), then F(t) is known as fuzzy time series on Y(t) [1]. F(t) and Y(t)
depend upon t and hence both are the function of time. If only F(t-1) causes F(t), i.e.
F (t 1) o F (t ), then relationship is denoted by fuzzy relational equation
F (t ) F (t 1)oR (t , t 1) and is called the first-order model of F(t). (‘‘o’’ is
Max–Min composition operator). If more than one fuzzy sets F(t-n), F(t-n+1), . . . ,F(t-
1) cause F(t) , then relationship is called nth order fuzzy time series model [1, 2].
Proposed FTS method uses CPDA to discretize UD. It uses the ratio formula [20] for
determining the number of partitions. Order of difference parameters used in forecast is
computed as follows:
x For year 1973 enrollment forecast, proposed computational method uses second
order difference parameter D2 | E2 E1 | .
x For year 1974 enrollment forecast, proposed computational method uses third
order difference parameter D3 | E3 E2 | | E2 E1 | .
x For year 1975 enrollment forecast, proposed computational method uses fourth
order difference parameter
D4 | E4 E3 | | E3 E2 | | E2 E1 | .
ith order difference parameter is defined as follows:
ª i 1 º
Di | Ei Ei 1 | «¦ | Ei c Ei ( c 1) |» | E1 E 0 | , 2 d i d N
¬c 1 ¼ (1)
1
PLB 0 ½
°
2i 3 ¾
i
PLB , 2 d i d 3°
2n ¿ (2)
PUB 1, i n (3)
1 ( x c) 2 ½
x
V 2S ³f ¯ 2V
and, P F ( x | c, V ) exp ® 2 ¾dx (5)
¿
Step 4 Construct the triangular fuzzy sets Ãi in accordance with the intervals
constructed in step 3.
Step 5 Fuzzify observations of time series by choosing maximum membership grade
set up fuzzy logical relationships.
Step 6 Use ratio formula [20] for repartitioning time series into different partitions.
Step 7 Apply the following computational algorithm.
~ ~
For a fuzzy logical relation Ai o A j Ãi and Ãj are fuzzified enrollment of
current and next year. Ei and Fj are actual enrollment of current year and crisp
forecasted enrollment of the next year.
Computational algorithm: Forecasted enrollments of University of Alabama are
computed using the following computational algorithm with complexity of linear order.
This algorithm uses the difference parameters (Di) of various orders, lease and upper
~ ~
bounds of the intervals. For a fuzzy logical relation Ai o A j , it uses mid point of the
~
intervals ui and uj having supremum value in Ai and Ãj. The algorithm starts to forecast
enrollment for year 1973 in partition 1, 1981 in partition 2 and 1988 in partition 3 using
the second order difference parameter. In following computational algorithm [*Ãj] is
interval uj for which membership in Ãj is supremum (i.e. 1), L[*Ãj] and U[*Ãj] are
lower and upper bounds of interval uj respectively. l[*Ãj] and M[*Ãk] is length and mid
point of the interval uj whose membership in Ãj is supremum (i.e. 1).
ª i 1 º
Di Ei Ei 1 «¦ Ei c Ei ( c 1) » E1 E0
¬c 1 ¼
For a = 2, 3,......i
Fia = M[*Ãi] + Di/(2(a-1))
FFia = M[*Ãi] - Di/(2(a-1))
If Fia ≥ L[*Ãj] and Fia ≤ U[*Ãj]
Then P =P+ Fia and Q =Q+ 1
If Fia ≥ M[*Ãj]
Then P =P+ l[*Ãj]/( 2(i-1)*(2(a-1))**2)
Else P =P- l[*Ãj]/( 2(i-1)*(2(a-1))**2)
If FFia ≥ L[*Ãj] and FFia ≤ U[*Ãj]
Then P =P+ FFia and Q =Q+ 1
If FFia ≥ M[*Ãj]
Then P =P+ l[*Ãj]/( 2(i-1)*(2(a-1))**2)
Else P =P- l[*Ãj]/( 2(i-1)*(2(a-1))**2)
Next a
Fj = (P + M(*Ãj))/(Q + 1)
Next i
We use the root mean square error (RMSE) and average forecasting error (AFE) to
compare the forecasting results of different forecasting methods. Coefficients of
correlation and determination are used to determine the strength between actual and
forecasted enrollments of University of Alabama.
3. Experimental Study
5. Conclusions
linear order with partition mechanism of UD and thus forecasting of time series data
with large number of observations may not be a matter of concern, (ii) it uses CPDA to
determine the length of the intervals used in forecasting, (iii) it reduces intricate
computations of fuzzy relational matrices and eliminates need of defuzzification
method.
Even though the fusion of CPDA with computational approach in partitioned
environment enhances the accuracy in forecasted output, following are few limitations
with the proposed method.
1. It can not be applied to time series data that does not follow normal
distribution.
2. Time series data are partitioned using the ratio
U ( Emax Emin ) / 2( Emax Emin ) . If 0 U d 1 then there will be no.
In this case, difference parameters increases heavily to make the computation
very complex.
3. If U t N / 2 (N = no. of observations in time series data), there will not be
enough observations in partitions for subsequently forecast.
However, some preprocessing techniques can be explored to make time series data
approximately normally distributed to address the limitation of non-normally
distributed time series data. There is also scope to explore the proposed method with
well known k-mean or any exclusive clustering techniques for partitioning the time
series data rather than using ratio formula.
References
[1] Q. Song, B. S. Chissom, Fuzzy time series and its models, Fuzzy Sets and Systems, 54(1993), 269-277.
[2] Q. Song, B. S. Chissom, Forecasting enrollments with fuzzy time series - Part I, Fuzzy Sets and Systems,
54(1993), 1-9
[3] Q. Song, B. S. Chissom, Forecasting enrollments with fuzzy time series - Part II., Fuzzy Sets and Systems,
64(1994), 1-8.
[4] L. A. Zadeh, Fuzzy set, Information and Control, 8(1965), 338-353.
[5] S. M. Chen, Forecasting enrollments based on fuzzy time series, Fuzzy Sets and Systems, 81(1996), 311-
319.
[6] J. R. Hwang, S. M. Chen, C. H. Lee, Handling forecasting problem using fuzzy time series, Fuzzy Set
and System, 100(1998), 217-228.
[7] C. M Own, P. T Yu, Forecasting fuzzy time series on a heuristic high-order model, Cybernetics and
Systems: An International Journal, 36(2005), 705-717.
[8] W. K. Wong, E. Bai, A. W. C. Chu, Adaptive time variant models for fuzzy time series forecasting. IEEE
Transaction on Systems, Man and Cybernetics-Part B: Cybernetics, 40(2010), 1531-1542.
[9] K. Chi, F. P. Fu and W. G. Chen, A novel forecasting model of fuzzy time series based on K-means
clustering, IWETCS, IEEE, 2010, 223–225.
[10] S. M. Chen, K, Tanuwijaya, Fuzzy forecasting based on high-order fuzzy logical relationships and
automatic clustering techniques, Expert Systems with Applications, 38(2011), 15425-15437.
[11] C. H. Cheng, R. J. Chang, C. A. Yeh, Entropy-based and trapezoid fuzzification based fuzzy time series
approach for forecasting IT project cost, Technological Forecasting and Social Change, 73(2006), 524-
542.
[12] K. Huarng, T. H. K. Yu, Ratio-Based Lengths Of Intervals To Improve Fuzzy Time Series Forecasting,
IEEE Transactions on Systems, Man and Cybernetics-Part B: Cybernetics, 36(2006), 328–40.
[13] U, Yolcu, E. Egrioglu, V. R. Uslu, M. A. Basaran, C. H. Aladag, A new approach for determining the
length of intervals for fuzzy time series, Applied Soft Computing, 9(2009), 647-651.
[14] H. J. Teoh, C. H. Cheng, H. H. Chu, J. S. Chen, Fuzzy Time Series Model Based on Probabilistic
Approach and Rough Set Rule Induction for Empirical Research in Stock Markets, Data & Knowledge
Engineering, 67(2008), 103–17.
10 S.S. Gangwar and S. Kumar / Cumulative Probability Distribution
[15] C. H. Su, T. L. Chen, C. H. Cheng, Y. C. Chen, Forecasting the Stock Market with Linguistic Rules
Generated from the Minimize Entropy Principle and the Cumulative Probability Distribution
Approaches, Entropy, 12(2010), 2397-417.
[16] S. R. Singh, A robust method of forecasting based on fuzzy time series, Applied Mathematics and
Computation, 188(2007), 472-484.
[17] S. R. Singh, A simple time variant method for fuzzy time series forecasting, Cybernetics and Systems:
An International Journal, 38(2007), 305-321.
[18] S. R. Singh, A computational method of forecasting based on fuzzy time series, Mathematics and
Computers in Simulation, 79(2008), 539-554
[19] B. P. Joshi, S. Kumar, A Computational method for fuzzy time series forecasting based on difference
parameters, International Journal of Modeling, Simulation and Scientific Computing, 4(2013),
1250023-1250035.
[20] S. S. Gangwar, S. Kumar, Partitions based computational method for high-order fuzzy time series
forecasting, Expert Systems with Applications, 39(2012), 12158-12164.
[21] S. S Gangwar, S. Kumar, Probabilistic and intuitionistic fuzzy sets based method for fuzzy time series
forecasting, Cybernetics and Systems, 45(2014), 349-361.
[22] G. E. Dallal, L. Wilkinson, An Analytic Approximation to the Distribution of Lilliefors’s Test for
Normality, The American Statistician, 40(1986), 294-296.
[23] H. T Liu, An improved fuzzy time series forecasting method using trapezoidal fuzzy numbers, Fuzzy
Optimization and Decision Making, 6(2007), 63-80.
[24] C. H. Cheng, J. W. Wang, G. W. Cheng, Multi-attribute fuzzy time series method based on fuzzy
clustering, Expert Systems with Applications, 34(2008), 1235-1242.
[25] E. Egrioglu, A new time-invariant fuzzy time series forecasting method based on genetic algorithm,
Advances in Fuzzy Systems, 2012, 2.
[26] G. Chen, H. W. Qu, A new forecasting method of fuzzy time series model, Control and Decision,
28(2013) 105-109.
Fuzzy Systems and Data Mining II 11
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-11
Introduction
1
Corresponding Author: Felix A. C. MORA-CAMINO; ENAC, Toulouse University, 7 avenue
Edouard Belin, 31055 Toulouse, France , E-mail: felix.mora@enac.fr
12 C.A.N. Cosenza et al. / Introduction to Fuzzy Dual Mathematical Programming
fuzzy numbers adopting some elements of classical dual number calculus [7] and [8].
Indeed, the proposed special class of numbers, dual fuzzy numbers, integrates the
nilpotent operator H of dual numbers theory while considering symmetrical fuzzy
numbers. Then uncertain values are characterized by only three parameters: a mean
value, an uncertainty interval and a shape parameter.
In this communication, first are introduced the elements of fuzzy dual calculus
useful to tackle the proposed issue: basic operations as well as strong fuzzy dual and
weak fuzzy dual partial orders and fuzzy dual equality. Then two classes of fuzzy dual
mathematical programming problems are considered: those where uncertainty relays
only in the parameters of the problem and those for which the implementation of the
solution is subject to uncertainty. In both situations, the proposed formalism is
developed and used to identify the expected performance of the solutions.
~
The set of fuzzy dual numbers is the set ' of numbers of the form u = a H b such as
aR, bR+ where r(u) = a is the primal part and d(u) = b is the dual part of the fuzzy
dual number.
A crisp fuzzy dual number will be such as b is equal to zero, losing its fuzzy dual
attribute. To each fuzzy dual number a H b is attached a fuzzy symmetrical number
whose membership function μ is such that:
0 if x d a b or x t a b
° (1)
P ( x) ® P ( x ) P ( 2a x )
° x [a b, a b]
¯
~ ~
Different basic operations can be defined on ' [9]. First, the fuzzy dual addition , is
given by:
~ (x H y )
( x1 H y1 ) ( x1 x 2 ) H ( y1 y 2 )
2 2 (2)
( x1 H y1 ) ~x ( x 2 H y 2 ) ( x1 . x 2 H ( x1 y 2 x 2 y1 ))
(3)
The fuzzy dual product has been chosen here in a way to preserve the fuzzy
interpretation of the dual part of the fuzzy dual numbers, so it is different of the product
of dual calculus. The neutral element of fuzzy dual multiplication is (1 0 H ) , written ~1 .
C.A.N. Cosenza et al. / Introduction to Fuzzy Dual Mathematical Programming 13
It is easy to check that internal operations such as fuzzy dual addition and fuzzy dual
multiplication are commutative and associative. The fuzzy dual multiplication is
distributive with respect to the fuzzy dual addition since operator ε is according to Eq.
(3) such as:
~
H ~x H 0 (4)
Comparing with common fuzzy calculus, fuzzy dual calculus appears to be much
less demanding in computer resource [10] and [11].
Let E be an Euclidean space of dimension p over R then we define the set of fuzzy dual
~
vectors E as the pairs of vectors which are taken from the Cartesian product E u E ,
~
where E+ is the positive half-space of E. Basic operations can be defined over E :
Addition:
~ ~
(O H P )( a H b) Oa H (O b P a) O H P ', a H b E (6)
~ (7)
u
v r (u ).r (v) H ( r (u ) .d (v) d (u ). r (v) ) u, v E
~
where "*" represents the inner product in E and "." represents the inner product in E.
With the objective to make possible the comparison of fuzzy dual numbers as well as
the identification of extremum values between fuzzy dual numbers, a new operator
~
from ' to R+, called fuzzy dual pseudo norm, is introduced.
where U is a shape parameter associated with the considered fuzzy dual number which is
given by:
14 C.A.N. Cosenza et al. / Introduction to Fuzzy Dual Mathematical Programming
1
U ³ P ( x)dx [0, 1]
2b xR
(9)
a R, b R a H b 0a b 0 (11)
O (a H b) D O a H b D a, O R, b R (13)
~
However, since the set of dual numbers ' is not a vector space, the proposed
operator can be only regarded as a pseudo norm.
The fuzzy dual pseudo norm of a fuzzy dual vector u can be introduced as (here
is the Euclidean norm associated to E):
u D r (u ) U d ( u ) (14)
Partial orders between fuzzy dual numbers can be introduced using this pseudo norm.
Depending if fuzzy dual numbers overlap or not, strong and weak partial orders can be
introduced.
~
A strong fuzzy dual partial order written t is defined over ' by:
~
a1 H b1 , a2 H b2 ' : a1 H b1 t a2 H b2 (15)
a1 U b1 t a2 U b2
In that case there is no overlap between the membership functions associated with
the two fuzzy dual numbers and the first one is definitely larger than the second one.
~
A weak fuzzy dual partial order written t is defined over ' by:
~
a1 Hb1 , a 2 Hb2 ' : a1 Hb1 t a 2 Hb2 (16)
a1 Ub1 t a 2 Ub2 t a1 Ub1 t a 2 Ub2
In that case there is an overlap between the membership functions associated with
the two fuzzy dual numbers and the first one appears to be partially larger than the
second one.
C.A.N. Cosenza et al. / Introduction to Fuzzy Dual Mathematical Programming 15
A fuzzy dual equality, written ~ , can be defined between two fuzzy dual numbers
by:
~
a1 H b1 , a 2 H b2 ' : a1 H b1 ~ (a 2 H b2 ) (17-a)
a 2 >a1 U b1 , a1 U b1 @ et a1 >a 2 U b2 , a 2 U b2 @
~
a1 Hb1 , a 2 Hb2 ' : a1 Hb1 # a 2 Hb2
a1 Ub1 t a 2 Ub2 t a 2 Ub2 t a1 Ub1 (17-b)
or a 2 Ub2 t a1 Ub1 t a1 Ub1 t a 2 Ub2
In this last case there is a complete overlap of the membership functions associated
with the two fuzzy dual numbers.
Then when considering two fuzzy dual numbers, they will be in one of the above
situations (no overlap, partial overlap or full overlap): strong fuzzy dual inequality,
weak fuzzy dual inequality or fuzzy dual equality.
The max and the min operators over two or more fuzzy dual numbers can now be
defined. Let c+H J be the fuzzy dual maximum of fuzzy dual numbers a + H α and b+ H E :
c H J max ^a H D , b H E ` (18)
then:
d H G min ^a H D , b H E ` (20)
then:
3.1. Discussion
To illustrate the proposed approach the case of a linear programming problem with real
variables where all parameters are uncertain and described by fuzzy dual numbers, is
considered. The proposed approach can be easily extended to integer mathematical
programming problems or to nonlinear mathematical programming problems, or to
problems with different types of level constraints.
Let then define formally problem L~ given by:
n
minn
xR
¦ c~ x
i 1
i i
(22)
and
xi R i ^1,", n` (24)
When the problem is a constrained cost minimization problem, the cost parameters
c~i , although uncertain, remains positive and the absolute operator can be retrieved
from expression of Eq. (22). Here is adopted the fuzzy dual hypothesis for the cost
coefficients ci , the technical parameters aki and the constraint levels bk . This opens
different perspectives to be considered when dealing with the parameter uncertainty.
Here are considered three different cases:
the nominal case (a standard deterministic linear programming problem) in which
the dual parts of the parameters are zero;
the pessimistic case where uncertainty adds to the cost and where constraints are
strong ones,
the optimistic case where uncertainty subtracts from the cost and the constraints
are weak ones.
The nominal case corresponds to a standard mathematical programming problem. The
analysis of the pessimistic case is developed here with more detail and can be transposed
easily to the study of the optimistic case.
This problem corresponds to the minimization of the worst estimate of total cost
with satisfaction of strong level constraints. Here variables xi are supposed to take real
positive values, but they could take also fully real or integer values. In the case in
which the di are zero, the uncertainty is relative to the feasible set. Problem L+ is
equivalent to the following problem in R n :
n n
minn
xR
¦
i 1
ci xi U ¦d x
i 1
i i
(28)
¦ (a ki U D ki ) xi t bk U E k k ^1,", m` (29)
i 1
and
xi t 0 i ^1,", n` (30)
particular sign, the solution x of problem L+ will be the one corresponding to the
minimum of:
^ min §¨ c xH U d xH ·¸ , min §¨ c xG U d xG ·¸ `
n n n n
(31)
xR n
¦
©i 1
i i ¦i 1
i i
¹ xR n
¦
©i 1
i i ¦
i 1
i i
¹
H
where x is solution of problem:
n n
xR
minn ( ¦c x U ¦d x )
i 1
i i
i 1
i i
(32)
¦c xi 1
i i t 0 and xi t 0 i ^1,", n` (34)
G
and where x is solution of problem:
n n
xR
minn ( U ¦ i 1
d i xi ¦c x )
i 1
i i
(35)
18 C.A.N. Cosenza et al. / Introduction to Fuzzy Dual Mathematical Programming
¦c x
i 1
i i d0 and xi t 0 i ^1,", n` (37)
The fuzzy dual optimal performance of this program is then given by:
n n n
¦ (c
i 1
i H d i ) xi ¦c x
i 1
i i H ¦ d i xi
i 1
(38)
Problems of Eqs. (32), (33) and (34) and of Eqs. (35), (36) and (37) are classical
continuous linear programming problems which can be solved in acceptable time even
for large size problems.
¦ (a ki U D ki ) xi t bk U E k k ^1,", m` (40)
i 1
and
xi t 0 i ^1,", n` (41)
¦a x t bk
ki i k ^1,", m` (43)
i 1
and
xi t 0 i ^1,", n` (44)
0
Let x and x be the respective solutions of problems of Eqs.(39), (40) and (41) and of
Eqs. (42),(43) and (44), it will be instructive to compare in a first step the performances
of problems L+, L- and L0 where:
n n n n n
(45)
¦c x
i 1
i
i U ¦ d i xi d
i 1
¦c x
i 1
i
0
i d ¦c x
i 1
i
i U ¦ d i xi
i 1
C.A.N. Cosenza et al. / Introduction to Fuzzy Dual Mathematical Programming 19
This allows to display the dispersion of results between the pessimistic view of problem
L+, the optimistic view of problem L- and the neutral view of problem L0.
Then in a second step, since x is feasible for problems L- and L0, it is of interest to
compare the different performances when adopting the x solution:
n n n n n
(46)
¦c x
i 1
i
i U ¦ d i xi d
i 1
¦c x
i 1
i
i d ¦c x
i 1
i
i U ¦ d i xi
i 1
Now we consider fuzzy dual programming problems with fuzzy dual parameters and
decision variables as well. In that case problem V is formulated as:
n
min ( c H d )( x H y )
xi R , yi R
¦
i 1
(47)
i i i i
and
xi R, yi t 0 i ^1, ", n` (49)
The above problem corresponds to the minimization of the worst estimate of total cost
with satisfaction of strong level constraints when there is some uncertainty not only on
the values of the parameters but also on the ability to implement exactly what should be
the optimal solution. According to Eq. (3), problem V can be rewritten as:
n
min
xR , y R
n n ¦ (c x
i 1
i i H ( xi d i ci y ))i (50)
x R n , y R n A( x, y) A( x, 0) and C ( x, y) t C ( x, 0) (55)
It appears, as expected, that the case of no diversion of the nominal solution is
always preferable. In the case in which the diversion from the nominal solution is fixed
to yi , i ^1, ", n`, problem V has the same solution than problem V’given by:
n n
minn ¦ ci xi U ¦ d i xi (56)
xR
i 1 i 1
¦ (a
i 1
ki x i U D ki x i ) t bk U ( E k ¦ a ki y i )
i 1
(57)
k ^1, " , m`
The fuzzy dual optimal performance of problem of Eq. (46) will be given by:
n n
¦c
i 1
i xi* H ¦ ( xi* d i ci yi )
i 1
(58)
5. Conclusion
References
[1] M. Delgado, J. L. Verdegay and M. A. Vila, Imprecise costs in mathematical programming problems,
Control and Cybernetics, 16(1987), 114-121.
[2] T. Gal, H. J. Greenberg (Eds.), Advances in Sensitivity Analysis and Parametric Programming, Series:
International Series in Operations Research & Management Science, Vol. 6, Springer, 1997.
C.A.N. Cosenza et al. / Introduction to Fuzzy Dual Mathematical Programming 21
[3] A. Ruszczynski and A. Shapiro. Stochastic Programming. Handbooks in Operations Research and
Management Science, Vol. 10, Elsevier, 2003.
[4] A. Ben-Tal, L. El Ghaoui and A. Nemirovski, Robust Optimization. Princeton Series in Applied
Mathematics, Princeton University Press, 2009.
[5] H. J. Zimmermann, Fuzzy Sets Theory and Mathematical Programming, in A. Jones et al. (eds.), Fuzzy
Sets Theory and Applications, D. Reidel Publishing Company, 99-114, 1986.
[6] C. A. N. Cosenza and F. Mora-Camino, Nombres et ensembles duaux flous et applications, in French,
Technical repport, Labfuzzy laboratory, COPPE/UFRJ, Rio de Janeiro, August 2011.
[7] W. Kosinsky, On Fuzzy Number Calculus, International Journal of Applied Mathematics and Computer
Science, 16(2006), 51-57.
[8] H. H. Cheng , Programming with Dual Numbers and its Application in Mechanism Design, Journal of
Engineering with Computers, 10(1994), 212-229.
[9] Mora_Camino F., O. Lengerke and C. A. N. Cosenza, Fuzzy sets and dual numbers, an integrated
approach, Fuzzy sets and Knowledge Discovery Conference, Chongqing, China, 28-31 May 2012.
[10] H. Nasseri, Fuzzy Numbers: Positive and Nonnegative, International Mathematical Forum, 3(2006),
1777-1780.
[11] E. Pennestrelli and R. Stefanelli, Linear Algebra and Numerical Algorithms using Dual Numbers,
Journal of Multibody Systems Dynamics, 18(2007), 323-344.
22 Fuzzy Systems and Data Mining II
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-22
Introduction
Sports outcome prediction is an important area of betting on sports events, which has
gained a lot of popularity recently. American Football, such as National Football
League (NFL) games, uses a complex scoring system that the resulting scores are
hardly to model using standard modeling approaches. There are five ways to score in
American Football, giving 2 points, 3 points, 6 points, and 7 points under different
touchdown situation. Other sports, such as soccer, baseball, basketball, are relatively
much simpler to give different points under few situations. Consequently, the standard
modeling approaches, such as Poisson-type regression models, can provide impressive
performance when modeling scores in soccer, but it may perform worse when applied
to American Football scores due to the peculiar distribution [1].
Many researches on sport forecasting have demonstrated that the win/lose results
of the game may be affected by the past score, offense/defense statistics, player
absence [2], and etc. Even the temperature, wind speed and moistures in the
competition venues may potentially influence player performance. Most research
adopts these influencing factors for quantitative analysis to estimate the pointed score
1
Corresponding author: Yu-Chia Hsu, Dep. of Sports Information and Communication, National Taiwan
University of Sport, No. 16, Sec. 1, Shuang-Shih Rd., Taichung, Taiwan; E-mail: ychsu@ntupes.edu.tw.
Y.-C. Hsu / Forecasting National Football League Game Outcomes 23
The market data in sport betting, such as odds, point spread, over/under, offer a type of
predictor and source of expert advice and expectation probability regarding sports
outcome. Adopting the betting market data published by bookmakers in the prediction
model could provide a rather high forecasting accuracy [4]. It is reasonable because
betting companies would not survive with inefficient odds and spread.
The betting market has many similar characteristics to financial markets [5]. Three
variants of the efficient market hypothesis (EMH): "weak", "semi-strong", and "strong"
form, which reflect the relationship between the current prices and the information
rationality and instantaneousness, have also been extended to betting market to reflect
the line incorporates all relevant information contained, all public information, and
inside information in the past game outcome [6].
Moreover, the price fluctuation were followed the mechanism known as the
random walk model under some restriction and condition, that the profitable
forecasting models were not persisting for a long time. However, both in financial and
betting market, the profitable forecasting models existed during the periods of market
inefficiency, but require extensive modeling innovations [7].
Candlestick charting originates back to the Japanese rice future market in the 18th
century. It provides a visual aid for looking at data differently and forecasting near term
equity price movement, and then develops insight into market psychology. Recently,
Japanese candlestick theory is one of the most widely used technical analysis
techniques that based on the empirical model for investment decision. The trend of
financial time series was assumed to be predictable by recognizing the specific
candlestick patterns.
The candlestick is produced with the opening, highest, closing, lowest prices over
a given time interval. Each candlestick includes both a body and a wick that extends
either above or below the body. Figure 1 illustrates the candlestick line. The body is
shown as a box to represent the difference between the opening and closing price, and
the wick is shown as a line to represent the highest and the lowest price range during
the opening and closing. The body is filled with either black or white color, according
to the condition that weather the opening price is above or below the closing price,
respectively. In some particular time interval, the highest /lowest price is marked by the
top/bottom of the body. However, a candlestick may or may not have a wick.
24 Y.-C. Hsu / Forecasting National Football League Game Outcomes
The sports metric candlestick charts provide simple graphics of game outcomes relative
to the gambling line, which have been proposed by Mallios [9]. Similarly with the
candlestick chart used in financial equity price analysis, each candlestick of sports
metric includes both a body and a wick that extends either above or below the body.
But the open, high, close, and low price, which constitute the body and wick of
candlestick chart in finance are not appropriate for sports. For sports metric, the
candlestick charts are composed by the winning/losing margin, the total points scored,
and their corresponding gambling line. Figure 2 illustrates the sports metric candlestick.
The body of candlestick is determined by the winning/losing margin, denoted D, and
the gambling line on the wining/losing margin, denoted LD, for a certain team. If D >
LD, the body’ color is white, and the body’s maximum and minimum values are
defined by D and LD. If LD > D, the body’ color is black, and the body’s maximum
and minimum values are defined by LD and D. The length of the candlestick wick is
determined by the gambling shock of line on total points scored, denoted GST. GST is
calculated by the difference between total points scored in the game and the
corresponding line on total points scored. If GST > 0, the wick extends above the body,
and below the body when GST <0. There is no wick when GST = 0.
The size of candlestick line only reflects the characteristics of the price fluctuation
during a time interval, which is not enough to model valuable candlestick patterns. In
order to capture the characteristics of consequent trend of candlestick, the relationship
between two adjacent candlestick lines should be considered. Compared with the
previous candlestick line, the related position of the opening and closing price are used
to model the open style and the close style. Five linguistic variables, Low, Equal Low,
Equal, Equal High, and High, are defined to represent the open and close style. Figure
4 shows the membership function of the linguistic variables of the open style and close
style. The unit of x-axis is the prices in previous time interval and the y-axis is the
possible values of the membership function. The parameters in the function to describe
the linguistic variables depend upon the previous candlestick line, which is illustrated
by the previous candlestick line in the bottom of figure 4.
The candlestick charts are characterized with fuzzy linguistic variables by applying
subordinate function maximum method. When more than one fuzzy set matched for a
single crisp value, the fuzzy set with the maximum membership value will be selected.
Table 1 shows the example of fuzzy candlestick pattern at time t-i to t for forecasting
the next game outcome.
To mining the rule of candlestick pattern for forecasting next game outcome, we
extract the historical data, consist of the point spread line, total point line, the actual
box score, and the outcome, at time t, t-1,…to t-i. Then, we translate these data to the
candlestick char entity, and symbolize the time series by fuzzification. The fuzzy
26 Y.-C. Hsu / Forecasting National Football League Game Outcomes
candlestick patterns are then recognized by using the random forests algorithm to
achieve the optima decision tree. Finally, the next game outcomes are predicted by
using the optimal decision tree.
Figure 4. The membership function of the linguistic variables of the open style and close style
Table 1. Example of fuzzy candlestick pattern
Time Body Upper Lower Body Open style Close style Outcome
frame length wick wick color
Length length
t-i Short VeryShort VeryShort Black EqualHigh Low Win
…
t VeryLong Long Short White EqualLow Equal Lose
For demonstration the effectiveness of forecasting game outcome, we use the NFL data
gathered from the covers.com in the 2011-2012 season. We arbitrarily choose the
champion of the Super Bowl in the year, New York Giant, as the team for empirical
study. The data covers the regular season, and after seasons data for the year. Total 20
games that New York Giant have joined were held in the year, including 17 regular
games from week 1 to week 17, and 4 after season games including the Wildcard,
Divisional, Conference, and Super Bowl. The data in the year are divided into two sets
according to the NFL season. The data in regular season is considered as the training
set, and the data in play-off or super bowl is considered as the testing set. The rules of
candlestick patterns is found by the regular season data, and used to forecast the
outcome of super bowl.
The effect of the prediction is evaluated based on four performance measurements,
precision, recall, and F-measure, which are widely used in data mining. The formulas
are shown in Eqs. (1) – (3).
TP (1)
precision u 100%
TP FP
TP (2)
recall u 100%
TP FN
recall u precision (3)
F measure 2 u
recall precision
where TP, FP, and FN denote true positive, false positive, and false negative.
Y.-C. Hsu / Forecasting National Football League Game Outcomes 27
The empirical results of the forecasting are presented in Table 2. The results
revealed that the precision, recall, and F-measure of the outcome prediction for win are
extreme high. This may be occurred due to the small size of samples, which is the
innate limitation of sports outcome forecasting. Most team of NFL only played almost
20 games in one season. So, it is reasonable that only 17 samples are used for training,
and left 4 samples are for testing. In fact, the New York Giant wins the all 4 games of
after season, including the Super Bowl.
Table 2. The results of prediction
Time frame of Number of Outcome Precision Recall F-measure
the input data input variables prediction
t 7 Win 100% 100% 1
Lose 0% 0% 0
t-1, t 14 Win 100% 75% 0.857
Lose 0% 0% 0
5. Conclusion
References
[1] R. D. Baker, I. G. McHale, Forecasting exact scores in National Football League games, International
Journal of Forecasting, 29 (2013), 122-130.
[2] W. H. Dare, S. A. Dennis, R. J. Paul, Player absence and betting lines in the NBA, Finance Research
Letters, 13 (2015), 130-136.
[3] M. Lewis, Moneyball: The Art of Winning an Unfair Game, W. W. Norton & Company, New York, 2003.
[4] D. Paton, L. V. Williams, Forecasting outcomes in spread betting markets: can bettors use ‘quarbs’ to
beat the book, Journal of Forecasting, 24 (2005), 139-154.
[5] S. D. Levitt, Why are gambling markets organized to differently from financial markets, The Economic
Journal, 114 (2004), 223–246.
[6] L. V. Williams, Information efficiency in betting markets: A survey, Bulletin of Economic Research, 51
(1999), 1-39.
[7] W. S. Mallios, Forecasting in Financial and Sports Gambling Markets. Wiley, New York, 2011.
[8] Y. Li, Z. Feng, L. Feng, Using candlestick charts to predict adolescent stress trend on micro-blog,
Procedia Computer Science, 63 (2015), 221-228.
[9] W. Mallios, Sports Metric Forecasting, Xlibris Corporation, 2014.
[10] C.-H. L. Lee, A. Liu, W.-S. Chen, Pattern discovery of fuzzy time series for financial prediction, IEEE
Transactions on Knowledge and data Engineering, 18 (2006), 613-625.
[11] Q. Lan, D. Zhang, L. Xiong, Reversal pattern discovery in financial time series based on fuzzy
candlestick lines, Systems Engineering Procedia, 2 (2011), 182-190.
[12] P. Roy, S. Sharma, M. K. Kowar, Fuzzy candlestick approach to trade S&P CNX NIFTY 50 index
using engulfing patterns, International Journal of Hybrid Information Technology, 5 (2012), 57-66.
28 Fuzzy Systems and Data Mining II
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-28
Abstract. Aiming at the problem of high cost and slow equalization speed in
traditional circuit, a parallel filling valley equalization circuit based on fuzzy
control is proposed in this paper. A fuzzy controller suitable for the circuit is
designed. The average voltage, voltage range and the balance electric quantity of
the battery pack are modeled by fuzzy model. The fuzzy reasoning and
defuzzification is produced to optimize the circuit control logic, which can be
adapted to the nonlinearity of the battery pack and the uncertainty of the battery
parameters. The simulation and experiment results show that, in the process of
charging and discharging, the fuzzy control based parallel filling valley
equalization circuit has the advantage of fast and efficient equalization which can
improve the use efficiency of the battery pack.
Introduction
As the continuous environmental pollution and the deterioration of oil, the vehicles
energy system structure has become a hot issue of the global concern and research [1].
In recent years, people are committed to the development of safe, efficient and clean
transport. The electric vehicle represents the development direction of the new
generation of environmentally friendly vehicles. As the power source of electric
vehicles, the power battery directly affects the use of electric vehicles [2]. The lithium
battery is one of the best choices for the power source of electric vehicle because of its
advantages, such as the high voltage, low self-discharge rate, high efficiency and
environmental protection [3]. However, the lithium battery in the production, long
standing and times of the charge and discharge process, battery charge amount of the
gap increases, so that within the battery pack cells dispersion increased dispersion
increases, individual cell performance degradation intensified, eventually leading to the
whole group batteries failure [4]. Therefore, the battery equalization technology is an
indispensable technology to ensure the safety of the battery and extend the service life
of the battery pack [5].The battery equalization can be roughly divided into active and
passive equalization [6-7]. Active balance in the process will not consume the battery
energy and has become a hot research topic today [8]. In the active balance scheme, the
1
Corresponding author: Yuan JI, Department of Microelectronics Center, Shanghai University,
Shanghai, China; E-mail: jiyuan@shu.edu.cn.
F. Ran et al. / A Fuzzy Control Based Parallel Filling Valley Equalization Circuit 29
highest energy cell of the battery pack adds energy to the lowest battery cell through
the converter. The super capacitor equalization, the inductance balance, and the
converter equalization are the most common ways to achieve the parallel filling valley
equalization [9]. However, problems of active equalization that needs to be solved
urgently, such as the high cost, the complex control circuit and the slow equalization
speed.
At this stage, the research of battery equalization technology mainly includes two
aspects. On the one hand it is the equalization strategy [10], about how to build a
common evaluation system of the battery group and then obtains the control strategy
equilibrium basis. On the other hand it is the design of the equalization circuit topology
[11], mainly research on the hardware implementation. In view of these two aspects,
the researchers put forward many different equalization solutions. Tian et al. [12]
proposed an energy tap charging and discharging equalization control strategy but did
not give a specific implementation of the program. Wu et al. [13] proposed that the
SOC based equilibrium of the battery can effectively eliminate the inconsistency of the
battery. But due to the SOC estimate accuracy is not guaranteed, it is only suitable for
the offline mode equalization. Fu et al. [14] proposed a control strategy based on the
battery voltage as the criterion of equilibrium, and the goal is to achieve the relative
consistency of the SOC of a single cell. It is widely used because of its clear goal and
simple control, but the ability to deal with nonlinear problems needs to be improved in
this method.
Generally, the lithium battery shows the nonlinear characteristic. In order to make
the battery maintain good system stability and fast balancing speed in different
environments with uncertain parameters, this paper proposes a parallel filling valley
equalization scheme based on the fuzzy control. A balanced fuzzy controller is used to
optimize the balance strategy. Simulation results show that the balancing speed and the
efficiency of the proposed parallel fill valley equalization scheme has been improved,
compared with the traditional inverse excitation filling valley control strategy. Thirteen
general E-bike lithium batteries (rated 48V) were used as the object of the series
battery for charging and discharging experiments. The experimental results show that,
the voltage difference between the lithium battery converges to less than 10mV with
the fuzzy control based parallel fill valley equalization strategy when the large voltage
difference are initialized between the battery packs.
VS S M L VL
Membership degree relation
Figure 2 and Figure 3 show the filling valley equalization fuzzy controller
input/output membership function. The equalization current and the equalization time
are determined by the measured average voltage AV and the voltage range VD of the
fuzzy controller. The triangle function is chose to be the membership function of the
average voltage (AV) and voltage difference (VD) as it’s easy to be calculated,
compared with other membership function. The average voltage (AV) is divided into 5
fuzzy subsets: average very large (AVL), average large (AL), average medium (AM),
average small (AS), average very small (AVS), for covering the domain [2.7V, 4.2V].
Input variable voltage difference VD is also divided into 5 fuzzy subsets: difference
very large (DVL), difference large (DL), difference medium (DM), difference small
(DS), difference very small (DVS), which used to cover the domain [0mv, 20mv].
System is in accordance with the 20mV input when the voltage range is greater than
20mV. The output variable equalization electricity QBAL is divided into subsets: VL
(very large), L (large), M (medium), S (small), VS (very small). Fig.2 and Fig. 3 shows
the membership function of fuzzy control system in the horizontal coordinates VD, AV,
QBAL. For example, when the AV is 3.4V, it is one hundred percent in the S
membership and zero percent in the M and VS membership. Figure 4 shows the surface
relationship of fuzzy controller. From the figure, the relationship among AV, VD and
the balance capability could be seen. The rule base can be described in Table 1.
VS S M L VL
Membership Degree Relation
Figure 3. Membership functions for voltage difference and balancing electric quantity
32 F. Ran et al. / A Fuzzy Control Based Parallel Filling Valley Equalization Circuit
Table 1 shows the fuzzy control system has a total of 25 rules, write them
separately R1㧘R2…R25. Fuzzy rule expression can be given by
In the theory of fuzzy control, there are many kinds of operations. So it has many
kinds of choices in the practical application. According to the demand of the definition,
filling valley equalization fuzzy controller operation rules are shown as follows: fuzzy
variable "and" is used for and operation, "min" is used to take the minimum value,
fuzzy variable "or" is used for or operation, "max" is used to take the maximum value.
Implication relation operation use the "min", output synthesis calculation use the "max"
and the centroid method is used in the output defuzzification process. All of the above
rules can be expressed as:
25
R R1 R2 ... R25 Ri
i 1 (3)
The exact values of AV and VD is known, the fuzzy quantity of output QBAL can
be given by
25
Po QBAL ( VB u VD)
(VB V Ri
i 1
25
(V u VD)
(VB V ( Ai and Bi o Ci )
i 1 (5)
Because the "min" method is used in the calculation of the implication relation
25
Po QBAL [[V ( Ai o Ci )] [VD
[VB [[V ( Bi o Ci )]
i 1 (6)
Finally, the output fuzzy variables are accurate through the solution of the
defuzzification module
QBAL
³ Q P Q dQ
BAL o BAL BAL
³ P Q dQ
o BAL BAL
(7)
The maximum equalization current Ieq_max and equalization current IBAL can be
given by
U dc U M U 0 U dio LP T K
2
I eq _ max 2
2 ª¬U dc U M LP LSK U 0 U dio LP Lx º¼
(8)
§ QBAL ·
I BAL min ¨ , Ieq _ max ¸ (9)
¨ TBAL _ MIN ¸
© ¹
2 I BAL U min U dio L p Lx
2
V
U dc U M
2
Lp K T
(10)
°Q ½
TBAL min ® BEC , TBAL _ MAX ¾ (11)
° I BEC
¯ ¿
Where TBAL_MAX shows the maximum equilibrium time, here TBAL_MAX = 10s.
TBAL_MAX is set to limit the length of the equilibrium time, in case the battery pack
charging and discharging voltage changes too large in the time period.
34 F. Ran et al. / A Fuzzy Control Based Parallel Filling Valley Equalization Circuit
The simulation of battery charging and discharging is carried out under the MATLAB
and the ordinary fly-back control is compared with the fuzzy control according to the
simulation results.
Figure 5 shows the lithium battery charging and discharging process under
MATLAB simulation. The battery pack has 13 sections batteries in different initial
voltage, the 13 curves of different colors in the figure are corresponding. According to
the algorithm, the charge and discharge process differences, distinguish the figure into
the Fig.5A, Fig.5B, Fig.5C and Fig.5D. The horizontal axis shows the simulation time,
and the vertical axis shows the battery voltage value of the battery pack. Where the
fuzzy control is used in Fig.5B and Fig.5D, the ordinary fly-back control is used in
Fig.5A and Fig.5C. By the comparison, in the charging process, as shown in Figure 5A,
the battery pack needs 190min to reach the energy balance, while fuzzy control based
controller achieve energy balance only in 120min, as shown in Figure 5B. In the static
discharge process, as shown in Figure 5C, the battery pack needs 232min to reach the
energy balance, while in Figure 5D, only in 150min, equilibrium state were reached.
The simulation results shows that the fuzzy control based parallel filling valley
equalization strategy the has a faster equalization speed compared with normal fly-back
controller control.
The battery charge and discharge experiment are carried out in this paper in the
background of the filling valley equalization fuzzy control. The initial voltage values of
these cells vary from 2.9V to 3.4V.
F. Ran et al. / A Fuzzy Control Based Parallel Filling Valley Equalization Circuit 35
Charging experiment is according to the way that constant current charging first
and then constant voltage charging. The full charging process is shown in Figure 6. In
order to see the equalization time clearly, Figure 7 shows the first 50min equalizing
charge diagram. It can be seen from the figure, only 30min the battery pack from the
initial state of disequilibrium can enter into the equilibrium, which has remarkably
improve compared with general equalization technology. As shown in Figure 8, the
equalizing discharging process can balance the power of batteries in different initial
voltage and make the batteries equalization.
From Table 2, the experiment result shows that, the fuzzy control based parallel
filling valley equalization circuit can clearly reduce the voltage difference and have
good performance in solving the nonlinear problem and equalization speed. However,
36 F. Ran et al. / A Fuzzy Control Based Parallel Filling Valley Equalization Circuit
the fuzzy rule and data base in the realistic fuzzy control process need adequate
accuracy to be reliable and better rules, inference process will certainly be found in the
later research.
3. Conclusion
The proposed fuzzy control based parallel filling valley equalization circuit can
fast reaching equalization. It has good ability of solving the nonlinear problem,
compared with traditional circuit. With the development of electrical vehicle, people
require high quality cell equalization. Lossless equalization can achieve lossless energy
transfer between different batteries to avoid the waste of energy. Filling valley
equalization is one of the schemes in lossless equalization. But how to improve the
energy flow efficiency and change the imbalance in multi- string-parallel battery pack
should be concerned in future research of lossless equalization. In addition,
equalization circuit is supposed to be as succinct as possible. How to reduce the size of
the chip and enhance the applicability deserve enough attention in further study.
References
[1] E. Kim, K. G. Shin, J. Lee. Real-time battery thermal management for electric vehicles. Cyber-Physical
Systems (ICCPS). Berlin: IEEE, (2014):72-83.
[2] C. L. Wey, P. C. Jui. A unitized charging and discharging smart battery management system. Connected
Vehicles and Expo (ICCVE). Las Vegas: IEEE, (2012):903-909.
[3] B. B. Qiu, H. P. Liu, J. L. Yang, et al. An active balance charging system of lithium iron phosphate
power battery group, Advanced Technology of Electrical Engineering and Energy, 2014.
[4] J. Cao, N. Schofield, A. Emadi. Battery Balancing Methods: A Comprehensive Review. Vehicle Power
and Propulsion Conference (VPPC). Harbin: IEEE, (2008):1-6
[5] B. T. Kuhn, G. E. Pitel, P. T. Krein, et al. Electrical properties and equalization of lithium-ion cells in
automotive applications. Vehicle Power and Propulsion Conference (VPPC): IEEE, 2005
[6] B. Lindemark. Individual cell voltage equalizers (ICE) for reliable battery performance.
Telecommunications Energy Conference,: INTELEC, (1991):196-201
[7] A. Baughman, M. Ferdowsi. Analysis of the Double-Tiered Three-Battery Switched Capacitor Battery
Balancing System. Vehicle Power and Propulsion Conference (VPPC). Harbin: IEEE, (2006):1-6
[8] W. G. Ji, X. Lu, Y. Ji, et al. Low cost battery equalizer using buck-boost and series LC converter with
synchronous phase-shift control. Annual IEEE Applied Power Electronics Conference and Exposition
(APEC). Long Beach: IEEE, 331(2013):1152-1157
[9] M. Daowd, N. Omar, DBP Van, et al. Passive and Active Battery Balancing comparison based on
MATLAB Simulation. IEEE Vehicle Power and Propulsion Conference (VPPC). Chicago, IL: IEEE,
(2011):1-7
[10] H. R. Liu, S. H. Zhang, et al. Lithium-ion battery charge and discharge equalizer and balancing
strategy.Transactions of China Electrotechnical Society, 16(2015):186-192.
[11] W. G. Ji, X. Liu, Y. Ji, et al. Low cost battery equalizer using buck-boost and series LC converter with
synchronous phase-shift control. In 2013 28th Annual IEEE applied Power Electronics Conference and
Exposition (APEC). Long Beach. CA, USA, (2013): 1152-1157
[12] R. Tian, D. T. Qin, M. H. Hu, et al. Research on battery equalization balance strategy. Journal of
Chongqing University (Nature Science Edition), (2005):1-4
[13] Y. Y. Wu, H. Liang. Research on electric vehicle battery equalization method. Automotive Engineering,
(2004): 384-385.
[14] J. J. Fu, B. J. Wu, H. J. Wu, et al. Dynamic bidirectional equalization system to a vehicle hang-ion
battery weave. China Measurement Technology, (2005): 10-11.
Fuzzy Systems and Data Mining II 37
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-37
Abstract. Hesitant fuzzy set (HFS) is one of the most common used techniques for
expressing the decision maker’s subjective evaluation information. Interval-valued
hesitant fuzzy set (IVHFS) is the extension of HFS and can reflect our intuition
more objectively. In this paper we focus on the IVHF information aggregation
methods based on Bonferroni mean (BM). We proposed the IVHF geometric BM
operator (IVHFGBM) and weighted IVHFGBM operators. Some numerical
examples for the operators are designed for showing their effectiveness. The
desirable properties of weighted IVHFGBM operator are also discussed in detail.
These operators can be applied in many areas especially in decision making
problems.
Introduction
There are various methods available for decision making. One of the common features
for decision making methods is the information aggregation techniques [1-7]. Using
information aggregation operator in decision making, we can obtain the comprehensive
performance values of alternatives, which are used to compare alternatives. The
alternative with the biggest comprehensive performance value is the best option. The
Bonferroni mean (BM) [8-10] is a widely used technique in information aggregation
and decision making area. At present, it has been extended to interval-valued
uncertainty environment, intuitionistic fuzzy (IF) environment, interval-valued
intuitionistic fuzzy (IVIF) environment, fuzzy environment, uncertain linguistic fuzzy
environment and hesitant fuzzy environment.
However, we found that the BM operator cannot be used to aggregate interval-
valued hesitant fuzzy information [11] which is the research focus of this paper. In the
rest of this paper, we first review the basic concept about interval-valued hesitant fuzzy
set (IVHFS) and then extend the BM to interval-valued hesitant fuzzy environment.
The numerical examples are presented to better understand these interval-valued
hesitant fuzzy information aggregation methods based on BM operators.
1
Corresponding Author: Xiao-Rong HE, School of Economics and Management, Southeast University,
Nanjing, China; E-mail: shelley526@126.com.
38 X.-R. He et al. / Interval-Valued Hesitant Fuzzy Geometric BM Aggregation Operator
1. Preliminaries
In this section, a briefly review of the interval-valued hesitant fuzzy set (IVHFS) is
presented.
Definition 1 [11]. Let X be a referenced set. An IVHFS on X can be represented
as the following mathematical form:
E { x, f E ( x) !| x X } (1)
h2 J 2 h2 ^ª¬J 2
L
`
, J 2U º¼ be three IVHFEs. O is a real number bigger than 0. Then the
operations are defined as follows.
1)
hO J h ^ª«¬J , J
L O U O º
»¼`
2) O h J h ^ªª1¬1 (1 J L O
( J U )O º¼
,1 (1
) ,1 `
3) h1 h2 J 1h1 ,J 2 h2 ^[J 1
L
J 2L J 1LJ 2L , J 1U J 2U J 1U J 2U ]`
4) h1 h2 J 1h1 ,J 2 h2 ^[J J , J 1U J 2U ]`
L L
1 2
in h .
1 1 § L JU J L · 1 §J L JU ·
S (h ) ¦ J
# h J h
¦ ¨J
# h J h © 2 ¹
¸ ¦ ¨
# h J h © 2 ¹
¸
(2)
X.-R. He et al. / Interval-Valued Hesitant Fuzzy Geometric BM Aggregation Operator 39
After the concepts of IVHFS and IVHFE are proposed, the aggregation operators for
aggregating IVHFEs are forwarded correspondingly, such as IVHFWA operator,
IVHFWG operator, IVHFOWA operator, IVHFOWG operator, GIVHFWA operator,
GIVHFWG operator, induced IVHFWA operator, induced IVHFWG operator, and so
on [12-13]. It should be noted that the above IVHF information aggregation operators
cannot be used to fuse the correlated arguments. On the other hand, the geometric mean
(GM) is the common aggregation operator and has been widely used in the information
fusion field. Based on the GM, the geometric BM (GBM) operator has been proposed
and investigated by some researchers. However, it seems that the researchers have no
concern with the investigation on GBM for aggregating IVHFEs which is the concern
of the following studies.
Definition 4. Let h j J j h j ^ª¬J j
L
`
, J jU º¼ ( j 1, 2,..., n) be a group of IVHFEs. If
1 § n ·
1
ª 1
°« § · pq
° n 1
® «1 ¨1 1 (1 J i ) ((1 J j ) n(
= ¨ L p L q n ( n 1) ¸
J i hi ,J j h j
« ¸
° ¨ i 1, j 1 ¸
°¯ «¬ © iz j ¹
1
º½
§ · pq » °
°
n 1
1 ¨1 1 (1 J iU ) p (1
( J Uj )q n(
n ( n 1) ¸ »
¨ i 1, j 1 ¸ »¾
¨ ¸ °
© iz j ¹ »°
¼¿ (4)
Figure 1. Scores for IVHFEs obtained by the IVHFGBM operator (p∊ (0, 10), q∊ (0, 10))
X.-R. He et al. / Interval-Valued Hesitant Fuzzy Geometric BM Aggregation Operator 41
Figure 2 shows the changing trend of the scores for the aggregated IVHFEs based
on IVHFGBM operator when the two parameters are fixed.
1 § n ·
1
wi wj n(( n 1)
n
¨
ph hi qh j ¸
p q ¨ ii ,zj j 1 ¸
© ¹ (5)
° «§ 1 · pq
° ¨
n
¸
® «¨1 1 (1 (1 J i ) i ) ((1 ((1 J j ) )
L w p L wj q n ( n 1))
J i hi ,J j h j
« ¸ ,
° ¨ i 1, j 1 ¸
° «© iz j ¹
¯¬
§ 1 ·º ½
¸» °
n
¨1
¨¨ i
1 (1 (1 J iU ) wi ) p (1 ( J Uj ) j ) q
( (1
w n ( n 1))
¸¸ » ¾
(6)
1, j 1 °
© iz j ¹ ¼» ¿
42 X.-R. He et al. / Interval-Valued Hesitant Fuzzy Geometric BM Aggregation Operator
°« § 1 · pq
°« ¨
n
¸ ,
® 1 ¨1 1 (1 (J i ) i ) ((1 (J j ) )
L w p L wj q n ( n 1))
J i hi ,J j h j
« ¸
° ¨ i 1, j 1 ¸
° « © iz j ¹
¯¬
1
º½
§ · pq » ° 1
¸ »°
n
1 ¨ 1 1 (1 (J iU ) wi ) p (1
( (J Uj ) j ) q
w n ( n 1))
¸¸ » ¾
(7)
¨¨ i 1, j 1
°
© iz j ¹ »°
¼¿
Example 3. Suppose there are three IVHFEs, h1 ={[0.31,0.45], [0.46,0.71] }, h2
={[0.34,0.47]} and h3 ={[0.23,0.35], [0.46,0.58], [0.65,0.73]} and the weight of the
three IVHFEs is 0.3, 0.4, 0.3 . Based on the IVHFWGBM operator, the aggregated
T
IVHFE can be obtained when the values of p and q were assigned to specific numbers.
For example, when pp=0.1, qq=10, then
IVHFWGBM h1 , h2 , h3
= {[ 0.6512, 0.7377], [0.6829, 0.7648], [0.6831, 0.7650], [0.6528, 0.7390], [0.6861,
0.7672], [0.6864, 0.7673] }
the score of the aggregated IVHFE is 0.7153.
Example 4. Suppose there are four IVHFEs, h1 ={[0.2,0.4], [0.2,0.7]}, h2
={[0.5,0.6], [0.3,0.6]} , h3 ={[0.3,0.5]} and h4 ={[0.5,0.6],[0.3,0.6]}, the weight of the
four IVHFEs are supposed as 0.2, 0.3, 0.3, 0.2 . Based on the IVHFWGBM
T
operator, the scores are shown in Figure 4 when the parameters p and q changed from 0
to 10 simultaneously.
3. Conclusions
In this paper, we have extended the traditional BM and proposed the IVHFGBM and
IVHFGWBM operators to aggregate IVHFEs. Some numerical examples for these
operators are also presented to show the practicality and effectiveness. In the future
research, we intend to consider the extensions of some other BMs and study their
relationships, pay attention to the application of the proposed operators to the real
application area such as sustainable development evaluation, science and technology
project review, group decision making [14-16]and so on.
References
[1] J. J. Peng, J. Q. Wang, J. Wang, et al. Simplified neutrosophic sets and their applications in multi-criteria
group decision-making problems. International Journal of Systems Science, 47(2016), 2342-2358.
[2] D. Yu, D. F. Li and J. M. Merigó, Dual hesitant fuzzy group decision making method and its application
to supplier selection. International Journal of Machine Learning and Cybernetics, In press. DOI:
10.1007/s13042-015-0400-3
[3] H. Zhao, Z. Xu and S. Liu, Dual hesitant fuzzy information aggregation with Einstein t-conorm and t-
norm. Journal of Systems Science and Systems Engineering, In press.DOI:10.1007/s11518-015-5289-6.
[4] X. F. Wang, J. Q. Wang and W. E. Yang. Group decision making approach based on interval-valued
intuitionistic linguistic geometric aggregation operators. International Journal of Intelligent
Information and Database Systems, 7(2013), 516-534.
[5] M. Xia, Z. Xu and N. Chen. Induced aggregation under confidence levels. International Journal of
Uncertainty, Fuzziness and Knowledge-Based Systems, 19(2011), 201-227.
44 X.-R. He et al. / Interval-Valued Hesitant Fuzzy Geometric BM Aggregation Operator
[6] G. Wei. Interval valued hesitant fuzzy uncertain linguistic aggregation operators in multiple attribute
decision making. International Journal of Machine Learning and Cybernetics, In press. DOI:
10.1007/s13042-015-0433-7
[7] H. Liu, Z. Xu and H. Liao. The multiplicative consistency index of hesitant fuzzy preference relation.
IEEE Transactions on Fuzzy Systems, 24(2016), 82-93.
[8] C. Bonferroni, Sulle medie multiple di potenze, Bolletino Matematica Italiana, 5 (1950), 267-270.
[9] M. M. Xia, Z. S. Xu, and B. Zhu. Geometric Bonferroni means with their application in multi-criteria
decision making. Knowledge-Based Systems, 40 (2013), 88-100.
[10] W. Zhou and J. M. He. Intuitionistic fuzzy geometric Bonferroni means and their application in multi-
criteria decision making. International Journal of Intelligent Systems, 27(2012), 995-1019.
[11] N. Chen, Z. S. Xu, and M. M. Xia. Interval-valued hesitant preference relations and their applications
to group decision making. Knowledge-Based Systems, 37(2013), 528-540.
[12] R. M. Rodríguez, B. Bedregal, H. Bustince, et al. A position and perspective analysis of hesitant fuzzy
sets on information fusion in decision making. Towards high quality progress. Information Fusion,
29(2016), 89-97.
[13] R. Pérez-Fernández, P. Alonso, H. Bustince, et al. Applications of finite interval-valued hesitant fuzzy
preference relations in group decision making. Information Sciences, 326(2016), 89-101.
[14] D. Yu. Group decision making under intervaĺvalued multiplicative intuitionistic fuzzy environment
based on Archimedean t́conorm and t́norm. International Journal of Intelligent Systems, 30(2015),
590-616.
[15] D. Yu, W. Zhang and G. Huang. Dual hesitant fuzzy aggregation operators. Technological and
Economic Development of Economy, 22(2016), 194-209.
[16] W. Zhou and Z. S. Xu. Generalized asymmetric linguistic term set and its application to qualitative
decision making involving risk appetites. European Journal of Operational Research, 254(2016), 610-
621.
Fuzzy Systems and Data Mining II 45
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-45
Introduction
Decision making based on multi-criteria evaluation has been used with great success
for many applications. Most of these applications are characterized by high levels of
uncertainties and vague information. Fuzzy set theory has provided a useful way to
deal with vagueness and uncertainties in solving multi-criteria decision making
(MCDM) problem. During the last two decades, MCDM methods that integrated with
fuzzy sets have been one of the fastest growing research areas. Abdullah [1] presents a
brief review of category in the integration of fuzzy sets and MCDM. In general,
MCDM can be categorized into multi-attribute decision making (MADM) and multi-
objective decision making (MODM). Naturally, MADM problem is related to multiple
attributes. The attributes of MADM represent the different dimensions from which the
alternatives can be viewed by decision makers. There are many fuzzy MADM methods
that have been discussed in the literature, and fuzzy technique for order preference
1
Corresponding Author: Lazim ABDULLAH, School of Informatics and Applied Mathematics,
Universiti Malaysia Terengganu; E-mail: lazim_m@umt.edu.my.
46 L. Abdullah and C.W.R.A.C.W. Kamal / A New Integrating SAW-TOPSIS
1. Proposed Method
This paper integrates the IT2 FSAW with IT2 FTOPSIS to establish a new MADM
method. In this proposed method, the IT2 FSAW is used to find weights of the criteria,
whereas IT2 FTOPSIS is used to establish preference of alternatives. The definitions
of IT2 FS [8], upper and lower memberships of IT2 FS [9], and ranking values of the
trapezoidal IT2 FS [10] are used in the proposed method. The detailed procedure of the
proposed method is described as follows.
Step 1: Construct the decision matrix Y p of the p-th decision maker and construct the
average decision matrix Y , respectively.
L. Abdullah and C.W.R.A.C.W. Kamal / A New Integrating SAW-TOPSIS 47
x1 x2 xn
ª p f12p f11nnp º
f1 « f11 »
f2 « f p f 22p p »
f 22nn »
Yp ( fijp ) mun « 21
« »
fm « p »
«¬ f m
m1
1 f mp2 f mnp »
m ¼
Y ( fij )mun ,
(1)
§ f1 f 2 fijk ·
fij ¨ ij ij ¸,
¨ k ¸ f1 , f 2 , , f m represent the criteria and
where © ¹ .
x1 , x2 , , xn represents alternatives.
Step 2: Construct the aggregated fuzzy weight W , from the weighting matrix Wp of
the attributes provided by p-th decision maker.
p
Let wi (ai , bi , ci , di ), i 1, 2, , m be the linguistic weight given to the subjective
criteria C1 , C2 , , Ch and objective criteria Ch 1 , Ch 1 , , Cn by decision maker Dt .
f1 f2 fm
Wp ( wip )1um ª w1p w2p wmp º,
¬ ¼ (2)
W ( wi )1um , (3)
wi1 wi2 wik
wi , wi
where k is an interval type-2 fuzzy set.
x1 x2 xn
ideal solution
x
v1 , v2 , , vm , where
max{Rank
°1d j d n
v },}
ij if fi F1
vi ®
° min{Rank
¯1d j d n
v },
ij if fi F2
(8)
and
min{Rank
°1d j d n
v },
ij if fi F1
vi ®
° max{Rank
¯1d j d n
v },
ij if fi F2
(9)
F F
where 1 denoted the set of benefit attributes, and 2 denotes the set of cost attributes.
¦ Rank v v
m
2
d xj ij
i ,
i 1
(10)
¦ Rank v v
m
2
d xj ij
i ,
i 1
(11)
xj
d
C xj ,
d xj d xj (12)
where 1 d j d n.
Step 8: Arrange the values of in a descending order, and the larger value of
C xj
C x j , indicates the higher preference of the alternative x j ,
.
2. Numerical Example
For the purpose of illustration and to show the feasibility of the proposed method, an
example is presented. This example is retrieved from Chou et al. [5].
Researchers intend to identify the facility location alternatives to build a new plant.
The team has identified three alternatives which are alternative 1 ( A1 ) , alternative 2
( A2 ) , and alternative 3 ( A3 ) . To determine the best alternative site, a committee of
four decision makers is created; decision maker 1 ( D1 ) , decision maker 2 ( D2 ) ,
decision maker 3 ( D3 ) and decision maker 4 ( D4 ) . Three selection criteria are
deliberated: transportation availability (C1 ) , availability of skilled workers (C2 ) and
climatic conditions (C3 ) . Table 1 shows the linguistic terms used to rate criteria with
respect to alternatives and also the weights for criteria.
Table 1. Linguistic terms and IT2 FS
Linguistic Terms Interval Type-2 Fuzzy Sets
Very Poor (VP) ((0,0,0,0.1;1,1),(0,0,0,0.05;0.9,0.9))
Poor (P) ((0.0,0.1,0.1,0.3;1,1),(0.05,0.1,0.1,0.2;0.9,0.9))
Medium Poor (MP) ((0.1,0.3,0.3,0.5;1,1),(0.2,0.3,0.3,0.4;0.9,0.9))
Fair (F) ((0.3,0.5,0.5,0.7;1,1),(0.4,0.5,0.5,0.6;0.9,0.9))
Medium Good (MG) ((0.5,0.7,0.7,0.9;1,1),(0.6,0.7,0.7,0.8;0.9,0.9))
Good (G) ((0.7,0.9,0.9,1;1,1),(0.8,0.9,0.9,0.95;0.9,0.9))
Very Good (VG) ((0.9,1,1,1;1,1),(0.95,1,1,1;0.9,0.9))
Based on the ratings given by decision makers , the example is solved using the
proposed method. The final degree of closeness and preference are shown in Table 2.
Table 2. Degree of closeness and preference
Degree of closeness Preference order
C ( A1 ) 0.4112 3
C ( A2 ) 0.4605 2
C( A3 ) 0.4778 1
It can be seen that the preference order of the alternatives is A3 ; A2 ; A1. The
proposed method therefore decided that the best alternative is A3. This preference is
slightly inconsistent with the result obtained using the FSAW where the preference is
A2 ; A3 ; A1.
50 L. Abdullah and C.W.R.A.C.W. Kamal / A New Integrating SAW-TOPSIS
3. Conclusions
This paper proposed a novel method, which integrate IT2 FSAW and IT2 FTOPSIS to
solve MADM problems. Decision makers used interval type-2 linguistic variables to
assess the importance of the criterion. The ranking weighted decision matrix obtained
from IT2 FSAW was then used as an input to the IT2 FTOPSIS where ideal solutions
could be computed. Finally, preference of alternatives was obtained as a result of the
implementation using the integration method. To illustrate the feasibility of the
proposed method, a numerical example, that formerly solved using the FSAW method
was considered. The results showed that A3 is the most preferred alternative. Detailed
comparative analysis between the results obtained using the integrated method and
other decision making methods is left for future research. Future research may also
include sensitivity analysis where the uncertainty of the final preference of the
integrating model can be investigated.
Acknowledgments
This work is part of the research grant project FRGS 59389. We acknowledged the
financial support provided by Malaysian Ministry of Education and Universiti
Malaysia Terengganu.
References
[1] L. Abdullah, Fuzzy Multi Criteria Decision Making and its Application: A Brief Review of Category.
Procedia-Social and Behavioral Sciences, 97 (2013), 131-136.
[2] S. Vinodh, M. Prasanna, N. Hari Prakash, Integrated fuzzy AHP-TOPSIS for selecting the best plastic
recycling method: A case study. Applied Mathematical Modelling, 39 (2014),4662-4672.
[3] K. Rezaie, S. S. Ramiyani, S Nazari-Shirkouhi, A. Badizadeh, Evaluating performance of Iranian
cement firms using an integrated fuzzy AHP-VIKOR method. Applied Mathematical Modelling, 38
(2014), 5033-5046.
[4] T. Wang, J. Liu, J. Li, C. Niu, An integrating OWA–TOPSIS framework in intuitionistic fuzzy settings
for multiple attribute decision making, Computers & Industrial Engineering, 98(2016), 185-194.
[5] M. G. Kharat, S. J. Kamble, R. D. Raut, S. S Kamble, S. M. Dhume, Modeling landfill site selection
using an integrated fuzzy MCDM approach . Earth Systems and Environment, 2(2016), 53.
[6] D. Pamučar, G. Ćirović, The selection of transport and handling resources in logistics centers using
Multi-Attributive Border Approximation area Comparison (MABAC), Expert Systems with
Applications, 42(2015), 3016-3028.
[7] M. Tavana, E. Momeni, N. Rezaeiniya, S. M. Mirhedayatian, H. Rezaeiniya, A novel hybrid social media
platform selection model using fuzzy ANP and COPRAS-G, Expert Systems with Applications,
40(2013), 5694-5702.
[8] Y. C. Chang, S. M. Chen, A new fuzzy interpolative reasoning method based on interval type-2 fuzzy
sets. IEEE International Conferencte on Systems, Man and Cybernetics, (2008), 82-87.
[9] J. M. Mendel, R. I., John, F. Liu, Interval Type-2 Fuzzy Logic Systems Made Simple. IEEE
Transactions of Fuzzy Systems, 14 (2006), 808-821.
[10] L. Lee, S. Chen S, Fuzzy Multiple Attributes Group Decision-Making Based On The Extension Of
Topsis Method And Interval Type-2 Fuzzy Sets. Proceedings of the Seventh International Conference on
Machine Learning and Cybernetics, (2008), 3260-3265.
[11] J. S.Yao, K. Wu, Ranking fuzzy number based on decomposition principle and signed distance. Fuzzy
Sets and Systems, 116(2000), 275-288.
Fuzzy Systems and Data Mining II 51
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-51
Abstract. In this paper, we focus on describing the oscillation period and index
of fuzzy tensor. The definition of the induced third-order fuzzy tensor is proposed.
By using this notion, firstly, the oscillation period and index of fuzzy tensor are
obtained on the basis of Power Method with max-min operation. Secondly, we rely
on Minimal Strong Component to find the oscillation period of fuzzy tensor. It
is a more practical graph theory method that the number of nonzero elements is
less than half of the sum of fuzzy tensor elements. Furthermore, numerical results
demonstrate the Power Method and the Minimal Strong Component two algorithms
for solving the period and index of fuzzy tensor which are effective and promising.
Introduction
In fuzzy mathematics, the study of fuzzy matrix is very complex but quite important
since it has a wide range of applications, especially in fuzzy control and fuzzy decision.
The object of fuzzy control is fuzzy system which is one of the important aspects of
fuzzy control system in which one can reach the stable state in limited time, and study
it’s stability by using the periodicity of fuzzy matrix. In order to study the multi-objective
fuzzy decision making and dynamic multiple objective fuzzy control, it is necessary to
investigate the higher order forms of fuzzy matrix.
The periodicity is one of the most important characteristics of fuzzy matrices.
Thomason [1] first proposed the powers of fuzzy matrix with convergence period or os-
cillation period. Fan and Liu [2] got the conclusion that the period of fuzzy matrix is
equal to the least common multiple of the period of its cutting matrix. Li [3] discussed
the periodicity of fuzzy matrices in the general case. Liu and Ji [4] described the peri-
odicity of square fuzzy matrices. Furthermore, in the same paper [3], and perfected the
conclusion of the upper bound of powers convergence index of fuzzy matrix, obtained
the greatest periodicity value of any square fuzzy matrix. So they solved the problem of
estimating period in the general fuzzy matrix.
1 Corresponding Author: Ling CHEN, School of Mathematical Sciences, Guizhou Normal University,
Owing to the literatures [5,6,7], nowadays, many practical problems can be modeled
as tensor problems. For example, vector is one order tensor, matrix is a second order
tensor, three order or higher is called higher order tensor. [8] explain fast algorithm ex-
ponential data fitting and its application to exponential data fitting. [9]considered infinite
and finite dimensional Hilbert tensors and researched its periodicity. So generalizing the
tensor to the fuzzy tensor is practical and meaningful.
In this paper, we deal with the oscillation period and index of fuzzy tensor with max-
min operation. We find the oscillation period and index of fuzzy tensor by Power Method
and Minimal Strong Component in section 1 and in section 2, respectively. Our numerical
examples show the feasibility of the two proposed algorithms. Finally, in section 3, we
will give Conclusions.
In this section, we first describe some concepts and results about fuzzy matrices in the
literatures [1,2,3,4,10,11,12,13] , which will be used in the section. We give the definition
of fuzzy tensor, and analyze the periodicity and index of fuzzy tensor.
Let A = (aij ) and B = (bij ) be n × n fuzzy n matrices. We have the following
product definition: A × B = C = (cij ) = ( k=1 (aik ∧ bkj )), where aij ∧ bij =
min{aij , bij }, aij ∨ bij = max{aij , bij }. And Ak+1 = Ak × A, k = 1, 2, · · · . A = B
if aij = bij for all i, j ∈ {1, 2, · · · , n}.
Consider a finite number of fuzzy matrices A1 , A2 , · · · , An with any Ai ∈
F n×n , where F n×n denotes the set of all of n × n fuzzy matrices. We have F =
{A1 , A2 , · · · , An }.
Let Z + = {x|x be a position integer} and [n] be the least common multiple of
1, 2, · · · , n.
Referring to the relevant literatures [1,2,3,4,11], for convenience in application, we
propose an equivalent definition of the period of oscillation and the index of fuzzy matrix.
Remark 1. The possible range of the period of fuzzy matrix is from 1 to [n], that is ,
1 ≤ d ≤ [n] and d|[n]. If d = 1, we say A is convergence.
For our purposes, throughout this paper, we always consider i1 · · · , im be the same
dimensional.
From the above definition of fuzzy tensor, clearly, a fuzzy tensor is higher order
generalization of a fuzzy matrix, and is also a tensor extension of characteristic function.
Next, we discuss the third order clustering of fuzzy tensor by using the slice-by-slice
method. For fuzzy tensor, we obtained second-dimensional sections by fixing all indices
L. Chen and L.-Z. Lu / Algorithms for Finding Oscillation Period of Fuzzy Tensors 53
expect for tow indices. Each slice is a fuzzy matrix. Fixing all indices but three indices,
we will define the induced third-order fuzzy tensor.
Definition 3. Let an order m dimension n fuzzy tensor A = (ai1 ···im ). Multiple third-
order fuzzy tensors clustering (Aij ik ih , A) of A are constructed by fixing all but three
indices. We call Aij ik ih is the induced third-order fuzzy tensor of A. Where ij , ik , ih ∈
{i1 · · · , im }.
By the third-order clustering theory, we shall explore the period and index of higher
order fuzzy tensor, which is converted into the study of third-order fuzzy tensor. A third-
order fuzzy tensor has the horizontal, lateral and frontal slices. Each direction contains
3 m−3
a set of fuzzy matrices. We obtain Cm n the induced third-order fuzzy tensors and
3 m−3
3Cm n sets of fuzzy matrices sequences by an order m dimension n fuzzy tensor.
Figure 1 shows the horizontal, lateral and frontal slices of the third-order fuzzy tensor
Aij ik ih , denoted by Aij :: , A:ik : and A::ih , respectively.
On the whole, it is far more intuitive and simpler to investigate higher order fuzzy
tensor with the help of geometric significance of third-order fuzzy tensor. Furthermore,
it is convenient to apply them to various fields.
Now, we introduce the period of induced third-order fuzzy tensor and the given fuzzy
tensor. The following result follows immediately by Difinition1.
From the geometric significance of 3-order fuzzy tensor, we state easily the main
conclusion as follows.
Theorem 2. Let Aij ik ih is the induced third-order fuzzy tensor of order m dimension n
fuzzy tensor A. Suppose d, dij , dik , dih and k, kij , kik , kih is the oscillation period and
index of A, Aij :: , A:ik : and A::ih , respectively. Then
d = l.c.m[d1 , · · · , dn ], k = max{k1 , · · · , kn }.
54 L. Chen and L.-Z. Lu / Algorithms for Finding Oscillation Period of Fuzzy Tensors
Proof. Using the block fuzzy matrices theory as Theorem 1 can prove the theorem.
Clearly, based on Definition 3 and Theorem 2, we have the following result.
Theorem 3. Let an order m dimension n fuzzy tensor A = (ai1 ···im ). Multiple third-
order fuzzy tensors clustering (Aij ik ih , A) of A, Aij ik ih is the induced third-order fuzzy
tensor of A, then the oscillation period D of fuzzy tensor A is the least common multiple
of the oscillation period all of induced third-order fuzzy tensors, and the index K of A is
the max of the index all of induced third-order fuzzy tensors.
Example 1. Let A be a 4-order fuzzy tensor with dimension four which is defined by
Table 1. For m = 4, we have the induced 3-order fuzzy tensor Ai1 i2 i3 , Ai1 i2 i4 , Ai1 i3 i4
and Ai2 i3 i4 . For Ai1 i2 i3 , if i4 = 1, we have the induced 3-order fuzzy tensor Ai1 i2 i3 1
L. Chen and L.-Z. Lu / Algorithms for Finding Oscillation Period of Fuzzy Tensors 55
contains data denoted by Ai1 i2 i3 1 = (A(:, :, 1, 1), A(:, :, 2, 1), A(:, :, 3, 1), A(:, :, 4, 1)),
and we obtain three sets of fuzzy matrices F11 , F21 , F31 by fixed one indices in turn i1 , i2 ,
i3 , where Fi1 = {A1 , A2 , A3 , A4 }, i = 1, 2, 3.
Consider all of fuzzy matrices Fi1 (i = 1, 2, 3) by Definition 1 and Theorem 1:
dF1 = [1, 1, 1, 1] = 1, kF11 = max{4, 4, 4, 4} = 4; dF21 = [2, 1, 1, 1] = 2, kF21 =
1
In this section, by using graph theory tools, we give a method to find oscillation period
of fuzzy tensor. When m and n are not large and the number of nonzero elements is less
than half of the sum of fuzzy tensor elements, with the minimal strong component than
Power Method for oscillation period is simple and does not need much calculation. The
following definition in [11].
Let ΦA denote the set of all nonzero elements of fuzzy matrix A, for any λ ∈ ΦA ,
we call Aλ = (aλ )ij the cut matrix of A, where (aλ )ij = 1 if aij ≥ λ, else (aλ )ij = 0.
We follow [14,15,4] to show the period of Boolean matrix by strong components
and express the period of fuzzy matrix with minimal strong component. Furthermore, we
shall find the period of fuzzy tensor based on the minimal strong component.
According to the above discussion, we can develop the following algorithm for the
oscillation period of fuzzy tensor by minimal strong component.
a
1 a2
a1
a4
a
3
a
3
a
a 5 a
1 4 a6 a5
(a)Digraph of D0.5 (b)Digraph of D0.4 (c)Digraph of D0.3
Example 2. Let A be a third order six dimensional fuzzy tensor A = (A(:, :, 1), A(:, :
, 2), A(:, :, 3)) defined by Table 2.
For A(:, :, 1) See Figure 2 we have λ1 = 0, λ2 = 0.3, λ3 = 0.4, λ4 = 0.5 then
digraph Di (i = 1, 2, 3) can be represented as follows.
In D0.5 there is only one strong component S1 = {a1 }. In D0.4 there is one
strong component S2 = {a1 , a3 }. In D0.3 there are two strong components S3 =
{a1 , a2 , a3 }, S4 = {a4 , a5 }.
Notice that S4 is a strong component which has no common vertices with S1 , S2 , S3 .
Hence, we say that S4 is a newly appeared strong component. Moreover, we obtain that
the set of minimal strong components of fuzzy matrix A(:, :, 1) is Ω = {S1 , S4 }. Then
d(A(; , ; , 1)) = [d(s1 ), d(s4 )] = [1, 2] = 2.
Consider A(:, :, 2) and A(:, :, 3) we have d(A(; , ; , 2)) = [2, 3] = 6, d(A(; , ; , 3)) =
[1, 2, 2] = 2. Then d(A) = [d(A(:, :, 1)), d(A(:, :, 1), d(A(:, :, 1)] = [2, 6, 2] = 6.
This example illustrates Algorithm 2 has one great advantage that only uses the
directed graph of spare fuzzy matrix can find it’s oscillation period , and there is no need
the troublesome calculations.
L. Chen and L.-Z. Lu / Algorithms for Finding Oscillation Period of Fuzzy Tensors 57
3. Conclusions
In this paper, we proposed fuzzy tensor which is a new class of nonnegative tensor and
which is a form of higher order of fuzzy matrix. We gave the definition of the induced
third-order fuzzy tensor that has an advantage of intuitive geometric significance. Based
on these concepts, we investigated the oscillation period and index of fuzzy tensor with
the help of Power Method and Minimal Strong Component, respectively. Our numerical
results showed that two methods are feasible and favourable. Hence, it is necessary to
research many more properties of fuzzy tensors. In the future, we will continue to mull
over all aspects of fuzzy tensor.
Acknowledgements
The work of the first author was supported by Innovation Foundation of Guizhou Nor-
mal University for Graduate Students(201529, 201528), and the Shandong province Col-
lege’s Outstanding Young Teachers Domestic Visiting Scholar Program(2013). The work
of the second author was supported by the National Science Foundation of China (Grant
Nos.11261012).
References
[1] M.G.Thomason, Convergence of powers of a fuzzy matrix, Journal of Mathematical Analysis and Appli-
cations, 57(1977), 476–480.
[2] Z.T.Fan, D.F.Liu, On the oscillating power sequence of a fuzzy matrix, Fuzzy Sets and Systems, 93(1998),
75–85.
[3] J.X.Li, Periodicity of powers of fuzzy matrices, Fuzzy Sets and Systems, 48(1992), 365–369.
[4] W.B.Liu, Z.J.Ji, The periodicity of square fuzzy matrices based on minimal strong components, Fuzzy
Sets and Systems, 126(2002), 233–240.
[5] L.Q.Qi, Eigenvalues of a real supersymmetric tensor, Journal of Symbolic Computation, 40(2005), 1302–
1324.
[6] L.H.Lim, Singular values and eigenvalues of tensors: A variational approach, Proceeding of the IEEE
Internatinal Workshop on Computation advances in multi-tensor adaptive processing, 1(2005), 129–132.
[7] T.G.Kolda , B.W.Bader, Tensor decomposition and applications, SIAM Review, 51(2009), 455–500.
[8] W.Y.Ding, L.Q.Qi, YM Wei, Fast Hankel tensorĺCvector product and its application to exponential data
fitting, Linear Algebra and its Applications, 22(2015), 814–832.
[9] Y.Song, L.Q.Qi, Infinite and finite dimensional Hilbert tensors, Linear Algebra and its Applications,
451(2014), 1–14.
[10] C.Z.Luo, Introduction to fuzzy sets (Vol.1), Beijing Normal University Press,Beijing,(In Chinese), 1989.
[11] Z.T.Fan, D.F.Liu, On the power sequence a fuzzy matrix-Convergent power sequence, Journal of Com-
putational and Applied Mathematics, 4(1997), 147–165.
[12] L.A.Zadeh, Fuzzy sets, Information and Control, 8(1965), 338–353.
[13] S.G.Guu, Y.Y.Lur, C.T.Pang, On infinite products of fuzzy matrices, SIAM Journal on Matrix Analysis
and Applications, 22(2001), 1190–1203.
[14] B.De.Schutter, B.DE.Moor, On the sequence of consecutive powers of a matrix in a Boolean algebra,
SIAM Journal on Matrix Analysis and Applications, 21(1999), 328–354.
[15] K.H.Kim, Boolean matrix theory and application, Marcel Dekker, New York, 1982.
58 Fuzzy Systems and Data Mining II
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-58
Introduction
As a classic combinatorial problem, the minimum cost flow problem has a wide range
of applications and ramifications. In the logistics industry, it is common for decision
makers to generate a plan to optimally transport damageable items from multiple
sources to multiple destinations through transshipment stations. Furthermore,
impreciseness in defining parameters such as the cost per unit on one route is another
commonly appeared problem in realistic environment. Therefore, this paper is devoted
to solve this problem.
With respect to the fuzzy minimum cost flow problem [1], there exist a lot of
fruitful outcomes. In the fuzzy minimum cost flow problem proposed by Shih and Lee
[2], the cost parameter and capacity constraints are taken as fuzzy numbers. In addition,
they proposed a fuzzy multiple objective minimum cost flow problem and used
minimization of the total passing time as the second objective in an example. Ding
proposed an ǩ-minimum cost flow problem to deal with uncertain capacities [3].
However, few studies refer to adaption of this problem to the damageable items
transportation. A close related problem is the multi-objective, multi-item intuitionistic
fuzzy solid transportation problem for damageable items which was proposed by
Chakraborty et al [4]. To defuzzify the imprecise parameters, we use the k-preference
integration method, the area compensation method, and the signed distance method
respectively. Computations to solve the problem are done by using the Wolfram
Mathematica 9.
1
Corresponding Author. Si-Chao LU, School of Traffic and Transportation, Beijing Jiaotong
University, Beijing, China; E-mail: lusichao@163.com.
S.-C. Lu and X.-F. Wang / Toward a Fuzzy Minimum Cost Flow Problem 59
The remaining of the paper is organized as follows: The next section offers a brief
introduction to fuzzy numbers and three defuzzification methods. The mathematical
model of fuzzy minimum cost flow problem for damageable items is proposed in
Section 2. A simulated problem instance is given and solved in Section 3. Finally, the
paper is concluded in Section 4.
1. Fuzzy Preliminaries
0 x ≤ a1
⎧
⎪ x − a1
⎪ a1 ≤ x ≤ a2
() = aa2 − a1
⎨ 3−x (2)
a2 ≤ x ≤ a3
⎪a3 − a2
⎪
⎩ 0 a3 ≤ x
Figure 1. A triangular fuzzy number.
The k-preference integration method was introduced by Chen and Hsieh [6]. Based on
this method, the k-preference integration representation of a general TFN = (a1 , a2 , a3 )
is defined as:
1 1
1
= h[kL-1 (h)+(1-k)R-1 (h)] dh h dh = [ka1 +2a2 +(1-k)a3 ]
Pk (3)
0 3
From Eq.(3), it can be obviously seen that the k-preference integration is fairly
flexible compared with other defuzzification methods, because the value of k is
determined by the decision maker. It has been used to handle the fuzzy cold storage
problem [7] and the constrained knapsack problem in fuzzy environment [8].
If k=0.5 then the result generated by k-preference integration method will be the
same as that obtained by graded mean integration (GMI) method which was introduced
by Chen and Heieh [9].
=(a1 , a2 , a3 ) can be
Based on the area compensation method [10], the TFN A
defuzzified as:
60 S.-C. Lu and X.-F. Wang / Toward a Fuzzy Minimum Cost Flow Problem
a
2 3 a
∫a h
(x)dh+ ∫a h
(x)dh (a3 − a1 )(a1 +a2 +a3 )⁄6 a1 +a2 +a3
=
ΦA A 1
a2 a3
= = (4)
∫a
(x)dh+ ∫a
(x)dh (a3 − a1 )⁄2 3
1
The left and the right α-cuts of the TFN A = (a1 , a2 , a3 ) are L-1 (α) = a+(b-a)α and
R-1 (α) = c-(c-b)α [11]. Based on the ranking system for fuzzy numbers proposed by
is defined as follows:
Yao and Wu [12], the signed distance of the A
1 1
1 1 1
,0) = [L-1 (α)+R-1 (α)] dα = [a+(b-a)α+c-(c-b)α] dα= (a+2b+c)
d(A (5)
2 0 2 0 4
Shekarian et al. [11] combined this method with existing economic production
quantity models to find optimal production quantities.
2. Mathematical Formulation
The fuzzy minimum cost flow problem for damageable items transportation blends the
fuzzy set theory and the minimum cost flow problem. The objective of the proposed
problem is to minimize the total cost of sending the available supply through
transshipment nodes to satisfy the demand. It is also necessary to introduce constraints
that guarantee the feasibility of flows.
Let G=(N, A) be a directed network with node set N={1,2,3,…,n} and arc set A.
Each arc aij ∈ A stands for a route and has a positive upper bound capacity uij and a
positive cost cij . Both uij =(ulij ,uij , urij ) and cij =(clij ,cij , crij ) are taken as triangular fuzzy
numbers, because some vehicles may provide a small degree of leeway of capacity [5]
and the transportation cost of each route tends to vary. Each node i ∈ N has a bi , which
represents the nature of node n. If node i is a supply node then bi >0, if node i is a
demand node then bi <0, if node i is a transshipment node then bi =0. We use a TFN
l r
α
=(α
ij ij ,αij ,αij ) to denote the percentage of unit damage products for the route aij due to
physical vibration caused by bad road condition or improper driving behaviors etc. xij is
the decision variable which denotes the flow quantity through route aij .
Based on the above descriptions, the mathematical formulation can be developed
as follows.
min Z = ∑ni=1 ∑nj=1 cij xij (6)
s.t.
∑nj=1 xij - ∑nj=1 (1-α
ji )xji =bi , ∀i ∈ {i|bi ≥0} (7)
∑nj=1 xij - ∑nj=1 (1-α
ji )xji ≤bi , ∀i ∈ {i|bi <0} (8)
0≤ xij ≤ uij , ∀i, ∀j (9)
∑ni=1 bi -∑ni=1 ∑nj=1 α
ij xij ≥0 (10)
Here (6) indicates the cost minimization objective function. Constraint (7) and
constraint (8) represent the net flow of node i under two different situations
respectively. In addition, constraint (8) implies that demand nodes can be satisfied with
excessive items. Constraint (9) ensures that the total amount of transported damageable
items is less or equal to the capacity of route aij . Constraint (10) guarantees that the
S.-C. Lu and X.-F. Wang / Toward a Fuzzy Minimum Cost Flow Problem 61
total amount of items provided by the supply nodes is no less than the amount of
damaged items plus the total amount of items required by the demand nodes.
Based on the k-preference integration method, Eq. (6)-Eq. (10) can be redefined as
follows, where kc , kα , ku can be determined differently under the decision maker’s
preference.
r
min Z = ∑ni=1 ∑nj=1 [kc clij +2cij +(1-kc )cij ]xij ⁄3 (11)
s.t.
r
∑nj=1 xij - ∑nj=1 {1- [kα αlji +2αji +(1-kα )αji ]⁄3 }xji =bi , ∀i ∈ {i|bi ≥0} (12)
r
∑nj=1 xij - ∑nj=1 {1- [kα αlji +2αji +(1-kα )αji ]⁄3 }xji ≤bi , ∀i ∈ {i|bi <0} (13)
0≤ xij ≤ [ku ulij +2uij +(1-ku )urij ]⁄3, ∀i, ∀j (14)
n n n l r
∑i=1 bi -∑i=1 ∑j=1 (kα αij +2αij +(1-kα )αij )xij ⁄3 ≥0 (15)
Applying the area compensation method, Eq. (6)-Eq. (10) can be written in the
following form:
min Z = ∑ni=1 ∑nj=1 (clij +cij +crij )xij ⁄3 (16)
s.t.
∑nj=1 xij - ∑nj=1 [1- (αlji +αji +αrji )⁄3 ]xji =bi , ∀i ∈ {i|bi ≥0} (17)
n n l r
∑j=1 xij - ∑j=1 [1- (αji +αji +αji )⁄3 ]xji ≤bi , ∀i ∈ {i|bi <0} (18)
0≤ xij ≤ (ulij +uij +urij )⁄3, ∀i, ∀j (19)
∑ni=1 bi -∑ni=1 ∑nj=1 (αlij +αij +αrij )xij ⁄3 ≥0 (20)
Similarly, with the help of the signed distance method, Eq. (6)-Eq. (10) can be
expressed as:
min Z = ∑ni=1 ∑nj=1 (clij +2cij +crij )xij ⁄4 (21)
s.t.
∑nj=1 xij - ∑nj=1 [1- (αlji +2αji +αrji )⁄4 ]xji =bi , ∀i ∈ {i|bi ≥0} (22)
n n l r
∑j=1 xij - ∑j=1 [1- (αji +2αji +αji )⁄4 ]xji ≤bi , ∀i ∈ {i|bi <0} (23)
l r ⁄
0≤ xij ≤ (uij +2uij +uij ) 4, ∀i, ∀j (24)
∑ni=1 bi -∑ni=1 ∑nj=1 (αlij +2αij +αrij )xij ⁄4 ≥0 (25)
3. Numerical Experiment
The case in this section is adapted from an example in [1], which copes with the crisp
model of the minimum cost flow problem. Assume 60 units and 40 units of damageable
items are supplied by node A and node B, whereas no less than 30 units and 60 units of
damageable items are required by node D and node E respectively. Node C is a
transshipment node. Capacities and costs of the routes cannot be determined precisely
in advance. If the route aij has no specified capacity, then uij can be regarded as a large
number and hence be ignored in the mathematical model. Critical parameters of this
problem instance are shown in Figure 2.
Given that this problem is small-scale and hence can be solved by exact algorithms,
we use the Wolfram Mathematica 9 to generate optimal solutions. The imprecise
parameters are defuzzified using three methods, which are the k-preference integration
method, the area compensation method, and the signed distance method. To simplify
the problem, we let k=kc =kα =ku . Mathematical formulations and results by using the
62 S.-C. Lu and X.-F. Wang / Toward a Fuzzy Minimum Cost Flow Problem
GMI method and the area compensation method are shown in Figure 3 and Figure 4.
Computational results are shown in Table 1.
Figure 2. Network representation of a fuzzy minimum cost flow problem for damageable items
transportation.
Figure 3. Mathematical formulation and results by using the GMI method with Mathematica.
Figure 4. Mathematical formulation and results by using the area compensation method with Mathematica.
From Table 1, it can be clearly seen that the GMI method, the area compensation
method, and the signed distance method generated similar results. Furthermore,
decremented cost can be obtained when k is increased, which proves the correctness of
the fuzzification by using the k-preference integration method.
S.-C. Lu and X.-F. Wang / Toward a Fuzzy Minimum Cost Flow Problem 63
Table 1. Solutions obtained using the k-preference integration method, the signed distance method, and the
area compensation method
4. Conclusion
In this paper, we have presented a minimum cost flow problem for damageable items
transportation in imprecise environment. After defuzzifying the fuzzy parameters with
k-preference integration method, area compensation method, and the signed distance
method, the optimal flow can be obtained with Wolfram Mathematica.
There are three major avenues for future work. First, more defuzzification methods
such as the credibility measure method [8] or using the violence tolerance level [13]
can be used and the results could be compared to a further step. Second, more objective
functions could be added and more item properties can be considered. Finally, given
that Das et al. successfully solved a multi-objective solid transportation problem with
type-2 fuzzy variable [14], some parameters in this models could also be taken as type-
2 fuzzy numbers to better describe the problem and defuzzified to generate optimal
solutions.
References
[1] F. S. Hillier and G. J. Lieberman, Introduction to Operations Research (Ninth Edition), McGraw-Hill,
New York, 2010.
[2] H. S. Shih and E. S. Lee, Fuzzy multi-level minimum cost flow problems, Fuzzy Sets & Systems,
107(1999), 159-176.
[3] S. Ding, Uncertain minimum cost flow problem, Soft Computing, 18 (2014), 2201-2207.
[4] D. Chakraborty, D. K. Jana, T. K. Roy, Expected value of intuitionistic fuzzy number and its application
to solve multi-objective multi-item solid transportation problem for damageable items in intuitionistic
fuzzy environment, Journal of Intelligent & Fuzzy Systems, 30 (2016), 1109-1122.
[5] H. J. Zimmermann, Fuzzy Set Theory-and Its Applications, Fourth Edition, Luwer Academic Publishers,
Norwell, 2001
[6] S. H. Chen and C. H. Hsieh, A new method of representing generalized fuzzy number, Tamsui Oxford
Journal of Management Sciences, 13-14 (1998), 133-143.
[7] S. Lu and X. Wang, Modeling the Fuzzy Cold Storage Problem and Its Solution by a Discrete Firefly
Algorithm, Journal of Intelligent and Fuzzy Systems, 31(2016), 2431-2440.
[8] C. Changdar, G. S. Mahapatra, and R.K. Pal, An improved genetic algorithm based approach to solve
constrained knapsack problem in fuzzy environment, Expert Systems with Applications 42 (2015),
2276-2286.
64 S.-C. Lu and X.-F. Wang / Toward a Fuzzy Minimum Cost Flow Problem
[9] S. H. Chen and C. C. Wang, Representation, ranking, distance, and similarity of fuzzy numbers with step
form membership function using k-preference integration method, Joint 9th. IFSA World Congress and
20th NAFIPS International Conference, 2 (2001). IEEE, 801-806.
[10] S. K. De and I. Beg, Triangular dense fuzzy sets and new defuzzification methods, Journal of
Intelligent and Fuzzy Systems, 31(1) (2016), 469-477.
[11] E. Shekarian, C. H. Glock, S.M.P. Amiri, K. Schwindl, Optimal manufacturing lot size for a single-
stage production system with rework in a fuzzy environment, Journal of Intelligent and Fuzzy Systems
27 (2014), 3067-3080.
[12] J. S. Yao, K. Wu, Ranking fuzzy numbers based on decomposition principle and signed distance, Fuzzy
Sets and Systems, 116 (2000), 275-288.
[13] J. Brito, F. J. Martinez, J. A. Moreno, J. L. Verdegay, Fuzzy optimization for distribution of frozen
food with imprecise times, Fuzzy Optimization and Decision Making, 11 (2012), 337-349.
[14] A. Das, U. K. Bera, M. Maiti. Defuzzification of trapezoidal type-2 fuzzy variables and its application
to solid transportation problem, Journal of Intelligent and Fuzzy Systems, 30 (2016), 2431-2445.
Fuzzy Systems and Data Mining II 65
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-65
Introduction
Electronic commerce is a new commerce mode in the field of business, refers to the use
of digital electronic technology to carry out all the business activities, it to the Internet
as the main body, with information technology as the core. Electronic commerce
appear to businesses and individuals to bring new opportunities and challenges,
promote the process of network of the traditional business model, change the business
activities of enterprises and personal consumption, to achieve the business activities of
the digital and intelligent.
Electronic commerce development prompted internal collected a lot of data, is in
urgent need of these data into useful information and knowledge, for enterprises to
create more potential profit. Internet access from the massive data enable data mining
with rich data base, using data mining technology can effectively help the enterprise to
highly automated analysis data, makes an inductive reasoning, found hidden in a
subsequent regularity, extract the effective information, guide enterprises to adjust their
marketing strategies, make the right business decisions, at the same time, provide the
1
Corresponding Author: Xia SONG, Shandong Agricultural Engineering Institute, Jinan Shandong,
China; E-mail: 643549139@qq.com
66 X. Song and F. Huang / Research on the Application of DM in the Field of Electronic Commerce
dynamic personalized and efficient service for customers, improve the core
competitiveness of enterprises.
1.1. E-commerce
Data mining (DM), also known as the database knowledge discovery (knowledge
discovery in database (KDD), is from a large number of, completely, noisy, fuzzy and
stochastic data, extraction implied in them, people do not know in advance, but is
potentially useful information and knowledge process [2]. Data mining is a cross
discipline, it is a gathering of the database technology, artificial intelligence, machine
learning, data visualization, pattern recognition, parallel computing multiple fields of
knowledge.
Data mining is a new business information processing technology, is according to
the enterprise established business objectives, a large number of enterprise database
business data extraction, conversion, analysis and handling of other models, the
extraction key data that is helpful to business decision, reveal the hidden, unknown or
validation of known regularity and further advanced and effective method of the model.
In the electronic commerce data mining, Web mining, is to use data mining
technology from WWW resources (Web document) and behavior (Web service) to
automatically find and extract interesting and useful patterns and information [3].Web
data have 3 types: the Web data of HTML markers, Web document in the connection
structure data and user access data. According to the corresponding data type, Web
mining can be divided into 3 categories: Web content mining, is from the Web
document or the description of the process of knowledge selection; Web structure
mining, closed system is derived from the organizational structure and knowledge link
Web, its purpose is through clustering and analysis of Web links, web page structure
and useful patterns, find authoritative web pages; Web usage mining, is through mining
storage access log on Web, to discover patterns and potential customers users access
the Web, and other information of the process.
Correlation analysis digging out hidden correlation within the dataset. For example, it
could analyze the correlation of different items in one online purchase. If the customer
buys an item A then the model could predict the probability of the customer buys item
B based on the correlation of A and B. A Prior algorithm is the most commonly used
method for correlation analysis [4].
Cluster analysis is a technique that clustering objects into different groups. It could be
used to cluster customers with similar interests or items with common characteristics.
The most widely used clustering algorithms include: hierarchical clustering, centroid-
based clustering, distribution-based clustering and density-based clustering [5].
Cluster analysis is commonly used in E-Commerce for sub-dividing client groups.
The algorithm could cluster clients into different subgroups by analyzing the
similarities of their consumption patterns. The business owner could then make
different strategies and provide personalized services for different target groups.
68 X. Song and F. Huang / Research on the Application of DM in the Field of Electronic Commerce
Data mining is a powerful tool and provides informed guidance in the decision making
process of E-Commerce. It seeks pattern in the sea of unorganized internet traffic and
discovers valuable information to support decision making and strategy development.
Data mining is widely used in product positioning and purchasing behavior
analysis to formulate marketing strategy. It can also be applied to forecast sales market
by analyzing purchasing patterns. Currently, all the major data companies start to
embed the data mining function into their own products. For example, those giant
players such as IBM and Microsoft all incorporate the online analysis function into
their corresponding products. By mining the customer information including
customers’ visit behavior, visit content and visit frequency, the E-commerce
recommendation system based on data mining is able to analyze customer features and
to conclude their visiting patterns in order to offer tailor-made service and product
recommendation catering to customer need.
Data mining technique can figure out the correlation algorithm among products by
analyzing the portfolio in shopping cart and then fixing customers’ purchasing
behaviors accordingly, hereby generating the marketing strategy associated with
commodity displays, bundling sales and marketing promotion. The major task for
correlation analysis is to digging out hidden correlations within the dataset. One
example is the purchase of bread and butter implies the purchase of milk: over 90% of
customers who buy bread and butter will purchase milk as well. The business owner
could make better item bundling by analyzing the correlation among different goods.
X. Song and F. Huang / Research on the Application of DM in the Field of Electronic Commerce 69
The sales manager in Wal-Mart has found a surprising fact that the beer and
nappies, two apparently irrelevant products, are always purchased together [8]. This
phenomenon is frequently observed among young fathers who are used to pick up beers
when they are required to buy nappies in supermarket. This motivated the grocery store
to move the beer isle closer to the diaper, instant increase in sales of both.
In the E-commerce, by figuring out all the similar association rules through data
mining, online vendors can recommend commodities to customers based on their
existing products in shopping cart, thus enhancing the cross-selling. Furthermore, by
offering personalized commodity information and advertisement, customer interest and
loyalty are also expected to be increased.
4. Application Cases
Data mining techniques are used in E-Commerce for analyzing online inquiries, online
trades and registration information. It usually takes steps such as define business scope,
data collection, data preprocessing, model construction and evaluation, output analysis
and evaluation [9]. The steps above are usually repeated and iterated to get more
accurate results.
Data mining is playing more and more important role in E-Commerce. There are
successful cases applying data mining related theory and technology to the E-
Commerce [10].This section discusses the application of data mining in customer
segmentation on Taobao.com. Purchase behavior and sales behavior coexist on Taobao
platform. Experts suggest using the following 15 key factors and weights for
classifying customers and predict their behaviors, as shown in Table 1.
5. Conclusion
E-Commerce is developing rapidly and generating tons of data to analyze. Data mining
enables businesses to predict market trend and customer’s behaviors, it also helps to
provide personalized service and push personalized advertisements. Business may
enhance the revenue by forming better strategies with the help of data mining analysis.
Data mining in E-Commerce will enjoy further development with the progress on
hardware technology and algorithms research as the accumulation of application
experiences.
References
[1] S. Z. Zhang, X. K. Qu, L. Zhang, Research on the Web data mining based on Electronic-Commerce,
Modern Computer, 03(2015),12–17.
[2] H. M. Wu, Sales data mining technology and e-commerce application research, Guangdong University of
Technology,2014
[3] Y. N. Zhang. Application of web data mining in e-commerce. Fujian computer, 05(2013),138-140
[4] J. X. Wu, Research on web data mining and its application in E-Commerce, Information System
Engineering, 01 (2010), 15–18.
[5] X. J. Chen, Research on data mining in electronic commerce, Information and Computer, 05(2014),135
[6] H. Y. Lu, Application of data mining techniques in e-commerce, Network and Information Engineering
(2014),73-75
[7] L. Huang, Research on the application of Web data mining in e-commerce, Hunan University,2014
[8] Y. Gao. Beer and diapers. Tsinghua University press, 2008
[9] S. Liu, Application of Web data mining technology for e-commerce analysis, Electronic technology and
software engineering, 07(2014),216-7
[10] China statistics web. Application of data mining in e-commerce.
http://www.itongji.cn/datamining/hangye/dianzishangwuzhongshujuwajuefangfadeyingyong/ 2010
Fuzzy Systems and Data Mining II 71
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-71
Abstract. With the rapid development of the Semantic Web research, the demand
for representation and reasoning with uncertain information increases. Despite
ontology is capable of modeling the semantic and knowledge in knowledge-based
system, classical ontology languages are not appropriate to deal with uncertainty in
knowledge, which is inherent in most of the real world application domains. In this
paper, we address this issue by extending the power of expression in current
ontology language, that is, proposing a Fuzzy Multi Entity Bayesian Networks
ontology language which extends the PR-OWL and based on combination of
Fuzzy MEBN and ontology, defining and studying its syntax and semantics, and
showing representation of domain knowledge by RDF graphs. The proposed
language Fuzzy PR-OWL will move beyond the current limitation of modeling the
knowledge with fuzzy semantic or fuzzy relation in PR-OWL. By providing a
principled means of uncertainty representation and reasoning, Fuzzy PR-OWL can
serve for many applications with fuzzy and probability knowledge.
Introduction
1
Corresponding Author. Dun LI, School of Information Engineering, Zhengzhou University,
Zhengzhou 450001, China ; E-mail: ielidun@zzu.edu.cn; iedli@zzu.edu.cn.
72 Z.-Y. Zheng et al. / A Fuzzy MEBN Ontology Language Based on OWL2
1. Related Research
The main models of the current uncertainty representation and reasoning of semantic
web are Probabilistic and Dempster-Shafer Models, Fuzzy and Possibilistic Models [1],
etc. The representative models in probabilistic models are mainly BN and MEBN,
which ontology languages based on are BayesOWL [7] and PR-OWL2 [8-9].
BN has the ability to deal with uncertain and probabilistic events and incomplete
data set according to the causality or other type of relationships in events. However,
standard BNs has limitations of the relational information representation. Figure 1a
shows a BN that represents the probability knowledge of bronchitis, that is, smoking
may cause bronchitis, and colds that may be incurred by factors like bad weather can
also lead to airway inflammation. The BN clearly shows causation of the patient’s
illness, but it cannot represent relational information such as the effect of harmful gas
produced by others smoking on the patient. While MEBN takes advantage of first-order
logic that makes it overcome the limitations of BN. Figure 1b, where ovals present
resident node, trapezoids present input node, and pentagons present context node,
shows that person and other are entities of the class Person and context rule
other=peopleAround (person), which may link to another MFrag, defines other is
people around the person. So MEBN can represent the relationship between entities
and take effect of others’ smoking on the probability of the patient having bronchitis
into account via parent node getCold (other).
In reality, however, the experience or knowledge of human beings is
characteristic of fuzziness that can’t be dealt with by MEBN. As the example above,
the impact of slight cold must differ from a bad cold. Though MEBN can represent the
possibility of getting a cold, for instance, getCold{true 1, false 0} where 1 and 0
represent probabilities, it cannot represent the degree of cold. Another situation of
value of resident node is state. For example, suppose the weather has two states {sunny,
cloudy}. MEBN assigns the probability to these states, say {sunny 0.5, cloudy 0.5}, but
situations like partly cloudy can’t be dealt with by MEBN.
Z.-Y. Zheng et al. / A Fuzzy MEBN Ontology Language Based on OWL2 73
2. Fuzzy PR-OWL
2.1. Elements
2.Domain 4.Membership
FMFrag 2.Context 2.If-Then Rules
1.FMFrag 1.Node
2.FExemplar 2.Then-Part
Argument 1.FRS
2.OVArgument 4.Probability 2.State Assignment
Assignment
4.FArgument 1.FMTheory
2.FConstant 2.Declarative
Argument 1.Probability Distribution
Distribution
2.FMapping 4.conditioning
Argument 2.FMExpression
1.Fuzzy state
Argument 2.FPR-OWL
1.FRandom Variable MExpression
table
3.Ordinary
Variable 3.Fuzzy 2.TrueValue
LogicalOperator 2.Simple FMExpression
FMExpression
2.TrueValue
Random Variable 1. (Main Classes/elements)
3.FExemplar 2. (SubClasses)
4. (Reified Relationships)
2.2. Syntax
An overview about the basic model of the Fuzzy PR-OWL is described in Figure 3. In
this diagram, an oval and an arrow represent general class and major relationship
Z.-Y. Zheng et al. / A Fuzzy MEBN Ontology Language Based on OWL2 75
Is built from
(hasNode)
Has rules
(hasFRS)
ResidentNode * 1 *
/InputNode FPR-OWL Table
1 1 1 1 If-Then Rule
* *
1 1
* 1 1 1
Probability
*
ConditioningState StateAssignment * 1
Assignment
1 1 If-Part StateAssignment Then-Part
1 1 1 *
FMExpression
FArgument
*
1 *
*
RandomVariable *
1
x Fuzzy Expression
As showed in Figure 6, this part proposes the model of fuzzy expression that can
represent the constraints or fuzzy relationships between entities.
The expression represents the relationship between entities in FuzzyPR-OWL
wherein the class FMExpression can present true value expression of the context node
or simple expression of other kind of nodes. The former indicates logical expressions
based on FOFL and the latter can be deemed as random variables of input nodes or
resident nodes with some arguments. The class Exemplar here indicates the general or
existential quantifiers in fuzzy expression which is in the form of Skolem. The
structure of fuzzy expression is defined below:
FMExpression::= ‘FMExpression(’FMExpression_id‘,’[ exits|forAll Exemplar_id ’,’] Expression ‘)’;
Expression::= Term [“and”|”or” Term] [“=”Term] [”implies”|’iff’ Term];
Term::=[“not”] RandomVariable_id [’(’Argument_id{,Argument_id}+’)’] | FMExpression_id|
OrdinaryVariable_id;
RandomVariable::=‘RandomVariable(’ RandomVariable_id ‘, hasPossibleValues(’ {URI
reference}+’)’ [‘,defineUcertaintyOf(’URI’)’] [‘,probDistr(‘PrTable_id’)’]
[‘,trueValue(‘float’)’]’)’;
OrdinaryVariable::=‘OrdinaryVariable(’ OrdinaryVariable_id ‘, ( class(’ DomainClass_URI ’))’;
Argument ::= ‘Argument(’Argument_id‘,’[‘type(’Thing‘)’][‘,typeOfData(’ Literal’)’ ]
[‘,‘MembershipDegree ‘])’;
Exemplar ::= ‘Exemplar (‘Exemplar_id ‘,’[‘type(’ Thing ‘)’] [‘,typeOfData(’ Literal’)’ ] )’;
[‘,’MembershipDegree ])’;
2.3. Semantic
values can be N ∪ {⊥}, and the latter can be either a real number ∈ [0,1], or
a member of the chain = 〈! , " , … , # 〉.
The random variables mentioned here can be represented as the expressions in
section 2.2. The probability and the membership degree of possible values of functions
can be assigned by joint probability distribution, and the If-Then rules, respectively. LF
uses phenomenal random variable with n-ary arguments to represent a function. The
(V ) (V ) (V )
function A ̅: ∆→7 Umaps a vector of entity identifier symbols ∆= 〈N! W , N" , … , N# X 〉,
like input arguments, into the vector of identifier symbols U =
(Z ) (Z ) (Z )
〈Y! W , Y" , … , Y\ ^ 〉,like fuzzy state or fuzzy value assignment, where the value of
for various arrangements of arguments and possible values are predefined in the
language by the fuzzy interpretation of A,̅ which can also be represented as the fuzzy
relation [12] in which the truth values of a relation of input set are resulted, that is
_: 〈`, U〉 → ∈ {! , " , … , # }. By the matching of domain entity identifier symbols and
domain entities, the function or relation can map the n-ary vector of domain entities
into the entities for phenomenal random variables or true values of domain assertions
for logical random variables.
3. Use Case
In the Equipment Diagnosis problem, the belt status and room temperature can affect
the engine status. This problem represented by an EngineStatus FMFrag is shown in
figure 7. In the figure, Isa(Machine,m) represents that m is an instance of machine,
EngineStatus(m), BeltStatus(b) and RoomTemp(r) represent the engine status of
machine m, status of belt b and temperature of room r, respectively. Suppose that the
engine status node has a local distribution shown in table 2 where superscripts denote
the membership degree.
Table 2 Local distribution of EngineStatus FMFrag
RoomTemp(r) BeltStatus(b) EngineStatus(m)
(Normalα1;Hotα2) (OKβ1;Brokenβ2) Satisfactoryα1 Overheatedα2
Normal OK 0.8 0.2 0
Normal Broken 0.6 0.4 0
… … … … …
isA(m,Machine) isA(r,Room)
...
m=BeltLocation(b) Isa(Belt,b)
MachineLocation(m)
R=MachineLocation(m)
MachineLocation_FMFrag
BeltStatus(b) RoomTemp(r)
EngineStatus(m)
EngineStatus_FMFrag
EquipmentDiagnosis_FMTheory
EngineStatus which includes the probability assignment of states like OverHeated once
conditioning state of parent BeltStatus is OK.
fpr:Probability fpr:hasProbability fpr:Probability fpr:InputNode
Distribution Assignment Assignment (ResidentNode)
fpr:hasState fpr:hasCond
fpr:hasCondNode
Assign State
fpr:hasState fpr:Conditioning
fpr:StateAssignment
fpr:hasState Assign State
string Name fpr:hasState
Prob
float
...
… BeltStatus Overheated …
… OK 0.2 …
es:DomainFMFrag.Enginestate es:DomainFMFrag.BeltLocation
fpr:hasResidentNode
fpr:hasContextNode
fpr:hasOV fpr:hasOV
es:BeltLocation
es:ContextNode_C es:Enginestate_ _DomainRes ...
es:Enginestate_
X Mfrag.mechine
Mfrag.room fpr:hasFMExp
fpr:hasFMExp fpr:isSubstutedBy es:MachineLoc_
fpr:typeOfFArg FMExp
fpr:isSubstutedBy
File:/...#Machine
fpr:typeOfFArg es:FMExpression_CX1 fpr:typeOfFMExp
es:CX1_2_
File:/...Engine.owl es:MachineLoc_
inner_1
#Room fpr:hasFArg fpr:hasFArg RandomVariable
fpr:typeOfFMExp
es:CX1_1 fpr:hasFArg
es:CX1_2 fpr:typeOfFMExp
fpr:typeOfFArg
fpr:hasArgNumber es:equalTo es:CX1_2_inner_
fpr:hasArgNumber FMExp
1
fpr:hasPossibleValue 2
0.9
connected to another FMFrag MachineLocation. The deep color ovals constitute the
main parts of the expression, including logical connective equalTo with a truth value,
arguments CX1_1 and CX1_2 which respectively correspond to the ordinary variable
room in EngineStatus FMFrag and random variable MachineLocation(m) in
BeltLocation FMFrag.
4. Conclusion
Acknowledgment
This work was funded by the key scientific and technological project of Henan
Province (162102310616)
References
[1] P. Michael, Uncertainty Reasoning for the Semantic Web III, Springer International Publishing, 2013.
[2] K. J. Laskey and K. B. Laskey, Uncertainty Reasoning for the World Wide Web: Report on the
URW3-XG Incubator Group, International Workshop on Uncertainty Reasoning for the Semantic Web,
Karlsruhe, Germany, 2008.
[3] K. B. Laskey, MEBN: A language for first-order Bayesian knowledge bases, Artificial
Intelligence, 172(2008):140-178.
[4] K. Golestan, F. Karray, and M. S. Kamel, High level information fusion through a fuzzy extension to
Multi-Entity Bayesian Networks in Vehicular Ad-hoc Networks, International Conference on
Information Fusion, (2013):1180-1187.
[5] K. Golestan, F. Karray, and M. S. Kamel, Fuzzy Multi Entity Bayesian Networks: A Model for Imprecise
Knowledge Representation and Reasoning in High-Level Information Fusion, IEEE International
Conference on Fuzzy Systems, (2014):1678-1685.
[6] P. Hitzler, et al, OWL2 Web Ontology Language Primer(Second edition) (2015).
[7] Z. L. Ding, and Y. Peng, A Probabilistic Extension to Ontology Language OWL, Hawaii International
Conference on System Sciences, 4(2004):40111a-40111a.
[8] P. C. Costa, G. Da, K. B. Laskey and K. J. Laskey, PR-OWL: A Bayesian Ontology Language for the
Semantic Web, Uncertainty Reasoning for the Semantic Web I:, ISWC International Workshop, URSW
2005-2007, Revised Selected and Invited Papers, (2008):88-107.
[9] N. C. Rommel, K. B. Laskey, and P. C. G. Costa, PR-OWL2.0 – Bridging the Gap to OWL
Semantics, Uncertainty Reasoning for the Semantic Web II, Springer, Berlin Heidelberg, (2013):1-18.
[10] V. Novák, On the syntactico-semantical completeness of first-order fuzzy logic, Kybernetika
-Praha- 2(1990):47-66.
[11] N. F. Noy, et al, Creating semantic web contents with protégé-2000, IEEE Intelligent Systems, 16
(2001): 60–71.
[12] W. Gueaieb, Soft computing and intelligent systems design: Theory, tools and applications, Neural
Networks IEEE Transactions on, 17(2004):825-825.
Fuzzy Systems and Data Mining II 81
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-81
Abstract. Return voltage method (RVM) is a good method to study aging state of
transformer insulation, but it is difficult to accurately assess the insulation aging
state by a single characteristic quantity. In this paper, the fuzzy rough sets theory
combined with RVM is proposed to assess the oil paper insulation state of
transformer and construct the assessment system of oil paper insulation of
transformer based on a lot of test data. First, the evaluation index of oil-paper
insulation status of transformer is established by return voltage characteristic
parameters. Then, fuzzy c-means clustering algorithm is used to obtain the
membership function of the transformer test data along with fuzzy partition of
characteristics .Moreover, the fuzzy attributes of assessment table of oil paper
insulation statue is simplified according to the distinct matrix,and it extracts the
evaluation rule of oil paper insulation condition. Finally, the examples in this
paper demonstrate that the assessment system is effective and feasible, which
provides a new idea for the assessment of transformer oil-paper insulation state.
The research has practical value in application of engineering
Introduction
Transformers play a vital role in the whole electrical power system. Due to a large
number of transformers within electric utilities are approaching the end of their design
life, there has been a growing interest in the condition assessment of transformer
insulation currently. The degradation of the main insulation system in transformer is
recognized to be one of the major causes of transformer breakdown [1- 3].
Methods based on the analysis of electrical polarization in dielectrics are often
used in the diagnostics of paper-oil insulation state. Three parameters customarily were
selected to assess the oil-paper insulation [4-5]. However, due to the characteristics of
insulation aging affected by a variety of factors, it is difficult to accurately assess the
insulation aging state by a single feature. The grey correlation method was introduced
for the insulation condition assessment [6], but did not consider the amount of
redundant characteristics in condition assessment of oil paper insulation, the assesse
process is complicated.
In this paper, fuzzy rough set theory is introduced and multiple characteristics are
considered to comprehensive assesse the condition of oil paper insulation. The method
solves the problem of partial information is incomplete and unknown. It has been used
1
Corresponding Author: De-Hua HE, College of Electrical Engineering and Automation, Fuzhou
University, Fuzhou, China; E-mail:153367542@qq.com.
82 D.-H. He et al. / State Assessment of Oil-Paper Insulation Based on Fuzzy Rough Sets
that the fuzzy C means clustering algorithm (FCM) in disperse important data category
to form classification attribution [7]. The characteristics fuzzy rules and the insulation
assessment system are established based on historical database.
Rough set theory is a powerful tool in dealing with vague and uncertain information.
The basic idea of the fuzzy rough model is that a fuzzy similarity relation is used to
construct the fuzzy lower and upper approximations of a decision. The sizes of the
lower and upper approximations reflect the discriminating capability of a feature subset.
The union of fuzzy lower approximations forms the fuzzy positive region of decision.
Let a universe U as a finite nonempty set of objects. Each object within U is defined by
a set of attributes, denoted by A. The pair (U, A) is an information system (IS), where
for every subset P⊆A there exist an associated similarity relation. ǴRp(x,y) denote the
similarity of objects x, and y induced by the subset of features p. Given X⊆U, X can be
approximated by the information contained in P through the construction of the P-
lower and P-upper approximations of X as defined in Eqs. (1):
P R p X _ ( x) inf I ( P R p ( x, y ), P X ( y))
yU
P R p X ( x) sup T ( P R p ( x, y ), P X ( y ))
(1)
yU
Where I represents the fuzzy implicator and T is the t-norm, and Rp is the fuzzy
similarity relation induced by the subset of features P. The degree of similarity of
object with respect to subset of features can be constructed using Eq. (2)
where μRa(x,y) is the degree to which objects x and y are similar for feature a. It
employs a quality measure termed the fuzzy-rough dependency function γP(Q) that
measures the dependency between two sets of attributes P and Q, which is defined by:
P POS ( x) ¦P POS RP ( x)
J Pi (Q ) RP xU
U U
(3)
where the fuzzy positive region, which contains all objects of U that can be
classified into classes of U/Q using the information in P, is defined as:
Not all attributes are necessary for assessment of oil-paper insulation system, removal
of these extra features and the amount of property vague language entry does not affect
the original oil-paper insulation diagnostic effect. The discernibility matrix can be
reduction of condition attributes and attribute values. Specific reduction steps are as
follows: 1: Calculate the similarity relation of fuzzy attributes Ck
min{Ck ( xi ), Ck ( x j )} Ck ( xi ) z Ck ( x j )
Rk ( xi , x j ) ®
¯ 1 Ck ( xi ) Ck ( x j )
(5)
°^Rk :1 Rk ( xi , x j ) t Oi ` Oi t O j
cij ®
°̄ Oi O j
( )
3. Membership of Characteristic
In this paper, FCM is used to calculate the cluster center of each cluster and
membership of transformer test data. Let (U,PыQ)be(a fuzzy decision system with
U={x1, x2,̖, xn}, fuzzy condition attributes P is divided into three categories, and
cluster centers V={v1,v2, v3},The relationship between sample and cluster centers can
be expressed by membership degree. Membership function is obtained by the algorithm,
and then membership degree matrix μ is obtained:
ª P11 P1 j P1n º
« »
P « P21 P2 j P2 n » j 1,
1 ,n
« P31 P3 j P3n »¼
¬ ( )
2
(1/ x j vi )1/ m1
Pij 3
¦ (1/
2
x j vc )1/ m 1
c 1 ( )
3 n
¦¦ (P )
2
min J ( Pij , vn ) ij
m
x j vi
i 1 j 1 (9)
n
1
vi n ¦ (P ij ) m x j , i 1, 2,3
¦ (P
j 1
ij )m j 1
(1 )
According to Eqs.(1) and (4), And the most important for assessment is P4,
followed by P3, P5, P1 and P2, and the set is calculated by attributes reduction
D.-H. He et al. / State Assessment of Oil-Paper Insulation Based on Fuzzy Rough Sets 85
Traf Model years tcdom Urmp Srmax Rg/GΩ Cg/nF Furfur state
T1 SFSE-220 1 2314 230 27 15.74 90.35 0.06 Good
T2 SFP-220 14 1449 243 45 1.795 364.2 0.74 Bad
T3 cub-/220 22 3328 289 24 4.027 95.48 0.99 Bad
Traf P1(H) P2(L) P3(M) P3(H) P4(L) P4(H) P5(H) Rule Result
T1 0.6684 0.0034 0.0123 0.0007 0.0001 0.9980 0.0236 1 G
T2 0.0052 0.0132 0.0418 0.0014 0.9883 0.0050 0.9932 6 B
T3 0.9787 0.0341 0.0245 0.0016 0.0007 0.0000 0.0170 9 B
5. Conclusion
To avoid a single characteristics impact the correctness of the insulation condition
assessment, the fuzzy rough sets theory combine with RVM is proposed and used to
assess the oil paper insulation of transformer. The results demonstrate that the
assessment system is effective and feasible which provides a new idea for the
assessment of transformer oil paper insulation.
References
[1] T. K. Saha, Review of modern diagnostic techniques for assessing insulation condition in aged
transformers, IEEE Trans. Dielectr. Electr. Insul. 10(2003), 903-917.
[2] M. de Nigris, R. Passaglia, R. Berti, L. Bergonzi and R. Maggi, Application of modern techniques for the
condition assessment of power transformers, CIGRE Session 2004, , France, Paper A2-207, 2004.
[3] W. G. Chen, J. Du, Y. Ling, et al. Air-gap discharge process partition in oil-paper insulation based on
energy-wavelet moment feature analysis. Chinese Journal of Scientific Instrument, 34(2013):1062-1069.
[4] Y. Zou, J. D. Cai. Study on the relationship between polarization spectrum characteristic quantity and
insulation condition of oil-paper transformer. Chinese Journal of Scientific Instrument, 36(2015): 608-
614.
[5] R. J. Liao, H. G. Sun, Q. Yuan, et al. Analysis of oil-paper insulation aging characteristics using Return
voltage method. High Voltage Engineering, 37(2011): 136-142.
[6] J. D. Cai and Y. Huang. Study on Insulation Aging of Power Transformer Based on Gray Relational
Diagnostic Model. High Voltage Engineering, 41(2015): 3296- 3301.
[7] S. H. Gao, L. Dong, Y. Gao, et al. Mid-long term wind speed prediction based on rough set theory.
Proceedings of the CSEE, 32(2012): 32-37.
Fuzzy Systems and Data Mining II 87
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-87
Introduction
Networked control systems (NCSs) are the feedback control systems with a network.
As we all know, NCSs has a lot of advantages, for example ease of maintenance, low
cost, greater flexibility . In recent years, a number of papers have been report on
analysis and control of NCSs[2-4]. In order to design the networked-based control, Gao
obtained a new delay system approach by using LMI approach [5]. In [6], Walsh et al.
considered the asymptotic stable of nonlinear NCSs. For the NCSs with long
communication delay, the networked-based optimal controller has been designed in [7].
Yue etc. considered the H f control problem of NCSs with uncertainty [8].
As a useful approach, the fuzzy control approach is usually used to design the
robust control for nonlinear systems. With the well-known T-S approach, many papers
have been published on the stabilization and control problem for nonlinear delay
systems [9-10]. In [11], by considering the insertion of the network, in order to ensure
systems properties, a new two-step approach has been introduced. For the nonlinear
NCSs, the input-to-state stability problem has been considered in [12]. But the results
of the above papers have only focused on the asymptotic stability of dynamic systems.
A few paper considered the finite-time stability of nonlinear NCSs. Therefore, the
finite-time control problem of nonlinear NCSs worthy to be concerned, which
motivates this paper.
1
Corresponding Author. He-Jun YAO, School of Mathematics and Statistics, Anyang Normal
University, 455000, Anyang, Henan, China; E-mail addresses: yaohejun@126.com.
88 H.-J. Yao et al. / Finite-Time Stabilization for T-S Fuzzy Networked Systems
In this paper, by using the LMI approach, based on the Lyapunov functional
approach, we obtained the fuzzy controller design methods and a finite-time stability
condition.
1. Problem formulation
Controller
Assumption1[14]. The controller and actuator are event driven, the sensor-
controller delay is W sc ; the sensor is time driven, the controller-actuator delay is
W ca .Therefore, the communication delay is W W sc W ca .
By insertion of the network, with considering the communication delay W , the
control systems of Fig1 with a network is
r
x(t ) ¦ P ( z(t ))[ A x(t ) A
i
i i di x(t d ) Biu (t W ) GiZ (t )] (4)
x(t ) I (t ) t [d, 0]
In this paper, we would design the following controller
H.-J. Yao et al. / Finite-Time Stabilization for T-S Fuzzy Networked Systems 89
r
u (t ) ¦ P ( z(t ))K x(t )
i
i i
(5)
Inserting the above controller㧔5㧕 into network system (4), we obtain the closed
systems:
r r
x(t ) ¦¦ P ( z(t ))P ( z(t ))[ A x(t ) A
i j
i j i di x(t d ) Bi K j (t W ) GiZ (t )] (6)
x(t ) \ (t ) t [ d, 0]
We suppose the initial state x(t ) \ (t ) is a smooth function on [d , 0] ,
d max{W , d} . So, ||\ (t ) ||d \ t [d , 0] , where \ is a positive constant.
Definition1[15] For the given positive scalars c1 , c2 , T , positive matrix R , the
time delay NCSs (6) (setting Z (t ) { 0 ) is finite-time stable, if
xT (0) Rx(0) d c1 xT (t ) Rx(t ) c2 t [0, T ] (7)
Definition2[16] For the given positive scalars c1 , c2 , T , positive matrix R , with the
state feedback controller, the time delay NCSs (6) is finite-time stabilization if the
following condition holds
xT (0) Rx(0) d c1 xT (t ) Rx(t ) c2 t [0, T ] (8)
2. Main Results
Theorem1. For the given positive scalars c1 , c2 , T , positive matrix R , the NCSs (6) is
finite-time stabilization, if there are scalar D t 0 , matrix K i R mu n , positive matrices
P, Q, T R nun , S R l ul to make the matrix inequalities hold
ª; PAdi PGi º PBi K j
«
Q 0 »» 0
« 0
(9)
«
0 » T
« »
¬
D S ¼
c1 (Omax ( P) hOmax
m (Q ) WOmax
m x (T )) d Om )( e D T )
max ( S )(1
c2 e D T (10)
Omin ( P)
where
; PAi AiT P Q T D P , P R 1/ 2QR
Q 1/ 2 , T R 1/ 2TR 1/ 2
R 1/ 2 PR
P 1/ 2 , Q
and Omax ( ) and Omin ( ) are the maximum and minimum eigenvalue.
Proof. For the positive matrix P, Q, T in Theorem 1, we choose the Lyapunov
function[13]:
t t
V ( x(t )) : xT (t ) Px(t ) ³ xT (T )Qx(T )dT ³ xT (T )Tx(T )dT (11)
t -h t -W
3. Numerical Example
The temperature control system for polymerization reactor is a inertia link with time
delay. The state space model of polymerization reactor is usually written as[6]
x1 (t ) x2 (t )
x2 (t ) a1 x1 (t ) a2 x2 (t ) bu (t )
y (t ) x1 (t )
It is impossible to avoid the external disturbance and time delay. We consider the
nonlinear delay system with norm-bounded uncertainties as following
x(t )
x( Ai x(t ) Adi x(t d ) Biu (t )
x(t ) \ (t ) d dt d0
where
ª 30 0 º ª3 12º ª 2 0.5º ª 3 1º ª1 º ª0 º ª1º
A1 « 0 20 » , A2 «1 0 » , Ad 1 « 0.5 2 » , Ad 2 « 0.1 1» , B1 « 2» , B2 «1 » ,\ (t ) « 1» ,
¬ ¼ ¬ ¼ ¬ ¼ ¬ ¼ ¬ ¼ ¬ ¼ ¬ ¼
d 0.2,W 0.5
Solving the LMIs (17), the gain matrix can be obtained
K1 K1P 1 [3.4529 1.6837], K 2 K 2 P 1 [8.6183 4.3602]
With the state feedback controller (5) in Theorem 2, and choosing the initial
conditions \ (t ) [2 0.5]T
The simulation results are shown in the following figures 2-3
92 H.-J. Yao et al. / Finite-Time Stabilization for T-S Fuzzy Networked Systems
Figure 2. x1 (t ) of systems
Figure 3. x2 (t ) of systems
In the above figures, one can see that the systems is well finite-time stable.
4. Conclusion
In this paper, by introducing the Lyapunov approach and a new finite-time stable
approach, a finite-time stabilization condition is obtained. Based on this condition, the
state feedback fuzzy controller has been designed by using LMI.
Acknowledgments
This work was supported by Anyang Normal University Innovation Foundation Project
under Grant ASCX/2016-Z113.
H.-J. Yao et al. / Finite-Time Stabilization for T-S Fuzzy Networked Systems 93
References
[1] Y. Xia, Y. Gao, Recent progress in networked control systems-a survey, International Journal of
Automation and Computing, 12(2015), 343-367.
[2] G. Chen, Q. Lin, Finite-time observer based cooperative tracking control of networked large range
systems, Abstract and Applied Analysis, 2014, Article ID 135690.
[3] B. Chen, W. Zhang, Distributed fusion estimation with missing measurements, random transmission
delays and packet dropouts. IEEE Transactions on Automatic Control, 59(2014), 196-1967.
[4] J. Chen, H. Zhu, Finite-time H f filtering for a class of discrete-time Markovian jump systems with partly
unknown transition probabilities. International Journal of Adaptive Control and Signal Processing,
28(2014), 1024-1042.
[5] H. Gao, T. Chen, J. Lam, A new delay system approach to network-based control, Automatica, 44(2008),
39-52.
[6] G. C. Walsh, H. Ye, L G. Bushnell, Stability analysis of networked control systems, IEEE Trans on
Control Systems Technology, 10(2002), 438-446.
[7] S. Hu, Q. Zhu, Stochastic optimal control and analysis of stability of networked control systems with
long delay, Automatica, 39(2003),1877–1884.
[8] D. Yue, Q. L. Han, and J. Lam, Network-based robust H∞ control of a system with uncertainty,
Automatica, 4(2005), 999- 1007.
[9] Z. H. Guan, J. Huang, G. R. Chen, Stability Analysis of Networked Impulsive Control Systems, Proc. 25th
Chinese Control Conference, 2006, 2041-2044.
[10] Y. Tian, Z. Yu, Multifractal nature of network induced time delay in networked control systems,
Physics Letter A, 361(2007), 103-107.
[11] G. C. Walsh, O. Beldiman, L. G. Bushnell, Asymptotic behavior of nonlinear networked control
systems, IEEE Transactions on Automatic Control, 46(2001), 1093–1097.
[12] D. Nesic, Observer design for wired linear networked control systems using matrix inequalities,
Automatica, 44(2008), 2840-2848.
[13] S. He, H. Xu, Non-fragile finite-time filter design for time-delayed Markovian jumping systems via T-S
fuzzy model approach, Nonlinear Dynamic, 80(2015), 1159-1171.
[14] D. Huang, S. Kiong, State feedback control of uncertain networked control systems with random time
delays, IEEE Transactions on Automatic Control, 53(2008), 829-834.
[15] F. Amato, M. Ariola, P. Dorate, Finite-time stabilization via dynamic output feedback, Automatica,
42(2006), 337-342.
[16] F. Amato, M. Ariola, C, Cosentino, Finite-time control of discrete- time linear systems: Analysis and
design conditions, Automatica, 46(2010), 919-924.
94 Fuzzy Systems and Data Mining II
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-94
Introduction
The main idea of multiple attribute decision making (MADM) problems is to rank the
alternatives or choose the optimal solution. However, the available information is often
imprecise or vague. In this case, a better solution is to use fuzzy number. Fuzzy theory
[1] is able to address many decision problems that experts and decision makers struggle
to respond to, because of lack of information. Over the years, many theories and appli-
cations have been proposed for solving FMADM problems [2-3].To deal with these
fuzzy situations, experts are usually encouraged to use the trapezoidal fuzzy number,
which can involve the triangular number and interval number. At the same time, rank-
ing fuzzy numbers [4-5] is very important in real time decision-making applications.
Therefore there is a need for a procedure which can rank fuzzy numbers in more condi-
1
Corresponding Author: Zhi-Ying Lv, College of Mathematics, University of Electronic Science and
Technology of China, Chengdu 611731, China; College of Management, Chengdu University of Information
Technology; E-mail: lvZhiying1979@163.com.
Z.-Y. Lv et al. / A Trapezoidal Fuzzy Multiple Attribute Decision Making Based on Rough Sets 95
tions. Ref [6] gives a way to rank trapezoidal fuzzy numbers based on the circumcenter
of centroids. This is a very practical method, which can incorporate the importance of
using the mode and spreads of fuzzy numbers.
Study found that correlations among the attributes will seriously affect the scientif-
ic objectivity and fairness of the evaluation, so attribute reduction [7-8] is an essential
subject in MADM. Usually, the rough set theory is a useful tool to study the attribute r
reduction problems. This theory is initiated by Pawlak in 1982 [9]. However, few stud-
ies have been conducted on the problem of attribute reduction in fuzzy decision making.
In this paper, a new FMADM method is presented, in which the distance between
two trapezoidal fuzzy numbers is defined and a fuzzy number attribute reduction meth-
od based on the TOPSIS method and rough sets [10] is proposed.
1. Preliminaries
In this section, we give the concepts of rough sets and trapezoidal fuzzy numbers and
their extensions.
Below, we briefly review the definition of the trapezoidal fuzzy number and the rank-
ing method.
Definition 1 The membership function of a trapezoidal fuzzy number
~
P (a, b, c, d ; Z ) is given by:
96 Z.-Y. Lv et al. / A Trapezoidal Fuzzy Multiple Attribute Decision Making Based on Rough Sets
Z (a t )
° a b adt db
°° Z bdt dc
P ~p (t ) ® Z (d t )
° cdt dd
° d c
¯° 0 otherwise
~
where, -f a d b d c d d f , 0 d Z d 1 . If Z 1 , then P is normalized and can be
~
denoted by P (a, b, c, d ) , which is shown in figure 1.
Given the fuzzy and rough theories described above, the proposed FMADM procedure
is defined as follows:
~ ~
Step 1. Construct the circumcenter of the centroid matrix O (( xij , yij )) of P .
~
Step 2. Construct the value matrix Q (qij ) of P .
Step 3. Determine the positive ideal and negative idea solution using the following
steps:
½ ½
°~
p j
~
® pij : i N , qij max qij °¾ and p j
~ °~
® pij : i N , qij min qij °¾ (4)
°̄ iN °¿ °̄ iN °¿
Then,
{ p1 , p2 ,!, pm
AP
} and A N { p1 , p2 ,!, pm
} (5)
~
Step 4. The distance between pij and the positive value or negative values are de-
fined as:
dij d(~ p j )
pij , ~ x
j xij y
2
j yij
2
and dij d(~ p j )
pij , ~ x
j xij y
2
j yij
2
(6)
where x j , y j and x j , x j are the circumcenters of the centroids of p j and p j respec-
tively. Then calculate the similar degrees tij between ~pij and the idea solution and con-
struct matrix T (tij )mun , where
dij
tij (7)
dij dij
Step 5. Construct a judgment matrix M (mij )8u6 through T (tij )mun , where
0 0 d tij 0.3
°
mij ®1 0.3 d tij 0.6 (8)
°2 0.6 d t d 1
¯ ij
98 Z.-Y. Lv et al. / A Trapezoidal Fuzzy Multiple Attribute Decision Making Based on Rough Sets
proximation of U about B is defined by apr (U ) {xi : [ xi ]RB [ xi ]CR , i N } , then the ap-
| apr (U ) |
proximate quality is rBU . Because rCU 1 , if ck C , s.t. rCU{ck } 1 , then ck is a
|U |
reduction of C .
Step 7. Give the weight vector Z (Z1, Z2 ," , Zt ) of the set of all non-superfluous at-
tributes, then calculate the values of all alternatives:
t
di ¦ Z jtij (i 1,2,", n) (9)
j 1
Then choose the best alternatives based on the ranking value of d i .
In this section, we present an example to show how the given model works in practice.
A fuzzy multiple attribute decision with trapezoidal fuzzy number involves a company
making an investment decision. Let us consider an investment company, which wants
to make the best investment decision for a given sum of money.
There is a panel with eight possible alternatives U {x1, x2 ,", x8} in which the
company can invest. Each alternative is assessed on six attributes C {c1, c2 ,", c6} . The
decision makers compare these eight companies with respect to the attributes, then con-
~
struct the decision matrix P ( ~pij )8u6 , which is shown below:
§ (0.7,0.72,0.75,0.8) (0.4,0.45,0.6,0.63) (0.7,0.72,0.82,0.9) (0.5,0.5,0.64,0.72) (0.18,0.19,0.2,0.21) (0.09,0.1,0.14,0.17) ·
¨ ¸
¨ (0.54,0.57,0.59,0.6) (0.5,0.52,0.6,0.63) (0.5,0.62,0.62,0.7) (0.5,0.5,0.54,0.6) (0.18,0.19,0.2,0.21) (0.09,0.09,0.098,0.1) ¸
¨ (0.7,0.73,0.78,0.79) (0.5,0.52,0.6,0.63) (0.6,0.72,0.8,0.9) (0.8,0.85,0.9,0.92) (0.21,0.23,0.25,0.27) (0.1,0.1,0.15,0.2) ¸¸
¨
¨ (0.6,0.63,0.66,0.73) (0.4,0.45,0.6,0.63) (0.7,0.72,0.86,0.9) (0.44,0.5,0.66,0.7) (0.17,0.18,0.18,0.19) (0.09,0.12,0.15,0.18) ¸
¨ ¸
¨ (0.72,0.75,0.77,0.8) (0.7,0.73,0.81,0.83) (0.7,0.72,0.8,0.83) (0.7,0.7,0.74,0.8) (0.19,0.21,0.24,0.26) (0.1,0.16,0.18,0.22) ¸
¨ (0.54,0.57,0.59,0.6) (0.4,0.46,0.5,0.56) (0.7,0.75,0.8,0.92) (0.4,0.5,0.54,0.62) (0.18,0.19,0.2,0.21) (0.1,0.12,0.13,0.13) ¸
¨ ¸
¨ (0.6,0.63,0.69,0.71) (0.5,0.52,0.7,0.74) (0.41,0.45,0.5,0.51) (0.44,0.5,0.66,0.7) (0.18,0.19,0.21,0.23) (0.12,0.18,0.21,0.22) ¸
¨ (0.72,0.75,0.77,0.8) (0.5,0.52,0.6,0.63)
© (0.71,0.72,0.86,0.9) (0.7,0.7,0.74,0.8) (0.19,0.21,0.24,0.26) (0.1,0.16,0.18,0.22) ¸¹
~
Step1.Using Eq. (1), construct the circumcenter of the centroid matrix O (( xij , yij )) :
§ (0.7400,0.4146) (0.5217 ,0.3933) (0.7800,0.4036) (0.5833,0.3964) (0.2267 ,0.4161) (0.1233,0.4146) ·
¨ ¸
¨ (0.5767 ,0.4159) (0.5617 ,0.4907 ) (0.6133,0.4135) (0.5300,0.4143) (0.1950,0.4165) (0.0943,0.4166) ¸
¨ (0.7517 ,0.4137 ) (0.5617 ,0.4907 ) (0.7567 ,0.3991) (0.8700,0.4127 ) (0.2400,0.4158) (0.1333,0.4135) ¸¸
¨
¨ (0.6517 ,0.4138) (0.5217 ,0.3933) (0.7933,0.3975) (0.5767 ,0.3887 ) (0.1800,0.4166) (0.1350,0.4148) ¸
¨ ¸
¨ (0.7600,0.4155) (0.7683,0.4097 ) (0.7617 ,0.4097 ) (0.7300,0.4143) (0.2250,0.4153) (0.1667 ,0.4146) ¸
¨ (0.5767 ,0.4159) (0.4800,0.4119) (0.7867 ,0.4085) (0.5167 ,0.4092) (0.1950,0.4165) (0.1217 ,0.4165) ¸
¨ ¸
¨ (0.6583,0.4123) (0.6133,0.3867 ) (0.4700,0.4134) (0.5767 ,0.3887 ) (0.2017 ,0.4160) (0.1867 ,0.4147 ) ¸
¨ (0.7600,0.4155)
© (0.5617 ,0.4097 ) (0.7950,0.3983) (0.7300,0.4143) (0.2250,0.4153) (0.1667 ,0.4146) ¸¹
~
Step 2. Based on Eq. (2), construct the value matrix Q (qij )8u6 of P as follows:
Z.-Y. Lv et al. / A Trapezoidal Fuzzy Multiple Attribute Decision Making Based on Rough Sets 99
RC {c4 ,c5 } ^^x1, x3`, ^x2 `, ^x4 `, ^x5`, ^x6 `, ^x7 `, ^x8`` ,
RC {c3 ,c5 } ^^x1`, ^x2 , x4 `, ^x3`, ^x5 `, ^x6 `, ^x7 `, ^x8 `` ,
RC {c2 ,c5 } ^^x1`, ^x2 `, ^x3`, ^x4 `, ^x5 , x8 `, ^x6 `, ^x7 `` ,
RC {c5 ,c6 } ^^x1`, ^x2 `, ^x3 , x8 `, ^x4 `, ^x5 `, ^x6 `, ^x7 `` ,
RC ^^x1`, ^x2 `, ^x3`, ^x4 `, ^x5 `, ^x6 `, ^x7 `, ^x8 `` .
Thus, rC {c5 } ( X ) 1 ; therefore, c5 is a reduction of C . so, core(C ) ^c1, c2 , c3 , c4 , c6` .
So we can deduct the fifth line in the matrix T .
Step 7. Let Z {0.18,0.24,0.16,0.23,0.19} be the weight vector of ^c1, c2 , c3 , c4 , c6 ` .
Then using Eq. (9) calculates the values of the alternatives as follows:
d1 0.0511 , d 2 0.0393 , d3 0.0551 , d 4 0.0560 ,
d5 0.0691 , d 6 0.0507 , d 7 0.0774 , d 4 0.0691 .
Therefore, we can conclude that the most desirable alternative is x7 .
4. Conclusion
In this article, a new fuzzy attribute decision making method is proposed, in which the
attributed values are trapezoidal fuzzy numbers. An attribute reduction method is pro-
posed based on the distance definition between two trapezoidal fuzzy numbers and
rough sets, which can improve the accuracy of the evaluation. In future research, the
decision model presented in this paper will be extended to interval type-2 fuzzy values
based on Ref. [10].
Acknowledgment
This paper is supported by the National Natural Science Foundation Project of China
(No.61673285; No.61203285; No. 41601141); the Province Department of Soft Sci-
ence Project in Sichuan (2016ZR0095); soft Science Project of the technology bureau
in Chengdu (2015-RK00-00241-ZF); the high level research team of the major projects
division of Sichuan province (Sichuan letter [2015] no.17-5); the Project of Chengdu
University of Information Technology (N0.CRF201508, CRF201615)
References
[6] P.B. Rao and N.R. Shanker, Ranking fuzzy numbers with an area method using circumcenter of centroids.
Fuzzy Information and Engineering, 1( 2013): 3-18
[7] Z.Y. Lv, T. M. Huang and F.X. Jin. Fuzzy multiple attribute lattice decision making method based on the
elimination of redundant similarity index. Mathematics in Practice and Theory, 43(10)(2013):173-
181
[8] X.Y. Zhang and D.Q. Miao, Quantitative/qualitative region-change uncertainty/certainty in attribute
reduction, Information Sciences, 334-335(2016):174--204.
[9] Z. Pawlak, Rough Sets. International Journal of Computer and Information Science, 11(1982):34-356.
[10] L. Dymova, P. Sevastjanov and A. Tikhonenko, An interval type-2 fuzzy extension of the TOPSIS
method using alphacuts. Knowledge-based Systems, 83(2015):116-127.
102 Fuzzy Systems and Data Mining II
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-102
Introduction
Stock market investing is a high-risk activity with a potentially high reward, requiring
complex decision making based on imprecise and incomplete information under
uncertainty. Typically, two analytical approaches are utilized in investment decision
making: fundamental analysis and technical analysis. Decisions based on fundamental
analysis primarily consider the business entity represented by a stock. The information
under consideration includes the nature of the business, its profitability, its
competitiveness, and most importantly its financial standing through detailed study of
its financial statements. For technical analysis, a stock is treated separately from the
business entity. Only stock price movements and patterns generated by them are used
in making trading decisions. Technical analysis views price movements as being
governed by supply and demand of market participants and aims to exploit them.
This paper proposes a technical analysis-based method that applies fuzzy rule-
based inference on stock price momentum and market capitalization (company size),
with different sets of rules for different prevailing market conditions. The method was
tested on the Stock Exchange of Thailand.
1
Corresponding Author: Ratchata PEACHAVANISH, Department of Computer Science, Thammasat
University, Pathum Thani, Thailand; E-mail: rp@cs.tu.ac.th.
R. Peachavanish / Fuzzy Rule-Based Stock Ranking 103
1. Related Works
There is a large and diverse body of research literature on computerized stock market
investing. Techniques in soft computing, fuzzy logic, machine learning, and traditional
data mining have been applied to address various aspects of stock trading, utilizing
both fundamental analysis and technical analysis. Support vector machine and genetic
algorithm were applied on business financial data to perform stock selection that can
outperform market benchmark [1, 2]. Fuzzy logic was applied on stock price
movements to market time stock trades [3], to create a new technical indicator that
incorporated investor risk tendency [4], and to assist in portfolio management [5, 6].
Machine learning experiments on technical analysis-based trading conducted by [7] did
not outperform the market benchmark when using transaction costs. In addition, using
sentiment data obtained from social networks to assist in stock market investing has
also been attempted [8, 9]. A recent comprehensive review of works using evolutionary
computing methods can be found in [10].
Stock markets in different regions have different rules and characteristics. Highly-
developed and efficient markets, such as the New York Stock Exchange, differ greatly
from emerging markets like the Stock Exchange of Thailand. In smaller markets,
extreme price movements are more common as a few well-funded participants can
dictate the market direction in the short term and affect market volatility. This is
especially true for market participants that are classified as foreign fund flows [11].
Lack of regulation and enforcement against insider trading in emerging markets like
Thailand also makes market inefficient and unfair [12]. These differences make
comparisons among research studies difficult. A working strategy under one market
environment may not be effective in another. Nevertheless, the industry-standard way
of judging an investment strategy is to compare the investment return against the
market index benchmark. Most mutual funds, in the long term, failed to outperform the
market [13]. The method proposed in this paper provides a superior investment return
to the market index. It is described in the next section.
2. Method
The strategy proposed in this paper is based on a key technical analysis principle: price
moves in trend and has momentum. This momentum effect, which implies that stock
price tends to continue on its current direction due to inertia, has been observed in
stock markets [14, 15]. Price reversal then occurs after the momentum weakens.
According to this principle, buying stocks with strong upward momentum is likely to
give superior result to buying stocks with weaker or downward momentum. The
strategy is then to make trading decisions based on a technical indicator that reflects
stock price momentum, which by definition is computed from past price series. This
reactive approach therefore makes no attempt is explicitly forecast future price, but
rather to take actions based on past price behavior.
Additionally, past evidence suggested that company’s market capitalization, or its
size, also determines the characteristic of its stock return [16]. In general, stocks of
small companies (the so called “small caps” stocks) tend to be far more volatile than
those of large, established companies (“big caps”). This is simply due to the tendency
for small companies to grow faster, albeit with higher risk. During a bull market, small-
104 R. Peachavanish / Fuzzy Rule-Based Stock Ranking
cap stocks as a group far outperform big-cap stocks. On the other hand, investors prefer
the relative safety of big-cap stocks during an economic downturn or a bear market.
To see how trading using momentum and market capitalization can provide
addition returns above the market index, experiments were performed on the Thai
stocks spanning January 2012 to July 2016. The pool of stocks for the experiments
comprised all constituents of the Stock Exchange of Thailand’s SET100 index. These
stocks are the 100 largest and most liquid stocks in the market (SET100 members are
updated semiannually). These relatively large stocks are considered investment grade
and are least susceptible to manipulations. The daily closing price data of the stocks
were obtained from the SETSMART system [17]. The experiments were conducted
using a custom-written software implemented in the C# language and Microsoft SQL
Server.
The momentum indicator used in the experiment was the Relative Strength Index
(RSI) [18], a standard technical indicator widely-used by stock traders for measuring
the strength of stock price movements. The RSI is a bounded oscillating indicator
calculated using past n-period closing price data series (1).
100
_bc = 100 −
d;
1+
9;
-f;gh; − -f;gh;−1 , ;A -f;gh; > -f;gh;−1 -f;gh;−1 − -f;gh; , ;A -f;gh;−1 > -f;gh;
d; = e 9; = e
0, ;A -f;gh; ≤ -f;gh;−1 0, ;A -f;gh;−1 ≤ -f;gh;
The RSI is effectively a ratio of average gain to average loss during a given past n
consecutive trading periods. An RSI value is bounded between 0 and 100 where a value
higher than 50 indicates an upward momentum and a value lower than 50 indicates a
downward momentum. An extreme value on either end indicates an overbought or an
oversold condition, often used by traders to identify point of price reversal. For this
experiment, the 60-day RSI was chosen.
For trading, the portfolio was given 100 million Thai Baht of cash for the initial
stock purchase. The algorithm selected a quartile of 25 stocks from the pool of 100
stocks ranked by 60-day RSI. They were then purchased on an equal weight basis using
all available cash and held on to for 20 trading days (one month). The process was then
repeated – the algorithm chose a new group of stocks and the portfolio was readjusted
to hold on only to them. Trading commission fees at retail rate were incorporated into
the experiments.
Similarly, the same 100 stocks, this time ranked by market capitalization, were
divided into four quartiles for the algorithm to choose from. However, since the weight
distribution of stocks in the market was nonlinear, each of the four quartiles contained
different numbers of stocks: the first quartile comprised the 4 largest stocks in the
market, the second quartile comprised the next 8 largest stocks, the third quartile
comprised the next 16 largest stocks, and the last quartile comprised the remaining 72
stocks. In other words, every quartile weighted approximately the same when the
market capitalizations of its component stocks are summed.
R. Peachavanish / Fuzzy Rule-Based Stock Ranking 105
The results of the experiments are shown in Table 1. Monthly trading based on 60-
day RSI momentum indicator significantly outperformed the market index. Small-cap
stocks outperformed big-cap stocks.
Table 1. Portfolio returns based on monthly trading using momentum and market capitalization, compared to
the return of the SET100 market index benchmark.
Group By Momentum By Market Capitalization
First Quartile 126.61 % 9.40 %
Second Quartile 68.82 % 29.82 %
Third Quartile 32.12 % 76.96 %
Fourth Quartile -5.29 % 65.31 %
SET100 40.40 % 40.40 %
Experiments using momentum and market capitalization have provided the basis for
stock selection: buy small-cap stocks with high momentum. However, this strategy
does not work during market downtrend. While small-cap stocks as a group outperform
the market during normal times, they severely underperform during market downtrends
due to their lower liquidity. In addition, stocks with high momentum are indicative of
being overbought and have a much greater chance of sudden and strong price reversal.
Price momentum, company size as measured by market capitalization, and
prevailing market condition are the three dimensions that influence stock price
behavior. Each has inherently vague and subjective degrees of measure and so fuzzy
logic [19] is an appropriate tool to assist in the decision-making process. For the
proposed method, fuzzy rules were constructed based on these three factors with
membership functions shown in Figure 1 and fuzzy rule matrix shown in Figure 2. The
60-day RSI indicator was used to indicate both the momentum of stocks and the
prevailing market condition (bull market is characterized by a high RSI value, and vice
versa). There were three linguistic values expressing the momentum – “Weak”,
“Moderate”, and “Strong”, with a typical non-extreme 60-day RSI value ranging
between 40 and 60. For company size, relative ranking of market capitalization was
used instead of the absolute market capitalization of a company. The largest 50 stocks
out of 100 were considered “Large” and “Mid”, with overlapping fuzzy memberships.
The remaining half was considered “Mid” and “Small”, also with overlapping fuzzy
memberships. For output, there were five levels of stock purchase ratings in linguistic
terms: “Strong Buy” (SB), “Buy” (B), “Neutral” (N), “Sell” (S), and “Strong Sell” (SS),
having overlapping numerical scoring ranges between 0 and 10.
40 45 50 55 60 10 30 50 70 90 1 3 5 7 9
Momentum (RSI) Market Capitalization Purchase Rating
Figure 1. Fuzzy membership functions for momentum as measured by RSI (left), company size as measured
by market capitalization (middle), and purchase rating of stock (right).
Mamdani-type [20] fuzzy inference was used to determine stock purchase rating.
For each rule, the intersection between antecedents was evaluated. Consequents of
106 R. Peachavanish / Fuzzy Rule-Based Stock Ranking
rules were then combined using Root-Sum-Square method and the Center of Gravity
defuzzification process was performed to obtain the final crisp stock purchase rating.
The Fuzzy Framework [21] C# library was used to implement the fuzzy logic rule-
based algorithm.
Market Capitalization
Small Mid Large Small Mid Large Small Mid Large
Stock Momentum
Figure 2. Fuzzy rules for different market conditions as measured by momentum (RSI): weak market (left),
moderate market (middle), and strong market (right).
During strong market condition, money should be allocated first to small-cap
stocks with strong momentum and second to mid-cap stocks, also with strong
momentum. During weak market condition, small-cap stocks should be avoided and
priority should be given to big-cap stocks with strong momentum. For moderate market
condition, desirability of a stock was decided on its momentum.
Portfolio readjustments were performed in the same manner to the previous
experiments. The algorithm chose the top quartile of stocks with the best purchase
rating computed from the fuzzy rules. The portfolio returns 161.76%, which was better
than the best return from the experiment using momentum alone (126.61%) or market
capitalization alone (76.96%). The fuzzy rule-based approach also outperformed both
the SET100 index benchmark (40.40%) and one of the best actively-managed mutual
funds in the industry (“BTP” by BBL Asset Management Co., Ltd. at 124.43%) The
results are shown in Figure 3.
200
150
100
50
0
SET100 BTP Mutual Fund Momentum Market Capitalization Fuzzy Rules
Figure 3. Investment returns by algorithms: best result from momentum-only strategy (126.61%), best result
from market capitalization-only strategy (76.96%), and fuzzy rule-based method (161.76%). Returns of the
SET100 index benchmark and “BTP” mutual fund are shown for comparison.
This paper proposes a method that uses fuzzy rule-based inference to rank stocks based
on a combination of price momentum, company’s market capitalization, and prevailing
market condition. The method yields superior return to both the market index
benchmark as well as an industry-leading mutual fund. The method can be further
R. Peachavanish / Fuzzy Rule-Based Stock Ranking 107
improved in the future by incorporating the ability to hold cash during market
downturns. Additionally, short-term indicators may also be used to detect imminent
weakening or strengthening of momentum – information that is potentially useful in
making trading decisions.
References
[1] H. Yu, R. Chen, and G. Zhang, A SVM stock selection model within PCA, 2nd International Conference on
Information Technology and Quantitative Management, 2014.
[2] C. Huang, A hybrid stock selection model using genetic algorithms and support vector regression,
Applied Soft Computing, 12 (2012), 807-818.
[3] C. Dong, F. Wan, A fuzzy approach to stock market timing, 7th International Conference on Information,
Communications and Signal Processing, 2009.
[4] A. Escobar, J. Moreno, S. Munera, A technical analysis indicator based on fuzzy logic, Electronic Notes
in Theoretical Computer Science 292 (2013), 27-37.
[5] K. Chourmouziadis, P. Chatzoglou, An intelligent short term stock trading fuzzy system for assisting
investors in portfolio management, Expert Systems with Applications, 43 (2016), 298-311.
[6] M. Yunusoglu, H. Selim, A fuzzy rule based expert system for stock evaluation and portfolio
construction: An application to Istanbul Stock Exchange, Expert Systems with Applications, 40 (2013),
908-920.
[7] A. Andersen, S. Mikelsen, A novel algorithmic trading frame-work applying evolution and machine
learning for portfolio optimization, Master’s Thesis, Norwegian University of Science and Technology,
2012.
[8] J. Bollen, H. Mao, X. Zeng, Twitter mood predicts the stock market, Journal of Computational Science, 2
(2011), 1-8.
[9] L. Wang, Modeling stock price dynamics with fuzzy opinion networks, IEEE Transactions on Fuzzy
Systems, (in press).
[10] Y. Hu, K. Liu, X. Zhang, L. Su, E. W. T. Ngai, M. Liu, Application of evolutionary computation for
rule discovery in stock algorithmic trading: a literature review, Applied Soft Computing, 36 (2015), 534-
551.
[11] C. Chotivetthamrong, Stock market fund flows and return volatility, Ph.D. Dissertation, National
Institute of Development Administration, Thailand, 2014.
[12] W. Laoniramai, Insider trading behavior and news announcement: evidence from the Stock Exchange
of Thailand, CMRI Working Paper, Thai Stock Exchange of Thailand, 2013.
[13] C. Mateepithaktham, Equity mutual fund fees & performance, SEC Working Papers Forum, The
Securities and Exchange Commission, Thailand, 2015.
[14] N. Jegadeesh, S. Titman. Returns to buying winners and selling losers: implications for stock market
efficiency, Journal of Finance, 48 (1993), 65-91.
[15] R. Peachavanish, Stock selection and trading based on cluster analysis of trend and momentum
indicators, International MultiConference of Engineers and Computer Scientists, 2016.
[16] T. Bunsaisup, Selection of investment strategies in Thai stock market, Working Paper, Capital Market
Research Institute, Thailand, 2014.
[17] SETSMART (SET market analysis and reporting tool), http://www.setsmart.com.
[18] J. Welles Wilder, New concepts in technical trading systems, Trend Research, 1978.
[19] L. Zadeh, Fuzzy sets, Information and Control, 8 (1965), 338-353.
[20] E. Mamdani, S. Assilian, An experiment in linguistic synthesis with a fuzzy logic controller,
International Journal of Man-Machine Studies, 7 (1975), 1-13.
[21] Fuzzy Framework, http://www.codeproject.com/Articles/151161/Fuzzy-Framework.
108 Fuzzy Systems and Data Mining II
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-108
Abstract. Aiming at control method of 2-DOF joint robot, the 3D robot model is
established in ADAMS firstly, and then dynamic equation of the robot is deduced
by using the obtained parameters. And dynamic model is combined with control
system model in MATLAB/Simulink by the ADAMS/Control module and is
established coordinated simulation system. In order to eliminate the effect of the
modeling error and uncertainty signal, a sliding-mode control is proposed. In this
method, a linear sliding surface is used to ensure the system to reach equilibrium
with the sliding surface in finite time; and fuzzy control is used to compensate for
the modeling error and uncertainty signal. Equivalent control law and switching
control law are derived by using Lyapunov stability criterion and exponential
reaching law. Fuzzy control law and membership function are set up by using
fuzzy control rules. Through online adaptive learning of fuzzy, buffeting is
weakened. Simulation result shows that the control method is effective.
Introduction
In order to achieve accurate control of the multi-joint robot system including modeling
errors and uncertainty signals, there have been many effective methods. And the
development of robot control theory has gone through three stages: traditional control,
modern control and intelligent control. The traditional control theory mainly includes
PID control, feed-forward control, and so on; modern control theory mainly includes
robust control, sliding-mode control and so on; intelligent control theory mainly
includes fuzzy control, neural network control, adaptive control, etc [1-2].Robot
control is divided into point-to-point control(PTP) and trajectory tracking control(or
continuous path control, CP).Point-to-point control only requires that the end effector
of the robot is moved from one point to another without taking into account the motion
trajectory. Robot trajectory tracking control is that the driving torque of each joint is
given, so that the position, velocity and other state variables of robot are tracked the
known ideal trajectory. For the entire trajectory, it is necessary to strictly control [3-6].
In recent years, fuzzy control and sliding-mode control have been got more and
more people's attention for their strong robustness. As for the sliding-mode control, by
designing a stable sliding surfaces can ensure that the control system would be run into
1
Corresponding Author: Jie YANG, School of Electrical Engineering, Qingdao University, 308
Ningxia Rd, Qingdao, Shandong, PRC, 266071, China; E-mail: jackiey69@sina.com.
H. Niu et al. / Adaptive Fuzzy Sliding-Mode Control of Robot and Simulation 109
the surfaces from any of the initial state within a limited time frame, and be sported
near the balance point on the surface. But the problem of buffeting is still existed in the
control system and the upper limit of the modeling error and uncertainty signal of the
control system must be knew in advance in this method, which is hard to do in the
actual robot control[7]. However, the fuzzy control has overcome these deficiencies,
which is an effective way to eliminate the buffeting of sliding-mode control system. Its
strong adaptive learning capability can also be used to weaken the uncertain signal. So
combining sliding-mode control with fuzzy control is used to implement the control of
trajectory tracking, which ensures the stability and effectiveness of the control system.
In this paper, the first part mainly introduces the establishment of the 3D model
and the derivation of dynamic equation of robot; in the second part, the design of the
sliding-mode control system is introduced; in the third part, the design of the sliding-
mode control system is introduced; in the fourth part, the simulation experiment and
simulation results of the robot control system are introduced; a brief summary is at the
end of the paper. Those have a certain reference value for the robot control in the future.
Firstly, the 3D model of the robot is established in function module of ADAMS VIEW,
which has two robotic arms and can be realized 2-DOF rotary motion in YOZ plane.
The robotic arms’ length is set to 0.225m and qualities are set to 0.03kg, as shown in
Figure 1.
=0.3375,
y2 =0.39, z2 =0.The inertial parameters of robot are I =0.1732, I yy =0.1588,
xx
are: C111 C212 C221 C222 0 , C112 C121 C122 0.0026325cos q2 0.0022725sin q2 ,
C211 0.0026325cos q2 0.0022725sin q2 ; G(q) >G1 G2 @ is gravity matrix, parameters of
it are: G1 G2 0 ; U t is the modeling error and uncertainty signal; it is generally set
to the same form of input signal, which the amplitude is > 2% 5%@ of input signal[8].
The purpose of trajectory tracking control of robot is to make the joint position
vectorconsistent with the desired joint angular displacement as much as possible[9-10].
Therefore, the sliding-mode surface is designed to Eq.(2):
s e De (2)
In Eq.(2): D is constant of sliding-mode surface; e q qr is tracking error; e q qr
is derivative of tracking error. And the exponential reaching law of sliding-mode
s
control is designed to s M Ks , and M , K ! 0 .
s
The Eq.(2)is simultaneous withthe reaching law, so the Eq.(3) can be got:
W ueq uvss (3)
In Eq.(3):
s
ueq M (q)qr C (q, q )q G (q ) U (t ) D M (q )e , uvss M M (q ) KsM (q ) ;
s
K ! K U (t ) , K is any small positive number; M , K are parameters of exponential
reaching law.
In multi-joint robot system,the effect of modeling error and uncertainty signal is always
existed. So combining sliding-mode control with fuzzy control is usually used to
weaken the effect, which ensures the stability and effectiveness of the control system.
Fuzzy reasoning is used to establish fuzzy rules. The fuzzy set is defined as shown in
Table 1:
H. Niu et al. / Adaptive Fuzzy Sliding-Mode Control of Robot and Simulation 111
S*ds ds
NB NM NS ZO PS PM PB
s
PB NB NB NM PM PB PB PB
PM NB NM NS PS PM PB PB
PS NM NS NS PS PS PM PB
ZO NM NS NS ZO PS PS PM
NS PB PM PS PS NS NS NM
NM PB PB PM PS NS NM NB
NB PB PB PB PM NM NB NB
Among the Table 1: NB is represented the maximum of negative number; NM is
represented the middle value of negative number; NS is represented the minimum of
negative number; ZO is represented the zero; PN is represented the minimum of
positive number; PM is represented the middle value of positive number; PB is
represented the maximum of positive number. Fuzzy rules are the model of IF-THEN:
Rm : if i is A and i is B si si is C
l
For the 2-DOF robot, it is assumed that the upper bound of the modeling error and
uncertainty signal is Ui (t ) d Li ; optimal approximation parameter of adaptive laws is:
Ti
arg min ª¬sup u fi ( xi Ti ) ( Li K ) sign( s) º¼ ;adaptive error is Ti Ti Ti
; the upper
T R
¦ ª¬( L K ) s
i 1
i i H i si Ui (t ) wi si º¼ 0
The result of Eq.(5) shows that the control system has the global stability.
4. Simulation Experiment
5. Conclusions
In allusion to position control of 2-DOF joint robot and the modeling error and
uncertainty signal of control system, adaptive fuzzy sliding-mode control is proposed.
Simulation experiment is conducted in MATLAB and ADAMS, and the experiment
result of adaptive fuzzy sliding-mode control is compared with PD control. The
simulation result shows that adaptive fuzzy sliding-mode control is effective and robust.
And there is no obvious buffeting in control system. The trajectory tracking is more
effective than PD control. So this controlpolicy has practical operability, and the study
would supply a certain practice guidance with value in theory.
114 H. Niu et al. / Adaptive Fuzzy Sliding-Mode Control of Robot and Simulation
Acknowledgment
This work is supported by the Science & Technology Project of College and University
in Shandong Province (J15LN41).
References
[1] J. X. Lv, Y. H. Li, X. Z. Wang, X. L. Bao, Mechanical structure optimization and power fuzzy control
design of picking robot end effector, Journal of Agricultural Mechanization Research, 38(2016): 36-40.
[2] S. H. Ju, Y. M. Li, Research on nonholonomic mobile robot based on self-adjusting universe fuzzy
control, Electronic Design Engineering, 24(2016), 103-106.
[3] Z. M. Ju, Fuzzy, control is applied to wheel type robot target tracking, Computer Measurement &
Control, 22(2014): 614-616.
[4] J. L. Zhang, Comprehensive obstacle avoidance system based on the fuzzy control for cleaning robot,
Machine Tool & Hydraulics, 18(2014): 92-95.
[5] Z. B. Ma, Self-adjusting parameter fuzzy control for self-balancing two-wheel robots, Techniques of
Automation and Applications, 33(2014): 9-13
[6] S. B. Hu, M. X. Lu, Fuzzy integral sliding mode control for three-links spatial robot, Computer
Simulation, 20(2012): 162-166.
[7] L. Lin, H. R. Wang, Y. N. Hu, Fuzzy adaptive sliding mode control for trajectory tracking of uncertain
robot based on saturated function, Machine Tool& Hydraulics, 36(2008): 137-140.
[8] C. Z. Xu, Y. C. Wang, Nonsingular terminal fuzzy sliding mode control for multi-link robots based on
back stepping, Electrical Automation, 34(2012): 8-9.
[9] W. D. Gao, Y. M. Fang, W. L. Zhang, Application of adaptive fuzzy sliding mode control to
servomotor system, Small& Special Electrical Machines, 37(2009): 32-36.
[10] T. W. Wu, Y. S. Yang, Research on simulation of adaptive sliding-mode guidance law, Modern
Electronics Technique, 34(2011): 23-25.
Fuzzy Systems and Data Mining II 115
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-115
Abstract. In this paper, by combining hesitant fuzzy set with bipolar-valued fuzzy
set, the concept of hesitant bipolar value fuzzy set is introduced, and the hesitant
bipolar fuzzy group decision making method based on TOPSIS is proposed. Our
study firstly integrates fuzziness, hesitation and incompatible bipolarity in multiple
criteria decision making method. An illustrative case of chemical project evaluation
also demonstrates the feasibility, validity, and necessity of our proposed method.
Keywords. Fuzzy set, Bipolar-valued fuzzy set, Hesitant fuzzy set, Multiple criteria
decision making, Incompatible bipolarity
Introduction
As an extension of fuzzy set [1], hesitant fuzzy set (HFS) was introduced by Torra and
Narukawa to describe the case that the membership degrees of an element to a given set
have a few different values, which arises from hesitation the decision makers hold [2].
A growing number of studies focus on HFS and some extensions are presented, such as
interval-valued HFS [3], possible-degree generalized HFS [4] and linguistic HFS [5].
On the other hand, in recent years, incompatible bipolarity has attracted re-
searchers’ attentions with some instructive results have devoted to it [6,7]. In fact,
incompatible bipolarity is inevitable in the real world. See an example of the psychol-
ogy disease-bipolar disorder. A patient suffering bipolar disorder has episodes of mania
and depression. Two poles may simultaneously reach extreme cases, i.e., the sum of
positive pole value and negative pole value is bigger than 1. Bipolar-valued fuzzy set
(BVFS) was pointed out is suitable to handle incompatible bipolarity [8,9].
The aforementioned HFS and its extensions can not accommodate incompatible
bipolarity. Considering BVFS is adept at modeling incompatible bipolarity, by combin-
ing BVFS with HFS, hesitant bipolar fuzzy set (HBFS) is introduced in this paper. And a
hesitant bipolar fuzzy multiple criteria group decision making (MCGDM) method based
on TOPSIS [10] is presented. Our study firstly accommodates fuzziness, hesitation, and
incompatible bipolarity in fuzzy set and multiple criteria decision making.
The rest is structured as follows. In Section 1, some related notions are reviewed.
The concept of HBFS is introduced and some related properties are discussed. In Section
1 Corresponding Author: Ying Han, B-DAT & CICAEET, Nanjing University of Information Science and
2, a hesitant bipolar fuzzy group decision making method based on TOPSIS is presented.
In Section 3, an illustrated case about chemical project evaluation is included to show the
feasibility, validity, and necessity of the theoretical results obtained. Finally, the paper is
concluded in Section 4.
Throughout the paper, denote I P = [0, 1], I N = [−1, 0]. The sets X always repre-
sents the finite discourse.
In this section, firstly, some related notions are reviewed. Then, the concept of HBFS is
introduced and some related properties are discussed.
In [2], Torra and Narukawa suggested the concept of HFS permitting the member-
ship degree of an element to a set presented as several possible values in I P . In [11],
bipolar-valued fuzzy set B in X is defined as B = {< x, B(x) = (B P (x), B N (x)) >|
x ∈ X}. Where the functions B P : X → I P , x → B P (x) ∈ I P and B N : X →
I N , x → B N (x) ∈ I N define the satisfaction degree of the element x ∈ X to the prop-
erty corresponding and the implicit counter-property to the BVFS B in X, respectively.
Denote L = {α = (αP , αN ) | αP ∈ I P , αN ∈ I N }, then α is called a bipolar-valued
fuzzy number (BVFN) in [9]. For any α = (αP , αN ), the preference order relation is
defined as α ≤ β if and only αP ≤ β P and αN ≤ β N . The preference order relation is
P N
partial. Denote αM = α +α 2 and we see that if α ≤ β, then αM ≤ β M , then we can
rank all the BVFNs according to their mediation values [9].
Next, the concept of the HBFS is introduced, accommodating fuzziness, hesitation,
and incompatible bipolarity in fuzzy set theory for the first time.
Definition 1 Hesitant bipolar fuzzy set in X is defined as à = {< x, h̃à (x) >| x ∈ X}.
Where h̃Ã (x) is a set of some different BVFNs in L, representing the possible bipolar
membership degree of the element x ∈ X to the set Ã. For convenience, h̃Ã (x) is called
a hesitant bipolar fuzzy element (HBFE), a basic unit of HBFS.
Inspired by work about HFS proposed by Xia et al. [12], for a HBFE h̃Ã (x), it
is necessary to arrange the BVFNs in h̃Ã (x) in the increasing order according to the
mediation value. Suppose that l(h̃Ã (x)) stands for the number of BVFNs in HBFE h̃Ã (x)
σ
and h̃Ãj (x) be the jth largest BVFN in h̃Ã (x). Given two different HBFSs Ã, B̃ in X,
denote lx = max{l(h̃Ã (x)), l(h̃B̃ (x))}. If l(h̃Ã (x)) = l(h̃B̃ (x)), then the shorter one
should be extended by adding the largest value until it has the same length with the longer
one.
In the following paper, all of HBFSs in X are denoted by F̃ (X). HBFE is denoted
by h̃ for simplicity, and the set of all of the h̃ is denoted by L̃. And the preference order
relation in L̃ is defined in the following definition.
Definition 2 Let h̃1 , h̃2 ∈ L̃, then define preference order relation in L̃ as follows: h̃1 ≤
σ σ σ σ σj
h̃2 if and only if (h̃1 j )P ≤ (h̃2 j )P and (h̃1 j )N ≤ (h̃2 j )N . Where, (h̃i ) be the jth
σj
largest BVFN in (h̃i ), (i = 1, 2) according to the mediation value.
Y. Han et al. / Hesitant Bipolar Fuzzy Set and Its Application in Decision Making 117
Definition 5 For any Ã, B̃, C̃ ∈ F̃ (X), if the operation d˜ : F̃ (X) × F̃ (X) → I P
satisfying the following conditions: 1◦ 0 ≤ d( ˜ Ã, B̃) ≤ 1 and d(
˜ Ã, B̃) = 0 if and only if
à = B̃; 2◦ d( ˜ B̃, Ã); 3◦ d(
˜ Ã, B̃) = d( ˜ Ã, C̃) ≤ d(
˜ Ã, B̃) + d(
˜ B̃, C̃). Then d˜ is called the
distance in F̃ (X).
In this section, based on the theory results in the above section, a hesitant bipolar fuzzy
MCGDM method based on TOPSIS is presented.
Considering a MCGDM problem with hesitant bipolar fuzzy information. Let
{x1 , · · · , xm } be the alternatives set, {c1 , · · · , cn } be the evaluation criteria set and t
experts be invited to make evaluation. The hesitant bipolar fuzzy evaluation value to
alternative xi about the criteria cj given by the sth expert is denoted by the HBFE
(h̃sij ), then we can derive the hesitant bipolar fuzzy matrix (HBFM) given by the sth
expert as H̃ s = (h̃sij )m×n (i = 1, · · · , m; j = 1, · · · , n, s = 1, · · · , t). Suppose all the
118 Y. Han et al. / Hesitant Bipolar Fuzzy Set and Its Application in Decision Making
BVFNs in h̃sij is arranged in the increasing order according to the mediation value. The
weights vector about experts is supposed to be known as w = (w1 , · · · , wt ) satisfying
s
ws ∈ I P and ws = 1 and the weights vector about criteria is supposed to be known
t=1
n
as ω = (ω1 , · · · , ωn ) satisfying ωj ∈ I P and ωj = 1.
j=1
The hesitant bipolar fuzzy multiple criteria decision making method based on TOP-
SIS is given as follows:
Step 1. Use (1) to aggregate the HBFM H̃ s to get the comprehensive HBFM H̃ =
(h̃ij )m×n (i = 1, · · · , m; j = 1, · · · , n). Where h̃ij = HBF W G(h̃1ij , h̃2ij , · · · , h̃tij ).
Step 2. Denote lj = maxi=1,···,m {l(h̃ij )}. For j = 1, · · · , n, if l(h̃ij ) < lj , adding
the largest value in it until its length equal to lj . And compute
(h̃j )∗ =
σ(1) σ(1)
maxi=1,···,m (h̃ij )P , maxi=1,···,m (h̃ij )N , · · · ,
σ(l ) σ(l ) (3)
maxi=1,···,m (h̃ j )P , maxi=1,···,m (h̃ j )N
ij lj
and
σ(1) σ(1)
(h̃j )∗ = mini=1,···,m (h̃ij )P , mini=1,···,m (h̃ij )N , · · · ,
σ(l ) σ(l ) (4)
mini=1,···,m (h̃ij j )P , mini=1,···,m (h̃lj j )N .
Then h̃∗ = {(h̃1 )∗ , · · · (h̃n )∗ } is the positive ideal point and h̃∗ = {(h̃1 )∗ , · · · (h̃n )∗ } is
the negative ideal point.
Step 3. Denoted h̃i = {h̃i1 , · · · , h̃in }. For any i = 1, · · · , m, compute the distance
(d˜i )∗ ((d˜i )∗ )between h̃i and h̃∗ (h̃∗ ) by (2).
˜i )∗
Step 4. Computes ξi = (d˜ )(d+( d˜i )∗
, i = 1, · · · , m
i ∗
Step 5. Rank the alternatives according to the principle that the smaller ξi is, the
better the project xi is.
3. Case Study
Example 2 Considering a chemical project evaluation problem. Suppose there are four
chemical projects {x1 , x2 , x3 , x4 } to be evaluated, and two experts are invited to make
evaluation. c1 : economy, c2 : environment and c3 : society are evaluation criteria. Con-
sidering its economy evaluation criteria: in a short time, it may bring huge benefits to
the company, resulting its positive evaluation value is 0.8; on the other hand, in the long
run, the pollution needs a huge amount of money to fix, resulting its negative evaluation
value is 0.7. The sum of two poles is 1.5, bigger than 1, i.e., there exists incompatible
bipolarity. And when make evaluation, experts may have hesitation among several mem-
Y. Han et al. / Hesitant Bipolar Fuzzy Set and Its Application in Decision Making 119
berships, thus, to alternative xi about the criteria cj the evaluation values given by the
sth expert is denoted by the HBFE h̃sij . Suppose all the BVFNs in h̃sij is arranged in the
increasing order according to the mediation value. The HBFMs given by Expert 1 and
Expert 2 are presented in Table 1 and Table 2, respectively. The weights vectors about
experts and criteria are given as w = (0.7, 0.3) and ω = (0.3, 0.5, 0.2), respectively.
Next, we will see how to use the proposed method to make evaluation.
Step 1. Use (1) to aggregate the HBFM H̃ s given by the sth expert to get the com-
prehensive HBFM H̃ = (h̃ij )4×3 .
The comprehensive HBFM is given in Table 3.
Step 2. Compute the positive (negative) ideal point h̃∗ (h̃∗ ) by (3) ((4)).
By (3), we have h̃∗ = {([0.8688, −0.1231]), ([0.8000, −0.2158], [0.7686, −0.2158]),
([0.8688, −0.1390], [0.8000, −0.1000], [0.8688, −0.1231], [0.8000, −0.1390])}.
By (4), we have h̃∗ = {([0.6000, −0.6684]), ([0.5281, −0.5681], [0.5531, −0.4277]),
([0.6684, −0.4000], [0.6684, −0.4000], [0.6684, −0.4000], [0.6684, −0.4000])}.
Step 3. Compute the distance (d˜i )∗ ((d˜i )∗ ) between h̃i and h̃∗ (h̃∗ ) by (2), i =
1, 2, 3, 4.
By (2), we have (d˜1 )∗ = 0.1850, (d˜2 )∗ = 0.0199, (d˜3 )∗ = 0.1116, (d˜4 )∗ = 0.1934;
(d˜1 )∗ = 0.1178, (d˜2 )∗ = 0.2878, (d˜3 )∗ = 0.1977, (d˜4 )∗ = 0.1093;
˜i )∗
Step 4. Computes ξi = (d˜ )(∗d+( d˜i )∗
, i = 1, 2, 3, 4.
i
We have ξ1 = 0.3890, ξ2 = 0.9352, ξ3 = 0.6392, ξ4 = 0.3610.
Step 5. Rank the alternatives according to the principle.
120 Y. Han et al. / Hesitant Bipolar Fuzzy Set and Its Application in Decision Making
4. Conclusions
In this paper, by combining hesitant fuzzy set with bipolar-valued fuzzy set, the concept
of hesitant bipolar value fuzzy set is introduced, and then, a hesitant bipolar fuzzy group
decision making method is presented. Our study firstly accommodates fuzziness, hesita-
tion, and incompatible bipolarity in information process. In the following work, we will
try to combining rough set theory with hesitant bipolar fuzzy set.
Acknowledgements
This work was supported in part by the Joint Key Grant of National Natural Science
Foundation of China and Zhejiang Province (U1509217), the National Natural Sci-
ence Foundation of China (61503191) and the Natural Science Foundation of Jiangsu
Province, China (BK20150933).
References
[1] L.A. Zadeh, Fuzzy sets, Inform. and Control, 8 (1965) 338–353.
[2] V. Torra and Y. Narukawa, On hesitant fuzzy sets and decision, in: the 18th IEEE International Confer-
ence on Fuzzy Systems, Korea, 2009, 1378–1382.
[3] N. Chen, Z.S. Xu and M.M. Xia, Correlation coefficients of hesitant fuzzy sets and their applications to
clustering analysis, Applied Mathematical Modeling, 37 (2013) 2197–2211.
[4] Y. Han, Z.Z. Zhao, S. Chen and Q.T. Li, Possible-degree generalized hesitant fuzzy set and its Applica-
tion in MADM, Advances in Intelligent Systems and Computing, 27 (2014) 1–12.
[5] F.Y. Meng and X.H. Chen, A hesitant fuzzy linguistic multi-granularity decision making model based
on distance measures, Journal of Intelligent and Fuzzy Systems, 28 (2015) 1519–1531.
[6] J. Montero, H. Bustince, C. Franco, J.T. Rodríguez, D. Gómez, M. Pagola, J. Fernández and E. Bar-
renechea, Paired structures in knowledge representation, Knowledge-Based Systems, 100 (2016) 50–58.
[7] C.G. Zhou, X.Q. Zeng, H.B. Jiang, L.X. Han, A generalized bipolar auto-associative memory model
based on discrete recurrent neural networks, Neurocomputing, 162 (2015) 201–208.
[8] H. Bustince, E. Barrenechea, M. Pagola, J. Fernandez, Z.S. Xu, B. Bedregal, J. Montero,H. Hagras,
F. Herrera and B.D. Baets, A historical account of types of fuzzy sets and their relationships, IEEE
Transactions on Fuzzy Systems, 24 (2016) 179–194.
[9] Y. Han, P. Shi and S. Chen, Bipolar-valued rough fuzzy set and its applications to decision information
system, IEEE Transactions on Fuzzy Systems, 23 (2015) 2358–2370.
[10] Y.J. Lai, T.Y. Liu and C.L. Hwang, TOPSIS for MODM, European Journal of Operational Research, 76
(1994) 486–500.
[11] W.R. Zhang, Bipolar fuzzy sets and relations: a computational framework for cognitive modeling and
multiagent decision analysis, Proceedings of IEEE Conf., 1994: 305–309.
[12] M.M. Xia and Z.S. Xu, International Journal of Approximate Reasoning, 52 (2100) 395–407.
Fuzzy Systems and Data Mining II 121
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-121
Introduction
Nowadays, support vector machines (SVMs) are considered as one of the most effective
learning methods for classification. The main idea of this classification technique is by
mapping the data to the higher dimensional space with some kernel methods and then
determine a hyperplane separating binary classes with maximal margin [1,2].
Binary data classification methods have made breakthrough progress in recent years.
Mangasarian et al. [3] proposed generalized evigenvalue proximal support vector ma-
chinie (GEPSVM). Different from canonical SVM, GEPSVM aims to find two optimal
nonparallel planes such that each hyperplane is closer to its class and is as far as possi-
ble from the other class. Motivated by GEPSVM Jayadeva et al. [4] proposed a a twin
support vector machine (TWSVM) to solve the classification of binary data. The main
idea of TWSVM is generating two nonparallel planes that have the similar properties
in GEPSVM. But different from GEPSVM, the two planes in TWSVM are obtained by
double related programming problems. At the same time, the ν -TWSVM [5] was pro-
posed for handling outliers as an extension of TWSVM. Some extensions to the TWSVM
can be founded in [6].
For the above-mentioned methods, the parameters in the training data sets are im-
plicitly assumed to be known exactly. However, in real world application, parameters are
perturbed as they are estimated from the data of the measurement and the statistical error
[7]. For instance, the real data points are always incorporating the uncertain information
in automatic acoustic identification and other imbalanced data problems [8]. When the
data points are uncertain, some SVM models for processing uncertainties have been pro-
posed as the development of previous model. Trafalis et al. [9] proposed a robust opti-
mization model when the noise of the uncertain data is norm-bounded. Robust optimiza-
tion [10] was also introduced in the cases of chance constraints. The usage of robust op-
timization in chance constraints is to ensure small probability of error classification for
the uncertainty. More precisely, this assurance is to require the probability of construct-
ing a maximum margin linear classifier by random variables more higher. It also means
that probability which the points of one class are classified to the other is controlled by a
extremely low value. Ben Tal et al. [11,12] employed moment information of uncertain
training data to developing a different chance-constrained SVM (CC-SVM) model. How-
ever, to our best knowledge, there is no researcher considering the chance-constrained
optimization in TWSVM problem. Therefore, it is interesting and important to study the
TWSVM with chance constraints for the uncertain data classification problem. The main
purpose of this paper is to make an attempt in this direction.
Combining the capability of processing the uncertainty of chance constraints and
the benefits of TWSVM, in this paper, we propose a chance constrained twin support
vector machine (CC-TWSVM). The main method of this paper is ,by using the moment
information of uncertain data, to transform chance constrained programming into second
order cone programming. Section 1 recalls SVM and TWSVM briefly. In Section 2, we
introduce the model of CC-TWSVM. Experimental results on the uncertain data sets are
presented in Section 3. Conclusions are provided in Section 4.
1. Preliminaries
In this section, we briefly recall some concepts of SVM and TWSVM for binary classi-
fication problem.
1.1. SVM
Let us consider the linearly separable classification problem. Given training set
SVM aims to find an optimal hyperplane wT x + b = 0 which separates the data into
2
two classes based on maximizing the distance w between two support hyperplanes,
2
which can be formulated as follows
B.-Z. Yang et al. / Chance Constrained Twin SVM for Uncertain Pattern Classification 123
min 12 w22 +C ∑li=1 ξi
w,b
s.t. yi (wT xi + b) ≥ 1 − ξi , (1)
ξi ≥ 0, i = 1, · · · , l.
After solving (1), a new point is classified as class +1 or class −1 according to the
finally decision function f (x) = sgn(wT x + b).
1.2. TWSVM
Consider a binary classification problem of l1 positive points and l2 negative points (l1 +
l2 = l). Suppose that data points belong to positive class are denoted by A ∈ Rl1 ×n , where
each row Ai ∈ Rn (i = 1, · · · , l) represents a data point with label +1. Similarly, B ∈ Rl2 ×n
represents all the data points with the label −1. The TWSVM determines two nonparallel
hyperplanes:
and
where C1 ,C2 are pre-specified penalty factors, e+ , e− are vectors of ones of correspond-
ing dimensions. It is apparent from the formulations that the vector of ones e+ is l2 di-
mensions and e1 is l1 . The nonparallel hyperplanes (2) can be obtained by solving (3)
and (4). Then the new point is classified by following decision function
xT wr + br = min | xT wr + br |, (5)
r=+,−
In this section, we introduce chance constrained programming (CCP) briefly and propose
a chance constrained twin support vector machine (CC-TWSVM) to process uncertain
data points.
When uncertain noise exists in the datast, the TWSVM model need to be modified
to contain the uncertainty information. Suppose there are l1 and l2 training data points in
124 B.-Z. Yang et al. / Chance Constrained Twin SVM for Uncertain Pattern Classification
Rn , use Ai = [Ai1 , · · · , Ain ], i = 1, · · · , l1 to denote the uncertain data points and the label
is positive +1. And let Bi = [Bi1 , · · · , Bin ], i = 1, · · · , l2 to denote the uncertain data points
and the label is negative −1 respectively. Then A = [A
1 , · · · , A
l ]T and B = [B
1 , · · · , B
l ]T
1 2
represent two data sets. The chance-constrained program is to determine two nonparallel
planes such that each hyperplane is closer to its class in the sense of expectation and is
as far as possible from the other class in probability. The chance-constrained TWSVM
formulations are
l1
∑ ξi
2 E{Aw+ + e+ b+ 2 } +C1
1 2
min
w+ ,b+ i=1 (6)
s.t. P{−(Bi w+ + b+ ) ≤ 1 − ξi } ≤ ε
ξi ≥ 0, i = 1, · · · , l1
and
l2
∑ ηi
2 E{Bw− + e− b− 2 } +C2
1 2
min
w− ,b− i=1
(7)
s.t. P{(Ai w− + b− ) ≤ 1 − ηi } ≤ ε
ηi ≥ 0, i = 1, · · · , l2 .
where E{·} denote the expectation under corresponding distribution, C1 ,C2 are user-
given regularization parameters, 0 < ε < 1 is a parameter close to 0 and P{·} is the prob-
ability distribution of uncertain data points of binary classes sets. The objective functions
of model ensure that minimum distance between each hyperplane to its class in average.
The chance constraints of model ensure that an upper bound on the misclassification
probability which the point is assigned to another class. The chance constraints in the
model have the advantages of guaranteing classification correctly with high probability.
And the determined planes constructing by maximum margin classifiers are robust to
uncertainties in data. But two quadratic optimization problems (6) and (7) with chance
constrained are obviously non-convex, so the model is difficult to solve. So far using
different bounded inequalities is always effective technique to deal with CCP. When the
mean and covariance matrix of uncertain data points are known, then multivariate bound
[13,14,15] can be adopted to express the chance constraints by robust optimization.
Let X ∼ (μ , Σ) denote random vector X with mean μ and covariance matrix Σ, the
multivariate Chebyshev inequality states that for any closed convex set S, the supremum
of the probability that X take a value in S is
sup P{X ∈ S} = 1
1+d 2
X∼(μ ,Σ) (8)
d 2 = inf (X − μ )T ∑−1 (X − μ ).
X∈S
Assume the first and second moment information of random variables Ai and Bi are
known. Let μi+ = E[Ai ] and μi− = E[Bi ] be the mean vector seperately. And let ∑+
i =
+ T − − T −
E[(Ai − μi ) (Ai − μi )] and ∑i = E[(Bi − μi ) (Bi − μi )] be the covariance matrix of
+
the two data set uncertain points respectively. Then the problems (6) and (7) could be
reformulated respectively as:
B.-Z. Yang et al. / Chance Constrained Twin SVM for Uncertain Pattern Classification 125
l1
T +T b + 1 l b2 +C
2 w+ G w+ + w + μ ∑ ξi
1 T +
min + 2 1 + 1
w+ ,b+ i=1 (9)
1
s.t. −(μi− w+ + b+ ) ≥ 1 − ξi + k ∑−
i
2
w+ , ξi ≥ 0
and
l2
1 T − T −T b + 1 l b2 +C
min 2 w− G w− + w − μ − 2 2 − 2 ∑ ηi
w− ,b− i=1 (10)
1
s.t. μi+ w− + b− ≥ 1 − ηi + k ∑+
i
2
w− , ηi ≥ 0,
1−ε
where k = ε and
l1 l1
G+ = ∑ (μi+ μi+ + Σ+ μ + = ∑ μi+
T
i ),
i=1 i=1
with
l2 l2
G− = ∑ (μi− μi− + Σ− μ − = ∑ μi− .
T
i ),
i=1 i=1
Let
1 G+ μ +T
H+ = . (11)
2 μ + l1
Then the matrix H + is positive semi-define. To ensure the strict convexity of problem
(9), we can always append a perturbation ε I (ε > 0, I is the identity matrix) such that the
matrix H + + ε I is positive define. Without loss of generality, suppose that H + is positive
define.
The dual problems of chance-constrained TWSVM models (9) and (10) can be for-
mulated as the following models
l1 T T T T
max ∑ λi − 12 s+
i H1 G H1 si − 2 l1 si H2 H2 si − μi H1 si H2 si
+ + + + 1 + + + + + + + + +
λi ,ν i=1
l 1
l1
1 T (12)
s.t. − ∑ λi μi− + kΣi− 2 ν , ∑ λi = s+ i
i=1 i=1
0 ≤ λi ≤ C1 , ν ≤ 1
and
l2 T T T T
max ∑ γi − 12 s− − − − − − − − − + + − + −
i H1 G H1 si − 2 l2 si H2 H2 si − μi H1 si H2 si
1
γi ,υ i=1
l 1
l
2
+T
2 (13)
s.t. − ∑ γi μi − kΣi υ , ∑ γi = s−
+2
i
i=1 i=1
0 ≤ γi ≤ C2 , υ ≤ 1,
126 B.-Z. Yang et al. / Chance Constrained Twin SVM for Uncertain Pattern Classification
where
−1 −1
H+ = [H1+ , H2+ ], H− = [H1− , H2− ].
3. Numerical Experiments
In this section, our CC-TWSVM model is illustrated by numerical test based on two
types of data sets. The first test is implemented to certify the performance of our CC-
TWSVM by artificial data. And in second test, we also test the performance of CC-
TWSVM model on real-word classifying data sets from UCI Machine Learning Repos-
itory. All results were averaged on 10 train-test experiments and carried out by Matlab
R2012a with 2.5GHz CPU, 2.5G usable RAM. The SeDuMi 3 software is employed to
solve the SOCP problems of CC-TWSVM.
8 8
6 6
4 4
2 2
0 0
−2 −2
−4 −4
−6 −6
−3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3
Figure 1 showes that the performance of CC-TWSVM to two uncertain data set
points. In numerical experiments, different data points are generated by respective dis-
tribution. In data set, one class points were generated by normal distribution (μ + , Σ+ )
and the other by (μ − , Σ− ). Each class has 50 points, and 20 points are randomly picked
as the training points, the other points are the test points. In Figure 1 , the blue stars are
the points of +1 class, while -1 class with the red circles. For simplicity, we set ε to be
0.1 and 0.01 respectively. The penalty parameters C1 and C2 are selected form the set
3 http://sedumi.ie.lehigh.edu/
B.-Z. Yang et al. / Chance Constrained Twin SVM for Uncertain Pattern Classification 127
{10i |i = −5, · · · , 5}. After 10 times experiments, we obtain the results of two parameters
of hyperplanes and set the average parameters to be the ultimate result. The blue and red
lines are the separating hyperplane (2-D) that we look for. In fact, the value of parame-
ter ε also effect the determination of two hyperplane. When the parameter ε decreases
from 0.1 to 0.01, the average accuracy of classifier is more higher and the planes are
more closet to responding classes. Figure 1. (a) and (b) perform the effect of various
parameters.
In numerical experiments, this section presents results in two real data sets. The following
datasets were used in the experiments:
• WBCD Wisconsin Breast Cancer Diagnostic dataset was also obtained from UCI
dataset [16]. WBCD data is 10-dimensional. The data set has 699 samples, with
444 benign samples are labeled as the class +1 and the remaining malignant as
the class -1.
• IONOSPHERE Ionosphere dataset was collected from UCI dataset . Ionosphere
data is of 34-dimension. The data set has 351 samples, with 225 good samples
are labeled as the class +1 and the remaining as the class -1.
The distribution properties are often so unknown that need to be estimated from data
points. If an uncertain point xi = [
xi1 , · · · , x
in ] has N samples xik , k = 1, · · · , N, then the
T
N
sample mean xi = 1
N ∑ xik is used to estimate the mean vector μi = E[
xi ], and the sample
i=1
covariance
1 N
Si = ∑ (xik − xi )(xik − xi )T
N − 1 i=1
xi − μi )(
Σi = E[( xi − μi )T ].
But these could bring possible estimation errors in some condition that the mean
vector μi and covariance matrix Σi may not obtained. Panos M. Pardalos et al. [17] has
discussed the way to processing these special cases. In our practical experiments, similar
to Pardalos, we employ mentioned methods to modify the estimation.
Since the data sets are uncertain, the measures of performance are worth studied.
Ben-Tal et al. [11] proposed using nominal error and optimal error to evaluate the perfor-
mance. In our experiment, we choose these index to calculate the accuracy of our model.
The formula of NomErr is
∑ 1yipre =yi
i
NomErr = × 100%
the amount o f training data
The optimal error (OptErr) is defined on the basis of the misclassification probabili-
ty. The chance constraints in the model (6), (7) can be reformulated to (9), (10), then we
can derive the least value of ε called εopt . Te OptErr of data point xi is defined as
128 B.-Z. Yang et al. / Chance Constrained Twin SVM for Uncertain Pattern Classification
1 if yipre = yi
OptErr =
εopt if yipre = yi
∑ OptErri
i
OptErr = × 100%.
the amount o f training data
We tested on the WBCD firstly. Because each data points in WBCD has 10 attributes,
the amount of time in calculating SOCP would take too much. We used principle com-
ponent analysis (PCA) to exact the two important features. Then 80% of the data points
was used as training and the remaining as the test data. For the setting of parameter ε ,
three parameter values {0.1, 0.05, 0.01} were adopted separately. Similar to the exper-
iments in artificial data, the penalty parameters C1 and C2 were selected form the set
{10i |i = −5, · · · , 5}.
3.1
5.5
5.5
3.08
5
5
3.06
4.5
4.5
3.04
4
4 3.02
3.5
3.5 3
3
Figure 2. The performance of CC-TWSVM in the Wisconsin breast cancer data set.
The average results over 10 runs are shown in Figure 2. In Figure 2.(a), it is obvi-
ously that NomErr decreases slightly when ε descends from 0.1 to 0.01. That is because
ε represents the upper bound of misclassification. The result for OptErr is also the case
in Figure 3.(b). When ε decrease from 0.1 to 0.01, average OptErr rate decrease from
5.4% to 5.3% approximately. So we can get the conclusions that classifying accuracy
improves when parameter ε decreases. Since the definitions of OptErr and NomErr, it is
not difficult to see that OptErr was bigger than NomErr from the previous two maps. In
addition, the model takes more time when ε increases. This is due to solving process of
second cone programming problem is related heavily to the parameters.
22
22 2.1
21
21 2.09
20
20 2.08
19
19 2.07
18
2.06
18
17
2.05
17
16
2.04
16 15
2.03
15 14
2.02
14 13
2.01
13 12 2
e=0.1 e=0.05 e=0.01 e=0.1 e=0.05 e=0.01 e=0.1 e=0.05 e=0.01
The average results for Ionosphere set over 10 runs are shown in Figure 3.. Similar
to the process in WBCD, we obtained 3 principal attributes of Ionosphere by PCA. Based
on these principal components, 80% of the data points was used as training and the
remaining as the test data. For the setting of parameter ε , three parameter values set
{0.1, 0.05, 0.01} were adopted and the penalty parameters C1 and C2 were selected form
the set {10i |i = −5, · · · , 5} respectively. We can also get the conclusions that classifying
accuracy improves when parameter ε decreases. And in this experiment, it is easy to
see that OptErr was bigger than NomErr. Moreover, because of the usage of SeDuMi
software in solving SOCP, the model takes more time when ε increases.
We also tested our model to compare with previous model, such as TWSVM and
CC-SVM. The experiment data sets were ”Bliver”, ”Heart-c”, ”Hepatitis”, ”Inosphere”,
”Votes”, and ”WBCD” which were selected from UCI datasets. In the experiments,
the penalty parameters in three model were all same. They were selected from the set
{10i |i = −5, · · · , 5} respectively. The parameter ε in CC-SVM and CC-TWSVM model
was selected from the set {0.1, 0.05, 0.01} respectively, and 80% of the data points ware
used as training and the remaining as the test data. Comparison of previous models and
our model is given in Table 1. It is easy to see that the average misclassification rate
of CC-TWSVM is better than original TWSVM. Furthermore, the performance of CC-
TWSVM is better than CC-SVM. This is consistent with the result that two nonparallel
planes has advantages than single hyperplane.
4. Conclusions
A new chance constrained twin support vector machine (CC-TWSVM) via chance con-
strained programming formulation was proposed, which can attend to data set with mea-
surement noise efficiently. This paper studied twin support vector machine classification
when data are uncertain statistically. With chance constraints programming (CCP) in the
model, the CC-TWSVM was used to ensure the low probability of classification error
for the uncertainty. The CC-TWSVM model could be transformed to second-order cone
programming (SOCP) by the properties of moment information of uncertain points and
the dual problem of SOCP model was also introduced. Then we obtained the twin hyper-
planes by the calculating the dual problem. In addition, we also showed the performance
of CC-TWSVM model in artificial data and real data by numerical experiments. In the
future work, how to further make the model more robust is under our consideration. In
addition, to deal with the situation of nonlinear classification with chance constrained is
also interesting.
130 B.-Z. Yang et al. / Chance Constrained Twin SVM for Uncertain Pattern Classification
Acknowledgement
This work was supported by the joint Foundation of the Ministry of Education of China
and China Mobile Communication Corporation (MCM20150505) and the Fundamental
Research Funds foe the Central Universities of Sichuan University (skqy201646).
References
[1] B. Scholkopf and A. J. Smola, Learning with Kernels: Support Vector Machines, Regulation, Optimiza-
tion, and Beyond, MIT press, Cambridge, 2002.
[2] B. Z. Yang, M. H. Wang, H. Yang, T. Chen, Ramp loss quadratic support vector machine for classifica-
tion, Nonlinear Analalysis Forum, 21 (2016), 101-115.
[3] O. Mangasarian, E. Wild, Multisurface proximal support vector classification via generalized eigenval-
ues, IEEE Transactions on Pattern Analysis and Machine Intelligence, 28 (2006), 69-74.
[4] Jayadeva, R. Khemchandani, S. Chandra, Twin support vector machines for pattern classification, IEEE
Transactions on Pattern Analysis and Machine Intelligence, 29 (2007), 905-910.
[5] X. J. Peng, A v-twin support vector machine (v-TWSVM) classifier and its geometric algorithms, Infor-
mation Sciences, 180 (2010), 3863-3875.
[6] Y. J. Lee, O. L. Mangasarian, SSVM: a smooth support vector machine for classification, Computational
Optimization and Applications, 20 (2001), 5-22.
[7] D. Goldfarb, G. Iyengar, Robust convex quadratically constrained programs, Mathematical Program-
ming, 97 (2009), 495-515.
[8] Paul Bosch, Julio López, Héctor Ramı́rez, Hugo Robotham, Support vector machine under uncertainty:
An application for hydroacoustic classification of fish-schools in Chile, Expert Systems with Applica-
tions, 40 (2013), 4029-4034.
[9] T. B. Trafalis, R. C. Gilbert, Robust classification and regression using support vector machines, Euro-
pean Journal of Operational Research, 173, (2006), 893-909.
[10] C. Bhattacharyya, L. R. Grate, M. I. Jordan, G. L. El, I. S. Mian, Robust sparse hyperplane classifier:
application to uncertain molecular profiling data, Journal of Computational Biology, 11 (2004), 1073-
1089.
[11] A. Ben-Tal, S. Bhadra, C. Bhattacharyya, J.S. Nash, Chance constrained uncertain classificiation via
robust optimization, Mathematical Programming, 127 (2011), 145-173.
[12] A. Ben-Tal, A. Nemirovski, Selected topics in robust convex optimization, Mathematical Programming,
112 (2008), 125-158.
[13] D. Bersimass, I. Popescu, Optimal inequities in probality theory: a convex optimization approach, SIAM:
SIAM Journal on Optimization, 15 (2005), 780-804.
[14] A. W. Marshall, I. Olkin, Multivariate chebyshev inequities. The Annals of Mathematical Statistics, 31
(1960), 1001-1014.
[15] A. Nemirovski, A. Shapiro, Convex approximations of chance constrained programs. SIAM: SIAM Jour-
nal on Optimization, 17 (2010) , 969-996.
[16] A. Frank and A. Asuncion, UCI Machine Learning Repository, 2010. Available at
http://archive.ics.uci.edu/ml.
[17] X. Wang, N. Fan, P. M. Pardalos, Robust chance-constrained support vector machines with second-order
moment information. Annals of Operations Research, (2015), 10.1007/s10479-015-2039-6
Fuzzy Systems and Data Mining II 131
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-131
Abstract. This paper deals with non-algebraic binary relational semantics, called
here set-theoretic Kripke-style semantics, for monoidal t-norm (based) logics. For
this, we first introduce the system MTL (Monoidal t-norm logic) and some of its
prominent axiomatic extensions, and then their corresponding Kripke-style seman-
tics. Next, we provide set-theoretic completeness results for them.
Keywords. relational semantics, (set-theoretic) Kripke-style semantics, substructural
logic, fuzzy logic, t-norm (based) logics
After introducing algebraic semantics for t-norm (based) logics, their corresponding
Kripke-style semantics have been introduced. For instance, after Esteva and Godo intro-
ducing algebraic semantics for monoidal t-norm (based) logics in [4], their correspond-
ing Kripke-style semantics were introduced by Montagna and Ono [6], Montagna and
Sacchetti [7], and Diaconescu and Georgescu [3]. These semantics have one important
common feature as follows:
• While such semantics are called Kripke-style semantics in the sense that those
semantics are provided using forcing relations, they are still algebraic in the sense
that their completeness results are provided using the fact that such semantics are
equivalent to algebraic semantics.
Because of this fact, Yang [8,9,10] called these semantics algebraic Kripke-style
semantics. Although non-algebraic Kripke-style semantics, where the “non-algebraic”
means that their completeness results are provided without using the above fact, were
provided for some particular systems (see e.g. [9]), such semantics have not yet been
established for basic fuzzy logics in general.
The aim of this paper is to provide set-theoretic Kripke-style semantics for basic core
fuzzy logics2 . As its starting point, we investigate set-theoretic Kripke-style semantics
for the logic system MTL (Monoidal t-norm logic) and its most prominent axiomatic
1 Corresponding Author: Eunsuk Yang, Department of Philosophy & Institute of Critical Thinking and
Writing, Chonbuk National University, Rm 307, College of Humanities Blvd. (14-1), Jeonju, 54896, KOREA
Email: eunsyang@jbnu.ac.kr.
2 Here, fuzzy logics are logics complete with respect to (w.r.t.) linearly ordered algebras and core fuzzy logics
are logics complete w.r.t. the real unit interval [0, 1] (see [1,2]).
132 E. Yang / Set-Theoretic Kripke-Style Semantics for Monoidal T-Norm (Based) Logics
extensions. For this, first, in Section 2, we discuss monoidal t-norm (based) logics and
their corresponding Kripke-style semantics. Next, in Section 3, we provide set-theoretic
completeness results for them.
For convenience, we adopt notations and terminology similar to those in [1,7,8,9,10]
and assume reader familiarity with them (together with results found therein).
Monoidal t-norm (based) logics are based on a countable propositional language with the
set of formulas FOR built inductively from a set of propositional variables VAR, propo-
sitional constants , ⊥, and binary connectives →, &, ∧, and ∨. Further connectives are
defined as follows:
Well-known monoidal t-norm logics are axiomatic extensions (extensions for short)
of MTL. We introduce some prominent examples.
Definition 2. The following are famous monoidal t-norm logics extending MTL:
For easy reference, we let Ls be a set of the monoidal t-norm logics defined previ-
ously.
E. Yang / Set-Theoretic Kripke-Style Semantics for Monoidal T-Norm (Based) Logics 133
Lemma 1. (Hereditary condition, HC) Let X be a Kripke frame. For every formula ϕ
and for any two nodes a, b ∈ X , if a ϕ and b ≤ a, then b ϕ.
Proof. Here we consider the formulas (PL), (DIV ), (DNE), (CT R) and (CAN) as exam-
ples.
(PL): By the condition (∨), it is sufficient to prove that ϕ → ψ or ψ → ϕ. By
Proposition 1, we can instead show that ≤ v(ϕ → ψ) or ≤ v(ψ → ϕ). Proposition 1
also ensures v(ϕ → ψ) = v(ϕ) → v(ψ) for all formulas ϕ and ψ. If v(ϕ) ≤ v(ψ), then
∗ v(ϕ) ≤ v(ψ) and thus ≤ v(ϕ → ψ). If v(ψ) ≤ v(ϕ), then ∗ v(ψ) ≤ v(ϕ) and
thus ≤ v(ψ → ϕ).
(DIV ): Lemma 2 ensures that in order to prove (ϕ ∧ ψ) → ((ϕ&(ϕ → ψ)) → ψ),
it is sufficient to show that for each node a ∈ X, if a ϕ ∧ ψ, then a ϕ&(ϕ → ψ). By
Proposition 1, we can instead assume a ≤ v(ϕ ∧ ψ) and show a ≤ v(ϕ&(ϕ → ψ)). Note
that Proposition 1 also ensures v(ϕ ∧ ψ) = min{v(ϕ), v(ψ)} and v(ϕ&ψ) = v(ϕ) ∗ v(ψ)
for all formulas ϕ and ψ. Then, since min{v(ϕ), v(ψ)} ≤ v(ϕ) ∗ (v(ϕ) → v(ψ)) by
(DIV F ), we have a ≤ v(ϕ&(ϕ → ψ)).
(DNE): As above, it is sufficient to prove that for each a ∈ X, if a ¬¬ϕ, then
a ϕ. By Proposition 1, we instead assume a ≤ v(¬¬ϕ) and show a ≤ v(ϕ). Note that
v(¬ϕ) = v(ϕ → ⊥) = ¬v(ϕ). Then, since a ≤ v(¬¬ϕ) = ¬¬v(ϕ) and ¬¬v(ϕ) ≤ v(ϕ)
by (DNE F ), we have a ≤ v(ϕ).
(CT R): As above, it is sufficient to prove that for each a ∈ X, if a ϕ, then a ϕ&ϕ. Let
a ϕ. By Proposition 1, we have a ≤ v(ϕ). Then, using the monotonicity and (CT RF ),
we also have a ≤ a ∗ a ≤ v(ϕ) ∗ v(ϕ). Hence, by the condition (&) and Proposition 1, we
obtain a ϕ&ϕ.
(CAN): We need to show that either ϕ → ⊥ or (ϕ → (ϕ&ψ)) → ψ. Ob-
viously, v(ϕ) = ⊥ ensures ≤ v(ϕ → ⊥) since v(⊥ → ⊥) = v(⊥) → v(⊥) = v().
Thus, by Proposition 1, we have ϕ → ⊥ in case v(ϕ) = ⊥. Let v(ϕ) = ⊥. In or-
der to prove (ϕ → (ϕ&ψ)) → ψ, we assume a ϕ → (ϕ&ψ) and show a ψ.
By Proposition 1, we instead assume a ≤ v(ϕ → (ϕ&ψ)) and show a ≤ v(ψ). Then,
since a ≤ v(ϕ → (ϕ&ψ)) = v(ϕ) → v(ϕ&ψ) = v(ϕ) → (v(ϕ) ∗ v(ψ)) and = (v(ϕ) →
(v(ϕ) ∗ v(ψ))) → v(ψ) by (CAN F ), we have a ≤ v(ψ).
We leave the proofs for the other cases to the interested reader.
E. Yang / Set-Theoretic Kripke-Style Semantics for Monoidal T-Norm (Based) Logics 135
Now, we provide completeness results for Ls. A theory T is said to be linear if, for
each pair ϕ, ψ of formulas, we have T ϕ → ψ or T ψ → ϕ. By an L-theory, we
mean a theory T closed under rules of L. By a regular L-theory, we mean an L-theory
containing all of the theorems of L. Since we have no use of irregular theories, by an
L-theory, we henceforth mean an L-theory containing all of the theorems of L.
Let T be a linear L-theory. We define the canonical L frame determined by T as a
structure X = (Xcan , can , ⊥can , ≤can , ∗can ), where can = T , ⊥can = {ϕ : T L ⊥ →
ϕ}, Xcan is the set of linear L-theories extending can , ≤can is ⊇ restricted to Xcan , i.e,
a ≤can b iff {ϕ : a L ϕ} ⊇ {ϕ : b L ϕ}, and ∗can is defined as a ∗can b := {ϕ&ψ : for
some ϕ ∈ a, ψ ∈ b} satisfying integral commutative monoid properties corresponding
to L frames on (Xcan , can , ≤can ). Notice that we construct the base can as the linear
L-theory that excludes nontheorems of L, i.e., excludes any formula ϕ such that L ϕ.
The linearly orderedness of the canonical L frame depends on ≤can restricted on Xcan .
First, we can easily show the following.
Proof. It is easy to show that a canonical L frame is partially ordered. We show that this
frame is connected and so linearly ordered. Suppose toward contradiction that neither
a ≤can b nor b ≤can a. Then, there are ϕ, ψ such that ϕ ∈ b, ϕ ∈ a, ψ ∈ a, and ψ ∈ b. Note
that, since can is a linear theory, ϕ → ψ ∈ can or ψ → ϕ ∈ can . Let ϕ → ψ ∈ can
and thus ϕ → ψ ∈ b. Then, by (mp), we have ψ ∈ b, a contradiction. The case, where
ψ → ϕ ∈ can , is analogous.
Next, let vcan be a canonical evaluation function from formulas to sets of formulas,
i.e, vcan (ϕ) = {ϕ}. We define a canonical evaluation as follows:
Lemma 3. can can ϕ → ψ iff for each a ∈ Xcan , if a can ϕ, then a can ψ.
Proof. By (a), we need to show that ϕ → ψ ∈ can iff for all a ∈ Xcan , if ϕ ∈ a, then
ψ ∈ a. For the left-to-right direction, we assume ϕ → ψ ∈ can and ϕ ∈ a, and show
ψ ∈ a. The definition of ∗can ensures (ϕ → ψ)&ϕ ∈ can ∗can a = a. Since L proves
((ϕ → ψ)&ϕ) → ψ, we have ((ϕ → ψ)&ϕ) → ψ ∈ can and thus ((ϕ → ψ)&ϕ) → ψ ∈
a. Therefore, we obtain ψ ∈ a by (mp). We prove the other direction contrapositively.
Suppose ϕ → ψ ∈ can . We set a0 = {Z : there exists X ∈ can and can (X&ϕ) → Z}.
Clearly, a0 ⊇ can , ϕ ∈ a0 , but also ψ ∈ a0 . (Otherwise, can (X&ϕ) → ψ and thus
can X → (ϕ → ψ); therefore, since can X, by (mp), we have can ϕ → ψ, a
contradiction.) Then, by the Linear Extension Property of Theorem 12.9 in [2], we have
a linear theory a ⊇ a0 with ψ ∈ a; therefore ϕ ∈ a but ψ ∈ a.
Property, we can obtain a linear theory b such that b0 ⊆ b and a ∗can b = {Z : there is
X ∈ a and can (X&ϕ) → Z}; therefore, ϕ ∈ b but ψ ∈ a ∗can b.
Let a model M for L be an L model. Using Lemma 4, we can show that the canoni-
cally defined (X , can ) is an L model. Then, by construction, can excludes our chosen
nontheorem ϕ and the canonical definition of |= agrees with membership. Therefore, we
can say that, for each nontheorem ϕ of L, there exists an L model in which ϕ is not
can |= ϕ. This gives us the following weak completeness of L.
Theorem 1. (Weak completeness) For any formula ϕ, if ϕ is valid in every L frame, then
L ϕ.
Furthermore, using Lemma 4 and the Linear Extension Property, we can show the
strong completeness of L as follows.
Theorem 2. (Strong completeness) L is strongly complete w.r.t. the class of all L frames.
4. Concluding Remarks
Acknowledgments: This work was supported by the Ministry of Education of the Repub-
lic of Korea and the National Research Foundation of Korea (NRF-2016S1A5A8018255).
References
[1] P. Cintula, R. Horčı́k and C. Noguera, Non-associative substructural logics and their semilinear exten-
sions: axiomatization and completeness properties, Review of Symbolic Logic 6 (2013), 394-423.
[2] P. Cintula, R. Horčı́k and C. Noguera, The quest for the basic fuzzy logic, in: Petr Hájek on Mathematical
Fuzzy Logic, F. Montagna, ed., Springer, Dordrecht, 2015, pp. 245-290.
[3] D. Diaconescu and G. Georgescu, On the forcing semantics for monoidal t-norm-based logic, Journal
of Universal Computer Science 13 (2007), 1550-1572.
[4] F. Esteva and L. Godo, Monoidal t-norm based logic: towards a logic for left-continuous t-norms, Fuzzy
Sets and Systems 124 (2001), 271-288.
[5] P. Hájek, Metamathematics of Fuzzy Logic, Kluwer, Amsterdam, 1998.
[6] F. Montagna and H. Ono, Kripke semantics, undecidability and standard completeness for Esteva and
Godo’s Logic MTL∀, Studia Logica 71 (2002), 227-245.
[7] F. Montagna and L. Sacchetti, Kripke-style semantics for many-valued logics, Mathematical Logic
Quarterly 49 (2003), 629-641.
[8] E. Yang, Algebraic Kripke-style semantics for relevance logics, Journal of Philosophical Logic 43
(2014), 803-826.
[9] E. Yang, Two kinds of (binary) Kripke-style semantics for three-valued logic, Logique et Analyse 231
(2015), 377-394.
[10] E. Yang, Algebraic Kripke-style semantics for substructural fuzzy logics, Korean Journal of Logic 19
(2016), 295-322.
This page intentionally left blank
Data Mining
This page intentionally left blank
Fuzzy Systems and Data Mining II 141
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-141
Introduction
Recently, an intensive research focused on association rule mining, which is one of the
main functions of data mining [1] Association rule was first introduced by Agrawal et
al. [2] and is defined as, X% of the customers who buy item A also buy item B and
denoted as AÆB. Association rules are meant to find the impact of a set of items on
another set of items. An itemset (items that co-occur in a transaction) frequency is
referred as the support count, which is the number of transactions that contain the
itemset. An itemset is frequent if its support count satisfies the minimum support
minsup threshold [3]. Confidence of an association rule XÆY is the ratio of
Y to the number of transactions that contain X [2 and 4].
transactions that contain X
Association rule mining has two main steps; 1) finding frequent itemsets/patterns,
2) generating association rules [5]. The first step is more expensive and several
algorithms have been proposed to find the frequent itemsets from huge databases. The
most classical one is the Apriori algorithm that uses candidate generation and testing
approach. Other subsequent algorithms using Apriori-like technique were introduced in
[6-12]. FP-Growth [4] and Matrix Apriori [13 and 14] are more recent algorithms that
try to overcome the drawback of candidate generation and multiple scans of the
database.
1
Corresponding Author: Belgin ERGENÇ, Computer Engineering Department, Izmir Institute of
Technology, Urla, Izmir, Turkey; Email: belginergenc@iyte.edu.tr.
142 N. Abuzayed and B. Ergenç / Dynamic Itemset Mining Under Multiple Support Thresholds
Dynamic MIS algorithm provides a solution to the dynamic itemset mining under
multiple support thresholds problem by maintaining dynamic MIS-tree and two header
tables that keep the support counts of all items of the database. Frequent pattern
generation from the tree is done by related module of CFP-Growth++ algorithm [18].
Throughout the section, we use the following example illustrated. Table 1 presents
a sample database D and Table 2 illustrates the user given multiple item support (MIS)
for each item in decreasing order and items’ actual support in the database D. In the
right most column of Table 1, the transactions’ items are in an order of support values
as given in Table 2.
Table 1. Transaction database D [17]. Table 2. MIS and actual support of each item in
TID Item bought Item bought (ordered) D [17].
100 D, C,A, F A, C, D, F
Item A B C D E F G H
200 G, C, A, F, E A, C, E, F, G MIS 80 80 80 60 60 40 40 40
300 B, A, C, F, H A, B, C, F, H (%)
400 G, B, F B, F, G Support 60 60 80 20 20 80 40 20
500 B, C B, C (%)
To build the MIS-tree, the MIS-tree builder algorithm illustrated in Figure 1 is used.
First, the MIS sorted list is created from the MIS values in Table 1 and ordered in de-
N. Abuzayed and B. Ergenç / Dynamic Itemset Mining Under Multiple Support Thresholds 143
creasing order (Line 1) then the root node of the tree is created (Line 2). Primary and
secondary header tables are created (Line 3) as shown in Figure 2.
INPUT: Database D, Minimum item supports MIS
OUTPUT: MISsorted, MIS-tree
BEGIN
1 Build MISsorted list (in decreasing order)
2 Create the root of MIS-tree as null
3 Create primary and secondary header tables
4 Insert items into primary table (count=0)
5 Scan D
6 FOR each transaction T in D do:
7 Sort all items in T (as MISsorted)
8 Add T to the tree
9 END FOR
10 Calculate the support of items in D
11 Update the supports in the tables
12 Relocate items between header tables
END
BEGIN
1 Scan d
2 FOR each transaction T in d do:
3 Sort items in T (like MISsorted )
4 Add T to the tree
5 END FOR
6 Calculate the support of items
7 Update the supports in the tables
8 Relocate items between header tables
END
Figure 3. Update process in Dynamic MIS for additions. Figure 4. Dynamic MIS-tree after adding d.
The pseudo code of the update process for additions is given in Figure 3. When new
transactions (Table 3) arrive, they are scanned to be added to the tree (Line 2-5). First,
144 N. Abuzayed and B. Ergenç / Dynamic Itemset Mining Under Multiple Support Thresholds
items in the new transaction are sorted in descending order of MISsorted list then
transactions are added to the tree one by one as seen in Figure 4. Each item’s count in
this transaction is incremented in the primary table. Then, the nodes of same item are
linked all through the tree and to the header tables of the same figure. Supports of
items are calculated then updated in the header tables (Line 6 - 7). Lastly, items are
relocated between header tables by comparing item’s support with MIN MIS value
(Line 8). The items (A and G) are transferred from primary to secondary header table,
because their supports become less than the MIN MIS value (40%).
The pseudo code of the update process for additions with new items is given in Figure
5. Let us explain this process by using the MIS-tree shown in Figure 2, incremental
database (with new items J, K, L) given in Table 5 and MIS values of new items given
Table 4. The first step is combining the new MIS values in Table 4 with the MIS values
of the old items in Table 2 to get Table 6.
Table 4. MIS values for new items in d. Table 5. The incremental database d with new items J, K, L.
Item J K L TID Item bought Item bought ( ordered)
MIS value 70% 35% 30% 1 C, B, K, J, H, L B ,C, J, H, K, L
2 K,H H, K
3 K, B, C B , C, K
Table 6. MIS values of all items.
Item A B C J D E F G H K L
MIS (%) 80 80 80 70 60 60 40 40 40 35 30
When new items appear, the MISsorted is updated by adding the new MIS values in
descending order as in Line 1. After that, new items in MISnew are appended to the
primary header table with item’s count 0 (Line 2). These two lines are the main
difference between additions and additions with new items.
BEGIN
1 Build MISsorted (MISsorted + MISnew)
2 Insert new items into primary header
table (count=0)
3 Scan d
4 FOR each transaction T in d do:
5 Sort items in T (like MISsorted )
6 Add T to the tree
7 END FOR
8 Calculate the support of all items
9 Update the supports in the tables
10 Relocate items between header tables
END
Figure 5. Dynamic MIS for additions with new items. Figure 6. Dynamic MIS-tree after adding d.
At the end, some items will be transferred between two header tables. Here, item
(G) is transferred from primary to secondary because its new support (25%) is less than
the new MIN MIS value (30%) and item (H) is transferred from secondary to primary
because with its new support of (37%). Figure 6 presents MIS-tree after adding the new
three transactions.
N. Abuzayed and B. Ergenç / Dynamic Itemset Mining Under Multiple Support Thresholds 145
Let us explain the pseudo code of the update process for deletions which is shown in
Figure 7 by using the increment of deletions shown in Table 7. This example is applied
on the tree of Figure 2. The new transactions in d are scanned, and then deleted from
the tree as seen in Figure 8. Some items’ counts are decremented. These counts’
supports are calculated and updated in the tables of tree. According to the new items’
supports, some items are relocated between header tables. In this example; the support
of item (G) is 33.3%, which is less than the MIN MIS value (40%) and it is moved into
the secondary header table.
BEGIN
1 Scan d
2 FOR each transaction T in d do:
3 Sort tems in T (like MISsorted)
4 Delete T from the tree
5 END FOR
6 Calculate the support of all items
7 Update the supports
8 Relocate items between header tables
END
Figure 7. Update process in Dynamic MIS for deletions. Figure 8. Dynamic MIS-tree after deletions.
The nodes with count 1 are decremented and deleted from the tree, but their
records are kept in its specified table. The result Dynamic MIS-tree is illustrated in
Figure 8.
2. Performance Evaluation
Dynamic MIS is compared with the popular tree based algorithm, CFP-Growth++ [2].
Several experiments are executed on 4 datasets with different properties (T: average
size of the transactions, D: number of transactions, N: number of items) as shown in
Table 8. D1 and D4 are real; D2 and D3 are synthetic datasets. Density2 of a dataset
indicates the similarity of the transactions. D3 is generated to be used only in the
experiment additions with new items.
Table 8. Properties of datasets.
All experiments are implemented on an Intl(R) core i7 -5500u CPU@ 2.40 GHz
with 8GB main memory, and Microsoft Windows 10 operating system. All programs
are implemented on C# environment.
For our experiments, we use two formulas [12] to assign MIS values to items in
the datasets: M (i)= β f (i) and MIS (i)=
{ M (i) M(i) > LS
LS Otherwise
f(i) is the actual frequency of item i in the data. LS is the user-specified lowest
minimum item support allowed. β (0 ≤ β ≤ 1) is a parameter that controls how the
MIS values for items should be related to their frequencies. If β = 0, we have only one
minimum support, LS, which is the same as the traditional association rule mining. If β
= 1 and f(i) ≥ LS, f(i) is the MIS value for i [16]. This formula is used to generate MIS
values to algorithms which use multiple support thresholds as in [16, 17, 18 and 28].
Computational complexity of building the initial tree is same for both algorithms. It is
(T * V); where T is the number of transactions, and V the average transaction length.
The complexity of the pruning procedure in CFP-Growth++ is O (N * C) where N is
the number of nodes holding the items to be pruned, C is the number of their children.
The merging procedure in CFP-Growth++ is O (N2 * K) where N is number of nodes
in the tree and K is the node links. However in Dynamic MIS the pruning and merging
procedures are replaced by relocating items between header tables procedure which has
a linear complexity of O (N) where N is the number of items to be transferred. The
complexity of adding increments to the tree is O (T * V) where T is the number of the
incremental transactions, and V the average transaction length.
Figure 9. Speed-up on Retail with additions. Figure 10. Speed-up with additions
The speed-up by running Dynamic MIS instead of re-running CFP-Growth++
when the database is updated is shown in Figure 9 and Figure 10. Speed-up of Dynamic
MIS is from 22.21 to 55.94 on D1 (Figure 9), from 1.56 to 1.35 on D2, and from 37.67
to 5.69 on D4 respectively as seen in Figure 10. The reasons behind these speed-up are
1) Dynamic MIS runs only on the increment whereas CFP-Growth++ runs from the
N. Abuzayed and B. Ergenç / Dynamic Itemset Mining Under Multiple Support Thresholds 147
beginning, 2) Dynamic MIS generates frequent patterns from the items of primary
header table only whereas CFP-Growth++ requires pruning and merging of MIS-tree.
Figure 11. Speed-up with additions with new items. Figure 12. Speed-up with deletions.
The last comparison is to determine how the size of deletions affects the performance
of algorithm. Each split contains 20 % of the transactions of the original dataset. MIS
values are kept constant. The speed-up by running Dynamic MIS instead of re-running
CFP-Growth++ when the database is up-dated with deletions can be seen in Figure 12.
The speed-up increases from 2.26 to 44.88 in D1, from 2.06 to 40.16 in D4 and from
1.12 to 1.25 in D2 while the split size decreases.
3. Conclusion
Single support threshold and dynamic aspect of databases bring additional challenges
on frequent itemset mining algorithms. Dynamic MIS algorithm is proposed as a
solution to dynamic update problem of frequent itemset mining under multiple support
thresholds. It is tree based and handles increments of additions, additions with new
items, deletions and is faster especially with large sparse database.
Acknowledgements
This work is partially supported by the Scientific and Technological Research Council
of Turkey (TUBITAK) under ARDEB 3501 Project No: 114E779
References
[1] M. Chen, J. Han, P. S. Yu, Data mining: An overview from a database perspective. IEEE Transaction
on knowledge and Data Engineering, 8(1996), 866–883.
148 N. Abuzayed and B. Ergenç / Dynamic Itemset Mining Under Multiple Support Thresholds
[2] R. Agrawal, T. Imielinski, A. Swami, Mining association rules between sets of items in large databases,
In: ACM SIGMOD International conference on Management of data, USA (1993), 207–216.
[3] J. Han, M. Kamber, J. pei, Data mining concepts and techniques, Morgan Kaufmann Publishers,
Location-Based Services Jochen Schiller, Agnes Voisard (2006), 157–218.
[4] J. Han, J. Pei, Y. Yin, Mining frequent patterns without candidate generation, In: ACM SIGMOD
International Conference on Management of Data, ACM New York, USA (2000), 1–12.
[5] R. Agrawal, R. Srikant, Fast algorithms for mining association rules in large databases, In: The 20th
International Conference on Very Large Data Bases, San Francisco, CA, USA (1994), 487–499.
[6] H. Mannila, H. Toivonen, A.I. Verkamo, Efficient algorithms for discovering association rules, In:
AAAI Workshop on KDD, Seattle, WA, USA (1994), 181–192.
[7] R. Agrawal, H. Mannila, R. Srikant, H. Toivonen, A.I. Verkamo, Fast discovery of association rules, In
Advances in KDD. MIT Press, 12(1996), 307–328.
[8] A. Savasere, E. Omiecinski, S.B. Navathe, An efficient algorithm for mining association rules in large
databases, In: The 21st VLDB Conference, Zurich, Switzerland (1995), 432–443.
[9] J.S. Park, M. Chen, P.S. Yu, An effective hash-based algorithm for mining association rules, In: ACM
SIGMOD International Conference on Management of Data, San Jose, CA, USA (1995), 175–186.
[10] R. Srikant, Q. Vu, R. Agrawal, Mining association rules with item constraints, In: ACM KDD
International Conference, Newport Beach, CA, USA (1997), 67–73.
[11] R.T. Ng, L.V.S. Lakshmanan, J. Han, A. Pang, Exploratory mining and pruning optimizations of
constrained associations rules, In: ACM-SIGMOD Management of Data, USA (1998), 13–24.
[12] G. Grahne, L. Lakshmanan, X. Wang, Efficient mining of constrained correlated sets, In: The 16th
International Conference on Data Engineering, San Diego, CA, USA (2000), 512–521.
[13] J. Pavon, S. Viana, S. Gomez, Matrix Apriori: Speeding up the search for frequent patterns, In: The
24th IASTED International Conference on Database and Applications, Austria (2006), 75–82.
[14] B. Yıldız, B. Ergenç, Comparison of two association rule mining algorithms without candidate
generation, In: The 10th IASTED International Conference on Artificial Intelligence and Applications,
Innsbruck, Austria (2010), 450–457.
[15] H. Mannila, Database methods for data mining, Tutorial for the 4th ACM SIGKDD International
Conference on KDD, New York, USA (1998).
[16] B. Liu, W. Hsu, Y. Ma, Mining association rules with multiple minimum supports, In: The 5th ACM
SIGKDD International Conference on KDD, San Diego, CA, USA (1999), 337–341.
[17] Y. Hu, Y. Chen, Mining association rules with multiple minimum supports: a new mining algorithm
and a support tuning mechanism, Decision Support Systems, 42(2006), 1–24.
[18] R.U. Kiran, P.K. Reddy, Novel techniques to reduce search space in multiple minimum supports-based
frequent pattern mining algorithms, In: The 14th International Conference on Extending Database
Technology, ACM, New York, USA, (2011), 11–20.
[19] S. Darrab, B. Ergenç, Frequent pattern mining under multiple support thresholds, In: The 16th Applied
Computer Science Conference, WSEAS Transactions on Computer Research, Turkey, 4(2016), 1–10.
[20] D.W. Cheung, J. Han, V.T. Ng, C.Y. Wong, Maintenance of discovered association rules in large
databases, An incremental updating technique, In: The 12th IEEE International Conference on Data
Engineering, New Orleans, Louisiana, USA, (1996), 106–114.
[21] D.W. Cheung, S.D. Lee, B. Kao, A general incremental technique for maintaining discovered
association rules, In: The 5th International Conference on Database Systems for Advanced
Applications, Melbourne, Australia, (1997), 185–194.
[22] D. Oğuz, B. Ergenç, Incremental itemset mining based on Matrix Apriori, DEXA-DaWaK, Vienna,
Austria, (2012), 192–204.
[23] D. Oğuz, B. Yıldız, B. Ergenç, Matrix based dynamic itemset mining algorithm, International Journal
of Data Warehousing and Mining, 9(2013), 62–75.
[24] Y. Aumann, R. Feldman, O. Lipshtat, H. Manilla, Borders: An efficient algorithm for association
generation in dynamic databases, Journal of Intelligent Information System, 12(1999), 61–73.
[25] S. Shan, X. Wang, M. Sui, Mining Association Rules: A continuous incremental updating technique,
In: International Conference on WISM, IEEE Computer Society, Sanya, China (2010), 62–66.
[26] B. Dai, P. Lin, iTM: An efficient algorithm for frequent pattern mining in the incremental database
without rescanning, In: The 22nd International Conference on Industrial, Engineering and Other
Applications of Applied Intelligent Systems, Tainan, Taiwan (2009), 757–766.
[27] W. Cheung, O.R. Zaiane, Incremental mining of frequent patterns without candidate generation or
support constraint, In: IDEAS, Hong Kong, China (2003), 111–116.
[28] F.A. Hoque, M. Debnath, N. Easmin, K. Rashad, Frequent pattern mining for multiple minimum
supports with support tuning and tree maintenance on incremental database, Research Journal of
Information Technology, 3(2011), 79–90.
[29] Frequent Itemset Mining Implementations Repository, http://fimi.ua.ac.be/data/
Fuzzy Systems and Data Mining II 149
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-149
Abstract. In this study, two major applications are introduced to develop advanced
deep learning methods for credit card data analysis. Credit card information is con-
tained in two data sets; credit approval dataset and card transaction dataset. The
credit card dataset has two problems. One problem is using credit card approval
dataset, it is necessary to combine multiple models, each referring to a different
clustered group of users. The other problem is using card transaction dataset, since
the actual unauthorized credit card use is very small, these imprecise solutions do
not allow the appropriate detection of fraud. To solve these problems, we proposed
deep learning algorithm to apply credit card dataset. The proposed methods are val-
idated using benchmark experiments with other machine learnings. To evaluate our
proposed method, we use two credit card datasets, credit approval dataset by UCI
machine learning repository and credit transaction dataset constructed by random.
The experiments confirm that deep learning exhibits comparable accuracy to the
Gaussian kernel support vector machine (SVM). The proposed methods are also
validated using large scale transaction dataset. Moreover, we apply our proposed
method for the time-series benchmark dataset. Deep learning parameter adjustment
is difficult. By optimizing the parameters, it is possible to increase the learning
accuracy.
Keywords. Data Mining, Deep Learning, Credit Approval Dataset, Card Transaction
Dataset
Introduction
Deep learning is a state-of-the-art research topic in the machine learning field with ap-
plications for solving various problems [1, 2]. This paper investigates the application of
deep learning in credit card data analysis.
Credit card data are mainly used in user and transaction judgments. User judgment
determines whether a credit card should be issued to the user satisfying particular criteria.
On the other hand, transaction judgment refers to whether the validity of a transaction is
correct [3]. We determined the deep learning processes required for solving each of these
problems, and we proposed appropriate methods for deep learning [4, 5].
To verify our proposed methods, we use benchmark experiments with other machine
learnings, which confirm the accuracy of the deep learning methods similar to that of
1 Corresponding Author: Ayahiko Niimi, Faculty of Systems Information Science, Future University
the Gaussian Kernel SVM. In the final section of this paper, we provide suggestions for
future deep learning experiments.
We only used a small scale transaction dataset for evaluation experiment, and did
not use a large-scale dataset [6]. In this paper, The proposed methods are also validated
using large scale transaction dataset. Moreover, we apply our proposed method for the
time-series benchmark dataset.
First, in section 1, we will introduce the characteristics of the data-set of the credit
card. Then, in section 2, we will introduce Deep Learning. In section 3, we will discuss
the data processing infrastructure that is suitable for analysis of credit card data. In sec-
tion 4, we describe the experiment, and the results are shown in section 5. We discuss
about the results in section 6. Finally, in section 7, we describe conclusions and future
works.
For each user submitting a credit card creation application, there is a record of the deci-
sion to issue the card or to reject the application. This is based on the user’s attributes, in
accordance with the general usage-trend models.
However, to reach this decision, it is necessary to combine multiple models, each
referring to a different clustered group of users.
In actual credit card transactions, the data is complex, constantly changing, and conti-
nously arrives online as follows:
Therefore, credit card transaction data can be precisely called a data stream. How-
ever, even if we use data mining for such data, an operator can monitor around only 2,000
transactions per day. Therefore, we have to detect suspicious transaction data effectively
by analyzing less than 0.02% of the total number of transactions. In addition, fraud de-
tection is extremely low from analyzing massive amounts of transaction data, because
real fraud occurs at an extremely low rate, i.e., within a 0.02% to 0.05% of all of the
transaction data.
In a precious paper, transaction data in CSV format were described as attributed in a
time order [3]. Credit card transaction data have 124 attributes, 84 are called transactional
A. Niimi / Deep Learning with Large Scale Dataset for Credit Card Data Analysis 151
data, including an attribute used to discriminate whether the data refers to fraud; the
others are called behavioral data, and they refer to the credit card usage. The inflow file
size is approximately 700 MB per month.
Mining the credit card transaction data stream presents inherent difficulties, since it
requires performing efficient calculations on an unlimited data stream with limited com-
puting resources. Therefore many streams mining methods seek an approximate or prob-
abilistic solution instead of an exact one. However, since the actual unauthorized credit
card use is very small, these imprecise solutions do not allow the appropriate detection
of fraud.
2. Deep Learning
Deep learning is a new technology that recently attracted much attention in the field of
machine learning. It significantly improves the accuracy of abstract representations by re-
constructing deep structures such as neural circuitry of the human brain. The deep learn-
ing algorithms were honored in various competitions such as International Conference
on Representation Learning.
Deep learning is a generic term for multilayer neural networks, which were re-
searched for a long time [1, 2, 7]. Multilayer neural networks decrease the overall calcu-
lation time by performing calculation on hidden layers. Thus, they were prone to exces-
sive over training, as an intermediate layer was often used for approximately every single
layer.
However, the technological advances suppressed over training, whereas GPU uti-
lization and parallel processing increased the number of hidden layers.
A sigmoid or a tanh function was commonly used as an activation function (see
Equation 1, 2), although recently, a maxout function was also used (section 2.1).
The dropout technique was implemented to prevent over training (section 2.2).
2.1. Maxout
where zij = xT W...ij + bij , W ∈ Rd×m×k and b ∈ Rm×k are learned parameters. In
a convolutional network, a maxout feature map can be constructed by taking the maxi-
mum across k affine feature maps (i.e., pool across channels, in addition to spatial loca-
152 A. Niimi / Deep Learning with Large Scale Dataset for Credit Card Data Analysis
tions). When training with dropout, we perform the element-wise multiplication with the
dropout mask immediately prior to the multiplication by weights, in all cases; inputs are
not dropped to the max operator. A single maxout unit can be interpreted as a piecewise
linear approximation of an arbitrary convex function. Maxout networks learn not just the
relationship between hidden units, but also the activation function of each hidden unit.
Maxout abandons many of the mainstays of traditional activation function design.
The representation it produces is not sparse at all, though the gradient is highly sparse,
and the dropout will artificially sparsify the effective representation during training. Al-
though maxout may learn to saturate on one side or another, this is a measure zero event
(so it is almost never bounded from above). Since a significant proportion of parame-
ter space corresponds to the function delimited from below, maxout learning is not con-
strained at all. Maxout is locally linear almost everywhere, whereas many popular acti-
vation functions have significant curvature. Given all of these deviations from standard
practice, it may seem surprising that maxout activation functions work at all, but we find
that they are very robust, easy to train with dropout, and achieve excellent performance.
2.2. Dropout
In this section, we consider the data processing infrastructure that is suitable for analysis
of credit card data, as well as the applications of deep learning to credit card data analysis.
3.1. R
R is a language and environment for statistical computing and graphics [8]. It is a GNU
project similar to the S language and environment which was developed at Bell Labora-
tories (formerly AT&T, now Lucent Technologies) by John Chambers and colleagues. R
can be considered as a different implementation of S. There are some important differ-
ences, but much code written for S runs unaltered under R. R is available as Free Soft-
ware and is widly used. It includes many useful libraries, such as multivariate analysis,
machine learning, and it is suitable for data mining.
However, R performs processing in memory, therefore, it is not suitable for large
amounts of data processing.
Google BigQuery [9] and Amazon Redshift [10] are systems corresponding to the in-
quiries using large amounts of data. These cloud systems can easily store a large amount
of data and processing it at high speed. Therefore, we can use them to analyze data trends
interactively. However, data processing, such as machine learning, needs to be further
developed.
Apache Hadoop is a platform for handling large amount of data as well [11]. Apache
Hadoop divides the process into mapping and reducing, wich operate in parallel; the Map
processes data, whereas the Reduce summarizes the results. In combination, these pro-
cesses realize high-speed processing of large amounts of data. However, since process-
ing is performed in batches the Map/Reduce cycle can be completed before all data are
stored. It is difficult to apply separate algorithms for Map/Reduce different batches. In
particular, it is difficult to implement the algorithm repeatedly for the same data, as is
required in machine learning.
Apache Storm is designed to process a data stream [12]. For incessantly flowing data,
data conversion is executed. The data source is called the Spout and the part that per-
forms the conversion process is called the Blot. Apache Storm is a model that performs
processing by a combination of Bolt from Spout.
Apache Spark is also a platform that processes large amounts of data [13]. Apache Spark
has generalized the Map/Reduce processing. It processes by caching the work memory,
and it is designed to execute efficient iterative algorithms by maintaining shared data,
154 A. Niimi / Deep Learning with Large Scale Dataset for Credit Card Data Analysis
which is used for repeated processing in the memory. In addition, a machine learning
and graphs algorithms library is prepared, and it can be an easily build environment for
stream data mining.
H2O is a library of deep learning for Spark [14, 15].
SparkR is an R package that provides a light weight frontend for Apache Spark
from R [16]. In Spark 1.5.0, SparkR provides distributed data frame implementation that
supports operations, such as selection, filtering, aggregation, similar to R data frames,
dplyr, but on large datasets. SparkR also supports distributed machine learning using
MLlib.
In the present paper, we perform credit card data analysis using R and Spark. It is
possible to use an extensive library with R to gain the high performance by parallel and
distributed processing of Spark.
4. Experiments
We used the credit approval dataset by UCI Machine Learning repository to evaluate the
experimental results [4].
All attribute names and values were reassigned to meaningless symbols to protect
the data confidentiality.
In addition, the original dataset contains missing values. In the experiment, we use
a pre-processing dataset [17], as presented in Table 1.
Deep learning uses the R library of H2O [14, 15]. H2O is a library for Hadoop and
Spark, but it also has an R package.
For comparison, we also use five typical machine learning algorithms. In addition,
the deep learning parameters (activation functions and dropout parameter) are changed
five times. In this experiment, the hidden layer neurons are set to (100, 100, 200) for deep
learning. The parameters used are shown in Table 2.
XGBoost is an optimized general purpose gradient boosting library [18]. The li-
brary is parallelized and provides an optimized distributed version. It implements ma-
chine learning algorithms under the gradient boosting framework, including a general-
ized linear model and gradient boosted decision trees. XGBoost can also be distributed
and scaled to Terascale data.
The activation functions used here are summarized in Table 3 [15].
Moreover, to ascertain whether there is a bias in the results of the training data
and the test data, we perform 10-fold cross-validation using the entire dataset. In this
experiment, the hidden layer neurons are set to (200, 200, 200).
In the experiment, we use the following environment.
A. Niimi / Deep Learning with Large Scale Dataset for Credit Card Data Analysis 155
In this paper, The proposed methods are also validated using large scale transaction
dataset. We made a dataset from the actual card transaction dataset which contains the
same number of attributes(130 attributes) and the value of each attribute made by random
with the same range. The data set has about 300,000 transactions which include about
3,000 illegal usages. We made a dataset with six months data for experiment. Because
this dataset has random values, it could not use to evaluate accuracy. We used this dataset
to estimate machine specs and calculation times.
The percentage of fraud count is too low in the dataset. We used all illegal usages
(approximately 3,000) and sampling normal usages (approximately 3,000) in the exper-
iment.
We used the Amazon EC2 r3.8xlarge (32 cores, 244GB memory) for experiment. As
a preliminary experiment, deep learning’s parameters of hidden layer nurons (100, 100,
200) and epochs (200) were used, but the learning did not converge. Therefore, in the
experiment, the parameters of the deep learning (hidden layer nurons (2048, 2048, 4096)
156 A. Niimi / Deep Learning with Large Scale Dataset for Credit Card Data Analysis
and epochs (2000) and hidden dropout ratios (0.75, 0.75, 0.7)) were used. The “Maxout
With Dropout” is used for activation function.
The experimental results are currently being analyzed.
For comparison of the proposed method, we evaluate our proposed method by using
public time-series benchmark data. We used the gas sensor dataset from the UCI Machine
Learning repository [4, 19, 20].
We are going to apply for experiment and tune parameters and analyze the obtained
results.
5. Experimental Results
Teble 4 shows the experimental results. We run each algorithms five times and the Table
4 presents the average. Because the machine learning algorithms that we used have no
initial value dependent, the results of the algorithms are the same, all five times.
The deep learning results depend on the initial parameters. Deep learning of accu-
racy with the Maxout with Dropout produces a result close to the Gaussian kernel SVM.
Table 5 shows the results of the 10-fold cross-validation. N and Y are class attributes.
Stability results are obtained regardless of the dataset.
A. Niimi / Deep Learning with Large Scale Dataset for Credit Card Data Analysis 157
6. Considerations
The presently conducted experiments confirm that deep learning has the same accuracy
as the Gaussian kernel SVM.
In addition, the 10-fold cross-validation experiment indicates that it is deep learning
offers higher precision.
In this experiment, we used the H2O library for deep learning, with the deep learning
modules written in Java were activated each time. Therefore, we cannot assessment the
execution time.
Deep learning parameter adjustment is difficult. By optimizing the parameters, it is
possible to increase the learning accuracy.
There are some different approaches for time-series dataset [21, 22]. These ap-
proaches are different from the proposed method, but it is useful to improve our proposed
method.
7. Conclusion
In this paper, we consider the application of deep learning in credit card data analysis.
We introduce two major applications and propose methods for deep learning. To verify
our proposed methods, we use benchmark experiments with other machine learnings.
Through these experiments, it is confirmed that deep learning has the same accuracy as
the Gaussian kernel SVM. The proposed methods are also validated using large scale
transaction dataset.
In the future, we will consider evaluation an experiment using the transaction data
and real datasets.
Acknowledgment
The authors would like to thank to Intelligent Wave Inc. for many comment of credit card
transaction datasets.
158 A. Niimi / Deep Learning with Large Scale Dataset for Credit Card Data Analysis
References
[1] Y. Bengio. Learning Deep Architectures for AI. Foundations & Trends R in Machine Learning,
2(2009):1-127.
[2] I. J. Goodfellow, D. Warde-Farley, M. Mirza, A. Courville, and Y. Bengio. Maxout Networks. ArXiv
e-prints, Feb., 2013.
[3] T. Minegishi and A. Niimi. Detection of Fraud Use of Credit Card by Extended VFDT, in World
Congress on Internet Security (WorldCIS-2011), London, UK, Feb., (2011), 166–173.
[4] M. Lichman. UCI Machine Learning Repository. (2013), (Access Date: 15 September, 2015). [Online].
Available: http://archive.ics.uci.edu/ml
[5] T. J. OZAKI. Data scientist in ginza, tokyo. (2015), (Access Date: 15 September, 2015). [Online]. Avail-
able: http://tjo-en.hatenablog.com/
[6] A. Niimi. Deep Learning for Credit Card Data Analysis, in World Congress on Internet Security
(WorldCIS-2015), Dublin, Ireland, Oct., (2015), 73–77.
[7] Q. Le. Building High-Level Features using Large Scale Unsupervised Learning. in Acoustics, Speech
and Signal Processing (ICASSP), 2013 IEEE International Conference on, May, (2013), 8595–8598.
[8] R: The R project for statistical computing. (Access Date: 15 September, 2015). [Online]. Available:
https://www.r-project.org/
[9] Google cloud platform. what is BigQuery? - Google BigQuery. (Access Date: 15 September, 2015).
[Online]. Available: https://cloud.google.com/bigquery/what-is-bigquery
[10] AWS Amazon Redshift. Cloud Data Warehouse Solutions. (Access Date: 15 September, 2015). [Online].
Available: https://aws.amazon.com/redshift/
[11] Apache Hadoop. Welcome to Apache Hadoop! (Access Date: 15 September, 2015). [Online]. Available:
https://hadoop.apache.org/
[12] Apache Storm. Storm, distributed and fault-tolerant realtime computation. (Access Date: 15 September,
2015). [Online]. Available: https://storm.apache.org/
[13] Apache Spark. Lightning-Fast Cluster Computing. (Access Date: 15 September, 2015). [Online]. Avail-
able: https://spark.apache.org/
[14] 0xdata — H2O.ai — Fast Scalable Machine Learning. (Access Date: 15 September, 2015). [Online].
Available: http://h2o.ai/
[15] A. Candel and V. Parmar. Deep Learning with H2O. H2O, (2015), (Access Date: 15 September, 2015).
[Online]. Available: http://learnpub.com/deeplearning
[16] SparkR (R on Spark) — Spark 1.5.0 Documentation. (Access Date: 15 September, 2015). [Online].
Available: https://spark.apache.org/docs/latest/sparkr.html
[17] T. J. OZAKI. Credit Approval Data Set, modified. (2015), (Access Date: 15 September, 2015).
[Online]. Available: https://github.com/ozt-ca/tjo.hatenablog.samples/tree/
master/r_samples/public_lib/jp/exp_uci_datasets/card_approval
[18] dmlc XGBoost extreme Gradient Boosting. (Access Date: 15 September, 2015). [Online]. Available:
https://github.com/dmlc/xgboost
[19] A. Vergara, S. Vembu, T. Ayhan, M. Ryan, M. Homer, and R. Huerta. Chemical Gas Sensor Drift Com-
pensation using Classifier Ensembles. Sensors and Actuators B: Chemical, 166(1), (2012), 320–329.
[20] I. Rodriguez-Lujan, J. Fonollosa, A. Vergara, M. Homer, and R. Huerta. On the Calibration of Sensor Ar-
rays for Pattern Recognition using the Minimal Number of Experiments. Chemometrics and Intelligent
Laboratory Systems, 130, (2014), 123–134.
[21] S. Yin, X. Xie, J. Lam, K. C. Cheung, and H. Gao. An Improved Incremental Learning Approach for
KPI Prognosis of Dynamic Fuel Cell System. IEEE Transactions on Cybernetics, PP(99), (2015), 1–10.
[22] S. Yin, H. Gao, J. Qiu, and O. Kaynak. Fault Detection for Nonlinear Process with Deterministic Dis-
turbances: A Just-In-Time Learning Based Data Driven Method. IEEE Transactions on Cybernetics,
PP(99), (2016), 1–9.
Fuzzy Systems and Data Mining II 159
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-159
Abstract. Uncertain data is the data accompanied with probability, which makes
the frequent itemset mining have more challenges. Given the data size n,
computing the probabilistic support needs O(n(logn)2) time complexity and O(n)
space complexity. This paper focuses on the problem of mining probabilistic
frequent itemsets over uncertain databases and proposed PFIMSample algorithm.
We employ the Chebyshev inequation to estimate the frequency of the items,
which decreases certain computing from O(n(logn)2) to O(n). In addition, we
propose the sampling technique to improve the performance. Our extensive
experimental results show that our algorithm can achieve a significantly improved
runtime cost and memory cost with high accuracy.
Introduction
The restraint of physical factors, the data preprocessing and the data privacy protecting
methods will bring uncertainty to data, which is significant over continuous arrived
data [1]. By introduce the probability of data occurrence, we can improve the robust of
data mining method, and guarantee that the data analysis can achieve exact and precise
knowledge, which is much valuable for user decision. Frequent itemset mining
algorithms over certain databases have achieved many good results [2-4]. Nevertheless,
the uncertainty of data [5, 6] brings new challenges.
According to the different definitions of frequent itemset over uncertain data, the
mining methods can be categorized into two types: one is based on the expected
support and another is based on the probabilistic support [7-22]. The methods based on
the expected support mainly used the expectation of the itemset support to evaluate
whether an itemset is frequent; the methods based on the probabilistic support
considered that an itemset is frequent when its support is larger than the minimum
support with a specified high probability. If the database size is n, then the former have
O(n) time complexity and O(1) space complexity, and the latter have O(n(logn)2) time
complexity and O(n) space complexity[11]. Clearly, the former has a much higher
1
Corresponding Author: Hai-Feng LI, School of Information, Central University of Finance and
Economics, Beijing 100081, China; E-Mail: mydlhf@cufe.edu.cn.
160 H.-F. Li et al. / Probabilistic Frequent Itemset Mining Algorithm
performance. The latter, however, can represent the probabilistic characters of frequent
itemsets.
Since the computing of probability support is complicate, it is more challengeable.
In this paper, we focus on this problem and propose a frequency estimating method
based on Chebyshev Inequation and sampling method, to implement the approximately
computing of probabilistic support, and guarantee the accuracy by theoretical analysis.
Also, we use the experiments to testify this method.
This paper is organized as follows. Section 1 introduces the preliminaries of
frequent itemset mining. Section 2 proposed our PFIMSample algorithm in detail.
Section 3 presents the experimental results. Section 4 concludes this paper.
1. Preliminaries
We call an itemset X with size n the k-itemset. Assuming X has item xt (0 t d k ) with
a probability pt , then X is an uncertain itemset, denoted as
X {x1 , p1; x2 , p2 ; ; xk , pk } . For uncertain dataset UD {UT1 ,UT2 , UTv } , each
UTi (i 1 v) denotes a transaction based on * , which has an id and the corresponding
(tid , X )
itemset X, denoted as . Figure 1 is a simple uncertain dataset, which, if using
possible world model, can be converted multiple certain dataset with a probability, and
each certain dataset is called a possible world.
Definition 1 (Count Support): Given the uncertain database UD and itemset X, the
occurrence count is called the count support of X, denoted as /UD ( X ) 㧘
/( X )
for
short.
Definition 2 (Possible World) [9]: Given the uncertain database, the generated
possible world PW has |UD| transactions, each transaction Ti is a subset of UTi ,
, in which Ti UTi .
PW {T1 , T2 , T|UD|}
denoted as
Providing the uncertain transaction is independent, then the probability of the
possible world, p(PW), can be computed by the following method. If an item x exists in
Ti and UTi , then we get the probability of x, the p(x); if x exists in UTi but not in Ti ,
then we get the probability of x , the p( x ). Then we can multiply all the probabilities.
p( PW ) 3 ( 3 p( x ))( 3 p( x))
xUTi xTi xTi
The computing equation is .
Using < to denote the possible worlds generated from UD, then the size of < will
increase exponentially w.r.t. the size of UD. That is, if UD has m transactions, and each
6im1ni
transaction has nm items, then < has 2 possible worlds.
H.-F. Li et al. / Probabilistic Frequent Itemset Mining Algorithm 161
The left image of Figure 1 show the uncertain dataset has 2 transactions, each one
22
has two items. As can be seen in the right image of Figure 1, there are 2 16
possible worlds, each of which has a occurrence probability. As an example, possible
world PW6 has two transactions T1 and T2 , which are all {A}. Then the probability of
PW6 is
p( PW6 ) p{ A}UT1 { A}T1 ({ A}) p{B}UT1 {B}T1 ({B}) p{ A}UT2 { A}T2 ({ A})
= * * *
p{C}UT2 {C}T2 ({C})
=0.6*0.3*0.2*0.7=0.025. As can be seen, the summary of
probability of all the possible worlds is 1.
In the uncertain database UD, the frequent itemset is defined by the possible world
model. If the itemset X has the support / PW ( X ) in each possible world PW, then the
probability pPW ( X ) is the probability of PW, the p(PW). We can use a 2-tuple
< / PW ( X ) , pPW ( X ) > to denote it. In UD, X has 2
6im1ni
such tuples, which can be denoted
P (X )
with the probability summed vector / ( X ) .
Definition 3 (Probabilistic Frequent Itemset) [10, 23]: Given the uncertain
database UD, the minimum support O and the minimum probabilistic confidence W ,
and itemset X is a ( O , W )-probabilistic frequent itemsetif the probabilistic support
/WP ( X ) t O /P (X ) P
, in which W =Max{i| / ( X ) ti > W }.
For an uncertain database with size n, we can use a divide-and-conquer method
[11], which has the time complexity O(n(logn)2) and space complexity O(n). As can be
seen, n is the key factor that determines the computing efficiency. If we can decrease n,
then the runtime cost will decrease linearly.
162 H.-F. Li et al. / Probabilistic Frequent Itemset Mining Algorithm
According to the law of large number, when n is large enough, the data tend to fit
the normal distribution, with which we proposed our mining algorithm PFIMSample
based on the sampling method. We described the detail as follows.
1) To scan the database and get the data statistics characteristics, that is, the average
and the variance of the itemset probability.
2) To scan the database and compute the count support and the expected support, in
which the expected support, is the sum of the probabilities.
3) For a given sampling parameter, we use random sampling over the database so that
the acquired data fits the normal distribution. Since we assume that the uncertain
database fits the normal distribution initially if the data is massive enough, we use
the simple system sampling method, which can guarantee the mining efficiency
with a similar distribution. On the other hand, since sampling will decrease the
database size, which may reduce the mining accuracy; thus, we will evaluate it
with our experiments, and we finally find that the accuracy is not relative to the
sampling rate. According to each item, we scan the sampling database and
compute the probabilistic support, which if is larger than the minimum support,
then is a 1-frequent itemset.
4) To match all the n-probabilistic frequent itemsets and generate the (n+1)-
probabilistic itemset, and compute the probabilistic support to determine whether
they are frequent.
5) To repeat the 4) phase until no new probabilistic frequent itemsets are generated,
then output the results.
In the PFIMSample algorithm, when the independent item is generated, to
guarantee the accuracy, we will scan the database but not the sampling database to
compute the probabilistic support. Consequently, the computing cost will be
O(n(logn)2). We use the heuristic rule based pruning strategy in the 3) phase.
According to the Chebyshev Inequation, a given variable X has the expected
support E(X) and the standard variance D(X); for random constant ε>0, we can get P( |
X - E(X) | ≥ ε ) ≤ D2(X) / ε². That is, in a arbitrary dataset, the ratio that it locates
within m D(X) centered by the expected support is at least 1-1/ m2, and m is a positive
number larger than 1. For an example, if m=5, then at least 1-1/25=96% data has the
probability that the support is larger than E(X)-5D(X). Thus, before we can determine
the frequency of a distinct X, we will first compute the expected support and the
standard variance, if E(X)-mD(X) is larger than the minimum support, then X is a
probabilistic frequent itemset with 1-1/m2 probability. Since computing the expected
support is O(n), which is far less than the cost of computing the probabilistic support,
then we can prune the itemsets efficiently. This efficiency will be better follows the
larger n.
To guarantee that the memory cost is low, we use a prefix-tree to maintain the
itemsets, as well the count support, the expected support and the probabilistic support.
Note that our algorithm does not store the probabilistic density function of the itemset,
which is due the fact that the space complexity of a probabilistic density function is
O(n), and many itemsets will result in massive memory usage . Since the probabilistic
support will be computed once, the probabilistic density functions can be deleted once
the probabilistic support is achieved, which can significantly improve the performance.
H.-F. Li et al. / Probabilistic Frequent Itemset Mining Algorithm 163
3. Experimental Results
We compared the performance and the accuracy when the minimum probabilistic
confidence is set 0.9. Our algorithm was implemented with Python 2.7 under Windows
7, and run over i7-4790M 3.6GHZ CPU and 4GB memory. Two uncertain datasets
were used to evaluate our algorithm: One is the GAZELLE that contains the real e-
commerce click stream, another is the synthetic data T25I15D320K generated by the
IBM generator. We assigned a probability generated from Gaussian distribution for
each item, which is widely accepted by the current research over uncertain data [16].
We showed the characteristics of the two datasets in Table 1. Our sampling method is
a framework that can be extensively applied on the existing algorithm, we employed
the state-of-the-art method TODIS [11] as the benchmark algorithm accordingly. That
is, TODIS is actually the condition that our algorithm PFIMSample with sampling rate
1.
Table 1. The Characteristics of Uncertain Datasets
We first conducted PFIMSample algorithm over two datasets with different sampling
rate. From Figure 2 and 3 we can see, when the minimum support was fixed, the
mining efficiency will reduced in line with the incremental sampling rate. When the
sampling rate is 0.01, the mining cost will be very low, that is, the runtime can be 100
folds better. Furthermore, to reduce the minimum support will result in the same trend
of the performance, which was much significant over T25I15D320K dataset. This is for
the reason that T25I15D320K dataset is denser than GAZELLE dataset.
Figure 2. Runtime VS Sampling rate (GAZELLE) Figure 3. Runtime VS Sampling rate (T25I15D320K)
Figure 4 and 5 compared the memory usage over different sampling rates. We can see
that the memory cost turned larger but not significantly, when the sampling rate
increased. On the other hand, the memory usage was not related to the minimum
164 H.-F. Li et al. / Probabilistic Frequent Itemset Mining Algorithm
support, which is because we used the relative minimum support. Moreover, the
memory usage is low when mining over the sparse dataset GAZELLE.
Figure 4. Memory cost VS Sampling rate Figure 5. Memory cost VS Sampling rate
(GAZELLE) (T25I15D320K)
3.3. Precision and Recall
We used the Precision and the Recall to evaluate the accuracy of our algorithm. For an
original mining results D and the ones that used the sampling rate D’, we defined
Precision=|D ∩ D’|/|D|, and Recall=|D ∩ D’|/|D’|. As a result, the larger the precision
and the recall, the higher accuracy our algorithm will be. Table 2 shows the Precision
and the Recall using different sampling rates over two datasets. As can be seen, when
the minimum support is 0.08, our algorithm can achieve 100% accuracy; also, it can
achieve more than 90% accuracy over T25I15D320K on most cases. In addition, the
accuracy of our algorithm is not related to the sampling rate since we use the random
samples.
Table 2. Precision and Recall
4. Conclusions
This paper made a study on probabilistic frequent itemset mining over uncertain
databases. The proposed algorithm PFIMSample employed the Chebyshev inequation
to estimate the count of frequent items, and thus can partly reduce the computing cost
from O(n(logn)2) to O(n). Moreover, we used the sampling method to improve the
performance with a high accuracy. Our extensive experimental results over two
datasets showed that our algorithm was effective and efficient.
Acknowledgement
References
[1] B. Babcock, S. Babu, M. Datar, et al. Models and issues in data stream systems. Proceedings of PODS,
2002.
[2] J. Han, H. Cheng, D. Xin, et al. Frequent pattern mining: current status and future directions. Data
Mining & Knowledge Discovery. 15(2007):55-86.
[3] J. Chen, Y. Ke, W. Ng. A survey on algorithms for mining frequent itemsets over data streams.
Knowledge and Information System, 16(2008), 1-27.
[4] C. C. Aggarwal, P. S. Yu. A survey of uncertain data algorithms and applications. IEEE Transaction on
Knowledge and Data Engineering, 21(2009), 609-623.
[5] A. Y. Zhou, C. Q. Jin, G. R. Wang, et al. A survey on the management of uncertain data. Chinese
Journal of Computers, 31(2009).
[6] J. Z. Li, G. Yu, A. Y. Zhou. Challenge of uncertain data management. Chinese Computer
Communications, 5(2009).
[7] J. Xu, N. Li, X. J. Mao, et al. Efficient probabilistic frequent itemsets mining in big sparse uncertain data.
Proceedings of PRICAI, 2014.
[8] Y. Konzawa, T. Amagasa, H. Kitagawa. Probabilistic frequent itemset mining on a gpu cluster. IEICE
Transactions of Information and Systems, E97-D(2014) , 779-789.
[9] Q. Zhang, F. Li, K. Yi, Finding frequent items in probabilistic data. Proceedings of SIGMOD, 2008
[10] T. Bernecker, H. P. Kriegel, M. Renz, et al, Probabilistic frequent itemset mining in uncertain
databases. Proceedings of SIGKDD, 2009.
[11] L. Sun, R. Cheng, D. W. Cheung, et al, Mining uncertain data with probabilistic guarantees.
Proceedings of SIGKDD, 2010.
[12] T. Bernecker, H. P. Kriegel, M. Renz, et al, Probabilistic frequent pattern growth for itemset mining in
uncertain databases. Proceedings of SSDM, 2012.
[13] L. Wang, R. Cheng, S. D. Lee, et al, Accelerating probabilistic frequent itemset mining: a model-based
approach. Proceedings of CIKM, 2010.
[14] L. Wang, D. Cheung, R. Cheng, et al. Efficient mining of frequent item sets on large uncertain
databases. IEEE Transaction on Knowledge and Data Engineering, 24(2012), 2170-2183.
166 H.-F. Li et al. / Probabilistic Frequent Itemset Mining Algorithm
Introduction
With the change of traffic model, large-scale data center networks (DCNs) are often
deployed in Fat-Tree architecture as the non-blocking network. It has over provisioned
network resources and inefficient power usage. Thus, the goal of network power
conservation is to make the power consumption on networking devices proportional to
the traffic load [1]. Many researchers have investigated energy saving for DCNs from
different aspects. The article [2] proposed energy saving routing based on elastic tree.
In [3], the authors proposed a data center energy-efficient network-aware scheduling.
The article [4] presented an energy efficient routing algorithm with the network load
balancing and energy saving. In [5], the authors proposed a bandwidth guaranteed
energy efficient DCNs scheme from the perspective of routing and flow scheduling.
The article [6] aimed to reduce the power consumption of DCNs from the routing
perspective while meeting the throughput performance requirement.
In the DCNs, the network delay is also an important parameter which can reflect
the network performance [7]. The traffic with high priority usually has a strict demand
of transmission delay. In this paper, a new energy efficient routing algorithm is
proposed with the considering of traffic priority. Its basic idea is to make sure the
higher priority traffics get the shorter routings, combine the bandwidth constraints, and
balance between energy consumption and traffic priority demand.
1
Corresponding Author: Hu-Yin ZHANG, Shenzhen Research Institute of Wuhan University,
Shenzhen, China; School of Computer Science, Wuhan University, Wuhan, China; E-mail:
zhy2536@whu.edu.cn.
168 H.-Y. Zhang et al. / Priority Guaranteed and Energy Efficient Routing in Data Center Networks
Figure 1 shows the Fat-Tree architecture, which contains three tiers of switch modules,
the Ck core switches, the Ak aggregation switches and the Sk edge switches. i is from one
to n, it represents the number of switches in the tier. This is conventionally denoted as
a v(c, a, s) network. In order to achieve the goal of efficient energy, it needs to use the
links as little as possible, so that we can make switches work in the sleep mode as
many as possible. The Eq.(1) describes the minimum link number, which is intended to
use a minimum number of switches in the v(c, a, s) network.
Rw in Eq.(1) is an array which array R sum the nodes and then take a linear
transform in Eq.(2). It expresses the number of active switches in each tier. In the array
R, Ck , Ak , Sk represent the node name of active switches in each tier respectively. The
problem is how to obtain the optimal array R with the priority guaranteed traffics, and
establish the least links.
In order to establish the least links, the bandwidth utilization of the used links needs to
get a maximum value. In array R, the higher priority traffic will choose the shorter
routing path. However, we will encounter the problem in figure 2.
When traffic 1 (higherpriority) used the A–>B–>E path, traffic 2 and 3 have no
path to use, and the failure bandwidth (FB) occurs. If we analyze the traffic
requirements from the overall situation, optimize the routing and change the traffic 1
into the path A->D->E, then traffic 2 and 3 can both have their paths, and the FB is 0.
Although the higher priority traffic needs to choose a new route, it did not increase its
forwards, so we can regarded this change as no increase of transmission delay.
H.-Y. Zhang et al. / Priority Guaranteed and Energy Efficient Routing in Data Center Networks 169
2. PER Algorithm
The scheme is to compute transmission paths for all flows in the DCNs topology and
reduce the energy consumption of switches in this topology as little as possible.
Any Failure
Bandwidth ?
x Step 2, update priority parameter, and then configure the lower priority traffic.
x Step 3, see if there is any failure bandwidth, if yes jump to step 4, else jump to
step 6.
x Step 4, the priority guaranteed optimization algorithm.
x Step5, is the optimized new routing of the higher priority traffic longer than
the existing one? If yes, the routing of the higher priority traffic maintain the
existing one, the lower priority traffic choose the longer path. If no, the
optimized new routing will be executed. Then repeat step 3.
x Step 6, judge that if all configurations are completed, if not completed, repeat
the step 3, if all configurations are completed, generate the energy efficient
routing topology, and then turn the idle switches into sleep mode.
This algorithm is designed for selecting route path for the flows with different priorities.
Each selected path can eliminate the failure bandwidth and make the link bandwidth
utilization rate as high as possible. If there are many available paths for a flow, the
problem can be converted to an undirected graph G=(S, E). Assume that the weight is
the bandwidth left in the link. We need to find out the shortest path from node b6 to b ,
and maximize the link bandwidth utilization. So the more bandwidth left, the bigger
weight link has. We use these path selection rules as follow:
x Rule 1, set an accessorial vector SA, each of its components SA[i][j]
represents the weight of the link from source node Sk to node A .
x Rule 2, the state of the SA: if there is a link from node Sk to A , SA[i][j]
represents the weight of this link, if no link, SA[i][j]= -1. So, we choose (Sk ,A )
for the SA[i][j]= Min{ SA | A V }. If there are links with same weight, we
choose the node which has the minimum subscript.
x Rule 3, another accessorial vector AC, each of its components AC[j][k]
represents the weight of the link from source node A to C .
x Rule 4, the state of the AC: if there is a link from node A to C , AC[j][k]
represents the weight of this link, if no link, AC[j][k]= -1. So, we choose
(A ,C ) for the AC[j][k] = Min{ AC | C V }. If there are links with same
weight, we choose the node which has the minimum subscript.
x Rule 5, we store the nodes which be selected from the rule 1 to 4.
x Rule 6, if there is any failure bandwidth, all flows in the links which related to
failure bandwidth should be reconfigured according to the Rule 1 to 4. We
should select the route path for the flow which has caused the
failure bandwidth firstly, and then for other flows according to
priorities in descending order.
x Rule 7, all the nodes including in the routing that generated from Rule 6 are
stored in the accessorial array D, then we do a comparison between the array
D and R.
x Rule 8, if the nodes of higher priority traffics in array D is more than R, the
higher priority traffics will preserve the status inarray R and then copy R to D,
or else the higher priority traffics will choose the status inarray D and then
copy D to R.
H.-Y. Zhang et al. / Priority Guaranteed and Energy Efficient Routing in Data Center Networks 171
When the path selections for all traffics are completed, the higher priority flows
are configured with the less number of routing nodes, and the array R stores switch flag
nodes which we used in the links. Therefore, we can sleep the idle switches in order to
save the data center energy consumption.
3. Evaluations
We evaluate our PER algorithm by using Fat-Tree topologies and Matlab7.11 platform.
We compare the results to the random routing without priority guaranteed. We use
simulation model with the network /(16,32,32) which includes eighty nodes of
switches. The available bandwidth of each link is randomly generated, and it does not
exceed 10M. We select twelve traffics, and set their priorities and flow capacities
randomly. To simplify the simulation system, we assume that the data processing
abilities of each layer are same. We set the transmission delay that begin from the
current node and arrive at the next node from 30 to 50ms randomly.
80
60
40
20
0
Fat-Tree Random PER
Figure 5 shows the energy consumption of three kinds of topologies based on the
network /(16,32,32). In the Fat-Tree topology, eighty switches remain in the active
state even if no traffic in some of them. In the random routing, we used almost half of
the switches in the same network, so nearly half of the number of switches can be turn
into the sleep mode. In the PER, because of the increasing utilization of link bandwidth,
about 75% of the switches can be turn into sleep mode in this network, it reduce the
energy consumption greatly.
4. Conclusion
In this paper, we address the power saving problem in DCNs from a routing
perspective. We establish the network model, and introduce the priority guaranteed and
energy efficient routing problem. Then we propose a routing algorithm to solve the
problem of improving energy consumption in DCNs with the guarantee of traffic
priorities. The evaluation results demonstrate that our algorithm can effectively reduce
the transmission delay of the higher priority traffics and the power consumption of
DCNs compared with the random routing.
Acknowledgements
This work was supported by the National Natural Science Foundation of China under
Grant No. 61540059, and the Shenzhen science and technology projects under Grant
No. JCYJ20140603152449639.
References
[1] L.A. Barroso, U. Hlzle, The case for energy-proportional computing, Computer, 40(2010):33–37.
[2] B. Heller, S. Seetharaman, P. Mahadevan, et al.., Elastic Tree: Saving energy in datacenter networks.
Proc of the 7th USENIX Symp on Networked Systems Design and Implementation (NSDI 10). New
York: ACM, 2010:249–264.
[3] D Kliazovich, P Bouvry, S.U. Khan, DENS: Data Center Energy-Efficient Network-Aware Scheduling,
Cluster Computing, 16(2013):65–75.
[4] S Dong, R Li, X Li, Energy Efficient Routing Algorithm Based on Software Defined Data Center
Network, Journal of Computer Research and Development, 52(2015): 806–812.
[5] T Wang, B Qin, Z Su,Y Xia, M Hamdi, et al.., Towards bandwidth guaranteed energy efficient data
center networking, Journal of Cloud Computing, 4(2015):1–15.
[6] M Xu, Y Shang, D Li, X Wang, Greening data center networks with throughput-guaranteed power-aware
routing, Computer Networks, 57(2013):2880–2899.
[7] W. Lao, Z. Li, Y. Bai, Methodology and Realization of Measure on Network Performance Parameter,
Computer Applications & Software, 21(2004).
Fuzzy Systems and Data Mining II 173
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-173
Keyword̆̆. data mining, DRAM, yield analysis, artificial neural network, fault
detection.
Introduction
Due to the lengthy manufacturing process for a dynamic random access memory
(DRAM) chip [1-2], it would be beneficial if any manufacturing error can be detected
earlier before the whole process is completed. To do so, on-line machine-fault
detection should be performed to prevent further damage to the wafers in process [3-5].
In general, a yield rate that is much lower than average indicates a possible fault with
high probability. Therefore, an on-line prediction of the yield rate would be helpful to
the machine-fault detection.
For any integrated circuitry, the machining parameters in each step of the
manufacturing process are usually specified. However, no physical or mathematical
model exists to relate the machining parameters with the yield rate. To cope with this
modeless problem, data mining technique can be used to investigate this relationship by
extracting the information from the manufacturing data [6-7]. Therefore, in this paper,
we propose using an ANN to build up the functional relationship between the
machining parameters and the yield rate, and use the constructed ANN to serve as a
yield rate predictor [8-10]. The training algorithm for the proposed ANN will be
introduced, and the ANN will be trained by real manufacturing data. To investigate the
1
Corresponding Author: Chun-Wei CHANG, Department of Electrical Engineering, Chang Gung
University, Kwei-Shan, Tao-Yuan 333, Taiwan; E-mail: shinylin@mail.cgu.edu.tw.
174 C.-W. Chang and S.-Y. Lin / Yield Rate Prediction of a DRAM Manufacturing Process
effect of the size of the data set on training the ANN, the prediction accuracy of the
ANN trained by different size of data sets will be investigated in this paper.
This paper is organized in the following manner. Section 1 presents the proposed
ANN. Section 2 presents the test results of the proposed ANN. Section 3 draws a
conclusion.
1. Construction of ANN
There are two parts for constructing an ANN as a yield rate predictor. The first part is
to collect the data of machining parameters and the corresponding yield rate of DRAM
wafers to serve as a training data set. The second part is using the training data set to
train the ANN.
There are hundreds to thousands of processing steps for manufacturing a DRAM chip.
Each DRAM wafer may repeatedly visit the same machine but with different setup of
machining parameters. To train the ANN, a pair of input and output data for each wafer
is collected. The collected input data are the machining parameters, which consist of
the following types, average thickness of oxide coating, range of thickness of oxide
coating, average Nitride thickness, range of Nitride thickness, polish time of chemical
mechanical planarization, photo dose, photo focus, etc. The output data is the yield rate
of the wafer, e. g. 90%. Therefore, the input-output pair of data is formed by a multiple
input data and a single output data, and the collected input-output pairs of data will
serve as the training data set for the ANN.
T
x [ x1 , , xN ]
Let , where x1 , , xN represent the N machining parameters. Let
y ( x) represent the yield rate of the wafer, which is a function of the vector of
machining parameters x . Let M denote the number of input-output pairs of
collected training data set. We employ a feed-forward back propagation ANN that
consists of an input layer, one hidden layer and an output layer [11]. Fig. 1 shows the
three-layer ANN consisting of N input neurons, q hidden-layer neurons, and 1
Z
output neuron, where i , j , i 1, …, q , j 1,…, N , and E k , k 1,…, q
represent the arc weights.
The N neurons in the input layer correspond to x , and the single output neuron
is for y ( x ) . The input layer neurons directly distribute each component of x to
neurons of hidden layer. Hyperbolic tangent sigmoid function shown in Eq. (1) is used
as the activation function of the hidden layer neurons.
e x e x
tanh( x) (1)
e x e x
C.-W. Chang and S.-Y. Lin / Yield Rate Prediction of a DRAM Manufacturing Process 175
Z 1 ,1 tanh()
E1
x1 Z 2,1 2
Zq ,1 tanh() E2
Z1,2 y( x)
¦
3
x2 E3
Zq ,2 tanh()
..
. Z1,N ..
.
Eq
Z3,N q
xN
Zq , N tanh()
ʳʳʳʳʳʳʳʳʳʳ
Figure 1. A three-layer feed-forward back propagation ANN
The procedures to train the ANN using the training data set, which are the M
input-output pairs of collected data, can be stated in the following. For a given input
xi to the ANN that is presented in Fig. 1, we let the corresponding output of the ANN
be denoted by yˆ (xi | ω, β) , which can be calculated by the following formula:
q N
yˆ (xi | ω, β) ¦ E tanh(¦Z , jθij )
1 j 1
(2)
M
1
min
ω ,β M
¦{ y(x ) yˆ (x
i 1
i i | ω, β)}2 (3)
2. Test Results
In this section, the prediction accuracy of the trained ANN is investigated. In addition,
the relationship between the size of training data set and the prediction accuracy is also
investigated. Therefore, five test cases with various size of training data set are set up.
In all test cases, the number of machining parameters is set to N =78. The employed
three-layer ANN consists of 78 input neurons, 150 (= q ) hidden layer neurons and one
176 C.-W. Chang and S.-Y. Lin / Yield Rate Prediction of a DRAM Manufacturing Process
output neuron. The number of epochs exceeding 1000 is chosen as the termination
criteria for training the ANN. The value of M , which is the size or the number of
input-output pairs of training data set, is set to M =50 for case 1, 150 for case 2, 400
for case 3, 700 for case 4 and 1000 for case 5. For each case, we collect 2M pairs of
input-output data from real manufacturing process and separate them into two sets, the
training and testing data sets. The training data set is used to train the employed
three-layer ANN, and the testing data set is utilized to test the prediction accuracy of
the trained ANN.
The prediction accuracy of the trained ANN is defined by the average of the
percentage of the absolute error between the actual and the predicting yield rate, which
is denoted by e and can be calculated by the following equation
1 M
y(xi ) yˆ (xi | ω, β)
e
M
¦|
i 1 y ( xi )
| u100% (4)
M
1
Ve
M
¦ (e e )
i 1
i
2
(5)
case 1 2 3 4 5
Size, M 50 150 400 700 1000
Table 2. Standard deviation of the prediction accuracy of the trained ANN for the four cases
case 1 2 3 4 5
Size, M 50 150 400 700 1000
36.00 15.13 9.08 5.55 4.73
180
Number of input-uotput pairs
160
140
120
100
80
60
40
20
0
-30 -26 -22 -18 -14 -10 -6 -2 2 6 10 14 18 22 26 30
Percentage of prediction error
From Figure 2, we see that most of the tested input-output pairs of data are with
very small prediction error, which confirms that the proposed ANN can serve as a good
yield rate predictor for DRAM manufacturing process.
3. Conclusion
In this paper, a three-layer feed-forward and back propagation ANN is presented and is
used to serve as a predictor for the yield rate of a DRAM manufacturing process. The
proposed ANN is trained and tested using real manufacturing data. The test results
reveal that the prediction errors are very small, and as the size of the training data set
increases, the prediction accuracy of the ANN increases and the associated standard
deviation decreases. Therefore, the presented ANN is qualified to serve as a yield rate
predictor for the future purpose of fault detection.
Acknowledgments
This research work is supported in part by Chang Gung Memorial Hospital under grant
BMRP29.
178 C.-W. Chang and S.-Y. Lin / Yield Rate Prediction of a DRAM Manufacturing Process
References
[1] K. Chandrasekar, S. Goossens, C. Weis, M. Koedam, B. Akesson, N. Wehn and K. Goossens, Exploiting
expendable process-margins in DRAMs for run-time performance optimization, Design, Automation &
Test in Europe Conference & Exhibition, 2014, 1-6.
[2] P. S. Huang, M. Y. Tsai, C. Y. Huang, P. C. Lin, L. Huang, M. Chang, S. Shih and J. P. Lin, Warpage,
stresses and KOZ of 3D TSV DRAM package during manufacturing process, 14th International
Conference on Electronic Materials and Packaging, 2012, 1-5.
[3] S. Hamdioui, M. Taouil and N. Z. Haron, Testing open defects in memristor-based memories, IEEE
Trans. on Computers, 64(2015), 247-259.
[4] R. Guldi, J. Watts, S. PapaRao, D. Catlett, J. Montgomery and T. Saeki, Analysis and modeling of
systematic and defect related yield issues during early development of a new technology, Advanced
Semiconductor Manufacturing Conference and Workshop, 4(1998), 7-12.
[5] L. Shen and B. F. Cockburn, An optimal march test for locating faults in DRAMs, Records of the 1993
IEEE International Workshop on Memory Testing, 1993, 61-66.
[6] A. Purwar and S. K. Singh, Issues in data mining: a comprehensive survey, IEEE International
Conference on Computational Intelligence and Computing Research, 2014, 1-6.
[7] J. Han and M. Kamber. Data mining concepts and techniques. 2nd ed. Morgan Kaufmann Publishers,
2006.
[8] B. Dengiz, C. Alabas-Uslu and O. Dengiz, Optimization of manufacturing systems using a neural
network metamodel with a new training approach, Journal of the Operational Research Society,
60(2009), 1191-1197.
[9] N. Alali, M. R. Pishvaie and V. Taghikhani, Neural network meta-modeling of steam assisted gravity
drainage oil recovery processes, Journal of Chemistry & Chemical Engineering, 29(2010), 109-122.
[10] T. Chen, H. Chen and R. Liu, Approximation capability in C(Rn) by multilayer feed-forward networks
and related problems, IEEE Transactions on Neural Networks, 6(1995), 25-30.
[11] J. A. Anderson. An introduction to neural network. MIT Press, Boston, USA, 1995.
[12] B. M. Wilamowski and H. Yu, Improved computation for Levenberg-Marquardt training, IEEE Trans.
On Neural Network, 21(2010), 930-937.
Fuzzy Systems and Data Mining II 179
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-179
Introduction
Frequent itemset mining is one of the important techniques in data mining, which dis-
covers the patterns from databases to support the commercial decisions. Recently, new
applications have been developed in web site, Internet and wireless networks, which will
generate many uncertain data, that is, each data will be adhered with a probability to show
the existence of the data[3], Table 1 shows an example of the uncertain database with
4 items {a, b, c, d}. In such cases, traditional exact frequent itemset mining algorithms
were studied in the recent years[1] were not effective yet since new feature brings us
new challenges; thus, new methods need to be designed to handle this data environment.
The existing uncertain frequent itemset mining methods can be split into two categories.
One is based on the expected support to achieve the results[2], another is to discover
the probabilistic frequent itemsets according to the definition of probabilistic support[4].
The probabilistic frequent itemsets, in comparison to the expected frequent itemsets, can
better represent the probability of the itemsets; thus, the mining problem obtain more
focus. Nevertheless, the mining is hard since converting the uncertain database to exact
database is an NP-hard problem. One will use O(nlog 2 n) time complexity and O(n)
space complexity to conduct the probabilistic support computing for an itemset. Clearly
to see, when the database size n is large, the mining cost will be huge.
1 Corresponding Author: Hai-Feng Li, School of Information, Central University of Finance and Economics,
In this paper, we will focus on the mining problem and present an approximate
method to convert the uncertain database to exact database so that the runtime can be re-
duced. The rest of the paper are organized as follows. section 1 presents the preliminaries
and then present the challenge of the problem. Section 2 introduces our method. Section
3 evaluates the performance with our experimental results. Finally, section 4 concludes
the paper.
1.1. Preliminaries
In this paper, we will discover all the probabilistic frequent itemsets from the uncer-
tain databases for the given λ and minimum probabilistic confidence.
H.-F. Li and Y. Wang / Mining Probabilistic Frequent Itemsets with Exact Methods 181
To address this problem, many research have be studied. Zhang et al. firstly intro-
duced the conception of probabilistic frequent items[4], and employed the dynamic pro-
gramming(DP) technique to perform mining, which was improved by Bernecker et al.[5]
with using the a priori rule for further pruning. With this method, the time complexity is
O(n2 ) and the space complexity is O(n). Sun et al. improved the method by regarding
the probability computation as the convolution of two vectors and thus used the divide-
and-conquer method(DC)[6] to conduct mining, in which the fast fourier transform can
reduce the computing complexity from O(n2 ) to O(nlog 2 n). The probabilistic frequent
itemset and the expected frequent itemset were proved having relationships in [7] based
on standard normal distribution. Tong et al. surveyed all the methods in [8].
As can be seen, the most efficient method to computing the probabilistic support has a
significantly high cost, which, as a result, will reduce effective of the mining method in
real applications. We develop a novel method, which does not consider the method of
improving the mining method but design a framework to mining probabilistic frequent
itemsets with traditional exact frequent itemset mining methods. In this framework, we
build a relationship between uncertain data and exact data with a supplied parameter,
called the minimum confidence
.
With the minimum confidence
, we can convert the uncertain database to the exact
one as follows. We will scan the uncertain database, if the probability of an item is
smaller that
, then we will consider it as not existing in the exact database, otherwise
existing. The reason behind it is based on an instinctive consideration, that is, an item
with a small probability contributes little in getting an effective probability of its high
occurrences. Once an exact database is generated, we can employ the traditional frequent
itemset mining algorithm to discover the results. The pseudocode is shown in Algorithm
1. As an example, when we set
= 0.5, then the uncertain database in Table 1 can be
converted to the database in Table 3, in which all the items with probability smaller than
0.5 are removed directly.
In this paper, we ignored τ for two reasons. On the one hand, in [8], Tong et. al
evaluated that τ has little impact over the mining results with their experiments; we
also conducted experiments with the state-of-the-art algorithm TODIS, whose results are
shown in Table 2. As can be seen, when we fix the minimum support, the runtime cost
and the memory cost kept almost unchanged no matter how τ changes. On the other
hand, we employ a novel framework to convert the uncertain database to exact one, over
which the traditional mining methods can be used, and thus τ is not useful and can be
ignored accordingly.
182 H.-F. Li and Y. Wang / Mining Probabilistic Frequent Itemsets with Exact Methods
Table 3. The Database Converted from the Uncertain Database when = 0.5
ID Transaction
1 ad
2 ac
3 bc
4 abcd
5 d
Analysis: Our method is in line with the database size n, that is, the conversion
from uncertain data to exact data need O(n) time complexity; furthermore, an other
advantage is that it can read the data into the memory synchronously, which can almost
be ignored. On the other hand, since our final mining is based on the exact database, the
mining speed can be much improved. In comparison to the uncertain database mining,
the time complexity will be reduced to O(n) at least. Suppose the count of itemsets that
need to be computed is m, then our method has the time complexity O(mn), the most
effective method to directly discover the probabilistic frequent itemsets, however, needs
o(mnlog 2 n). Clearly to see, when the database size n is large, the mining speed will be
improved significantly.
Even though the performance can be improved, the mining results will be approx-
imate. The minimum confidence
is the key parameter to determine how approximate
the mining results. Consequently, how to decide
is the main problem so far. Table 4 is
the precision and recall of our method when we set the minimum support to 0.1, 0.08
and 0.06; also, we set the minimum confidence to 0.9, 0.8 and 0.7. As can be seen, the
precision and the recall will reach to their highest value when for a special minimum
confidence. That is to say, if we find this special minimum confidence, the accuracy will
be high.
To address this problem, we employed a sampling method to find this special param-
eter. Before we convert the uncertain database, we will take samples from the database as
a sub-database, which will firstly be converted and we can use our method to determine
the minimum confidence; then, the mining can be conducted over the entire database.
H.-F. Li and Y. Wang / Mining Probabilistic Frequent Itemsets with Exact Methods 183
Precision Recall
Data Minsup
0.9 0.8 0.7 0.9 0.8 0.7
0.1 100% 100% 100% 100% 75% 75%
GAZELLE 0.08 80% 100% 100% 100% 100% 100%
0.06 85% 100% 100% 100% 70% 70%
0.1 11% 88% 100% 100% 72% 56%
T25I15D320K 0.08 25% 95% 100% 100% 74% 64%
0.06 24% 98% 100% 100% 81% 70%
uncertain data data size avg. size min. size max. size item count mean variance item corr.
T25I15D320K 320,002 26 1 67 994 0.87 0.27 38
GAZELLE 59,602 3 2 268 497 0.94 0.08 166
3. Experiments
showed that our method was more efficient over the sparse datasets. Moreover, we pre-
sented the memory cost of our method. In Figure2, the memory cost was not impacted
by the minimum support. When the minimum confidence was small, the memory usage
was high, which, however, still smaller that the one of UMiner algorithm.
4. Conclusions
Acknowledgement
References
[1] J.Han, H.Cheng, D.Xin, and X.Yan, Frequent pattern mining: current status and future directions, Data
Mining and Knowledge Discovery,Vol.15(2007),55-86
[2] C.K.Chui, B.Kao, and E.Hung, Mining Frequent Itemsets from Uncertain Data, Proceedings of
PAKDD’2007
[3] C.C.Aggarwal, and P.S.Yu. A survey of uncertain data algorithms and applications. Transaction of
Knowledge and Data Mining, Vol.21(2009), 609-623
[4] Q.Zhang, F.Li, and K.Yi, Finding Frequent Items in Probabilistic Data, Proceedings of SIGMOD’2008
H.-F. Li and Y. Wang / Mining Probabilistic Frequent Itemsets with Exact Methods 185
[5] T.Bernecker, H.P.Kriegel, M.Renz, F.Verhein, and A.Zuefle, Probabilistic Frequent Itemset Mining in
Uncertain Databases, Proceedings of SIGKDD’2009
[6] L.Sun, R.Cheng, D.W.Cheung, and J.Cheng, Mining Uncertain Data with Probabilistic Guarantees,
Proceedings of KDD’2010
[7] T.Calders, C.Garboni, and B.Goethals. Approximation of Frequentness Probability of Itemsets in Uncer-
tain Data, Proceedings of ICDM’2010
[8] Y.Tong, L.Chen, Y.Cheng, and P.S.Yu. Mining Frequent Itemsets over Uncertain Databases, Proceed-
ings of VLDB’2012
[9] P.Tang, and E.A.Peterson. Mining Probabilistic Frequent Closed itemsets in Uncertain Databases, Pro-
ceedings of ACMSE’2011
186 Fuzzy Systems and Data Mining II
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-186
Introduction
With more and more satellites being sent into space these years, the ground in-orbit
managements have to handle such challenges as satellite’s high control precision, vari-
ous working modes, and high complexity. As advanced technologies and new materials
are utilized in satellites [1, 2], the sudden failure is not the primary failure mode for
most satellite failures, which is replaced by performance degradation. The theory of
analyzing satellite performance degradation only focuses on the overall performance of
equipment, regardless of failure modes, which is different from analyzing sudden fail-
ures.
In 2001, the University of Wisconsin and the University of Michigan, together
with other 40 industry partners, were united to establish the Intelligent Maintenance
Systems (IMS) research center under the U.S. National Science Foundation. After then,
many methods of performance degradation assessment have been proposed, such as the
pattern discrimination model (PDM) based on a cerebellar model articulation controller
(CMAC) neural network [3], self-organizing map (SOM) and back propagation neural
network methods [4], hidden Markov model (HMM) and hidden semi-Markov model
(HSMM) [5], etc. However, these methods are deficient in some aspects. For example,
the results of CMAC assessment method are greatly influenced by parameter setting
1
Corresponding Author: De-Chang PI, College of Computer Science and Technology, Nanjing Univer-
sity of Aeronautics and Astronautics, 29 Yudao Street, Nanjing, Jiangsu, 210016, China. E-mail:
dc.pi@nuaa.edu.cn.
F. Zhou et al. / Performance Degradation Analysis Method Using Satellite Telemetry Big Data 187
and the assessment results of the SOM, neural network method and HMM cannot di-
rectly reflect degradation degree. In order to accommodate the characteristics of as-
sessment for different key components, the analysis theory of performance degradation
has been developed from single degradation variable to a more diverse practical direc-
tion. Although some new theories and methods have emerged, the researches on the
performance degradation of satellite are still limited. M Tafazoli [6] studied in-orbit
failures for more than 130 different spacecraft and revealed that the spacecraft are vul-
nerable to failures occurring in key components. MAW [7] analyzed the space radiation
environment of thermal coatings and proposed degradation models for the optical prop-
erties of thermal coatings. However, these methods mainly focus on failure data and
also require relevant experience.
The conventional analysis methods for satellite performance degradation have
some shortcomings such as experimental difficulties and high cost. Satellites telemetry
big data contain monitoring information, abnormal states, space environment, and oth-
ers, which reflect the operational status and payload of satellites. A novel analysis
method for satellite performance degradation with telemetry big data is proposed in this
paper. This method uses data mining techniques and provides a quantitative description
for satellite performance degradation process.
Recently, the presented performance degradation methods are based on physical
rules or models [8, 9], this methods need to understand the internal structure of the sat-
ellite which is a difficult work for analyst. However, our proposed method uses the data
sampled in satellite operation process to analyze satellite performance degradation pro-
cedure without needing to determine the relationship of equipment accurately. What’s
more, our proposed method studies the characteristics of historical data, summarizes
the regulation of change, and analyzes the performance degradation process automati-
cally. To the best of our knowledge, a similar approach to performance degradation of
satellite has not appeared yet. Furthermore, it also can be extended to apply to failure
prediction.
1. Related Concepts
dª
¬X
m
i , X j º¼
m
max[| x i k x( j k ) |]
(2)
k 0, , m 1, j 1,
1 , N m 1, j z i
N m 1
1
)
m
r
N m 1
¦ C i m
r
(3)
i 1
4) When the dimension expands to m 1 , steps 1-3 are repeated to find out ) m +1 r .
The theoretical value of the SamEn is defined as follows:
SamEn m, r
N of
^
lim ln ª¬) m 1 r ) m r º¼ ` (4)
Support Vector Data Description [12] (SVDD) is inspired by the Support Vector Clas-
sifier. The method is robust against outliers in the training set and is capable of tighten-
ing the description by using negative examples.
A hypersphere that contains all or most samples of the target class is defined
as X = ^x1 ,x2 , xn ` . The hypersphere is bounded by the core of the hypersphere a and
radius R . If the hypersphere covers all the training samples of target class, the classifi-
cation is established by the empirical error which is equal to zero, and the structural
error is defined as H a,R =R 2 .
As the distance from xi to the core a should not be larger than radius R for all the
samples of the target class X , the constraint of the minimization problem can be de-
2
scribed as xi -a d R 2 .
To account for the possibility of outliers in the training set, the distance between xi
and the core a should not be strictly smaller than R , but larger distances should be
penalized. Therefore, slack variable [ i is brought in, and the minimization problem is
transformed into
F. Zhou et al. / Performance Degradation Analysis Method Using Satellite Telemetry Big Data 189
N
min H R,a,[ =R 2 C ¦ [i s.t xi -a d R 2 [i [i t 0 i =1,2, N
2
,,N (5)
i 1
The penalty factor C makes a trade-off between the volume and errors. The minimi-
zation problem in Eq. (5) can be calculated by using Eq. (6).
L R,a,Di ,[i =R2 C ¦[i ¦Di ^R2 [ 2 xi 2axi a 2 ` ¦ J i[i Di t 0,J i t 0 (6)
i i i
In Eq. (6), D i and J i are the Lagrange multipliers. L should be minimized with re-
spect to R , a , and [ i , and maximized with respect to Di and J i . Respectively taking
their partial derivatives equal to zero, and then get the following constraint Eq. (7):
¦Dx = Dx
¦D =1 ¦ C D i J i =0 i (7)
i i i
a=
¦D
i i i
i i i i
max L = ¦ D i xi xi ¦ D iD j xi x j (8)
i =1 i ,j
Thus, the optimization problem can be further transformed into Eq. (9):
max L =1 ¦ D iD j K G xi ,x j , V s.t 0 d D i d C
i ,j
(9)
K G x,y, V = exp x -y 2
V2
Eq. (9) shows that the core of the hypersphere is a linear combination of the objects.
Only objects xi with D i t 0 are needed in the description. Therefore, these objects are
called the support vectors of the description (SVs). To test an object z , the distance to
the core of the hypersphere and the radius R are respectively calculated by Eq. (10).
d = z -a =K G z , z -2¦ D i K G z , xi + ¦ D iD j K G xi , x j
i i ,j
(10)
R 2 = xsv -a =1-2¦ D i K G xi , xsv ¦ D iD j K G xi , x j
2
i i ,j
The test object z is accepted when this distance is not greater than the radius
(i.e. d d R ).
The SamEn of a time period is taken as its performance feature. And the vector
composed of the performance features of parameters within the same time period is
called performance eigenvector.
In this study, parameters are not limited to those of the objective equipment, but
they also contain a number of closely related equipment parameters. As parameters are
relative to specialized knowledge, their selections are conducted based on the domain
and expert knowledge.
Definition 2 (Health Model)
With SVDD method, the model obtained by training the performance eigenvector
of satellite in the healthy status is called health model (model).
According to the theory of SVDD, the model described in definition 2 is composed
of the support vectors of healthy state vector (model.SV), corresponding coefficients
( model.V ), number of support vectors (model.len), hypersphere bounded by the core
(model.a) and the radius (model.R)
Definition 3 (Performance Degradation Degree)
Here, dec denotes the distance between the performance eigenvector of satellite
and the core of hypersphere. The performance degradation degree deg which reflects
the “health condition” [13] is defined by the difference between dec and the radius of
hypersphere model.R, that is, deg = dec – model.R (in Figure 1).
It means that performance degradation process of the objective equipment may oc-
curs when the value of deg is larger than 0. When the value increases monotonously,
the speed of performance degradation process of the objective equipment increases
accordingly. As the degree cannot be negative, set deg = 0 when dec – model.R <0.
Figure 2 shows the overall framework of the analysis for satellite performance degrada-
tion presented in this study, which has four main steps.
Step 1. Select parameters of the satellite according to expert knowledge. Then, medi-
F. Zhou et al. / Performance Degradation Analysis Method Using Satellite Telemetry Big Data 191
an filter method is used to reduce the noise in satellite telemetry big data so as to gen-
erate a new clean dataset.
Step 2. Extract the performance features from the selected parameters through Step 1
according to Definition 1. And compose the final set of the performance eigenvectors.
Step 3. Select the performance eigenvectors in the healthy status as the training set,
and build a health model with SVDD method.
Step 4. To measure the degradation status of the new performance eigenvector, cal-
culate the performance degradation degrees according to Definition 3 and the results of
the model obtained in Step 3.
Satellite
Satellite telemetry
telemetry
r
data
data
Parameter
Parameter selection
selection Expert
Expert knowledge
knowledge
Telemetry
r data
elemetry data
processing
processing
Processing
Eigenvectors
Eigenvectors in
in Eigenvectors
Eigenvectors for
f r
fo Sample
Sample Entropy
Entropy
healthy
healthy states
states analysis
analysis extraction
extraction
Support
u port Vector
Sup Vector Data
Data Health
Health Model
Model
Description
Description
Performance
Perfo
f rmance
Degradation
Degradation Degree
Degree
The telemetry big data of one satellite is used as experimental data, which recorded
from 2011-05-01 00:00:00.0 to 2011-12-29 18:16:59.987, 14 million data frames that
contain several failures and performance degradation information. In our experiments,
seven important parameters in this dataset are selected by expert knowledge.
The telemetry big data is stored in Oracle 11g, and the algorithms are coded by Java.
The operating system used is Windows Server 2008 R2 Standard with the Intel (R)
Xeon (R) Eight-core E5606 processor with 8 G RAM.
each group are extracted by Definition 1. Finally, seven performance feature sequences
are obtained with a length of 800. The performance eigenvector is composed of the
features of seven parameters in the group with same number.
(1) Performance eigenvector under healthy status are selected as the training data,
SVDD method is used by setting V =1 in this experiment, and then the health model of
satellite is established.
(2) The remaining dataset is used as test data to verify the obtained health model,
and the degradation degree is calculated according to Definition 3. Figure 3 shows the
final results.
The degradation degrees are unsteady, and the curve is not smooth but fluctuant.
This is mainly due to the recognition accuracy of SVDD and cyclical factors of original
data that does not affect the overall reaction on the degradation process of satellite. In
order to reduce the interference of these factors, a relative algorithm is employed, and
the wavelet denoising sequence is obtained as Figure 3 shows. Overall, the average
degradation degree presents an increasing trend. Given the long period, the accidental
factors cannot influence the degradation degree all the time. Therefore, we conclude
that the satellite has entered the performance degradation state based on Definition 3.
0.35
0.25
degradation degree
0.2
0.15
0.1
0.05
0
1 100 200 300 400 500 600 700 800
group number
4. Conclusions
A method for satellite performance degradation with telemetry big data is proposed in
this paper while studies for solving this problem are limited. The experimental analysis
shows that the proposed method can extract effective state information from the param-
eters and provide a quantitative description for satellite performance degradation.
Moreover, the analysis on the performance degradation of satellite with telemetry big
data has a significant meaning in in-orbit research and management for satellites.
In our study, the definitions may have some limitations; for example, the degradation
degree of the experiment is unstable but fluctuant. The sample entropy algorithm may
take much time to trim redundant parameters in massive data, which will be improved
in our future work.
Acknowledgment
This paper is supported by the National Natural Science Foundation of China (Grant
No. U1433116).
References
[1] Z.Z. Zhong, D.C. Pi D. Forecasting Satellite Attitude Volatility Using Support Vector Regression with
Particle Swarm Optimization. Iaeng International Journal of Computer Science, 41(2014), 153-162.
[2] F. Zhou, D.C. Pi. Prediction Algorithm for Seasonal Satellite Parameters Based on Time Series Decom-
position. Computer Science, 43(2016), 9-12 (in Chinese).
[3] J. Lee. Measurement of machine performance degradation using a neural network model. Computers in
Industry, 30(1996), 193-209.
[4] R. Huang, L. Xi, et al. Residual life predictions for ball bearings based on self-organizing map and back
propagation neural network methods. Mechanical Systems and Signal Processing, 21(2007), 193-207.
[5] X.S. Si, W. Wang, C.H. Hu, et al. Remaining useful life estimation–A review on the statistical data driv-
en approaches. European Journal of Operational Research, 213(2011), 1-14.
[6] M. Tafazoli. A study of on-orbit spacecraft failures. Acta Astronautica, 64(2009), 195-205.
[7] W. Ma, Y. Xuan, Y. Han, et al. Degradation Performance of Long-life Satellite Thermal Coating and Its
Influence on Thermal Character . Journal of Astronautics, 2(2010), 43-45.
[8] G. Jin, D.E. Matthews, Z. Zhou. A Bayesian framework for on-line degradation assessment and residual
life prediction of secondary batteries inspacecraft. Reliability Engineering &System Safety, 113(2013),
7-20
[9] X. Hu, J. Jiang, D. Cao, et al. Battery Health Prognosis for Electric Vehicles Using Sample Entropy and
Sparse Bayesian Predictive Modeling. IEEE Transactions on Industrial Electronics, 63(2015), 2645-
2656.
[10] S.M. Pincus. Assessing serial irregularity and its implications for health. Annals of the New York Acad-
emy of Sciences, 954(2001), 245-267.
[11] D. Weinshall, A. Zweig, et al. Beyond novelty detection: Incongruent events, when general and specific
classifiers disagree. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(2012), 1886-
1901.
[12] G. Yan, F. Sun, H. Li, et al. CoreRank: Redeeming “Sick Silicon” by Dynamically Quantifying Core-
Level Healthy Condition. IEEE Transactions on Computers, 65(2016), 716-729.
194 Fuzzy Systems and Data Mining II
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-194
Abstract. This study firstly proposes Meta-Investment Strategy, derived from the
concept of Meta-Search in network and Meta-Cognition in psychology. We
compare enormous web information to all A shares in China, process of searching
information to stock selection and search engines to equity funds. Based on the
sector rotation theory and decision tree model, through the construction of
indicator system and the statistical model, some stock selection rules according to
funds information can be extracted. After classifying the period from 2016.02 to
2016.04 as recovery, we selected finance industry. By importing 12 stock
indicators of all the component stocks in finance industry as input variables and
whether it is heavily held by stocks funds as target variable, a decision tree model
is constructed. Finally, by entering data of the last quarter in 2015, the predictive
classification results are obtained. Result shows that Meta-Investment Strategy
outperformed CSI300 and CSI300 of Finance Sector (000914) and obtained
significant excess return from 2016.02.01 to 2016.04.30.
Introduction
In each surge of stock market in China, there are always hot industries which lead the
upward trend periodically. If investors can seize these fleeting investment opportunities
of hot industries, their portfolios can acquire excess return. Sector rotation has become
one of the most important means in investment research of stock market.
Sector rotation refers to a phenomenon that in every phase of business cycle and
stock market cycle, different industries take turns to outperform the market. The
research on sector rotation theory abroad is more mature than domestic ones. It
originated from the famous “The Investment Clock” [1], which classified the business
cycle into four phases and concluded the performance of different industries. Sassetti
and Tani [2] outperformed market returns by using 3 market-timing techniques on 41
1
Corresponding Author: Zhen-Hua Zhang, School of Economics and Trade, Guangdong University of
Foreign Studies, Guangzhou 510006, China; E-mail: zhangzhenhua@gdufs.edu.cn.
L.-M. He et al. / A Decision Tree Model for Meta-Investment Strategy of Stock 195
funds of the Fidelity Select Sector family over the period 1998 to 2003. The domestic
research of sector rotation focuses on the phenomenon itself and its underlying causes,
including business cycle, monetary cycle, industrial cycle and behavioral finance.
However, only a few researches probe into sector rotation as an investment
strategy. Peng and Zhang [3] empirically analyzed the sector rotation effect in Chinese
stock market and proved the feasibility of sector investment strategy. By adopting
association rules algorithm, many strong association rules of stock market were mined
from a massive amount of data [4]. In this research, manufacturing and petrochemical
industry stock indexes (the core of association rules) are closely related to other sector
indexes (except for finance, real estate, food & beverage and media).
In addition, because of the immature Chinese capital market, irrational investments
contribute to instability of stock market. This leads to the divergence of market
performance and economic fundamentals. In this case, stock fund, representative of
professional investors, can often forecast directions of financial market. That is why we
put forward the concept of “meta-investment”.
We first present the concept of Meta-Investment based on Meta-Search and Meta-
Cognition. And then, by fusing Meta-Investment and Sector Rotation Strategy, we
apply this concept to stock investment according to the investment results of some
funds and institutions. In order to get comprehensible rules, we adopt decision tree
model to construct the final investment strategy. Simulation results show the
advantages of the present method.
Yang [5] proposed that the essence of sector rotation is an economic phenomenon.
Namely, factors influencing business cycle also induce sector rotation in capital market.
These factors include investment, monetary shock, external shock and consumption of
durable goods. In his study, by introducing phases of business cycle as dummy variable
to the classical CAPM model, the sector rotation strategy gains 0.2% excess Jensen
Alpha returns. Dai & Lin [6] put forward the innate logic of sector rotation and
business cycle. Business cycle is determined by external shock while industrial
structure decides internal forms of business cycle. The process is shown in Figure 1.
Financial Situation of
Business Cycle Different Industries
Sector Rotation
Monetary policy is an important contributory factor of stock market. In the long run,
the performance of stock market is based on real economy. However, the change in
liquidity resulting from the conversion of monetary policy can influence the stock
market in the short run. The interpretation of sector rotation based on monetary policy
is that different industries have different sensitivity to liquidity. Conover, Jenson,
Johnson and Mercer [7] used federal FED discount rate as an indicator of monetary
policy to build a sector rotation strategy based on monetary environment. After
classifying the monetary phases, sensitivity to liquidity of different industries was
tested. Subsequently, cyclical industries sensitive to liquidity were invested during
monetary easing while noncyclical industries were invested during monetary tightening.
This strategy gained excess return.
1.4. Conclusion
2. Meta-Investment Strategy
Therefore, sufficient data is gathered, formatted by their ranks and presented to the
users.
It is well known that Meta-Cognition is "cognition about cognition", "thinking
about thinking", or "knowing about knowing". The term Meta-Cognition literally
means cognition about cognition, or more informally, thinking about thinking defined
by American developmental psychologist Flavell [12]. Flavell defined Meta-Cognition
as knowledge about cognition and control of cognition. It comes from the root word
"meta", meaning beyond. It can take many forms; it includes knowledge about when
and how to use particular strategies for learning or for problem solving. There are
generally two components of Meta-Cognition: knowledge about cognition, and
regulation of cognition.
Meta-Memory, defined as knowing about memory and mnemonic strategies, is an
especially important form of Meta-Cognition. Differences in Meta-Cognitive
processing across cultures have not been widely studied, but could provide better
outcomes in cross-cultural learning between teachers and students.
Some evolutionary psychologists hypothesize that Meta-Cognition is used as a
survival tool, which would make Meta-Cognition the same across cultures. Writings on
Meta-Cognition can be traced back at least as far as On the Soul and the Parva
Naturalia of the Greek philosopher Aristotle.
As representatives of professional investors, stock funds can explore intrinsic
values of investment objectives before the market. Therefore, through the application
of the stock funds investment result, investing by “standing on the shoulders of giants”
can be a brand-new idea. Meta-Investment Strategy in this study is based on funds. It
compares enormous web information to all A shares, stock selection to search process
and equity funds to search engines. Through the construction of indicator system and
building of statistical modeling, the stock selection rules of stock funds can be
extracted for portfolio construction.
Increasing methods of Data mining and machine learning have been applied to the
financial field. There have been many models of stocks selection, such as Neural
Network, Random Forest, Support Vector Machine (SVM), Genetic Algorithm (GA),
Rough Set Theory and Concept Lattices etc.
The aim of this research is to probe into Meta-Investment Strategy (based on sector
rotation theory) by searching for proper data mining and machine learning algorithm.
To realize the goal, firstly the comprehensibility of investment strategies has to be
considered. Therefore, algorithm which can be used to extract understandable rules is
the main approach in this study.
However, Neural Network and Random Forest are more suitable for large sample
of data. Besides, Neural Network cannot be used for rule extraction. SVM is applicable
to relatively small sample, whereas it is difficult to extract rules. In summary, Neural
Network, SVM) and Genetic Algorithm (GA) are suitable for prediction instead of rule
extraction. Thus, Decision Tree, Rough Set and Concept Lattices methods are more
suitable than the other prediction methods for the research purpose.
198 L.-M. He et al. / A Decision Tree Model for Meta-Investment Strategy of Stock
Some researchers had put forward the decision tree [13, 14, 15] and random forest
[16] used in the field of investment decisions. For example, Hu and Luo [14] applied
the decision tree model to sector selection, Sorensen and Miller et al. [15] utilized the
decision tree approach for stock selection. Liu et al. [16] proposed a random forest
model applied to bulk-holding stock forecast.
However, no available research directly applied stock funds’ investment result to
investment practice currently. In addition, although there were some related researches
on bulk-holding stock [16], they were not directly combined with investment practice.
Most importantly, because Meta-Investment Strategy is firstly launched in this
study, there are no specialized algorithms for meta-investment Strategy at present.
After comparison, C5.0 Decision Tree which is easy to extract understandable rules is
preferred.
Secondly, conditional attributes are continuous in data set. In terms of applying
Rough Set and Concept Lattices for extracting rules, discretization process is necessary,
which requires proper discretization model. C5.0 Decision Tree method, without the
discretization process, is comparatively easier to implement than the former.
Moreover, traditional extraction methods of comprehensible rules, which are used
to extract information from massive original data directly, are difficult for this research
because of several problems: (1) Massive data, large number of indicators and scattered
information make it difficult to extract rules; (2) Implicit rules of investment vary from
different periods because of various financial conditions and policies. Therefore, the
prediction accuracy is limited and rules are likely to contradict with each other; (3)
Operational speed is relatively slow when coping with massive data.
We aim to use Meta-Investment Strategy and rule extraction algorithm to solve the
aforesaid problems. Relevant research is limited in this field. It is built on existing
investment strategies and thus accuracy is improved. In this way, data size and
conflicting information are relatively less, which makes the extracted rules more
reasonable. Therefore, C5.0 is chosen for rule extraction.
Selection of Industry
Data Back-testing
1), by which the business cycle is divided into four stages: recovery, overheat,
stagflation and recession. Classification results of business cycle are shown in Table 2.
Table 1. Classification of the Four Stages in a Business Cycle
Phase
Recession Recovery Overheat Stagflation
Indicator
Since 2009, mainstream securities have studied the investment clock in China. Typical
investigations include Guotai Junan Securities (2009), Shenyin Wanguo Securities
(2009), Orient Securities (2009) and Guoxin Securities (2010). The methodologies
including classification of domestic business cycle and statistical processing to
different industries are similar despite the different industry classification benchmark
and time range.
Chinese capital market is immature, for Chinese financial market greatly changes
because of different policies in different periods. During the period from 2016.02.01 to
2016.04.30, data is relatively more comprehensive and timely. In this way, extracted
rules are more likely to comply with implicit ones of Chinese capital market. In
addition, there are 51 stocks in finance industry at present. If we choose data before
2015, data size will be reduced greatly. For example, Guoxin Securities (002736) went
public in December, 2014 while Orient Securities (600958), Guotai Junan Securities
(601211), Dongxing Securities (601198) and Shenwan Hongyuan Group (000166)
went public in 2015. In conclusion, the chosen timeframe is considered from three
aspects: sector rotation theory, data size and timeliness.
It’s important to note that this study judges this period (2016.02.01-2016.04.30) to
be recovery. Subsequently, the training samples are confined to the first three quarters
in 2015. Financial and technical indicators are imported as input variables. Because of
the time lags of financial indicators, whether the stock in financial industry is heavily
held in the next quarter is set as target variable. Classification rules are produced
through C5.0 Decision Tree. Then the data of the last quarter in 2015 is imported for
classification and prediction and a portfolio is constructed with each chosen stock
weighted equally. Finally, performance of this portfolio is back tested from 2016.02.01
to 2016.04.30.
Table 3 summarizes the findings of the four studies mentioned above.
Since the research period in this study is recovery from 2016.02.01 to 2016.04.30,
finance industry is chosen according to the conclusion above.
All the input variables and target variable are shown in Table 4.
We manually chose input variables from four dimensions - profitability, operating
capacity, technical factors and indicators per share according to theory of financial
statements theories and previous researches [18-21].
In this study, sample size for model construction is 149. Samples are divided into
training set (70%) and test set (30%). According to Industry Classification of China
Securities Regulatory Commission (CSRC), number of stocks in China’s financial
industry is about 50. Because data used for model construction is confined to the first
three quarters in 2015, after excluding invalid samples, 149 samples in total are
available.
This study doesn’t adopt traditional stock selection model. Instead, we combine
sector rotation strategy and Meta-Investment Strategy. Therefore, after classifying
202 L.-M. He et al. / A Decision Tree Model for Meta-Investment Strategy of Stock
business cycle, choosing finance industry and research period (the first three quarters in
2015 for model construction), size of data that needs to be processed and noise data are
greatly reduced.
Table 4. Description of Variables
The research period of this study is confines to the first three quarters in 2015. Because
twelve of the input variables are from lagging financial statements, this study sets the
rules shown in Table 5.
Table 5. Usage of different Types of Report
By importing 13 stock indicators of all the component stocks in finance industry in the
first three quarter of 2015 as input variables and whether it is heavily held by stocks in
the next quarter as target variable, a set of inference rules is generated as follow.
Detailed rules are shown in appendix.
Rule 1 - estimated accuracy 89.22% [boost 96.1%]
NCFIATTM <= -4.903720 [ Mode: 1 ] => 1.0
NCFIATTM > -4.903720 [ Mode: 0 ]
Momentum <= -0.0665 [ Mode: 0 ] => 0.0
Momentum > -0.0665 [ Mode: 1 ]
Momentum <= 0.3515 [ Mode: 1 ]
PCFTTM <= 3.331410 [ Mode: 0 ]
PCFTTM <= -178.095000 [ Mode: 1 ] => 1.0
PCFTTM > -178.095000 [ Mode: 0 ] => 0.0
PCFTTM > 3.331410 [ Mode: 1 ] => 1.0
Momentum > 0.3515 [ Mode: 0 ] => 0.0…
Hence, we extract some rules and explain them.
For example, the first rule: NCFIATTM <= -4.903720 [ Mode: 1 ] => 1.0.
It means Net Cash Flow of Investment Activities (Trailing Twelve Months) which
is less than -4.903720 is chosen (1.0). In the field of commercial bank management,
banks have fixed demand of asset allocation. In China, the main investment activity of
commercial banks is purchasing treasury bonds. Because of the expansion of a bank’s
asset, the less the Net Cash Flow of Investment Activities (NCFIA) is, the faster of its
expansion. For example, in 2015 a bank has asset of RMB ¥100 Yuan, in which 30% is
allocated as one-year treasury bonds. In 2016 this bank has asset of RMB ¥120 Yuan,
in which 30% is allocated as one-year treasury bonds. The annual rate of return is 3%.
Therefore, in its financial statement, Net Cash Flow of Investment Activities is -6 (-
120*0.3+100*0.33). Minus sign means capital outflow while positive sign means
capital inflow.
The second rule: Momentum > -0.0665 [ Mode: 1 ] .
It means Three-Month Momentum which is greater than -0.0665 is chosen (1.0). In
short-term investment, there is an effect called “Momentum effect”. That is to say, rate
of return of a stock has the tendency of following the original trend.
From the above explanation of two most important rules, we know that the
extracted rules are reasonable. Certainly, we can also explain the others, which shown
in appendix.
4. Results Analysis
By importing 12 financial indicators in the last quarter of 2015 (up to the end of
2015.12.31) and prior three-month momentum before 2016.02.01 of all the component
stocks in finance industry, classification results are produced. From those stocks with
classification value of “1”, stocks with a confidence level higher than 90% are chosen.
Subsequently, a portfolio is constructed with each chosen stock weighted equally.
Finally, performance of this portfolio is back tested from 2016.02.01 to 2016.04.30.
Classification results and performance of the portfolio are shown in Table 6.
Table 6. Classification Results
Figure 4. Performance
L.-M. He et al. / A Decision Tree Model for Meta-Investment Strategy of Stock 205
Period 2016.02.01-2016.04.29
Cumulative Return of CSI300 (399300) 9.12%
Cumulative Return of CSI300 Finance Sector (000914) 9.45%
Cumulative Return of Portfolio Based on Meta-Investment Strategy 12.53%
Winning Rate (Ratio of Days outperforming CSI300 to Total Days) 68.97%
In this study, 149 samples are divided into training set (70%) and test set (30%).
Number of trails of boosting is five. It is used for amplifying sample size and
enhancing accuracy.
After selecting training set and test set and five iterations, the overall accuracy is
up to 96.1%. It is necessary to note that samples in each iteration are different to some
extent. The first model is built on equal probability sampling of training set, while the
second model is mainly based on the incorrectly classified samples of the first model.
The third model focuses on incorrectly classified samples of the second model and so
forth. Therefore, the estimated accuracy is different among rules.
It is also necessary to explain that the purpose of setting “Whether it is Heavily
Held by Equity Funds” as target variable is not for forecasting bulk-holding stocks of
stock funds. Instead, our purpose is to apply investment result of stock funds, extract
principals and rules and invest by “standing on the shoulders of giants”. Therefore, this
study uses the comparison among cumulative return of portfolio, cumulative return of
CSI300 Finance sector and cumulative return of CSI300 to test effects of extracted
rules and stock selection model.
5. Conclusions
Acknowledgments
This paper is supported by the National Natural Science Foundation of China (No.
71271061), the National Students Innovation Training Program of China (No.
201511846058), Student Science and Technology Innovation Cultivating Projects &
Climbing Plan Special Key Funds in Guangdong Province (No. pajh2016b0174),
Philosophy and Social Science Project (No. GD12XGL14) & the Natural Science
Foundations (No. 2014A030313575, 2016A030313688) & the Soft Science Project
(No. 2015A070704051) of Guangdong Province, Science and Technology Innovation
Project of Education Department of Guangdong Province (No. 2013KJCX0072),
Philosophy and Social Science Project of Guangzhou (No. 14G41), Special Innovative
Project (No. 15T21) & Major Education Foundation (No. GYJYZDA14002) & Higher
Education Research Project (No. 2016GDJYYJZD004) & Key Team (No. TD1605) of
Guangdong University of Foreign Studies.
References
[4] Y. Ye, The cointegration analysis to stock market plate indexes based on association rules, Statistical
Education 9 (2008), 56-58.
[5] W. Yang, Research of sector rotation across the business cycle in the Chinese A share market, Wuhan:
Huazhong University of Science & Technology, 2011
[6] X. Lin, J. Dai, Quantitative and structural analysis of Guoxin investment clock (Report), Shenzhen China
(2012).
[7] C. M. Conover, G. R. Jensen, R. R. Johnson, et al., Is fed policy still relevant for investors? Financial
Analysts Journal 61 (2005), 70-79.
[8] H. Chen, Industry allocation in active portfolio management, Wuhan: Huazhong University of Science &
Technology (2011).
[9] M. Su, Y. Lu, Investigation on sector rotation phenomenon in Chinese A share market—from a
perspective of business cycle and monetary cycle, Study and Practice 27 (2011), 36-40.
[10] C. He, Analysis of sector rotation phenomenon in Chinese A share market, Economic research 47
(2001), 82-87.
[11] E. W. Glover, S. Lawrence, W. P. Birmingham, et al., Architecture of a metasearch engine that
supports user information needs, Conference on Information and Knowledge Management, 1999.
[12] J. H. Flavell, Metacognition and cognitive monitoring: A new area of cognitive-developmental inquiry,
American Psychologist 34 (1979), 906 – 911.
[13] W. Xue, H. Chen., SPSS modeler--the technology and methods of data mining, Beijing: Publishing
House of Electronics Industry (2014).
[14] H. Hu, J. Luo, Profitability and momentum are the key factors of selection of industries—the
exploration of the decision tree applied in sector selection (Report), Shenzhen China (2011).
[15] E. H. Sorensen, K. L. Miller, C. K. Ooi, The decision tree approach to stock selection, Journal of
Portfolio Management 27 (2000), 42-52.
[16] W. Liu, L. Luo, H. Wang, A forecast of bulk-holding stock based on random forest, Journal of Fuzhou
University (Natural Science Edition), 36 (2008), 134-139.
[17] L. Zhang, C. Wang, The investigation of Chinese business cycle and sector allocation on the
macroeconomic perspective (Report), Shenzhen China (2009).
[18] L. Zhang, Stock Selection Base on Multiple-Factor Quantitative Models, Shijiazhuang: Hebei
University of Economics and Business (2014).
[19] J. Zhao, Sector Rotation Multi-factor Stock Selection Model and Empirical Research on its
Performance, Dalian: Dongbei University of Finance and Economics (2015).
[20] P. Wang, J. Yu, Analysis of Financial Statements, Beijing: Tsinghua University Press (2004).
[21] H. Peng, X.Y. Liu, Sector Rotation Phenomenon Based on Association Rules, Journal of Beijing
University of Posts and Telecommunications, 18 (2016), 66-71.
208 Fuzzy Systems and Data Mining II
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-208
Abstract. This paper deals with the security problems facing with next genera-
tion computing environments. As the method, Virtualized Security Defense Sys-
tem (VSDS) is proposed under the application of ‘Trello’(web application) for on-
line patient networks, it deals with the following problems; (1) blurred security
boundaries between attackers and protectors, (2) group key management system,
(3) secret-collaborative works and sensitive information-sharing for group mem-
bers, (4) preserving privacy, (5) rendering of 3D image(member indicator, high
level of security). Consequently, although current IT paradigm is changing to more
‘complicated’, ‘overlapped’ and ‘virtualized’, VSDS makes it securely possible to
share information through collaborative works.
Keywords. Blurred Security Boundaries, Virtualized Security Defense System,
PatientsLikeMe, Trello, Group key, Reversed hash key chain, VR/AR, Member
indicator, Pseudonym
Introduction
0.1. Computing Environments for Next Generation and Problem Identification
ered a clear objective in the traditional IT environments, and it is divided as two groups–
attackers and protectors, in which the security specialists take responsibility to prevent
the attacks and threats from outsiders using their knowledge in the security architec-
ture. On the other hand, at present, the changing IT paradigm has made the information
boundaries between attackers and protectors blurred.
The characteristics of next generation computing era (IT paradigm) can be summa-
rized as follows; (1) the increase of collaborative works through the network connections,
(2) the increase of information sharing in information-oriented society, (3) the blurred
security boundaries to protect, which is caused by virtualized IT resources and migration
policy, (4) the increase of 3D data such as in VR/AR.
Therefore, in this paper, to solve the problem against ‘blurred security boundaries’,
Virtualized Security Defense System (VSDS) is proposed under the web application of
‘Trello’ to construct online patient networks very similar to ‘PatientsLikeMe’.
to the given(developed) equation algorithm. Then, total five sub-group keys are made as
each member’s group session key.
Hence, every member has different group keys for each session. However, con-
sequently, every result of authentication(including encryption/decryption) is the same
as the fundamental group key’s result under the computation of the developed proto-
col(equation algorithm).
One of the most important things is that the same result value between fundamen-
tal group key and all other group keys has no need of re-keying processes whenever
membership-changes.
3. VR/AR Technique: A new concept of the 3D Video Image Mobile Security Tech-
nology Solution is proposed. As a member indicator, the 3-dimensional realistic models
which are decided at the registration time should be rendered in the log-in process to be
authenticated as a legitimate user [4].
4. VSDS preserves privacy. (1)Anonymity and Pseudonymity; Every session we use
pseudonymity. Although perfect anonymity cannot be provided, instead pseudonymity
can be provided, (2) Unlinkability; Every session users log-in with different pseudonyms
(Pd) and use different encryption keys(each member’s group key). Consequently, VSDS
can achieve the similar level of security to ’One Time Encryption’. (3) Unobservability;
All information is encrypted and pseudonym is changed every session by reversed hash
chain [5].
5. Access Control by Cryptographic Techniques and VR/AR Technique
6. VSDS is scalable to other group project systems on the websites. Application sce-
nario is about patient networks on the web, however VSDS is extendable to other secure
group projects.
researches based on reversed hash key chain, there are [3], [8]. [3] proposed two practical
approaches - efficiency and group search in cloud datacenter, where the authors defined
the group search secrecy requirements including baward accessibility. [8] suggested the
protocol for designated message encryption for designated decryptor, so that they make
a server see the only corresponding message in the cloud service system based on onion
modification and reversed hash key chain.
As for the multi-users setting researches not-based on reversed hash key chain, there
have been the following works; [6] proposed the common secure indices to make multi-
users obtain securely the encrypted group’s documents without re-encrypting them,
which is based on keyword field, dynamic accumulators, Paillier’s cryptosystem and
blind signatures. They formally defined common secure index for conjunctive keyword-
based retrieval over encrypted data (CSI-CKR) and its security requirements. The next
year, they proposed another scheme of keyword field-free conjunctive keyword searches
on encrypted data in the dynamic group setting [9], whereby the authors solve the open
problem asked by Golle et al. In [10], Kawai et al. showed the flaw of Yamashita and
Tanaka’s scheme SHSMG, and they suggest a new concept of Secret Handshake scheme;
monotone condition Secret Handshake with Multiple Groups (mc-SHSMG) for members
to authenticate each other in monotone condition. [11] suggested a new effective fuzzy
keyword search in a multi-user system over encrypted cloud data. This system supports
differential searching and privileges based on the techniques; attribute-based encryption
and Edit distance, which achieves optimized storage and representation overheads.
In this paper, VSDS generates group session keys for each user which are composed
of five sub-keys by using reversed hash key chain and random numbers. According to the
developed encryption/decryption algorithm, the group key achieves no need of re-keying
processes whenever membership changes happen.
1.2. Application
‘PatientsLikeMe’ is online patient networks, actually, VSDS does not apply to the web
‘PatientsLikeMe’ directly. The substantial application for VSDS is the web application
‘Trello’. It is intended that the proposed system VSDS, which is applied to Trello with
cryptographic and security techniques, can accomplish the goal and functional roles of
PatientsLikeMe. Hence, we need to know both of two websites.
Trello. Trello is a web-based project management application. Generally, basic ser-
vice charge is free, except for a Business Class service. Projects are represented by boards
containing lists (corresponding to task lists). Lists contain cards to progress from one list
to the next. Users and boards are grouped into organizations. Trello’s website can access
to most mobile web browsers. Trello dose various works such as real estate management,
software project management, school bulletin boards, and so on [12].
PatientsLikeMe. This online patient network has the goal of connecting patients
with one another, improving their outcomes, and enabling research. PatientsLikeMe
started the first ALS (amyotrophic lateral sclerosis) online community in 2006. There-
after, the company began adding other communities such as organ transplantation, mul-
tiple sclerosis (MS), HIV, Parkinson’s disease and so on. Today the website covers more
than 2,000 health conditions. The approach is scientific literature reviews and data-
sharing with patients to identify outcome measures, symptoms, treatments through an-
swering questions [13].
212 H.-A. Park / VSDS for Blurred Boundaries of Next Generation Computing Era
Using the web project application ‘Trello’, Security Defense System VSDS is con-
structed for the patients with any disease in all over the world, just like ‘PatientsLikeMe’.
The reasons are; (1) PatientsLikeMe does not deal with all kinds of disease. Although
the company began adding other communities such as MS, Parkinson’s disease, so many
other patients want to get in such kind of web and to be helped more easily. (2) The lan-
guage of PatientsLikeMe allows only for ‘English’. Patients in non-English speaking re-
gions are so difficult to sign in and use. The system is for group members who want to get
helps through information-sharing. The information scope is health conditions and pa-
tient profile. Mostly, the sensitive information could be shared but some secret personal
data in patient profile should not be revealed to anyone. Plus one more important thing
is that the system is Virtualized SDS using 3D image rendering for the next generation
computing.
The details are as follows; A board is assigned to one group. A list containing cards
is assigned to a user. Each member uploads his/her conditions or information to a card
and then the information is shared.
VSDS has three parties; Users, SM(Security Manager), VSDS Server. SM(Security
Manager) is a kind of a client, which is granted a special role of a security manager. SM
is assumed as a TTP (trusted third party) and it is located in front of the VSDS server. SM
controls group-key and key-related information, all sensitive information, and all other
events with powerful computational and storage abilities. Fig.1 is the system configura-
tion of VSDS. Every user should register at SM at first, thereafter they should get through
the authentication process every session and then they start some actions. When some
information is shared with other patients (it means that the shared card is generated), we
know the card is encrypted by the group’s encryption key. Only the legitimate users (who
registered at SM and have stored the information given by SM for authentication at his
device) can pass authentication processes and know the sharing information. In the last
H.-A. Park / VSDS for Blurred Boundaries of Next Generation Computing Era 213
2.1. Notations
• KG : the fundamental group key of group G
• m : the number of group G’s members, j : session number, i : each member of group G
j
• kmi : group session key for each member ’i’ in the j-th session
j j j j j j
• Ki,1 , Ki,2 , Ki,3 , Ki,4 , Ki,5 : five subkeys for i’s group key kmi
j
• αi : random number of member i in the j-th session
j
• pdi : pseudonym of member i in the j-th session
• h(·) : hash function, f (·) : pseudorandom function
• C, E : Encryption function, D : Decryption function
q
• Vi : a video image information for a member i to render at q-th session, RV : a rendered
image of V
Therefore, the first session’s random number of member i is αi1 and the s-th session’s
214 H.-A. Park / VSDS for Blurred Boundaries of Next Generation Computing Era
random number of member i is αis ; h(αis+1 ) = αis = hq−s (αiq ). To make the member i’s
group session keys, αij is changed to αij+1 , 1 ≤ j ≤ q − 1 in the member’s group key.
With these different random numbers, we can make all different group keys for each
member and each session respectively.
One-way hash function h() plays the important role of group information-sharing
system in VSDS. One-Way Hash Key Chain is generated by randomly selecting the last
value, which is repeatedly applied to an one-way hash function h(). The initially selected
value is the last value of the key chain. One-way hash chain has two properties; 1. Any-
one can deduce an earlier value(ki ) with the later value(k j ) of the chain by computing
h j−i (ki ) = k j . 2. An attacker cannot find a later value(k j ) with the latest released value(ki )
because of h j−1 (k j ) = ki . Therefore, two properties make it possible that a leaving mem-
ber cannot compute new keys after leaving the group and any newly joining member
can obtain all previous keys and information through applying the current key to hash
function h() repeatedly.
We assume that the encryption method for a massage ‘M’ with the group key ‘KG ’ is
q q q q q
C = gh(KG ) f (KG ) M. For simplicity, we put Ki,1 , Ki,2 , Ki,3 , Ki,4 , Ki,5 as K1 , K2 , K3 , K4 , K5 and
fKG (KG ) as f (KG ). Then, the encryption method with the each member’s group key kmij ,
for example, in the last session (i.e. j=q, kmqi ) is as follows.
q q
C = Ekmq (M) = K3K1 gK2 M = (g f (KG ) )h(KG )αi gh(KG ) f (KG )(1−αi ) M = gh(KG ) f (KG ) M. We can
i
check that the result of encryption with the group key ‘KG ’ is the same as one with each
member’s group key kmij , that is K1 , K2 , K3 , K4 , K5 .
The decryption method with the group key ‘KG ’ is D = C · g−h(KG ) f (KG ) = M. Then,
the decryption method with the the each member’s group key kmqi in the last session is;
q q
D = C · K3K4 gK5 = gh(KG ) f (KG ) M · (g f (KG ) )−(h(KG )+αi ) · g f (KG )αi =
q q
gh(KG ) f (KG )− f (KG )h(KG )− f (KG )αi + f (KG )αi · M = M
We can also check that the result of decryption with the group key ‘KG ’ is the same
as one with each member’s group key kmij . Because of the properties of this developed
encryption and decryption algorithm, VSDS can achieve no need of re-keying processes
whenever membership-changes happen.
H.-A. Park / VSDS for Blurred Boundaries of Next Generation Computing Era 215
At the registration stage, SM assigns the 3-dimensional real models of the number of
j(3 ≤ j ≤ t, t value depends on the condition and policy of systems) to each member i of
each group A, and the members keep the given j 3D real models for later authentication.
s is put as the
Every group has its particular j models(3D shaped thing) respectfully. VA,i
video image information for the 3D model of the member i in group A at the s-th ses-
sion. Every session SM selects one model of the group’s 3D models and challenges the
member of the group with VA,is . Then, the member renders the 3D real model for V s .
A,i
2.5.1. Registration.
As the first process, every user should register at the Security Manager (SM). In this
registration Stage, pseudonyms, group members’ group keys, group session keys, and
the other information including member indicators are generated for each user to use this
system with safety.
Then, every user is given some information from SM. They stores them in one’s own
device such as smartphone or PC and keep VA,i s to j 3D real models. The given infor-
mation for each member i is as follows; h(Ekm1 (pdi1 ||V )), pdi1 , km1i , {h(Ekm j (pdij )), (1 ≤
i i
j ≤ q)}.
SM should also store some information for each member ; αiq , the values for pseudonym
hash key chain {h(pdij ), pdij , (1 ≤ j ≤ q)}.
Fig. 2 shows the whole process of VSDS from Registration.
Registration
The First Session
User SM
Log-in Stage
1. Compute: f pd 1 (km1i ), h(pdi1 )
i
{1(s),h(pdi1 ), f pd 1 (km1i ),h(Ekm1 (pdi1 ||Vi1 ))}
−−−−−−−−−−−−−−−−−−i−−−−−−→ i
q
2. Find: 1(s), h(pdi1 ) → αi , pdi1
Decrypt: D( f pd 1 (km1i )) = km1i
i
q
Compute:hq−1 (αi ) = αi1 ,
km1i = {K11 , K21 , K31 , K41 , K51 }
Verify: km1i = km1i
Compute & Verify:
h(Ekm1 (pdi1 ||Vi1 )) =? h(Ekm1 (pdi1 ||Vi1 ))
i i
3. Compute: αi , kmi
2 2
f pd 1 (km2i ,pdi2 ), f pd 2 (pdi1 ||Vi1 )
Compute & Send:
←−i−−−−−−−−−i −−−−−−
4. Decrypt: D( f pd 1 (km2i , pdi2 )) = km2i , pdi2
i
Compute & Verify:
h(Ekm2 (pdi2 )) =? h(Ekm2 (pdi2 )), h(pdi2 ) = pdi1
i i
Then, km2i → km2i , pdi2 → pdi2
5. Decrypt: D( f pd 2 (pdi1 ||Vi1 )) = pdi1 ||Vi1
i
Render at a card: R(Vi1 )
6. Verify the card: R(V ) = RV 1
i
Action Stage
User V SDS Server
[member − i]
1
Ki,1 K1
7. Encrypt & Upload M: Ci1 = Ekm1 (M) 1
=Ki,3 ·g i,2 ·M=gh(KG ) f (KG ) M
i −−−−−−−−−−−−−−−−−−→
[member − j]
8. Download from VSDS Server: C1
←−−−
i
−−
K1 K 1j,5
9. Decrypt Ci1 :D = Ci1 · K 1j,3 j,4 ·g =M
2nd Session
User SM
Log-in Stage
1. Compute & Send:
{2(s),h(pdi2 ), f pd 2 (km2i ),h(Ekm2 (pdi2 ||Vi1 ))}
−−−−−−−−−−i−−−−−−−−i−−−−−−→ q
2. Find: 2(s), h(pdi2 ) → αi , pdi2
Decrypt: D( f pd 2 (km2i )) = km2i
i
q
Compute:hq−2 (αi ) = αi2 ,
km2i = {K12 , K22 , K32 , K42 , K52 }
Verify: km2i = km2i
Compute & Verify:
h(Ekm2 (pdi2 ||Vi1 )) =? h(Ekm2 (pdi2 ||Vi1 ))
i i
3. Compute: αi3 , km3i
f pd 2 (km3i ,pdi3 ), f pd 3 (pdi2 ||Vi2 )
Compute & Send:
←−i−−−−−−−−−i −−−−−−
4. Decrypt: D( f pd 2 (km3i , pdi3 )) = km3i , pdi3
i
Compute & Verify:
h(Ekm3 (pdi )) = h(Ekm3 (pdi3 )), h(pdi3 ) = pdi2
3
i i
Then, km3i → km3i , pdi3 → pdi3
5. Decrypt: D( f pd 3 (pdi2 ||Vi2 )) = pdi2 ||Vi2
i
Render at a card: R(Vi2 )
6. Verify the card: R(V ) = RV 2
i
Action Stage
Same as the 1st Session
Figure 2. The Whole Process of VSDS.
H.-A. Park / VSDS for Blurred Boundaries of Next Generation Computing Era 217
Then, SM verifies km1i = km1i or not. Again, SM computes h(Ekm1 (pdi1 ||Vi1 ))) with the
i
km1i and checks h(Ekm1 (Vi1 )) is the same as the received value h(Ekm1 (Vi1 )) or not. Here,
i i
Ekm1 (pdi1 ||Vi1 ) has the same meaning as the above 1.
i
3. SM computes αi2 by applying αiq to hash function (q-2) times, and then computes
km2i ;
K12 = h(KG )αi2 , K22 = h(KG ) fKG (KG )(1 − αi2 ), K32 = g fKG (KG ) , K42 = −(h(KG ) + αi2 ), K52 =
fKG (KG )αi2 .
SM computes and sends f pd 1 (km2i , pdi2 ), f pd 2 (pdi1 ||Vi1 ). Here, pdi2 is the stored value.
i i
4. With the value pdi1 , the member i decrypts the received value; D( f pd 1 (km2i , pdi2 )) =
i
km2i , pdi2 . With the obtained values km2i , pdi2 , the group member i computes h(Ekm2 (pdi2 ))
i
and verify if this is the same as h(Ekm2 (pdi2 )).
i
Because km2i is the member i’s group key, the encryption method is also the same
as 1. Then, i hashes the value pdi2 and verifies; h(pdi2 ) = pdi1 . If the verifications are
successful, km2i and pdi2 become km2i and pdi2 .
5. With this pdi , the group member i also decrypts; D( f pd 2 (pdi1 ||Vi1 )) = pdi1 ||Vi1 .
2
i
With the decrypted Vi1 , i renders this R(Vi1 ) then i uploads the image of R(Vi1 ) at a card.
6. SM verifies if the rendered card image R(Vi1 ) is the same as RV 1 (3D real model)
i
or not. At the first session’s verification, member indicator’ authentication is processed.
If SM’s verification is successful, the member i can begin to act (log-in allowed). The
action means uploading, reading(decryption) and downloading.
[The First Session_Action Stage]
K1
1 i,1 · gKi,2 · M = gh(KG ) f (KG ) M.
1
7. A member i encrypt message M: Ci1 = Ekm1 (M) = Ki,3
i
Then, the member i uploads M to his card.
8.Another member j downloads an encrypted message Ci1 from a VSDS board(server).
K 1j,4 1
9.The member j decrypts Ci1 with his first group session key; D = Ci1 · K 1j,3 · gK j,5 =
gh(KG ) f (KG ) M · (g f (KG ) )−(h(KG )+α j ) · g f (KG )α j = gh(KG ) f (KG )− f (KG )h(KG )− f (KG )α j + f (KG )α j ·
1 1 1 1
M=M
[The Second Session]
From the second session, most processes are similar to the first session. As the
session is changed, the corresponding pseudonym keys and group session keys are also
changed. As for the video image information V for 3D real model, a member sends the
information V 1 kept from the first session to SM, and then SM challenges the member
with the newly selected information V 2 in the third step. Lastly, the member renders 3D
real model R(V 2 ) at his card. Action stage is also similar to the first session.
From the third session, all processes go through the same paths as the second session.
3. Discussion
3.1. Efficiency
3.1.1. Strength.
In the secure group information-sharing communication, ’Group Re-Keying’ is the im-
portant task when user joins or leaves the group. The group keys needs to be updated
218 H.-A. Park / VSDS for Blurred Boundaries of Next Generation Computing Era
to maintain the forward and backward secrecy [14]. However, in the proposed system
VSDS, according to computation of the developed protocol, the every result of authenti-
cation is the same as the authentication-result with the fundamental group key. Therefore,
it does not need to do re-keying for membership -changes.
3.1.2. Weakness.
In the last step of the first session’s authentication (5, 6), 3-dimensional image Vis ren-
dered. R(V) plays a role of a member indicator which is decided by SM in advance. The
meaning is “improving security". If 3-dimensional image is inefficient in a real world,
2-dimensional image is recommended.
However, Google’s project ‘Tango’ has been recently showcased with indoor map-
ping and VR/AR platform [4]. ‘Tango’ technology makes a mobile device possible to
measure the physical world. Tango-enabled devices (smartphones, tablets) are used to
capture the dimensions of physical space to create 3-D representations of the real world.
‘Tango’ gives the Android device platform the new ability for spatial perception. There-
fore, it can be said that the proposition of VSDS is timely good keeping abreast of
‘Tango’s AR/VR technique to mobile devices.
3.2. Security
VSDS is a reversed hash key chain based group-key management system. Message con-
fidentiality is one of the most important features in secure information sharing for group
members. The group key security requirements are;
1. Group Key Secrecy: It should be computationally impossible that a passive adver-
sary discovers any secret group key.
2.Forward Secrecy: Any passive adversary with a subset of old group keys cannot
discover any subsequent(later) group key.
3.Backward Secrecy: Any passive adversary with a subset of subsequent group keys
cannot discover any preceding(earlier) group key.
4. Key Independence; Any passive adversary with any subset of group keys cannot
discover any other group key [3, 15].
However, group-key based information sharing and service system does not follow such
requirements because a new joiner to the group could search all of the previous informa-
tion to be helped. Namely, backward secrecy is not eligible for a security requirement of
VSDS. The System VSDS satisfies with Group Information-sharing Secrecy as follows;
1. Forward Secrecy: For any group GT and a dishonest participant p ∈ GTj , the prob-
ability that a participant p can generate valid group key and pseudonym for (j+1)-th au-
thentication is negligible when the participant knows group key kmij and pseudonym pdij ,
where p ∈ GTj+1 and 0 < j < q. It means that all leaving members from a group should
not access to all of the next information or documents of the group any more.
2. Backward Accessibility: For any group GT and a dishonest participant p ∈ GTj , the
probability that a participant p can generate valid group key and pseudonym for (j-l)-th
authentication is 1 − η(n)2 when the participant knows group key kmij and pseudonym
pdij , where p ∈ GTj−l and 0 < l < j. Namely, all joining members to a group can access
2 the term negligible function refers to a function η : N → R such that for any c ∈ N, there exists n ∈ N, such
c
that η(n) < n1c for all n ≥ nc [16]
H.-A. Park / VSDS for Blurred Boundaries of Next Generation Computing Era 219
4. Conclusion
VSDS was proposed for the patients in all over the world who want to get some helps
and share information such as the web ‘PatientsLikeMe’. This system guarantees secu-
rity and privacy, because most health and private information are sensitive. Therefore,
VSDS is scalable to other group’s project applications with safety. Moreover, it is firmly
believed that the identified problems between next generation’s collaborative comput-
ing and security and the approaches also should be managed as an Integrated Security
Management (ISM).
References
[1] H.A.Park, Secure Chip Based Encrypted Search Protocol In Mobile Office Environments, International
Journal of Advanced Computer Research, 6(24), 2016
[2] Y.Hu, A.Perrig, D.B.Johnson, Efficient security mechanisms for routing protocols, In the proceedings of
Network and Distributed System Security Symposium (2003), 57-73
[3] H.A.Park, J.H.Park, and D.H.Lee, PKIS: Practical Keyword Index Search on Cloud Datacenter,
EURASIP Journal on Wireless Communications and Networking, 2011(1), 84(2011), 1364-1372
[4] G.Sterling, Google to showcase Project Tango indoor mapping and VR/AR platform at Google I/O,
http://searchengineland.com/google-showcase-project-tango-indoor-mapping-vrar-platform-google-io-
249629, 2016
[5] H.A.Park, J.Zhan, D.H.Lee, PPSQL: Privacy Preserving SQL Queries, In the Proceedings of ISA(2008),
Taiwan, 549-554
[6] P.Wang, H.Wang, and J.Pieprzyk, Common Secure Index for Conjunctive Keyword-Based Retrieval over
Encrypted Data, SDM 2007 LNCS 4721(2007), 108-123
[7] H.A.Park, J.W.Byun, D.H.Lee, Secure Index Search for Groups, TrustBus 05 LNCS 3592(2005), 128-
140
[8] H.A.Park, J.H.Park, J.S.Kim, S.B.Lee, J.K.Kim, D.G.Kim, The Protocol for Secure Cloud-Service Sys-
tem. In the Proceedings of NISS(2012), 199-206
[9] P.Wang, H.Wang, and J.Pieprzyk, Keyword Field-Free Conjunctive Keyword Searches on Encrypted
Data and Extension for Dynamic Groups, CANS 2008 LNCS 5339(2008), 178-195
[10] Y.Kawai, S.Tanno, T.Kondo, K.Yoneyama, N.Kunihiro, K.Ohta, Extension of Secret Handshake Proto-
cols with Multiple Groups in Monotone Condition. WISA 2008 LNCS 5379(2009), 160-173
[11] J.Li, X.Chen, Efficient multi-user keyword search over encrypted data in cloud computing, Computing
and Informatics 32 (2013), 723-738
[12] http://lifehacker.com/how-to-use-trello-to-organize-your-entire-life-1683821040
[13] https://www.patientslikeme.com/
[14] R.V.Rao, K.Selvamani, R.Elakkiya, A secure key transfer protocol for group communication, Advanced
Computing: An International Journal, 3(2012), 83-90
[15] A.Gawanmeh, S.Tahar, Rank Theorems for Forward Secrecy in Group Key Management Protocols, In
the Proceedings of 21st AINAW(2007), 18-23
[16] D.Boneh, B.Waters, Conjunctive, Subset, and Range Queries on Encrypted Data, In the Proceedings of
4th TCC(2007), 535-554
220 Fuzzy Systems and Data Mining II
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-220
Abstract. In the most existed text-mining schemes for customer reviews, explicit
features are usually concerned while implicit features are ignored, which probably
leads to incomplete or incorrect results. In fact, it is necessary to consider implicit
features in customer review mining. Focusing on the identification of implicit
feature, a novel scheme based on hybrid rules is proposed, which mixed statistical
rule, dependency parsing and conditional probability. Explicit product features are
firstly extracted according to FP-tree method and clustered. Then, association pairs
are obtained based on dependency parsing method and the production of frequency
and PMI. Finally, implicit features are identified by considering the association
pairs and conditional probability of verbs, nouns and emotional words. The
proposed scheme is tested on a public cellphone reviews corpus. The results show
that our scheme can effectively find implicit features in customer reviews.
Therefore, our research can obtain more accurate and comprehensive results from
the customer reviews.
Introduction
1
Corresponding Author: Yong WANG, Chongqing University of Posts and Telecommunications, No.2
Chongwen Road, Nan’an District, Chongqing City, China; E-mail: wangyong1@cqupt.edu.cn.
Y. Wang et al. / Implicit Feature Identification in Chinese Reviews Based on Hybrid Rules 221
In recent years, some scholars have been studying implicit feature extraction. In
most proposes, implicit features are identified on the basis of emotional words. Qiu et
al. [3] proposed a novel approach to mine implicit features based on clustering
algorithm of k-means and F2 statistics. Hai et al. [4] identified implicit features via co-
occurrence association rules (CoAR) mining. Zeng et al. [5] proposed a method based
classification for implicit features identification. Zhang et al. [6] used explicitly multi-
strategy property extraction algorithm and similarity to detect implicit features. What’s
more, Wang et al. [7] proposed a hybrid association rule mining method to detect
implicit features.
To identify implicit feature, we proposed a novel scheme based on a hybrid rules,
which consist of three different methods. Compared with previous research results, the
presented scheme has two advantages: (1) considering semantic association degree and
statistical association degree together, we would get more accurate <feature clusters,
emotional words> association pairs. (2) In Chinese reviews, some emotional words can
qualify more than one features, such as ̌ᅢ̍(good),̌Ꮕ̍(bad). Thus, it is not
accurate to only consider the association between emotional words and features. To
solve this problem, the association between verbs, nouns and features is also
considered.
1. Scheme Design
Figure 1 depicts the framework of our scheme which is composed of several parts.
In this stage, explicit features are extracted. Detail steps are as follows:
Do word segmentation and POS (part-of-speech) tagging for reviews via
ICTCLAS (Institute of Computing Technology, Chinese Lexical Analysis
System).Then, nouns and noun phrase from the annotated corpus of comments
are stored in a transaction file.
Frequent itemsets obtained by FP-tree method are regarded as candidate explicit
features I0.
Candidate explicit features I1 are got after pruning all single words in I0.
222 Y. Wang et al. / Implicit Feature Identification in Chinese Reviews Based on Hybrid Rules
1.2. Explicit association pairs <explicit feature cluster, emotional word> extraction
n
Pf &w ¦ Co _ occurrence f , w / R
i 1
i
(4)
Where n is the number of features in a feature cluster, fi is ith feature in the feature
cluster f, Co_occurence (fi, w) is co-occurrence times of fi and w explicit sentences, R is
the number of sentences in explicit sentences.
Using syntax analysis tools to obtain all dependence relationship in the
sentences. If “nsubj” relationship exists between feature clusters and emotional
words, there is modified relation feature between feature clusters and emotional
words. If a feature in a feature cluster has a modified relation with an emotional
Y. Wang et al. / Implicit Feature Identification in Chinese Reviews Based on Hybrid Rules 223
word, we consider that the feature cluster has a modified relation to the
emotional word.
Setting a threshold p. The association pairs with frequency*PMI value larger
than p, or the frequency*PMI value smaller than p but existing modified
relations, are chosen as final association pairs. The p in the paper is -0.00009.
Where D is a weight coefficient, and it is set as 0.7 after several experiments. Then the
representative feature of a feature cluster which is in the association pairs with the
highest score is chosen as the implicit feature.
Y1N2
Step 1 is the same as the first step of Y1N1.
Step 2, the representative feature of a feature cluster which is in the candidate
association pairs with the highest frequency*PMI value is chosen as the implicit feature.
Y2N1
Step 1, verbs and nouns in the implicit sentence are extracted and treated as
notional words set. Then, we use Eqs. (5) and (6) to calculate all explicit feature
cluster’s average conditional probability under the condition of these words.
224 Y. Wang et al. / Implicit Feature Identification in Chinese Reviews Based on Hybrid Rules
2. Experiment Evaluation
Six hundred reviews about one kind of cell phone was download from a pubic website
called Datatang.com. In order to evaluate the performance of the scheme, data set was
manually annotated. In the data set, there are 1870 explicit sentences and 413 implicit
sentences. Three traditional methods, precision, recall and F-measure, are used to
evaluate the performance of the scheme.
89 product explicit features are obtained by the method described in Section 1.1. The
top 5 features most concerned by customers are shown in Table 1. The precision of this
method is 70.8%, the recall is 73.3% and F-measure is 72%.1285 association pairs are
extracted from explicit sentences by the approach described in Section 1.2. Five
association pairs are shown in Table 2. As seen from the table, the performance of the
approach is good.
Table 1.Top 5 product features results
rank feature PMI frequency
1 ᥓ⢻(intelligence) 0.0 14
2 ઙ(software) -0.10005 42
3 ภ (number) -0.44418 30
4 ዳ᐀(screen) -0.6529 194
5 ચᩰ(price) -0.79837 34
Table 2. Association pairs
rank feature PMI frequency
1 ᥓ⢻(intelligence) 0.0 14
2 ઙ(software) -0.10005 42
3 ภ (number) -0.44418 30
4 ዳ᐀(screen) -0.6529 194
5 ચᩰ(price) -0.79837 34
Implicit features are identified by the approach described in Section 1.3. Table 3 is
partial result. Compared with Ref. [4] by using the same data, results are in Table 4.
Table 3.partial result about Implicit features identification Table 4.Comparative results
Implicit sentences implicit feature Evaluation index our scheme Ref.[4]
900 Ma is difficult to meet the needs battery precision 67.49 % 41.55%
too expensive price recall 65.86% 37.53%
very slow and very troublesome reaction F-measure 66.67% 39.44%
Very beautiful appearance
shape looks like hard appearance
It can be seen from the above tables that the proposed algorithm is far superior to
the algorithm in [4]. Our scheme can better meet the needs of the practical application.
The algorithm proposed in this paper takes statistical analysis and semantic analysis
Y. Wang et al. / Implicit Feature Identification in Chinese Reviews Based on Hybrid Rules 225
into account which can find more association between emotional words and explicit
feature clusters. The research in [4] only focused on mining product features from the
point of statistical view. Therefore, our method has more advantages in performance.
3. Conclusion
Implicit features in customer reviews have an important effect on the text mining
results, which is also an important factor for customers or enterprises to make a wise
decision. In this paper, we proposed a scheme combining several rules to extract the
implicit features from the word segmentation to identification. Compare with the
conventional methods, our scheme not only obtains the association between emotional
words and product features based on statistics and semantics, but also consider the
effect of emotional words, verbs and nouns to the final results. Experiment results
shows that our scheme lays a good basis for the application of network reviews.
Acknowledgments
References
[1] B. Liu, M. Hu, J. Cheng. Opinion observer: analyzing and comparing opinions on the web. In:
Proceedings of the 14th International Conference on World Wide Web (WWW’05), ACM, New York,
NY, USA, 2005, 342̄351.
[2] H. Xu, F. Zhang, W. Wang. Implicit feature identification in Chinese reviews using explicit topic mining
model. Knowledge-Based Systems, 76(2014):166̄175.
[3] Y. F. Qiu, X. F. Ni, L. S. Shao. Research on extracting method of commodities implicit opinion targets.
Computer Engineering and Applications, 51(2015):114-118.
[4] Z. Hai, K. Chang, J.-j. Kim, Implicit feature identification via co-occurrence association rule mining. In:
Computational Linguistics and Intelligent Text Processing, Lecture Notes in Computer Science,
6608(2011), 393̄404.
[5] L. Zeng, F. Li. A Classification-Based Approach for Implicit Feature Identification/ Chinese
Computational Linguistics and, Natural Language Processing Based on Naturally Annotated Big Data.
Springer Berlin Heidelberg, 2013:190-202.
[6] L. Zhang, X. Xu. Implicit Feature Identification in Product Reviews. New Technology of Library and
Information Service. 2015, (12):42-47.
[7] W. Wang, H. Xu, and W. Wan. Implicit feature identification via hybrid association rule mining. Expert
Systems with Applications, 40(2013):3518̄3531.
[8] K W Church, et al. Word association norms, mutual information and lexicography. In: Proceedings of
the 27th Annual Conference of the Association of Computational Linguistics, New Brunswick, NJ:
Association for Computational Linguistics.1989: 76̄83.
[9] J. L. Tian, W. Zhao. Words Similarity Algorithm Based on Tongyici Cilin in Semantic Web Adaptive
Learning System. Journal of Jilin University (Information Science Edition), 28(2011):602-608.
226 Fuzzy Systems and Data Mining II
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-226
Abstract. The researches of the traditional cascade events, such as avalanche, sand
model, only researched the power-law distribution in the process of all time. In
fact, the speed of Virus propagation is different in each time period. In this paper,
we can find that the number of infected people behaves as a power-law for Guinea,
Liberia and Sierra Leone respectively over different time through our empirical
observations. So the government could take different power exponents of the
number of infected people as the spread of the disease in different periods for the
speed of manufacturing of the drug.
Introduction
1
Corresponding Author: Hao YANG, School of Information Engineering, Yancheng Teachers
University, Yancheng, Jiangsu, China; School of Software and TNList, Tsinghua University, Beijing, China;
E-mail: classforyc@163.com.
K.-M. Tang et al. / Characteristics Analysis and Data Mining of Uncertain Influence 227
period. For Guinea, Liberia and Sierra Leone, we do some empirical observations about
the spread of the disease in different time.
1. The measurement of the Spread of the Disease Based on Power Law Model
We suppose the sand number increase with n. A possibility is that the sand number per
unit length of the cycles, λ. The parameter is constant. Put k/2 sands (k = 2πλctgθ is an
even number) on the 0-th phase, and k sands at the 1st sand. The number of the sand on
the n-th cycle should be nk. Likely, we assume the falling sands is an inelastic collision
with the resting sands. In this case, each sand slide together after the collision. We make
the sliding sands in order that (n2 −n+1) sands evenly meet 2n resting sands on the n-th
phase. There are (n2 − n + 1)k in the n-th generation (bn = (n2 − n + 1)k, dn = 0), then
N(t)~n(t)2~t2×2~t4 [16].
We resolve with susceptible people(S), latent people (L), infected people (I) and
death people(D). The transformation of the four nodes is shown in Figure 1.
2. Model Evaluations
We collected data about the number of all the cases and the number of the people
infected from a website (http://www.cdc.gov/vhf/ebola/outbreaks/2014-west-africa/
whats-new.html). As a result of our experiments, a linear function was fitted to the linear
ranges of log-log plotted distributions to estimate the value of the gamma exponent.
Figure 2-4 show distributions of I for Guinea, Liberia and Sierra Leone respectively
(with values of the Pearson correlation coefficient R, and standard deviation SD). Our
method considers the number of infected people from February 4, 2014 to March 25,
2015.
The Log-log plots of the number of the people infected and dead are demonstrated;
a) Log-log plot of the number of the people infected
228 K.-M. Tang et al. / Characteristics Analysis and Data Mining of Uncertain Influence
(a)0<t<100 (b)100<t<229
(c)229<t<318
Figure 2. Log-log plot of the number of the people infected for Guinea
(a)2<t<100 (b)100<t<267
(c)267<t<318
Figure 3. Log-log plot of the number of the people infected for Liberia
K.-M. Tang et al. / Characteristics Analysis and Data Mining of Uncertain Influence 229
(a)64<t<271 (b)271<t<318
Figure 4. Log-log plot of the number of the people infected for Sierra Leone
(a)0<t<100 (b)100<t<318
Figure 5. Log-log plot of the number of the people dead for Guinea
(a)0<t<100 (b)100<t<318
Figure 6. Log-log plot of the number of the people dead for Liberia
230 K.-M. Tang et al. / Characteristics Analysis and Data Mining of Uncertain Influence
Figure 7.. Log-log plot of the number of the people dead for Sierra Leone
Table 1. Values of R(Pearson correlation coefficient), B(Power exponent) and SD(Standard deviation) for
log-log plot of the number of the people infected for three countries.
Guinea Liberia Sierra Leone
t 0-100 100-229 229-318 2-100 100-267 267-318 64-271 271-318
R 0.96003 0.98086 0.93949 0.88703 0.9855 0.98225 0.97084 0.98431
SD 0.06164 0.06282 0.01599 0.18816 0.10405 0.00326 0.15811 0.00396
Table 2. Values of R(values of the Pearson correlation coefficient) and SD(standard for log-log plot of the
number of the people dead for three countries.
Guinea Liberia Sierra Leone
t 0-100 100-318 2-100 100-318 64-318
R 0.95263 0.9909 0.79757 0.94552 0.96776
SD 0.06817 0.03521 0.15197 0.17199 0.15719
Figure 3-5 show the numbers of the people infected degree distributions for the three
countries and Figure 6-7 show the numbers of the people dead degree distributions for
the three countries. We use R and SD to illustrate the feasibility of our model(with values
of the Pearson correlation coefficient R, and standard deviation SD in Table 1-2). If 0.95
is taken as a minimal reliable value, we can state a power law for the infected and death
people.
Through the above analysis, we can get power relations about the number of people
infected or people dead changing over time. Values of A and B are shown in Table 3-4
respectively. Of course, it is not straightforward the correlation between them and the
number of people. We just represent the numerical results.
Table 3. Values of A and B of the people inflected for the three countries.
Guinea Liberia Sierra Leone
0-100 100-229 229-318 2-100 100-267 267-318 64-271 271-318
0.36666 3.1281 0.99661 0.82688 5.1708 0.73092 4.14169 1.01616
1.3384 -4.46411 0.53758 -0.18821 -8.61879 1.87354 -6.20484 1.336
K.-M. Tang et al. / Characteristics Analysis and Data Mining of Uncertain Influence 231
Table 4. Values of A and B values of the people dead for the three countries.
Guinea Liberia Sierra Leone
t 0-100 100-318 2-100 100-318 64-318
A 0.37032 1.92976 0.45961 3.71858 3.59267
B 1.61866 -1.52369 0.4246 -5.38156 -5.30488
3. Conclusion
According our model, the transmission speed of virus is slow at the beginning, but the
speed will accelerate after a period, which can cause people enough attention to the virus
and take some relevant measures to prevent the spread. Then the speed will be
decreased relatively. Our power law model is reasonable by using a simplified sandpile
model and analyzing the empirical data. The data of latent people could not be collected,
so we only analyze the data of infected and death people in the model to produce the drug.
As our experiment, it is a realistic, sensible, and useful mode, and can be applied to
eradicate Ebola.
Acknowledgements
This work is supported by the National High Technology Research and Development
Program (863 Program) of China (2015AA01A201), National Science Foundation of
China under Grant No. 61402394, 61379064, 61273106, National Science Foundation of
Jiangsu Province of China under Grant No. BK20140462, Natural Science Foundation of
the Higher Education Institutions of Jiangsu Province of China under Grant No.
14KJB520040, 15KJB520035, China Postdoctoral Science Foundation funded project
under Grant No. 2016M591922, Jiangsu Planned Projects for Postdoctoral Research
Funds under Grant No. 1601162B, JLCBE14008, and sponsored by Qing Lan Project.
References
[1] P. Bak, C. Tang, K. Wiesenfeld, Self-organized criticality, Physical review A, 38 (1988): 364.
[2] M. L. Sachtjen, B.A. Carreras, V.E. Lynch, Disturbances in a power transmission system, Physical Review
E, 61(2000): 4877.
[3] A. E. Motter, Cascade control and defense in complex networks, Physical Review Letters, 93(2004)
098701.
[4] J. Wang, L.-L. Rong, L. Zhang, Z. Zhang, Attack vulnerability of scale-free networks due to cascading
failures, Physical A, 387(2008): 6671.
[5] S.V. Buldyrev, R. Parshani, G. Paul, et al. Catastrophic cascade of failures in interdependent networks,
Nature, 464(2010): 1025.
[6] R. Parshani, S.V. Buldyrev, S. Havlin, Interdependent networks: reducing the coupling strength leads to a
change from a first to second order percolation transition, Physical Review Letters, 105(2010): 048701.
[7] T. Zhou, B. H. Wang, Chin. Maximal planar networks with large clustering coefficient and power-law
degree distribution, Physical Letters, 22 (2005): 1072.
[8] K. Lied, Avalanche studies and model validation in Europe, Avalanche studies and model validation in
Europe, European research project SATSIE (EU Contract no. EVG1-CT2002-00059), 2006.
[9] D.A. Noever, Himalayan sandpiles, Physical Review E, 47(1993): 724.
232 Fuzzy Systems and Data Mining II
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-232
Introduction
1
Corresponding Author: Kang-Wei LIU, Engineer of Sinopec Safety Engineering Institute, No339,
Songling Road, Qingdao, Shandong, China ; E-mail: liukw.qday@sinopec.com.
K.-W. Liu et al. / Hazardous Chemicals Accident Prediction Based on Accident State Vector 233
chemical is very grim with all kinds of explosion, fire, leakage and poisoning accidents
occurred at times. According to statistics, more than 96000 chemical enterprises in
China, of which dangerous chemicals production enterprises are more than 24000 , the
species of chemicals are more than 100000,but more than 4600 chemical accidents
have occurred nearly a decade. As the device of large-scale, intensive, material and
economic loss will occur when any accident happens, and especially death and disabil-
ity loss will lead to health life loss. Therefore, it is particularly important to forecast for
hazardous chemical accidents and develop appropriate safety measures on this basis.
Accident prediction is based on the known information and data, which forecasts
and predict for the security of forecasting object, as shown in Figure 1. Accident pre-
diction method has become a hot topic of scholars gradually as the change trend and
security hidden danger of the accident can be analyzed through the method in recent
years. According to incomplete statistics, all kinds of forecasting method was more
than 300 now, and the development of modern forecasting methods are often accompa-
nied by cross-analysis and mutual penetration of all kinds of forecasting methods, so it
is difficult to classify them absolutely. The current common accident prediction method
can be summarized into 6 types of situational analysis method, regression prediction
method, time prediction method, Markova chain prediction method, gray prediction
method and the nonlinear prediction method. The establishment and algorithm im-
provement of model often tend to be an emphasis in the traditional accident prediction
method and the collection and carding of the prior accident data will be an overlook
frequently. Limited by difficulty of priori data collection and complexity of model, the
accident prediction models are usually based on number of factors of strong causal re-
lationship artificially to hazardous chemical accidents ,which include the number of
accident ,death toll and the amount and type of hazardous chemical, then, leading to
incomplete and inaccurate of the accident forecasting result ultimately.
Accident
Prior data
prediction
Of accidents
model
Figure 1. Establishment way of the Accident prediction model
Support Vector Machine (SVM) is developed by Vapnik and co-workers[1] It is
an excellent method of machine learning. SVM have empirically been shown to give
good generalization performance on a wide variety of problems. SVM is a kind of im-
plementation way of statistical learning theory, which is not only the pursuit of accura-
cy on the training sample, but also the consideration of complexity of the learning
space on the basis of the training sample accuracy, namely, it adopted a compromise
between spatial complexity and sample learning precision so that the resulting models
for unknown samples possess good generalization ability.
In view of the analysis and summary of previous methods, an improved Hazardous
Chemicals Accident Prediction method is proposed based on Support Vector Machine
in this paper. It defines the accident state vector from three dimensions of the human
factors, physical state factors, environmental factors based on the principle of accident
causes. According to the geometrical distribution characteristics of support vector, it
can be selected from the incremental samples that the sample of support vectors most
likely to become forming a boundary vector set by adopting vector distance pre-
extraction method, on which support vector training and accident prediction model
build. It ensures the validity of predictive models due to various factors of the cause of
234 K.-W. Liu et al. / Hazardous Chemicals Accident Prediction Based on Accident State Vector
the accident are fully considered by the accident state vector and advantages of support
vector machines in high-dimensional, multi-factor, large sample datasets machine
learning are exhibited.
1. Overview of SVM
The core of SVM is finding a hyperplane that separates a set of positive examples from
a set of negative examples with maximum margin[1,2,3]. The training of a Support
Vector Machine can be reduced to maximize a convex quadratic program to linear con-
straints. Given a training sample:
{( xi, yi )| i=1,…,l; xi Rn yi {+1, -1}},
For the condition of Linear Separable: The goal of SVM is to find a hyperplane
<w, x> + b = 0
Which divides the sample set exactly. But there are always not only one hyper-
plane. The hyperplane which has the largest margin of the two kinds of samples - the
optimum classification hyperplane - attains the best capacity of spread. The optimum
hyperplane is only determined by samples closest to it and has no responsibility on
other samples. These samples are so called support vectors. This is also the origin of
“support vector”[4,5,6,7].
The accident causation theory is used to illustrate the causes of accidents, exploring
process and accident consequences, so the occurrence and development of accident
phenomenon can be analyzed definitely. It is accident mechanism and model extracted
from the essence of a large number of typical accident, which reflects the regularity of
the accident, provides a scientific and complete basis in theory for the accident predic-
tion and prevention, besides the improvement of the safety management work owing to
the capacity for quantitative and qualitative analysis of accident cause
In accordance with the accident causation theory, the insecure elements of human
beings, the insecure status of objects and insecure impact of environment can all lead to
the occurrence of accidents, so the accident can be described as three categories of sub-
jective evaluation indicator (human factors), objective inherent indicator (physical fac-
tors), environmental indicator (environment factors), as shown in Figure 2.
P D E
set of M. Obviously, , the learning target is to find the classifier : and cor-
responding support vector set * of * .
Based on the geometric character of support vector, determining whether one sam-
ple can transfer to support vector should consider two complications: One is the dis-
tance between the sample and the hyperplane; the other is the distance between this
sample and the center of this kind of samples [13,14,17]. So we can do our best to se-
lect the samples likely to become support vectors as the newly training set. There may
be samples which will become support vector in and . Select samples which
are close to separating hyperplane and between the class center and hyperplane as new-
ly-increased samples. Select samples whose hyperplane-distance is less than center
plane distance form edge sample set T. Set * * as the final training set of in-
cremental learning.
4. Experimental Results
We apply this algorithm into establishment of the model for the prediction of hazardous
chemicals accidents. We compare the ASV-SVM algorithm with traditional SVM
learning algorithm and KNN k-Nearest Neighbor algorithm. Simple description of
three algorithms is as follows:
Classical SVM algorithm: This is traditional SVM algorithm. The algorithm com-
bines original samples and newly-increased samples, and does the learning again for all
of the training samples.
Classical KNN algorithm: KNN is a memory-based method. Prediction on a new
instance is preformed using the labels of similar instances of the training set.
ASV-SVM algorithm: Using ASV-SVM algorithm which select support vector
based on vector-distance for incremental learning.
In this experiment, the accident state vector is defined by Multimodal data. The
method is as follows:
(1)Collect and maintain the data of 619 typical hazardous chemicals accidents oc-
curring within the last ten years, including accident report, accident cause analysis,
accident consequence and influence.
(2)Crawling related data of mentioned accidents using web crawler which build by
open-source tools. The web crawler is an internet bot which systematically browses the
known hazardous chemical accident website, for the purpose of collecting Multi-modal
accident data. Such as the weather condition, geographical situation, population density
etc. when the accident happened.
(3)In order to do a good job of comparative test, we collected two to three sets of
non-accident status data on other times at the place where the accident occurred. And
1288 non accident state data are formed by this way.
(4)The data collected above is Multi format such as authoritative data, accident re-
port, webpage, image, video, speech, etc. In order to define the accident state vector
easily, we divide the Multi-modal data into three dimensions based on the principle of
accident causes. We use the open source big data tools, with the manual screening, the
data were structured to deal with. We add as many attribute labels as possible to each
data, so that these non-structural data become structured data. To be frank, for unstruc-
K.-W. Liu et al. / Hazardous Chemicals Accident Prediction Based on Accident State Vector 237
tured data such as video, image, most of which is done by artificial recognition, the
accuracy and availability of automatic recognition of the machine is not satisfactory.
(5)All the Multimodal data will have a lot of attribute labels after structured pro-
cess. Categorize these attribute labels by three dimensions based on the principle of
accident causes. Respectively is the human factors, physical state factors, environmen-
tal factors.
(6)Determine the dimension of the accident state vector of 265 dimensions, and
each attribute label represents one dimension, including human vector P (185 dimen-
sions), divided into leadership and safety culture, safety, process safety information for
process control, inspection and human performance, the state vector D (49 dimensions),
divided into Fire index of hazardous substances, explosive index, toxicity index, pro-
cess index, equipment index, safety facility index, etc. The environment vector (31
dimensions), by the meteorological index (We) and geography information index (Gi).
(7)Transfer accident state and the non-accident state into the vector form of acci-
dent as follows[15]:
<label> <index1>: <value1> <index2>: <value2> ̖̖ <indexn>: <valuen>
Label is result of the accident state. 1 is accident state.-1 is non-accident state. In-
dex is attribute label. Value is the weight or description of attribute label. And n is
equal to 256.
(8)And from which 1000 vectors are selected as test sets, 1000 vectors are used as
the initial training set, and the remaining 907 vectors are randomly divided into 3 sets,
as an incremental set.
After the pretreatment, Accident information are transferred to the form of vectors.
Then we use the three algorithms do the learning. All of the algorithms are carried out
in the LibSvm-mat-5.20 saddlebag[16]. The platform of the experiment is E7-4830V2,
operating system is Windows server 2012. In the experiment, kernel is REF function,
C=1. The results of the experiment are shown in table 1 3.
Table 1. Classical SVM algorithm experiment results
ASV-SVM algorithm
Incremental
Test set
set number
samples numbers time/s Precision
5. Conclusion
Hazardous chemicals industry is a high risk industry. Explosion, fire, leakage and poi-
soning accidents occur frequently. This paper analyzes the influence of occurrence of
hazardous chemicals accidents form human factors, physical factor and environmental
factor, and defines the accident state vector from three dimensions. In view of the anal-
ysis and summary of previous methods, an improved Hazardous Chemicals Accident
Prediction method is proposed based on Accident State Vector (ASV-SVM). The high
dimension vector is used to define the accident state, and the most possible factors are
considered. Using improved support vector machine learning algorithm (ASV-SVM
algorithm), an accident prediction model is established by accident state vector. A
sample test of the hazardous chemical accident shows that the method proposed this
paper can differentiate accident state accurately and efficiently, and make a positive
significance on accident prediction of hazardous chemicals.
Acknowledgement
This work was supported by the National Natural Science Foundation of China (Grant
No. 31201133).
References
[1] V.N. Vapnik, The Nature of Statistical Learning Theory, Springer-verlag, New York, (2000),332-350..
[2] N. Cristianini, J. Shawe-Talor. An Introduction to Support Vector Machines and Other Kernel-based
Learning Methods. Cambridge University Press, (2004), 543-566
[3] R. Xiao, J.C. Wang, Z.X. Sun An Incremental SVM Learning Algorithm. Journal of Nan Jing Univer-
sity (Natural Sciences) 38(2002), 152-157㧚
[4] N Ahmed, S Huan, K Liu, K Sung, Incremental learning with support vector machines. The International
Joint Conference on Artificial Intelligence, Morgan Kaufmann publishers, 10 1999), 352-356.
[5] P. Mitra, C. A. Murthy, S. K. Pal, Data Condensation in Large Databases by Incremental Learning with
Support Vector Machines. Proceeding of International Conference on Pattern Recognition, (2000), 2708-
2711.
240 K.-W. Liu et al. / Hazardous Chemicals Accident Prediction Based on Accident State Vector
[6] C. Domeniconi and D. Gunopulos Incremental Support Vector Machine Construction. Proceeding of
IEEE International Conference on Data Mining series (ICDM ), (2001),589-592.
[7] G. Cauwenberghs , T. Poggio, Incremental and Decremental Support Vector Machine Learning. Ad-
vances in Neural Information Processing Systems,(2000),122-127.
[8] S. Katagiri , S. Abe, Selecting Support Vector Candidates for Incremental Training. Proceeding of IEEE
International Conference on Systems, Man, and Cybernetics (SMC), (2005),1258-1263,.
[9] D. M. J. Tax, R. P. W. Duin, Outliers and Data Descriptions. Proceeding of Seventh Annual Conference
of the Advanced School for Computing and Imaging, (2001),234-241.
[10] L.M. Manevitz and M. Yousef, One-class SVMs for document classification. Journal of Machine Learn-
ing Research, 2 (2001), 139-154.
[11] R. Debnath, H. Takahashi, An improved working set selection method for SVM decomposition method.
Proceeding of IEEE International Conference Intelligence Systems, Varna, Bulgaria, 21-24(2004), 520-
523.
[12] R. Debnath, M. Muramatsu, H.Takahashi, An Efficient Support Vector Machine Learning Method with
Second-Order Cone Programming for Large-Scale Problems. Applied Intelligence, 23(2005), 219-239.
[13] W D.Zhou, L.Zhang, L.C.Jiao, An Analysis of SVMs Generalization Performance. Acta Electronica
Sinica. 29(2001),590-594
[14] J. Heaton, Net-Robot Java programme guide. Publishing House of Electronics Industry. 22(2002) 1-
141.
[15] C.W. Hsu C.J. Lin A simple decomposition method for support vector machines. Machine Learning,
46(2002) 291–314.
[16] C.C. Chang , C. Lin, LIBSVM : a library for support vector machines, 2001. Software available at
http://www.csie.ntu.edu.tw/~cjlin/libsvm
[17] C.H. Li, K.W. Liu, H.X. Wang. The incremental learning algorithm with support vector machine based
on hyperplane distance, Applied Intelligence, 46(2009):145-152
Fuzzy Systems and Data Mining II 241
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-241
Introduction
1 Corresponding Author: Guoqi Liu; School of Computer and Information Engineering, Henan
1. Background
∂φ
= −((I − c1 )2 − (I − c2 )2 + μK)δ (2)
∂t
A data fitting energy is defined in LBF [6], which can be locally approximated the
image intensities on the two sides of the contour. This energy is then incorporated
into a variational level set formulation, and a curve evolution equation is derived
G.-Q. Liu and H.-F. Li / Regularized Level Set for Inhomogeneity Segmentation 243
2
e= λi ei (x)dx (3)
i=1
2
e1 (x) = Kσ (y−x)|I(y)−f1 (x)| Hdy, e2 (x) = Kσ (y−x)|I(y)−f2 (x)|2 (1−Hdy
Ω Ω
(4)
Kσ is served as a kernel function. f1 (x) and f2 (x) are computed as follows:
∂φ ∂E ∇φ
=− = −δ (φ)(e1 − e2 ) + λδ (φ)div( ) (7)
∂t ∂φ |∇φ|
Similar with [12], the evolution equation of LBF is also computed by mini-
mizing the following energy functional:
E = λ|∇φ|dx + (e2 − e1 )φdx (10)
244 G.-Q. Liu and H.-F. Li / Regularized Level Set for Inhomogeneity Segmentation
1
where g = 1+|∇I| 2 is the edge stoping function, u is the character function
and 0 ≤ u ≤ 1. Since the above energy is convex but with constrained condition
minimization problem, the unconstrained and convex energy is obtained based on
introducing an exact penalty function:
E (u, f1 , f2 , λ, α) = λT Vg (u) + (e2 − e1 )u + αpf (u)dx (12)
Ω
In order to obtain the solution of the energy functional (13), the regularized
method is utilized in this letter. By introducing a variable v, the regularized
energy functional is computed as follows:
μ
E (u, v, f1 , f2 , λ, α) = λT Vg (u) + u − v F + (e2 − e1 )v + αpf (v)dx (13)
2 Ω
1
u=v− div p (14)
μ
1 1
g(x)∇( div p − v) − |∇( div p − v)|p = 0. (15)
μ μ
The above equation can be solved by a fixed point method, which is given in [13].
Similarly, v is obtained by minimizing the following equation: v is updated by
μ
v = argmin u − v F + (e2 − e1 )v + αpf (v)dx (16)
v 2 Ω
λ
v = min(max(u − (e2 − e1 ), 0), 1) (17)
μ
4. Conclusions
In this paper, we first introduce the CV model and the LBF model. Then, we
propose our model to improve the efficiency of contour evolution. There are two
contributions in ours. One is that an energy function with edge information is
added into LBF, the other is to introduce a fast algorithm to obtain the solu-
tion. Experimental results confirm that the proposed method can obtain similar
segmentation and keep weak edges. Meanwhile, proposed method obtains faster
evolution of contour.
G.-Q. Liu and H.-F. Li / Regularized Level Set for Inhomogeneity Segmentation 247
Acknowledgements
References
[1] V. Caselles, R. Kimmel, and G. Sapiro. Geodesic active contours, International journal
of computer vision, 22(1997), 61-79.
[2] S. Kichenassamy, A. Kumar, P. Olver, A. Tannenbaum, and A. Yezzi: Gradient flows and
geometric active contour models, Proc. 5th Int. Conf. Comput. Vis., 1995, 810-815.
[3] R. Kimmel, A. Amir, and A. Bruckstein. Finding shortest paths on surfaces using level set
propagation, IEEE Transactions on Pattern Analysis and Machine Intelligence, 17(1995),
635-640.
[4] R. Malladi, J. A. Sethian, and B. C.Vemuri. Shape modeling with front propagation: A
level set approach, IEEE Transactions on Pattern Analysis and Machine Intelligence,
17(1995), 158-175.
[5] T. han and L. Vese. Active contours without edges, IEEE Transactions on Image Pro-
cessing, 10(2) (2001), 266-277
[6] C. Li, C. Kao, J. C. Gore, and Z. Ding. Minimization of region-scalable fitting energy for
image segmentation, IEEE Transactions on Image Processing, 17(2008), 1940-1949.
[7] C. Li, Huang R., Ding Z., Gatenby C., Metaxas DN., Gore JC. A level set method for
image segmentation in the presence of intensity inhomogeneities with application to MRI,
IEEE Transactions on Image Processing, 20(7) (2011), 2007-2016.
[8] X.F. Wang, H. Min. A level set based segmentation method for images with intensity in-
homogeneity, Emerging Intelligent Computing Technology and Applications, with Aspects
of Artificial Intelligence, 2009, 670-679.
[9] F.F. Dong, Z.S. Chen and J.W. Wang, A new level set method for inhomogeneous image
segmentation, Image and Vision Computing, 31(2013), 809-822.
[10] K.H. Zhang, H.H. Song and L. Zhang, Active contours driven by local lmage litting energy,
Pattern recognition, 43(2010), 1199-1206.
[11] C. Li, C. Xu, C. Gui, MD. Fox, Level set evolution without re-initialization: A new vari-
ational formulation, Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2005,
430-436.
[12] A. Chambolle, An algorithm for total variation minimization and applications, Journal of
Mathematical Imaging and Vision, 20(2004), 89-97.
[13] X. Bresson, S. Esedoglu, P. Vandergheynst, et al. Fast global minimization of the active
contour /snake model, Journal of Mathematical Imaging and Vision, 28(2007), 151-167.
[14] E.S. Brown, T.F. Chan, X. Bresson. Completely convex formulation of the Chan-Vese
image segmentation model, International journal of computer vision, 98(2012), 103-121.
[15] C. Li, R. Huang, Z. Ding, C. Gatenby, D. Metaxas, J. Gore, A variational level set approach
to segmentation and bais correction of images with intensity inhomogeneity, Processing
of medical image computing and computer aided intervention (MICCAI), 2008, Part II,
LNCS 5242, 1083-1091.
248 Fuzzy Systems and Data Mining II
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-248
Introduction
In the 1990s, Judea Pearl first talked about Bayesian network [1], which is a kind of infer-
ence network based on probabilistic uncertainty. A particularly restricted model, Naive
Bayes (NB), is a powerful classification technique. Many restricted Bayesian classifiers
[2] have been set out to extend the dependence of NB, such as Tree-augmented Naive
Bayes (TAN) [3] and k-dependence Bayesian classifier (KDB) [4].
Madden [2] finds that unrestricted Bayesian classifiers [5] learned using likelihood-
based scores are comparable to TAN. In this paper, a novel unrestricted k-dependence
Bayesian classifier (UKDB) is proposed to build from the perspective of Markov blanket.
Local mutual information and conditional local mutual information are applied to build
the local graph structure UKDBL for each test instance. UKDBL can be considered a
complementary part of UKDBG , which is learned from training set.
1 CorrespondingAuthor: LiMin Wang, Key Laboratory of Symbolic Computation and Knowledge
Engineering of Ministry of Education, Jilin University, ChangChun City 130012, P. R. China; E-mail:
wanglim@jlu.edu.cn.
M.-H. Li and L.-M. Wang / Exploring the Non-Trivial Knowledge Implicit in Test Instance 249
The rest of the paper is organized as follows. Section 1 briefly introduces information
theory and Markov blanket. Section 2 introduces related Bayesian classifiers. Section
3 presents the learning procedure of UKDB and basic idea of local learning. Section 4
provides the experimental results and comparisons. Section 5 concludes the findings.
In the 1940s, Claude E. Shannon introduced information theory, the theoretical basis of
modern digital communication. Many commonly used measures are based on the infor-
mation theory and used in a variety of classification algorithms.
The mutual information (MI) [6] I(X; Y ) can measure the reduction of uncertainty
about variable X when all the values of variable Y are known. Conditional mutual in-
formation (CMI) [6] I(X; Y |Z) can measure the mutual dependence between X and Y .
Local mutual information (LMI) I(X; y) can measure the reduction about variable X
after observing that Y = y. Conditional local mutual information (CLMI) I(x; y|Z) can
measure the mutual dependence between two attribute values x and y.
Definition 1. [1] The Markov blanket (MB) for variable C is the set of nodes composed
of C’s parents Xpa , its children Xch , and its children’s parents Xcp . Suppose that X =
{Xpa , Xch , Xcp }, Markov blanket Bayesian classifiers approximate P (x, c) as follows,
P (c, x) = P (xpa )P (c|xpa )P (xcp |xpa , c)P (xch |xcp , xpa , c) (1)
Eq.(1) presents a more general case.The Markov blanket of C shields C from effects
of those attributes outside it and is the only knowledge needed to predict its behavior.
where Xr denotes the root node and {Xj(i) } = Pa(Xi )\C, for any i = r. An
example of TAN is shown in Figure 1(b).
KDB further relaxes NB’s independence assumption by allowing every attribute to
be conditioned on the class and, at most, k other attributes [4]. Then
n
P (c|x) ∝ P (c)P (x1 |c) P (xi |c, xi1 , · · · , xip ) (4)
i=2
where {Xi1 , · · · , Xip } are the parent attributes of Xi and p = min(i − 1, k). Figure
1(c) shows an example of KDB when k=2.
UKDB can output two kinds of sub-classifiers, i.e., UKDBG and UKDBL , which de-
scribe the causal relationships implicated in training set and test instance, respectively.
UKDB uses I(Xi ; C) and I(Xi ; Xj |C) simultaneously to measure the comprehensive
effect of class C and other attributes (e.g., Xj ) on Xi .
The learning procedures of UKDBG are described as follows:
———————————————————————————————————
Algorithm 1 UKDBG
———————————————————————————————————
M.-H. Li and L.-M. Wang / Exploring the Non-Trivial Knowledge Implicit in Test Instance 251
Input: Pre-classified training set, DB, and the k value for the maximum allowable
degree of attribute dependence.
1. Let the global Bayesian classifier being constructed, UKDBG , begin with a single
class node C. Let the used attribute list S be empty.
2. Select k attributes {X1 , · · · , Xk } as Xpa that correspond to the maximum of
I(X1 , · · · , Xk ; C).
3. Add {X1 , · · · , Xk } to S. Add k nodes to UKDBG representing {X1 , · · · , Xk }
as the parents of C. Add k arcs from {X1 , · · · , Xk } to C in UKDBG .
4. Repeat until S includes all domain attributes
• Select
q attribute Xi that corresponds to the maximum value of I(Xi ; C) +
j=1 I(Xi , Xj |C), where Xi ∈/ S, Xj ∈ S and q = min(|S|, k).
• Add Xi to S. Add a node that represents Xi to UKDBG . Add an arc from C
to Xi . Add q arcs from q distinct attributes Xj in S to Xi .
5. Compute the conditional probability tables inferred by the structure of UKDBG
by using counts from DB, and output UKDBG .
———————————————————————————————————
The learning procedures of UKDBL are described as follows:
———————————————————————————————————
Algorithm 2 UKDBL
Input: Test instance (x1 , · · · , xn ), estimates of probability distributions on training
set and the k value for the maximum allowable degree of attribute dependence.
1. Let the local Bayesian classifier being constructed, UKDBL , begin with a single
class node C. Let the used attribute list S be empty.
2. Select k attributes {X1 , · · · , Xk } as Xpa that correspond to the maximum of
I(x1 , · · · , xk ; C).
3. Add {X1 , · · · , Xk } to S. Add k nodes to UKDBL representing {X1 , · · · , Xk }
as the parents of C. Add k arcs from {X1 , · · · , Xk } to C.
4. Repeat until S includes all domain attributes
•
Select attribute Xi that corresponds to the maximum value of I(xi ; C) +
q
j=1 I(xi , xj |C), where Xi ∈
/ S, Xj ∈ S and q = min(|S|, k).
• Add Xi to S. Add a node that represents Xi to UKDBL . Add an arc from C
to Xi . Add q arcs from q distinct attributes Xj in S to Xi .
5. Compute the conditional probability tables inferred by the structure of UKDBL
by using counts from DB, and output UKDBL .
———————————————————————————————————
For UKDBG and UKDBL , estimate the conditional probabilities P̂G (cp |x) and
P̂L (cp |x) that instance x belongs to class cp (p = 1, 2, · · · , t), respectively. The class
label of x is determined by the average of both of the conditional probabilities.
In order to better verify the efficiency of the proposed UKDB, experiments have been
conducted on 15 datasets from the UCI machine learning repository [7]. Table 1 sum-
marizes the characteristics of each dataset. Table 2 presents for each dataset the average
zero-one loss. The following algorithms are compared:
• NB, standard Naive Bayes.
• TAN [8], Tree-augmented Naive Bayes applying incremental learning.
• KDB (k=2), standard k-dependence Bayesian classifier.
• UKDBG (Global UKDB, k=2), a variant UKDB describes global dependencies.
• UKDBL (Local UKDB, k=2), a variant UKDB describes local dependencies.
• UKDB (k=2), a combination of global UKDB and local UKDB.
Statistically a win/draw/loss record (W/D/L) is computed for each pair of competi-
tors A and B with regard to a performance measure M . The record represents the number
of datasets in which A respectively beats, loses to, or ties with B on M . Finally, related
algorithms are compared via one-tailed binomial sign test with a 95% confidence level.
Table 3 shows the W/D/L records respectively corresponding to average zero-one loss.
Dems̆ar [8] recommends the Friedman test [9] for comparisons of multiple algo-
rithms. For any pre-determined level α, the null hypothesis will be rejected if F > χ2α ,
which is the upper-tail critical value having t − 1 degrees of freedom. The critical value
M.-H. Li and L.-M. Wang / Exploring the Non-Trivial Knowledge Implicit in Test Instance 253
of χ2α for α = 0.05 is 9.49. The Friedman statistic for zero-one loss in our experiments
are 16.64. By comparing those results, we can get the following conclusions:
For different classifiers the average ranks of zero-one loss on all datasets are {N-
B(4.66), TAN(3.74), KDB(3.56), UKDBG (3.45), UKDBL (3.58), UKDB(2.01)}. UKD-
B and UKDBG performs the best among all classifiers in terms of zero-one loss. From
Table 3, UKDB has lower zero-one loss more often than other classifiers and the differ-
ences are significant. UKDBG also has relative advantages, however, the differences are
not significant. The performance of UKDBL is similar to that of TAN. UKDB can make
full use of the information that is supplied by the training sets and test instances. Thus,
performance robustness can be achieved.
5. Conclusion
The working mechanisms of NB, TAN and KDB were analysed and summarised. The
proposed algorithm, i.e., UKDB, applies local learning and Markov blanket to improve
the classification accuracy. Local learning makes the final model more flexible and
Markov blanket breaks the limitation of strict restriction for the parent variables.
15 datasets are selected from UCI machine learning repository by 10-fold cross val-
idation for zero-one loss comparison. Overall, findings reveal that UKDB model outper-
formed NB, TAN and KDB extraordinarily. To clarify the working mechanism of UKDB
more clearly, global UKDB and local UKDB, are also implemented and compared.
Acknowledgements
This work was supported by the National Science Foundation of China (Grant No.
61272209, 61300145) and the Postdoctoral Science Foundation of China (Grant No.
2013M530980), Agreement of Science & Technology Development Project, Jilin
Province (No. 20150101014JC).
References
[1] J. Pearl, Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference, Morgan Kauf-
mann, Palo Alto, CA, 1988.
[2] M.G. Madden, On the classification performance of TAN and general Bayesian networks, Knowledge-
Based Systems, 22 (2009), 489–495.
[3] R.A. Josep, Incremental Learning of Tree Augmented Naive Bayes Classifiers, in Proceedings of the 8th
Ibero-American Conference on Artificial Intelligence, Seville, Spain, 2002, 32–41.
[4] M. Sahami, Learning limited dependence Bayesian classifiers, in Proceedings of the 2nd International
Conference on Knowledge Discovery and Data Mining, 1996, 335–338.
[5] F. Pernkopf, Bayesian network classifiers versus selective k-NN classifier, Pattern Recognition,
38(2005), 1–10.
[6] C.E. Shannon, A Mathematical Theory of Communication, Bell System Technical Journal, 1948, 379–
423.
[7] UCI Machine Learning Repository. Available online: https://archive.ics.uci.edu/ml/datasets.html.
[8] J. Demšar, Statistical comparisons of classifiers over multiple data sets, Journal of Machine Learning
Research, 7 (2006), 1–30.
[9] M. Friedman, The Use of Ranks to Avoid the Assumption of Normality Implicit in the Analysis of
Variance, Journal of the American Statistical Association, 32 (1937), 675–701.
254 Fuzzy Systems and Data Mining II
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-254
Introduction
1
Corresponding Author: Ling-xi PENG, School of Mechanical and Electrical Engineering, Guangzhou
University, Guangzhou, P.R. China. Email: xysoc@gzhu.edu.cn.
Y. Xie et al. / The Factor Analysis’s Applicability on Social Indicator Research 255
effect will be limited. On the contrary, with stronger correlation, factor analysis could
largely reduce the dimensionality, produce superior performance, and improve the
interpretability [11].
Nowadays, factor analysis is implemented in most statistical software. But the
software are unable to understand the underlying meaning of each variable, and thus
researchers need to name cryptic factor, make a practical interpretation of the factors,
and check the applicability of factor analysis to the dataset. Quite often, the
applicability is not tested at all, and the researchers assume applicability by default.
This is one of the key reasons of absurd factor analysis results are not uncommon in
many statistical textbooks and articles. Many authors do not examine the raw data
before conducing the factor analysis.
Specifically, this article will use an example in Statistics (the fourth edition,
Renmin University of China Press) to illustrate the importance of checking
applicability of factor analysis. This textbook is widely used in China, recommended
by National Statistics Committee and Ministry of Education, with supporting
comprehensive database of teaching. In fact, the similar misuses could be found in
many other statistical textbooks, including another popular textbook Multivariate
Statistical Analysis [8].
1. A Case Study
The following example uses factor analysis to rank economic development of China
Provinces. "Based on the data of six major economic indicators for 31 provinces,
municipalities and autonomous regions in 2006, conduct factor analysis, explain the
factors, and calculate the factor score [9]." (Quoted (translated) from 256-269 pages of
the original book, Chapter 12, principal component analysis and factor analysis):
Gross
Governm Total Total Household Total Retail
Regional
ent Investment in Consumption Sales of
Product Populatio
Region Revenue Fixed Assets Expenditure Consumer
Per n (10000
(10000 (100 million (yuan/per Goods (100
Capita persons)
yuan) yuan) Capita) million yuan)
(yuan)
Component
1 2
Gross Regional Product Per Capita .112 .981
Government Revenue .755 .622
Total Investment in Fixed Assets .931 .247
TotalPopulation .941 -.213
Household Consumption Expenditure .117 .980
Total Retail Sales of Consumer Goods .922 .349
Extraction Method: Principal Component Analysis.
Rotation Method: Varimax with Kaiser Normalization.
Then, the author weighted each factor according to variance contribution rate, and
then summed. In this way, the textbook calculated the total scores of each region, and
Y. Xie et al. / The Factor Analysis’s Applicability on Social Indicator Research 257
used the total score to reflect regional economic development. The result of rank in the
textbook is in the following Table 4.
According to the textbook, the result of the KMO Test, shown in Table 5, is
statistically significant, which means that the result of factor analysis is meaningful.
However, the result is highly skeptical. For example, Beijing is significantly under-
ranked, and Henan is over-ranked. Guangdong being ranked first is inconsistent with
the actual situation of the economic development.
Table 5 KMO and Bartlett's Test
Kaiser-Meyer-Olkin Measure of Sampling Adequacy. .695
Approx. Chi-Square 277.025
Sig. .000
The problem of the above analysis lies in the raw data. The example selects a few
variables to reflect the economic development. However, these variables are not on the
same scale. The GDP per capita is on the "individual" scale, while the "total population
at the end of the year", "investment in fixed assets", "total retail sales of social
consumer goods" and "government revenue" are all on the "population" or "overall"
scale. Because of the mismatched scale, it is inappropriate to combine these variables
into meaningful factors. In fact, to compare the level of economic development, the
"total population" is not even a proper indicator, as it gives advantages to regions with
larger population in the ranking system. Obviously, large population does not
necessarily indicate a prosperous economy. For example, Beijing, the Capital of China,
has much less population than Henan province, but Beijing’s economy is much more
developed than Henan.
To overcome this problem, a more appropriate approach is to examine the raw data
before factor analysis. To evaluate the economic development, the per capita variables
are more reasonable. Using the data from the textbook, the author calculates per capita
data for each variable except "total population", and then using factor analysis to do
same kind of research. The results (Table 6) show that the Number 1 extracted factor
can explain more than 80% of the variation, showing that there is greater relationship
between per capita level of economic indicators. We use the component matrix (Table
7) to recalculate the score.
Table 6 New Total Variance Explained
Initial Eigenvalues Extraction Sums of Squared Loadings
Component
Total % of Variance Cumulative % Total % of Variance Cumulative %
1 4.210 84.210 84.210 4.210 84.210 84.210
2 .592 11.833 96.042
3 .139 2.776 98.818
4 .039 .770 99.588
5 .021 .412 100.000
Extraction Method: Principal Component Analysis.
258 Y. Xie et al. / The Factor Analysis’s Applicability on Social Indicator Research
The final regional rank of the economic level (Fac 1) is shown below.
Clearly, the ranking in Table 8 agrees with the actual economic situation in China.
The more developed regions are on the top.
2. Conclusions
Factor analysis is a widely taught and used statistical method, especial in the field of
social indicator research. Various professional statistical softwares (such as SPSS and
SAS) integrate modules of factor analysis to automate the process. But without careful
examination of the raw data, erroneous conclusions are unavoidable. The quality of the
factor analysis result highly depends on the original variables, data sources, and the
analysis method.
The KMO test is often employed to test whether the data is suitable for factor
analysis, but this test cannot tell whether the data itself is reasonable for analysis.
Most of time, factor analysis results in the textbook only give some mathematical
expressions without clear interpretation. When researchers and teachers used factor
Y. Xie et al. / The Factor Analysis’s Applicability on Social Indicator Research 259
Acknowledgements
This work was supported by the National Social Science Fund 15AZD077.
References
[1] H. H. Harman, Modern Factor Analysis, 3rd ed. Chicago: University of Chicago Press. 1976.
[2] N. Cressie, Statistics for spatial data. John Wiley & Sons, 2015.
[3] J. L. Devore, Probability and Statistics for Engineering and the Sciences. Cengage Learning, 2015.
[4] D. R. Anderson, D. J. Sweeney, T. A. Williams, et al. Statistics for business & economics. Nelson
Education, 2016.
[5] J. Pearl, M. Glymour, N. P. Jewell. Causal Inference in Statistics: A Primer. John Wiley & Sons, 2016.
[6] J. R. Schott, Matrix analysis for statistics. John Wiley & Sons, 2016.
[7] D. C. Howell, Fundamental statistics for the behavioral sciences. Nelson Education, 2016.
[8] X. Q. He, multivariate statistical analysis, Renmin University of China Press, 2011, 143-173.
[9] J. P. Jia, Statistics, Renmin University of China Press, 2011, 254-270.
[10] J. Kim and C. W. Mueller. Factor Analysis: What it is and how to do it. Beverly Hills and London:
Sage Publications, 1978.
[11] P. Kline, An easy guide to factor analysis. London: Routledge, 1994.
260 Fuzzy Systems and Data Mining II
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-260
Introduction
The weapon target allocation (WTA) is to optimize the distribution of our forces and
weapons according to the characteristics and quantity of incoming targets for the best
operational effectiveness. The WTA is a typical constrained combinatorial optimization
problem, which is a hard Non-Polynomial optimizing problem. The model of WTA
based on multi-objective optimization is more realistic and a hot topic. At present, the
intelligent optimization methods[1-3], such as genetic algorithm (GA), particle swarm
algorithm (PSA), ant colony algorithm (CA), and simulated annealing (SA), are widely
employed to solving WTA.
These intelligent algorithms have been shown better solutions than the classic ones.
However, it is not enough to satisfy real-time requirement of air-defense. In this paper,
we focus on designing a new gene coding to improve computational efficiency. A
popular genetic coding length is n*m corresponding to assigning n weapons to m
targets. In our study, a sequence of weapons serves as gene coding, which is attached
the two other codes, target code and capacity code respectively. This coding length is n
and adapts to the constraints of WTA effectively. On the other hand, the maximum
1
Corresponding Author: Yan-Sheng ZHANG, Lecturer, Ordnance Engineering College, No.97 Heping
West Road, Shijiazhuang City, Hebei Province, China; E-mail: zhang_sheng_74@163.com.
Y.-S. Zhang et al. / Research on Weapon-Target Allocation Based on Genetic Algorithm 261
1. Mathematic Model
Our anti-aircraft equipment is represented by A=[a1, a2,…, an], in which ai means the ith
(1≤i≤n) weapon. R=[r1, r2,…, rn] represents the capacity of ammunition corresponding
to A=[a1, a2,…, an], and ri means the quantity of ammunition about ai. Target set is
given by T=[t1, t2,…, tm], and tj (1≤j≤m) is the jth incoming target. D=[d1, d2,…, dm]
shows threat levels corresponding to T=[t1, t2,…, tm], and dj represents threat degree of
tj. P=[pij]nm is a matrix of intercept probability, and pij gives the intercept probability of
ai to tj. The decision matrix is described by X=[xij]nm, and xij is the number of missile
about ai to tj.
Operational effectiveness f1(X) is expressed in Eqs. (1) [4]. The total number of
missiles f2(X) consumed is given in Eqs. (2). The optimization of WTA is to make f1(X)
as large as possible, and f2(X) as small as possible. The multiple objective
optimizations can be transformed into a single one as the following, shown in Eqs. (3).
Then f(X) is the objective function, in which L1 and L2 are weights. We expect f(X) as
large as possible, shown in Eqs. (4).
m n
¦ d j (1 (1 pij ) ij )
x m
f1 ( X ) (1)
j 1
n m
i 1 ¦
j 1,觟j z k
xij 0 (5)
f2 ( X ) ¦¦ x ij
(2) n
i 1 j 1 s.t.
¦x ij t1 (6)
f (X ) L1 f1 ( X ) L2 f 2 ( X ) (3) i 1
1 d xij d ri (7)
max f ( X ) (4)
Usually, there are some constraints about f(X). The number of weapons is greater
than the number of targets, namely, n m. A weapon ai is allowed only to been allocated
to one target tk, which is indicated in Eqs. (5). It is concluded that there is only one
nonzero element in each line of X. Any target is assigned at least one weapon. This
tells that at least one element of each column is not zero in X, as shown in Eqs. (6). The
number of missiles xij, which ai is allocated to tj, shouldn’t exceed the capacity of the
missile ri, given in Eqs. (7).
The decision matrix X is the solution of the objective function. It is very complicated to
perform gene crossover and mutation if X is directly encoded as gene particle. A
sequence of weapons, A=[a1, a2,…, an], serves as gene coding, and 1, 2,… n represents
a1, a2,…, an respectively. Additionally, each gene particle is set two other codes.
Corresponding to A=[a1, a2,…, an], one is the target set T=[t1, t2,…, tn], and the other is
the quantity set of ammunition C=[c1, c2,…, cn]. t1, t2,…, tn are described by ,
262 Y.-S. Zhang et al. / Research on Weapon-Target Allocation Based on Genetic Algorithm
,… m , and c1, c2,…, cn meet the conditions of c1< r1, c2< r2,…, cn< rn. For example,
the gene coding and its additional coding are shown in Figure 1.
W
1 2 3 4 5 6 T
1 0 0 0 3 0 0
2 2 0 0 0 0 0
Gene code W=[ 2 4 8 1 5 7 3 6 9]
3 0 0 0 4 0 0
4 0 1 0 0 0 0
Target coding T=[ ] X= 5 0 0 0 0 1 0
Capacity coding 6 0 1 0 0 0 0
C=[ 2 1 3 3 1 4 4 1 1]
7 0 0 0 0 0 4
8 0 0 3 0 0 0
9 0 0 0 0 1 0
The roulette method is used to generate the parent population Q1, and some of them are
selected to implement gene reconfiguration by crossing with the probability of p1. The
cross point k of the two-parent genes is a random number between 1 and n. Figure 3
shows the process of crossing.
W1 and W2 are gene particles for crossing, and their affiliated codes (T1, C1, T2 and
C2) are also listed in Figure 3(a). Suppose k=4, and the crossing of W1(1) can be
explained as follows.
Step1: Search for the same value as W2(1)=1 in W1. Search result is W1(5)= W2(1)=1.
Step2: The Value of W1(1) is interchanged with the one of W1(5). As a result,
W1'(1)=1 and W1'(5)=8.
Step3: Accordingly, the Value of C1(1) is also interchanged with the one of C1(5) . As
a result, C1'(1)=2 and C1'(5)=4.
Y.-S. Zhang et al. / Research on Weapon-Target Allocation Based on Genetic Algorithm 263
a3 a4
a1 a2
W1=[8 4 3 2 1 9 5 6 7]
W1=[1 6 9 5 8 3 2 4 7]
W2=[1 6 9 5 3 2 7 8 4]
W2=[8 4 3 2 9 5 7 1 6]
b1
b3 b2 T1=[1 2 3 4 5 6 4 1 6]
b4
T1=[1 2 3 4 5 6 4 1 6] T2=[1 2 3 4 5 6 1 1 4]
T2=[1 2 3 4 5 6 1 1 4] C1=[2 2 1 1 4 1 2 1 4]
C1=[4 1 1 2 2 1 1 2 4] C2=[3 2 4 2 1 2 3 3 1]
C2=[3 1 1 2 4 2 3 3 2]
(a) (b)
The crossing of W1(2), W1(3) or W1(4) is similar to W1(1). Again, W2(1)~ W2(4)
can be made this transformation. The two news genes W1' and W2' can be derived by
crossing, as shown in Figure 3(b).
On the surface, if k=1, the first bits of W1 and W2 will be swapped; if k=9, W1 and
W2 will be completely swapped. In fact, the traditional position swap, that the two
particles interchange each other at the crossing point, is not adopted. It is because there
may be the same values in a new gene code, which does not conform to the constraints
of Eqs. (5). In this paper, the crossover operator specifies that the element swaps only
occurs in one gene code. The result of exchanging between the elements of a gene is
that the first k elements of W1' are the same to the ones of W2, so do the ones of W2'. In
general, W1'≠W1 and W2'≠W2. After the genes crossing, the new population is generated
and named as Q2.
A little of particles are selected to mutate with the probability of p2 in the population.
Suppose W1 is the mutation particle, and it affiliated codes are T1 and C1 in Figure 4.
The mutation operation is designed as shown in Figure 4.
a1 W
1 2 3 4 5 6 T
W1=[8 4 3 2 1 9 5 6 7]
1 0.91 0.16 0.96 0.66 0.32 0.45
T1=[1 2 3 4 5 6 4 1 6]
2 0.13 0.97 0.66 0.17 0.95 0.65
C1=[4 1 1 2 2 1 1 2 4] 3 0.91 0.96 0.04 0.71 0.03 0.71
(a) 4 0.63 0.49 0.85 0.03 0.44 0.75
P=
W1=[8 6 3 2 1 9 5 4 7] 5 0.10 0.80 0.93 0.28 0.38 0.28
6 0.28 0.14 0.68 0.05 0.77 0.68
T1=[1 2 3 4 5 6 4 1 6]
7 0.55 0.42 0.76 0.10 0.80 0.66
C1=[4 3 1 2 2 1 1 2 4] 8 0.96 0.92 0.74 0.82 0.19 0.16
(b) 9 0.96 0.79 0.39 0.69 0.49 0.12
Step1: k1 and k2 are the random integers form 1 to n. The value of W1(k1) is exchanged
with the one of W1(k2). Set k1=2 and k1=8, then W1'(k1)=6 and W1'(k2)=4 after
264 Y.-S. Zhang et al. / Research on Weapon-Target Allocation Based on Genetic Algorithm
Set the initial population P0 and the ith generation Pi. The steps of generating the next
generation Pi+1 is the followings:
Step1: P0 experiences selecting, crossing and mutating, and becomes Q3.
Setp2: Q3 and P0 are mixed together to become the population P' with 2M particles.
Then the objective values of these particles are calculated.
Setp3: According to the values, they are listed in descending as following:
The first M particles are selected to form the next generation Pi+1.
3. Sample
2.4 2.4
2.3 2.3
f(X)
f(X)
2.2 2.2
2.1 2.1
2 2
0 50 100 150 200 0 50 100 150 200
number of iterations number of iterations
(b)
2.4 2.4
X: 53
Y: 2.408 2.3
2.3
f(X)
f(X)
2.2 2.2
2.1 2.1
2 2
0 50 100 150 200 0 50 100 150 200
number of iterations number of iterations
( ) (d)
A. The higher the intercept probability of weapon to target is, the greater the
probability that the weapon will be assigned to the target will be.
B. More equipment and more ammunition tend to be given to the target with the high
threat level.
C. More equipment tends to be given to the target to which all equipment has poor
intercept probability.
D. The application run shown is about 0.40 seconds long, which reduces an order of
magnitude compared to a similar-size example in the literature [6].
Table 1. The optimums of the four times
Order 1 2 3 4
Literation 146 104 53 131
f(X) 2.4011 2.4066 2.4078 2.4078
f1(X) 2.8511 2.8566 2.9078 2.9078
f2(X) 9 9 10 10
Time 0.3541s 0.4107s 0.4732s 0.3988s
000100 000100 001000 001000
000010 010000 000010 000010
010000 000100 010000 010000
000001 000001 000001 000001
X 001000 001000 010000 010000
000001 000010 000001 000001
000010 000010 000010 000010
000100 000100 000200 000200
100000 100000 100000 100000
4. Conclusions
The WTA model proposed in this paper conforms to air defense operation well
according to Conclusion A, B and C in the sample.
A new genetic coding method is presented, and its operators of crossing, mutating
and selecting are designed. Compared with the traditional coding method, the coding
length is n and shortened m times corresponding to assigning n weapons to m targets.
So the suggested algorithm should have a significant improvement in computational
efficiency, which has been confirmed in Conclusion D of the sample. The sample
shows that the algorithm is good at the optimization. It should be noted that the weight
and play an important role in balancing between f1(X) and f2(X). Therefore,
determining the weights of these parameters is further research for us.
References
[1] M. Dorigo and C. Blum, Ant Colony Optimization Theory: A Survey, Theoretical Computer Science㧘
344(2005):243-278.
[2] S. Chen and T. Hu, Weapon-target Assignment with Multi-objective Non-dominated Set Ranking
Genetic Algorithm, Ship Electronic Engineering, in Chinese, 35(2015):54-57.
266 Y.-S. Zhang et al. / Research on Weapon-Target Allocation Based on Genetic Algorithm
[3] C. L. Fan, Q. H. Xing, M. F. Zheng, et al. Weapon-target allocation optimization algorithm based on
IDPSO, Systems Engineering and Electronics, in Chinese, 37(2015):336-342.
[4] O. Karasakal, Air defense missile-target allocation models for a naval task group, Computers &
Operations Research, 35(2008): 1759-1770.
[5] C. G. Xue, Enterprise Information System Adaptability Optimization Based on Cloud Co-evolution
Algorithm, Industrial Engineering and Management, in Chinese, 18(2013): 47-53.
[6] C. L. Fan, Q. H. Xing, M. F. Zheng and Z. J. Wang, Weapon-target allocation optimization algorithm
based on IDPSO, Systems Engineering and Electronics, in Chinese, 37(2015):336-342.
Fuzzy Systems and Data Mining II 267
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-267
Introduction
1
Corresponding Author: Xiao-Fei LI, Shanghai Key Laboratory of Trustworthy Computing East China
Normal University, Shanghai; The College of Mathematics and Computer WuYi university, FuJian, China;
E-mail: lixiaofei_73@163.com.
268 X.-F. Li et al. / PMDA-Schemed EM Channel Estimator for OFDM Systems
A baseband equivalent OFDM system is depicted in Figure 1. Each data stream which
possesses M-piont inverse fast fourier transform (IFFT) with additional cyclic prefix
(CP) ahead is first input into a serial to parallel (S/P) converter, and modulates the
corresponding sub-carrier by MPSK or MQAM. In terms of the maximum capacity or
the minimum BER under some constraints, one sub-carrier are transferred to another
X.-F. Li et al. / PMDA-Schemed EM Channel Estimator for OFDM Systems 269
altered by various schemes. Here, for simplicity, only QPSK is utilized in all the sub-
carriers. The modulated data stream, denoted as the complex valued variables
X(0),...,X(m),...,X(M−1) which are the modulated data symbols, are transformed by
IFFT, and the output symbols represents x(0),...,x(k),...,x(M − 1).
In terms of avoiding intersymbol interference (ISI), CP symbols propagating the
terminal IFFT symbols are added to the head of each frame. After shifted back to a
serial data streams, the parallel data are conveyed over the frequency selective channel.
After throwing away the prefix, and applying
y(k) =∑!
~ ( − ) + η(k), 0 ≤ k ≤ M − 1, (1)
^
!
where x(k) = ∑! 4()h " , 0 ≤ ≤ − 1. The CIR ξw (0≤ l ≤ L − 1)
√ \~
is i.i.d complex-valued Gaussian random data sequence, and η(k),(0 ≤ k ≤ M − 1) is the
additive AWGN variables with zero mean and variance σ2 for both real and imaginary
components, L is the length of the time-domain CIR. The received data frame in the
frequency domain
^
"
Y(M) = ∑!
£~ ¢()h
, 0 ≤ ≤ −1 (2)
In the paper, the PMDA algorithm yields an estimate of the entire observed posterior H
in order to specify a normal approximation to H, instead of just a maximizer and
270 X.-F. Li et al. / PMDA-Schemed EM Channel Estimator for OFDM Systems
curvature at this point. To computing the observed posterior, a sample X1,··· ,XM is
given from g(Y |X,H). The weights are assigned as follows:
©(ª|«¬ ) !
ω = , 1≤®≤ and equation Q(H±H 6 ) = ∑\
~! ²(³|4 , ´) is
©(ª|«¬ )
∑^
¬¶W µ¬ ©(«¬ |ª,)
replaced with Q(H±H 6 ) = ∑^
, where the original smple is updated for the
¬¶W µ¬
new information at iteration it through the weights [32].
The iterative algorithm is described as follows:
1. Initialize n, H0.
Generate X1, ···, Xm ∼ g(Y |X, H) via Monte Carlo algorithm
At the iteration it +1:
¹¬∗ |ª,«¬ )
©(
2. Compute the importance weights ω = (5hL∑∗ )!/" , 1≤®≤
©(|«¬ ª)
∑^
¬¶W ©(«¬ |ª,)
3. E-step: Estimate Q(H|H 6 ) by Q(H|H 6 )= ∑^¬¶W µ¬
4. M-step: Maximize Q(H|H(it)) to obtain H(it+1). H(it+1) = argmaxQ(H|H(it))
5. Computing the difference between successive estimates |Hˆ(it+1) − Hˆit|, if the
difference is below a predetermined threshold, terminate the iteration and
output the final decisionX¹» , otherwise it +1 −→ it and repeat step 2 .
Figure 2. BER v.s. SNR for Rayleigh fading channels with fdT≤0.01
Figure 3. MSE v.s. SNR for Rayleigh fading channels with fdT≤ 0.01
than the EM estimator. Therefore, less iterations are run to obtain convergence using
PMDA as a starting point for the data augmentation estimates.
4. Conclusion
In the paper, the PMDA-based estimator is proposed to be efficient estimate the CIR in
an OFDM system. PMDA-based channel estimate yields an approximation to the
observed posterior to reduce the calculating quantity of E-step. The simulation reveals
that BER and MSE performance of the PMDA is very close to the theoretic estimator,
but much better than the EM estimator and approximate to the CRLB estimator in high
SNR, and the convergence of PMDA estimator is faster than EM estimators. The
simulation results show that the performance is acceptable when SNR is larger than
10dB. In the small SNR region, channel coding scheme is used to improve performance
in the PMDA estimator. It approves that the PMDA algorithm is more efficient
performance than the EM estimator. Our further work is to estimate multiple-input
multiple-output (MIMO) channels in the MIMO-OFDM systems.
Acknowledgements
This research work is supported by the Important National Science and Technology
Specific Project of China under Grant No.2016ZX03001022-006, National Natural
Science Foundation of China under Grant Nos.91438113, 61571064 and 61370176,
Education Department A Class Project in FuJian Province under Grant N0. JA15515,
and Science Research Project of WuYi University under Grant No. XL201012.
References
[1] W. W. Ren, L. Z. Liu. A novel iterative symbol detection of ofdm systems in time and frequency selective
channels. IEEE Conference General Assembly and Scientific Symposium (URSI GASS), (2014):1-4.
[2] S. Zettas, S. Kasampalis, P. Lazaridis; Z. D. Zaharis, J. Cosmas. Channel estimation for OFDM systems
based on a time domain pilot averaging scheme.16th International Symposium on Wireless Personal
Multimedia Communications (WPMC). (2013):1-6.
[3] Y. S. Liu, Z. H. Tan, H. J. Hu, et al. Channel estimation for ofdm. IEEE communications surveys &
tutorials, 16(2014):1891-1908.
[4] S. M. Riazul Islam, Kyung Sup Kwak. Two-stage channel estimation with estimated windowing for MB-
OFDM UWB system. IEEE Communications Letters, 20(2016): 272-275.
[5] M. Hajjaj, W. Chainbi, R. Bouallegue. Low-rank channel estimation for MIMO MB-OFDM UWB
system over spatially correlated channel. IEEE Wireless Communications Letters, 5(2016): 48-51.
[6] V. Pohl, P. H. Nguyen, V. Jungnickel, and C. V. Helmolt. How often channel estimation is needed in
MIMO systems, in Proc. IEEE Global Telecommun. Conf., San Francisco, Calif, USA, (2003): 814-
818.
[7] D. He. Chaotic Stochastic Resonance Energy Detection Fusion Used in Cooperative Spectrum Sensing,
IEEE Transactions on Vehicular Technology, 62 (2013):620-627.
[8] X.Q. Ma, H. Kobayashi and S. C. Schwartz. EM-Based Channel Estimation algorithm for OFDM.
Journal on Applied Signal Processing, 10(2004):1460-1477.
[9] R. Carvajal, B. I. Godoy, J. C. Aguero, J. I. Yuz, and W. Creixell. EM-based ML channel estimation in
OFDM systems with phase distortion using RB-EKF. 17th International Symposium on Wireless
Personal Multimedia Communications (WPMC2014). (2014): 232-237.
[10] J. W. Choi, S. C. Kim, J. H. Lee, Y. H. Kim. Joint channel and phase noise estimation for full-duplex
systems using the EM algorithm. 2015 IEEE 81st Vehicular Technology Conference (VTC Spring):1-5.
X.-F. Li et al. / PMDA-Schemed EM Channel Estimator for OFDM Systems 273
[11] R. Carvajal, J. Agüero, B. I. Godoy, D. Katselis. EM-based sparse channel estimation in OFDM
systems with q−norm regularization in the presence of phase noise and frequency offset. 2015 7th IEEE
Latin-American Conference on Communications (LATINCOM). (2015): 1 - 6
[12] A. Assra; J. X. Yang; B. Champagne. An EM approach for cooperative spectrum sensing in
Multiantenna CR networks. IEEE Transactions on Vehicular Technology. 65(2016): 1229-1243
[13] R. Carvajal, J. C. Aguero, B. I. Godoy, and D. Katselis. A MAP approach for q-norm regularized
sparse parameter estimation using the EM algorithm, in Proc. of the 25th IEEE Int. Workshop on
Mach. Learning for Signal Process (MLSP 2015), Boston, USA, (2015): 1-6
[14] C. H. Cheng; H. L. Hung; J. H. Wen. Application of expectation-maximisation algorithm to channel
estimation and data detection techniques in ultra-wideband systems. IET Communications. 6(2012):
2480 - 2486
[15] M. L. Ku, W. C. Chen, C. C. Huang: EM-based iterative receivers for OFDM and BICM /OFDM
systems in doubly selective channels, IEEE Transaction Wireless. Communication, 10(2011):1405–
1415.
[16] M. Marey, M. Samir, and O. A. Dobre: EM-based joint channel estimation and IQ imbalances for
OFDM systems, IEEE Transaction and Broadcasting, 58 (2012): 106–113
[17] R. Carvajal, J. C. Aguero, B. I. Godoy, G. C. Goodwin: EM-Based Maximum-Likelihood Channel
Estimation in Multicarrier Systems With Phase Distortion. IEEE Transactions on Vehicular
Technology. 62(2013): 152-160.
[18] M. Hajjaj; W. Chainbi; R. Bouallegue: Two-step LMMSE channel estimator for Unique Word MB-
OFDM based UWB systems. 2015 International Wireless Communications and Mobile Computing
Conference (IWCMC). (2015): 1012-1016.
[19] V. Savaux; Y. Louet; F. Bader. Low-complexity approximations for LMMSE channel estimation in
OFDM/OQAM. 23rd International Conference on Telecommunications (ICT). (2016): 1-5.
[20] S. M. Key: Fundamentals of Statistical Signal Processing: Estimation Theory, Prentice-Hall, 1998:595.
[21] S. Haykin: Adaptive Filter Theory, Prentice-Hall, 3td Edition, 1996:989.
[22] Y. Srivastava, H. C. Keong, H. W. F. Patrick, S. Sumei: Robust MMSE channel estimation in OFDM
systems with practical timing synchronization, 2004 iEEE Wireless Communications and Networking
Conference, 2(2004):711-716.
[23] B. Dulek; O.Ozdemir; P. K.Varshney; W.Su.Distributed Maximum Likelihood Classification of Linear
Modulations Over Nonidentical Flat Block-Fading Gaussian Channels. IEEE Transactions on Wireless
Communications. 14(2015): 724-737
[24] G. X. Zhou, W. Xu, G. Bauch. Efficient Maximum Likelihood Detection with Imperfect Channel State
Information for Interference-limited MIMO Systems. SCC 2015; 10th International ITG Conference on
Systems, Communications and Coding; Proceedings of. 2015:1-6
[25] I. Ngebani, Y. B. Li, X. G. Xia, M. J. Zhao. EM-based phase noise estimation in vector ofdm systems
using linear MMSE receivers. IEEE Transactions on Vehicular Technology. 65(2016): 110-122.
[26] Hayder Al-Salihi; Mohammad Reza Nakhai. An enhanced whitening rotation semi-blind channel
estimation for massive MIMO-OFDM. 2016 23rd International Conference on Telecommunications
(ICT). 2016:1-6
[27] W. Feng; J. L. Li; L. Zhang. Blind channel estimation combined with matched field processing in
underwater acoustic channel. OCEANS 2016 - Shanghai, 2016: 1-4
[28] W. Feng, W. P. Zhu, M. N. S. Swamy, A semiblind channel estimation approach for MlMO-OFDM
Systems, IEEE Transactions on Signal Processing., 56(2008):2821-2834.
[29] G. C. G. Wei and M. A. Tanner: A Monte Carlo Implementation of the EM Algorithm and the Poor
Man’s Data Augmentation, Journal of the American Statistical Association, 85(1990): 699-704.
[30] M. A. Tanner and W. W. Hung: The Calculation of Posterior Distributions by Data Augmentation.
Journal of the American Statistical Association, 82(1987):528-540, B.D. Ripley: Stochastic Simulation,
New York: John Wiley.
[31] G. J. McLachaln and T. Krishnan. The EM Algorithm and Extensions (second Edition). Wiley. A John
Wiley & Son, Inc., Publication. 2008.
[32] L. Tierney, R. E. Kass, and J. B. Kadane: Fully Exponential Lapalace Approximations to Expectation
and Variance of Nonpositive Functions, Journal of the American Statistical Association: theory and
method, (1989):710-716.
[33] X.Q. Ma; H. Kobayashi; S. C. Schwartz: An EM-based estimation of OFDM signals. 2002 IEEE
Wireless Communications and Networking Conference. 1(2002): 228 - 232.
274 Fuzzy Systems and Data Mining II
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-274
Abstract. Heavy metals in soil not only affect the growth of plants, also harm to
people's health through the food chain, and may cause the problem such as air and
water pollution, influence the ecological function of urban soil. Therefore, in order
to improve the living environment and solve the pollution problem thoroughly, we
must find out the cause of the heavy metal pollution. In this article, through
pollution index evaluation, statistical analysis, BP network spatial interpolation to
establish mathematical model. The soil heavy metal pollution degree, reason and
pollution source location are obtained. It provides important basis for
environmental protection and urban development.
Introduction
The soil is an important part of urban ecosystem. It is necessary to verify the anomaly
of soil environment and evaluate the urban environment quality by huge amounts of
data. Research on the evolution of urban soil environment under the influence of
human activity is of great significance for city ecological construction, agricultural
food safety, people's physical health and sustainable development. In this paper, the
content and spatial distribution of soil heavy metal elements such as Cu, Zn, Pb, Cd, Ni,
Cr, As, Hg in a certain city are discussed by statistical analysis. The location of
pollution sources is determined using BP network. The influence factors of heavy metal
pollution and potential hazard to the environment is found out. The distributions of
different urban activities on soil heavy metal forms provide the reference for evaluating
the environmental effect of heavy metals and guaranteeing the physical and mental
health of urban residents.
1. Samples Collection
Since the influence of human activities on the environment is different, the city is
divided into five functional areas: living area, industrial area, mountainous area, main
1
Corresponding Author: Wei-Wei SUN, lecture, School of Mathematics and Statistics, Fuyang Normal
College, Fuyang, Anhui, China 236041; E-mail: 93692849@qq.com.
W.-W. Sun and X.-P. Sheng / Soil Heavy Metal Pollution Research 275
road area and park green area [1], noting for class 1, class 2, class 3, class 4 and class 5
area respectively. In order to comprehensively analyze urban soil heavy metal pollution
problem, the soil samples are collected in the first place. To make sample have good
representativeness, a total of 319 soil sample are taken in different functional areas
using approximate grid method (1 km x 1 km). The location, altitude and functional
areas of sample are recorded by GPS, as shown in table 1. Then the concentration of
each sample containing Cu, zinc, Pb, Cd, Ni, Cr, As, Hg heavy metals are tested using
special equipment, as shown in table 2. In addition, we sample in the nature area away
from the crowd and industry according to two kilometers distance as the soil element
background values in the city, showing in table 3.
Table 1. Location and functional area of ample
Sample As Cd Cr Cu Hg Ni Pb
Zn (μg/g)
number (μg/g) (ng/g) (μg/g) (μg/g) (ng/g) (μg/g) (μg/g)
1 7.84 153.80 44.31 20.56 266.00 18.20 35.38 72.35
2 5.93 146.20 45.05 22.51 86.00 17.20 36.18 94.59
3 4.90 439.20 29.07 64.56 109.00 10.60 74.32 218.37
4 6.56 223.90 40.08 25.17 950.00 15.40 32.28 117.35
5 6.35 525.20 59.35 117.53 800.00 20.20 169.96 726.02
6 14.08 1092.90 67.96 308.61 1040.00 28.20 434.80 966.73
7 8.94 269.80 95.83 44.81 121.00 17.80 62.91 166.73
8 9.62 1066.20 285.58 2528.48 13500.00 41.70 381.64 1417.86
9 7.41 1123.90 88.17 151.64 16000.00 25.80 172.36 926.84
……
318 7.56 63.50 33.65 21.90 60.00 12.50 41.29 60.50
319 9.35 156.00 57.36 31.06 59.00 25.80 51.03 95.90
276 W.-W. Sun and X.-P. Sheng / Soil Heavy Metal Pollution Research
Standard Standard
Element Average Scope Element Average Scope
deviation deviation
As (μg/g) 3.6 0.9 1.8~5.4 Hg (ng/g) 35 8 19~51
Cd (ng/g) 130 30 70~190 Ni (μg/g) 12.3 3.8 4.7~19.9
Cr (μg/g) 31 9 13~49 Pb (μg/g) 31 6 19~43
6.0~20.
Cu (μg/g) 13.2 3.6 4 Zn (μg/g) 69 14 41~97
At present, the common evaluation methods of soil heavy metal pollution include
accumulated index method, pollution index method, potential ecological harm index
method and so on [2]. In this paper, the single factor index method and Nemerow
comprehensive pollution index method are used for evaluation of different type soil by
heavy metal pollution. The formula is [3]
2
§1 n ·
max pi ¨ ¦ pi ¸
2
pi
ci
, pc ©n i 1 ¹
si 2 (1)
Comprehensive index pc d 1 1 pc d 2 2 pc d 3 3 pc d 5 pc t 5
moderate high
Pollution degree security cordon light pollution
pollution pollution
W.-W. Sun and X.-P. Sheng / Soil Heavy Metal Pollution Research 277
Combined with the data, the single factor index pi and Nemerow comprehensive
index pc of eight heavy metals for five classification areas in the city are got by
Eq.(1), as shown in table 5.
Table 5. Pollution index of soil heavy metal
Living area 1.53 2.28 1.70 3.57 5.96 1.39 1.93 2.84 4.516
Industrial
1.59 2.40 1.80 4.59 9.69 1.44 2.01 3.16 7.248
area
Mountainous
1.48 2.08 1.34 2.58 5.92 1.31 1.68 1.99 4.487
area
Main road
1.57 2.33 1.73 4.17 8.58 1.40 1.99 2.92 6.448
area
Park green
1.56 2.27 1.70 3.49 5.80 1.39 1.91 2.78 4.498
area
By studying the contrast, there are some differences between soil heavy metal
pollution in five functional areas. Specific as follows:
(1) Hg in five classification areas belong to high pollution, especially the pollution
index in industrial and main road area are 9.69 and 8.58, obviously exceeds bid badly.
As, Cr, Ni, Pb four elements all belong to light pollution. Cd and Zn all belong to
moderate pollution. In addition to high pollution in industrial and main road area, Cu is
moderate pollution in other three.
(2) Through the comprehensive pollution index can be seen that industrial area and
main road area belong to high pollution, the rest three areas are moderate pollution.
And its pollution degree from big to small in turn is industrial area, main road area,
living area, park green area, mountainous area.
(3) Integrated all the data, the concentrations of all heavy metals in industrial area
and main road area are significantly higher than other functional areas. It shows that
industrial and traffic pollution exceeding other pollution has become the dominant
pollution sources in the city.
3. Cause of Pollution
Heavy metals in soil not only affect the growth of plants, also harm to people's health
through the food chain, and may cause the problem such as air and water pollution.
Therefore, in order to improve the living environment and thoroughly solve the
problem of pollution, we must find out the cause of heavy metal pollution.
Table 6. Statistics of heavy metal content in soil
As Cd Cr Cu Hg Ni Pb Zn
Standard
3.02 225.27 8.36 12.75 42.30 9.93 49.98 22.39
deviation
Variation
coefficient 53 74 11 23 14 57 81 11
(%)
Overstanda
77.4 79.6 80.6 88.4 66.5 75.2 81.5 79.0
rd rate (%)
Backgroun
3.6 130 31 13.2 35 12.3 31 69
d value
First, the mean, standard deviation and variation coefficient of eight heavy metals
in soil of the city are calculated using Matlab software, the specific results as shown in
table 6.
Through the data of table 6 we can see that as follows:
(1) The average contents of eight heavy metals in soil are higher than the
background value. As, Cd, Cr, Cu, Hg, Ni, Pb, Zn are respectively 1.58, 2.33, 1.73,
4.17, 8.56, 1.40, 1.99, 1.40 times of the background value. Cu and Hg have
accumulated to a certain extent, mainly from industrial activity and traffic activities of
human beings.
(2) According to rough classification rule of the variation coefficient, varying from
28.8% to 60.62% belong to moderate variation. In the city As, Cd and Pb in soil are
strong variation. Ni is medium variation. Cr, Cu, Hg, and Zn are weak variation. The
variation coefficient scope of eight elements is 11% ~ 81%, and the variability is very
big. It is visible that the soil pollution may be affected by unreasonable layout of
human activities and the influence of the enterprise and road traffic.
Second, the element geochemical study shows that elements with similar causes
often have good correlation [5]. Therefore, heavy metal elements with higher
correlation statistically have similarities in origin. Factor analysis with SPSS statistical
software, can be concluded that the correlation coefficient between the heavy metals in
table 7.
Table 7. Correlation coefficient of heavy metal content in soil
Element As Cd Cr Cu Hg Ni Pb Zn
As 1
Cd 0.2547 1
Cr 0.1890 0.3524 1
Cu 0.1597 0.3967 0.5316 1
Hg 0.0644 0.2647 0.1032 0.4167 1
Ni 0.3166 0.3294 0.7158 0.4946 0.1029 1
Pb 0.2899 0.6603 0.3828 0.5200 0.2981 0.3068 1
Zn 0.2469 0.4312 0.4243 0.3873 0.1958 0.4364 0.4937 1
From the table 7 shows, the relevance between Cr and Ni is strongest, explaining
their sources are roughly same. And their correlation coefficient is maximum 0.7158,
W.-W. Sun and X.-P. Sheng / Soil Heavy Metal Pollution Research 279
indicating Cr and Ni in soil have the closest relations, and their content influence on
each other. Second Cd and Pb is significantly positive correlation, implying a similar
process control distribution features of the heavy metal element in soil.
If the concentration of heavy metal in a certain place obtains maximum value, it is the
location of pollution sources [6]. According to this idea, we try to establish function
relation between the concentration U of heavy metals and the three-dimensional
coordinate ( x, y, h) of sample points. Using 319 data for interpolation and fitting of the
ternary function U ( x, y, h) , and then calculate maximum of the function. But since the
Matlab software can't achieve common interpolation and fitting for four-dimensional
scattered data, so we consider the statistical regression model. In order to find out the
relation between the concentration U and coordinates x, y , h , first make three scatter
plot about U in x, y , h respectively using data, such as element As is shown in figure 1:
In order to make use of sample data information fully effective, and confirm the space
position of largest concentration for every heavy metal in the urban area, this article
adopts BP neural network to encrypt space interpolation [7]. BP network can learn and
store a lot of input - output model mapping, without prior reveal the mathematical
equations describing the mapping relationship [8]. Therefore, the elevation height can
be effectively Integrated into the network, and the stability and precision of the
network is improved. Specific algorithm is as follows:
(1) Determine the topology of BP network. Input and output node is determined by
the problem. The problem has three input nodes, respectively three-dimensional
coordinates x, y , h of sample point. Output node is eight, respectively the concentration
of eight heavy metals. The number of hidden layer nodes is given according to the
empirical formula [9]
280 W.-W. Sun and X.-P. Sheng / Soil Heavy Metal Pollution Research
n ni no E (2)
Where ni , no is the number of neurons in input layer and output layer respective.
E is an integer between 1 to 10.
(2) Initialization. The initial weights, learning rate, error accuracy and maximum
iteration step are set.
(3) For each training sample, the forward error is calculated. If the error is greater
than the precision, then reverse modify weight step by step. When the error accuracy or
maximum iteration step is met, BP algorithm is stopped. Thus the mapping relationship
between the concentration of heavy metals and spatial location is determined.
(4) Use the trained BP network to spatial interpolation. All the sample points are
reduce the sampling intervals (10 m), and the data x, y , h of all encryption are input into
BP network, the network system will automatically calculate the concentration of eight
heavy metal.
(5) By the maximum of heavy metal concentrations of all interpolation points and
sample points, the position coordinates of pollution sources are determined.
250 sample data randomly as the training sample set, the rest 69 samples as test data.
The number of hidden layer nodes takes six by Eq. (2). So the topology of BP network
is three-six-eight here. All the initial network weights take random values within the
scope of [-1, 1], learning rate K 0.9 .Error accuracy is set to 0.0001. Maximum
iteration step is 10000. The results as shown in table 8 and table 9.
Table 8. The experimental results of BP network
Recognition rate of
Algorithm Iteration steps Error of training 1000 steps
test sample
BP network 304 4.5951e-08 99.98%
As Cd Cr Cu Hg Ni Pb Zn
h(m) 41 43 42 43 44 41 45 46
Area 4 4 4 2 2 4 1 4
From table 9, we can see that the pollution sources of As, Zn, Cr, Ni, Cd are all in
forth areas, namely main road area. The pollution sources of CuޔHg are both in second
areas, namely industrial area. The pollution source of Pb is in first areas, namely living
area.
W.-W. Sun and X.-P. Sheng / Soil Heavy Metal Pollution Research 281
5. Conclusions
In this paper, statistical analysis and BP network higher dimensional interpolation are
used to solve the heavy metal pollution in the soil. The pollution degree, cause,
pollution source location and transmission characteristics come to conclusion. The
model not only can be used on other heavy metal pollution did not mention in the
article, but also in other problem such as air and water pollution, have certain
extension.
Acknowledgment
The first author is grateful to Associate Professor Hai Wu of Fuyang Normal College
for helpful discussions on soil pollution. The related works are supported by Natural
Science Research Project in Anhui Universities (2015KJ003, KJ2015A161) and
Natural Science Foundation in Anhui province (1508085MA12).
References
[1] C. M. Li. Spatial distribution characteristics of soil heavy metal in urban and influencing factors. Journal
of Jinzhong University, 31(2014):24-27.
[2] J. Tang, C. Y. Chen, H. Y. Li, et al. Assessment on potential ecological hazard and human health risk of
heavy metals in urban soil of Daqing city. Geographical Science, 31(2011):118-122.
[3] J. Yin, Y. L. Liu. Spatial Distribultion and Pollution Evaluation of Heavy Metal in Shanghai
Urban-Suburb Soil. Modern Agricultural Science and Technology, 10(2010):251-255.
[4] H. D. Wang, F. M. Fang, H. F. Xie, et al. Pollution evaluation and source analysis of heavy metal in
urban soil of Wuhu city. Urban Environment and Urban Ecology, 23 (2010):36-40.
[5] Y. Qian, W. Zhang, D. C. Ran. The chemical speciation and influencing factors of heavy metals in
Qingdao urban soils. Environmental Chemistry, 30(2011): 652-657.
[6] J. J. Chen, H. H. Zhang, J. M. Liu, et al. Spatial distributions and controlled factors of heavy metals in
surface soils in Guangdong based on the regional geology. Ecology and Environmental Sciences,
20(2011):646-651.
[7] D. W. Hu, X. M. Bian, S. Y. Wang et al. Study on spatial distribution of farmland soil heavy metals in
Nantong City based on BP -ANN modeling. Journal of Safety and Environment, 7(2007):91-95.
[8] G. Li and P. Niu. An enhanced extreme learning machine based on ridge regression for regression.
Neural Computing and Application, 22(2013): 804-810.
[9] R. Zhang, Z. B. Xu, G. B. Huang. Global convergence of online BP training with dynamic leaning rate.
IEEE Transactions on Neural Networks and Learning systems, 23(2012): 330-33.
282 Fuzzy Systems and Data Mining II
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-282
Abstract. In this study, a novel parameter tuning strategy for a kernel extreme
learning machine (KELM) is constructed using an improved particle swarm
optimization method based on differential evolution (EPSO). First, the proposed
EPSO is used to obtain the global optimum by introducing the differential
evolution mutation strategy. Then, the EPSO is used to construct an effective and
stable KELM model for bankruptcy prediction. The resultant EPSO-KELM model
is compared to two other competitive KELM methods based on traditional particle
swarm optimization and the genetic algorithm via a 10-fold cross validation
analysis. The experimental results indicate that the proposed method achieved
superior results compared to the other two methods when applied to two financial
datasets. When applied to the Polish bankruptcy dataset, the EPSO-KELM
achieved a classification accuracy (ACC) of 83.95%, an area under the receiver
operating characteristic curve (AUC) of 0.8443, and Type I error and Type II error
of 13.15% and 16.61%, respectively. In addition, the proposed method achieved an
ACC of 87.10%, AUC of 0.8716, and Type I error and Type II error of 15.53%
and 10.13%, respectively, when applied to the Australian dataset. Therefore, the
proposed EPSO-KELM model could be effectively used as an early risk warning
system for bankruptcy predication.
Introduction
1
Corresponding author: Hui-Ling CHEN, College of Physics and Electronic Information Engineering,
Wenzhou University, 325035, Wenzhou, China; E-mail: chenhuiling.jlu@gmail.com.
M.-J. Wang et al. / An Improved Kernel Extreme Learning Machine for Bankruptcy Prediction 283
[1], corporate life cycle [2], and corporate credit rating [3] models. However, ELMs
can yield inaccurate results when applied to most practical tasks. In order to improve
the accuracy of ELMs, Huang et al. [4] developed the kernel extreme learning machine
(KELM). In a KELM, the connection weights between hidden layers and the input
layer do not have to be generated randomly, improving the performance and training
speed of the decision-making process. Since its development, the KELM has been
widely applied to various fields.
However, the kernel bandwidth γ and penalty parameter C of a KELM
significantly influence its performance. The penalty C controls the relationship between
the complexity and fitting error minimization results of the model. The kernel
bandwidth γ defines the non-linear mapping from the input space to some
high-dimensional feature space. In recent years, methods inspired by biology, such as
the genetic algorithm [5], particle swarm optimization (PSO) [6], and artificial bee
colony [7] methods, have been used to determine these two key parameters. In this
study, an enhanced PSO strategy (EPSO) is developed by introducing the mutation
strategy based on differential evolution (DE) [8] in order to more effectively tune the
kernel bandwidth γ and penalty parameter C. In addition, the classification accuracy
(ACC), area under the receiving operating characteristics curve (AUC), Type I error,
and Type II error of the proposed EPSO-KELM model is compared to those of the
original PSO optimized KELM (PSO-KELM) and genetic algorithm optimized KELM
(GA-KELM). The experimental results indicate that the proposed EPSO-KELM
method more effectively detected enterprises at risk of bankruptcy.
This remainder of this paper is structured as follows. A brief description of the
proposed EPSO-KELM is presented in Section 1. The experimental design of the
proposed method is provided in Section 2. The results and discussion are presented in
Section 3. Lastly, the conclusions and recommendations for future studies are discussed
in Section 4.
1. Proposed Method
1.1. EPSO
During the PSO process, position updates are completed based on the conventional
strategy, in which each particle simply moves around the Pbest and Gbest without
re-diversifying the particle. Because the current best position of the Gbest may not be
the global optimum, each particle moves around itself before moving toward the Pbest
and Gbest. Thus, the following DE mutation strategy was implemented in the PSO
before performing position updates:
X ki X k r2 F *( X k r3 X k r4 ) (1)
In this equation, r2 , r3 , and r4 denote the randomly-generated indices of the
particle with a range of [1, 2, ..., D] that are not equal to the current index i; D denotes
the number of particles in the PSO; and F is a mutation parameter called the scaling
factor. The scaling factor F, which controls the amplification of the difference and
prevents stagnation during the global search, was defined as 0.7 herein.
284 M.-J. Wang et al. / An Improved Kernel Extreme Learning Machine for Bankruptcy Prediction
1.2. EPSO-KELM
In this study, a novel EPSO-KELM model was developed for parameter optimization
problems using a KELM with an RBF kernel. The proposed model consisted of two
procedures, including the inner parameter optimization and outer performance
evaluation procedures. During the inner parameter optimization procedure, the
parameters of the KELM were tuned dynamically using the EPSO strategy via a 5-fold
cross validation (CV) analysis. Then, the obtained optimal parameters were substituted
into the KELM prediction model and used to perform bankruptcy prediction
classification tasks in the outer loop via a 10-fold CV strategy.
The classification accuracy was considered in the design of the fitness function,
written as:
K
f avgAcc ( ¦ testAcci ) / k (2)
i 1
where averAcc denotes the average test accuracy achieved by the KELM classifier
according to the 5-fold cross validation strategy. The pseudo code of the proposed
method is detailed as follows:
Pseudo-code of the parameter optimization procedure
Begin
Set the initialized parameters including the number of particle and the maximum minimum search space
and velocity, the max iterations;
For i = 1 to the number of particles
Initialize the position and velocity of each particle;
Code the C and ǫ according to the position of each particle and calculate the fitness
simultaneously;
C = position (i, 1);
ǫ = position (i, 2);
Fitness (i)= Function(C, ǫ);
end
Find the Pbest and Gbest at the current situation, save them for the next comparison;
[global_fitness ,bestindex] = max (Fitness);
Gbest = position (bestindex, :);
Pbest = position;
Local_fitness = Fitness;
For j = 1 to the max iterations
For k = 1 to the number of particles
Adopt the mutation strategy to diverse each particle;
end
For l = 1 to the number of particles
Update the positions and velocity of each particle;
end
For m =1 to the number of particles
Control the search space of each particle to avoid going out of the boundary of position and
velocity;
end
For n =1 to the number of particles
M.-J. Wang et al. / An Improved Kernel Extreme Learning Machine for Bankruptcy Prediction 285
2. Experimental Studies
In this study, the Wieslaw dataset [9] was used to construct the decision system. The
Wieslaw dataset is comprised of 240 cases with 30 financial ratios. Of the 240 Polish
enterprises, 112 declared bankruptcy from 1997 to 2011. The remaining 128 enterprises
did not declare bankruptcy during this period. All of the observations in this period
occurred 2 to 5 years before bankruptcy. In order to further illustrate the performance
of the proposed method, a slightly larger financial dataset, the Australian credit dataset,
was also used. This dataset consists of 307 creditworthy applicants and 383
non-creditworthy applicants.
The proposed EPSO-KELM, PSO-KELM, and GA-KELM models were
implemented in the MATLAB platform. In order to prevent numerical difficulties when
performing the calculations, the data was scaled to a range of [-1, 1] before
constructing the model. In order to obtain unbiased results, the ACC, AUC, Type I
error, and Type II error of the EPSO-KELM, PSO-KELM, and GA-KELM were
obtained via a 10-fold CV. Then, the average results were used to compare the
performance of the methods. The same number of generations and population swarm
size were adopted in the EPSO, PSO, and GA in order to ensure the accuracy of the
results. According to the results of this preliminary experiment, all of the methods
yielded satisfactory classification results when 280 generations and a swarm size of 8
were adopted. The values of C andǫ varied within the ranges of CЩ{2-5,2-3,2-1...,25}
andǫЩ{2-5,2-3,2-1...,25}, respectively. The maximum velocities of the EPSO and PSO
were both approximately 65% of the dynamic ranges of the variable on each dimension,
with an acceleration coefficient of approximately 2.05 and an inertial weight of 1. The
mutation and crossover probabilities of the GA were approximately 0.7 and 0.4,
respectively.
286 M.-J. Wang et al. / An Improved Kernel Extreme Learning Machine for Bankruptcy Prediction
The classification accuracy (ACC), AUC, Type I error, and Type II error results were
used to test the performance of the proposed EPSO-KELM model. These criteria can
be written as:
TP TN
Acc u 100% (3)
TP FP FN FN
FP
Type I error u 100% (4)
FP TN
FN
Type II error u 100% (5)
TP FN
where TP denotes the number of true positives, FN denotes the number of false
negatives, TN denotes the number of true negatives, and FP denotes the number of
false positives. The AUC, the area under the ROC curve, is one of the most accurate
methods of comparing classifiers in two-class problems. The Type I error, defined as
FP/ (FP+TN), calculates the proportion of bankrupt cases incorrectly defined as
non-bankrupt cases. The Type II error, defined as FN/ (TP+FN), calculates the
proportion of non-bankrupt cases incorrectly defined as bankrupt cases.
Table 1 displays the average ACC, AUC, Type I error, and Type II error results
obtained by the EPSO-KELM, PSO-KELM and GA-KELM using the two datasets.
According to the results, the performance of the proposed EPSO-KELM was superior
to that of the PSO-KELM and GA-KELM methods for both the Polish and Australian
datasets. The EPSO-KELM yielded an ACC of 83.95%, AUC of 0.8443, Type I error
of 13.15%, and Type II error of 16.61% when applied to the Polish dataset, and an
ACC of 87.10%, AUC of 0.8716, Type I error of 15.53%, and Type II error of 10.13%
when applied to the Australian dataset. In contrast, the PSO-KELM yielded an ACC of
82.19%, AUC of 0.8361, Type I error of 14.63%, and Type II error of 17.76% when
applied to the Polish dataset, and an ACC of 86.37%, AUC of 0.8638, Type I error of
16.62%, and Type II error of 10.60% when applied to the Australian dataset.
Furthermore, the GA-KELM yielded an ACC of 80.37%, AUC of 0.8078%, Type I
error of 16.10%, and Type II error of 22.23% when applied to the Polish dataset, and
an ACC of 85.94%, AUC of 0.8583, Type I error of 17.06%, and Type II error of
11.34% when applied to the Australian dataset. These results indicate that the
EPSO-KELM achieved a higher classification accuracy than the other methods when
applied to bankruptcy prediction. The results also demonstrate that the solution quality
of the proposed EPSO-KELM was superior to that of the PSO-KELM and GA-KELM.
M.-J. Wang et al. / An Improved Kernel Extreme Learning Machine for Bankruptcy Prediction 287
Table 1. Average ACC, AUC, Type I error, and Type II error of the two datasets.
Polish dataset Australian dataset
Methods ACC AUC Type I Type II ACC AUC Type I Type
error error error II
error
EPSO-KELM 0.8395 0.8443 0.1315 0.1661 0.8710 0.8716 0.1553 0.1013
PSO-KELM 0.8219 0.8361 0.1463 0.1776 0.8637 0.8638 0.1662 0.1060
GA-KELM 0.8037 0.8078 0.1610 0.2223 0.8594 0.8584 0.1706 0.1124
The standard deviation reflects whether the performance of a model is reliable. The
classification accuracy and standard deviation of each model after 10 runs of the
10-fold CV with the Polish dataset are displayed in Figure 1. In this figure, the vertical
coordinate of each node represents the ACC, while the length of the bar represents the
standard deviation.
In Figure 1, the green line represents the results obtained by the GA-KELM. As
shown, a relatively high degree of fluctuation was observed in the GA-KELM results.
The results obtained by the PSO-KELM, denoted by the blue line, also exhibited a high
degree of fluctuation. In contrast, the results obtained by the EPSO-KELM, denoted by
the red line, were relatively reliable, with significantly lower standard deviations than
the other methods. Figure 2 displays the experimental results obtained using the
Australian dataset. As shown in this figure, the EPSO-KELM achieved higher ACC
values and lower standard deviations than the other two methods. According to the
above analysis, the proposed EPSO-KELM approach yielded more reliable and robust
results than the PSO-KELM and GA-KELM methods.
Figure 1. ACC and standard deviation after 10 runs of the 10-fold CV with the Polish dataset
288 M.-J. Wang et al. / An Improved Kernel Extreme Learning Machine for Bankruptcy Prediction
Figure 2. ACC and standard deviation after 10 runs of the 10-fold CV with the Australian dataset
The evolutionary processes of the EPSO-KELM, PSO-KELM, and GA-KELM
meta-heuristic optimization methods were recorded using the Polish dataset in order to
analyze their optimization procedures. As shown in Figure 3, the three fitness curves
gradually improved from iteration 1 to iteration 280. However, no obvious
improvements were noted in the EPSO-KELM, PSO-KELM, or GA-KELM results
after iterations 44, 76, and 130. According to the above analysis, the proposed
EPSO-KELM demonstrated efficient convergence toward the global optimum, with an
average ACC of 83.95%. Thus, the performance of the EPSO-KELM was superior to
that of the PSO-KELM and GA-KELM when applied to bankruptcy prediction.
Figure 3. Average best fit results of the EPSO-KELM, PSO-KELM, and GA-KELM during the training
stage after one run of the 10-fold CV
M.-J. Wang et al. / An Improved Kernel Extreme Learning Machine for Bankruptcy Prediction 289
In this study, an effective and accurate approach, the EPSO-KELM, was developed in
order to precisely detect companies at risk of bankruptcy. In the proposed EPSO-based
approach, the generalization capabilities of the KELM classifier are combined with
PSO and EPSO to achieve optimum parameter tuning for financial decisions. The
experimental results indicated that the ACC, AUC, Type I error, and Type II error of
the KELM constructed with EPSO were superior to those of two other advanced
KELM bankruptcy prediction models constructed with PSO and GA. Therefore, the
proposed EPSO-KELM method could be used as an effective early warning system in
financial decision-making applications. In future studies, the efficacy of the proposed
EPSO-KELM will be tested using other datasets. In addition, the EPSO-KELM will be
applied to other financial problems.
Acknowledgements
This study was financially supported by the National Natural Science Foundation of
China (61303113) and the Science and Technology Plan Project of Wenzhou, China
(G20140048).
Reference
[1] Q. Yu, Y. Miche, A. Lendasse, et al., Bankruptcy prediction using extreme learning machine and
financial expertise. Neurocomputing, 128(2014): 296-302.
[2] S. J. Lin, C. Chang, and M. F. Hsu, Multiple extreme learning machines for a two-class imbalance
corporate life cycle prediction. Knowledge-Based Systems, 39(2013): 214-223.
[3] H. Zhong, C. Miao, Z. Shen, et al., Comparing the learning effectiveness of BP, ELM, I-ELM, and SVM
for corporate credit ratings. Neurocomputing, 128(2014): 285-295.
[4] G. B. Huang, H. Zhou, X. Ding, et al., Extreme learning machine for regression and multiclass
classification. Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on, 42(2012):
513-529.
[5] B. Liu, J. Tang, J. Wang, et al., 2-D defect profile reconstruction from ultrasonic guided wave signals
based on QGA-kernelized ELM. Neurocomputing, 128(2014): 217-223.
[6] L. Zhang and J. Yuan, Fault Diagnosis of Power Transformers using Kernel based Extreme Learning
Machine with Particle Swarm Optimization. Applied Mathematics & Information Sciences, 9(2015):
1003-1010.
[7] C. Ma, J. H. Ouyang, H. L. Chen, et al., A novel kernel extreme learning machine algorithm based on
self-adaptive artificial bee colony optimisation strategy. International Journal of Systems Science,
2014: 1-16.
[8] K. Price, R.M. Storn, and J. A. Lampinen, Differential evolution: a practical approach to global
optimization. 2006: Springer Science & Business Media.
[9] W.Pietruszkiewicz, Dynamical systems and nonlinear Kalman filtering applied in classification in
Cybernetic Intelligent Systems, 2008. CIS 2008. 7th IEEE International Conference on. 2008. IEEE.
290 Fuzzy Systems and Data Mining II
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-290
Introduction
Dynamic Bayesian networks (DBNs) are a useful and general representation to model
complex temporal processes, and are widely applied in bioinformatics for modeling
various biological networks including gene regulatory networks and metabolic
networks [1-3]. Learning dynamic Bayesian network structures is to identify
probabilistic relationships in time-series data, and is one of the most challenging
problems. The major methods for learning DBN are primarily adapted from the
approaches of learning static Bayesian network (SBN), namely the search and score
algorithm and Markov chain Monte Carlo (MCMC) simulation.
Some researchers believe score and search algorithms solve this problem.
Reference [4] combined multiple scoring criteria, such as BDe and BIC, and heuristic
search strategies, including greedy searching, simulated annealing, and genetic
algorithm, to design many structure learning algorithms. Reference [5] made changes
to the EGA-DBN algorithm, using an immune algorithm to replace the genetic
algorithm, which achieved good convergence. Reference [6] proposed PS_DBN based
on particle swarm algorithm. Reference [7] built network stepwise and proposed a
novel DBN structure learning algorithm based on particle swarm optimization.
Reference [8] proposed an unsupervised genetic algorithm in which mutual information
is used in the selection of initial population to reduce the search space. Furthermore,
1
Corresponding Author: Li-Ning XING, College of Information Systems and Management, National
University of Defense Technology, Changsha 410073, China; E-mail: xing2999@qq.com.
G.-L. Li et al. / Novel DBN Structure Learning Method Based on MIC 291
this paper provided a new structure representation with no need of the acyclicity test
and a novel searching algorithm for BIC scores using family inheritance to enhance the
efficiency.
The appliance of searching for only the highest scoring network in the score and
search approach may be doubtable, especially in a small size sample because the
posterior is likely to be relatively flat, so there is no sufficiently unique network with
the highest score. Therefore, in many cases, it is more appropriate to consider the full
posterior distribution over the network models or, in reality, a set of high scoring
networks. MCMC methods are used to sample networks directly from the posterior,
which are applied as optimization or estimation procedures.
Dirk Husmeier first applied the MCMC method to learn dynamic Bayesian
network learning, and then explored related factors that affected the sensitivity and
specificity of learned networks [9]. Reference [10] relaxed the time-invariant
assumption and introduced a new type of graphical model called non-stationary
dynamic Bayesian networks. He then presented an MCMC sampling algorithm to learn
the model structure from time-series data. In the experiment part, he applied both
simulated and biological data to test the effectiveness of the algorithm. Reference [11]
improved the MCMC-based DBN structure learning framework with evolutionary
algorithms, and effectively improved the convergence rate.
In the Markov chain Monte Carlo approach, The Metropolis-Hastings acceptance
probability for state transforming from A to B is calculated, and each state is a DBN
that represents the whole structure. Different than the MCMC approach, the score and
search approach learns the inter-network and intra-network separately. However, both
approaches ignore some structural constraints of DBN. In this paper, these structural
constraints were discovered and analyzed to transform the structure learning problem to
discovering associations among variables.
To search for the pairs of closely relevant variables in a dataset, the measurement
of relevance can be calculated for each pair, then the pairs are ranked by their scores,
and the high scoring pairs are examined. For this procedure, the statistic used to
measure relevance should have two heuristic properties, which include generality and
equitability.
Reference [12] presented a measure of dependence for two-variable relationships
that included the maximal information coefficient (MIC). They mathematically proved
that MIC is general, and tested its equitability through simulations. On the basis of their
tests, MIC was found to be useful for identifying and characterizing the relevant
relevance in data.
The maximal information coefficient (MIC) was firstly applied to learn Bayesian
network structure in reference [13]. Reference [14] proposed a novel MIC-based
approach for data discretization, and created a new method for mutual information-
based structure learning with DBN. Reference [15] presented a novel algorithm named
MIC-BPSO (Maximal Information Coefficient – Binary Particle Swarm Optimization)
to learn Bayesian networks from data. This algorithm firstly makes use of MIC in
network construction phase to enhance the quality of initial populations, and then uses
the scoring function’s decomposability to update BPSO algorithm.
The remainder of this paper is organized as follows. Preliminaries about Dynamic
Bayesian networks and maximal information coefficient are presented in Section 1. In
Section 2, the structure learning method based on MIC is proposed. Then, Section 3
presents the experimental results of several benchmark datasets with known structures.
Finally, Section 4 includes the final conclusions and future research is outlined.
292 G.-L. Li et al. / Novel DBN Structure Learning Method Based on MIC
1. Preliminaries
2 2 2 2 2 2 2 2
X X X X X X X X
1 2 3 1 t 1 t t 1 t
3 3 3 3 3 3 3 3
X X X X X X X X
1 2 3 1 t 1 t t 1 t
Assumed max G I ( X ; Y ) is the maximum over all grids in G with the size of | X | u | Y | ,
and let I|*X |,|Y | ( X ; Y ) max G I ( X ; Y ) . We define MIC as follows:
° I| X |,|Y | ( X ; Y | D) °½
*
Eq. (2) forced the relevant relation between the same variable in consecutive
timeslices (in fact, this constraint can be abandoned if there is no need to enforce each
variable to be associated with itself of the last timeslice). Eq. (3) forced the variables
X1 , , X n simulating the previous slice to have no parents.
When obtaining inter-timeslice arcs in DBN, the appliance of MIC can be seen as
follows: If the MIC between two variables from continuous slices was high, then the
two nodes were directly associated with each other. Otherwise, if the MIC was very
low, they were independent of each other, referring to no connection edge between
them in DBN structure.
MIC is a useful tool to measure the degree of dependence between two variables
from the two continuous slices. To obtain inter-timeslice arcs, MIC was calculated for
all the variables corresponding to all variables in the next slice in the DBN. Then, the
maximum MIC (MaxMIC) for each variable was given. Also, a threshold value
D 0.9 of the maximum MIC for each variable was found appropriate to include most
of the true arcs [21]. If either of the following equations in (4) was satisfied, an
undirected arc was inserted in the two variables.
MIC ( X , Y ) t D MaxMIC ( X )
OR ® (4)
¯ MIC (Y , X ) t D MaxMIC (Y )
The pseudo-code for obtaining DBN structure only with inter-timeslice arcs was as
follows:
Obtain inter-timeslice arcs based on MIC
Input V {X1 , , X n , X1c, , X nc } -the variables set; D -dataset
Output G -DBN structure
Begin Procedure
Compute MIC ( X i , X cj ) ( i z j , i, j 1, 2, , n );
Find MaxMIC ( X i ) for each variable X i ( i 1,2, , n );
Select node pairs with threshold D 0.9 ;
°1; MIC ( X i , X cj ) t D MaxMIC ( X i )
MIC ( X i , X cj ) ® or
°̄0; MIC ( X i , X cj ) D MaxMIC ( X i )
°1; MIC ( X i , X cj ) t D MaxMIC ( X cj )
MIC ( X i , X cj ) ® ;
°̄0; MIC ( X i , X cj ) D MaxMIC ( X cj )
3. Experiment
To evaluate the proposed method, Dynamic Asia network was firstly selected as a test
sample, and two other networks followed. It was compared to MI, K2, and MCMC
algorithms from structure learning both the accuracy and efficiency [22].
To evaluate the accuracy of a learned structure, the F-score was introduced to be a
synthetical indicator on the precision and recall of the structure learning algorithm [23].
In this study, all experiment programs were implemented by Matlab and based on
the BNT toolkit developed by K. Murphy. The platform was a PC with Pentium (R) 4
3.20GHz CPU, 1GB RAM, and the operating system is Windows XP.
All experiments in this study were repeated by 10 times for each sample size and
each Bayesian network benchmark, so ten datasets were randomly generated for each
sample size of BN, and the average performance indicator was calculated as the final
result. For each dataset, a Dynamic Bayesian network was learned by each of these
methods.
Table 1 and Figure 2 demonstrate the structure learning results of the Dynamic
Asia network only with inter-timeslice arcs. Compared with other methods, the
proposed method based on MIC obtained better results by obtaining a higher F-score.
Table 1. Results of different methods for Dynamic Asia network structure learning
I.
Figure 2. Results of F-score for different sample sizes and methods
There are two other dynamic networks including the WATER network and BAT
network. The WATER network is a dynamic network that monitors biochemical
processes in water supply plants. There are 12 properties, and the corresponding two-
timeslice transition network includes total 24 vertices and 26 edges, as shown in Figure
3(a). The BAT network is a dynamic Bayesian network for highway traffic monitoring.
There are 28 attributes, and the corresponding two timeslice transition network
includes a total of 56 vertices and 42 edges, as shown in Figure 3(b).
296 G.-L. Li et al. / Novel DBN Structure Learning Method Based on MIC
4. Conclusion
References
[1] M. Grzegorczyk, D. Husmeier. Improvements in the Reconstruction of Time varying Gene Regulatory
Networks: Dynamic Programming and Regularization by Information Sharing Among Genes.
Bioinformatics, 27(2011): 693–699.
[2] N. X. Vinh, M. Chetty, R. Coppel, et al. Global MIT: Learning Globally Optimal Dynamic Bayesian
Network with the Mutual Information Test Criterion. Bioinformatics, 27(2011): 2765–2766.
[3] B. Wilczynski, N. Dojer. B N Finder: Exact and Efficient Method for Learning Bayesian Networks.
Bioinformatics, 25(2009): 286–287.
[4] J. Yu, V. A. Smith, P. Wang, A. J. Hartemink, et al. Advances to Bayesian network inference for
generating causal networks from observational biological data. Bioinformatics, 20(2004): 3594-3603.
[5] H. Jia, D. Liu, P. Yu. Learning dynamic bayesian network with immune evolutionary algorithm.
Guangzhou, China: Institute of Electrical and Electronics Engineers Computer Society, (2005): 2934-
2938.
[6] X. Heng, Q. Zheng, T. Lei, et al. Research on Structure Learning of Dynamic Bayesian Networks by
Particle Swarm Optimization. Proceedings of the 2007 IEEE Symposium on Artificial Life (CI-ALife
2007), 2007: 85-91.
298 G.-L. Li et al. / Novel DBN Structure Learning Method Based on MIC
[7] Y. Lou, Y. Dong, H. Ao. Structure Learning Algorithm of DBN Based on Particle Swarm Optimization.
2015 14th International Symposium on Distributed Computing and Applications for Business
Engineering and Science (DCABES). IEEE, 2015: 102-105.
[8] J. Dai, J. Ren. Unsupervised evolutionary algorithm for dynamic Bayesian network structure learning.
Workshop on Advanced Methodologies for Bayesian Networks. Springer International Publishing,
2015: 136-151.
[9] D. Husmeier. Sensitivity and specificity of inferring genetic regulatory interactions frommicroarray
experiments with dynamic Bayesian networks. Bioinformatics, 19(2003): 2271–2282.
[10] J. Robinson, A. Hartemink. Learning non-stationary dynamic Bayesian networks. Journal of Machine
Learning Research, 11(2010): 3647–3680.
[11] W. Hao, Y. Kui, H. Yang. Learning dynamic Bayesian networks using evolutionary MCMC.
Piscataway, NJ, USA: IEEE, 2006: 2934-2938.
[12] D. N. Reshef, Y.A. Reshef, H. K. Finucane, et al. Detecting Novel Associations in Large Data Sets.
Science, 334(2011): 1518–1524.
[13] Y. Zhang, W. Zhang. A novel Bayesian network structure learning algorithm based on maximal
information coefficient. Proceedings of the Fifth International Conference on Advanced Computational
Intelligence. IEEE, 2012: 862–867.
[14] N. X. Vinh, M. Chetty, R. Coppel, et al. Data Discretization for Dynamic Bayesian Network Based
Modeling of Genetic Networks. ICONIP 2012, Part II, LNCS 7664, 2012: 298–306.
[15] G. Li, L. Xing, Y. Chen. A New BN Structure Learning Mechanism Based on Decomposability of
Scoring Functions. Bio-Inspired Computing-Theories and Applications. Springer Berlin Heidelberg,
2015: 212-224.
[16] N. Friedman, K. Murphy, S. Russell. Learning the structure of dynamic probabilistic networks. In
Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence (UAI), San
Francisco, CA: Morgan Kaufmann Publishers, 1998: 139–147.
[17] G. Li, X. Gao, R. Di. DBN structure learning based on MI-BPSO algorithm. In: 13th IEEE/ACIS
International Conference on Computer and Information Science, 2014: 245–250.
[18] B. Wilczynski, N. Dojer. BNFinder: exact and efficient method for learning Bayesian networks.
Bioinformatics, 2009, 25(2): 286–287.
[19] N. Dojer Learning Bayesian Networks Does Not Have to Be NP-Hard. In Proceedings of International
Symposium on Mathematical Foundations of Computer Science, 2006: 305–314.
[20] S.A. Ramsey, S.L. Klemm, D.E. Zak, et al. Uncovering a macrophage transcriptional program by
integrating evidence from motif scanning and expression dynamics. PLOS Computational Biology,
4(2008).
[21] Y. Zhang, W. Zhang, Y. Xie. Improved heuristic equivalent search algorithm based on Maximal
Information Coefficient for Bayesian Network Structure Learning. Neurocomputing, 117(2013): 186–
195.
[22] G. F. Cooper, E. Herskovits. A Bayesian method for the induction of probabilistic networks from data.
Machine Learning, 9(1992): 309-347.
[23] E. Patrick, K. Kevin, L. Frederic. Information-Theoretic Inference of Large Transcriptional Regulatory
Networks. EURASIP Journal on Bioinform atics and Systems Biology, 2007.
Fuzzy Systems and Data Mining II 299
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-299
Introduction
Illustration images are used in animation works or comics. Those illustration images
are also called "Anime illust". Figure 1 shows an example of an illustration image.
1
Corresponding Author: Kenji KITA, Tokushima University, 2-1, Minamijosanjima-cho, Tokushima
770-8506, Japan; E-mail:kita@is.tokushima-u.ac.jp.
2
© 2015 Ell ( http://elleneed.blog102.fc2.com/ )
300 A. Fujisawa et al. / Improvement of the Histogram
experience viewing the illustration images. Thus, receiving the style from the
illustration image in detail is difficult other than illustration lovers. While a few studies
have focused on illustration images [1-2], previous researchers did not propose method
for automatically identifying the style. In this paper, we studied the illustration style to
achieve the following purpose.
x To investigate image features that is effective to classify the style.
x To construct style-based classifiers for illustration images.
In section 1, we describe previous research. In section 2, we describe “IF-hist” and
present an improved histogram “vIF-hist”. In section 3, we describe experiment that
investigates the effectiveness of the proposed method and discuss the experimental
result. Finally, we describe the conclusion of this research in section 4.
1. Previous Research
This section introduces previous research. Kuriyama [3-4] created the recognition
model based on various image features such as Local Binary Pattern features and HSV
model. Moreover, he created the global feature amount per one image by integrating
local features. To study the similarity of touch in relation to illustration images,
Kadokura [5] used the number of types of colors. Besides this, a number of studies
have targeted clip art [6-8]. Clip art is a kind of illustration image. However, clip art is
used as icons or symbols. Therefore, clipart is very simple compared to the type of
illustration images that are targeted in our study. Because illustration images such as
“Anime Illust” include more sensitive and complicated expressions, illustrations that
are classified as clipart should be distinguished from our research target.
2. Proposed Method
2.1. IF-hist
Fujisawa et al [9] assumed that the color to represent the style is determined relatively.
Based on this assumption, they proposed a color histogram focuses on infrequently
appearing colors called an Infrequency Histogram, or IF-hist. Fujisawa et al. ranked
color frequencies according to histogram value and used the ranking result as a
threshold.
IF-hist is created by changing the histogram values of gradations appearing more
frequently than a certain frequency of appearance. This operation is carried out
according to Eq.(1).
IF-ℎ;ÀL# and ℎ;ÀL# indicate each histogram value of gradation n. ÁÆÇz indicates
the ranking threshold of each histogram value. This calculation method does not
change the color information under the threshold. However, the colors with more than a
certain frequency of appearance have only information on the presence or absence of
the colors in the target image. As a result, information on how often the colors
A. Fujisawa et al. / Improvement of the Histogram 301
appeared in the image is lost. This method created a color histogram by keeping
information about the infrequent colors. However, they found a weakness in this
method. The method has the problem that the histogram values of gradations that had
originally been appearing less frequently are changed. Figure 2 shows a concrete
example of the problem IF-hist has.
This subsection describes the improved IF-hist. The improved IF-hist is created by
using a different threshold from the original IF-hist. We refer to the proposed
histogram as vIF-hist. The creation of vIF-hist is shown in Eq.(2)
ÁËÇÌÍÎ indicates the threshold of histogram value. vIF-hist is obtained by using this
threshold. Figure 3 shows the sample of a vIF-hist. The left histogram is a normal color
histogram which is the same as Figure 2. The right histogram is a vIF-hist that ÁËÇÌÍÎ
equal to 0.005. Compared to the IF-hist in Figure 2, in vIF-hist, the histogram values
of gradation that appeared infrequently were not changed. In addition, the other
histogram values of gradation that surrounded by a square were changed as intended.
3. Evaluation Experiment
Illustration images used in experiment were collected from “Rakuten Kobo3,” which is
a shop that sells digital books. Table 1 shows the number of data used in evaluation
experiment. We used 30-fold cross-validation. The library used in this experiment was
lib-svm [10].
Table 1. Experimental Data
Ð8
Recallk = (3)
Ñ8
Ð8
Precisionk = (4)
Õ8
3
Rakuten Kobo( https://store.kobobooks.com/ )
A. Fujisawa et al. / Improvement of the Histogram 303
Result
Table 2 and Table 3 show the experimental results of classifying the styles “For boys”
and “For girls”.
Table 2. Experimental result using normal color histogram.
When using the normal color histogram, the difference of value between precision
and recall was large. In addition, the F-measure of "For boys" was low. From this, it
was considered that the classification results were biased. In the case using vIF-hist, the
overall precision is over 80%. Similarly, the value of recall was higher than the result
using a normal color histogram. Therefore, we considered that our proposed histogram
could effectively classify an illustration image based on the style. In addition, the F-
measure of vIF-hist was better than the F-measure of IF-hist. Thus, it was considered to
be successful at improving the method of IF-hist, and classification was improved.
4.1. Discussion
In the additional experiment, we used the same experimental data as we used in the
evaluation experiment. We divided those data into training and test data so that images
by the same illustrator would not be included in both. By using the training data, we
created a classifier by SVM. To classify the test data based on the style, we conducted
the experiment using this classifier. The number of images used for the experiment was
1,219. Table 4 shows a breakdown of the additional experiment data.
Table 4. Training and test data used in additional experiment.
5. Conclusion
In this study, we aimed to classify illustration images based on their styles. To do so,
we focused on color information that are used infrequently. IF-hist was a color
histogram focused on colors that appeared infrequently on an image. IF-hist used the
rank of appearance frequency of a color as a threshold. However, this histogram had a
weakness. To improve this weakness, we proposed using the histogram value directly
as the threshold. In addition, we named the improved IF-hist, vIF-hist.
To evaluate vIF-hist, we experimented with classification of the illustration image
as either of the styles "For boys" or "For girls". When we looked at the F-measure, the
result with vIF-hist was better than the result with IF-hist regardless of the style. From
this, we considered that the weakness of IF-hist was improved by vIF-hist.
For future studies, we would like to investigate the relationships among colors with
low appearance frequencies on an illustration image. Also, we would like to further
investigate the colors that are not used for the illustration image to represent its
particular style.
A. Fujisawa et al. / Improvement of the Histogram 305
Acknowledgments
This work was supported by JSPS KAKENHI Grant Numbers 15K00425, 15K00309, and
15K16077.
References
[1] T. Itamoti, M. Miwa, K. Taura, and T. Chikayama, An identification algorithm of illustration artists,
Proc.74th National Convention of IPSJ, 74(2012), 2.209-2.210.
[2] S. Aoki, and R. Miyamoto, Feature point extraction from a manga-style illustration of a facial image
using sift and color features as a feature vector, IEICE Tech. Rep., 115(2016), SIS2015-59, 63-68.
[3] S. Kuriyama, Cognitive classification for styles of illustrations based recognition model, IPSJ SIG
Technical Report 2013-CG-152(2013), 1-7.
[4] T. Furuya, S. Kuriyama, and R. Ohbuchi, An unsupervised approach for comparing styles of
illustrations, Oral paper, Proc.13th International Workshop on Content-Based Multimedia Indexing
(CBMI) 2015,1-6.
[5] K. Kadokura and Y. Osana. Image(Artwork) Retrieval based on Similarity of Touch, Proc.75th
National Convention of IPSJ 2013, 603-604.
[6] M. J. Fonseca, B. Barroso, P. Ribeiro, and J. A. Jorge, Retrieving ClipArt Images by Content, Proc.
the 3rd International Conference on Image and Video Retrieval (CIVR) 2004, 500-504.
[7] E. Garces, A. Agarwala, D. Gutierrez, and A. Hertzmann, A similarity measure for illustration style,
ACM Transactions on Graphics (SIGGRAPH 2014), 33(2014).
[8] P. Martins, R. Jesus, M. Fonseca, and N. Correia, Clip art retrieval combining raster and vector
methods, 11th International Workshop on Content-Based Multimedia Indexing(CBMI) 2013, 35-40.
[9] A. Fujisawa, K. Matsumoto, M. Yoshida, and K. Kita, An Illustration Image Classification Focusing
on Infrequent Colors, International Journal of Advanced Intelligence, 8(2016), 84-98.
[10] C. C. Chang and C. J. Lin. LIBSVM: A library for support vector machines, ACM Transactions on
Intelligent Systems and Technology, 2(2011).
306 Fuzzy Systems and Data Mining II
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-306
Introduction
In wireless digital communication system, the LDPC code is an important class of the
forward error correction codes. It is closest to the Shannon limit decoding performance,
with a high data throughput [1]. So the LDPC code has been widely used in many
wireless communication standards, such as WLAN and WiMAX in the field of
broadband wireless access [2].
With the broad application of LDPC codes, multi-standard universal LDPC
encoder will be widely used in the future too. Therefore the implementation of the
encoder is required to not only have some flexibility, but also to be close to the special
chip in area and power consumption. The application specific instructions sets
processor (ASIP) is usually selected to achieve the demanding requirements.
Based on the special instruction set processor architecture, paper [2] designs an
architecture aware LDPC (AA-LDPC) encoder. Its core computing component is
matrix multiplier. In paper [3~6], several high speed QC-LDPC code encoders are
designed which have some special structures in computing unit. In this paper, we
design a 4 bits RAM with a less resources to complete the cyclic shift function for
multi-standard. The encoder can calculate the cyclic shift in parallel, which increase the
speed and the throughput.
*
Corresponding Author: Qian YI, Department of Information Science and Technology, Taishan
University, Taian, China ; E-mail: bjkdqy@126.com.
Q. Yi and H. Jing / Design and Implementation of a Universal QC-LDPC Encoder 307
In the section 1, the paper analyses the QC-LDPC coding algorithm, the section 2
describes the system structure and main calculation module of the encoder, the section
3 gives the simulation results and analysis.
The QC-LDPC code is based on the set structure. It reduces the coding complexity and
is easier to use semi-parallel hardware structure to realize. Some communication
standards such as WLAN [8], WIMAX [5], DTMB [3] and CCSDS [7] etc. have been
adopted it. The quasi lower triangular matrix encoding algorithm is proposed by
Richarson and Unbanke (also called RU algorithm). The algorithm makes full use of
the sparsity of the check matrix, and rearranges the rows and columns of the parity
matrix to get the lower-triangular-like H matrix, which can reduce the complexity of
the linear encoding [8]. In order to effectively encode the LDPC code, the matrix H is
divided according to the RU algorithm. The A, B, C, D, E and T are circulant
permutation matrices, which are composed of the basic block size g×g. The g is the
expansion factor and the z is an integer much smaller than n.
Assuming the parity code word v=(s, P_1, P_2), which the s is the information
code, the P_1 and P_2 are the first and second parts of the check code, there are:
) (1)
(2)
) (3)
According to the block method, the length of the s, P_1 and P_2 are n-m, z, and m-
z respectively. As known parameters, the Ф can be directly input to the encoder, which
does not need to be calculated.
Analysis of the algorithm from the view of hardware realization, the encoder needs
to calculate the equation (1) and equation (2). According to the analysis of the
algorithms in the paper [4], the core operations of the two equations can be divided into
the multiplication of a g×g matrix and a g-dimensional vector, and the modular-2
addition of the two g-dimension vectors. The modulo-2 addition operation is generally
implemented by the XOR circuit.
According to the matrix theory lemma: If an identity matrix right shifts x bits
cyclically and then multiplies with a column vector, it is equivalent to the column
vector upward shifts x bits cyclically. Therefore, the cyclic shifter is implemented to
achieve the multiplication of a g×g matrix and a g-dimensional vector. If the cyclic
shifter shifts N bits, the result of this shift can be obtained after N clocks usually. In the
paper [5], the logarithmic cyclic shifter can output the result after log2N clocks. These
cyclic shifters are suitable for the case that the shift-bit length is constant. Even in the
same communication standard, the g often has different values. For example, in
IEEE802.16e the g is equal to 24, 28, 32...and in IEEE802.11n it is 27, 54, 81... If the
general cyclic shifter and the logarithmic-cyclic shifter are used to carry out the cyclic
shift of the g-dimensional vector, the data length of the two shifters should be the
maximum value of all the g. In this way, when the g value of the actual operation is
smaller, there is a large resources waste and long-time delay in the chip. In order not to
affect the accuracy and throughput of the subsequent operation, additional control
circuit is also needed to deal with the redundant or invalid bits in the final result.
308 Q. Yi and H. Jing / Design and Implementation of a Universal QC-LDPC Encoder
In addition to ensuring universal, this paper combines the two kinds of design
ideas, such as loop shift and instruction string operation, to extract the special
instruction set of P_1 and P_2. In this way, the modules of the chip can run in parallel
with the structure, so as to increase the resource utilization and the running speed of the
encoder.
The bit length of the s is from hundreds to thousands, only the parallel structure can
increase the coding efficiency and flexibility. In this section, the design of MIMD
parallel architecture is shown in Figure 1.
The MIMD architecture has a plurality of processing units which do the different
operations depending on the different control flows. The units can process different
data, achieve spatial parallelism, so the MIMD architecture is often used for special
purpose computing.
Processing unit
Data bus
Control bus
memory
...
...
Processing unit
group #1 Instruction Instruction
decode unit memory
control signals to the processing unit. If it is a global instruction, the instruction decode
unit will send the control signals to the control unit.
The control unit coordinates the processing tasks among the processing unit groups,
manages the synchronization signal, the external handshake signal and so on.
After the instruction decode unit shown in Figure 1 decodes the MVMs instruction, the
controller module sends out the control signal, which causes the matrix vector
multiplier to complete the matrix vector multiplication. The hardware structure of
matrix vector multiplier is shown in Figure 2.
local memory
register #0
g-dimensional
g×g matrix
vector data
memory register #1
memory selector
register #2
controller
0 A0,0
1 A0,1
2 A0,2
...
...
23 A0,23
24 A1,0
...
...
If mod (N, 4) =0, thus N is divisible by 4, data in the register #0, register #1 can
output directly without concatenation.
If mod (N, 4) =1, then on a clock cycle, {register #0 [2:0], register #1 [3]} is
written into local memory; and on the next clock cycle, {register #1 [2:0], register #0
[3]} is written into local memory. This is equivalent to moving the data one bit.
If mod (N, 4) =2, then on a clock cycle, {register #0 [1:0], register #1 [3:2]} is
written into local memory; and on the next clock cycle, {register #1 [1:0], register #0
[3:2]} is written into local memory. This is equivalent to moving the data 2 bits.
If mod (N, 4) =3, then on a clock cycle, {register #0 [0], register #1 [3:1]} is
written into local memory; and on the next clock cycle, {register #1 [0], register #0
[3:1]} is written into local memory. This is equivalent to moving the data 3 bits.
The space-time task description of the matrix vector multiplier is:
On the first clock cycle, the data with address (0) of the g-dimensional vector
memory is written into the register #0 and the register #2.
On the second clock cycle, the data with address (1) of the g-dimensional vector
memory is written into the register #1. The two group data in the register #0 and the
register #1 are concatenated into a new 4bit data accordance with the above 4 ways,
and on the next clock cycle the new 4bit data is written into the address (0) unit of the
local memory
On the third clock cycle, the data with address (2) of the g-dimensional vector
memory is written into the register 0#. The two group data in the register #0 and the
register #1 are concatenated, and on the next clock cycle the new 4bit data is written
into the address (1) unit of the local memory... and so on.
As the data of the first group is also stored into register #2, on the last clock cycle,
register #2 replaces register #0 or register #1 to concatenate to get the last 4-bits data
which will be written into the last unit on the next clock..
Ñ
The entire working process requires + 3 clock cycles.
ê
The specific instruction length is 25 bits. Hardware support for dedicated instruction
has been introduced in the previous section. Some major commands are shown in the
following Table 1.
Table 1. The main instructions
When the PEGs specified by the "SYN" receive the synchronization signal, they
will continue to execute the next instruction, or wait. Instruction "MVMs, N" is to
complete the multiplication of a matrix and vector, N is the size of the sub matrix.
Instruction "OUTS N" outputs code word which length is N.
Q. Yi and H. Jing / Design and Implementation of a Universal QC-LDPC Encoder 311
Acknowledgements
The research work was supported by Tai'an Science and Technology Development Plan
# 201330629 and Shandong Provincial Natural Science Foundation, China
#ZR2013FL030.
References
[1] R. G. Gallager. Low density parity check codes. IRE Trans, Inform. Theory, 8(1962), 21-28.
[2] A. C. Vikram, J. J. Sarah, G. Lechner. Memory-efficient quasi-cyclic spatially coupled low-density
parity-check and repeat-accumulate codes. IET Communications, 8(2014), 3179 – 3188.
[3] X. J. Zhang. Research on Encoder/Decoder of AA-LDPC Codes Based on ASIP. East China Normal
University, PhD dissertation, 2011.
[4] M. Zhao and L. Li. High throughput in-system programmable quasi cyclic LDPC encoder architecture.
Journal of Tsinghua University (Science and Technology), 49(2009), 1041-1044.
[5] Y. Zhang, X. M. Wang, H. W. Chen. FPGA based design of LDPC encoder. Journal of Zhejiang
University (Engineering Science), 45(2011), 1582-1586.
[6] R. J. Yuan, FPGA-based Joint Design of LDPC Encoder and Decoder. Journal of Electronics &
Information Technology, 34(2012), 38-44.
[7] P. W. Qiu, P. Bai, M. Y. Li. Design of Dynamically Reconfigurable LDPC Encoder Based on FPGA
According to CCSDS Criteria. Video Engineering. 36(2012), 59-62.
[8] H. Y, He, Principle and Application of LDPC. Beijing: The People's Posts and Telecommunications
Press, 2009.
312 Fuzzy Systems and Data Mining II
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-312
Introduction
It is well known that the relay nodes play an import role in conventional cellular
networks to help enlarge the coverage of base station or increase the overall cell
throughput compared to 3GPP LTE [1]. It is an effective and important technology to
solve the problem of cell coverage shortage and the throughput of cell-user especially
cell-edge user and improve the whole wireless system performance [2-3]. Relay
technology are very important in other cooperative networks too, such as ad hoc
networks. As a part of important network design, relay selection, power control,
spectrum resource allocation have been widely researched in previous period. Relay
selection, which is an important part of the application of relaying systems, influences
the performance of relay nodes effectively. Most of the relay selection researches base
on certain function of channel state information (CSI), which is considering distance
between the source and destination, path loss or SNR [4]. In this scenario, the receiver
knows all the CSI between the source and relay and the CSI between relay and
destination thus chooses the relay with the best performance based on some function of
CSI [5]. However, selecting single relay in wireless networks may have the
disadvantage of imbalance of low utilization of resource, and moreover, the
1
Corresponding author: Zhi-Jie SHANG, State Grid Information & Telecommunications Branch,
Beijing, China; E-mail: shangzhijie@163.com
F.-G. Lai et al. / Quantum Inspired Bee Colony Optimization 313
Relay node 1
f1 g1
Relay node R
R ai fi gi Pi R ai gi Pi
y P¦ s¦ ui w (1)
2 2
i 1 1 fi P i 1 1 fi P
where w is the receiver’s AWGN and ui vi e jarg fi while vi is the the i -th relay’s
AWGN. All of the noises are assumed to be i.i.d. complex Gaussian random variables
which has the characteristics of mean of zero and variance of unit. It is obvious that ui
and vi have the same distribution. The average SNR of the communication system is
F.-G. Lai et al. / Quantum Inspired Bee Colony Optimization 315
2
§R · § R a2 g 2P ·
a fg P
P¨¦ i i i i ¸
¨ ¦
J ¨1 i i i
¸
¨¨ 2
¸¸
2 ¸
i 1 1 f P © i1 1 f P ¹
© i ¹ i
J
max K = R
s.t. D i {0,1}
D1 , ,D R
P¦ D i2 Pi
i 1
Assume that the receiver knows all CSI, this problem is equivalent of solving the
problem of the SNR or power efficiency maximization, which is like the problem in
[12]. But, here the power control problem is not taken into consideration. Instead, each
relay has only two choices: to take part in the cooperation with full power or not to take
part in the cooperation at all. Since every relay has two choices, the problems
considering SNR or power efficiency are general 0-1 optimization problems.
Exhaustive search scheme has the ability to solve the problem, but the complexity
increases exponential with relay number. So the QBCO is used to solve the multiple
relay selection problems to get a better solution, which will be presented in Section II.
It has been proposed in [9] that for wireless communication relay networks which have
more than 2 relays, all relays are to corporate or not to cooperate at all, there exists no
optimal relay ordering. So the schemes proposed in [9] are not global-optimal, that is to
say, only get a sub-optimal solution. So we propose the multiple relay selection
schemes based on QBCO algorithm.
This paper QBCO is used to solve multiple relay selection problems, which is
referred by social behavior of bees. Three groups of bees, that is, quantum employed
bees, quantum onlooker bees and quantum scouts bees consists of the colony of
quantum bees. They look for food resources (which are represented by quantum
position) in an R dimensions space according to its own and its parteners’ historical
experiences; where R represents the optimization problem’s dimension. In QBCO,
316 F.-G. Lai et al. / Quantum Inspired Bee Colony Optimization
quantum coding is used to represent the probabilistic state, and the quantum velocity
can be represented by a string of quantum bits. One quantum bit can be written by a
pair of numbers (D , E ) , where D 2 E 2 1 . D 2 decides the probability that the
quantum bit is in the '0' state and E 2 decides the probability that the quantum bit is in
the '1' state. The i-th quantum bee’s quantum velocity is
ªD i1 D i 2 D iR º
«
EiR »¼
vi
¬ Ei1 Ei 2
The quantum colony consists of h quantum bees that flies in a space of R dimensions,
xi ( xi1 , xi 2 , , xiR ), (i 1, 2, , h) represents the i-th quantum bee’s position in the
space. vi (vi1 , vi 2 , , viR ) >Di1 DiR @ represents the i-th particle’s quantum
Di 2
velocity is and until now the best bit position (the local optimal bit position) of the ith
quantum bee is pi pi1 , pi 2 , , piR , i 1,2, , h . The global optimal bit position
found by the whole bee colony is p g p g1 , p g 2 ,
, pgR . At each iteration, the
quantum rotation gate, quantum velocity and bit position of the i-th quantum bee is
updated by the following quantum moving equations respectively:
t 1
Tij e1 ( pijt xijt ) e2 ( pgjt xijt )
1 (vt )2 , if ( pt xt pt and r c )
° ij ij ij gj 1
vijt 1 ®
°abs(vijt u cos Tijt 1 1 (vijt ) 2 u sin Tijt 1 ), else
¯
t 1 2
which is a constant among [0,1 / R] , J ij [0,1] is uniform random number, D ijt 1
F.-G. Lai et al. / Quantum Inspired Bee Colony Optimization 317
defines the selection probability of bit position state in the (t 1)-th generation. The
value of e1 and e2 represents the relative important degree of pti and ptg .
After updating the quantum velocity and bit position, calculate the fitness of
each quantum bee based on certain function, that is (3) or (4). If the fitness of xti 1 is
better than that of pti , then update pti 1 as xti 1 . If the fitness of pti 1 is better than that
t 1
of ptg , then update p g as pti 1 .
The quantum onlooker bees’ quantum position is based on the selected quantum
employed bee’s quantum position. The selection possibility of the k-th quantum
employed bee can be defined be calculated by the following equation:
U x
pkt h k (10)
¦ U xi
i 1
J K
where U xk represents the fitness of x k , which is in (3) or in (4).
At each iteration, the quantum rotation angles and velocities of the i-th quantum
onlooker bee are updated by the following equations, assume that the k-th quantum
employed bee is selected as the guidance of the quantum onlooker bee:
1 (vt )2 , if ( pt xt pt and r c )
° ij kj ij gj 1
vijt 1 ®
°abs(vijt u cos Tijt 1 1 (vijt ) 2 u sin Tijt 1 ), else
¯
After updating the velocity and position of each quantum onlooker bee, the
fitness is computed as the process of employed bee.
When the fitness of each quantum employed bees and quantum onlooker bees does not
change in limit times, then it becomes a quantum scout bee, which has the ability to
find new food resources, thus the quantum position is selected randomly.
From the above analysis, the processes of quantum bee colony optimization for
multiple relay selection are shown below:
Step1: Suppose that the receiver knows the CSI f1 , f 2 , , f R and g1 , g 2 , , g R .
Step2: Create an initial quantum bee colony randomly based on quantum coding.
Step3: For all quantum bees, calculate the fitness (i.e., J or K ).
318 F.-G. Lai et al. / Quantum Inspired Bee Colony Optimization
Step4: Update each quantum bee's quantum velocity and quantum position through the
evolutionary process of the three quantum bees.
Step 5: Update the local optimal position of each quantum bee. Update the global
optimal position of the whole quantum bee colony.
Step 6: If the maximum iteration is reached, stop and output the relay selection result;
if not, go to step 3.
In this section, we show the simulated J and K of the proposed QBCO scheme with
relay ordering multiple relay selection schemes (the complexity is R), exhaustive search
scheme(the complexity is 2R ) and single relay selection scheme (the complexity is 1).
In the simulation, all channels and noises at all of the relays and destination are i.i.d.
complex random variables with Gaussian distribution and mean of zero, variance of
unit. For QBCO, we set the maximal generation is 100, h 20, e1 0.06,e2 0.03,
c1 1/ 300 (the complexity is 10020 without considering R).
Firstly, 15 relays are adopted in the simulation and they have the same power value
Pi . Figure 2 shows the simulation results. We can see that SNR increases with the
power. From Figure 2(a), we can also see that the three relay ordering multiple relay
selection schemes perform almost the same, and multi Best Worst Channel Selection
performs the worst, while the multi SNR-based Selection performs the best among the
three relay ordering multiple relay selection schemes, but the QBCO performs better
than all of the relay ordering multiple relay selection schemes, which is the same as
exhaustive search. The gap between QBCO and the other schemes is obvious. Also, it
is obvious that multiple relay is much effective than single-relay.
Then set the number of relays as 20, Figure 2(b) shows the simulation results.
From Figure 2(b), we can see that the QBCO perform better than the other relay
selection schemes, and compared with Figure 2(a), we can see that when the relay
number increases, the QBCO can find an optimal solution compared with other
algorithms.
25 45
20
10
15
10
5
0 0
1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10
P(W)
P(W)
(a) (b)
Figure 2. The comparison of SNR for QBCO scheme and other schemes
Now let we consider the power efficiency problem. Figure 3 shows the simulation
results.
F.-G. Lai et al. / Quantum Inspired Bee Colony Optimization 319
1.4 2.1
1.35
2
1.3
1.25 1.9
1.2
1.8
1.15
K
K
1.7
1.1
(a) (b)
Figure 3. The comparison of power efficiency for QBCO scheme and other schemes
Figure 3(a) considers the case when the number of relays is 15, while Figure 3(b)
considers the case when relay number is 20. Among the three relay ordering multiple
relay selection schemes, the Best Worst scheme performs the worst, while the SNR-
based performs the best, which has similar performance with Figure 2. Our algorithm,
the QBCO, perform better than the other relay selection schemes and has the same
performance as exhaustive search when R = 15. Compare Figure 3(a) with Figure 3(b),
we can see that as relay number increases, the advantage of our algorithm is much more
obvious.
From Figure 2 and Figure 3, the differences between the QBCO multiple relay
selection scheme and relay ordering multiple relay selection schemes which maximize
SNR or power efficiency is obvious. And if the relay number increases, the advantage
of the QBCO-based multiple relay selection scheme is much more obvious.
This paper has proposed two multiple relay selection schemes based on QBCO which
maximizes SNR and power efficiency respectively in the cooperative multiple relay
networks. The proposed schemes have a great advantage with SNR and power
efficiency targets compared with other schemes in literature.
References
[1] 3GPP TR 36.814, Further Advancement for E-UTRA Physical Layer Aspects, v 1.5.2, Dec. 2009.
[2] N. Laneman, D. N. C. Tse, and G. W. Wornell, Cooperatvie diversity in wiress networks: efficient
protocols and outage behavior, IEEE Transactions on Information Theory, 50(2004), 3062-3080.
[3] A. Nosratinia, T. Hunter, and A. Hedayat, Cooperative communication in wireless networks, IEEE
Communications Magazine, ol. 42(2004), 68-73.
[4] V.Sreng, H.Yanik, D.Falconer, Relayer Selection Strategies in Cellular Networks with Peer-to-Peer
Relaying, VTC 2003-Fall. 2003 IEEE 58th, 3(2003):1949-1953.
[5] A. Bletsas, A. Khisti, D. P. Reed, and A. Lippman, A simple cooperative diversity method based on
network path selection, IEEE Journal on Selected Areas in Communications,24(2006), 659-672.
[6] X. Lin and L. Cuthbert, Load Based Relay Selection Algorithm for Fairness in Relay Based OFDMA
Cellular Systems, in Wireless Communications and Networking Conference, 2009. WCNC 2009.IEEE,
2009, 1-6.
320 F.-G. Lai et al. / Quantum Inspired Bee Colony Optimization
[7] H. Eghbali, S. Muhaidat, S. A. Hejazi and Y. Ding, Relay Selection Strategies for Single-Carrier
Frequency-Domain Equalization Multiple relay Cooperative Networks, in IEEE Transactions on
Wireless Communications, 12(2013), 2034-2045.
[8] T. Zhang, S. Zhao, L. Cuthbert and Y. Chen, Energy-efficient cooperative relay selection scheme in
MIMO relay cellular networks, in Proc. of IEEE Int’l Conf. on Commun. Systems (ICCS), 269-273,
Nov.2010.
[9] Y. Jing and H. Jafarkhani, Single and multiple relay selection schemes and their available divercity
orders, IEEE Trans.on Wireless Communication, 8(2009), 1414-1423.
[10] J. Kennedy and R. Eberhart, Discrete binary version of the particle swarm optimization, in Proc. IEEE
International Conference on Computational Cybernetics and Simulation, 4104-4108, 1997.
[11] H. Y. Gao, J. L. Cao, Quantum-inspired bee colony optimization algorithm and its application for
cognitive radio spectrum allocation, Journal of Central South University, 43(2012):4743-4749.
[12] Y. Jing and H. Jafarkhani, Network beamforming using relays with perfect channel information,
submitted for publication, 2006.
Fuzzy Systems and Data Mining II 321
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-321
Introduction
1 Corresponding Author: Yi-Lei WANG, College of Mathematics and Computer Science, Fuzhou University,
1. Related works
Speed up method for Collaborative Filtering: Most of the work done previously fo-
cused on memory-based CF. [2] designed a similarity metric optimized for hardware to
speed up the k-nearest neighbor computation. [9] using a k-nearest neighbor graph to
retrieve the most similar user or item. Our work, different from previous two, is a speed
up method for Neural Network based CF.
Neural Network based CF: Restricted Boltzmann Machine (RBF) [10] CF was one
of the first neural network based CF models. Recently,autoencoder based CF[12,14], a
subclass of Neural Network based CF, have achieved state-of-the-art performance. But
autoencoder based CF trained with large sparse target is often slow because it didn’t
exploit the sparsity structure of the ratings. Our approach solved this problem.
2. Autoencoders
Autoencoder[3] is a type of neural network use for learning efficient codings. The aim
of an autoencoder is to learn a representation from set of data Typical autoencoder use
narrow bottleneck layers to force the dimensionality reduction on data. The network
consists of two parts:
Given N users and M items and the sparse rating matrix R, the rating rij is the rating
of ith user on jth item. A user profile can be described by a sparse vector ui = the ith
row of R, and similarly, an item profile can be described by a sparse vector vj = the jth
column of R.The goal of the Collaborative Filtering is to complete the sparse vector. So
we have two autoencoders to complete the R:
5. Experiments
In this section, we test the speed and accuracy of I-CFAE on 2 real world datasets. Movie-
Lens 1M, MovieLens 10M, we randomly select 10% of the ratings for testing and 90%
for training. Prediction Errors are measured by Root Mean Squared Error(RMSE):
324 W.-Z. Tang et al. / A Speed up Method for Collaborative Filtering with Autoencoders
Figure 1. Row density histogram of MovieLens 1M and MovieLens 10M (on log scale)
1
N
RM SE = (nn(xtrain,i )j − xtest,i,j )2
||Rtest || i=1 xtest,i,j !=0
Because most of the rows in MovieLens 1M and MovieLens 10M are incredibly sparse
(<1% non-zeros), our speed up implementation gives order of magnitude speed up to I-
CFN, even 3-4 folds speed up compared to dedicated sparse implementation I-AutoRec
[12].
W.-Z. Tang et al. / A Speed up Method for Collaborative Filtering with Autoencoders 325
Because of the huge advantage of GPU on dense implementation. We must lower the
threshold from T=0.01*M to T=0.005*M to provide good results. Because sparse com-
putation can only be done on CPU, and we need to transfer the data between CPU and
GPU. So the speed up ratio is lower than CPU. But still 2-4 folds faster than full dense
implementation(I-CFN).
Table 3. Test Set RMSE on MovieLens 1M Table 4. Test Set RMSE on MovieLens 10M
Method Test set RMSE Method Test set RMSE
RBM 0.854 RBM 0.825
ALS-WR 0.843 ALS-WR 0.783
LLORMA 0.837 LLORMA 0.794
I-CFN 0.838 I-CFN 0.776
I-AutoRec 0.831 I-AutoRec 0.782
I-CFAE(Our) 0.836 I-CFAE(Our) 0.779
326 W.-Z. Tang et al. / A Speed up Method for Collaborative Filtering with Autoencoders
MovieLens 10M is a much bigger dataset comparing to MovieLens 1M, so we set much
lower weight decays and learning rates for this dataset. In MovieLens 10M dataset our
model beats strong baselines such as ALS-WR and even I-AutoRec. Our model’s perfor-
mance is similar to I-CFN.
6. Conclusions
References
[1] M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S. Corrado, A. Davis, J. Dean,
M. Devin, et al. Tensorflow: Large-scale machine learning on heterogeneous systems, 2015. Software
available from tensorflow. org, 1, 2015.
[2] J. Bobadilla, F. Ortega, A. Hernando, and G. Glez-de Rivera. A similarity metric designed to speed up,
using hardware, the recommender systems k-nearest neighbors algorithm. Knowledge-Based Systems,
51:27–34, 2013.
[3] L. Deng, M. L. Seltzer, D. Yu, A. Acero, A.-r. Mohamed, and G. E. Hinton. Binary coding of speech
spectrograms using a deep auto-encoder. In Interspeech, pages 1692–1695. Citeseer, 2010.
[4] G. E. Hinton. Training products of experts by minimizing contrastive divergence. Neural computation,
14(8):1771–1800, 2002.
[5] D. Kingma and J. Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980,
2014.
[6] Y. Koren, R. Bell, C. Volinsky, et al. Matrix factorization techniques for recommender systems. Com-
puter, 42(8):30–37, 2009.
[7] J. Lee, S. Kim, G. Lebanon, and Y. Singer. Local low-rank matrix approximation. ICML (2), 28:82–90,
2013.
[8] G. Linden, B. Smith, and J. York. Amazon. com recommendations: Item-to-item collaborative filtering.
IEEE Internet computing, 7(1):76–80, 2003.
[9] Y. Park, S. Park, W. Jung, and S.-g. Lee. Reversed cf: A fast collaborative filtering algorithm using a
k-nearest neighbor graph. Expert Systems with Applications, 42(8):4022–4028, 2015.
[10] R. Salakhutdinov, A. Mnih, and G. Hinton. Restricted boltzmann machines for collaborative filtering.
In Proceedings of the 24th international conference on Machine learning, pages 791–798. ACM, 2007.
[11] B. Sarwar, G. Karypis, J. Konstan, and J. Riedl. Item-based collaborative filtering recommendation
algorithms. In Proceedings of the 10th international conference on World Wide Web, pages 285–295.
ACM, 2001.
[12] S. Sedhain, A. K. Menon, S. Sanner, and L. Xie. Autorec: Autoencoders meet collaborative filtering. In
Proceedings of the 24th International Conference on World Wide Web, pages 111–112. ACM, 2015.
[13] N. Srivastava, G. E. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov. Dropout: a simple way
to prevent neural networks from overfitting. Journal of Machine Learning Research, 15(1):1929–1958,
2014.
[14] F. Strub and J. Mary. Collaborative filtering with stacked denoising autoencoders and sparse inputs. In
NIPS Workshop on Machine Learning for eCommerce, 2015.
[15] Y. Zhou, D. Wilkinson, R. Schreiber, and R. Pan. Large-scale parallel collaborative filtering for the
netflix prize. In International Conference on Algorithmic Applications in Management, pages 337–348.
Springer, 2008.
Fuzzy Systems and Data Mining II 327
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-327
Introduction
1
Corresponding Author: Zhi-Wei GAO, No.110 Dong Guan Zhuang Road, Tianhe District, Guangzhou
510610, Guangdong Province, China; E-mail: Gaozw@ceprei.org.
328 W.-D. Fang et al. / Analysis of NGN-Oriented Architecture for Internet of Things
transmission network (or access network), the Internet of Thing can transmit these
information to the user’s terminal, such as the personal computer, PAD, data sever and
so on, in order to meet the requirements of various applications, and then realize the
goal of the ubiquitous computing.
Currently, the IoT is gradually applied in many fields, which involve the smart city
and digital city [1], the intelligent transportation [2] and so on. Additionally, IoT is
studied by many institutes and organization for standardization. Recently, in the Third-
Generation Partnership Project's (3GPP's) Radio Access Network Plenary Meeting 69,
the Narrow-Band Internet of Things (NB-IoT) is decided to standardize [3]. This
standardization will focus on providing improved indoor coverage, supporting of a
massive number of low-throughput devices, low latency sensitivity, ultralow device
cost, low device energy consumption, and optimized network architecture.
Although the IoT’s standardizations have been launched, and its applications have
been deployed in many fields, the Next Generation Network (NGN) is converged
heterogeneous network, which has various topologies and standards, multi-networks
co-existing, and superior fault tolerance and resilience. At present, there is no an
objective architecture of IoT for NGN. In this paper, the applications of IoT are firstly
identified and summarized in section 1. Then, in section 2 a more holistic overview of
NGN-oriented architecture in IoT is given. These architectures are divided into four
categories: based on the Ubiquitous Sensor Networks (USN), the Open system
Interconnection (OSI) model and the methodology, as well as converged network-
oriented. Along the way, the pros and cons of proposed architectures are analyzed in
each category qualitatively. Lastly, the conclusions are represented.
The Next Generation Network (NGN) that is a broad concept involves a variety of
changes to take place in the network construction method. At present, the research on
designing and proposing the architecture of the Internet of Things is mainly classified
as four categories: based on the USN’s high-level architecture, referencing to the OSI
model, according to logistics or semantics and converged network- oriented.
1.1. USN-based
The Ubiquitous Network (UN) was defined in Y. 2002 of ITU [4]: “The ability for
persons and/or devices to access services and communicate while minimizing technical
restrictions regarding where, when and how these services are accessed, in the context
of the service(s) subscribed to.”
At present, the industry believes generally that "IoT + Internet" is almost equal to
the ubiquitous network It is the service, which refers to information perception,
transmission, storage, cognition, decision-making and use between people, objects and
things based on the needs of individuals and society. The ubiquitous network has
superior environment - aware, content - aware and intelligence, provides ubiquitous,
nothing without the information services and applications for individuals and society.
As an important part of ubiquitous networks, the high-level architecture of the
ubiquitous sensor networks was proposed in Y.2221 of ITU [5]. The architecture was
divided into five parts based on the high-level architecture of the ubiquitous sensor
networks in figure 1. It included the underlying sensor network, the ubiquitous sensor
W.-D. Fang et al. / Analysis of NGN-Oriented Architecture for Internet of Things 329
Generally, the system architecture based on the Open system Interconnection (OSI)
model is designed as three layers, which involve the Perception Layer, the Network
layer and the Application Layer. This architecture’s design comes more from the
requirements of industrial application, the interfaces between the different layers (data
interface or physical interface) are seldom mentioned. Wu [6] proposed a new IoT
architecture based on the three-layer, which could better explain the features and
connotation of the IoT. It is different that the Application Layer is refined into: the
processing layer, the application layer and the business layer.
Ub
ol
iqu
ntr
ito
Co
us
&
Tr
Diversified
se
ans
Sen
Services &
mi
ssio
Applications
n
Generic Technologies
(Interface,
(Interfa
f ce, Middleware )
1.3. Methodology-based
To abstract the heterogeneity of devices, Kiljande [8] presented novel semantic level
interoperability architecture for pervasive computing and IoT. This architecture had
two main principles. The first one was that information and capabilities of devices were
represented with semantic web knowledge representation technologies, and interaction
with devices and the physical world was achieved by accessing and modifying their
virtual representations. Second, for global IoT, it was divided into numerous local
smart spaces, which were managed by a Semantic Information Broker (SIB), which
provided a means to monitor and update the virtual representation of the physical world.
To connect the things with each other or the users with the physical world, Pu [9]
proposed intelligent interaction architecture based on the context fusion in IoT.
In addition, since services in IoT had four different characteristics from traditional
Internet service: environment perception, event-driven, service coordination and
initiative execution, Lan [10] proposed an Event-Driven Service-Oriented Architecture
(EDSOA) for IoT. The EDSOA could support real-time, event-driven, and active
service execution. Bergmann and Robinson [11] proposed Server-Based Internet-Of-
Things Architecture (SBIOTA).
The convergence of heterogeneous networks has created a huge potential for new
business. To fully realize this potential, there is a need for a common way to design
architecture. So, the architecture of the converged network- oriented has been focus on.
Figure 3. Enhanced IMS QoS architecture of converged IoT and 3GPP LTE-A network
W.-D. Fang et al. / Analysis of NGN-Oriented Architecture for Internet of Things 331
Yang [12] proposed an enhanced IMS (IP Multimedia Subsystem) QoS (Quality of
Service) architecture to support the convergence of IoT and 3GPP (Third Generation
Partnership Project) LTE-A (Long Term Evolution-Advanced) network. The above
architecture could provide flexible services with dynamic requirements to both IoT and
LTE-A network. In figure 3, higher-layer connections among all MTC (Machine Type
Communication) devices could be provided by attaching to these fixed or mobile
stations. In addition, Zhang and Liang [13] proposed architecture for converged IoT
based on VN (Vector Network).
From the IoT’s application point of view, most architecture of applied IoT’s
system is three-layers: sense, transmission and application. The information is sensed
in the front end networks, which have complex and dynamic network topology, such as
tree/cluster topology, star topology, peer-to-peer topology, mesh topology and so on.
This information is accessed via the base station (BS) or sinks to transmit into the
backbone/ core network. Finally, via the edge router and firewall, this information is
used by end-users
The convergence of heterogeneous networks is the IoT’s evolution trend. These
are due to that, 2G/3G/4G, WiFi, WSN and Ad hoc are co-existing. In fact, there are a
few specialized researches along the ITU's technology roadmap for the Internet of
Things, however, how to achieve the communication’s goal between persons and
things, things and things as the IoT’s important function, which is adopted by research
system of IoT, it requires all parties to research.
In this section, we will present some key technologies for the design of IoT’s
architecture and its necessity analysis briefly.
Cognizing, sensing and controlling in IoT depends on many different types of sensors
and electric tag. These sensor nodes have the following characteristics.
x Limited power supply or no power supply
x Restricted computing and storage
x Smaller user interface,
x Tiny volume and limited communication range
x Open and diverse application scenarios
These inherent constraints result in the particularity of security from as follows:
x Minimizing resource consumption and maximizing security performance.
x WSN deployment renders more link attacks ranging from passive
eavesdropping to active interfering.
x In-network processing involves intermediate nodes in end-to-end information
transfer.
x Wireless communication characteristics render traditional wired-based
security schemes unsuitable.
x Large scale and node mobility make the affair more complex.
x Node adding and failure make the network topology dynamic.
332 W.-D. Fang et al. / Analysis of NGN-Oriented Architecture for Internet of Things
Although there are many researches in this field, these methods and algorithms of
security seldom consider the implementation complexity and energy consumption. As
mentioned above, the limited power supply is one of the most important constraints,
and the complexity of the algorithm is directly related to that. We have to face a
balance between the algorithm of low complexity and security requirements. So we still
have a long way to go.
The heterogeneity is an important feature for the ubiquitous transmission, the diversity
of network standards and topologies is a good example. The ubiquitous transmission of
IoT differs from common digital transmission. The former need low latency and QoS
guarantee. On one hand, the interfaces become the key nodes of data congestion
because of the scheduling policies and priority management of the different standard
network. On the other hand, the electrical characteristics of different interfaces are also
the important reasons contributed to the forwarding latency.
The interface technology of the ubiquitous network includes not only the unity of
electrical interface specification, but also the optimization of interface protocol. After
all, this is not a small problem. As we all know, the Internet does not provide the
Quality of Service (QoS) guarantees, although some access network could, such as 3G,
HSDPA (High Speed Downlink Packet Access) and etc. However, there is no denying
that the research and standardization of the interface technology of IoT’s ubiquitous
transmission are important issues under the premise of guaranteeing the quality of
service.
3. Conclusions
Many researchers have focused on the design of the system architecture all along,
especially, the design of IoT’s architecture. In February 2004, the Next Generation
Network and its architecture were decided to standardize by ITU-T SG13. In the last
ten years, lots of heterogeneous networks continued to emerge. As the transmission
networks for sensed information, they are facing the challenges of complex scenario
and unattended environment in IoT. On the other hand, the rational design of the
architecture contributes to solve the transmission bottleneck. For the front end of IoT’s
system, the information is collected by many wireless sensor nodes. In the wireless
sensor network, the transmission bottleneck could make the entire network’s load
imbalance; meanwhile, the sink nodes’ energy is consumed rapidly. This will cause to
shorten the lifetime of the entire wireless sensor network. In this paper, we review the
NGN-oriented architecture in IoT, and present some conclusion and open research
issues to facilitate the design for IoT’s system.
According to the ITU’s description, “Interconnect Any Thing” is the expansion of
capacity, services and applications in the next generation network. Therefore, we
recommend that the Internet of things should be adopted into the research field of the
next generation network, and be implemented in the technology development roadmap
of the next generation network, relied on its existing research achievements. In the near
future, we will further research the functional architecture, the system framework and
specific configuring model in the Internet of Things.
W.-D. Fang et al. / Analysis of NGN-Oriented Architecture for Internet of Things 333
Acknowledgment
This work is partially supported by the National Natural Science Foundation of China
(No. 61471346, No. 61302113), the Science and Technology Service Network Program
of Chinese Academy of Sciences (No. kfj-sw-sts-155), the Shanghai Municipal Science
and Technology Committee Program (No. 15DZ1100400), the Shanghai Natural
Science Foundation (No. 14ZR1439700), the National Science and Technology
Infrastructure Program (No. 2015BAH26F00) and the NSFC-Guangdong Joint Fund
(No. U1401257).
References
[1] A. Monzon. Smart cities concept and challenges: Bases for the assessment of smart city projects. In:
Proceeding SMARTGREENS, Lisbon, Portugal, (2015), 1-11.
[2] M. Jiang, H. Liu, L. Niu. The Evaluation Studies of Regional Transportation Accessibility Based on
Intelligent Transportation System: Take the Example in Yunnan Province of China. In: Proceeding
ICITBS, Halong Bay, Vietnam. (2015), 862 - 865,
[3] J. Gozalvez. New 3GPP Standard for IoT. IEEE VEH TECHNOL MAG. 11 (2016), 14-20.
[4] ITU-T. Recommendation Y. 2002. Overview of ubiquitous networking and of its support in NGN.
Geneva: ITU, (2010).
[5] ITU-T. Recommendation Y.2221. Requirements for support of ubiquitous sensor network (USN)
applications and services in NGN environment. Geneva: ITU, (2010).
[6] M. Wu, T. J. Lu, F. Y. Ling, J. Sun, H. Y. Du. Research on the architecture of Internet of Things. In:
Proceeding ICACTE, Chengdu, China, (2010), V5-484 - V5-487.
[7] W. Fang, L. Shan, Z. Shi, G. Jia, X. Wang. A Spatial Architecture Model of Internet of Things Based on
Triangular Pyramid, Lecture Notes in Electrical Engineering, Springer, 237 (2014), 825-832.
[8] J. Kiljande, A. D'Elia, F. Morandi, P. Hyttinen, J. Takalo-Mattila. Semantic Interoperability Architecture
for Pervasive Computing and Internet of Things. IEEE ACCESS, 2(2014), 856 – 873.
[9] H. Pu, J. Lin, F. Liu, L. Cui. An intelligent interaction system architecture of the internet of things based
on context. In: Proceeding ICPCA, Maribor, Slovenia, (2010), 402 – 405.
[10] L. Lan, F. Li, B. Wang, L. Zhang, R. Shi. An Event-Driven Service-Oriented Architecture for the
Internet of Things, In: Proceeding APSCC, Fuzhou, China, (2014), 68 – 73.
[11] N. W. Bergmann, P. J. Robinson. Server-based Internet of Things Architecture. In: Proceeding IEEE
CCNC, Las Vegas, NV, USA, (2012), 360 – 361.
[12] S. Yang, X. Wen, W. Zheng, Z. Lu. Convergence architecture of Internet of Things and 3GPP LTE-A
network based on IMS. In: Proceeding GMC, Shanghai, China, (2011), 1 – 7.
[13] J. Zhang, M. Liang. A New Architecture for Converged Internet of Things. In: Proceeding ITAP,
Wuhan, China, (2010), 1-4.
334 Fuzzy Systems and Data Mining II
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-334
Abstract. Traditional clustering methods cluster data with pairwise graph and
usually result in information loss. In this paper, we propose a novel spectral
clustering method by combing hypergraph and sample self-representation together.
Specially, the proposed algorithm employs sample self-representation based loss
function 2,1 -norm which is row sparse to weaken the effects of the noises. And
then, a hypergraph regular term is imposed to construct the hypergraph Laplacian
which fully consider the complex similarity relationships of the data. The
experimental results on benchmark data-sets indicated that the proposed algorithm
prominently outperforms the compared state-of-the-art algorithms in terms of CE,
such as SRC, LSR and et al.
Introduction
1
Corresponding Author: Shi-Chao Zhang, Guangxi Normal University, Guilin, China; E-mail:
zhangsc@gxnu.edu.cn.
S.-C. Zhang et al. / Hypergraph Spectral Clustering via Sample Self-Representation 335
results. In the following, without special instructions, we utilize graph to represent the
simple graph.
HGSR algorithm can be detail described as fellows. Firstly, we construct a
hypergraph that fully considers the relations of samples and then obtain the hypergraph
Laplacian matrix. Secondly, we conduct row sparse self-representation for all samples
by utilizing an 2,1 -norm loss function, and meanwhile put hypergraph Laplacian into
the regulation to preserve the local structure of each sample. In this way, similar
samples are clustered into the same cluster. At last, we obtain an affinity matrix to
conduct clustering.
The contributions of this work are summed up as follows:
x By imposing hypergraph into the regularization, more information has been
introduced into the clustering model. Specially, the hypergraph Laplacian
utilize the higher order relationship to preserve the local structure and then
improve the clustering performance.
X XZ 2,1
x HGSR utilizes 2,1 -norm, i.e., to reconstruct the sample self-
representation error. It is row sparse and the representation coefficients of
every sample depend on all of the other samples, so that HGSR is robust to
noises and outliers.
x The experimental results on benchmark datasets (such as face image clustering,
motion segmentation, et al) showed that the HGSR surpasses the state-of-the-
art algorithms, such as LSR, SSC, et al.
1. Proposed algorithm
1.1. Notations
In the whole paper, the lowercase letters and the bold italic capital symbols are
respectively used to denote vectors and matrices. tr(A) means trace norm of a square
matrix A. AT and A-1 respectively represent the transpose and the inverse of A. [A]j
represents the j-th column of A. The i-th row of A is represented as [A]i.
¦ ¦
n n
A1 j 1 i 1
Aij A A 2,1 2,1
, F , respectively represent the 1 -norm, F-norm, -
norm (
¦ j > A@ j 2 ) of A. Rank (Z) and Z * respectively denote the rank and the
amount of the singular values (SV) of Z.
1.2. Hypergraph
is a sample Hypergraph of some animals from the Zoo dataset [15] with the hyper-
edges:
^
E e1 ^v1 , v2 , v9 , v10 ` , e2 ^v5 , v8 , v12 ` , e3 ^v5 , v6 , v7 , v11` , e4 ^v2 , v3 , v4 , v6 , v10 ` `
ª e1 e2 e3 e4 º
« v 1 0 0 0»
« 1 »
« v2 1 0 0 1 »
« »
« v3 0 0 0 1 »
«v 0 0 0 1 »
« 4 »
« v5 0 1 1 0»
«v 0 0 1 1 »
« 6 »
« v7 0 0 1 0»
« »
« v8 0 1 0 0»
« v9 1 0 0 0»
« »
« v10 1 0 0 1 »
«v 0 0 1 0»
« 11 »
«
¬ v12 0 1 0 0» ¼
is the diagonal edge degree matrix. In this paper, the WH is a diagonal matrix with all
ones diagonal elements.
The most recently studies [10] indicate that the local structure of data is beneficial
to clustering analysis. The local geometry structure of the data tends to be the local
neighborhood relationship of the samples. This relationship could be represented by the
k-nn graph of each sample. Intuitively, similar samples should have similar
representation coefficients. We combine it with hypergraph and define hypergraph
based regularization as follows:
1
¦ ¦
w ( e ) h ( xi , e ) h ( x j , e )
K( Z ) G (e) u || zi z j ||22
2 eE xi , x j V ( ei ) (2)
1 1
= tr( ZLˆ H Z T ) tr( Z ( I|V | Dv 2 HWH De1 H T Dv 2 ) Z T )
where WH is a weight matrix of hyper-edges (we set the weight of each hyper-edge to
1), Z is the representation coefficient matrix, H R|V |u| E| is the affinity matrix of the
S.-C. Zhang et al. / Hypergraph Spectral Clustering via Sample Self-Representation 337
hypergraph, Lˆ H is the hypergraph Laplacian matrix and V(ei) represents the vertices
that have relations to the hyper-edge ei. The regularization K( Z ) ensures that the
similar samples xi and xj have similar or equal representation coefficients zi and zj.
In general, we expect the clustering model should satisfy those properties: the
sample self-representation coefficients of similar data points should also be similar,
and the model should be robust to noise and outliers. However, existing methods
usually are sensitive to noise and outliers. Moreover, some relations between
samples are lost. For instance the model in LSR [8]:
1 n n
min J Z X XZ F O tr ZZ T = X XZ F + ¦¦|| zi z j ||22 || Z T e ||22 (3)
2 2 1
Z 2i1 j1 n
where e is a vector that all elements of it are one. It assigns equal weights to all
representation coefficients. In this way, whether the representation coefficients are
similar to each other is neglected. In consider of 2,1 -norm is robust to outliers [11], we
utilize it as the loss function. And then, the hypergraph Laplacian is utilized to
constraint coefficient of the representation Z, i.e. Eq. (2). It ensures the representation
coefficients of the similar samples are also similar. Finally, the objective function of
HGSR is
min J Z X XZ O tr ZLˆ Z T
Z 2,1 H (4)
where Lˆ H is the hypergraph Laplacian of the hypergraph GH . Since the loss function of
2
Eq. (4) is not quadratic, it makes the outliers become less important than X XZ F
.
With the help of hypergraph based trace-norm constrain on Z, local information has
been introduced into the model to make sure the representation coefficients of the
similar samples are similar. And ultimately improve the clustering performance.
The same as SRC et al, after the optimal coefficient matrix Z* is obtained, we use
the following equation to conduct spectral clustering [12]:
J1 ( Z * Z *T ) 2 (5)
2. Experimental Analyses
We compared HGSR with the latest graph based spectral clustering algorithms such as
LSR, SSC and et al. on the applications of motion segmentation (Hopkins155 [7]), face
clustering (Extended Yale Face database B [13] and ORL [14]) and animal clustering
(Zoo dataset [15]). All of the data sets are the most commonly used benchmark data
sets for evaluating spectral clustering algorithms. The details of the datasets are
presented as follows:
Hopkins 155 [7], includes 155 sequences of video. Every sequence contains
two or three movements and every sequence is a clustering task.
338 S.-C. Zhang et al. / Hypergraph Spectral Clustering via Sample Self-Representation
Table 2 shows the clustering errors on Extended Yale Face B, ORL and Zoo. It can
be summed up that the clustering result improved by HGSR over the others is
noteworthy. It also can be seen that the algorithm based on hypergraph HGSR is better
than graph based GSR. In each data set, the clustering error of graph based GSR is
S.-C. Zhang et al. / Hypergraph Spectral Clustering via Sample Self-Representation 339
smaller than other four graph based methods. It indicates that the 2,1 -norm is much
robust to noises than F-norm.
Table 2. The CE (%) achieved by each algorithm on Extended Yale Face B, ORL and Zoo.
3. Conclusion
Acknowledgements
This work was supported in part by the National Natural Science Foundation of China
(Grants No: 61263035, 61573270, and 61672177), the China Key Research Program
(Grant No: 2016YFB1000905); the Guangxi Collaborative Innovation Center of Multi-
Source Information Integration and Intelligent Processing; Innovation Project of
Guangxi Graduate Education under grants YCSZ2016046 and YCSZ2016045.
References
[1] A. Jain, M. Murty, P. Flynn, Data clustering: a review. ACM Computing Surveys, 31 (1999), 264-323.
[2] F. Zhao, L. Jiao, H. Liu, et al. Spectral clustering with eigenvector selection based on entropy ranking.
Neurocomputing, 73 (2010), 1704-1717.
[3] J. A. Hartigan, M. A. Wong. A k-means clustering algorithm. Applied Statistics, 28 (2013), 100-108.
[4] E. Elhamifar, R. Vidal. Sparse subspace clustering. In CVPR, 2009, 2790–2797.
[5] C. Y. Lu, H. Min, Z. Q. Zhao, et al. Robust and efficient subspace segmentation via least squares
regression. In ECCV, 2012, 347–360.
[6] G. Liu, Z. Lin, S. Yan, et al. Robust recovery of subspace structures by low-rank representation. IEEE
Transactions on Pattern Analysis and Machine Intelligence, 35 (2013), 171–184.
[7] H. Hu, Z. Lin, J. Feng, et al. Smooth representation clustering. In CVPR, 2014, 3834-3841.
[8] S. R. Bulo, M. Pelillo. A Game-Theoretic Approach to Hypergraph Clustering. IEEE Transactions on
Pattern Analysis & Machine Intelligence, 35 (2013), 1312-1327.
[9] J.B. MacQueen. Some Methods for Classification and Analysis of Multi Variate Observations. Berkeley
Symposium on Math, 1967, 281-297.
[10] J. Sivic, B. C. Russell, A. A. Efros, et al. Discovering Objects and Their Location in Images. In ICCV, 1
(2005), 370-377.
[11] X. Zhu, L. Zhang, Z. Huang. A sparse embedding and least variance encoding approach to hashing.
IEEE Transactions on Image Processing, 23 (2014), 3737-3750.
[12] U. von Luxburg. A tutorial on spectral clustering. Statistics & Computing, 17 (2007), 395-416.
340 S.-C. Zhang et al. / Hypergraph Spectral Clustering via Sample Self-Representation
[13] Y. Gao, M. Wang, D. Tao, et al. 3-D object retrieval and recognition with hypergraph analysis. IEEE
Transactions on Image Processing a Publication of the IEEE Signal Processing Society, 21 (2012),
4290-303.
[14] M. R. Franjoine, J. S. Gunther, M. J. Taylor. Pediatric Balance Scale: a modified version of the Berg-
Balance Scale for the school age child with mild to moderate motor impairment. Pediatric Physical
Therapy, 15(2003), 114-128.
[15] X. Zhu, H. I. Suk, S. W. Lee, et al. Subspace regularized sparse multi-task learning for multi-class
neurodegenerative disease identification. IEEE Transition Biomed Engineering, 63 (2016), 607-618.
Fuzzy Systems and Data Mining II 341
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-341
Abstract. In order to improve the level of safety risk early warning management
of metro construction, firstly, built early warning indicators from four aspects of
the human, machinery, environment, and management. Then quantized the early
warning indicators by combining 12 metro construction projects and using the
Dlephi method. Thirdly, based on factor analysis method, optimized the 30 early
warning indicators to reduce to 7 indicators. Finally, obtained index factors as the
input of the neural network, established BP_Adaboost network warning model and
made the subway construction safety risk early warning. The results show that the
input data optimization technique based on factor analysis method to
BP_Adaboost input factor to optimize neural network early warning model, raised
the speed of the warning, but also improved the precision of the subway
construction safety risk early warning.
Introduction
Along with the rapid development of urban rail transit, promote the development of the
city and improve the efficiency of the business gathering. However the metro
construction process is uncertainty, inscrutability, complexity etc. Coupled with
inadequate understanding of subway construction safety risk and imperfect early
warning management system, landslides and other safety accidents occur frequently.
Therefore, building the feedback safety risk early warning index system and a set of
operational early warning method is an effective measure to strengthen the metro
construction safety control.
In 2000 J. Reilly expounded the safety risk management in tunnel construction
process [1]; In 2002 FaberMH, applied safety evaluation method in the construction
process [2]; And the metro construction safety risk assessment based on hierarchical
analysis method [3-5]. Based on factor analysis and BP neural network of subway
construction safety warning [6], and based on fuzzy comprehensive evaluation method
*
Corresponding Author: Hong-De WANG, School of Civil & Safety Engineering, Dalian Jiaotong
University. Tunnel & Underground Structure Engineering Center, Dalian Jiaotong University, Dalian,
Liaoning, China. E-mail: whdsafety@126.com.
342 H.-D. Wang et al. / Safety Risk Early-Warning for Metro Construction
Based on the system theory, the safety accidents were analyzed. The safety risk factors
were obtained according to the four aspects of the human, machine, environment,
management [9-10]. According to the metro construction technology standards [11],
metro construction safety management standards [12] and expert engineering
experience, established metro construction safety risk early warning index system
based on the four aspects of human, machine, environment, management, and got 30
warning indicators [3]. As shown in the Table 1.
warning level, including the severe alarm (4), moderate alarm (3), light alarm (2), no
alarm (1).
2. Factor analysis
First, the reverse early warning indicators were changed into a positive indicator. The
xc 1
reverse index was dealt with x .The reverse index xi was transformed into positive
i
i
index xci .The statistical software SPSS.19 was used to analyze. And by using Bartlett
ball test and KMO test, the measure value was greater than 0.6, and the significant
level was 0. There was a great correlation amongst early warning indicators, which
would be reduced by factor analysis. As Table 2 shown, there were 7 common factors,
whose cumulative contribution rate of sample variance was 94.195% in all. The 30
indicators of the metro construction warning would be reduced to 7 core indicators,
which would be able to represent the information of all early warning indicators.
Table 2. Explained the total variance
Initial eigenvalue Extraction square and loading Rotating square and loading
Fac
Summ Varianc Cumula Summa Varian Cumulat Summa Varianc Cumulat
tor
ation e% tion% tion ce% ion% tion e% ion%
1 14.004 46.681 46.681 14.004 46.681 46.681 10.565 35.215 35.215
2 4.321 14.402 61.083 4.321 14.402 61.083 5.588 18.626 53.841
3 3.010 10.033 71.116 3.010 10.033 71.116 3.536 11.787 65.629
4 2.290 7.634 78.750 2.290 7.634 78.750 2.305 7.683 73.312
5 2.069 6.895 85.645 2.069 6.895 85.645 2.248 7.494 80.806
6 1.428 4.760 90.405 1.428 4.760 90.405 2.141 7.136 87.942
7 1.137 3.790 94.195 1.137 3.790 94.195 1.876 6.254 94.195
8 0.851 2.836 97.031 -- -- -- -- -- --
9 -- -- -- -- -- -- -- -- --
n
Fi ¦C x
j 1
ij j
. (1)
— Fi for the various factors extracted from the common factor; C ij factor
coefficient; X j for the original variable of the standard value.
344 H.-D. Wang et al. / Safety Risk Early-Warning for Metro Construction
BP neural network can realize from the input to the output of the nonlinear mapping
function, but sometimes lacked simply and effective parameters. Algorithm was not
stable. The single BP neural network was easy to fall into local optimum; When a large
amount of training data, would be easy to fall into local optimum. In order to speed up
the convergence rate, improve the learning efficiency, and avoid the local optimum,
there were 10 BP networks. The structure of BP neural network for 10-7-1, 10 input
nodes and 7 nodes in the hidden layer, 1 output node. Each BP neural network trained
20 times. Table 4 Item Score.
Table 4. Item score
Item 1 2 3 4 5 6 7 Alarm level
1 -0.92460 -0.56238 -1.05038 -0.81879 -0.51597 -0.30275 -0.49870 1
2 0.30931 0.34129 -0.72078 -0.90587 0.24190 -1.81012 0.57436 3
3 0.13676 0.22341 -0.42382 0.18261 3.04584 0.23712 -0.11627 3
4 -0.68143 0.30668 -0.97917 0.94920 -0.50854 -0.70859 0.61760 2
5 0.01271 -0.22087 0.82655 -0.00148 -0.21424 0.40999 2.80254 3
6 -0.18278 -0.23775 0.81107 0.92725 -0.26456 0.19955 -0.0713 2
7 0.15807 -0.08450 1.64599 1.67091 -0.02558 -1.06484 -0.97398 2
8 2.58489 -1.50741 -0.57458 -0.13913 -0.46455 0.46719 -0.33454 4
9 -1.07828 -0.49040 -0.46676 0.35577 -0.07380 2.14917 -0.28563 1
10 0.91079 2.75912 -0.21622 -0.13125 -0.71299 0.80764 -0.40766 4
H.-D. Wang et al. / Safety Risk Early-Warning for Metro Construction 345
The project 5 and project 6 were test samples. Other items were training samples.
And there was output value with BP neural network. The prediction results of the table
5 and figure 1.
4. Conclusion
In view of the large and medium-sized cities in China under construction and
completed metro construction projects, survey the original samples, using Delphi
method to grade indexes, through the analysis of the factor analysis method to the
original dimension reduction of 30 indicators, and then use BP_Adaboost index for
neural network training and testing, the following conclusions:
Based on the factor analysis method to reduce the dimension of the 30 warning
indicators of the metro construction project and got 7 core early warning indicators, as
the input of the BP_Adaboost neural network early warning model.
It was based on BP_AdaBoost network of metro construction safety risk early
warning model, to realize the classification of nonlinear index. Nonlinear index
classification was realized based on BP_AdaBoost network of metro construction
safety risk early warning model. Compared with BP neural network algorithm, it is
better in anti-noise ability, smaller in actual value error, and higher prediction accuracy.
References
[1] J. J. Reilly. Management Process for complex underground and tunneling Projects. Tunneling &
Underground Space Technology. 2000:31-44.
[2] M. H. Sorensen and J. D. Faber. Indications maintenance claiming of exonerate structures. Structural
Safety. 2002:377-396.
346 H.-D. Wang et al. / Safety Risk Early-Warning for Metro Construction
[3] Ren-hui Liu, Bo Yu, Zhen Jin. Study on Index System of Safety Risk Evaluation for Subway
Construction Based on Interval Estimation. Prediction.2012, 31 (2): 62-66.
[4] Ghosh Sid, Jintanapakanont Jakkapan. Identifying and assessing the critical risk factors in an
underground rail project in Thailand: A factor analysis approach. International Journal of Project
Management, 2004, 22(8):633-643.
[5] Shapira A, Simcha M. AHP –based weighting of factors affecting safety on construction sites with tower
cranes. Journal of Construction Engineering and Management, 2009, 135(4):307-318.
[6] Fan Chen, Hong-tao Xie. Subway Construction Safety Early Warning Research Based on Factor
Analysis and BP Network. China Safety Science Journal.2012,08:85-91.
[7] Zheng Xin, Hai-Ma Feng. Metro Construction Safety Risk Assessment Based on the Fuzzy AHP and the
Comprehensive Evaluation Method. Applied Mechanics and Materials, 2014, Vol.3307 (580),
pp.1243-1248.
[8] Hallowell M R, Gambatase J A. Activity-based safety risk quantification for concrete form work
construction .Journal of Construction Engineering and Management,2009㧘135(10):990-998.
[9] Sheng-Li Zhu, Wen-Bin Wang, Wei-Ning Liu. Risk management of metro engineering construction.
Urban Express Rail Transit. 01:, 56-60.2008.
[10] Feng Lv, Research on construction risk warning in mountain areas large section highway tunnel. Chong
qing Jiaotong university. 2010.
[11] GB50299-2003, Underground railway engineering construction and acceptance specification.
[12] GB/T50326-2001, Construction project management norms.
[13] Wei-Ke Chen, Xing-Hua Wang. Design and analysis of metro construction disaster early warning index
system. Urban Rail Transit.2007,10:25-29.
[14] Li Xue-Mei. Study on the early warning index system of construction safety risk of subway
engineering . Hubei. Huazhong University of Science and Technology.2011.
[15] Hong-Lin Wang. Study on construction risk management of urban rail transit project. China Mining
University. 2014.
[16] Hai-Li Yu. Safety risk analysis and evaluation of engineering construction based on human factors.
Wuhan: Huazhong University of Science and Technology, 2012.
Fuzzy Systems and Data Mining II 347
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-347
Introduction
As the core part of tax inspection, the tax inspection cases-choice is a method to select
objects and projects by tax authorities. This cases-choice uses the manual mode and
computer analysis mode, which is based on the present tax rules and methodologies [1].
Its information has experienced the process from the electronized stage of simulating
the manual operations to the current information stage including the general
management of tax inspection. However, the tax inspection information cannot manage
and integrate the accumulated massive basic data, which include the basic information
and filing information of tax-payers [2]. Therefore, it is necessary to applying the data
mining technology to look for credible and valuable information from massive random
data.
At present, some local tax departments in China have tried to utilize the related
technologies of data mining, such as data warehouse, to carry through the inspection
cases-choices. By comparing the decision tree model of C5.0 with the binary-
classification logistic regression method, it shows that the decision tree model can
improve the efficiency and effect of inspection cases-choice work much more [3]. For
1
Corresponding Author Jing ZHUO, University of Macau, Faculty of Law, Avenida da Universidade,
Taipa, Macau, China; Email: jzhuo@outlook.com.
348 J.-H. She and J. Zhuo / The Method Study on Tax Inspection Cases-Choice: Improved SVM
another example, the significant effects of data mining technology on the tax inspection
cases-choice have been studied in the following methods as: Application of Self-
Organizing Map (SOM) [4, 5], Association Rule [6], Combination of Support Vector
Machine (SVM) and SOM [1, 7] and Generalized Regression Neural Network (GRNN)
Model [8] etc.
It has been proven form the above literatures that the data mining as a technology
of tax inspection cases-choice is obviously better than the conventional mathematical
statistics methods. However, the biggest disadvantage when these data mining
technologies are used to deal with the inspection cases-choice problem of vector mode
data is limited by the data type, ignoring the mutual relations among the indexes or
characteristics of inspection cases-choice. Moreover, in fact, researchers in the actual
operation of tax inspection cases-choice always concern the temporal dynamic problem
of alternative case. However, SVM is often lacking the dynamic feature, which the data
should have.
s.t. ¢6 (Ä î 6 + Ú) ≥ 1 − 6 ,
≥ 0, ; = 1, ⋯ , .
(1)
C is the penalty parameter. Much bigger C is, much larger the penalty on
misclassification will be. The so-called maximum margin principle is to make the
previous distance farthest between the two planes of Ä î + Ú = 1 and Ä î + Ú = −1
determined by the margin plane of Ä î + Ú = 0 .Since the variable number of
optimization problem (1) is related to the dimensionality of sample point 6 , Tt is not
directly to solve (1) but turned to solve the dual problem during solving. The dual
problem of optimization problem (1) is as follows:
!
minï8|¥ ∑6~! ∑~! ¼6 ¼ ¢6 ¢ 6ð − ∑6~! ¼6
8¶W "
s. t. ∑6~! ¼6 ¢6 = 0,
0 ≤ ¼6 ≤ í, ; = 1, ⋯ , .
(2)
Obviously, (2) is a convex quadratic programming problem. The solution of
convex quadratic programming method can be used to work out the optimum solution,
and then through the relation between Ä ∗ and ¼6∗ :
Ä ∗ = ∑;=1 ¼∗; ¢; ; , Ú ∗ = ¢ − ∑;=1 ¼∗; ¢; Á; ® (3)
J.-H. She and J. Zhuo / The Method Study on Tax Inspection Cases-Choice: Improved SVM 349
s. t. y ¼6 ¢6 = 0
6~!
0 ≤ ¼6 ≤ í, ; = 1, ⋯ , .
(7)
Inside, ö46 , 4 = ⟨Ф(46 ) Ф(46 )⟩ becomes the kernel functions of matrixes 46
and 4 . The right side of equation is the inner product of matrix, and Ф is the mapping
of matrix space, which can map the sample points X from the matrix space into the
higher dimensional matrix characteristic space. In the higher dimensional matrix
characteristic space, all the sample points can be linearly divided. Therefore, the final
problem can be converted to search the linear decision function in the characteristic
space. The final decision function will be:
A(4) = sgn²(4) = sgn∑6~! ¼6∗ ¢6 ö(46 , 4) + Ú ∗ (8)
∗
Inside, ¼6 and Ú ∗ are the optimum solutions of optimization problem (7).
x Kernel Function of Matrix
Observing (7) and (8), the mappings of Ф are all appeared in pairs by the forms of
inner products. So the defined mappings of Ф are turned into the kernel function Κ.
There were some scholars gave out the definition of matrix kernel function as [12]:
ö(4, ³) = ö! (4, ³)ö " (4, ³) (9)
350 J.-H. She and J. Zhuo / The Method Study on Tax Inspection Cases-Choice: Improved SVM
! î
! ∑ 0 .«,(")
4 ð = þd«,(") "
d«,(") ÿ «,(") î
(11)
0 0 .«,(")
"
The matrixes c! and c" are respectively the unit matrixes whose sizes are same
with that of matrixes 4 and 4 î .
(7) is combined with (9). Then, one kind of new classification model based on the
data of matrix can be obtained. This model can be directly applied in the inspection
cases-choice problem of credit tax payments. In the end, (8) will be utilized to predict
the extent of credit tax payments for the new sample points.
3. Experiment of Case
In this paper, the inspection of added-value tax is selected to carry through the
experimental study of inspection cases-choice. The data of this study are mainly based
on the Vat Payment Return, Annex of Vat Payment Return, Input/Output Tax Amount
List of VAT, and Special Tax Payment Letter, Other information of tax declaration,
balance sheet and Profit Statement in the continuous three years.
By referring the indexes required by the practical experiences of tax inspection and
internal documents as well as the achievements of literature researches, the indexes2 of
VAT inspection cases-choice are determined (Table 1).
Table 1. Indexes of VAT Inspection Cases-Choice
The data of this study are from the VAT tax inspection of some tax bureau.
Totally, 140 commercial enterprises are randomly inspected. By the inspection, 60
enterprises inside are the enterprises of non-honest tax payments, and other 80
2
The Index selection is mainly based on the indexes from Chen (2004) [4]. The cases-choice indexes of
VAT are strictly selected by the stepwise discriminant analysis method. In fact, there are some differences for
the selection of each kind of cases-choice indexes and there is a very important significance in the cases-
choice. However, it is ignored since this study is only aimed at the research of data mining technology.
J.-H. She and J. Zhuo / The Method Study on Tax Inspection Cases-Choice: Improved SVM 351
enterprises are the enterprises of honest tax payments. According to the indexes of
cases-choice in Table 1, before disposing kernelled support matrix machine model㧘
the standardizing treatment of related ratio indexes needs to be done.
The experiment was under the environment of MATLAB 2010a. SVM and toolbox
function were called to realize the system simulation and test. In the collected data sets,
20 sample points of each class (means the honest and non-honest) were randomly
selected as the training set, and the left sample points were taken as the test set, namely
that there were 40 sample points in the training set and 100 sample points in the test set.
According to the above discussions, in SMM (7), total 13 values of
2 ,2 ,∙∙∙ ,2! ,2 ,2! ,∙∙∙ ,2 were valued as values of parameter C, and the parameter
in the kernel function (9) was valued as 1, 2, … and 10 (1 was taken as the step length).
After the sample points were randomly selected, such an experiment was classified by
SMM for 10 times. Then, the average value of this 10 times of accuracy values was
solved (Inside, the highest accuracies were shown as Table 2), namely that the final
classification accuracy was gotten.
Table 2. Performance comparison of SMM methods
Finally, the divided segments were laid out as the row vectors of matrix, to form
the final 4×7 matrix mode data. Compared the results of operations with the accuracies
of other conventional methods (Table 3), it shows that the kernelled SMM, especially
SMM (Method II) has a reliable superiority.
It can be seen from Table 3 that: When the kernelled SMM method is used to deal
with the data set, its predicting accuracy will be higher than that of other three methods.
SMM (Method II) keeps the dynamic nature of time, so that its obtained effect is more
ideal.
352 J.-H. She and J. Zhuo / The Method Study on Tax Inspection Cases-Choice: Improved SVM
4. Conclusion
Through improving the SMM and the constructive method of its data, this paper
optimized the kernelled SMM method of matrix data. By applying the actual data of
some tax department collected in the case of tax inspection cases-choice, and on the
basis of standardizing treatment of data set, this paper applied two methods to construct
the matrix mode data. It is shown from the experiment that: The kernelled SMM is
obviously better than the data mining methods of conventional SVM etc. This method
can effectively resolve the nonlinear and dynamic evaluation problem of honest
taxpayer, and has a higher value to improve the tax inspection cases-choice efficiency.
It is worthy for further study and pilot practical application.
Acknowledgements
The work described in this paper was supported by the National Social Science
Foundation of China (No: 14BGL214), the Science Foundation of Ministry of
Education of China (No.13YJA630073)
References
[1] S. H. Hong, Study on tax inspection cases-choice method, Master Thesis of Shantou University, (2011), 1.
[2] Y. H. Zhang, Tax inspection cases-choice study based on data mining technology: take the corporate
income tax as example, Master Thesis of Guangdong Business College, (2012), 1.
[3] S. H. Chen et al., Research on tax inspection based on C 5.0 decision tree, Journal of Lianyungang
Technical College, 03 (2011), 21-23.
[4] Y. Chen, Research on the method of tax inspection cases-choice, Tianjin University, (2004), 42-44.
[5] X. Wu et al., The method and application of tax inspection cases-choice based on neural network,
Journal of Xidian University (Social Science Edition), 05 (2007), 63-69.
[6] S. G. Xu, Research on tax compliance problem in China, Huazhong University of Science and
Technology, (2011), 1-110.
[7] H. Xia et al., Study on the corner detecting method based on SVM, Computer Applications and Software,
01 (2009), 230-231, 276.
[8] W. G. Lou et al., The modeling and empirical assessment of generalized regression neural network in tax
evaluation, Systems Engineering, 11 (2013), 74-80.
[9] C Cortes and V Vapnik, Support-vector networks, Machine Learning, 20 (1995), 273-293.
[10] D Cai et al., Support tensors machines for text categorization, Technical Report of Department of
Computer Science, University of Illinois at Urbana-Champaign, No.2714, 2006.
[11] D Cai et al., Learning with tensor representation, Technical Report of Department of Computer
Science, Department of Computer Science, University of Illinois at Urbana-Champaign, No. 2716,
2006.
[12] M Signoretto, LD Lathauwer and JAK Suykens, A kernel-based framework to tensorial data analysis,
Neural Networks, 24 (2011), 861-874.
Fuzzy Systems and Data Mining II 353
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-353
Abstract. The information system is considered in this paper. The nucleus of this
system is a software component, which is used for the numerical calculation of the
wave process. This wave process is the propagation of non-stationary waves in a
homogeneous solid in dynamic loading, and visualization of numerical solutions in
the form of the stress tensor and the velocity waveform. Many information systems,
software are usually developed on the basis of numerical methods such as finite
element methods, boundary element, etc. To solve this problem in this paper we
use the method bicharacteristics using the ideas splitting method, which is the
novelty and originality of this case study. The user can make a prediction and
engineering analysis with the help of the developed system. These predictions and
estimates can be used for building structures in engineering practice, in mechanical
engineering and in general for scientific research in the field of engineering. The
practical significance of the developed system is that the using of such a system
helps organizations reduce project development cycle and reduce costs and
improve product quality.
Introduction
The rapid development of computer technology and its implementation in almost all
spheres of life has led to the fact that today a qualified specialist in any field of
knowledge analysis (CAE). These CAD / CAM systems like AutoCAD, DUCT, Pro /
Engineer, Unigraphics and SolidsWorks are widely used for computer modeling of
complex shapes[1]. Modern engineering is not possible without knowledge of
computer- aided design systems (CAD), automatic production (CAM) and automated
engineering and Solids Works widely used for computer modeling of complex shapes,
with the subsequent release of drawings and generation of control programs [2].
1
Corresponding Author: Zhanar AKHMETOVA, Department of Information Systems, L.N.Gumilyov
Eurasian National University, Satpayev st.2, Astana, Kazakhstan; E-mail: zaigura@mail.ru.
354 Z. Akhmetova et al. / Development of the System
However, these specialized numerical modeling packages have not developed means of
engineering analysis [3,4].
In this paper is considered the information system. The nucleus of information
system is a software component that is used for the numerical calculation of the wave
process, as the propagation of non-stationary waves in a homogeneous solid in dynamic
loading. And this software component also is used for visualization of numerical
solutions in the form of the stress tensor and the velocity waveform. We used the
numerical bicharacteristics method for develop this component. Thebicharacteristics
method has ideas of splitting method.
Many software programs are usually developed on the basis of numerical methods
such as finite element methods, boundary element [5]. The novelty and originality of
this case study is that a software component developed using a numerical method
bicharacteristics using the ideas of the splitting method. The advantage of the proposed
method is that it allows approaching dependence domain of the final and differential
equation to the dependence area of the initial differential equation as much as possible.
This method is one of the most convenient for creating software and applications [6,7].
In this paper, we consider one of the statements of the problems, which are solved
using the developed information system.
The flat semi-strip with final width 1, which is made from a linear elastic material
properties which are characterized by a density U1 , speed of propagation of
longitudinal a1 and transverse b1 waves are already fixed rectangular coordinates
x1Ox2 occupies an area x2 d 1, , 0 d x1 d f (Figure 1).
The rest of the homogeneous boundary of the body free from stress.
V 22 0, V 12 0 on x2 d 1, 0 d x1 d f (3)
Difference method using the method of spatial characteristics of Clifton proposed in [8]
for the study of planar dynamical problems, and in [9] Recker developed for the study
of elastic wave propagation in isotropic bodies, a rectangular shape [4]. In this paper,
we solve the non-stationary problem of the dynamics of a homogeneous body with
bicharacteristics method [10]. This body is in a Cartesian coordinate system. To
understand the essence of the bicharacteristics method, consider a deformation of
elastic semi-strip. The final width occupies the area of x1 l in the Cartesian
system (Figure 2).
V 12 (t ) 0, V 11 (t ) 0, x1 0,
(6)
V 21 (t ) 0, V 22 (t ) 0, 0 d x1 ( N 1 , N 2 ), x 2 l
In order to solve the problem along with entry and boundary conditions, we used the
system of the equations consisting of the movement and ratios equations of the
generalized Hooke's law [13]:
w 2ui
V iE , E U 2 (7)
wt
V ij Ou E ,E G ij P (u i , j u i , j ) (8)
Where U – density, O , P – Lama’s constants, V i, j – Kroneckerdelta.and
required sizes are entered (9).
After the non-dimensional variables integration, the motion equations (7) and
differentiated by time correlation of the generalized Hooke's law (8) takes the form
(10):
-1 V 11,1 V 12, 2 ,
°
°X1 V 21,1 V 22, 2 ,
°
®V 11 -1,1 J 11V 2, 2 , (10)
°V J 11-1,1 V 2, 2 ,
° 22
°V J 122 (-1, 2 -2,1 ).
¯ 12
In order to obtain bicharacteristics equation and conditions on them, let us split the
two-dimensional system (10) on the single-dimensional one. Applying ideas of
K.A.Bagrinovski and S.K.Godunov on splitting multidimensional t- hyperbolic systems
on single-dimensional systems where xk const [11], we will have the system (11):
°-i V ij , j aij , ½°
® ¾ (11)
°̄V ij Oij-i , j bij °¿
2
dx j rOij dt
dV ij B Oij d-i (bij B Oij aij )dt (12)
According to the bicharacteristics method for selection of a point scheme and a
pattern the studied body is divided into square cells. The sides of the cell are
'x1 'x2 h .Іn thе dоublе pоіnts, the functіоn values -i , V ij аrе sеarchеd at
various time points with step of W .The dot grid (on the basis of which the difference
schеmе is buіlt, оthеr thаn thоsеmentiоnеd dоublе pоints) cоntаins points formed by
thеinterseсtiоn оf bіchаrаctеrіstісs wіthhуperplanes t const . Accеptеd pattern
cоnsisting of О node and Eij poіnts, separated from the point O to the distance OijW
(Figure 3)[13].
With the help of the developed software information system component, we were able
to visualize the numerical solution of the problem in this paper (Figure 4). On the 1st
picture of Figure 4 schematically shows the types of waves, determining the tension
points of the body. On 2nd and 3rd Figure 4 shows the wave of the longitudinal Q 1 and
transverse Q 2 particle velocity normal V jj 0 (j=1,2) and tangential stresses on V 12
time intervalfour fixed points of observation: 1 ( x1 0h, x2 5h) ; 2
( x1 5h, x2 5h), 3 ( x1 10h, x2 5h) , 4 ( x1 15h, x2 5h).
Analysis of the results shows that in the semi-strip clearly observed two-
dimensional nature of the wave process, and used the difference scheme does not lose
stability at a fairly long periods of time (studied before t = 420τ)[14,15]. Furthermore,
the results of decisions can be used for comparative evaluations for solving the more
complex tasks.
358 Z. Akhmetova et al. / Development of the System
a)
b)
c)
Figure 4.Visualization of numerical solutions obtained using the developed system
Z. Akhmetova et al. / Development of the System 359
4. Conclusion
With the help of the developed system, we were able to solve a number of tasks that are
useful in engineering practice: 1) The numerical solution of non-stationary wave
propagation in solids was obtained using the method bicharacteristics; 2) The
visualization of numerical solution of propagation of non-stationary waves was
obtained which occur in a solid and dynamic loading; 3) The analysis of the results was
obtained with the help of visualization; 4) The information system was developed, the
database which stores all the numerical solutions and visualizations. Multipurpose
orientation of the system, independent of the hardware (from PCs to workstations and
supercomputers), full compatibility with the Windows operating system and "friendly"
interface, can not only perform high-quality simulation of wave process, but also to do
analyzes and forecasts for these types of visualizations. [16,17].
The practical significance of this system is its support development organizations
to reduce development cycle, reducing the cost of products and improve product
quality. [18,19].
References
[1] B. Kantarci, H. T. Mouftah and S. Oktug. Availability analysis and connection provisioning in
overlapping shared segment protection for optical networks. Computer and Information Sciences,
2008.ISCIS'08.23rd International Symposium on.IEEE, 2008.
[2] G. Tarabrin, Metallurgical science, Moscow,3(1979), 193-199
[3] A. Zhidkov,Application of ANSYS system to meet the challenges of the geometric and finite element
modeling, Nizhny Novgorod, 4(2006), 4–5
[4] G. Tarabrin, Mechanics and calculation of constructions, Moscow,4(1981), 38-43.
[5] G. Tarabrin, Construction mechanics and calculation of constructions, Moscow, 3(1979), 193-199.
[6] G.Tidwell, Development of user interfaces,Trans. from English, 2008, 416.
[7] I. Medvedkov, Y. Bugaev, S. Nikonov. The Database, Voronezh, 2014, 67-73.
[8] R.Clifton, Mechanics, Moscow, 1(1968), 103-122.
[9] V. Reker, Applied mechanics. Series E, Moscow, 1(1970), 121-129.
[10] O. Syuntyurenko, Electron. Lib. Electronic information resources: new technologies and applications.
1(2011), 214-230.
[11] Z. Akhmetova, S. Zhuzbayev, S. Boranbayev., ActaPhysicaPolonica A, PolskaAkademiaNauk,
129(2016), 352-354.
[12] S.Dzuzbayev. B. Sarsenov, Dynamic stress state of the half-strip in a side pulse pressure,
Almaty,3(2003),55-62.
[13] Z. Akhmetova, S. Boranbayev, S. Zhuzbayev, Advances in Intelligent Systems and Computing,
Springer International Publishing Switzerland,448(2016), 473-482.
[14] S.Boranbayev , S. Altayev, A.Boranbayev, Proceedings of the 12th International Conference on
Information Technology: New Generations, Las Vegas, 2015, 796-799.
[15] A.Boranbayev , S.Boranbayev, Proceedings of the 7th IEEE International Conference on Application of
Information and Communication Technologies, Astana, 2014, 1282-1284.
[16] D. Raskin, Interface: New directions in designing of computer systems, Transl. from English, 2007,
272.
[17] G. Druzhinin, I. Sergeev, Maintenance of information systems, Marshrut, 2013, 124-128.
[18] Q. Mao, Micro-UIDT: A user interface development tool. Eurographics Association, 1989, 3-14.
[19] I.Molina Ana, M.Redondo, M. Ortega, A methodological approach for user interface development of
collaborative applications: A case study. Science of Computer Programming, 74(2009), 754-776.
360 Fuzzy Systems and Data Mining II
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-360
Keywords. bushing type cable terminal, infrared image, radon transform, fourier-
Mellin transform, BP neural network, feature extraction, image recognition
Introduction
The bushing type cable terminal, as the cable accessory connecting cable lines and
overhead lines has been widely used for its mature technology and stable running
record in 110kV and 220kV city grid,. It may malfunction because of poor design,
manufacture, installation and operating environment, so it’s necessary to carry out
regular inspections and preventive tests among which infrared detection is the main one.
Infrared detection can show the surface thermal distribution of the terminal by the
human eye visible image which can help diagnose and distinguish the existence of
defect and its properties, location and severity, then the corresponding measures can be
taken to eliminate the defect. The nearly four years statistics of cable terminal infrared
image in Guangzhou shows that abnormal heating of terminal concentrated in the
clamp, the stress cone, the tail and so on. Due to the difference of heating type and
diagnostic criteria in different heating part, it’s essential to recognize the pattern of
infrared images to overcome the low efficiency of human analysis and diagnostic
methods.
1
Corresponding Author: Hai-Qing NIU, Associate Professor, School of Electric Power, South China
University of Technology, Guangzhou, China; E-mail: niuhq@scut.edu.cn.
H.-Q. Niu et al. / Infrared Image Recognition of Bushing Type Cable Terminal 361
Feature extraction is the key for image pattern recognition. Favorable feature is not
affected by light, noise and geometry transformation. In the development of image
recognition, new features are constantly being raised and moment feature is widely
concerned. According to the characteristic of projection basis function, moment can be
divided into non-orthogonal moment and orthogonal moment. Orthogonal moment
includes Fourier-Mellin moment [1-2] which has strong robusticity to noise and good
effect on image reconstruction, but lacks of scale transformation invariance[3-6] and
will bring in resample and requantization error.
In order to avoid the shortcomings of the orthogonal moment, Radon transform [7-
8] can be used to process the gray image firstly and then analytic Fourier-Mellin
transform is utilized to the result. It means transforming scale change to amplitude
change, rotation change to phase change of original image. Rotation and scale invariant
functions are defined based on the above transform which applied to extract feature of
image rotation and scale invariant. The feature of infrared images of abnormal heating
of cable terminal will be extracted based on the method mentioned above and image
will be recognized by using BP neural network in the paper.
Due to the infrared images collected by infrared thermal imager are color images, it is
necessary to convert them to grayscale for convenience of the computer data processing.
The infrared image of the cable terminal whose clamp, stress cone and tail abnormal
heating is selected and then the gray image of the infrared image is obtained by using
rgb2gray function, as shown in figure 1.
The topology of BP neural network [9] includes input layer, hidden layer and output
layer. Figure 2 is a network topology with two hidden layers.
H.-Q. Niu et al. / Infrared Image Recognition of Bushing Type Cable Terminal 363
It should be coded for images while using BP neural network to identify abnormal
heating image. It means that 00, 01 and 10 respectively stands for network output
corresponding to the abnormal heating image of the clamp, stress cone and the tail. So
the number of output layer of the network is 2. If the extracted feature vector is used as
the input of the network, the number of input layer neurons in the network is 4. The
hidden layer structure of the network is determined to have two hidden layers by
repeated debugging. And the number of neurons both in the first hidden layer and the
second hidden layer is 9. So the structure of the BP neural network is 4-9-9-2.
As is shown in Figure 1, five abnormal heating infrared images of different parts of
the terminal is selected respectively and 5400 images which will be used as training
samples to train the constructed network can be obtained after rotating and enlarging
the original images. Fifty abnormal heating infrared images are selected as samples
and their feature value are extracted and input to the trained neural network and then
the forecast output will be achieved. If the output of the network (00, 01, 10) is in
agreement with the corresponding heating type, the identification is correct.
Recognition effect of BP neural network is shown in table 2.
Table 2. Recognition effect of BP neural network
From table 2, it can be seen that the method has a good recognition effect, its
average recognition rate is 97.3%. In addition, infrared image background with
excessive brightness may lead to unsuccessful recognition according to the analysis of
wrong infrared image recognition, so operators should pay attention to the influence of
ambient light when shooting infrared images.
In order to verify the validity of the method, invariant moment recognition method [10-
11] is used to compare and analyze.
The data of samples, the structure of BP neural network and the training process
are consistent with section 2.2. As shown in Table 3 is the comparison of recognition
results between the method mentioned in this paper and invariant moment feature
recognition method.
Table 3. Comparison of recognition result of different methods
The number of
Infrared image Invariant moments /% Recognition rate/%
samples
Clamp 50 93 100
Stress cone 50 86 94
Tail 50 91 98
Total 150 90 97.3
It can be seen from table 3, the recognition rate based on Radon transform and
Fourier-Mellin transform is better than invariant moment feature recognition method
which proves the validity of the method mentioned in this paper.
In order to test the robusticity of the feature extraction method based on Radon
transform and Fourier-Mellin transform to the noise, pepper and salt noise and white
Gaussian noise is utilized to pollute image respectively for studying the effect of noisy
on the recognition results.
Training samples and test samples are same with section 2.2 and pepper and salt
noise with different density and white Gaussian noise with different variance is added
to these samples. Finally, the infrared image with different signal to noise ratio is
obtained. Figure 3 is the image with pepper and salt noise.
As shown in table 4, infrared image recognition rate decreases with the increase of
the density of pepper and salt noise. Table 5 shows that the recognition rate of infrared
image reduces with the increase of the variance of white Gaussian noise. However,
Infrared image recognition rate is relatively ideal under high noise background which
proves that the recognition method has strong robusticity to noise.
Table 4. Recognition result of image containing salt and pepper noise
Recognition rate/%
The density of noise
The clamp The stress cone The tail of a bushing
0.01 100 92 98
0.03 98 90 94
0.05 94 86 92
Recognition rate/%
The variance of noise
The clamp The stress cone The tail of a bushing
5 98 94 96
10 96 90 92
15 92 84 88
4. Conclusion
The feature of infrared image of clamp, stress cone and tail of terminal is extracted
based on Radon and Fourier-Mellin transform and is composed to the feature vector
which will be input to the BP neural network to be identified. Some conclusions can be
obtained.
(1)The recognition rate of abnormal heating infrared image of clamp, stress cone
and the tail is 100%, 94% and 97% respectively and the average recognition rate is
97.3% which proves the validity of recognition of the method.
(2)Pepper and salt noise with different density and white Gaussian noise with
different variance is added to infrared image and the same feature extraction method is
used to recognize it. The recognition results show that the recognition rate of infrared
image reduces with the increase of the variance of white Gaussian noise and the density
of pepper and salt noise. And greater the density or variance of noise is, greater the
descent rate of the recognition rate of the infrared images are. However, infrared image
recognition rate is relatively ideal under high noise background which proves that the
recognition method has strong robusticity to noise.
References
[1] S. Derrode, G. Faouzi. Robust and efficient Fourier-Mellin transform approximations for gray-level
image reconstruction and complete invariant description. Computer Vision and Image Understanding,
83(2001): 57-78.
366 H.-Q. Niu et al. / Infrared Image Recognition of Bushing Type Cable Terminal
[2] K. Zhang, H. Q. Chen, Q. W. Liang, et al. Improvement of Fourier-Mellin moments-based edge detection
algorithm. Journal of Huazhong University of Science and Technology(Natural Science Edition),
38(2010): 53-56.
[3] X. Wang, B. Xiao, J. F. Ma. Scaling and rotation invariant analysis approach to object recognition based
on radon and analytic Fourier-Mellin transforms. Journal of Image and Graphics, 13(2008): 2157-2162.
[4] L. S. Fu, P. W. Liu, D. D. Li. Improved moment invariant characteristics and object recognition.
Computer Engineering and Applications, 48(2012): 183-185.
[5] G. L. Xu, J. Xu, B. Wang, et al. CIBA Moment invariants and their use in spacecraft recognition
algorithm. Acta Aeronautica Et Astronautica Sinica, 35(2014): 857-867.
[6] L. H. Jiang, H. Chenm Z. B, Zhuang, et al. Recognition on low-level wind shear of wavelet invariant
moments. Infrared and Laser Engineering, 43(2014): 3783-3787.
[7] Y. M. Wang, W. Yan, S. Q. Yu. Moment feature extraction of image based on radon transform and its
application in image recognition. Computer Engineering, 27(2001): 82-89.
[8] L. Wang, Q. Chang, K. Zhang, et al. Radon transform for line segment detection in low SNR image,
Infrared and Laser Engineering, 32(2003): 163-166.
[9] G. B. Zhang, X. Luo, Y. Y. Shen, et al. Effect of atmosphere condition on discharge characteristics of air
gap and the application of neural network. High Voltage Engineering, 40(2014):564-571
[10] M. K. Hu. Visual pattern recognition by moment nvariants. IEEE Transactions on Information Theory,
8(1962): 179-182.
[11] J. Flusser. On the independence of rotation moment invariants. Pattern Recognition, 35(2002): 3015-
3017.
Fuzzy Systems and Data Mining II 367
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-367
Abstract. To improve the performance of face recognition with only one sample
per person, a novel method of face recognition based on virtual images with multi-
pose and micro expressions is presented here. At first, the quadratic function is
used to create the virtual images as training samples, so as to enhance the
classification information of single training sample. Then the effective
discriminative features are extracted through the Bidirectional two dimensional
PCA in the residual space. The influence of different illumination on face
recognition was cut down effectively. Experiments on ORL and Yale dataset show
the effectiveness of the proposed method and the face recognition rate is improved.
Introduction
Because of its special merits of convenient, fast and easy collection, face recognition
technology has received much attention in biometrics recognition field in recent years
[1-2]. There are many effective methods developed for face recognition such as PCA,
LDA and 2DPCA [3-6]. These methods usually use a number of representative face
images for each person as training samples to extract discriminative features for
adapting with pose and illumination variability [7-8]. However, it is difficult to obtain
various sample images of one person in practical operation. Generally, only one image
could be got as sample for each person, such as the photo of personal identification,
student certificates and passports. It has become a current challenging task to attempt to
do face recognition only by the limited samples for each face under complex
illumination, various pose and expression [9].
Fortunately, it is possible and effective for face recognition to use virtual images
produced by the given image in various pose and expression as training samples to
solve the problem of insufficient training samples [10]. The method of (PC) 2A
proposed by Wu [11] fuses the original image and its integral projection together into a
new one. However, the recognition rate is low. Xu [12] develops a method to generate
the virtual images by rotating the original image. Wu [13] makes the best use
of the globe and local information of the samples by dividing the face into sub-blocks
and also overcomes some influence of pose to recognition effect.
1
Corresponding Author: Zhi-Bo GUO, School of Information Engineer, Yangzhou University, No.196,
Huayang Western Road, Yangzhou City, Jiangsu Province, China; E-mail: zhibo_guo@163.com.
368 Z.-B. Guo et al. / Face Recognition with Single Sample Image per Person
In this paper a novel method is proposed for face recognition by using virtual
images which are reconstructed as training samples from single sample image per
person in different pose and micro-expression. Firstly, the wavelet transform and
quadratic function are selected to create the virtual images as training samples. Then,
the algorithm RS-2DPCA is designed based on residual space and Bidirectional
2DPCA for face recognition. The influence of different illumination on recognition
could be decreased in residual space. Experimental results on ORL and Yale face
dataset show that the proposed method is more effective and accurate than the
corresponding.
In order to save the computation time, the sample image could be compressed at first.
The image is decomposed by wavelet transform as shown in Figure 1. The original
image is transformed into four sub-bands that are labeled as LL, LH, HL, and HH.
where LL denotes that horizontal and vertical directions with low frequencies, LH is
the horizontal direction with low frequencies and the vertical one with high frequencies,
HL is the horizontal direction with high frequencies and the vertical one with low
frequencies and HH is both horizontal and vertical directions have high frequencies.
The left up band LL is a coarser approximation to the original image. The right up band
HL and the lower left named LH record respectively the changes of the image along
horizontal and vertical directions while the lower right band HH correspond the higher
frequency component of the image. As it is shown, the information of the LL part is
essential, although the others also play an indispensable role in face recognition. It is
obviously that noises are mainly included in the high frequency part. So each part of
the four blocks after wavelet transform has different weight in face discrimination
information. The fusion of the four sub-bands with different weight is necessary for
face recognition.
(a) The original image (b) The image after wavelet decomposition
Figure 1. Face image using Wavelet transform
Under these combinations of different weight, the experiments are done on the
ORL face dataset by using the face recognition algorithm of the improved bidirectional
two dimensional PCA (detailed in part 3). For each person, five images are used as
training samples and the rest for testing. The experimental results are shown in Table 1.
When the weight of high frequency part is large, noise would be brought in and the
performance of the face recognition would be influenced. If the weight of high
frequency part is 0, the high frequency information of face would be lost and the
performance of the face recognition would also be influenced. It is illustrated that the
case Figure 2 (c) is the best as shown in Table 1.
Single face sample is difficult to train a high performance classifier for face recognition.
Several face images with different poses and different expressions could be created as
training samples by image transformation using the single face sample. The unitary
quadratic function is used here for image transformation.
For the pixel with the coordinate (x, t) in an image, the new coordinate after
transformation is (f(x), t) where f(x) =ax2+bx+c. c indicates the transformation angle.
The face deflection degree is usually from -300 to +300. Otherwise the face looks
serious deformation and the face recognition would be very difficult. Furthermore, a, b,
c should be satisfied as follows formula (1),(2),(3):
c (1)
d
a
b 1 (2)
k
c
' b 2 4ac ; 0 (3)
where d and k are constant which could be set by image size.
Suppose a face image X, the size is w×h. Let (m, n) denotes the coordinate of an
random pixel in the image where m=1,2…,w and n=1,2,…h. Based the idea of
370 Z.-B. Guo et al. / Face Recognition with Single Sample Image per Person
polynomial fitting, various pose of human face would be created from left or right
rotation, with column position changed under the unitary quadratic function, when the
row position is the same. The unitary quadratic polynomial function is as follows:
m' m
° (4)
® 2v 2 2v(h 1)
°̄n' h n [1 h
]n 2v
where v is used to control the rotation degree. Its possible value ranges from -0.5
to 0.5, deflecting through -380 to +380 relative to the frontal face, and (m', n')is the
virtual pixel. The image would look severely deformed and lose much discrimination
information when it was deflected too large.
The new position of pixel may be beyond the size of h, even becomes a negative
number in the transform. There would be some black lines in the image as shown on
Figure 3(b) when the pixel’s value is set as zero and it would be difficult for feature
extraction and recognition. A new method is presented here to solve the problem. For
the pixel with zero, its value is changed with the average of its adjacent four pixels
from above, below, left, and right. The improved image is shown in Figure3(c).
Similarly, the row position is changed under the unitary quadratic function, when
the column position is the same. The unitary quadratic polynomial function is as
follows:
n' n
° (5)
® 2u 2 2u ( w 1)
°̄m' m [1 ]m 2u
w w
where u is as the loosen degree about face muscle. The movement of muscle point
can be divided into two directions, up and down. The micro expression images would
be produced. The value of u ranges from -0.3 to +0.3, otherwise the face would look
severely deformed.
The images obtained by formula (4) and (5) are called virtual images. These virtual
images could be used as face training samples. For a given image sample A, 10 virtual
images could be produced. The algorithm of virtual images creation is as follows
named AVIC:
Step1: Under the formula (4), v is set as one of {+0.3, +0.15,-0.15, -0.3}, and A is
transformed into A1, A2, A3, A4 respectively. The corresponding deflection degree is
+200, +100, -100,-200.
Step2: Constructing the mirror images A5 and A6 with A2 and A3 respectively
Z.-B. Guo et al. / Face Recognition with Single Sample Image per Person 371
Step3: Under the formula (5), u is set as one of {+0.3, +0.15,-0.15, -0.3}, and A is
transformed into A7, A8, A9, A10 respectively.
As is known to all, the change of micro expressions is caused by the orbicular
constricting or dilating. According to the principle of orbicular movement, muscle
motion texture is a group of curves which are always around the facial contour [11]. If
the change is small when the polynomial function is used to fit the shape of contour
curves. In order to obtain the effective distinctive feature, u is selected relatively larger.
The experimental results are shown in Figure 4:
original image A A1 A2 A3 A4 A5
A6 A7 A8 A9 A10
Figure 4. Virtual images Creation
2DPCA was proposed by Yang [6]. This algorithm makes use of matrix of original
image directly. Therefore the inherent structure of image data is kept in good condition.
The recognition rate is improved efficiently while the feature extraction is also speedup.
However, 2DPCA only processes PCA transformation in row without considering
about column. A novel face recognition algorithm is presented here based on residual
space and bidirectional two-dimension PCA (named RS-2DPCA).
Suppose that there are N training image samples{X1, X2, X3, …, XN}, the size of each
image is w×h. There are L types in the training image samples, each type’s number of
the training sample is expressed as N1, N2, …, NL. The training images of cth type are
wuh
denoted by { X 1 , X 2 , … , X Nc },where X i R
c c c c
, i=1,2,…,Nc, c=1,2, … ,L.
Step1: Computing the mean of the cth type training samples Tc as formula (6):
Nc
1
Tc
Nc
¦X
i 1
i
c
, c=1,2,…,L (6)
Step2: Computing the image scatter matrixes in row and column direction
respectively as formula (7) or (8):
1 L Nc
Gr ¦¦ ( X ic Tc )T ( X ic Tc )
Nw c 1 i 1
(7)
372 Z.-B. Guo et al. / Face Recognition with Single Sample Image per Person
1 L Nc
¦¦ ( X ic Tc )( X ic Tc )T
Gc
Nh c 1 i 1
(8)
Suppose that there are L person’s images are {A1,A2,A3, … ,AL}. There is only one
image corresponding to each person.
Step1: L person’s images {A1,A2,A3,…,AL} are transformed into {B1,B2,B3,…,BL}
with wavelet transform. The combination of weight of the 4 sub-bands is defined as
Figure2 (c).
Step2: For each person, 10 virtual images { X 1i , X 2i , X 3i ,......, X 10i } are produced by
every Bi. { X 0i , X 1i , X 2i , X 3i ,......, X 10i } are used as the training samples of the person,
where X 0i Bi . When i=1,2,…L, for L person, the number of training samples are
expanded from L to N, where N L u11 . All of training samples
are { X 01 , X 11 , X 21 ,......, X 10
1
, X 02 , X 12 , X 22 ,......, X 102 ,......, X 0L , X 1L , X 2L ,......, X 10L } .
Step3: Computing the difference images Qj of each sample X ki , i=1,2,…L.
k=0,1,2,…,10. j (i 1) u 11 k by (6)-(11).
Step4: Computing the scatter matrix of all the difference images:
Z.-B. Guo et al. / Face Recognition with Single Sample Image per Person 373
N 1
Sb ¦Q Q
j 0
T
j j
(12)
3. Experiment Results
The experiments are done with MATLAB2012 using ORL and Yale face dataset.
Firstly, virtual face images are created as training samples by the proposed algorithm
AVIC. Recognition rates in table 2 and table 3 are defined the number of correctly
recognition divided by the total number of test samples.
The Yale dataset contains 15 individuals, each providing 11 different images. The
facial expressions are rich, open or closed eyes, smiling or non-smiling, glasses or no
glasses. Moreover, the illumination on the face is varied.11 sample images of one
person from Yale dataset are shown in Figure 5.
The ith image (i=1, 2, …, 11) of each person is selected as single face sample per
person. This sample and its 10 virtual images were used as training samples, and the
left 10 image of each person for test. The total number of training samples is 165 and
that of the testing samples are 150. The experimental results are shown in Table 2. The
mean face recognition rate is 84.36%.
374 Z.-B. Guo et al. / Face Recognition with Single Sample Image per Person
i=1 i=2 i=3 i=4 i=5 i=6 i=7 i=8 i=9 i=10 i=11
86.67 86.00 83.33 86.00 84.67 84.00 87.33 85.33 82.67 81.33 80.67
The ORL dataset consists of 40 persons. Each person has 10 different images. The
facial with rich expresses and the poses are varied.10 sample images of one person
from ORL dataset are shown in Figure 6.
i=1 i=2 i=3 i=4 i=5 i=6 i=7 i=8 i=9 i=10
85.00 82.78 83.06 77.78 88.06 81.11 83.61 83.33 75.56 80.83
From Table 3, the mean face recognition rate is 82.11%. Experiments in the same
environment with the different algorithm are shown in Table 4. Results of the different
PCA methods in Table 4 are obtained using the same training samples, test samples and
recognition rate calculating as the proposed method.
From Table 4, the proposed method is more effective than the corresponding.
Why? That is due to two factors. One is that the virtual images of multi-pose and
micro-expression are closer to the actual situations which remedy the shortage in poses
and expressions of the original image effectually. Obtained better results on the YALE
dataset is that the original image’s lighting is obvious. The present method reduces the
lighting effects in the paper, so the testing results are improved. The other is that the
influence of varying illumination is drawn down based on residual space.
4. Conclusions
Face recognition from one sample per person is an important but a challenging problem
both in theory and for real-world applications. In this paper, 10 virtual images of multi-
pose and micro expression were created with single face sample. These images were
used as training samples. Experimental results show the good performance of face
recognition. So the proposed algorithm is efficient. But the one sample problem is by
far not solved. The more research is needed for face recognition with large vary
expression, different lightness and various pose. It is also a problem to be encouraged
to study on how many training samples are enough to extract sufficient features for face
recognition.
Acknowledgment
This work was sponsored by Prospective Joint Research Project of Jiangsu Province
(BY201506-01), the LiuDa Talent Peak Project of Jiangsu (2013DZXX-023), Huai’an
533 Project and supported in part by the Major Program for scientific and technological
research in University of China under the Grant No.311024.
References
[1] Y. Q. Hu, A. S Mian, R. Owens. Face Recognition Using Sparse Approximated Nearest Points between
Image Sets, IEEE Transactions Pattern Analysis and Machine Intelligence, 10(2012), 1992 ~ 2012
[2] Y.F. Jin, K.M. Geng, Y.P.Wang. Efficient Feature Reduction Algorithm based on mPCA and Rough Set,
International Journal of Advancements in Computing Technology, 15(2012), 504 ~ 511.
[3] Z. Wang 㧘 Q. Ruan 㧘 G. An. Facial expression recognition using sparse local Fisher discriminant
analysis, Neurocomputing, 174(2016), 756-766.
[4] M. Kan㧘S. Shan㧘H. Zhang㧘S. Lao, X Chen. Multi-view Discriminant Analysis, IEEE Transactions
on Pattern Analysis and Machine Intelligence, 1(2016), 808-821.
[5] J. Yang, Z. Gu, N. Zhang, J. Xu. Median–mean line based discriminant analysis, Neurocomputing,
123(2016), 233-246
[6] J. Yang, Z. David, J. Y. Yang. Two-Dimensional PCA: A New Approach to Appearance-Based Face
Representation and Recognition, IEEE Transactions on Pattern Analysis and Machine Intelligence,
1(2004), 131 ~ 137.
[7] Z. Y. Yu, S. B. Gao. Fuzzy Two-dimensional Principal Component Analysis and Its Application to Face
Recognition, Advances in Information Sciences and Service Sciences, 11(2011), 335 ~ 341.
[8] Y. Zeng, D. Z Feng. The Face Recognition Method of the Two-direction Variation of 2DPCA,
International Journal of Digital Content Technology and its Applications, 2 (2011), 216 ~ 223.
[9] P. Voild, M. Jones. Robust Real-time Face Detection, International Journal of Computer Vision,
2(2004), 137 ~ 154.
376 Z.-B. Guo et al. / Face Recognition with Single Sample Image per Person
[10] X. Y. Tan, S. C Chen, Z. H. Zhou. Face Recognition form a single image per person: a survey, Pattern
Recognition, 9(2006), 1725 ~ 1745.
[11] J. X. Wu, Z. H. Zhou. Face Recognition with One Training Image Per Person, Pattern Recognition
Letters, 14(2002), 1711 ~ 1719.
[12] X. Y. Xu. Face Recognition Method for Single Sample Based on Virtual Image, Computer
Engineering, 1(2012), 143 ~ 145.
[13] P. Wu, J. L. Zhou, X. H. Li. Sub-block Face Recognition Based on Virtual Information with One
Training Image Per Person, Computer Engineering and Applications, 19(2009), 146 ~ 149.
Fuzzy Systems and Data Mining II 377
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-377
Introduction
The decision of personnel optimization scheduling is to complete the task under scarce
resources in terms of minimizing or maximizing objective function. The earliest
research is about transportation system, and it can be traced back to Edie’s [1] toll
station traffic jams. Then the application gradually extended to many fields, and these
fields, such as health care, communications, financial services, manufacturing, high and
1
Corresponding Author: Juan DU, National Geomatics Center of China, 28 Lianhuachi West Road,
Haidian District, Beijing, China; E-mail: mhwgo_jane@163.com.
378 J. Du et al. / Cloud Adaptive Parallel Simulated Annealing Genetic Algorithm
new technology services began to pay attention. Accordingly, as the problems being
more complicated, research methods developed from traditional single algorithm, such
as linear programming, integer programming and so on, to modern heuristic algorithm
and hybrid algorithm. With the rise of algorithms which based on biology, physics and
artificial intelligence in the 1990s, genetic algorithm (GA) [2-3], simulated annealing
algorithm (SA) [4], evolutionary algorithm (EA), tabu search (TS), ant colony
algorithm (ACA) [5], and so on are widely used in optimization scheduling problem,
genetic algorithm in particular. The widespread is due to its strong global optimization,
robustness and generality.
GA is proposed by Holland in the late 1960s and early 70s, its mechanism is to:
x (1)Simulate the process of natural selection and natural genetics of
reproduction, crossover and mutation,
x (2)Keep a set of candidate solutions in each iteration, and select better
individuals according to certain indicators,
x (3) Combine these individuals to produce a new generation by using of genetic
operators (selection, crossover and mutation).
x (4)Repeat the process until some conditions are satisfied.
But GA has its own drawbacks, such as premature convergence, slow convergence
speed and poor local optimization ability. Therefore, various improvements are put
forward. For the choice of adaptive value, breeding pool [6], Boltiziman selection were
introduced on the basis of roulette selection which is mostly used, but it was prone to
produce "premature" convergence and stagnation. For the rates of crossover and
mutation are constant in the process of genetic, Chen et al. [7] raised a new perspective
of superiority inheritance based on Srinivas et al. [8], Wang and Cao [9], Zheng et al
[10], which solved the premature problem effectively to some extent. In addition, SA,
methods of gradient, mountain climbing, list optimization and the like, have strong
local search ability. They can improve running efficiency and solution quality by
adding above methods to GA’s search process. Taken all above into consideration,
Peng et al [11] combined fuzzy control and SA which named fuzzy adaptive simulated
annealing genetic algorithm (FASAGA) based on standard genetic algorithm (SGA).
Furthermore, Dong et al [12] introduced a cloud model [13-14], which took fuzziness
and randomness into consideration, together with a parallel mechanism [15-16]. They
proposed an adaptive parallel simulated annealing genetic algorithms which based on a
cloud model (PCASAGA). The PCASAGA has faster convergence speed and better
optimization results.
As unreasonable staffing in production process of most units (enterprises) in
Geographical Conditions Census (CGC), the task allocation of producer and quality
inspector determined by experience according to engineering task quantity on a
national scale, "fire brigade" dispatch mode existed universally. Gear to the needs of
normalized National Geographic Conditions Monitoring (NGCM), the paper focuses
on PCASAGA which will improve staff scheduling. We establish optimization
scheduling model which links the characteristics of CGC’s production process. At last,
the passage illustrates good performance that algorithm applies to the optimization
model by utilizing instance analysis.
J. Du et al. / Cloud Adaptive Parallel Simulated Annealing Genetic Algorithm 379
f() = f /(1 + ∗ h ó ) (1)
CGC include production, quality inspection tasks, the process is producer organize
production, then quality inspector check complete production by sampling. In this
paper, the target is finding distribution between two kinds of worker when time is
380 J. Du et al. / Cloud Adaptive Parallel Simulated Annealing Genetic Algorithm
shortest under certain production quota. To reach the target, assumptions are as
follows:
x (1) P and QC are number of producer and quality inspector, P+QC=c, c stand
for constant;
x (2) In general, the production task can be number of patches, sheet quantity or
missions’ area, we adopt the last one in this paper which is represented as A.
Sampling proportion of quality inspection is s, which is decimal number
between 0 and 1, thus inspection tasks is s * A. If quality inspectors detect
errors, it needs to return to producers for modifying, and quality inspectors
need to check all errors again. If there are errors still, the rest should be done
in the same manner, errors are modified completely after n times, and n takes
natural number. Assuming the error rate of producer is e, quality inspectors
are all right, e∈(0, 1].
∗
∗ Î⋯Ê X 㧕 ∗
∗(!Î⋯Ê X )
Objective function㧦Min TIME=min( ∑
) + ) 㧔2㧕
W Æ!/(!Ç∗Ê ∑W Æ"/(!Ç∗Ê
)
where r1, r2 are the efficiency limit of producer and quality inspector respectively,
x, y are skill score of producer and quality inspector respectively.
Constraint conditions㧦P+QC=c
P,QC N+, and 0<P<c,0<QC<c
0≤x,y≤10
GA has premature convergence problem, slow convergence speed and poor local
optimization ability, it’s primarily because selection should be carried out in
accordance with proportion, better individuals will occupy a higher proportion at earlier
time in group due to their fitness are much higher than average, then it will result in
"premature" convergence. Meanwhile, offspring inherit from parents, the diversity of
population declines, then adaptive value that of individuals’ are close in later time,
genetic operators are difficult to choose more excellent offspring, so convergence speed
is slower and local optimization ability is poorer. Therefore, PCASAGA regulates
selection and mutation operator with cloud adaptive adjustment, it introduces SA which
has strong local search ability into GA. The starting point of SA is based on the
similarity between annealing process of solid matter in physics and combinatorial
optimization problem in general. It starts from a higher initial temperature, and finds
the global optimal solution of the objective function randomly in the solution space
with probabilistic kick feature and falling temperature. SA can jumps out local optimal
solution in probability and reaches global optimal ultimately. And also parallel
mechanism to improve the algorithm efficiency.
In genetic evolution process, crossover rate P and mutation rate P control global
and local search in search space. In GA, P and P are constants and determined by
experience, but small or large value will affect genetic process. Therefore M. Srinivas
J. Du et al. / Cloud Adaptive Parallel Simulated Annealing Genetic Algorithm 381
et al put forward an adaptive genetic algorithm, scholars at home and abroad conducted
a lot of research to get improvements. Although above adaptive mechanisms
considered individual difference, but ignored situation of whole group. As a result,
Peng Yong-gang (FASAGA) eyed on the differences of individual and population,
obtained P and P through fuzzy control. However, fuzzy inference system adopts
precise membership function to describe uncertainty of qualitative concept; it fuzzifies
input values into a fuzzy set, and stimulates membership functions to get membership
value. At last, outputs are obtained by fuzzy inference machine and solution
fuzzification. Therefore, same inputs will gain same results because of invariant
excitation level, fuzzy inference machine and solution fuzzification. On the contrary,
cloud inference system will obtain uncertain results because it has no precise
membership function. Its definition is: Set C as language value on domain U, if x ∈ d
is a random implementation of C, and the certainty degree of x to C is random number
with stable tendency which is symbolized as μ(x) ∈ [0,1], μ(x): U → [0,1] ∀ ∈ d,
then the distribution of x on domain U is called cloud model. There are three digital
characteristics called expectations ( ), entropy (# ) and hyper entropy (´Ê ) in cloud
model to reflect the characteristics of qualitative concept. Expectation ( ) reflects the
cloud’s barycenter position, it is the most representative value of qualitative theory on
domain space. Entropy (# ) reflects scope that can be accepted by the language value
on the one hand, and it also reflects probability that points in domain space stand for
language value on the other hand. # represents randomness in cloud droplets of
qualitative concept, it reveals the correlation between fuzziness and randomness. Hyper
entropy (´Ê ) is uncertainty measure of entropy, namely entropy of entropy. It’s
coherence of all droplets’ uncertainty measure that represents the same language value.
When μ(x) is normal distribution, it’s called normal cloud model, and showed by
( , # , ´Ê ) [21-23]. Thus the outputs of cloud model are generated by random
process. From this standpoint, PCASAGA introduces cloud reasoning based on
population differences, individual differences and fuzzy inference rules.
Population differences ( E! , include population crossover differences E! and
population mutation differences E! ) and individual differences(E" , include individual
crossover differences E" and individual mutation differences E" ) are as
followings(Eqs.(3) and (4)):
̅ \V
" = A w − A/A
" = ∈ [−1,1] (4)
̅ \V
"\ = (A − A)/A
where fÇ and f̅ are the maximum and average fitness values at each
generation, , f w is the larger of the fitness values of the individuals to be crossed, f is
the fitness values of the individuals to be mutated.
Cloud reasoning model uses the same inference rules in FASAGA, and table 1
shows the adaptive adjustment rules of P , P is in the same way. E! , P , P are
divided into language set {large, medium, small}, E" is divided into {positive, zero,
negative}. If diversity is worse(namely E! is small), and individual value is below the
average level(namely E" is negative), then P and P should be large, other
inference rules are shown in table 1, concept of each language set is described by
expectation E , entropy Ez and hyper entropy HÎ . Adaptive values of P and P
382 J. Du et al. / Cloud Adaptive Parallel Simulated Annealing Genetic Algorithm
are obtained by following procedures: 1) to take any given input vectors as cloud
titration values which without uncertainty information, 2) to input cloud titration values
to cloud generator which constructed by qualitative rules library, 3) to get some output
cloud droplets with certainty information, and then accuralize all outputs to a concrete
value. The process is shown in following figure 2.
Table 1. The rules’ table of adaptive crossover operator
μ
Ê!
>Ê!
´hÊ!
Cloud droplets > 0 Accuralize
Ê"
! " >Ê" ´h 0
´hÊ"
1 [0,0.15]
0.8
⎧!\V = e
⎪ (0.15,0.1,0.002)
0.6
!\Ê6É\ = (0.45,0.35/3,0.002)
μ
0.4
⎩ (0.8,0.35/3,0.002)
0.0
E1
1.0
1 [−1, −0.65]
0.8
⎧"#Ê© = e
⎪ (−0.65,0.65/3,0.005)
0.6
"ÊÂ = (0,0.35/3,0.005)
0.4
⎨ negative zero positive
⎪" = e 1 [0.65,1]
0.2
⎩ (0.65,0.35/3,0.005)
0.0
-1.0 -0.5 0.0 0.5 1.0
E2
1.0
1 [0,0.15]
0.8
⎧0\V = e
⎪ (0.15,0.1,0.002) 0.6
0\Ê6É\ = (0.45,0.35/3,0.002)
μ
0.4
⎩ (0.8,0.35/3,0.002)
0.0
Usually, the parallel model is divided into three categories: master-slave type,
coarse-grained and fine-grained model. We choose the coarse-grained model, and
introduce migration operator at the same time. The design generates a number of initial
populations randomly who process in different processors independently, and selects
the optimal individual in all populations to replace the worst one in each population
before crossover and mutation in each generation.
At the same time, we apply SA to stretching fitness, changes of crossover operator
and mutation operator. Fitness stretch according to the following method (Eq.(5)):
select the one who has the largest fitness as the best individual from all
populations after comparison. Otherwise, select the larger one between
individual who has the largest fitness in current generation and the best
individual.
x (3) Choose roulette after stretching fitness
x (4) Crossover rate gains by cloud reasoning system, and accept new individual
through Boltiziman mechanism.
x (5) Mutation rate operates as 4).
x (6) Increase k to k + 1, and cool T to 1/ln(k/T + 1), if termination
conditions are not met.
3. Application Instance
For the target of shortest time, we calculate reciprocal of objective function (Eq.(2)),
use binary encoding way, 1 represent producer and 0 for quality inspector. To test
performance of algorithm, we simulate under different values of n, s, e, and compare
PCASAGA with SGA to test efficiency. Main parameters of PCASAGA are:
population number is 3, each population size is 10, generation number is 100, initial
temperature is 100.On the contrary, SGA’s parameters: generation number and initial
temperature are the same, the rates of crossover and mutation are 0.8 and 0.1
respectively.
(1) Convergence performance and optimization ability
J. Du et al. / Cloud Adaptive Parallel Simulated Annealing Genetic Algorithm 385
e
n s method
0.1 0.3 0.5 0.7 0.9
PCASAGA 45 23 11 48 25
1 0.2
SGA 67 38 83 61 69
PCASAGA 7 1 8 3 5
3 0.6
SGA 15 20 64 12 26
PCASAGA 3 27 11 73 2
5 1
SGA 4 43 20 94 12
e
n s method
0.1 0.3 0.5
PCASAGA 24.6466034862413 26.0910670029848 27.5128935182451
1 0.2
SGA 24.7194057952400 26.1109679545236 27.5128935182451
PCASAGA 31.0809181212216 37.1152116514881 46.1194461945105
3 0.6
SGA 31.0809181212216 37.1173633750822 46.1194461945105
PCASAGA 36.2018982793111 46.5114911802478 64.1453026589571
5 1
SGA 36.2019593040322 46.5171956243194 64.1473770869863
Table 3. The comparison of optimal time needed (days) under the same genetic generation number(continue)
e
n s method
0.7 0.9
PCASAGA 28.9176323078687 30.3090832026231
1 0.2
SGA 28.9285882615977 30.3093889128568
PCASAGA 59.0271264010922 76.7745213589466
3 0.6
SGA 59.0294575681828 76.7745213589466
PCASAGA 95.8284392743847 152.6646799673810
5 1
SGA 95.8284392743847 152.6646799673810
386 J. Du et al. / Cloud Adaptive Parallel Simulated Annealing Genetic Algorithm
5 120
P/QC(n=
4.5 1,s=0.2)
4 100
P/QC(n=
3.5 80 3,s=0.2)
3 P/QC(n=
2.5 60 5,s=0.2)
2 T(n=1,s=
1.5 40 0.2)
1 20 T(n=3,s=
0.5 0.2)
0 0 T(n=5,s=
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.2)
Figure 6. Staff ratio and time change when fixed sampling rate in homogeneous case
5 250
P/QC(n=
4.5 3,s=0.2)
4 200 P/QC(n=
3.5 3,s=0.6)
3 150 P/QC(n=
2.5 3,s=1)
2 100 T(n=3,s
=0.2)
1.5
T(n=3,s
1 50 =0.6)
0.5 T(n=3,s
0 0 =1)
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Figure 7. Staff ratio and time change when fixed the number of return to modify in homogeneous case
J. Du et al. / Cloud Adaptive Parallel Simulated Annealing Genetic Algorithm 387
5 60
4.5 P/QC(n=
4 50 1,s=0.2)
3.5 P/QC(n=
40 3,s=0.2)
3
P/QC(n=
2.5 30 5,s=0.2)
2 T(n=1,s=
1.5 20
0.2)
1 10 T(n=3,s=
0.5 0.2)
0 0 T(n=5,s=
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.2)
Figure 8. Staff ratio and time change when fixed sampling rate in heterogeneous case
12 120
P/QC(n=
10 100 3,s=0.2)
P/QC(n=
8 80 3,s=0.6)
P/QC(n=
6 60 3,s=1)
T(n=3,s
4 40
=0.2)
2 20 T(n=3,s
=0.6)
0 0 T(n=3,s
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 =1)
Figure 9. Staff ratio and time change when fixed the number of return to modify in heterogeneous case
(3) Analysis
We get following findings through above results:
1) To take the same precision accuracy (0.0001) and generation number(100) as
stop conditions, it proofs that PCASAGA has better effects in convergence speed and
optimization ability after we calculate generation number and optimal completion
time in different parameter combinations;
2) By setting homogeneous and heterogeneous cases, it indicates that the former
one presents regular trend, whose ratios between producer and quality inspector are flat
or falling as increase of arbitrary influence factors. It means that there’s a need to keep
the same structure, or allocate more quality inspectors if we want to complete the task
within the optimal time. But the latter one shows an irregular change, that is, we can’t
388 J. Du et al. / Cloud Adaptive Parallel Simulated Annealing Genetic Algorithm
adopt the strategy which to keep producers unchanged or reduced blindly, it should be
combined with actual production.
3) As increased of n, s, e, and optimal time in homogeneous and heterogeneous
situations are increased obviously. This is due to low professional skills of workers’,
which will cause the increase of quantity and completion time.
Our Studies have shown that PCASAGA can be applied to the model, National
Geographic Conditions Monitoring (NGCM) should be committed to improve staff
quality and arrange production task based on actual situation from our study.
4. Conclusion
References
[1] L. C. Edie, Traffic Delays at Toll Booths, Journal Operations Research Society of America, 2(1954),
107-138.
[2] Y. J. Ma, W. X. Yun, Research Progress of Genetic Algorithm, Application Research of Computers 4
(2012), 1201-1210.
[3] X. Bian, L. Mi, Development on Genetic Algorithm Theory and Its Applications, Application Research
of Computers 7(2010),2425-2434.
[4] H. G. Chen, J. S. Wu, J. L. Wang, et al. Mechanism Study of Simulated Annealing Algorithm, Journal
of Tongji University(Natural Science) 6(2004),802-805.
[5] Q. H. Wu, Y. Zhang, Z. M. Ma, Review of Ant Colony Optimization, Microcomputer Information
3(2011), 1-5.
[6] F. Gao, Y.P. Shen, L.X. Li. Optimal Design of Piezo-electric Actuators for Plate Vibroacoustic
Control using Genetic Algorithms with Immune Diversity , Smart Materials and Structures
9(2000),485-491.
[7] S. Z. Chen, G. D. Liu, X. Pu, et al. Adaptive Genetic Algorithm Based on Superiority Inheritance,
Journal of Harbin Institute of Technology 7(2007), 1021-1024.
[8] M. Srinivas, L. M. Patnaik, Adaptive Probabilities of Crossover and Mutation in Genetic Algorithm,
IEEE Transaction on Systems, Man, and Cybernetics 4(1994), 656 - 667.
J. Du et al. / Cloud Adaptive Parallel Simulated Annealing Genetic Algorithm 389
[9] X. P. Wang, L. M. Cao, Genetic Algorithm Theory, Application and Software Implementation, Xi 'an
Jiaotong University Press, Xi 'an,2002.
[10] J. Zheng, J. Zhu, Image Matching based on Adaptive Genetic Algorithm, Journal of Zhejiang
University (Engineering Science) 6(2003), 689 -692.
[11] Y. G. Peng, X. P. Luo, W. Wei, New Fuzzy Adaptive Simulated Annealing Genetic Algorithm,
Control and Decision 6(2009), 843-848.
[12] L. L. Dong, G.H. Gong, N. Li, et al. Adaptive Parallel Simulated Annealing Genetic Algorithms
based on Cloud Models, Journal of Beijing University of Aeronautics and Astronautics 9(2011),
1132-1136.
[13] D.Y. Li, Y. Du, Artificial Intelligence with Uncertainty, National Defense Industry Press ,
Beijing,2005.
[14] D. R. Li, S. L. Wang, D. Y. Li, Theory and Application of Spatial Data Mining (Second Edition),
Science Press, Beijing, 2013.
[15] T. C. Guo, C. D. Mu, The Parallel Drifts of Genetic Algorithms, Systems Engineering& Theory
Practice 2(2002), 15-23, 41.
[16] J. Q. Gao, G. X. He, A Review of Parallel Genetic Algorithms, Journal of Zhejiang University of
Technology 2(2007), 56-59, 72.
[17] O. Ngwenyama, A. Guergachi, T. Mclaren, Using the Learning Curve to Maximize IT Productivity: A
Decision Analysis Model for Timing Software Upgrades, International Journal of Product Economics
2(2007),524-535.
[18] J.G. Walter, K. Stefan, R. Peter R, et al. Multi-objective Decision Analysis for Competence-oriented
Project Portfolio Selection, European Journal of Operational Research 3(2010),670-679.
[19] J.G. Walter, Optimal Dynamic Portfolio Selection for Projects under A Competence Development
Model, OR Spectrum 33(2011), 173-206.
[20] Y. Zhang, Knowledge works scheduling based on stochastic ability promotion, Xian University of
electronic science and technology, 2012.
[21] D. Y. Li, H. J. Meng, X. M. Shi, Membership clouds and membership cloud generators, Journal of
Computer Research and Development 6(1995), 15-20.
[22] C. H. Dai, Y.F. Zhu, W. R. Chen, Adaptive genetic algorithm based on cloud theory, Control Theory
and Applications 4(2007), 646-650.
[23] H. Chen, B. Li, Approach to Uncertain Reasoning Based on Cloud Model, Journal of Chinese
Computer Systems 12(2011), 2449-2455.
[24] M. S. Wang, M. Zhu, Evaluating Intensive Land Use Situation of Development Zone based on Cloud
Models, Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE)
10(2012), 247㧙252.
390 Fuzzy Systems and Data Mining II
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-390
Introduction
Superior product quality is the constant pursuit of industry and the product quality
prediction is valuable during manufacture which can detect the defective products early,
ensure the product quality and improve the product yield effectively. However, the
traditional quality prediction method based on Statistics Process Control (SPC) [1] is
difficult to establish a complete and accurate model for quality prediction in the
modern industry. Therefore, it is of great significance to realize the intelligent
production quality prediction using the modern quality prediction method based on
Artificial Intelligence (AI) which can overcome the limitations of the traditional
method.
The Artificial Neural Networks (ANNs) [2], an information processing model
simulating the human brain structure and function, has the strong adaptive learning
ability and nonlinear function approximation ability, which is suitable for the complex
and nonlinear production process nowadays. The proposed model, PCA-BPNN, which
optimizes the Back Propagation Neural Network (BPNN) using the Principal
Component Analysis (PCA) algorithm, can resolve prediction difficulties aroused by
the high dimensional production parameters.
1
Corresponding Author: Hong ZHOU, Ph.D. Program in Engineering Science, Chung-Hua University,
707, Sec.2, WuFu Rd., Hsinchu, Taiwan; E-mail: d10424004@chu.edu.tw.
H. Zhou and K.-M. Yu / Quality Prediction in Manufacturing Process Using a PCA-BPNN Model 391
1. Literature Reviews
The Back Propagation Neural Network (BPNN), a kind of the multi-layer forward
neural network, is proposed based on the error Back Propagation (BP) algorithm and
becomes one of the most widely used neural network models [3]. The learning process
of BPNN includes two processes. One is forward propagation which transmits the input
signal from the input layer through the hidden layer to the output layer. Another is
backward error propagation which transmits errors in reverse, and adjust the value of
connection weights and bias using gradient descent algorithm.
The BPNN shows an outstanding ability of fault tolerance, self-learning and
nonlinear dynamic processing and is a good choice to solve the fuzzy, not strict or
incomplete problems. However, it has inherent defects like that its convergence speed
is slow and it is easily falling in a local minimum value which can be improved by
introducing other algorithms or by optimizing the network structure model.
In the modern industry, massive production data are produced continuously during the
production process which is usually imprecise, incomplete, redundant and so on. Hence,
if the BPNN is adapted to analyze these large-scale data with a high dimension, it is
easy to lead a long training time of network, be trapped in a local optimal value, or
even generate an oscillation. These defects can be improved by introducing the PCA
algorithm to optimize the BPNN model shown in Figure 1.
The PCA is responsible for the dimension reduction of production features namely
the input variable X. The BPNN is in charge of the product quality prediction which
inputs the new production variable P obtained from PCA and outputs the prediction
quality of products y1. The flow chart of production quality prediction using the PCA-
BPNN model is shown in Figure 2.
In order to ensure the prediction results meaningful, the data analysis should be
implemented on the ideal datasets which means the data is true and incomplete.
Therefore, a simple and effective algorithm, K Nearest Neighbor (KNN) [5] is utilized
to do the missing data imputation for the Secom dataset of UCI (described in section 3).
The main implement steps include the sample definition, K nearest neighbors selection
using the minimum Euclidean distance, estimation of missing value using the weighted
average value of K nearest neighbors.
The PCA algorithm is employed to reduce the production variables and specific steps
are described as following:
1. Define the original input variables in the matrix X=(x1, x2, …, xp) T where
p=590 represents the dimension of production features. And production samples
(n=1567) are defined as xi=(xi1, xi2, …, xip), i=1,2,…,n.
2. Normalize the original matrix X into the standard matrix Z using two methods
separately. One is the Min-max method defined in Eq. (1) where zij [-1, +1] and
another is the Zero-mean method defined in Eq. (2) where the mean value is 0 and the
standard deviation is 1. In this way, effects on the analysis result brought by differences
between magnitude and dimension of inputs are eliminated to ensure the comparability
of inputs.
Note: zij (i=1, 2, ̖, n j=1, 2, ̖, p) are standardized variables; maxjᇬminj are the
maximum and minimum value of each production feature; xj is the arithmetic mean of
each production variable; V j is the standard deviation of each production feature.
3. Construct the covariance matrix R based on the matrix Z which is composed of
rij(i j=1, 2, ̖, p) , the correlation coefficient between zi and zj, defined in Eq. (3).
n n n (3)
rij (¦ (zki zi )( zkj z j )) / ¦ ( zki zi)2 ¦ ( zkj z j)2
k 1 k 1 k 1
i p (4)
Mi ¦ O /¦ O
i 1
i
i 1
i i 1, 2,..., p
6. Transform the principal component matrix W using the equation yij P ij wij .
The dataset adopted is called Secom provided by University of California Irvine [6]
which is acquired from a semi-conductor manufacturing process in the real-world. It
includes 1567 production instances, 590 associated measured features and a pass/fail (-
1/1) yield. The simulation runs on the Matlab R2015b with the CPU Intel(R) Core(TM)
i5-3317U 1.70GHz and a 4 GB RAM.
In the established BPNN [7, 8], the number of neurons in the input layer is defined as N
whose value depends on the number of principal components chosen by PCA. If the
min-max normalization method is applied to PCA then N=7. Otherwise, if the zero-
mean normalization method is applied to PCA then N=31. The number of neurons in
the output layer is defined as O=1. Furthermore, the number of neurons in the hidden
layer is defined as H according to the empirical formula H = N O +a where a=10 the
optimal value after trial. Thus the value of H will be 13 or 16 depending on the
different numbers of inputs. The transfer function utilized is the
hyperbolic tangent non-linear function “transig”; the training function utilized is the
gradient descent with momentum backpropagation function “traingdm”; and the
394 H. Zhou and K.-M. Yu / Quality Prediction in Manufacturing Process Using a PCA-BPNN Model
learning function utilized is the gradient descent with momentum weight and bias
function “learngdm”.
The training parameters are set as following: the maximum number of epochs to
train is 10000; the performance goal of training is 0.00001; the learning rate is 0.01; the
maximum validation failures is 20; and the minimum performance gradient is le-5.
The 5-fold cross-validation is utilized to partition the original samples achieved using
PCA into 5 disjoint subsamples randomly and evenly. Of these 5 subsamples, a single
subsample is retained as the validation data to evaluate the prediction accuracy, and the
remaining 4 subsets are used together as a training set to establish the model. This
cross-validation process is repeated 5 times and each subset is used only once as the
validation set.
In this section, Square Error Mean (MSE) and correct ratio are used to evaluate
the prediction performance of the PCA-BPNN model. The correct ratio (CR) indicates
the percentage of samples predicted correctly account for the total sample. It should be
particularly noted that the predicted quality output will be marked as qualified when it
is smaller than 0, otherwise will be marked as unqualified.
three models is roughly equal, the biggest MSE of BPNN (2.7993) is 10 times more
than that of the other two models (0.27298). Therefore, not only the stability but also
the accuracy of the quality prediction is optimized when produced PCA into the
BPNN model.
Table 2. Differences between three quality prediction models
4. Conclusion
The PCA-BPNN model is designed and simulated to predict the product quality for the
modern industry which has high dimensional production parameters. From the
simulation results described it can be seen that the PCA-BPNN model can reduce the
input dimension effectively using PCA without information loss when facing the
massive and high dimensional data. In addition, by this way the scale of the neural
network can be minified, the convergence can be accelerated, and the prediction
accuracy can be improved. Compared with the traditional BPNN, the performance of
PCA-BPNN is quite stable and the problem of performance vibration can be solved
effectively.
Although, the PCA-BPNN shows a prominent superiority over BPNN, it also has
some defects needed to be improved, for example, it is valuable to research on how to
optimize the model performance and avoid trapping in the local optimum.
396 H. Zhou and K.-M. Yu / Quality Prediction in Manufacturing Process Using a PCA-BPNN Model
Acknowledgement
This research was supported in part by the Ministry of Science and Technology of
R.O.C. under contract MOST 105-2221-E-216 -015 -MY2.
References
[1] W. H. Woodall and C. M. Douglas, Research issues and ideas in statistical process control, Journal of
Quality Technology, 31 (1999), 376.
[2] I. A. Basheer and M. Hajmeer, Artificial neural networks: fundamentals, computing, design, and
application, Journal of Microbiological Methods, 43 (2000), 3-31.
[3] A. T. C. Goh, Back-propagation neural networks for modeling complex systems, Artificial Intelligence in
Engineering, 9 (1995), 143-151.
[4] I. T. Jolliffe, Principal Component Analysis, New York: Springer-Verlag, 1986.
[5] G. E. Batista and M. C. Monard, A study of k-nearest neighbour as an imputation method, HIS, 87 (2002),
251–260.
[6] M. C. Michael and A. Johnston, Secom Data Sets of UCI Machine Learning Repository,
https://archive.ics.uci.edu/ml/datasets.html, 2008.
[7] P. G. Benardos and G. C. Vosniakos, Optimizing feedforward artificial neural network architecture.
Engineering Application of Artificial Intelligence, 20 (2007): 365-382.
[8] C. Macbeth and H. Dai, Effects of learning parameters on learning procedure and performance of a
BPNN. Neural Networks the Official Journal of the International Neural Network Society, 10 (1997),
1505-1521.
Fuzzy Systems and Data Mining II 397
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-397
Introduction
Online academic advising systems can provide prompt, effective and efficient advices,
enhance student experience and save institutional resources.
In [1], a prototype of a web based intelligent student advising system using
collaborative filtering [2, 3] had been developed for concept approval. In this system,
students are sorted into groups. If a student was determined to be similar to a group of
students, a course preferred by that group might be recommended to the student [1].
Real student data with complete records of all 743 students enrolled in over 50 courses
in the Bachelor of Computing Systems (BCS) were anonymized and used for training
and testing the prototype of the sys-tem. This made it easy to integrate the system with
our current student management system. There-fore, our students don’t need to create a
profile to use this system. Students intending to complete the BCS program of study
could use the system to help them decide the pathway and papers to enroll in, as it
gives advice that considers their current application, academic transcripts and cultural
background. The solution is cost-effective, scalable and easily accessible by students,
1
Corresponding Author: Xiao-Song Li, Practice Pathway of Computer Science, Unitec Institute of
Technology, Auckland, New Zealand; E-mail: xli@unitec.ac.nz.
398 X. Li / The Study of an Improved Intelligent Student Advising System
lecturers and data analysts who can benefit from the long-term investment in smart
tools.
The data were divided into two sets: training data (372 records) and testing data
(371 records) [1]. The following attributes were defined based on the existing student
records [1]:
x GPA: Relevant to performance.
x Age: Relevant to family stress, e.g. mature students are more likely to have
family commitment.
x Ethnicity: Relevant to English competency and family background.
x Gender: Relevant to learning style and how they can cope with the provided
learning facilities.
Instead of other clustering techniques, K-means algorithm was chosen to determine
the similarity of the students [4, 5] for the prototype, due to its ease of use and fitness
for the purpose. K=7 was identified as the most informative and effective value for the
K-means algorithm used in this system [1]. The K-means algorithm was implemented
by using the C# procedure provided by [6].
For verification, the training data were used to predict the preferences of the
testing data when K=7. For all the other clusters, the results were very close except for
cluster 3, where there were around 10% course preference differences between the two
datasets for all the three pathways: Software Development (SD), Networking and
Security (NS), Business Intelligence (BI) and the rest of the courses (Other).
To investigate further on improving the initial sys-tem to address the above issue,
other cluster methods were considered, such as Cobweb. In [7], Cobweb algorithm was
combined with K-means algorithm, where Cobweb was used to produce a balanced tree
with sub-clusters at the leaves and then K-means applied to the resulting sub-clusters.
An outline of the procedure to provide guidelines for students’ course selection
and pathway selection was given in [1]; however, it was not completely implemented
in the initial system.
An improved system was designed and implemented. The main objectives of the
improved sys-tem are to: improve the user interface of the initial system for robustness
and usability; provide recommendation for popular pathways for different groups of
students; provide pathway and course selection for a student logged into the system;
and improve the recommendation quality of the initial system. This paper describes the
results of the first phase of the improved system.
In the rest of this paper, the improved system is described, the experiment results
of K-means algorithm and Cobweb algorithm are compared, and a summary and future
work is given at last.
The improved system includes three major improvements: a) the initial system was
integrated with WEKA (Waikato Environment for Knowledge Analysis); b) both K-
means algorithm and Cobweb algorithm were implemented for training and testing data
sets; and c) course and pathway recommendations were implemented.
WEKA is a widely used open-source machine learning and data mining software
developed using Java programming language. To assure the correctness of the test
results and to experiment with more machine learning algorithms efficiently, WEKA
X. Li / The Study of an Improved Intelligent Student Advising System 399
was integrated with the initial intelligent student advising system which was
implemented in ASP.NET. IKVM software was used to integrate WEKA with the
initial system. K-means algorithm was re-implemented by using WEKA (training and
testing).
The recommendation procedure [1] was implemented for K-means algorithm with
K=7. Given a student record xi, the study pathway the student could take or the courses
the student could take for next semester are recommended based on the following
procedure:
1) Generate clusters C = {c k }, i-1, …, k by using the K means algorithm on the
whole data set, where k=7.
2) Identify which cluster xi belongs to, say cm where 7>= m >=1.
3) Find out the top 12 most popular courses in c m, eliminate those xi has taken,
recommend the rest to xi.
4) Calculate all the average marks for all the courses taken by the students in
cluster cm, select the top 12 courses, eliminate those xi has taken, recommend
the rest courses with the highest average marks.
5) Recommend the most popular pathway (major) in cluster c m to xi. This is
particularly useful to new student or for a student who wishes to change
pathway.
Figure 1 shows the recommendation after a Software Development pathway
student logged in. As the student’s record was taken when he was in the third year, he
had taken most of the software courses at lower levels; therefore the majority of the
recommended courses were third year software courses, such as Java Enterprise
Programming, Mobile Software Development, Data Warehousing and etc. The
recommended study pathway was Software Development, which is correct. The
recommended courses based on the highest average marks included some popular
network courses as well. Further investigation is required.
Figure 2 shows the K-means results from the initial system when K=7. Figure 3 shows
the K-means results from the improved system based on WEKA when K=7. To verify
that the results from the both systems are correct, the K-means results from the both
systems were compared for K = 1, …, 7, where the same datasets and attributes were
used. The whole set was divided into 50:50 to create the training set and the testing set,
see Table 1.
Table 1. The testing datasets
The Cobweb method was developed for clustering objects in an object-attribute dataset.
The Cobweb method yields a classification tree that characterizes each cluster with a
probabilistic description. It builds clusters by incrementally adding instances to a tree,
merging them with an existing cluster if this leads to a higher ‘Category Utility (CU)’
value than when the instance would get its own cluster and if the need arises, an
existing cluster may also be split up into two new clusters, as this is beneficial to the
CU value [9]. Unlike K-means, the Cobweb finds a set of clusters depending on the
parameter setting that has been set by the user.
The improved system adopted the Cobweb functionality of WEKA. The
experiments were conducted to discover most suitable parameter setting for the
improved system, which helped the Cobweb to discover 4 clusters from the three data
sets. Identical datasets for the K-means algorithm were used for these experiments. The
experiments required configuring the following three parameters:
x Acuity – The minimum standard deviation of a cluster attribute. It only
matters for numerical attributes. The default acuity is ‘1.0’,
x Cutoff – The minimum category utility of a cluster attribute. The default
cutoff is ‘0.0028209479177387815’,
x Seed – The random number seed to be used. The default seed is ‘42’.
According to [9], the CU is the main value that determines where instances belong
to, which clusters they belong to, if the dataset does not have numerical attribute(s) one
of the parameters, the acuity, does not need to be adjusted to find a set of aimed
clusters. The datasets that had been used for the experiments have both numerical and
nominal attributes; therefore, both Cutoff and Acuity were configured.
The number of clusters increased or stopped with maximum number of clusters
that could be found when the number of Cutoff was increased and the number of
clusters decreased when the Cutoff was decreased. Unlike the Cutoff, when the Acuity
and Seed were configured the results didn’t show steady trend. When number of Acuity
was increased or decreased, the number of clusters that were discovered sometimes was
smaller than previous setting or bigger than the previous setting.
Figure 4 shows the Cobweb results based on WEKA. Figure 5 shows the Cobweb
tree structure and the resulting clusters for the training dataset; Figure 6 shows the
Cobweb tree structure and the resulting clusters for the testing dataset; Figure 7 shows
the Cobweb tree structure and the resulting clusters for the whole dataset.
X. Li / The Study of an Improved Intelligent Student Advising System 403
be added dynamically, and the Cobweb will seek for updated values for its parameter to
produce a set of clusters. This makes Cobweb less efficient and therefore not likely to
be recommended in preference to the K-means.
The two algorithms were compared on the WE-KA based improved systems when 4
clusters were produced by both of the algorithms. The same method used in [1] was
used to verify K-means and Cobweb on the WEKA based improved systems, where the
training data was used to predict the preferences of the testing data for the four clusters,
and then they were compared.
Comparing the differences between the testing data and the training data for K-
means algorithm in Table 2, it was found that the biggest difference was in cluster 4 in
BI pathway, i.e. 6.68%; on the other hand, the maximum difference was around 5% for
all the clusters and all the pathways for Cobweb algorithm. This suggested that the
recommendations based on Cobweb algorithm could be more reliable.
Table 2 The experiment results on different datasets
This paper has described an improved intelligent student advising system. Compared to
the initial system, the improvements include: integrating the initial system with
WEKA; implementing both K-means algorithm and Cobweb algorithm for training and
testing data sets; and implementing course and pathway recommendations.
The recommendations given by the improved system were based on the K-means
algorithm; the results were meaningful. However, the quality of the recommendations
could be improved. For that purpose, Cobweb algorithm was also experimented. The
results showed that it is hard to identify proper parameters for Cobweb algorithm to
produce meaningful clusters. The results of experiments suggested that Cobweb
algorithm is less efficient that K-means algorithm for this system. It also suggested that
Cobweb algorithm is more reliable than K-means algorithm for this system. Future
research should focus on improving the efficiency of Cobweb algorithm.
To improve the quality of the recommendations, two new attributes, learning styles
and personal interests, were considered. Thirty five learning styles were introduced,
including: self-motivating, curiosity and adaptability. Personal interests included
cooking, sports, and reading. The improved system provided options for students to
select their learning styles and their personal interests, after they login, and saved those
into the database along with other data in each student’s record. This function can be
used in the future for data collection to include these two attributes in the data model of
the recommendation system.
Acknowledgment
The author would like to thank the “Intelligent Student Advising System” BCS project
team (Xingyu Liu, Jehee Hwang, Obert Ye and Xianbo Lu) for their implementation
and experiments.
References
[1] K. Ganeshan and X. Li, An Intelligent Student Advising System using collaborative filtering, Proceedings
of Frontiers in Education, 2015, 2194–2201.
[2] T. Jones. Recommender systems, Part 1: Introduction to approaches and algorithms. IBM Developer
Works: 12 December 2013.
[3] Collaborative Filtering. Web Whompers. Retrieved 10th April 2014 from:
http://webwhompers.com/collaborative-filtering.html.
[4] K means Clustering. OhMyPHD. Retrieved 23rd March 2014 from: http://www.onmyphd.com/?p=k-
means.clustering#h3_badexample.
[5] A. K. Jain, Data clustering: 50 years beyond K-means. Pattern recognition letters, 31(8), (2010), 651-
666.
[6] J. McCaffrey, K Means Clustering Using C#. Visual Studio Magazine. (2013) Retrieved 23rd March
2014 from: http://visualstudiomagazine.com/articles/2013/12/01/k-means-data-clustering-using-c.aspx
[7] M. Li, G. Holmes and B. Pfahringer, Clustering large datasets using cobweb and K-means in tandem,
Proceedings of the 17th Australian Joint Conference on Artificial Intelligence, G.I. Webb & Xinghuo Yu
(Eds.), Cairns, Australia, December 4-6, 2004, pp. 368-379. Berlin: Springer.
[8] J. Hu (2012). Clustering – An unsupervised learning method. Retrieved from
http://www.aboutdm.com/2012/12/clustering-unsupervised-learning-method.html
[9] P. Spronck (2005). Lab 7: Clustering. Tilburg centre for Cognition and Communication, Tilburg
University, The Netherlands.
Fuzzy Systems and Data Mining II 407
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-407
Abstract. The safety of the access to certification system has gained more and
more attention, the general current access control system does not consider that
whether the user's terminal equipment meets the needs of security policy. This
paper presents a security access control model based on 802.1x protocol, extended
by adding authentication information, to strengthen the authenticated user's
strategy control, and achieve the access user's security control. The model is
implemented using the plaintext in Kaspersky Anti-virus software as an extension
of information. The server realizes the authentication by simple string comparison.
The test results show that our model can check whether the host meets the policy
before it accesses to the network, and restrict the unmatched host and enhance the
network security control effectively.
Introduction
With the popularity of the network and the enterprise informatization, most companies
now choose Ethernet as it is simple, flexible and cheap. However, the birth defects in
Ethernet safety makes it necessary to achieve the user-level access control. 802.lx is a
standard defined by IEEE to deal with the Port-Based Network Access Control[1]. The
IEEE 802.1x protocol inherits the advantage of IEEE 802.1x LAN and provides a
measure to authenticate and authorize the user which has connected to the local area
network [2].
The current Popular security access control model is based on 802.1x protocols,
which uses the user name and password to be authenticated, and allows only legitimate
users to access the network [3]. Authentication and key agreement schemes are widely
adopted in many applications [4]. Such as Ruijie authentication system, its
authentication is combined with account, password, IP address and MAC address,
effectively put an end to illegal users to access the internal network. In the practical
application of such systems, we have found that some users have the right to enter the
1
Corresponding Author: Han-Ying CHEN, College of Information Science and Technology, Jinan
University, 510632, Guangzhou, China; E-mail: jackchenhy@hotmail.com.
408 H.-Y. Chen and X.-L. Liu / An Enhanced Identity Authentication Security Access Control Model
network, but their computers may not be suitable for access networks. As we know,
computers and other mobile devices are easily infected with the external viruses,
trojans and other malicious codes. When they re-access the net, it will inadvertently put
malicious code into the internal network environment, affecting the network security
[5-6].
This model only solves the problem of user identity, without taking into account
the safety of terminal equipment. We look forward to security access control system
that can check whether the host meets the security policy before it access the network,
to restrict suspicious host accesses to the network until it has taken appropriate safety
measures. This will not only help the host avoids becoming the target of malicious code
attacks, but also help the host avoids becoming the source of malicious code. In this
regard, this paper presents a security access control model, which authenticates the
client's security situation (such as the installing and operating information of anti-virus
software) as extended information, to strengthen the security policy control of terminal
equipment.
This paper proposes the following model, the client adds all kinds of information
collected by host as extension information to the user name frame, and then the device
packages it and forwards to the server. First the extended information is verified, if it
meets the requirements the user name and password are to be authenticated. Otherwise,
the authentication request is rejected [12]. Authentication process is shown in Figure 1.
The authentication process of the model:
x When the user needs to access the network, he opens the 802.1x client
program, enters the user name and password registered to initiate the
H.-Y. Chen and X.-L. Liu / An Enhanced Identity Authentication Security Access Control Model 409
connection request. Client program will send the request certified EAPOL-
Start frame to the device, triggering authentication.
3. Model Realization
The concrete realization of the client program refers to the open source ThorClient2
program, and adds extensions information gathering functions based on it. The overall
structure of the improved Authentication client program is shown in Figure 2.
The main security goals of such client program are authentication and privacy.
This protocol allows client to authenticate and establish a secure session key through a
server over an insecure channel [13].On login screen the user enters the user name,
password, and the network adapter selected, even call for configuration module to
configure the detail if necessary. The configuration information is saved for later
authentication. EAPOL module is the core of the software, which is responsible for
completion of the function of PAE of 802.1x authentication client. Authentication data
frame of authenticator system is charge of capturing data transmit/receive module,
authentication request response frame is structured according to the state of requesting
PAE state machine processing, and is forwarded to the data transmit/receive module.
MD5 module realizes the encryption on the user name and password.
Expansion information collection module is to collect all kinds of information of
system in accordance with preset policies, and send it to the EAPOL module. This
paper simplified this module to test whether local system have installed and run the
Kaspersky anti-virus software, and generate extended information. The concrete
method is to read the latest start time and the last update time of Kaspersky from the
Windows registry, and see if there are running the avp.exe processes as SYSTEM user
H.-Y. Chen and X.-L. Liu / An Enhanced Identity Authentication Security Access Control Model 411
and current user in system process list, and then transmit the information to EAPOL
module.
EAPOL module is to add the information that is the user name from the user
interface and the extension information from information gathering module to EAP-
Response/Identity frame; the two are separated by byte 0. To facilitate the processing
of the Radius proxy, we set the user name information and the extension information to
a fixed length, and complete with 0 when it is less than the length.
In this proposal, Radius agency's main task is to analysis and processes the received
UDP packets.
According to the direction of the UDP package, the program is divided into two
threads: 1) The packet from the device, the agency determine whether it is Radius
Access-Request packet with EAP-Response/Identity. If it is, then certificate its
extension information and forward the packet meeting the requirements to the server. If
the packet does not meet the requirements, the agency replies Radius Access-Reject
packet to feedback refused authentication information and terminates this
authentication. EAP-Response/MD5 Challenge packet is forwarded to the server, while
non EAP-Response/Identity packet is discarded (process shown in Figure 3 below
Radius agency process); 2) The packet from the server is directly forwarded to the
device. This program simplifies the extension information authentication to the
comparison of string.
The purpose of this test is to verify the transmission of extension information and the
authentication results.
Test environment:1) We use NetgearGSM7312 switch as the device, the server
address is set to point at the Radius agency; 2) We set Winradius server addresses
within the Radius agency, the policy is set to require the client must be running
Kaspersky anti-virus software, and the update date is late within 7 days; 3) We use
Winradius as authentication server; 4) client program discussed in previous section is
running on the client host (required Winpcap4.0); 5)Radius agency, Winradius and
client program are all running on Windows XP SP3 environment.
Test I: The client host had installed Kaspersky Anti-Virus software but it was not
running, we entered the correct password in the client program. From the test results
(Figure 4 Test I (RADIUS agent)) we can see the Radius agent found it did not meet
security policies while authenticating extension information, then Radius replied with
Access-Reject packet to reject the authentication request. The Winradius Server did not
412 H.-Y. Chen and X.-L. Liu / An Enhanced Identity Authentication Security Access Control Model
have any response. The client displayed with authentication failure and prompted the
user to check the anti-virus software. The client host still could not access the network
properly.
5. Conclusion
In this paper, we proposed a security access control model to realize the transmission
and authentication of extension authentication information from client to server with
802.1x protocol. Model is implemented using the plaintext in Kaspersky Anti-virus
software as an extension of information. The server realizes the authentication by
simple string comparison. In fact, the extension information can be the client's
operating system, software environment, network conditions and other ciphertext or
plaintext, the authentication of the extension information in server can also be varied.
Through the authentication of extension information, this security access control model
can only allow the networking equipment which meets the security policy to access
into the network, in order to reduce security threats from the network. This model can
be applied in system which is similar with the test environment above, and can resist
attacks with simple or single extension information authentication. If extended
information consists of a variety of data sources, then this model can't work. In the
future, we will make more experiments to improve the performance of the model. This
model still has some defects to be deal with. For instance, this access control model is
not comprehensive enough; it should be a more widely used in any system. Another
limitation is how to prevent more and more complex attacks. All these problems need
to be studied in the future.
References
[1] N. Hoque, Monowar H. Bhuyan, R.C. Baishya, D.K. Bhattacharyya, J.K. Kalita, Network attacks:
Taxonomy, tools and systems, Journal of Network and Computer Applications, 40(2014),307-324.
[2] X. Wang , Research on the 802.1x Authentication Mechanism and Existing Defects, International
Conference on Advanced Mechanical Engineering (AME 2012),192(2012), 385-389.
[3] D. Yadav, A. Sardana, Authentication Process in IEEE 802.11: Current Issues and Challenges, Advances
in Network Security and Applications, 196(2011), 100-112.
[4] I.P. Chang,T.F. Lee,T.H. Lin,C.M. Liu, Enhanced Two-Factor Au-thentication and Key Agreement
Using Dynamic Identities in Wireless Sensor Net-works, Sensors,15(2015),29841-29854.
[5] J. Soryal, T. Saadawi, IEEE 802.11 DoS attack detection and mitigation utilizing Cross Layer Design. Ad
Hoc Networks, 14(2014), 71-83.
[6] M. Cheminod, L. Durante, L. Seno, A. Valenzano, Detection of attacks based on known vulnerabilities in
industrial networked systems, Journal of Information Security and Applications, 2016.
[7] S. Hong, J. Park, S. Han, J. Pyun, J. Lee. Design of WLAN Secure System against Weaknesses of the
IEEE 802.1x, Advances in Hybrid Information Technology, 4413(2007), 617-627.
[8] C. Chen, J. Zhang, J. Liu, Design in the Authentication and Billing System Based on Radius and 802.1x
Protocol, International Symposium on Computers and Informatics (ISCI), 13(2015),1438-1443.
[9] A.K. Dalai, S.K. Panigrahy, S.K. Jena, A Novel Approach for Message Authentication to Prevent
Parameter Tampering Attack in Web Applications, Procedia Engineering, 38(2012), 1495-1500.
[10] Eshmurzaev, B.Eshmurzaev, Dalkilic, G. Dalkilic, Analysis of EAP-FAST Protocol, Proceedings of the
iti 2012 34th international conference on information technology interfaces (iti), (2012),417-422.
[11] J. Hur, C. Park, H. Yoon, An Efficient Pre-authentication Scheme for IEEE 802.11-Based Vehicular
Networks , Advances in Information and Computer Security, 4752(2007), 121-136.
[12] S. Wijesekera, X. Huang, D. Sharma, Utilization of Agents for Key Distribution in IEEE
802.11,Advances in Intelligent Decision Technologies, 4(2010),435- 443.
[13] Farash, Mohammad, Attari, Mahmoud. An efficient client-client password-based authentication scheme
with provable security, Journal of Supercomputing, 70(2014),1002-1022.
414 Fuzzy Systems and Data Mining II
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-414
Abstract. Designing E-R model is still the key process for database design. This
paper proposes a novel method to recommend entities for E-R model by OWL on-
tology. OWL ontology is open and shared knowledge which can be easily obtained
for anybody from the internet. We adopt ontology reasoning techniques to capture
domain concepts according to a set of predefined vocabularies. Some correspond-
ing concepts in OWL ontology are pinpointed by a preprocessing module first.
Then these concepts used as seeds are extended, deleted and modified by an opera-
tion module. We can acquire reasonable entities for E-R model through a series of
processes. Based on experimental study the entities recommended by this method
approach the expert level.
Introduction
E-R model introduced by Peter Chen [1] is a form of knowledge representation. With
the development of semantic web and researches of knowledge representation, domain
ontology has gradually become one of the main representations of domain knowledge,
for instance, the Foaf ontology [2], the GO ontology [3] and the SWEET ontology [4].
In addition, description logic language [5] can not only express displayed knowledge in
the domain but also imply implicit knowledge.
Many researchers devised different methods to construct relational database, and
some of them tried to build database using domain ontology. For example, Storey et al.
[6] presented an ontology that could classify terms to several categories. Sugumaran et
al. [7] proposed a methodology for supporting database design and evaluation. Gali et
al. [8] gave a method for querying ontology, which stored ontology information in
relational tables. Vysniauskas et al. [9] put forward some algorithms to convert well-
formed ontology to relational database. In this paper, a new method is proposed to
recommend entities for E-R model. We obtain relevant concepts from domain ontology
by description logic reasoning.
1
Corresponding Author: Jie LIU, College of Software, Jilin University, Changchun 130012, China; E-
mail: liu_jie@jlu.edu.cn.
X.-X. Xu et al. / Recommending Entities for E-R Model by Ontology Reasoning Techniques 415
1. Resolved Framework
2. Preprocessing Module
The expression of a concept provided by customers may be different from the expres-
sion of knowledge in domain ontology. Besides, the requirement terms provided by
customers may be not comprehensive. We need extend the requirement terms further.
We first call Hierarchy API of reasoner to generate the concept hierarchy tree of do-
main ontology, by which database designers can grasp the whole hierarchy of OWL
ontology. Then using WordNet [10] combined with semantics similarity comparison
[11] we obtain synonyms of requirement terms in ontology by calculating the similarity
value of two concepts. The similarity concepts gained from ontology and these re-
416 X.-X. Xu et al. / Recommending Entities for E-R Model by Ontology Reasoning Techniques
quirement terms are pinpointed on the concept hierarchy tree. Through this way we
realize the initial extension for requirement terms.
For designers and customers, it is inconvenient if they want to pinpoint or add a con-
cept in a large chart. We select a part of chart from the whole concept hierarchy tree.
(1) Owl: Thing and its direct child concepts
(2) All concepts in PTSet
(3) The direct child concepts and father concepts of concepts in PTSet
(4) These concepts at the same level with concepts in PTSet
Owl: Thing and its direct child concepts can define the scope of concepts. Besides,
these concepts in PTSet are the nearest ones to customers’ requirements. The child
concepts and father concepts of concepts in PTSet are very important to refine the enti-
ties of E-R model. Last, those concepts at the same level with concepts in PTSet can
approximately get concepts and entities with similar granularity.
Algorithm. getPartHierarchies
Input: HierarchiesTree, PTSet
Output: PartHierarchies: the part concept hierarchy tree
1 rootkHierarchiesTree
2 for each concept in root
3 PartHierarchies.add (concept);
4 for each sub in concept
5 PartHierarchies.add (sub);
6 end for
7 end for
8 for each concept in PTSet
9 PartHierarchies.add (concept);
10 for each sub in concept
11 PartHierarchies.add (sub);
12 end for
13 for each sup in concept
14 PartHierarchies.add (sup);
15 buildRelation (sup, root.sub);
16 end for
17 concept*=getSameLayer (concept);
18 PartHierarchies.add (concept*);
19 buildRelation (concept*, root.sub);
20 end for
21 return PartHierarchies;
Combining structural features of E-R model and those concepts in PTSet, we give some
principles to delete concepts from the part hierarchy tree and save these deleted con-
cepts into a set of concept terms deleted (DTSet) used to modify entities.
X.-X. Xu et al. / Recommending Entities for E-R Model by Ontology Reasoning Techniques 417
Principle 1 If there is inclusion relation between concepts, the father concept has
only one child concept, and these two concepts have different relations R with other
concepts in part hierarchy tree. Then retain these two concepts.
Principle 2 If there is inclusion relation between concepts, these concepts have the
same relation R with other concepts. Then retain the concept appearing in PTSet.
Principle 3 If there is inclusion relation between concepts, the father concept has
more than one child concept, and these concepts have the same relation R with other
concepts. Then retain father concept and delete child concepts.
Principle 4 If there is inclusion relation between concepts, the father concept has
more than one child concept. These concepts have different relations R with others.
Then delete these child concepts and retain father concept, meanwhile, retain those
child concepts that have different relations with father concept.
Through we have basically established the entities used to recommend for E-R model.
We cannot exclude there are isolated entities (these entities have no relations with oth-
ers). Directly delete isolated entities may cause the loss of some useful information. We
proposed some modifying principles as follows to modify isolated entities.
Principle 1 If there are isolated entities, using these related concepts in DTSet
(such as father concept or child concept, not considering Thing or Nothing) instead of
the isolated entities obtain relations R among entities again. If there are not isolated
entities, then use these related concepts instead of isolated entities.
Principle 2 If those related concepts are still isolated entities, continue to find re-
lated concepts of those related concepts until there are no concepts used to replace.
Then delete isolated entities and all related concepts.
Principle 3 If there are no isolated entities, then do not modify the entities.
2
https://sweet.jpl.nasa.gov
3
http://www.geneontology.org/
418 X.-X. Xu et al. / Recommending Entities for E-R Model by Ontology Reasoning Techniques
For geological domain the statistical information was shown in Table 1. We first com-
puted the number of entities recommended by designer and experts listed in the column
1 and column 2 respectively. Then through matching we calculated the number of re-
petitive entities and relevant entities in the column 3 and column 5.
Table 1. The number and rate of entities recommended in geological domain
5. Conclusions
References
[1] P P S Chen, The entity-relationship model—toward a unified view of data. ACM Transactions on Data-
base Systems (TODS) 1(1), (1976), 9-36.
[2] M Graves, A Constabaris., D Brickley, Foaf: Connecting people on the semantic web. Cataloging &
classification quarterly 43(2007), 191-202.
[3] Gene Ontology Consortium., The Gene Ontology (GO) database and informatics resource. Nucleic acids
research 32(suppl 1) (2004), D258-D261.
[4] R Raskin, M. Pan, Semantic web for earth and environmental terminology (sweet). Proc. of the Work-
shop on Semantic Web Technologies for Searching and Retrieving Scientific Data, 2003.
[5] F Baader, The description logic handbook, Cambridge: Cambridge University Press , 2003.
[6] V C Storey., D Dey., H Ullrich., et al., An ontology-based expert system for database design. Data &
Knowledge Engineering 28(1) (1998), 31-46.
[7] V Sugumaran., V C Storey., The role of domain ontologies in database design: An ontology management
and conceptual modeling environment. ACM Transactions on Database Systems (TODS) 31(3) (2006),
1064-1094.
[8] A Gali., C X Chen., et al., From ontology to relational databases. International Conference on Conceptu-
al Modeling. 2004, 278-289.
[9] Vysniauskas E., Nemuraite L.: Transforming ontology representation from OWL to relational database.
Information technology and control 35(3) (2015).
[10] G A Miller WordNet: a lexical database for English. Communications of the ACM, 1995, 38(11): 39-
41.
[11] D Lin, Automatic retrieval and clustering of similar words. Association for Computational Linguistics
1998, 768-774.
420 Fuzzy Systems and Data Mining II
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-420
Introduction
In recent years, UWSNs have drawn considerable attentions from both academy and
industry. It facilitates a wide range of aquatic applications such as undersea exploration,
assisted navigation, environmental monitoring, disaster prevention and mine
reconnaissance. And most of them require that nodes have consistent time. However,
the local clock of nodes has an intrinsic drift because nodes go out of sync as time
elapses. Therefore, time synchronization is a necessary prerequisite for underwater
network system [1-4]. Unlike terrestrial wireless sensor networks, there are many
characteristics in UWSNs such as large and unstable propagation delay, node mobility
and severe signal attenuation, which increase the difficulty of time synchronization.
Furthermore, wireless sensors are powered by battery, and replacing the battery is very
1
Corresponding Author: Meng-Na Zhang, 127West Youyi Road, Xian, Shaanxi, China; Email:
mengnazhang@163.com.
2
Corresponding Author: Xiao-Hong Shen, 127West Youyi Road, Xian, Shaanxi, China;
E-mail: xhshen@nwpu.edu.cn.
M.-N. Zhang et al. / V-Sync: A Velocity-Based Time Synchronization 421
1. Algorithm Design
In a distributed network, the node calculates local time with the crystal. However,
different nodes have different frequency due to the hardware and production process.
Thus they have different internal clock, which result in the time of different nodes are
not synchronous. The relationship between standard time and local time is
422 M.-N. Zhang et al. / V-Sync: A Velocity-Based Time Synchronization
T = D t +E (1)
where T is the standard time.t is the node's local time. α and β are the clock
frequency skew and the time offset. When there is no deviation, α and β are 1 and 0
respectively.
The basic idea of time synchronization is the slave node estimates clock frequency
skew and time offset by exchanging messages, then it compensates time deviation
between local and standard time to keep them synchronous. And the sync error is the
difference between them after synchronization.
The relationships between time information are functions through the clock model
as follows:
B0 =D A0 + d0 E (2)
B1 =D A1 - d1 E (3)
B2 =D A2 + d 2 E (4)
B3 =D A3 - d3 E (5)
where A0,B0,B1,A1,A2,B2,B3 and A3 are eight time information obtained by the four
message exchanges between father and child node. di denotes corresponding
propagation delay. And A0 can be obtained in the process of establishing stratified node
model, which reduce one message delivery. To account for node mobility, the
relationships of propagation delay can be written as:
Li
di (6)
vs
'Li Li Li 1
'di = (7)
vs vs
v
ª¬ d1 A1 d1 A0 d 0 º¼
d1 d 0 (8)
vs
v
ª¬ d 2 A2 A1 º¼
d 2 d1 (9)
vs
v
ª¬ d3 A3 d 3 A2 d 2 º¼
d3 d 2 (10)
vs
d1 k
A1 A0 1 k
d0 (11)
424 M.-N. Zhang et al. / V-Sync: A Velocity-Based Time Synchronization
k
d2
A2 A0 d 0 (12)
1 k
d3 k A3 A0 1 k d0 (13)
1 § B1 B3 B0 B2 ·
D =
¨ ¸ (14)
2 © A1 A3 A0 A2 ¹
1 § B1 B3 A1 A3 B0 B2 A0 A2 ·
E =
¨¨ B0 B1 B2 B3 ¸¸ (15)
4 © A1 A3 A0 A2 ¹
and then compensate the time deviation to achieve time synchronized using the
formula t = (B - β) / α. Furthermore, we found there is no need to calculate k in this
process, so we do not need an additional measurement for specific value of V.
In addition, the network is prone to packet collide, so this paper adds collision
avoidance procedure to reduce the possibility of packets collision when sending
packages. After the child node receives the packet from its father node, it will wait for a
period of time. The length of this time is determined by the size of timestamp and data
transmission rate. If a node does not receive the packet from corresponding node after
waiting for the time, it will re-send the last packet until it receives response packet.
T P
(16)
Tn 1 H
1 i n2
Tn a
T b
¦ Ti
n2 i 1
(17)
As can be seen from the above formula, Tn is average of predicted period and
historical period. a and b are weighting coefficients and a+b=1. We also set an error
threshold and order nodes don’t synchronize if ε is less than the threshold. By the
M.-N. Zhang et al. / V-Sync: A Velocity-Based Time Synchronization 425
In this section, we first give a detailed account of the simulation process, and then the
simulation results are analyzed from multiple aspects so as to evaluate the performance
of V-Sync.
We use OPNET simulation platform with underwater channel to run our scheme. The
simulation is based on mesh network topology with one common anchor node and 60
ordinary nodes. The errors encountered in message exchange and process is modeled
by Gaussian distribution. The time we use are all from the MAC layer timestamp. This
paper compares V-Sync with the algorithm what are extended by Tri-Message that we
call it E-Tri-Message. The other parameters used in our simulation are as follows:
x Clock initial skew is 40 ppm.
x Clock initial offset is 100 μs.
x The clock jitter is 15μs.
x Propagation speed is 1500m/s.
x Wait time is 5s.
x Maximum retransmit count is 3.
x The maximum speed (Vmax) of sensor is 5m/s.
2.2. Analysis
First of all, we study the effect of node level to these two algorithms. As can be seen
from Figure 2 and 3, skew error and offset error of V-Sync is much less than E-Tri-
Message. And the raw data are presented in Table 1 and Table 2.This is because Tri-
Message does not consider the effect of node mobility on the propagation delay, and it
supposes packet propagation delay is equal every time. Therefore its error greatly
increased but the error of V-Sync is very small. The results also show that the larger the
node level, the error grows, which is because the error accumulates with the number of
hop increasing.
4
10
V-Sync
E-Tri-Message
3
10
Skew Error(ppm)
2
10
1
10
0
10
1 1.5 2 2.5 3 3.5 4 4.5 5
Node Level
5
10
V-Sync
E-Tri-Message
4
10
Offset Error(us)
3
10
2
10
1
10
1 1.5 2 2.5 3 3.5 4 4.5 5
Node Level
2
10
us
Error
1
10
0
10
0 1 2 3 4 5 6
V m/s
60
V-Sync
E-Tri-Message
50
40
Error(s)
30
20
10
0
0 1 2 3 4
10 10 10 10 10
Time after Sync(s)
3. Conclusion
Acknowledgments
This work was sponsored by the National Natural Science Foundation of China
(61571365), the Seed Foundation of Innovation and Creation for Graduate Students in
Northwestern Polytechnical University (Z2016056).
References
[1] I. F. Akyildiz, D. Pompili, T. Melodia. Underwater acoustic sensor networks: research challenges. Ad
hoc networks 3 (2005): 257-279.
428 M.-N. Zhang et al. / V-Sync: A Velocity-Based Time Synchronization
[2] J. H. Cui, J. Kong, M. Gerla, et al. The challenges of building mobile underwater wireless networks for
aquatic applications. IEEE Network 20 (2006): 12-18.
[3] J. Heidemann, W. Ye, J. Wills, et al. Research challenges and applications for underwater sensor
networking. IEEE Wireless Communications and Networking Conference. WCNC 2006. 1(2006).
[4] J. Partan, J. Kurose, B. N. Levine. A survey of practical issues in underwater networks. ACM
SIGMOBILE Mobile Computing and Communications Review 11 (2007): 23-33.
[5] F. Sivrikaya, B. Yener. Time synchronization in sensor networks: a survey. IEEE network 18(2004): 45-
50.
[6] J. Elson, L. Girod, D. Estrin. Fine-grained network time synchronization using reference broadcasts.
ACM SIGOPS Operating Systems Review 36 (2002): 147-163.
[7] S. Ganeriwal, R. Kumar, M. B. Srivastava. Network-wide time synchronization in sensor networks.
Center for Embedded Network Sensing (2003).
[8] S. Ganeriwal, R. Kumar, M. B. Srivastava. Timing-sync protocol for sensor networks. Proceedings of
the 1st international conference on Embedded networked sensor systems. ACM, 2003.
[9] M. Maróti, B. Kusy, G. Simon, et al. The flooding time synchronization protocol. Proceedings of the
2nd international conference on Embedded networked sensor systems. ACM, 2004.
[10] A. A. Syed, J. S. Heidemann. Time Synchronization for High Latency Acoustic Networks. INFOCOM.
2006.
[11] C. Tian, H. Jiang, X. Liu, et al. Tri-message: a lightweight time synchronization protocol for high
latency and resource-constrained networks. 2009 IEEE International Conference on Communications.
IEEE, 2009.
[12] N. Chirdchoo, W. S. Soh, K. C. Chua. MU-Sync: a time synchronization protocol for underwater
mobile networks. Proceedings of the third ACM international workshop on Underwater Networks.
ACM, 2008.
[13] J. Liu, Z. Zhou, Z. Peng, et al. Mobi-Sync: efficient time synchronization for mobile underwater sensor
networks. IEEE Transactions on Parallel and Distributed Systems 24 (2013): 406-416.
[14] F. Lu, D. Mirza, C. Schurgers. D-sync: Doppler-based time synchronization for mobile underwater
sensor networks. Proceedings of the Fifth ACM International Workshop on Under Water Networks.
ACM, 2010.
Fuzzy Systems and Data Mining II 429
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-429
Introduction
1
Corresponding Author. Ying-Hua HAN, Department of Computer and Communication Engineering,
Northeastern University at Qinhuangdao, Qinhuangdao, Hebei, 066004, China; E-mail: yhhan723@126.com.
430 H. Liu and Y.-H. Han / An Electricity Load Forecasting Method
At present, researchers have developed many forecasting methods and models. The
traditional short-term load forecasting methods include: time series, regression
forecasting and gray system theory. Modern and smart forecasting methods includes:
gray system theory method, Artificial Neural Networks (ANN) model, fuzzy inference
prediction method, genetic algorithm, Support Vector Machine (SVM) model, and
wavelet analysis method and so on [5-6]. Gray system theory has many advantages
such as it can be applied to any non-linear change of load index forecast, irrespective of
distribution law and change tendency, and easy operation. But it requires the load has
index change tendency, and the data with greater dispersion degree leads to worse
prediction accuracy [7]. Advantages of artificial neural networks are adaptive, self-
learning, strong computing, sophisticated mapping and a variety of intelligent
processing capabilities. However, ANN forecasting method is difficult to determine the
network structure scientifically. Learning speed is slow and has local minimum points
[8]. Wavelet analysis method has higher prediction accuracy; but can’t consider the
impact of weather, temperature and humidity and many other factors, and the
forecasting results have a great relationship with the choice of wavelet base [9]. SVM
can be used for short-term load forecasting which has higher accuracy than
conventional methods, and takes full account of the various factors affecting the load. It
converges relatively quickly, and can find the global optimal solution [10]. In practice,
it is difficult to determine the most ideal mathematical model among the above
methods, and establish the relational expression between the load and its influence
factors dependably.
There are lots of uncertain factors directly affect the accuracy of load forecasting
in smart grid power system. A variety of external factors which affect the load
forecasting can be concluded from lots of practice, such as date type, load level,
weather conditions, seasonal factors, and social-economic environment, etc. While
weather conditions include temperature, humidity, wind speed, rainfall, and sunshine;
social-economic environment also includes several aspects. And date type also can be
classified as some categories. Therefore data types of the forecasting input are so many,
and the amount of data in smart grid is very large, which can affect the accuracy and
uptime of load forecasting.
In order to solve the above problems, by analyzing the short-term load forecasting
methods and association rule analysis algorithm, a new forecast method is presented in
the paper. The presented method: the Fuzzy Neural Network based on Association
Rules (FNNAR) method, can increase the accuracy of forecast values, and also reduce
the operation time. In this model, correlation between external factors and changes of
electric load can be found in massive data by using association rule analysis. Attribute
reduction based on association rule analysis can be carried on influencing factors of
load forecasting. That can reduce some interference of noise data, eliminate redundant
properties, and improve the effectiveness of load forecast. And the association rules are
excavated by the proposed Association Rules Mining algorithm based on Quantitative
Concept Lattice (ARMQCL algorithm).
The presented method: the Fuzzy Neural Network based on Association Rules
(FNNAR) method. Attribute reduction based on association rule analysis is firstly
H. Liu and Y.-H. Han / An Electricity Load Forecasting Method 431
carried on the input of load forecasting. Then, the load forecasting can be obtained by
the FNN model using the input of reduced attributes.
Since some irrelevant or unimportant factors can interfere with load forecasting
accuracy in smart grid. Attributes reduction based on association rules is necessary,
which can reduce the influence on forecast accuracy of noise data. So a new Apriori
algorithm fit for smart grid should be proposed primarily.
Association rule analysis is used to reveal interesting links between the basket data
transaction items. Later, it is used to find hidden relationships between the data [11].
Consider database D which includes h transactions. Each transaction is composed of a
number of items. If all itemsets are expressed as I, I= {i1, i2, … ,in}, the association rule
can be expressed as the format of A o B (A I, B I, A B ). The properties of the
association rules can be described by the following two parameters.
Support: If there are s% of transactions which simultaneously support the
transaction sets A and B among all transactions, then s% is called the support of A o B .
Additionally, support number can be denoted as the ratio of occurrence of the two
items and total number of transactions. Support represents the frequency of the rule,
and is represented as sup(A→B). sup( A o B )=sup( A B ). Minimum support is
represented as min_sup.
Confidence: The possibility of seeing the consequence of the rule under the
condition that the transactions also contain the antecedent is denoted as confidence.
When c% of transactions simultaneously support the transaction set A and B, that c% is
the confidence of A o B . Confidence represents the intensity of the rule, and is
represented by con ( A o B ). con ( A o B )=sup ( A B )/sup (A). Minimum confidence is
represented by min_con.
Association rule analysis is used to determine association rules in the transaction
database D that satisfy the given conditions: min_sup, and min_con. Association rule
analysis is generally divided into the following two steps [12]:
Determine all existing frequent itemsets in the transaction database.
Generate association rules by frequent itemsets. That is, if B A , B z ,and
con(B o (A B)) t min _con , then itemset A and B constitute the association rule
B o (A B) .
Definition 1: Each node in the CL is a tuple called a concept, such as (A,B). Where
A implies the extension of concept, and B implies the intension, which are respectively
denoted as Extension(C) and Intension (C).
Definition 2: For C=(A,B), C ' =( A ,B) is the quantitative concept of C. A is the
cardinality of A. The lattice concept constituted by quantitative concept is defined as
Quantitative Concept Lattice (QCL). QCL quantifies the extension of CL, and ignores
the specific information about CL.
1.1.1.1. Construct Quantitative Concept Lattice
Based on the above discussion, the algorithm constructs QCL can be proposed.
Insert (QCL, C): insert a quantitative concept C=(N,B) into QCL.
(1) If C in the QCL satisfies C1i (C1i C) , C=(N,B) is inserted in the QCL. Then
the intension of C is merged into the intension of C1i. And the operation of insertion is
finished, that is, Intent(C1i ) Intent(C1i ) Intent(C) .
(2) Else, on the assumption that C0 is a direct sup-concept of C. C is firstly inserted
as a sub-concept of C0 into the QCL. And we do the following operations:
Make C been the direct-sup-concept of each C1i (C1i<C0,C1i<C), and relieve the
link between C1i and C0.
Insert Ck=(Nk,Bk) into the QCL, which is generated by the concepts of C and
QCL joined.
The generation algorithm of QCL: Create_In_Attr (QCL) is described as follows:
Initialization: Generate QCL with the complete concept O,{} and the null
concept {},all .
For I=1 to n do for Ci C ai do Insert (QCL, Ci).
Enqueue(QCL, C0 )
WHILE Not empty(QCL) DO
BEGIN
,
L_ node Outqueue(Q) ;
C_node First direct sub concept(L, L_ ,
node) ;
WHILE (( C_node ! null ) OR ( C_node ! L_empty-concept_node )) DO
BEGIN
IF NOT ( C_node in QCL)THEN
IF Extent (C_node) / Extent (Ci ) ! Confidence_threshold
THEN Output association rules;
Enqueue(QCL, C_node ):
END;
,
C_node : Next-direct-sub-concept(L, L_node) ;
END;
END.
In the new model (FNNAR model), the first step is attribute reduction based on the
detected association rules by the Association Rules Mining algorithm based on
Quantitative Concept Lattice (ARMQCL algorithm). Then, FNN model uses the
reduced attributes as the input to forecast electric load in smart grid.
Fuzzy Neural Network (FNN) model is used to obtain the load forecast, which can
improve the accuracy of load forecasting in smart grid.
FNN implant the fuzzy input signal and fuzzy weight value into the conventional
neural network (such as feed-forward neural network, Hopfield neural network, etc.).
Fuzzy BP model is used in this paper. Specific implementation process is
described in detail in the literature [14], this paper only describes briefly.
(1) Identify the input and output factors of the neural network;
(2) Fuzzified and normalized the relevant data;
(3) Train the network until obtaining a stable output;
(4) Establish the mathematical model;
(5) Forecast load at the target time.
2. Simulation Analysis
24-hourly historical load data from February to July of Hebei Province is used to
simulate the load forecasting.
350
actual value
FNNAR Predictive
BP Predictive
300
250
load/MW
200
150
100
0 5 10 15 20 25
time/h
8
not Using the attribute reduction
7 Using the attribute reduction
5
Relative Error/%
0
0 5 10 15 20 25
time/h
Figure 2 shows the comparison of the relative error whether the attributes reduction
based on the detected association rules is carried out or not. Since attribute reduction
can exclude the influence of some irrelevant factors, and reduce the interference of
436 H. Liu and Y.-H. Han / An Electricity Load Forecasting Method
noise data. As it can be seen, when forecasting attributes are reduced, the forecast
accuracy is enhanced.
Table 1. The running time of models
Time
not use Attribute use Attribute
Reduction Reduction
Date
d 9.384 1.692
d+1 8.684 1.569
d+2 8.326 1.487
d+3 8.544 1.501
d+4 8.366 1.493
d+5 8.621 1.546
d+6 8.457 1.472
Attribute reduction based on association rules analysis can significantly reduce the
size of the input set for FNN model, and reduce the running time. Table1 shows that
running time of the model which uses attribute reduction is less than the other one.
3. Conclusions
In this paper, a new load forecasting method adapted to smart grid called Fuzzy Neural
Network based on Association Rule mining (FNNAR) is proposed. FNNAR model
firstly carries out attribute reduction based on association rule analysis in order to
exclude the influence of some irrelevant or unimportant factors on load forecasting,
reduce the interference of noise data, and reduce the size the input set of FNN model
significantly. The Association Rules Mining algorithm based on Quantitative Concept
Lattice (ARMQCL algorithm) is proposed to extract association rules. Experiment
results indicate that the proposed method has better forecast precision by offering
smaller forecast errors. Another advantage of FNNAR method is that it needs less
running time. The experiment results prove that FNNAR model is efficient and feasible,
which can be better applied to real conditions of an electricity market in smart grid.
Acknowledgment
This work is partially supported by the National Natural Science Foundation of China
under Grant No.61104005 and 61374097, by Natural Science Foundation of Liaoning
Province under Grant No.201202073, and by the Central University Fundamental
Research Foundation under Grant.N142304004.
References
[1] J. W. Cao, et al., Information system architecture for smart grid, Chinese Journal of Computers, 1(2013),
143-167.
[2] X. Fang, S. Misra, G. L. Xue, and D. J. Yang, Smart Grid-The New and Improved Power Grid: A Survey,
IEEE Communications Surveys & Tutorials, 4(2012), 944-980.
[3] Y. Ye, Y. Qian, H. Sharif, and D. Tipper, A Survey on Smart Grid Communication Infrastructures:
Motivations, Requirements and Challenges, IEEE Communications Surveys & Tutorials, 1(2013), 5-20.
H. Liu and Y.-H. Han / An Electricity Load Forecasting Method 437
[4] S. Y. Chen, S. F. Song, L. X. Li, and J. Shen, Survey on Smart Grid Technology, Power System
Technology, 8(2009), 1-7.
[5] H. Y. Zhao, L. C. Cai, and X. J. Li, Review of Apriori algorithm on Association Rules Mining, Journal
of Sichuan University &Engineering, 1(2011), 66-70.
[6] N. H. Liao, Z. H. Hu, Y. Y. Ma, and W. Y. Lu, Review of the short-term load forecasting methods of
electric power system, Power System Protection and Control, 1(2011), 147-152.
[7] P. R. Ji, J. Chen, and W. C. Zheng, Theory of grey systems and its application in electric load forecasting,
Proc. of Cybernetics and Intelligent Systems, 2008 IEEE Conference on IEEE, Chengdu, China, 2008,
1374-1378.
[8] X. H. Du, T. Feng, and S. Tan, Study of Power System Short-term Load Forecast Based on Artificial
Neural Network and Genetic Algorithm, Proc. of International Conference on Computational Aspects of
Social Networks, CASoN 2010, Taiyuan, China, 2010, 725-728.
[9] D. H. Zhang, and S. F. Jiang, Power Load Forecasting Algorithm Based on Wavelet Packet Analysis,
Proc. of Electric Power System & Automation, 2(2004), 987-990.
[10] M. G. Zhang and L. R. Li, Short-term load combined forecasting method based on BPNN and LS-SVM,
Power Engineering and Automation Conference (PEAM), 2011 IEEE, 1(2011), 319-322.
[11] S. Liu, Y. J. Yang, D. X. Chang, and W. Qiu, Improved fuzzy association rule and its mining algorithm,
Computer Engineering and Design, 4(2015), 942-946.
[12] S. Mallik, A. Mukhopadhyay, and Ujjwal Maulik, RANWAR: Rank-Based Weighted Association Rule
Mining from Gene Expression and Methylation Data, IEEE Transactions on Nanobioscience, 1(2015),
59-66.
[13] D. X. Wang, X. G. Hu, H. Wang, Algorithm of mining association rules based on Quant itative Concept
Lattice, Journal of Hefei University of Technology (Natural Science), 5(2002), 678-682.
[14] H. Y. Yu, F. L. Zhang, Short-Term Load Forecasting Based on Fuzzy Neural Network, Power System
Technology, 3(2007), 68-72.
438 Fuzzy Systems and Data Mining II
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-438
Introduction
The projection pursuit algorithm (PP) is a statistical method for processing non-normal
data [1]. Its idea is to project high-dimensional data to a low-dimensional subspace in
accordance with the needs of practical problems. It measures the probability that
projection reveal some structure of the data by using projection index function. Then, it
analyzes the structural features of high-dimensional data based on the projected value
[2-3]. But the computation of PP is very large. So, the authors propose a new
differential evolution particle swarm optimization projection pursuit algorithm through
the combination of VWPSO and DE, and use this algorithm to realize the optimization
of the projection direction of PP. Experimental results show that the DEPSOPP has
satisfactory effects in optimizing accuracy, convergence and robustness.
1
Corresponding Author: Bin ZHU, School of Electronic Information Engineering, Yangtze Normal
University, Chongqing, 408100, China; Email: zb8132002@163.com.
B. Zhu and W.-D. Jin / The Improved Projection Pursuit Evaluation Model 439
In Eq. (1), v represents the flight speed of the particle. pBest is the local optimum.
gBest represents the global optimum. Inertia weight is w . c1 and c2 are the
acceleration coefficients. rand1d and rand 2d are two random number on the interval of
[0, 1] in Eq. (1). In iteration, the particle is updated by individual extreme pBest and
global extremum gBest .
Since PSO is easy precocious, and is prone to oscillation in the vicinity of the
global optimal solution in the late [6], the authors propose the VWPSO. Weight change
is made according to the Eq. (3). Initial inertia weight is the biggest, which is
conducive to perform global search. Late inertia weight is smaller, which is conducive
to conducting an effective search in the vicinity of the current local extreme points.
Thus, the global and the local search ability of PSO are enhanced.
(t 1)
( wmax wmin )
w wmax ,t t 1 (3)
tmax 1
Where wmax denotes the maximum weight. wmin is the minimum weight. tmax is
the maximum number of iterations. t is the current iteration number.
Although the PSO has many advantages, it also has many problems, such as easy to fall
into local extremum, low searching accuracy and slow convergence speed in the late
evolutionary. Therefore, in order to improve the optimization ability and robustness of
PSO, the authors consider the introduction of DE and hope to improve particle swarm
algorithm through the superiority of DE in the maintenance of population diversity and
search ability.
Consider the minimum optimization problem of function f ( x ) .
lk and uk denote the lower and upper search bounds of the k-dimensional
variables, n is the variable dimension. Assuming that xi (t ) [ xi (t ), xi (t ), , xi (t )] is
1 2 n
Set iteration
condition
N
The termination condition is
satisfied?
End
x* (i, j ) xmin ( j )
x (i , j ) , x* (i, j ) is the benefit index. (5)
xmax ( j ) xmin ( j )
xmax ( j ) x* (i, j )
x (i , j ) , x* (i, j ) is the cost index. (6)
xmax ( j ) xmin ( j )
np
z (i ) ¦ a( j ) x(i, j)',
j 1
i 1,2, , n (7)
f ( a) s z rzy (8)
a arg ma
max{sz rzy }
°
° a
® np (9)
° s.t ¦ a ( j ) 1, 0 d a( j ) d1
2
°̄ j 1
Based on the above assumption, the problem of weight solving has been
transformed into the extremum problem of the objective function. In this case, the
weight is corresponding to the projection direction vector a of PP algorithm.
Fortunately, the hybrid DEPSO algorithm is good at seeking the extremum of the
objective function. After solving the extremum of the objective function that presented
by Eq. (9), we can get the projection direction a ( a (1), a (2), , a( n p )) , and then, we
can get the weight of different evaluation index.
The performance of optimization algorithm should be tested before applying. So, three
kinds of typical test functions were selected to do the test analysis. They are
Rosenbrock function, Rastrigrin function and Girewank function.
We have known that the Rosenbrock function is a unimodal function. The
extremum seeking of Rosenbrock function is slower, which makes the optimization
algorithm hard to get the global optimal extremum. This can be used to investigate the
local optimization ability of the optimization algorithm effectively. The Rastrigrin
function and Girewank function are multimodal functions. The optimization of multi
extremum function is easy to fall into local extremum. Therefore, the optimization of
the Rastrigrin and Girewank function can effectively detect the global optimization
capability of the DEPSO. Based on the above analysis, we carried out the following
experiment.
Table 1. Result for all algorithms on benchmark functions
benchmark
Fitness SPSO VWPSO DE DEPSO
functions
best 8.7682E+00 3.0159E-02 1.0109E-02 3.8032E-05
worst 3.7592E+01 9.4936E+00 8.7380E+00 1.4919E-01
f1
mean 1.8756E+01 3.6217E+00 3.5090E+00 2.5069E-02
Std. 7.6947E+00 2.3592E+00 2.6193E+00 3.5839E-02
best 2.9028E+00 9.9772E-01 3.1064E+00 0.0000E+00
worst 2.0098E+01 2.3879E+01 1.0203E+00 3.9798E+00
f2
mean 1.0004E+01 1.1741E+01 6.6814E+00 1.3012E+00
Std. 3.8116E+00 4.6106E+00 1.8781E+00 1.3031E+00
best 6.1226E-02 4.6796E-02 6.0007E-03 0.0000E+00
worst 5.3671E-01 3.3443E-01 7.2649E-02 4.4947E-02
f3
mean 2.4422E-01 1.5402E-01 2.7571E-02 5.6192E-03
Std. 1.2565E-01 7.3302E-02 1.6244E-02 1.0732E-02
B. Zhu and W.-D. Jin / The Improved Projection Pursuit Evaluation Model 443
3. Conclusions
A new feature evaluation model (viz. DEPSOPP model) is presented in this paper. The
authors achieve the projection vectors optimal of PP algorithm by combining the
VWPSO and DE algorithm. The simulation results show that the algorithm has the good
convergence, robustness and accuracy. The aim of this paper is to solve the evaluation
problem of radar emitter signals features, and there is no real-time requirement.
Therefore, the complexity of the algorithm is less considered.
Acknowledgments
This work was supported by the state key program of national natural science of China
(Grant No. 61134002), the natural science foundation of Chongqing City (Grant No.
CSTC2013JCYJA70010) and the science research project of Chongqing Education
Commission (Grant No. KJ1401224).
References
[1] M. Yan, Y. X. Zhao, L. G. Wu, et al. Navigability analysis of magnetic map with projecting pursuit-
based selection method by using firefly algorithm, Neurocomputing 159(2015), 288-297.
[2] H. L. Zhang, C. Wang, W. H. Fan, A projection pursuit dynamic cluster model based on a memetic
algorithm, Tsinghua Science and Technology 20(2015), 661-671.
[3] Y. Su, S. G. Shan, X. L. Chen, et al. Classifiability-based discriminatory projection pursuit, IEEE
Transactions on Neural Networks 22(2011), 2050-2061.
[4] I. Musa, S. Gadoue, B. Zahawi, Integration of Induction Generator Based Distributed Generation in
Power Distribution Networks Using a Discrete Particle Swarm Optimization Algorithm, Electric Power
Components and Systems 44(2016), 268-277.
[5] M. J. Mahmoodabadi, M. B. S. Mottaghi, A. Mahmodinejad, Optimum design of fuzzy controllers for
nonlinear systems using multi-objective particle swarm optimization, Journal of Vibration and Control
22(2016), 769-783.
[6] S. Mirhadi, N. Komjani, M. Soleimani, Ultra-wideband antenna design using discrete Green's functions
in conjunction with binary particle swarm optimization, IET Microwaves, Antennas and Propagation
10(2016), 184-192.
444 Fuzzy Systems and Data Mining II
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-444
Abstract. When attempting to recognize mental stress using heart rate variability
(HRV), single classification models tend to have lower accuracy in detecting
different stress levels accurately, and are likely to lead to over fitting, therefore
affecting the accuracy of stress recognition. This study employed the ensemble
learning method of random forests (RF) and proposed a method to recognize stress
by using HRV. By analyzing the short-term (120-180 sec) electrocardiography
(ECG) data of the subjects during a stress-inductive video game, we extracted their
HRV readings using a time-domain method, frequency-domain method, and
non-linear method. Next we constructed a stress recognition model based on the
RF technique, which was able to identify low, medium, and high level of stress.
Then the model was applied to 200 groups of stress level data collected from the
10 subjects. The results showed that, compared to traditional k-nearest neighbor
(KNN) and logistic regression (LR) methods, the RF model could be used to
automatically detect and identify stress of different levels with a higher level of
accuracy, and with 90% accuracy in recognizing higher levels of stress.
Introduction
1
Corresponding Author: Gang ZHENG, Professor, School of Computer and Communication Engineering,
Tianjin University of Technology, Room 317, #7 Building, Tianjin, China; E-mail:
kenneth_zheng@vip.163.com
G. Zheng et al. / HRVBased Stress Recognizing by Random Forest 445
subjects to extract the HRV features for the time and frequency domains for further
analysis, and proved that HRV could be used to predict mental stress by detecting
changes in the autonomous nervous system (ANS). Construction of stress recognition
models refers to using computer analysis to establish a HRV-stress relationship model
to analyze and detect stress. A well-developed model will usually have higher accuracy
in detecting stress, when compared to the observation of statistical analysis. Currently,
analysis methods that utilize HRV in detecting stress include: K-nearest neighbors
(KNN), probabilistic neural networks (PNN), linear discriminant analysis (LDA), and
fuzzy clustering. Karthikeyan et al. [8] adopted the Strop color word test, applied PNN
and KNN classifiers to classify features of short-term (32s) HRV and ECG readings,
and achieved a 91% accuracy rate in recognizing the two statuses: stressed and normal.
However, most of the stress recognition methods with HRV tend to apply a traditional,
single classification model to detect stress, which is likely to cause over fitting and
affect the accuracy in stress recognition.
This article introduced an RF technique into the ECG-based stress analysis,
combining the feature set of HRV extracted from time-domain, frequency-domain, and
non-linear methods to construct a model of mental stress recognition, to identify
different levels of stress.
Traditional laboratory stimuli that are used to induce stress include color word-based
tests, mental arithmetic tasks, pictures tests, video tests, and video game tasks[12-14].
Given that induction of mental stress is likely to be affected by the differences in
factors such as personal experience and psychological quality, it is hard to determine
the label of stress levels in most traditional stress induction experiments. In order to
scientifically determine the labels, the study employed a video game task as it has
higher stimulating effect on the senses. A multitude of difficulty levels were set to
induce mental stress. At the same time, the facial expression of the subjects and the
game parameters were recorded to assist the assessment of stress labels. In addition to
using participants’ subjective answers to a questionnaire to evaluate their stress level,
their facial expressions, and the game parameters were also recorded. Human
observation on the participants’ facial expression was applied in the experiment to
assess their level of commitment. If the participant looked absent-minded, then the data
would be removed from the data set to avoid label confusion. Game parameters refer to
the information that could reveal the subjects’ stress level. One such example was the
number of mistakes that was made by a subject, as making an excessive amount of
mistakes tends to cause greater mental stress. Therefore, we set difficulty levels for the
game, subsequently introducing facial expressions and game parameters as objective
indicators to further verify the results of subjects’ self-reported stress levels in the
questionnaire, so as to achieve more accurate labels for stress levels.
In order to recognize mental stress with HRV signals, features defined from the
original ECG readings that can reflect HRV information needs to be acquired.
446 G. Zheng et al. / HRVBased Stress Recognizing by Random Forest
Therefore, preprocessing of the ECG data is needed for this study, as well as extracting
feature parameters of the HRV signals.
The ECG signal of human body is a weak signal with a low signal-to-noise ratio
(SNR). A normal ECG signal frequency ranges from 0.05 to 100 Hz, whilst 90% of
spectral energy of the signal distributes between 0.25 and 35 Hz[15] .The acquisition of
ECG signals is likely to be subject to the interference of various amounts of noise.
Therefore, it is necessary to remove the noise generated during the ECG acquisition
process to effectively detect and locate the R-wave peak in ECG signals and to obtain
accurate HRV signals. Hence, we preprocessed the original ECG signal for our study.
Figure 1 shows processing steps of the original ECG signal: First, an ECG was used to
obtain the original ECG readings. Then, a wavelet thresholding technique [16] was
adopted to remove noise from the original signals. The coif4 wavelet function was used
and an initial threshold was set to remove the noise and baseline drift from the original
ECG signal. Next, we applied a window thresholding method to detect the R-wave
peak in the ECG waveform, after noise reduction. Lastly, the RR intervals were
acquired based on the R-wave peak positions, the ectopic beats were removed, and the
time series of the RR intervals or HRV signals, were obtained.
In our study, the training set represented by the features of HRV and the stress label is
expressed as
T = {(x1 , y1 ), (x 2 , y 2 ), , (x N , y N )} (1)
Wherein, xi ∈ χ ⊆ R n is the feature vector of HRV, yi ∈ γ = {c1 , c2 , cm } is
the classification of the HRV information: if m = 2, then c1 = 0, indicating a state of
relaxation c2 = 1 represents (low, medium, high) states of stress. We employed the RFC
model to establish the relationship between HRV and the states of mental stress, to
achieve the purpose of recognizing various states of stress through the HRV data, or to
permit a HRV feature vector x to get its classification y. The following steps were
adopted to construct the stress recognition model based on RF:
1) In a given HRV training set T, a training sample was acquired through N
sub-sampling with replacements.
*
2) For each acquired sample T , one classification tree model h (x, Θ1 ) was
established. During the model construction process, supposing the sample training set
has M HRV features, then Mtry (Mtry< M) features would be randomly selected from the
M features for each node in each tree, and select one of the M try features to split,
448 G. Zheng et al. / HRVBased Stress Recognizing by Random Forest
according to the principle of minimum impurity of nodes. This process was repeated to
construct each node in the classification tree model until the HRV vector of each
sample could be accurately classified, or all the features were used, at which the growth
of the classification tree ceased.
3) Step 1) and 2) were repeated to establish k classification trees. No pruning is
performed after the trees were established.
4) Each classification tree model would predict the input HRV feature vector to
obtain k results.
5) The final classification results y would be decided by voting on the predicted
results by k classification trees:
k
y = arg max ∑ I (h i ( x ) = c j ), j = 1,2, m (2)
cj
i =1
In order to investigate the performance of HRV stress recognition with the RFC, we
conducted a comparative analysis between our model and KKN and LR models, which
were used in the most relevant studies. We employed three models to identify the
states: relaxed and low-stress, relaxed and medium-stress, and relaxed and high-stress
separately. For every instance of recognition, a 5-fold cross-validation method was
adopted for adjustment. The data were randomly divided into five portions, wherein
G. Zheng et al. / HRVBased Stress Recognizing by Random Forest 449
four portions were used as training sets, and one reserved as a testing set. The receiver
operating characteristic (ROC [21]) curve and the average accuracy rate from twenty
5-fold cross-validation tests were used as the performance evaluation index for the
model.
Figure 2 shows the value of the ROC curve and the area under the curve (AUC) of
the three models, KNN, LR, RFC, under the condition of the recognition target being
set as: relaxed and low-stress, relaxed and medium-stress, and relaxed and high-stress
respectively. The diagonal dotted line shows the result of a random guess, the AUC of
which is 0.5.
The closer the ROC curve is to the upper left corner, the larger the AUC, and the
better the performance of the model. Whereas the closer the ROC curve is to the
diagonal dotted line, the closer AUC is to 0.5, and the closer the performance of the
model is to a random result. According to Figure 2, when recognizing the states of
relaxed and low-stress, the AUC of KNN, LR, and RFC model were 0.62, 0.75, and
0.78 respectively. The performance of the LR and RFC was close, whilst the
performance of the KNN was relatively poorer than the other two models. The overall
performance of the three models in stress recognition was not particularly accurate.
When recognizing the states of relaxation and medium-stress, the AUC of the KNN,
LR, and RFC models were 0.72, 0.90, and 0.95 respectively. The RFC model achieved
a good recognition performance, followed by the LR model, whilst the recognition
performance of the KNN was still under standard. When recognizing the states of
relaxation and high-stress, the AUC of KNN, LR, as well as the RFC model AUC were
0.88, 0.96, and 0.96 respectively. In this situation, the recognition performances of all
three models were acceptable. Thus, with the increase of the stress level, the
recognition performance of the three models improved accordingly, whilst the RFC and
LR seemed to have better recognition performance than the KNN.
Table 1.Classification Results of the Three Classifier for Relax and Different Level of Stress States
State Classifier Accuracy rate Key parameters
KNN 60.38 K=7
Relaxed/Low Stress LR 75.43 penalty='l2',tol=0.0001
RFC 73.35 Tree_nums=1000, Mtry=4
KNN 64.46 K=7
Relaxed/Medium Stress LR 89.94 Penalty=‘l2’, tol=0.0001
RFC 91.03 Tree_nums=1000, Mtry=4
KNN 84.51 K=7
Relaxed / High Stress LR 93.45 Penalty=‘l2’, tol=0.0001
RFC 93.88 Tree_nums=1000, Mtry=4
The model parameters were adjusted with 5-fold cross-validation. Table 1 shows
the optimal parameters and recognition accuracy of the three models when recognizing
relaxed and low-stress, relaxed and medium-stress, and relaxed and high-stress.
In the above table, an l2 penalty was used as the regularization constraint,
“tol=0.0001” was defined as the tolerance of stop training. “Tree_nums=500” of the
RFC represents the number of element classifiers in RF, “M try=5” refers to the number
of randomly selected feature subsets.
450 G. Zheng et al. / HRVBased Stress Recognizing by Random Forest
Figure 2. The ROC of KNN, LR, and RFC Model When Recognizing Relax and Low Stress, Relax and
Medium Stress, and Relax and High Stress
4. Conclusions
This study provided an automatic stress recognition method based on the RF technique
and the utilization of HRV signals, which can help individuals monitor and recognize
mental stress. A game task was applied to induce the state of relaxation and low-level,
medium-level, or high-level mental stress, HRV of ECG signal were obtained, and
features of the HRV were extracted. After computing by KNN, LR, and RFC models,
their recognition accuracy of relaxed and the three stress levels were acquired. The
computation complexity of KNN and LR is O(n), and that of RFC is O(mnlogn), n is
the amount of sample, m is the amount of feature. Since KNN’s computation
complexity is raised by n, the number of samples. Therefore, its computation
complexity is rather low, which was easy to be performed. The computation
complexity of REC is depended on n, and m (number of feature), which is means that
its computation time is longer than KNN, but it is still acceptable. The results showed
G. Zheng et al. / HRVBased Stress Recognizing by Random Forest 451
that HRV can be used to recognize changes in the state of mental stress, and is
especially sensitive when the stress levels are high. In addition, a comparative analysis
of the three classifiers revealed that when using HRV to recognize mental stress, the
RFC model appeared to have better overall recognition performance, and its
recognition accuracy was greater than 90%, when the stress levels were high.
Acknowledgment
References
[1] P. Ferreira, P. Sanches, K. Höök, et al. How to empower users to cope with stress, Proceedings of the 5th
Nordic conference on Human-computer interaction, 2008, 123-132.
[2] J. P. Niskanen, M. P. Tarvainen, P. O. Ranta-Aho, et al. Software for advanced HRV analysis. Computer
methods and programs in biomedicine, 1(2004), 73-81.
[3] D. W. Rowe, J. Sibert, Irwin D. Heart rate variability: Indicator of user state as an aid to
human-computer interaction. Proceedings of the SIGCHI conference on Human factors in computing
systems, 1998, 480-487.
[4] A. R. Subahni, L. Xia, A. S. Malik. Association of mental stress with video games, The 4th IEEE
International Conference on Intelligent and Advanced Systems (ICIAS), 2012, 82-85.
[5] C. Wang, F. Wang. An emotional analysis method based on heart rate variability, 2012 IEEE EMBS,
2012, 104-107.
[6] J. Zhang, A. Nassef, M. Mahfouf, et al. Modelling and analysis of HRV under physical and mental
workloads. Modeling and Control in Biomedical Systems. 1(2006), 189-194.
[7] Y. H. Lee, V. Shieh, C. L. Lin, et al. A Stress Evaluation and Personal Relaxation System Based on
Measurement of Photoplethysmography, The Second IEEE International Conference on Robot, Vision
and Signal Processing (RVSP), 2013, 182-185.
[8] P. Karthikeyan, M. Murugappan, S. Yaacob. Detection of Human stress using Short-Term ECG and
HRV signals. Journal of Mechanics in Medicine and Biology, 2(2013), 1-29.
[9] B. Kaur, J. J. Durek, B. L. O'Kane, et al. Heart rate variability (HRV): an indicator of stress, SPIE
Sensing Technology + Applications International Society for Optics and Photonics, 2014:
91180V-91180V-8.
[10] J. Choi, R. Gutierrez-Osuna. Using heart rate monitors to detect mental stress. 6th IEEE International
Workshop on Wearable and Implantable Body Sensor Networks, 2009, 219-223.
[11] M. Kumar, M. Weippert, R. Vilbrandt, et al. Fuzzy evaluation of heart rate signals for mental stress
assessment. IEEE Transactions on Fuzzy Systems, 5(2007), 791-808.
[12] M. Svetlak, P. Bob, M. Cernik, et al. Electrodermal complexity during the stroopcolour word test.
Autonomic Neuroscience, 1(2010), 101-107.
[13] C. Ring, M. Drayson, D. G. Walkey, et al. Secretory immunoglobulin A reactions to prolonged mental
arithmetic stress: inter-session and intra-session reliability. Biological psychology, 1(2002), 1-13.
[14] M. H. Choi, S. J. Lee, J. W. Yang, et al. Changes in cognitive performance due to three types of
emotional tensionDatabase theory and application, bio-science and bio-technology, Springer Berlin
Heidelberg 2010, 258-264.
[15] R. Q. Yan, Y. Q. Zhan, et al. Study of automatic detection on 12 leads Electrocardiogram, Chinese
Journal of Medical Instrumentation, 2(2002)88-91.
[16] P. Karthikeyan, M. Murugappan, S. Yaacob. ECG signal denoising using wavelet thresholding
technique in human stress assessment. International Journal on Electrical Engineering and Informatics,
2(2012), 306-319.
[17] H. Zong, C. C. Liu. Study on ECG signal processing and HRV analysis, Shandong University, 2009.
[18] R. H. D. Townsend. Fast calculation of the Lomb-Scargleperiodogram using graphics processing units.
The Astrophysical Journal Supplement Series, 2(2010), 895-895.
[19] K. Fang, J. B. Wu, et al. A review of Technologies on Random Forrest. Statistics & Information Forum,
3(2012), 32-38.
[20] L. Breiman. Random Forests. Machine Learning, 1(2001), 5-32.
[21] T. Fawcett. An introduction to ROC analysis. Pattern recognition letters, 8(2006), 861-874.
452 Fuzzy Systems and Data Mining II
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-452
Keywords. ricci flow, hyperbolic ricci flow, optimization, wireless sensor network
Introduction
In WSN, routing is an inherent challenge that genuinely impacts the service of network
[1], such as timely information inquiry. That is, building an available route in time is one
of crucial problems. Greedy strategy is suitable for the features of WSN [2]. Accordingly,
virtual coordinate system [3] is proposed for greedy routing scheme in WSN.
In practice, communication links in WSNs are volatile due to the inherent
characteristic of sensors. In this case, nodes could be deformed to virtual coordinates to
build the routing, since the geometric topology of the network may be more stable. In
this paper, we employ Ricci flow to achieve this conformal map and obtain “virtual
nodes”. Furthermore, a sampling greedy forward scheme could discover routing easily in
the virtual coordinate system. However, greedy routing may fail to find a route due to
forwarding void in practice. A well-known problem with geographical forwarding is
that packets may get stuck at nodes with no neighbor closer to the destination, since the
geometric topology of the network may include holes. To address this problem, the
applicable domain of Ricci flow is generalized from Euclidean space to hyperbolic
space.
The essential merit of this method is that it could generate a route quickly.
Unfortunately, the build route may be unavailable. In details, the source cannot reach the
destination through the route [4-5].
1
Corresponding Author: Hao YANG, School of Information Engineering, Yancheng Teachers
University, Yancheng, Jiangsu, China; School of Software and TNList, Tsinghua University, Beijing, China;
E-mail: classforyc@163.com
K.-M. Tang et al. / Ricci Flow for Optimization Routing in WSN 453
1. Preliminary
Hamilton [6] firstly introduced the theory of Ricci flow on smooth surfaces for
Riemannian manifolds. It is applied to deform Riemannian metric using a given
curvatures. That is, a geometric object which is distorted or uneven can be morphed to
guarantee all curvatures are the same. [8] Further extended the traditional circle packing
metric and proposed generalized discrete Ricci flow. Though the extension, discrete
Ricci flow enables construct geometric routing effectively [9] in wireless sensor network.
Especially, [7] adopted this technology in WLAN which validate that it enables to be
applied to be wireless sensor networks. However, the corresponding computational
expense is unacceptable when the scale of the network becoming large. In the other
words, optimized solutions should be considered, which our focus of this paper is.
2. Optimized Methodology
of classes, the suitable vertex set is V = Vi, where iЩ|C| and |C| is the number of
classes. Obviously, this scheme does not consider the functions of internal nodes. In this
case, it is possible that the built route is not the global optimum but an approximate
solution, since some nodes which are in clusters may be needed. Therefore, we propose
another improved scheme with more comprehensive considerations.
Scheme two: appends the neighbor nodes of boundaries of clusters, therefore the
built route is probable better than the former although the energy consumption is a little
greater. It constructs neighbor set Nei of Ci: if vЩNei, vЩCi and v is the neighbor vb, vb
ЩBi. Accordingly, constructs the vertex set Vi:
v C
i and v is the neighbor of vb, vb
ЩBi, if v is also the neighbor of vnei, vneiЩNei, then add vnei instead of vb . In this
case, the suitable vertex set is obtained. This scheme considers the nodes that may be
probable selected in the process of building routes.
According to our optimization schemes, the number of nodes that needed to be
transformed to virtual coordinates will reduce and energy cost for building routing will
decrease greatly in WSN.
The detail of Optimized Hyperbolic Ricci flow is as follow:
Algorithm: Optimized hyperbolic Ricci Flow
Input: Triangular Mesh M, genus 0 and exist holes.
Output: Virtual coordinates for sensors, all the boundaries are circular
1. Select a candidate set by either of above schemes
2. Calculate the length of the longest boundary \V
3. If \V < the radius of corresponding of sensors, then
4. For each vertex /6 , label /6 as un-accessed. Suppose the first face ¾6£ , and
embed this face onto the plane, which is
Ri (0,0), R j (lij ,0), Rk (lik cosT i jk , lik sin T i jk ) then label /6 , / and
/£ as accessed.
5. For each un-accessed node/6 , check all its neighbouring faces ¾6£ . Once /6 ’s
neighbouring nodes / and /£ are accessed, embed this face on to plane.
Suppose /6 and / has been accessed already, /£ can be computed,
1Ti jk
l jk e
Rk ( R j Ri )
lij
6. Find a point that is belonging to [1.8lÇ , 2lÇ ] from the origin, and make it a
center of the Upper Half-plane model
7. For each vertex /6 , map its coordinate _6 to its corresponding coordinate in the
Upper Half-plane model
8. End If
Note that all the vertex planar parameters ! " , ! # and ! $ are treated as complex numbers.
3. Performance Evaluations
The proposed schemes are verified in our sensor network, which is a consistently
operating system deployed for the aim of forest monitoring. With up to 124 nodes
K.-M. Tang et al. / Ricci Flow for Optimization Routing in WSN 455
deployed in the wild, this system provides us an excellent platform for validate the
availability of this method. Figure 1 plots the real topology of the sensor network. The
sink is deployed at the lower right corner and the communicational links are plotted.
Figure 3 illustrates the ratio of energy cost of both schemes compared to TRF passing
through the built route. As the results, the numerical value of TRF is 1 as the evaluative
456 K.-M. Tang et al. / Ricci Flow for Optimization Routing in WSN
criteria. The scheme two always cost the same energy with TRF while the scheme one
cost fewer when the number of nodes increases, since the former builds the identical
route with TRF based on classification and the later just obtains the approximate solution.
The above experiments demonstrate that either scheme has its advantage. We could take
the best of them on the basis of our demands.
Finally, we validate the globe optimum of our scheme. As our discussion, both of our
schemes are able to reflect the route successfully since they can guarantee delivery, but
they do not always guarantee the globe optimum. The results of experiments are shown in
Fig. 4. When the number of nodes is not large (e.g. the scale of network is no more than 50
sensors), the route of three methods can reach to the globe optimum. As the number of
nodes increases (e.g. the scale of network is more than 80 sensors), the scheme one may
not guarantee to build an optimal routing each time. In other words, the built route by the
scheme one is approximate optimal solution that is constructed by both the scheme two
and TRF.
Figure 4. The relationship of success rate of the globe optimum and the number of sensors
K.-M. Tang et al. / Ricci Flow for Optimization Routing in WSN 457
4. Conclusions
In this paper, we propose two optimized hyperbolic Ricci flow schemes to construct
virtual coordinates for geographic routing, which reduces energy cost of iterative process
immensely. With our methods, sensors are mapping to virtual coordinates to discover a
proper routing. Experiments demonstrate that our optimized schemes are feasibility and
effective in practice and outperform existing Ricci flow-based routing schemes.
Acknowledgements
This work is supported by the National High Technology Research and Development
Program (863 Program) of China (2015AA01A201), National Science Foundation of
China under Grant No. 61402394, 61379064, 61273106, National Science Foundation of
Jiangsu Province of China under Grant No. BK20140462, Natural Science Foundation of
the Higher Education Institutions of Jiangsu Province of China under Grant No.
14KJB520040, 15KJB520035, China Postdoctoral Science Foundation funded project
under Grant No. 2016M591922, Jiangsu Planned Projects for Postdoctoral Research
Funds under Grant No. 1601162B, JLCBE14008, and sponsored by Qing Lan Project.
References
[1] T. Meng, F. Wu, Z. Yang, et al. Spatial reusability-aware routing in multi-hop wireless networks, IEEE
Transactions on Computers, 65(2016): 244-255.
[2] H. Huang, H. Yin, Y. Luo, et al. Three-dimensional geographic routing in wireless mobile ad hoc and
sensor networks, IEEE Network, 30(2016): 82-90.
[3] D. Zhang, E. Dong. A Virtual Coordinate-Based Bypassing Void Routing for Wireless Sensor Networks,
IEEE Sensors Journal, 15(2015): 3853-3862.
[4] R. Sarkar, X. Yin, J. Gao, F. Luo, and X. D. Gu, Greedy routing with guaranteed delivery using Ricci flows,
in Proc. 8th Int. Symp. Inf. Process. Sensor Netw. , 2009, 121–132.
[5] K. Cai, Z. Yin, H. Jiang, et al. Onionmap: A scalable geometric addressing and routing scheme for 3d
sensor networks, IEEE Transactions on Wireless Communications, 14(2015): 57-68.
[6] R. S. Hamilton, The Ricci flow on surfaces, Math General Relativity, 71(1988): 237-262.
[7] H. Yang, K. M. Tang, J. J. Yu, L. C. Zhu, H. Xu, Y. Y. Cao, Virtual Coordinates in Hyperbolic Space
Based on Ricci Flow for WLANs, Applied Mathematics and Computation, 243(2014), 537-545.
[8] Y. L. Yang, R. Guo, F. Luo, S.M. Hu, and X. Gu, Generalized discrete Ricci flow, Computer Graphics
Forum, 28 (2009).
[9] R. Guo, Local rigidity of inversive distance circle packing, Transactions of the American Mathematical
Society, 363(2011), 4757-4776.
458 Fuzzy Systems and Data Mining II
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-458
Abstract. As we all know, the Internet of Things (IoT) has a promising future.
However, the Internet of Things is an onward research and development (R&D)
work as a system or technology. Currently, there is no strict definition of IoT
system by the International Telecommunications Union (ITU), so most of the
architecture design for IoT comes from the requirements of specific applications.
In this paper, we focus on the architecture for IoT, especially the application-
driven architecture. Firstly, we identify and summarize the applications in IoT.
Then we give a more holistic overview of IoT’s application-driven architecture,
which are divided into three categories based on Radio Frequency Identification
(RFID), Wireless Sensor Network (WSN) and Machine-to-Machine (M2M)
respectively. Along the way, we analyze the pros and cons of proposed
architecture in each category qualitatively. In addition, we also analyze the
techniques and methods in these categories, and point out the open research issues
and directions in this field.
Introduction
The Internet of Things (IoT) refers to uniquely identifiable objects (things) and their
virtual representations in an Internet-like structure [1]. In 2005, the International
Telecommunications Union (ITU) gave it a definition: “The connectivity for anything
by embedding short-range mobile transceivers into a wide array of additional gadgets
and everybody items, enabling new forms of communication between people and
things, and between things themselves.”[2] It is our comprehension that IoT is a multi-
disciplinary and advanced technology set.
At present, the applications based on IoT technology emerge more and more.
These applications involve the logistics management [3], the Intelligent Transport [4],
1
Corresponding Author: Wei CHEN, School of Computer Science and Technology, China University
of Mining and Technology, Xuzhou, Jiangsu, 221116, China; E-mail: chenw@cumt.edu.cn.
W.-D. Fang et al. / Research on the Application-Driven Architecture in Internet of Things 459
the Smart City [5], the Smart Home [6], and so on. The IoT has not only industrial
value, but also research significance. Many governments and research institutions have
invested heavily in the IoT research. Based on the existing wireless sensor network
(WSN), many projects have been researched. Recently, the special funds have been
established to facilitate the research of theories, methods and key technologies in the
IoT field, and lots of research results have been worked out and the application
demonstration and industrialization process have been initially launched by using of
some results. Gradually, the IoT becomes an indispensable aspect in the process of
next-generation broadband wireless communication network, and has huge
opportunities for the industrial R&D.
Although the applications in the Internet of Things have a promising future, the
Internet of Things is an ongoing R&D work. Nowadays, the IoT system is not strictly
defined by ITU, so most of the architecture design for IoT comes from the
requirements of specific applications. In this paper, firstly, we identify and summarize
the application of IoT in section 1. Then, in section 2 we give a more holistic overview
of the IoT’s application-driven architecture, which are divided into three categories
based on RFID, WSN and M2M respectively. Along the way, we analyze the pros and
cons of proposed architecture in each category qualitatively. In addition, we also
analyze these techniques and methods in these categories, and point out the open
research issues and directions in this field.
In this section, the IoT’s system architecture of application-driven is divided into three
typical categories based on RFID, WSN and M2M respectively.
1.1. RFID-based
The electronic tags that transform the “things” into intelligent things may be the most
flexible. The tagging mobile and fixed assets are their major application for the
commodities’ tracking and management. Khanam believed that RFID, just like punch
card, keyboard and barcode, was an information input approach, belonged to the IoT’s
category [7]. As an extension of the application technology, RFID improves the
efficiency of information input, and reduces costs. In coding, the Auto-ID Centre
proposed the EPCGlobal system [8] for all electronic encoding; and RFID was only an
encoded carrier.
In Figure 1, the EPCGlobal have proposed five technical components of the Auto-ID
system. The five technical components involve the electronic product code (EPC) tag,
the RFID reader, the Application Level Event (ALE) middleware for information
filtering and gathering, the EPC Information Service (EPCIS) and the EPCIS
Discovery Service (including the Object Name Service (ONS) and the Physical Mark-
up Language (PML)). The EPC only identifies "tag", all useful information about the
product are described by a new standard XML (eXtensible Makeup language), named
PML. Because of the existence of ONS and PML, the EPC system of RFID-based is
really from the Network of Things to the Internet of Things. Based on ONS and PML,
the enterprise application of RFID will be from the internal closed-loop application to
the open-loop supply chain applications. Zhang et al proposed extent six-layer
architecture of IoT based on RFID [9]. In this architecture, the perception layer was
460 W.-D. Fang et al. / Research on the Application-Driven Architecture in Internet of Things
divided it into coding layer, information acquisition layer and information access layer
from bottom to up. The coding layer was the base of the Internet of things. The things’
coding information was obtained from barcode, two-dimension code and EPC. Liu et al
proposed simple radio frequency identification (RFID) based architecture to preserve
the privacy of the target object [10]. The proposed architecture could effectively hide
the presence of the target object, and preserve the location information of the target
object via simply transferring the ID information.
1.2. WSN-based
In general, the Sensor Networks include the wireless sensor network, the Visual Sensor
Networks (VSN) and Body the Sensor Networks (BSN). In this sub-section, we mainly
discuss the WSN, which is made up of a set of autonomy and auto-configuring wireless
sensors. A wireless node is made up of sensor, RF, MCU (Micro Controller Unit),
memory; batteries and UART (Universal Asynchronous Receiver/Transmitter). The
sensor nodes sense information, process it into data packets, and transmit them to the
sinks. The sinks converge to them, and transmit them to BS (Base Station. Finally,
these data packets are transmitted to user via wide area network.
Although the wireless sensor network is a hot research spot, there are few successful
actual cases in the industrial field. This is because, the researches mostly focus on the
WSN’s sub-layer, such as ZigBee, TinyOS and 6LoWPAN (IPv6 over Low-power
Wireless Personal Area Networks), as well as the energy efficiency. It was noteworthy
that 6LoWPAN was created for the purpose, which was unsuitable for such low-power
wireless embedded devices due to a lack of resources [11]. On the other hand, the new
trends are that made sensor nodes into smart things, and allowed sensor nodes to be
accessed via Internet [12]. Based on the above assumptions, the architecture of
flattening network was given in Figure 2.
W.-D. Fang et al. / Research on the Application-Driven Architecture in Internet of Things 461
1.3. M2M-based
In general, the M2M’s concept covers a wide range, which involves the part of the
EPCGlobal and the wireless sensor network, both the wired communication and the
wireless communication. A typical M2M system architecture is shown in Figure 3.
lacks unified standardization and architecture, just as ONS and PML. Although there is
some attempt, the unified standardizations have not yet been formed. In general, as the
key technology of IoT’s architecture, the ONS and PML have broad application
prospects. In addition, Magdum et al. proposed a low cost M2M architecture to
improve the existing city bus public transport system - Arrival Time Prediction of bus
in real time and approximate Seat Availability in the bus [16].
Additionally, the technology architectures for the wireless sensor network and the
machine-to-machine have not yet fully been raised to the ONS / PML technology
system height for the Internet of Things. We think WSN and M2M would refer to the
ONS / PML technology system architect on the road towards the Internet of Things.
As a large set of technologies, the IoT has many key technologies in theories and
applications. In this section, we will analysis some representative technologies, and
present some future issues combined with the proposed architecture in the previous
section.
The Internet of Things is made up of many heterogeneous networks, which are co-
existence. On the other hand, as the important basis of information sense, the function
of perception layer embodies diverse aspects, which involve the overlapping of cross-
system, cross-cell and different accessed technology, in special applications.
Meanwhile, we have to take enough account for the different application characteristics,
which contain the following items:
x Unified structural design
x Diversity of standard and protocol
x Interaction of hardware and software
x Intersection of function implementation and task management
Therefore, through the application-driven analysis (seeing in section 2), we
synthesize the network structure of RFID, WSN and M2M, and then present some
future issues in the next sub-section.
3. Conclusions
There are different perspectives about the R&D of the Internet of Things. One is
whether IoT has its own technology architecture or not. Different people have different
points on this issue. Someone have passive attitudes. They claim that the IoT only
integrates the existing technologies, without its own technology architecture. However,
others hold opposite opinions, they hold the viewpoint of “Internet of Things pan-
technology theory”. They argue that the IoT’s technology have been widely used in the
464 W.-D. Fang et al. / Research on the Application-Driven Architecture in Internet of Things
all aspects of industrial application, related to various fields of IT R&D. In this paper,
we identify and summarize the applications in IoT, and then we give a more holistic
overview of IoT’s application-driven architecture, which are divided into three
categories: based on RFID, WSN and Machine-to-Machine (M2M) respectively. Along
the way, we analyze the pros and cons of the proposed architecture in each category
qualitatively. In addition, we also analyze the techniques and methods in these
categories, and point out the open research issues and directions in this area.
We believe that, although the Internet of Things possesses the computer,
communications, networking and control, the simple integration of these technologies
cannot constitute a flexible, efficient and useful IoT. Based on the convergence of
existing above-mentioned technology, the IoT forms its own technical architecture
through further R&D and application.
In the foreseeable future, the things become more and more tiny, more and more
intelligent. They have their own IP address, and could achieve autonomous information
exchange via IoT. Furthermore, with the development of Smart Manufacturing and
Industry 4.0, the industrial structure will be transited from vertical to flattening, and be
changed from centralized to decentralized design. This will inevitably require the
different IoT’s architecture to meet different application requirements. Through our
contribution in this paper, we wish that some conclusion and proposed open research
issues could facilitate the system design in IoT’s field.
Acknowledgment
This work is partially supported by the National Natural Science Foundation of China
(61471346, 61302113), the Shanghai Municipal Science and Technology Committee
Program (15DZ1100400), the Science and Technology Service Network Program of
Chinese Academy of Sciences (kfj-sw-sts-155), the Science and Technology
Commission of Shanghai Municipality (14ZR1439700), the National Natural Science
Foundation and Shanxi Provincial People's Government Jointly Funded Project of
China for Coal Base and Low Carbon (U1510115), the State administration of work
safety accident prevention technology project (shandong-0006-2014AQ, shandong-
0001-2014AQ), the independent innovation projects of Ji'nan University(201401210),
the Science and technology project of Housing Urban and rural construction in
Shandong Province (201419, 2015RK030), the Safety production science and
technology development plan of Shandong Province (201409, 201417) and Project
funding for young teachers of higher education in Shandong Province.
References
[1] M. M. Kashef, H. Yoon, M. Keshavarz, J. Hwang. Decision support tool for IoT service providers for
utilization of multi clouds. IEEE ICACT, Pyeongchang, Korea (south). 2016, 91-96.
[2] International Telecommunication Union (IUT), ITU Internet Reports 2005:The Internet of Things.
[3] H. Martin, T. Marek, H. Romana. The methodology of demand forecasting system creation in an
industrial company the foundation to logistics management. IEEE ICALT, Valeciennes, France. 2015,
12-15.
[4] K. Ben, G.-M. Susan. Sustainability assessment approaches for intelligent transport systems: the state of
the art. IET Intelligent Transport Systems 10(2016), 287-297.
[5] M. Andres. Smart cities concept and challenges: Bases for the assessment of smart city projects. IEEE
SMARTGREENS, Lisbon, Portugal. 2015, 1-11
W.-D. Fang et al. / Research on the Application-Driven Architecture in Internet of Things 465
[6] K. Xu, X. Wang, W. Wei, H. Song, B. Mao. Toward software defined smart home. IEEE
Communications Magazine 54 (2016), 116-122.
[7] S. Khanam, M. Mahbub, A. Mandal, M.S. Kaiser, S.A. Mamun. Improvement of RFID tag detection
using smart antenna for tag based school monitoring system. IEEE ICEEICT, Dhaka, Bangladesh. 2014,
1-6.
[8] F. Alessandro, M. Luca, P. Luigi, V. Roberto. An EPC-based middleware enabling reusable and flexible
mixed reality educational experiences. IEEE SoftCOM, Split-Primosten, Croatia. 2013, 1-6.
[9] M. Zhang, F. Sun, X. Cheng. Architecture of Internet of Things and Its Key Technology Integration
Based-On RFID. ISCID, Hangzhou, China, 2012, 294 – 297.
[10] D. Wu, J. Du, D. Zhu, S. Wang. A Simple RFID-Based Architecture for Privacy Preservation. IEEE
Trustcom/BigDataSE/ISPA, Helsinki, Finland. 2015, 1224-1229.
[11] S. C. Mukhopadhyay, N.K. Suryadevara Internet of Things: Challenges and Opportunities, Smart
Sensors, Measurement and Instrumentation, Vol. 9, Internet of Things: Challenges and Opportunities,
ISBN 978–3–319–04222–0, Springer–Verlag, by S. C. Mukhopadhyay, 2014, 1–18.
[12] C.P. Dan, M. Hedley, T. Sathyan. A manifold flattening approach for anchorless localization. Wireless
Networks, 18(2012), 319-333.
[13] T. Mohamed, D. Camille. A low-cost many-to-one WSN architecture based on UWB-IR and DWPT.
IEEE CoDIT, Metz, France. 2014, 712-718.
[14] C. Fu, Z. Ni. The application of embedded system in Supervisory Control and Data Acquisition System
(SCADA) over wireless sensor and GPRS networks. IEEE ASID, Xiamen, China. (2015), 81-85.
[15] I. Ungurean, V.G. Gaitan, N.-C. Gaitan. Integration of Information Acquired from Industrial Processes
in a Data Server Based on OPC.NET Specification. Global Journal on Technology, 3(2013), 553-558.
[16] N. Magdum, S. Patil, A. Maldar, S. Tamhankar. A low cost M2M architecture for intelligent public
transit. IEEE ICPC, Pune, India. 2015, 1-5.
466 Fuzzy Systems and Data Mining II
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-466
Abstract. The wireless video transmission process, due to the complexity and
variability of wireless communication channels, requires to adjust the bitrates to
match the dynamic wireless channel. An analysis of the video frame quality could
be used as an important basis to adjust the wireless video bitrates. This paper aims
at detecting and recognizing transmission bitrates based on video frame quality for
wireless networks. According to the distribution characteristics of video frame
quality, this paper proposes a GOP-level bitrate clustering recognition algorithm
(GLBCR) by video coding GOP structural feature and temporal continuity of
video frames to recognize different bitrates for wireless video. GLBCR uses PSNR
between each pair of original and terminal decoding frame as the feature to
quantify the degradation of video frame quality. The algorithm extracts the PSNR
values of all I-frames by the peak detector function, then uses PSNR similarity
measure to recursively split the frame interval into subintervals. Finally, the
different video bitrates can be recognized by GLBCR. The proposed algorithm is
evaluated by using the LIVE mobile video quality assessment (VQA) database.
The results show that the proposed algorithm can recognize the change of video
bitrates by analyzing video frame quality, it is well consistent with the real bitrate
changes in wireless video transmission with small amount of calculation.
Introduction
According to the Cisco Visual Networking Index Global Mobile Data Traffic Forecast
Update, mobile video traffic accounted for 55% of total mobile data traffic in 2015 and
will generate 75% of total mobile data traffic by the end of 2020 [1]. Moreover,
wireless systems are rapidly replacing present-day wire-line systems, and the wireless
video services will play a major role in our daily lives [2]. Despite growing maturity in
broadband mobile networks, the time-varying wireless channel qualities often cause the
1
Corresponding Author: Wen-Juan SHI, School of New Energy and Electrical engineering, Yancheng
Teachers University, Yancheng, 224051, China; School of Information and Electrical Engineering, China
University of Mining and Technology, Xuzhou 221116, China; E-mail: winterswj@126.com.
W.-J. Shi et al. / A GLBCR Algorithm for Wireless Video Transmission 467
channel to be relatively unreliable and can lead to the loss of transmission data; this can
seriously affect the image and video quality.
Many works have been done to study the impact of frame rate on perceptual video
quality [3-6]. Moorthy et al. [3] conducted subjective experiments to assess mobile
video quality, and the experiment results indicated the relationship of different bitrates
and subjective evaluation scores that human prefer higher bitrate. Chen et al. found that
frame rate around 15Hz seems to be generally more widely preferred, but the exact
acceptable frame rate varies depending on video content and viewers [4]. Ou et al.
explored the impact of frame rate and quantization on perceptual quality of a video [5].
Zhan Ma et al. proposed a rate model and a quality model which are expressed as the
product of separated functions of quantization step size and frame rate [6]. These works
focus on perceptual quality modeling and rate modeling for video.
In fact, a wireless channel is subject to radio interference, multipath fading and
shadowing, and sudden and severe fluctuations in the wireless bandwidth; all of these
factors can cause the traffic patterns of the compressed video streaming to change
dynamically and can significantly degrade the received video quality [7, 8]. Therefore,
it is important to recognize different bitrates by extracting the features of video and
analyzing the characteristics of the features under the conditions of terminal decoded
videos and unknown coder parameters. This paper focuses on automatically
recognizing video bitrates variation based on PSNR of each video frame. Considering
the structural and consecutive feature of video frames and the similarity measure of
neighboring frames, a bitrate clustering recognition algorithm named GLBCR is
proposed to partition the frames into clusters.
The rest of this paper is organized as follows. Section 1 gives the PSNR
computation of I-frames by peak detector function. Section 2 proposes the framework
of GLBCR and illustrates the detail of the proposed algorithm. Section 3 shows the
experiment results on LIVE mobile VQA database. Section 4 gives the conclusion.
PSNR (Peak Signal to Noise Ratio) is widely used as a quality metric or performance
indicator in image and video processing [9-11], which is defined as:
2552
PSNR=10 log10 (1)
MSE
2
1 m n
MSE= ¦¦ X (i, j) Y (i, j)
mn i 1 j 1
(2)
We have conducted an experiment on the LIVE mobile VQA database and found
that the mean PSNR value increases and the standard deviation decreases as the bitrate
increases, which indicates that the video frame quality improves as the bitrate increases.
For example, the PSNR values of four videos transmitted at different bitrates [R1,
R2, R3 and R4 (R1<R2<R3<R4)] are extracted; the results are illustrated in Figure 1. The
horizontal axis is the frame number. The vertical axis is the PSNR value of each video
frame. The PSNR mean and standard deviation values are shown in Table 1. From
Figure 1 and Table 1, it can be observed that the mean PSNR increases and the
standard deviation decreases as the bitrate increases.
468 W.-J. Shi et al. / A GLBCR Algorithm for Wireless Video Transmission
The PSNR values of a rate-changes video are shown in Figure 2. The horizontal
axis is the frame number. The vertical axis is the PSNR value of each video frame.
Three types of bitrates can be clearly observed, which are distributed in three
continuous frame intervals.
From Figure 1, it can be observed that PSNR is fluctuated affected by the structure
of Group of Pictures (GOPs). Since I frame is a full-frame compression coding frame
in a GOP, which is an Intra-code frame and can offer the most information as the
reference for the decoding of the other frames, the bitrate variation is recognized by the
PSNR of neighboring I-frames.
2. GLBCR Algorithm
xi xi 1
s( xi , xi 1 ) 1 (6)
xmax xmin
where xi and xi 1 are the ith and (i+1)th values, and xmax and xmin are the maximum
and minimum values in a interval. The value of the similarity measure is in the range [0,
1]. As the value approaches 1, the neighboring PSNR values are more similar. The
smaller the similarity value, the more likely it is that the corresponding point will be a
discontinuity point that can divide a frame interval into two frame subintervals.
In this paper, a PSNR similarity based clustering algorithm is proposed to search the
particular frames where bitrates changes. It needs to remind that in the first level, the
frame interval is composed of all I frame numbers, not all the video frames. In the first
level, according to the similarity measure discussed in section 2.1, the PSNR similarity
of neighboring I-frames is computed, and the discontinuity points are listed in
ascending order. Then, the I-frame number which has the smallest similarity value is
W.-J. Shi et al. / A GLBCR Algorithm for Wireless Video Transmission 471
selected as the discontinuity point. It needs to judge that if the frame interval should be
divided into two subintervels on both two sides of the selected discontinuity point.
The mean difference of the frame intervals on both two sides of the discontinuity
point is compared against the m-threshold. Since a frame interval corresponds to a
cluster, if the difference of the frame intervals on both sides of the discontinuity point
is greater than the m-threshold value, the above frame intervals represent two clusters;
otherwise, they would belong to one cluster and the frame interval would not be
partitioned.
It needs to remind that the frame interval is recursively split by the frames with
smallest value of similarity measure in each splitting process based on the principle of
binary tree.
After getting the Iframe number where the bitrate changes, in the second level, the
particular frame number in the GOP, in which the obtained I frame lies, needs to be
computed. According to the similarity measure discussed in section 2.1, we calculate
the neighboring PSNR similarity in the obtained GOP, and get the particular frame
number with the smallest similarity value in the GOP, which is just the particular frame
where bitrates changes.
Since sometimes the transmission process will occur at the same bitrate, it is
essential to compute the mean of each frame interval; however, the neighboring frame
intervals must be part of different clusters, according to the principles of the GLBCR
method. Therefore, in order to reduce the calculations required, one need only compute
the mean difference between the frame interval and all other frame intervals except the
neighboring interval. For example, there are four frame intervals: Interval 1, Interval2,
Interval3 and Interval4. Required calculations include the mean difference between
Interval1, Interval3 and Interval4, as well as the mean difference between Interval2 and
Interval4. If the difference of the frame intervals on both sides of the discontinuity point
is greater than the m-threshold value, the compared intervals are considered as two
clusters; otherwise, they are the same cluster that will be merged into one cluster.
GLBCR is described in Algorithm 1.
Algorithm 1 GLBCR algorithm
Algorithm 1 GLBCR
Input: video frames
Output: clustering number, subinterval and clustering type
1. Calculate PSNR of each video frame᧨and define the interval [N1, Nmax];
2. Extract the PSNR of I-frames, and define the new interval [n 1, nIMax];
3. Calculate the PSNR similarity between neighboring I-frame in the interval [n1, nIMax];
4. Detect the discontinuity points disci and define data set disc of the discontinuity points
disc={disc1,disc2,…᧨discN} in ascending sort;
5. Calculate the difference d between the mean of PSNR values within the interval [n 1, nIMax] on both
two sides of the discontinuity point disc1
6. if d>the m-threshold do
7. Divide the frame intervals on both two sides of the discontinuity point disc i into two subintervals
[n1, nIi] and [nIi+1, nIMax];
8. Calculate the particular frame number Ni between the neighboring nIith and nIi+1th GOP in which
the bitrate changed, and divide the interval [N1, Nmax] into two frame subintervals [N1, Ni] and
[Ni+1, Nmax];
9. renew the frame interval, repeat step 3-step9;
10. else
11. Break;
12. end if
13. Merge the intervals with the almost same bitrate
14. Define the bitrate clusters {cluster1, cluster2, …, clusterx}
472 W.-J. Shi et al. / A GLBCR Algorithm for Wireless Video Transmission
In this paper, we use the LIVE mobile VQA database [2] to evaluate the performance
of GLBCR, which has simulated video distortions in heavily-trafficked wireless
network [3]. The database consists of 10 source videos and 200 distorted videos at
720p (1280×720) resolution. All videos in the database are of duration 15 seconds and
frame-rate 30 fps. The distortions include compression, wireless channel transmission
losses, frame-freezes, rate adaptation and temporal dynamics.
The rate-adaption videos and rate-switches videos in the database are tested in this
paper. The rate-adaption videos are defined as following that the videos start at a bitrate
WRx, then after n seconds, the bitrate switches to a higher bitrate WRy, then again after
n seconds switches back to original bitrate. Three different bitrate-switches are
simulated as (1) WR1-WR4-WR1, (2) WR2-WR4-WR2, (3) WR3-WR4-WR3, which are
named in turn as s14, s24 and s34.
The rate-switches video is defined as following that the bitrate is varied between
WR1 to WR4 multiple times. The five different rate-switches are simulated as (1)WR1-
WR4-WR1-WR4-WR1-WR4, (2)WR1-WR2-WR4, (3)WR1-WR3-WR4, (4)WR4-WR2-
WR1, (5)WR4-WR3-WR1, which are named in turn as t14, t124, t134, t421 and t431.
Dissimilar to regular rate-adaption videos, the bitrates of rate-switches video are
changed unregularly.
The GLBCR method is tested on the LIVE mobile VQA database. The accuracy of
GLBCR is evaluated by comparing the difference between the frame intervals and the
corresponding clusters in the actual categories and also in the clusters of bitrates
recognized by GLBCR. The different types of videos are separately divided into
clusters in the experiments. The actual category and the recognized bitrate clusters and
corresponding frame intervals of the video called “dv” in the LIVE Mobile VQA
database are illustrated in Table 3. Table 3 displays the experimental results and shows
that the recognized bitrate clusters are approximately the same as the actual bitrate
category. A frame interval includes the frames at the same bitrate. For example, the
frame interval [1, 150] indicates the frame numbers from the first frame to the 150th
frame at the same bitrate. The correlation coefficient between the recognized clusters
and the actual frame intervals approximates 1. This means that GLBCR is capable of
recognition that is consistent with the actual bitrate.
According to the features of the PSNR values of videos at different bitrates, the
m_threshold of video called “dv” is set to 2.
Table 3. Comparison between the actual category of bitrates and the recognized clusters, based on GLBCR
of the video called “dv”.
The most well-known and commonly used partitioning methods are K-Means and K-
Medoids [13-16]. A comparison of the performance of K-Means, K-Medoids and
GLBCR is evaluated by using the LIVE mobile VQA database; the results of this
comparison are given in Table 4. From Table 4, it can be observed that the performance
of GLBCR algorithm exceeds that of the K-Medoids, and is closed to that of K-Means
algorithm. Although the K-Medoids and K-Means algorithms can work well for finding
spherical-shaped clusters in small- to medium-size databases, they are limited by the
given number of clusters. The common disadvantage of the K-Means and K-Medoids
algorithms is the necessity for users to specify the number of clusters. GLBCR can
automatically cluster with prior knowledge of the PSNR mean values at different
bitrates. The time complexity of the K-Means algorithm is O(nkt), where n is the total
number of objects, k is the number of clusters, and t is the number of iterations. The
time complexity of every iteration in the K-Medoids algorithm is O(k(n-k)2), where n
474 W.-J. Shi et al. / A GLBCR Algorithm for Wireless Video Transmission
and k are the same as in the k-means method. The time complexity of the proposed
GLBCR algorithm is O(nlog(n)) where n is the total number of objects.
Table 4. The performance comparison of K-means, K-medoids and GLBCR using the LIVE mobile VQA
database.
4. Conclusion
A GOP-level bitrate clustering recognition algorithm called GLBCR for wireless video
is presented. This algorithm can recognize the video bitrates variation by analyzing the
video frame quality of wireless video. When compared with the K-Means and K-
Medoids algorithms, the results demonstrate that the proposed GLBCR algorithm is
effective and produces results that are consistent with the real bitrates.
Acknowledgements
References
[1] Cisco. Cisco Visual Networking Index: Global Mobile Data Traffic Forecast Update, 2015-2020, Cisco,
2016, 01.
[2] A. K. Moorthy, K. Seshadrinathan, R. Soundararajan, et al. Wireless Video Quality Assessment: A
Study of Subjective Scores and Objective Algorithms. IEEE Transactions on Circuits and Systems for
Video Technology, 20(2010), 587-599.
[3] A. K. Moorthy, L. K. Choi, A. C. Bovik, et al. Video quality assessment on mobile devices: subjective,
behavioral and objective studies. IEEE Journal of selected topics in signal processing, 6(2012), 652-
671.
[4] J. Y. C. Chen, and J. E. Thropp. Review of low frame rate effects on human performance. IEEE Trans.
on systems, 37(2007), 1063-1076.
[5] Y. F. Ou, Z. Ma, T. Liu et al. Perceptual quality assessment of video considering both frame Rate and
quantization Artifacts. IEEE Transactions on circuits and systems for video technology, 21(2011), 286-
298.
[6] Z. Ma, M. Xu, Y. Wang. Modeling of rate and perceptual quality of compressed video as functions of
frame rate and quantization stepsize and its applications. IEEE transactions on circuits and systems for
video technology, 22(2012), 671-682.
[7] X. Q. Zhu, B. Girod. Distributed Media-Aware Rate Allocation for Wireless Video Streaming. IEEE
Transactions on Circuits and Systems for Video Technology, 20(2010), 1462-1474.
W.-J. Shi et al. / A GLBCR Algorithm for Wireless Video Transmission 475
[8] Y. F. Su, Y. H. Yang, Meng-Ting Lu, et al. Smooth Control of Adaptive Media Playout for Video
Streaming. IEEE Transactions on Multimedia, 11(2008), 1331-1339.
[9] T. S. Zhao, J. H. Wang, Z. Wang, et al. PSNR-Based Coarse-Grain Scalable Video Coding. IEEE
Transactions on Broadcasting, 61(2015), 210-221.
[10] R. Raju, S. A P. PSNR Based Video Coding Using 2D-DWT. 2014 International Conference on Control,
Instrumentation, Communication and Computational Technologies, Kanyakumari, 2014, 954-957.
[11] C. L. Yang, D. Q. Xiao. Improvements for H.264 Intra Mode Selection Based on SSE and PSNR.
Journal of Electronics & Information technology, 33(2011), 289-294.
[12] Jamali, S., et al., Detecting changes in vegetation trends using time series segmentation. Remote
Sensing of Environment, 156(2015), 182-195.
[13] J. W. Han, M. Kamber. Data Mining: concepts and techniques (Third Edition). Morgan Kaufmann
Publishers, 2012, 451-457.
[14] J. Macqueen. Some methods for classification and analysis of multivariate observations. In proc. 5th
Berkeley Symposium on Mathmatical Statistics and Probability, 1967, 281-297.
[15] L. M. Xue, W. X. Luan. Improved K-means Algorithm in User Behavior Analysis. 2015 ninth
International Conference on Frontier of Computer Science and Technology, Dalian, 2015, 339-342.
[16] U. Agrawal, S. K. Roy, U. S. Tiwary, et al. K-Means Clustering for Adaptive Wavelet Based Image
Denoising. 2015 International Conference on Advances in Computer Engineering and Applications,
Ghaziabad, 2015, 134-137.
476 Fuzzy Systems and Data Mining II
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-476
Abstract. There are forty to fifty old streets spreading around Taiwan. Each
Taiwan’s old street has its own story. Some are famous for their delicacy; others
for their unique scenery and still others promote their local cultures through the
combination of their local industries and festivals. A few Taiwan’s old streets have
found their position through time, while others have not and are fading like dried
leaves. This study, therefore, aims to find both the position and the developing
direction of Taiwan’s old streets. Through a tourism-experiencing perspective, the
evaluation system is formed according to six aspects (Landscape image, Historical
site image, Cultural image, Shopping experience, Gourmet experience, Marketing
experience) in order to classify the Taiwan’s old streets and construct the
developing strategies. The connection of old-street-forming criteria will be
constructed by the use of Fuzzy Cognitive Map (FCM). They will be classified
according to their quality, and their competitive strategies will also be established.
Hopefully, this study will be of great value in assisting the governmental
authorities finding the characteristics of the Taiwan’s old streets and improving
their development.
Keywords. old street, cognitive image, tourism experience, fuzzy cognitive map
(FCM), network relation map (NRM)
Introduction
1
Corresponding Author: Chia-Li LIN, Department of Recreation Management, Shin Chien University, 200
University Road, Neimen, Kaohsiung, 845, Taiwan; E-mail: linchiali0704@yahoo.com.tw.
C.-L. Kuo and C.-L. Lin / The Analysis of Cognitive Image and Tourism Experience 477
of local cultural development, but they also witness the rise and decline of economic
activities in the place. Although urban regeneration and suburban development have
solved the issue of living for the increasing population in the modernized city, they also
result in the demolition of many old streets and buildings. Therefore, we should look
for the significance of old streets in the fast-changing time and understand their value
for the new age. This study attempts to find the image connotations of old streets by
analyzing old street experiences, through which the researcher discovers the value of
old streets in people’s minds. Reconfirmation of old street images is used as the basis
for the re-utilization of old streets. In one aspect, through the preservation of old-street
cultures, local residents’ sentimental values of the old streets are lasted. In the other,
through the re-utilization of old streets, their economic values are raised. Hence, it is
an important issue for the government to ensure that with the consideration of both
sentimental and economic values, the old streets are able to be included in the new
trend of city and town development.
However, the rapid development also brings a dilemma: urban regeneration and
the preservation of historical monuments are often in conflict. Therefore, it would be
very difficult to keep the value of historical monuments in the rapid-changing process
of urban development, and the issue needs to be carefully treated. On the one hand, to
promote the economic development of the area, some old streets and historical
monuments need to be demolished. On the other hand, it may limit the development of
the area and its economic growth if one chooses to preserve the cultural resources.
With the improvement of material life, people’s need for travel and leisure has risen
increasingly, which has also promoted the local tourism. Especially, nostalgic tours
which provide customers with knowledge of the historical monuments have become
more and more popular. Recent studies have been paying attention to the issue. Our
study uses service-experiencing perception as the starting point. We created an
evaluation system for old-street classification and development strategies which is
based on six aspects (sightseeing, historic monument visiting, culture experiencing,
shopping, food tasting and marketing). We use the Fuzzy Cognitive Map (FCM) in our
construction of the system which presents the relationship between old-street
characteristics and criteria. Some important old streets in north and central Taiwan are
used for our study cases.
This study is divided into five sections, the second section we want to discuss
about the cognitive image and tourism experience, the third part is research method,
and empirical study based on Taiwan’s old streets, the final section is conclusion. In
the end, we would like to find the key success factor when developing Taiwan’s old
streets.
In the study of tourist characteristics and their image cognition of traveling places, one
finds that there is a particular relationship between cognitive image and the motives of
tourists, which leads to three results: (1) Motives influences the affective image. (2)
Holiday traveling experiences are clearly relevant to cognition and affective images. (3)
The social statistic features have influence on cognition and affective image evaluation
[1].
478 C.-L. Kuo and C.-L. Lin / The Analysis of Cognitive Image and Tourism Experience
In a study on the relationship between travelling image and holiday experience, one
finds that travelling image has direct impact on the previous cognitive quality,
satisfaction and the motivation to revisit the place. This has, therefore, verified the role
of image in the marketing of the tourist place. The relationship shows that good
service quality has positive influence on the tourists’ satisfaction and their inclination
for revisiting [6].
are the local cuisines or snacks; the more special they are, the more they attract tourists.
Local specialties mean the special products of the area which can be tasted by tourists
as well as for them to take home as gifts to families and friends. The dining
environment facet emphasizes the condition of the dining area which makes tourists
decide whether they dine in there or not. The hygiene condition facet suggests that
good hygiene condition will encourage customers to dine in or take away. The service
quality facet points out that a good-quality service will make customers want to visit
the shop again.
2. Research Method
FCM (Fuzzy cognitive map) approach was proposed by Kosko (1988); Sekitani and
Takahashi, (2001), and was developed from the original model of Axelord (1976)
through incorporating fuzzy measure and completing a flexible and feasible method to
resolve fuzzy network relation structure among objects in a complicated system. FCM
approach has been widely applied to enterprise management, political decision,
industrial analysis, and system control [10-24] This research process include 5 steps:
(1) Evaluate the initial average matrix (A), (2) Evaluate the direct influence matrix (D),
(3) Evaluate state matrix (C), (4) Evaluate the influence relation structure of
aspects/criteria and (5) Draw the Network Relation Map (NRM)
Aspects LI EI CI SE CE ME Total
Landscape image (LI) 0.000 2.780 2.780 2.203 2.237 2.525 12.525
Historical site image (HI) 2.831 0.000 3.508 2.068 2.169 2.678 13.254
Cultural image (CI) 2.627 3.322 0.000 1.983 2.254 2.780 12.966
Shopping experience (SE) 2.153 1.695 2.186 0.000 2.729 3.068 11.831
Gourmet experience (GE) 1.966 1.763 2.475 2.712 0.000 2.983 11.898
Marketing experience (ME) 2.441 2.712 2.610 2.949 2.949 0.000 13.661
Total 12.017 12.271 13.559 11.915 12.339 14.034
(2) Evaluate the direct influence matrix (D)
The initial direct influence matrix (D) can be calculated by Eqs. (1) and (2). Initial
average matrix (A) is the initial average influence matrix, and can produce the initial
direct influence matrix (D) through the process of Eqs. (1) and (2). Matrix D represents
each direct influence, and in the Matrix, the numbers on the diagonal are 0 and the sum
of each column and row is 1 in maximum (only one equals 1). Adding the sums of each
row and column in the Matrix results in the direct influence value:
D sA, s > 0 (1)
where
n n
s min [1/ max ¦ aij ,1/ max ¦ aij ], i, j 1, 2,..., n (2)
i, j 1di d n 1d j d n
j 1 i 1
n n
and lim D m [0]nun , where D [ xij ] nun , when 0 ¦ xij , ¦ xij d 1 at
m of
j 1 i 1
n n
least one ¦x
j 1
ij or ¦x
i 1
ij equal one, but not all. So we can guarantee lim D m 1
m of
[0]nun .
Aspects LI EI CI SE CE ME Total
Landscape image (LI) 0.000 0.198 0.198 0.157 0.159 0.180 0.893
Historical site image (HI) 0.202 0.000 0.250 0.147 0.155 0.191 0.944
Cultural image (CI) 0.187 0.237 0.000 0.141 0.161 0.198 0.924
Shopping experience (SE) 0.153 0.121 0.156 0.000 0.194 0.219 0.843
Gourmet experience (GE) 0.140 0.126 0.176 0.193 0.000 0.213 0.848
Marketing experience (ME) 0.174 0.193 0.186 0.210 0.210 0.000 0.973
Total 0.856 0.874 0.966 0.849 0.879 1.000
482 C.-L. Kuo and C.-L. Lin / The Analysis of Cognitive Image and Tourism Experience
The influence relationship between aspects or criteria can be calculated through the
following equation:
C ( t 1) f (C ( t ') D ) , Ct ' Ct C 0 , C (0) I nun (6)
where: I nu n represents the identity matrix.
Aspects LI EI CI SE CE ME Total
Landscape image (LI) 1.397 1.597 1.710 1.520 1.564 1.741 9.529
Historical site image (HI) 1.638 1.508 1.826 1.583 1.635 1.830 10.020
Cultural image (CI) 1.601 1.672 1.597 1.554 1.612 1.805 9.842
Shopping experience (SE) 1.456 1.465 1.598 1.318 1.521 1.690 9.048
Gourmet experience (GE) 1.455 1.477 1.620 1.486 1.365 1.694 9.097
Marketing experience (ME) 1.642 1.691 1.809 1.658 1.704 1.703 10.206
Total 9.189 9.410 10.161 9.118 9.401 10.463 -
*
The C (Limit state matrix) can be derived from Eqs. (4) or (5). Table 4 is the
calculated C * (Limit state matrix). The C * (Limit state matrix), consists of multiple
elements, indicated in Eq. (6) as shown in Table 4. The sum vector of the row value is
{ di }, and the sum vector of the column value{ rj }; then, let i j , the { di ri } is
sum vector of the row value plus the column value, which means the C * (Limit state
matrix). As the di ri is higher, the network relation structure of the aspects/criteria is
stronger. The di ri is, which means there is a net influence relation structure. If
di ri > 0, it means the degree of influencing others is stronger than the degree to be
influenced.
484 C.-L. Kuo and C.-L. Lin / The Analysis of Cognitive Image and Tourism Experience
The ME (Marketing experience) aspect has the highest degree of full influence
value ( d6 r6 ) = 20.669, and the SE (Shopping experience) aspect has the lowest
degree of full influence value ( d 4 r4 ) = 18.166. The HI (Historical site image) aspect
has the highest degree of net influence value ( d2 r2 )=0.610. The order of the other
net influences is listed as follows: the LI (Landscape image) aspect ( d1 r1 = 0.340),
the SE (Shopping experience) aspect ( d4 r4 =-0.070), ME (Marketing experience)
aspect ( d6 r6 =-0.257), GE (Gourmet experience) aspect ( d5 r5 =-0.304) and the last
one, the CI (Cultural image) aspect ( d3 r3 -0.319) as shown in Table 5.
The diagonal items of the matrix are all 0. In other words, the matrix contains a
strictly upper triangular matrix and a strictly lower triangular matrix as shown in Table
6. Moreover, while values of a strictly upper triangular matrix and strictly lower
triangular matrix are the same, their symbols are the opposite. This property helps us in
that we only have to choose one of a strictly triangular matrix.
Table 4 shows the matrix at the limit steady state. Eq. (10) can produce the net limit
state matrix, as shown in Table 6. Using the values of ( d r ) and ( d r ) in Table 5 as
the X axis and Y axis, respectively, the NRM can be drawn as shown in Figure 1. Data
in Table 7 can be used to draw the NRM as shown in Figure 1. The HI (Historical site
image) aspect is the major aspect with a net relation structure, while the CI (Cultural
image) aspect is the major aspect being influenced. ME (Marketing experience) aspect
is the aspect with the greatest aggregate relation structure while the SE (Shopping
experience) aspect is the one with the smallest aggregate relation structure as shown in
Figure 1.
C.-L. Kuo and C.-L. Lin / The Analysis of Cognitive Image and Tourism Experience 485
Table 6 The net influence matrix of cognitive image and tourism experience
Aspects LI EI CI SE CE ME
Landscape image (LI) -
Historical site image (HI) 0.041 -
Cultural image (CI) -0.109 -0.154 -
Shopping experience (SE) -0.063 -0.118 0.045 -
Gourmet experience (GE) -0.110 -0.158 0.008 -0.035 -
Marketing experience (ME) -0.099 -0.140 0.004 -0.032 0.010 -
1
0.8
Historical site
image (HI)
0.6
0.041
Landscape
0.4 image (LI)
0.118
0.2 0.063
0.154 0.140
Shopping
0.110 0.158
experience (SE)
d-r
0
0.109
0.099
0.035 0.032
-0.2 0.045
-0.6
-0.8
-1.0
18.0 18.5 19 19.5 20 20.5 21
d+r
Figure 1. The improvement strategy map for cognitive image and tourism experience
3. Conclusions
According to the aspects/criteria, there are six aspects, such as LI (Landscape image)
aspect, HI (Historical site image) aspect, CI (Cultural image) aspect, SE (Shopping
experience) aspect, GE (Gourmet experience) aspect, and ME (Marketing experience)
aspect and experts were invited to analyze the NRM (Network relation map) and to
score the network relation structure among the aspects based on the FCM (Fuzzy
cognitive map) approach. The network relation matrix can be derived from Eq. (9).
Which in the six aspects, HI (Historical site image) and LI (Landscape image) aspects
are more influential, while the SE (Shopping experience) aspect, Gourmet experience
(GE), Cultural image (CI) and Marketing experience (ME) are the major dimension
being influenced. The Marketing experience (ME) is the dimension with the highest
full influence, while the SE (Shopping experience) aspect is the one with the smallest
full influence aspect.
486 C.-L. Kuo and C.-L. Lin / The Analysis of Cognitive Image and Tourism Experience
References
[1] A. Beerli and J. D. Martin, Tourists' characteristics and the perceived image of tourist destinations: a
quantitative analysis - a case study of Lanzarote, Spain, Tourism Management 25 (2004), 623-636.
[2] W. M. Choi, A. Chan, and J. Wu, A qualitative and quantitative assessment of Hong Kong's image as a
tourist destination, Tourism Management 20 (1999), 361-365.
[3] C. N. Buzinde and C. A. Santos, Representations of slavery, Annals of Tourism Research 35 (2008), 469-
488.
[4] S. C. H. Cheung, The meanings of a heritage trail in Hong Kong, Annals of Tourism Research 26(1999),
570-588.
[5] M. Chaudhary, India's image as a tourist destination - a perspective of foreign tourists, Tourism
Management 21 (2000), 293-297.
[6] J. E. Bigne, M. I. Sanchez, and J. Sanchez, Tourism image, evaluation variables and after purchase
behaviour: inter-relationship, Tourism Management 22 (2001), 607-616.
[7] M. Asplet and M. Cooper, Cultural designs in New Zealand souvenir clothing: the question of
authenticity, Tourism Management 21 (2000) 307-312.
[8] J. Chang, B.-T. Yang and C.-G. Yu, The moderating effect of salespersons' selling behaviour on
shopping motivation and satisfaction: Taiwan tourists in China, Tourism Management 27 (2006), 934-
942.
[9] A.-T. Hsieh and J. Chang, Shopping and Tourist Night Markets in Taiwan, Tourism Management, 27
(2006) 138-145.
[10] G. A. Banini and R. A. Bearman, Application of fuzzy cognitive maps to factors affecting slurry
rheology, International Journal of Mineral Processing, 52 (1998), 233-244.
[11] S. Bueno and J. L. Salmeron, Fuzzy modeling Enterprise Resource Planning tool selection, Computer
Standards & Interfaces 30 (2008), 137-147.
[12] B. Kosko, Hidden patterns in combined and adaptive knowledge networks, International Journal of
Approximate Reasoning 2 (1988), 377-393.
[13] K. C. Lee, J. S. Kim, N. H. Chung, and S. J. Kwon, Fuzzy cognitive map approach to web-mining
inference amplification, Expert Systems with Applications 22 (2002), 197-211.
[14] S. Lee and I. Han, Fuzzy cognitive map for the design of EDI controls, Information & Management, 37
(2000) 37-50.
[15] S. Lee, B. G. Kim, and K. Lee, Fuzzy cognitive map-based approach to evaluate EDI performance: a
test of causal model, Expert Systems with Applications 27 (2004), 287-299.
[16] K. S. Park and S. H. Kim, Fuzzy cognitive maps considering time relationships, International Journal
of Human-Computer Studies 42 (1995)157-168..
[17] L. Rodriguez-Repiso, R. Setchi, and J. L. Salmeron, Modelling IT projects success with Fuzzy
Cognitive Maps, Expert Systems with Applications 32 (2007), 543-559.
[18] W. Stach, L. Kurgan, W. Pedrycz, and M. Reformat, Genetic learning of fuzzy cognitive maps, Fuzzy
Sets and Systems, 153 (2005) 371-401.
[19] M. A. Styblinski and B. D. Meyer, Signal Flow Graphs vs Fuzzy Cognitive Maps in application to
qualitative circuit analysis, International Journal of Man-Machine Studies 35 (1991), 175-186.
[20] R. Taber, Knowledge processing with Fuzzy Cognitive Maps, Expert Systems with Applications 2
(1991), 83-87.
[21] Z. Wei, L. Lu, and Z. Yanchun, Using fuzzy cognitive time maps for modeling and evaluating trust
dynamics in the virtual enterprises, Expert Systems with Applications 35 (2008)1583-1592.
[22] G. Xirogiannis, J. Stefanou, and M. Glykas, A fuzzy cognitive map approaches to support urban design,
Expert Systems with Applications 26 (2004), 257-268.
[23] B. S. A. Yeoh and S. Huang, The conservation-redevelopment dilemma in Singapore: The case of the
Kampong Glam historic district, Cities 13 (1996), 411-422.
[24] R. Yu and G.-H. Tzeng, A soft computing method for multi-criteria decision making with dependence
and feedback, Applied Mathematics and Computation 180 (2006), 63-75.
Fuzzy Systems and Data Mining II 487
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-487
Introduction
1
Corresponding Author: Cheng WANG, College of Computer Science and Technology, Huaqiao
University, Xiamen 361021, China; E-mail: wangcheng@hqu.edu.cn.
488 S.-Q. Wen et al. / A Collaborative Filtering Recommendation Model
idea is to predict the preference of target users according to their historical rating data,
and select a plurality of high degree project as the recommendation result [2].
But traditional collaborative filtering algorithm has the problem of low accuracy;
the reason of which partly due to the ignorance of the fact that different item has
different influence and contribution to the recommendation result when using the user
item rating matrix prediction score [3].
From the angle of the long tail theory, each item has a long tail in the aspect of
influence and the importance, and every item is valuable. The difference lies in the
magnitude of value and the types of audience groups.
The traditional recommendation algorithm is more concerned about the impact of
popular items, while ignoring the value of unpopular project in the trail part. Giving
weight to unpopular items is a commonly used method to improve item influence. But
these weighted strategies are all experiential weighted and additional priori-knowledge
is needed. In addition, they introduce a new optimization parameter set problem, and it
can only improve recommendation accuracy from one aspect.
Lai W [4] et al. proposed a collaborative filtering recommendation algorithm
incorporated with the changes of user interest. The algorithm they propose designed a
time weight function and introduced a new method to calculate the similarity between
items. You H [5] et al. propose a recommendation algorithm combining item clustering
method and weighted slope one scheme. These algorithms improve the accuracy of
recommendation accuracy to a certain extent, but ignoring the personalized of
recommendation.
User Item i1 … ib … in
ua ra ,1 … ra,b … ra,n
…
um rm,1 … rm,b … rm ,n
The calculation of item similarity is one of the steps of collaborative filtering algorithm,
and the commonly used method of similarity calculation includes cosine similarity,
correlation similarity and modified cosine similarity.
Cosine similarity regards user rating as a multi-dimensional vector. The similarity
between item u and item v can be computed by the angle of the vector.
sim(ia , ib )
ia x ib ¦ kI a I b
(rk ,a rk ,b )
(1)
| ia | * | ib | ¦ lI a
(rl ,a ) 2 ¦ jIb
( r j ,b ) 2
In formula (1) rl , a , l I a represents vector l ’s scoring on component a ,and
rj ,b , j I b represents vector j ’s scoring on component b . rk ,a , k I a I b
represents vector k ’s scoring on vector a and b ’s common component .
rk ,b , k I a I b represents vector k ’s scoring on vector a and b ’s common
component.
Xu X [6] et al. use recall, accuracy, coverage, popularity etc. as indicators to verify
in the case of sparse data. Compare the Jaccard coefficients, the Euclidean distance, the
Pearson coefficient and the cosine similarity, cosine similarity has best effect.
Therefore, cosine similarity is the basis of many kinds of weighted improvement.
But these algorithms do not take into account the importance of unpopular items
and the influence of most frequently recommended items [7], thus affecting the
accuracy of recommendation and personalization.
According to the rating of the target item nearest neighbor, to predict the score of the
non-rating items, we should select a number of the high scoring items as
recommendation results to target users. We can predict user uv ’ s rating on item ia ,
according to user uv ’ s rating on item ia ’ s nearest neighbor.
¦ sim(i , i )* r
vN
a b v ,b
Puv ,ia (2)
¦ sim(i , i )
vN
a b
490 S.-Q. Wen et al. / A Collaborative Filtering Recommendation Model
Here,
ib N represents i b is item ia ’s nearest neighbor. sim(ia , ib ) represents
i i u
item a ’s similarity between item b . rv ,b represents user v ’ s rate on item b .
i
°Q Q T
® T
wia ,ib (3)
°¯1 Q tT
Correlation-weighted similarity simc(ia , ib ) is shown as Eq. (4).
simc(ia , ib ) wia ,ib * sim(ia , ib ) (4)
If we set T to 1, simc(ia , ib ) is as the same as traditional similarity. If T set too
small or too large often cannot achieve the best result.
Here,
ib N represents i b is item ia ’s nearest neighbor. simc(ia , ib )
i i
represents item a ’s correlation-weighted similarity between item b . r represents v ,b
user
uv ’ s rate on item i b .
simcc(ia , ib )
ia x ib ¦ kI a I b
(rk ,a * wa , rk ,b * wb )
| ia | * | ib | ¦ lI a
(rl ,a * wa ) 2 ¦ (r * w )
jI b j ,b b
2 (6)
wa * wb * ¦ kI (rk ,a * rk ,b ) ¦ (r * r k ,a k ,b )
a Ib kI a I b
sim(ia , ib )
wa * wb * ¦ lI a
(rl ,a ) 2 ¦ jI b
( r j ,b ) 2 ¦ (r ) ¦
lI a l ,a
2
jI b
( r j ,b ) 2
¦ simcc(i , i )* r * w ¦ sim(i , i )* r * w
a b v ,b b a b v ,b b
Puccv ,ia vN vN
(7)
¦ simcc(i , i )* w
vN
a b ¦ sim(i , i )* w
b
vN
a b b
Here, Puccv ,ia represents user uv ’s rating on item ia . N represents item ia ’s nearest
neighbor. simcc(ia , ib ) represents item optimal-weighted similarity. sim(ia , ib )
represents traditional item similarity. rv ,b represents user uv ’ s rate on item i b . wb
represents item iv ’s weight.
2.3.3. Fusion Model Impact to Nearest Neighbor Choose and Rating Predicting
¦ sim
vN
fusion (ia , ib )* rv ,b * wb
Pfusion _ uv ,ia (9)
¦ sim
vN
fusion (ia , ib )* wb
¦| p
uv 1
uv ,ia rv ,a |
min MAE min
m
° sim fusion (ia , ib ) wia ,ib * sim(ia , ib ) wia ,ib *
¦ kIa Ib (rk ,a rk ,b ) (10)
°
°
¦ lIa (rl ,a )2 ¦ jIb (rj ,b )2
st. ®
° ¦ sim fusion (ia , ib ) * rv,b * wb
° Pfusion _ uv ,ia vN
°
¯
¦ sim fusion (ia , ib ) * wb
vN
In the experiment we use the MoviesLens Datasets provided by GroupLens team in the
University of Minnesota in the United States, which contains the 100000 ratings (1-5
scales) rated by the 943 users on 1682 movies. Each user scores at least 20 films. The
datasets are very sparse, since the density of actual score data is100000/ (943*1682)
=6.3%.
In this paper, the data are collected from the table u.data. Calculations of the
similarity are conducted in terms of UserID, MovieID, and Rating. Table 3 is shown as
below.
Table 2. Table u.data format
UserID MovieID Rating Timestamp
496 S.-Q. Wen et al. / A Collaborative Filtering Recommendation Model
To verify the effectiveness of the algorithm, we use MAE, recall, precision, average
popularity and coverage as the evaluation indicator. R (u ) represents the items we
recommend to users. T (u ) represents the items that users actually rated. M
represent the number of items.
¦| p
uv 1
uv ,ia rv ,a |
MAE
m
Puv ,ia u i u
represents user v ’s rating on item a , rv , a represents user v ’s actual
rating on item
ia , and m represents the size of test dataset.
Acknowledgements
This work was financially supported by National Natural Science Foundation of China
(Grant No.51305142, 61572204), Fujian province science and technology plan
(No.2017H01010065), project of Xiamen science and technology plan
(3502Z20151239), Postgraduate Scientific Research Innovation Ability Training Plan
Funding Projects of Huaqiao University (No.1511314023).
References:
[2] Z. L. Zhao, C. D. Wang, Y. Y. Wan, et al. Pipeline Item-Based Collaborative Filtering Based on Map
Reduce, Proceedings of the 2015 IEEE Fifth International Conference on Big Data and Cloud
Computing. IEEE Computer Society, (2015), 9-14.
[3] J. J. Castro-Schez, R. Miguel, D. Vallejo, et al. A highly adaptive recommender system based on fuzzy
logic for B2C e-commerce portals. Expert Systems with Applications, 38 2011), 2441-2454.
[4] W. Lai, H. Deng. An improved collaborative filtering algorithm adapting to user interest changes,
Information Science and Service Science and Data Mining (ISSDM), IEEE, (2012), 598-602.
[5] H. You, H. Li, Y. Wang, et al. An improved collaborative filtering recommendation algorithm
combining item clustering and slope one scheme. Lecture Notes in Engineering & Computer Science,
2215(2015), 18-20.
[6] X. U. Xiang, X. F. Wang. Optimization Method of Similarity Degree in Collaborative Filter Algorithm.
Computer Engineering, 36(2010), 52-54.
[7] S. Y. Wei, Y. Ning, X. B. Yang. Collaborative Filtering Algorithm Combining Item Category and
Dynamic Time Weighting. Computer Engineering, 40(2014), 206-210.
[8] W. U. Hu, Y. J. Wang, Z. Wang, et al. Two-Phase Collaborative Filtering Algorithm Based on
Co-Clustering. Journal of Software, 21(2010), 1042-1054.
[9] Z. Y. Xiong, F. J. Zhang, Y. F. Zhang. Item Clustering Recommendation Algorithm Based on Particle
Swarm Optimization. Computer Engineering, 35(2009), 178-180.
[10] L. U. Chun, A. Hong, J. Gong, et al. Research on collaborative filtering recommendation method
based on PSO algorithm. Computer Engineering & Applications, 50(2014), 101-107.
Fuzzy Systems and Data Mining II 501
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-501
Abstract. This paper studies the Cayley’s theorem for regular double Stone alge-
bras. We raise the concept of regular ternary class and show that each regular dou-
ble Stone algebra is isomorphic to a subalgebra of the algebra associated with some
regular ternary class of functions over a set, which is analogous to the results for
Stone algebras.
1. Introduction
We start by constructing a regular double Stone algebra. Let X be a set and S(X) =
{(A, B) ∈ X × X|A ⊆ B}. We define the following operations on S(X) :
(A1 , B1 ) ∪ (A2 , B2 ) = (A1 ∪ A2 , B1 ∪ B2 );
1 Corresponding Author : Cong-wen Luo , College of Science, China Three Gorges University, Yichang ,
3. Representation by Functions
g(x, x, x) = x,
for all x ∈ X.
4. Any two g, h ∈ D commute:
h(g(x1 , x2 , x3 ), g(y1 , y2 , y3 ), g(z1 , z2 , z3 )) = g(h(x1 , y1 , z1 ), h(x2 , y2 , z2 ), h(x3 ,
y3 , z3 )) for all xi , yi , zi ∈ X, i = 1, 2, 3.
5. Each g ∈ D is diagonal:
g(g(x1 , x2 , x3 ), g(y1 , y2 , y3 ), g(z1 , z2 , z3 )) = g(x1 , y2 , z3 ),
for all xi , yi , zi ∈ X, i = 1, 2, 3.
6. If g(z, x, x) = h(z, x, x) and g(z, z, x) = h(z, z, x), then
for all x, y, z ∈ X.
In what follows we always use B to denote the class of regular ternary classes over
a set X.
Let D ∈ B, we define the operations ∨, ∧,0 and + on D as follows. For any g, h ∈ D
and x, y, z ∈ X,
(g ∨ h)(x, y, z) = g(x, h(x, y, y), h(x, y, z)),
(g ∧ h)(x, y, z) = g(h(x, y, z), h(y, y, z), z),
g 0 (x, y, z) = g(z, z, x),
g + (x, y, z) = g(z, x, x).
C.-W. Luo / A Cayley Theorem for Regular Double Stone Algebras 503
⎛ ⎞
1 0 0
a + b = (a1 , a2 , a3 ) · ⎝b1 b2 + b3 0 ⎠ , (1)
b1 b2 b3
⎛ ⎞
b1 b2 b3
ab = (a1 , a2 , a3 ) · ⎝ 0 b1 + b2 b3 ⎠ . (2)
0 0 1
Moreover, we define
a0 = (a3 , 0, a1 + a2 ), a+ = (a2 + a3 , 0, a1 ) and let 0 = (0, 0, 1), 1 = (1, 0, 0).
Note that the two constants 0, 1 ∈ ML . Also, ML is closed under the above opera-
tions.
Theorem 3.3. Let L ∈ D, then
< ML , +, ·,0 ,+ , 0, 1 >∈ A.
Proof. Define
504 C.-W. Luo / A Cayley Theorem for Regular Double Stone Algebras
According to Proposition 3.2 in [6], DL satisfies the conditions (1)-(5) of the regular
ternary class. Now we will show that the condition (6) holds, hence DL ∈ A.
Suppose ifa (z, x, x) = ifb (z, x, x) and ifa (z, z, x) = ifb (z, z, x), where a =
(a1 , a2 , a3 ) and b = (b1 , b2 , b3 ) in ML , then a1 z + (a2 + a3 )x = b1 z + (b2 + b3 )x
and (a1 + a2 )z + a3 x = (b1 + b2 )z + b3 x. Setting x = 1, z = 0 and x = 0, z = 1
respectively, we have a3 = b3 , a2 + a3 = b2 + b3 and a1 = b1 , a1 + a2 = b1 + b2 . Thus
a2 = b2 from the fact that L ∈ D and a = b and then ifa (x, y, z) = ifb (x, y, z).
Next, we define
ϕ : ML → DL , a → ifa .
It is easy to show that ϕ is an isomorphism. In fact, if ifa = ifb , then, for all x, y, z ∈ L,
ifa (x, y, z) = ifb (x, y, z), that is to say, a1 x + a2 y + a3 z = b1 x + b2 y + b3 z. Setting
x = 1, y = z = 0; x = z = 0, y = 1 and x = y = 0, z = 1, respectively, we have
ai = bi , i = 1, 2, 3. Thus ϕ is one-to-one. Obviously, ϕ is onto. Furthermore,
and
ifa0 (x, y, z) = (ifa )0 (x, y, z), ifa+ (x, y, z) = (ifa )+ (x, y, z).
Hence ML ∈ A .
For each set X, the subset-pair algebra S(X) is isomorphic to the algebra ML ,
where L is the field of all subsets of X. Indeed, the function
S(X) → ML
is an isomorphism.
Lemma 3.4. Let A ∈ A. Then A can be embeded in S(X), where X =
{I is a prime ideal of A} .
Proof. Let Xa = {I is a prime ideal of A|a ∈ I} . Since a++ ≤ a00 , Xa++ ⊆ Xa00 ,
we have (Xa++ , Xa00 ) ∈ S(X). Define ϕ : A → S(X), a → (Xa++ , Xa00 ). It is easy
to see that ϕ is a homomorphism. Since Xa0 = X\Xa00 , Xa+ = X\Xa++ , we have
ϕ(a0 ) = (ϕ(a))0 , ϕ(a+ ) = (ϕ(a))+ . The fact A is regular implies ϕ is one-to-one.
Therefore, A can be embeded in S(X).
Corollary 3.5. Let A ∈ A. Then A embeds in ML .
Theorem 3.6. A ∈ A iff there exists D ∈ B such that A can be embeded in the
algebra associated with D.
Proof. If A embeds in D ∈ B, then A ∈ A, by Theorem 3.2. On the contrary,
suppose A ∈ A, then A can be embeded in S(X). But S(X) ∼ = M L , ML ∼ = DL , where
L ∈ D.
For each A ∈ A , let X be the set of all the prime ideals on A, L is the
field of all subsets of X. Define the map ϕ : A → B by A → ϕ(A) =
if(Xa++ ,Xa◦◦ Xa++ ,XXa00 ) |a ∈ A},
C.-W. Luo / A Cayley Theorem for Regular Double Stone Algebras 505
sc (X, ∅, ∅)
@
@
({x1 }, {x2 }, ∅) c @ c ({x2 }, {x1 }, ∅)
@ @
@ @
∼
= ({x1 }, ∅, {x2 }) c @ cs (∅, X, ∅) @ c ({x2 }, ∅, {x1 })
@ @
@ @
@ @c
(∅, {x1 }, {x2 }) c (∅, {x2 }, {x1 })
@
@
@ cs
(∅, ∅, X)
Figure 3 The regular double Stone algebra ML from the distributive lattice L.
506 C.-W. Luo / A Cayley Theorem for Regular Double Stone Algebras
s if(X,∅,∅)
@
@
if(x1 ,x2 ,∅) c @ c if(x2 ,x1 ,∅)
@ @
@ @
∼
= if(x1 ,∅,x2 ) c @ cs if @ c if(x2 ,∅,x1 )
(∅,X,∅)
@ @
@ @
@c @c
if(∅,x1 ,x2 ) if(∅,x2 ,x1 )
@
@
@s
if(∅,∅,X)
Semantic Web (SW) technologies, initiated by Tim J. Berners-Lee, allow the addition
of meaning to information through the use of a semantic formalization, named ontology.
Therefore, an ontology corresponds to a vocabulary containing a hierarchy of semantic
concepts and properties employed for the definition of the knowledge in a given
domain. Concepts and properties are used to annotate the content of the application.
Semantic annotation consists in the creation of links between concepts and their
instances in the content. In this context, an information search can be related with
content, semantic relations or both. For example, if the user is looking for an expert
located in Paris the system can see that Paris is a French city, and if there are no experts
living in the capital it can propose experts located in other cities in France. In order to
achieve this outcome, the system makes an inference using the subsumption relations
1
Corresponding Author: Daniel BURGOS, UNESCO Chair on eLearning. Universidad Internacional
de La Rioja (UNIR). Gran Via Rey Juan Carlos I, 4126002 Logroño, La Rioja, Spain; E-mail:
daniel.burgos@unir.net.
508 D. Burgos / ARII-eL: An Adaptive, Informal and Interactive eLearning Ontology Network
present at the ontology. There are reports in the literature which highlight the benefits
of inference in ontology-based user models [1].
In order to implement the foundations of Semantic Web, W3C has defined several
formalisms which allow the conceptual model to be represented and the knowledge
base to be managed. The logical representation is based on information triples defined
in the RDF (Resource Description Framework) language [2]. In this language each
triple is formed by a subject (for example a concept), a predicate (generally a semantic
property) and an object (the value of a resource). From this logical model, and using
either the formalism RDFS (Resource Description Framework Schema) [3] or OWL
(Ontology Web Language) [4], the ontology can be defined containing only concepts
and the semantic relations between them. Starting from an ontology, annotations can be
instanced (with RDF) in order to describe the content.
Ontologies and annotations are, in general, stored in a repository and form a graph of
triples that can be queried using SPARQL [5], the query language for RDF. It is also
possible to apply semantic rules de-fined with SWRL (Semantic Web Rule Language)
[6].
Using ontologies, several vocabularies have been proposed to describe important
domains. These vocabularies are public and permit semantic interoperability across the
Web. Interoperability issues of ontologies in educational applications have been
analysed in [7], where the authors couple two complementary systems through the use
of a common ontology.
In the following sections two of these vocabularies, which are of great importance for
user-profile modelling, will be described.
1.1. FOAF
FOAF (Friend of a Friend) [8] is based on decentralized technology and has been
designed to allow data integration through a variety of applications and Web services.
In order to achieve this goal FOAF has taken a different approach for data interchange.
It neither requires the user to specify anything about him or herself or others, nor limits
what can be said about the user or the variety of semantic vocabularies that may be
used.
Personal data are located in the category FOAF Basics. Personal Info contains
information like age, interests, etc. An important object property is knows, which
allows for the representations of interrelations between people. This property can be
very important for social networks. The category Online Accounts / IM contains the
identifiers used to connect to most extended chats. Projects and Groups contain the
projects and organizations the person belongs to. Documents and Images hold
references to documents or images browsed or in which the user has shown an interest.
In this category all the resources browsed by the user can be located. FOAF contains a
series of important concepts useful for any user model.
1.2. SIOC
many occasions, the place where the sought information is found. SIOC is intended to
link online communities using semantic Web technologies to describe the information
online communities have about their structure and content. Developers can use SIOC to
express the information contained in online communities in a simple and extensible
way.
Other ontologies, like Dublin Metadata Core, FOAF, etc., can be mixed with
SIOC terms. The SIOC kernel defines classes like Community, which defines a
community, and Container, a general class parent of the class Forum. The class Item is
the parent of the classes Post, User, and so on.
SIOC developers have defined a basic kernel and new concepts or extensions to
existing ones have been added as modules. This way the kernel is kept simple and
legible. At present, SIOC provides three modules: Access, Services and Types. Access
models access permissions, roles, etc. This module defines the classes Permission and
Status. Types module contains classes that extend to types like Forum and Post. This
module contains a large set of specialized classes. Developers are encouraged to add
new classes to the ontology here.
It is common for online communities to publish interfaces of Web services. These
interfaces allow programmatic search as long as services for content management are
used, usually SOAP and/or RES-Tian. Classes to deal with these are included in the
Service module.
User models are in general complex but at the same time need great flexibility. The
user model should represent user characteristics but also context characteristics. For
example, documents, activities or social interactions are important criteria to describe
the user. Semantic Web technologies seem to be the more suitable to satisfy user-model
needs and requirements. ARII-eL (Adaptive, Informal and Interactive eLearning
Ontology Network), the conceptual model defined in this paper is a fully open and
flexible logical model based on the composition of different vocabularies. The power
of this ontology allows users to:
x Develop a conceptual model independent of the application;
x Reuse public vocabularies;
x Extend the conceptual model with new vocabularies;
x Achieve interoperability concerns between the descriptions’ different
contents;
x Search information using semantic criteria;
x Enlarge the knowledge base with a reasoning feature, e.g., using the semantic
rules; and
x Manage system information and make recommendations also using the
semantic rule.
The definition of ontologies needs to follow basic elements of the methodology.
The first element consists of the development of an ontology network to represent the
diversity of data and content. In order to organize the ontology network, the network
kernel must be formed by generic concepts and the most important relations between
510 D. Burgos / ARII-eL: An Adaptive, Informal and Interactive eLearning Ontology Network
them. This approach allows a focus on the main concepts, the user in this case, and
extends them by adding specialized ontologies like a knowledge domain. There are also
other criteria to assure the quality of the defined ontologies and the semantic coherence
of the model.
The required ontologies must have different functionalities in the system; these
functionalities can be classified into three categories:
x Ontologies representing the conceptual model for content annotation;
x Ontologies for the extend the knowledge base through the reasoning process;
x Ontologies used for information classification and personalization.
Annotation consists of the instantiation of the ontologies in semantic descriptions
(semantic annotation). Every new element of content or user action can be associated
with a description. Part of this description can be generated automatically by the
system (automatic annotation) and the other part can be made by the user (manual
annotation). Ontologies that contain the inferences made from annotations can also be
defined.
Finally, ontologies defining criteria for information personalization will also be
used. Inferences can be applied using explicit criteria selected by the user in his or her
profile.
The Activity ontology models the activities per-formed by the user. These
activities can include social interactions, media interaction, etc. This ontolo-gy is also
related to the domain ontology, a media ontology (if necessary), and an ontology
related to user interaction. The goal is to annotate any activity performed by the user; to
achieve it, any number of ontologies which model the user activities can be connected.
The Preference ontology has been defined to model all user preferences but those
related to accessibility. The Preference class is defined as the parent class of classes
like Cognitive Preference and Learning Preference. The Accessibility ontology
contains the Accessibility Preference class, which inherits from Preference and is the
parent of the classes Content Accessibility, Context Accessibility, Control Accessibility
and Display Accessibility. Audio Accessibility, Video Accessibility, Key-board
Accessibility, etc., inherit from Content Accessibility. Other classes could be added and
existing ones incremented according to the needs of an application. It does not make
sense to define a complete accessibility ontology, with all the associated effort, and
then to use a minimal part of it. So, a simple ontology should define basic concepts and
create a framework for future additions and modifications.
The modular definition of ARII-eL allows any ontology to be disconnected where
it is not useful for a given application. This should not affect the rest of the network.
This flexibility should boost applicability to other software projects.
with se-mantic rules to generate new information should be less complex than
an explicit model. In any case, the ARII-eL ontology network is applicable to
any of these approaches and it should be considered that at present almost all
models are a mix between both approaches.
x Short term vs. long term. It is clear that this aspect should not affect the ARII-
eL ontology network at all. The only difference is regarding the persistence
strategy: if the information is stored in working memory, then it is a short-
term model; if the information is stored in the disk, then it is a long-term
model.
x Another advantage of ARII-eL is how easy it is to extend it. This is because of
its modularity, which also gives a high degree of comprehensibility to the
ARII-eL ontology network. These characteristics should increase the
possibilities to apply the ontology network to a range of projects.
x The use of first persons (i.e., “I”, “we”, “their”, possessives, etc.) should be
avoided, and can preferably be expressed by the passive voice or other ways.
This also applies to the Abstract.
x A research paper should be structured in terms of four parts, each of which
may comprise of multiple sections:
o Part One is problem description/definition, and a literature review upon
the state of the art;
o Part Two is methodological formulation and/or theoretical development
(fundamentals, principle and/or approach, etc.)
o Part Three is prototyping, case study or experiment;
o Part Four is critical evaluation against related works, and the conclusion.
x A survey paper may skip Part Three, but should multiply Part Two and
elaborate Parts One and Four.
x An application paper may lightly touch Part Two but should elaborate Part
Three, with Parts One and Four similar to what the research paper would.
In any article it is unnecessary to have an arrangement statement at the beginning
(or end) of every (sub-) section. Rather, a single overall arrangement statement about
the whole paper can be made at the end of the Introduction section.
At present, the ARII-eL ontology network is being used as part of the project iLIME, a
software engine based on a conceptual and personalized learning model, L.I.M.E.,
which is based on four ponderable categories: Learning (L), Interaction (I), Mentoring
(M) and Evaluation (E) [10]. These contributions are the pillars for any learning
scenario in their formal and informal settings. Our approach provides the student with
adaptive tutoring and support via a simple and fine-grain configurable rule system. In
addition, students are monitored along the course of their interaction through the
eLearning platform, which efficiently gathers necessary inputs [11, 32] like actions,
decisions, grades, communication, and so on. By combining rules, tracking data,
categories and settings, students finally get personalized counselling about their
academic path.
514 D. Burgos / ARII-eL: An Adaptive, Informal and Interactive eLearning Ontology Network
The model also provides an added value from other recommender approaches [13,
34] in online education by delegating to the teacher/tutor/manager the following
actions: a) design of the rule set; b) distribution of a percentage contribution to each
category and setting; and c) configuration of site inputs and monitoring strategies. In
short, iLIME is a tutor-assisted framework for student guidance; as with other
recommender systems, its goal is to improve learning efficiency. The Recommendation
engine in iLIME, Meta-Mender [15], provides the management of the information and
knowledge of the user, which becomes the basis for adaptive recommendation. This
user tracking is taken from the Learning Management System, which hosts a
knowledge data-base to be used by expert users.
In the case of the iLIME project, the ARII-eL ontology network offers support for
highly specialized communities. The main feature of this type of com-munities is, in
addition to the inherent specialization from resident students to highly experienced
surgeons, the almost complete absence of time that individuals have to share within the
community. So, in this case implicit techniques are fundamental in order to get all
available information from the user while trying to interfere as little as possible in the
user’s interaction with the application. In this case the application of semantic inference
rules is the key factor to personalize the user model. Another important factor is the
presence of a massive number of media, the majority of them videos of surgical
operations. The low availability of time implies that users will in most cases look for
specific knowledge while skipping less interesting material; comments and posts added
by experienced users will also be of great importance, so the learning will be in a large
part informal. As long as ARII-eL is conceived of as a user model for applications with
an important component of adaptive, informal and interactive eLearning, it is suitable
for the iLIME application.
In order to run a validation of the ontology and the user model, we designed and
implemented an application case (learning scenario) of the ARII-eL ontology network
applied to the iLIME project from 2 to 29 July 2012. We used a graduate course (in
Spanish) on “Design and management of research projects”, in the Master of Science in
eLearning and Social Networks, an online, official Master’s degree at Universidad
Internacional de La Rioja (UNIR). This course took place over four weeks, with 49
enrolled students. All the students but one took part in the experiment. Therefore, we
had 48 graduate students, between 35 and 45 years old, from two countries (Spain, 45
students; Colombia, three students) and two continents (Europe, South America), with
a gender distribution of 28 females and 20 males. The support group consisted of a
teacher, an online tutor, an academic coordinator, and a director. In addition, other
cross-support departments might have provided some assistance (i.e., administrative,
legal, counselling, research, library, etc.). The environment was executed for every user
only if (s)he agreed with the terms described in a formal document, so that the
recording and tracking of their private data were explicitly authorized.
We split the base group into two equally distributed groups (24 members for each
group). Group A (experimental) was engaged with the ARII-eL ontology network and
received personalized recommendations based on a number of inputs, including
traditional ones (e.g., teacher, tutor, admin staff). Group B (control) followed the
course without ARII-eL, and received traditional support only. The distribution of
learners between Group A and Group B was based on previous academic records.
D. Burgos / ARII-eL: An Adaptive, Informal and Interactive eLearning Ontology Network 515
The aim of this implementation was to prove the validity of the conceptual model
in a self-contained way. It was not our purpose to insert any disruptive element into the
development of a subject along a timeline to show significant progress in learning
assets or results in relation to every learner. On the contrary, we tested the model in a
split classroom to retrieve and analyse the learners’ track records on Learning,
Interaction, Mentoring and Evaluation, so that we might demonstrate whether or not
the conceptual model was a valid option to provide personalized feedback that might
lead to an increase in user performance.
The application of the ARII-eL user model to the described scenario showed a clear
and positive progress of the users in Group A, those who received recommendations
supported by ARII-eL. The overall average of inputs, categories and students shows a
final positive difference of 10.53% between the experimental group and the control
group (66.72% - 56.19%), in addition to a maximum difference between corners of
37.37% (81.41% - 44.04%).
After the implementation of the learning scenario we distributed a questionnaire
designed for evaluation by users of the learning scenario. We collected responses from
21 users from the experimental group (Group A, n=21).
Scores followed a modified Likert scale, from 1 (strongly disagree) to 5 (strongly
agree); 0 meant “completely against”. The questionnaire combined five categories for a
total of 12 questions. Categories and Questions are shown in Figure 2.
Figure 2. Evaluation by end-users of the ARII-eL user model and the iLIME project. Score by question and
category
516 D. Burgos / ARII-eL: An Adaptive, Informal and Interactive eLearning Ontology Network
The questionnaire was aimed at extracting useful information from the guided
users in Group A, those who received the recommendations provided by the LIME
model.
The results of this survey show a clear approval of the recommendation approach
and a strong influence on personal performance. The overall average of 3.95 out of a
maximum of 5 shows 79.05% positive feedback (Figure 2).
By category, Content shows the highest score with 4.36 points out of 5 (87.14%).
Adaptation shows the lowest score with 3.62 point out of 5 (72.38%). The highest-
scored question was #8 (Content: “The recommendation length, was it appropriate?”)
with 89.52%. However, the lowest score was not only in the lowest category (#10,
Adaptation: “Do you think that the recommendation took all the related factors of your
contribution to the subject?”) with 65.71%, but also in the category Usefulness (#1,
Usefulness: “Would you implement this recommendation model as a general service at
this university?”), with 65.71%. The other best-scored questions were #7 (Performance:
“Was the provided recommendation accurate and did it provide good counselling?”,
86.67%), #9 (Content: “The recommendation, was it properly written and easy to
understand?”, 84.76%), and #4 (Usability: “Did the recommendation focus your
attention on the screen?”, 83.81%).
These results outline a number of insights. Firstly, most of the users approved of
the user model and the experience. They found the learning scenario a valid model to
apply, which was useful for their learning experience. In addition, they found the
provided recommendations and their presentation on the screen to be appropriate and
accurate. However, we conclude that the concept of adaptation might not have been
completely understood. Since the final scores of the evaluation questionnaire seem
lower in questions related to adaptation, we think users’ expectations were not
completely met. Since they liked the model, the system and the experience, it is quite
likely that the definition of adaptation and/or the definition of the guidance provided
was not explained well enough. Nevertheless, questions on Adaptation mostly scored 3,
4 or 5, which means a result with room for improvement; nonetheless, it is also highly
remarkable, at the end.
In this paper, the authors have presented an ontology network for user modelling
focused on Adaptive, Informal and Interactive eLearning. The developed ontology
network is simple, modular and flexible. The simplicity comes from the fact that each
class has the most important properties necessary to represent the acquired knowledge.
The relations between classes are clear without large inheritance hierarchies. ARII-eL
is modular, because each ontology is functional by itself, and does not depend on other
ontologies to express its concepts and the relations between them. Each ontology
contains a set of classes with clear relations to each other. Finally, the flexibility is
explained by the facility to extend any class and the fact that new ontologies can be
added at any time without greatly affecting the overall network. Several types of user
models can be implemented with the proposed ontology-based model.
The presented ontology network will help fill a gap in user modelling related to the
support of applications with an important informal learning component. The
importance of social networks as a means to socialize and share knowledge and
experience must be taken into account by developers and de-signers of educational
D. Burgos / ARII-eL: An Adaptive, Informal and Interactive eLearning Ontology Network 517
applications. The lack of user models supporting this type of learning could influence
the development of applications able to take advantage of this emerging trend.
For validation we designed and implemented a learning scenario at the
Universidad Internacional de La Rioja (UNIR), in the context of the official academic
Master’s programme of Science in eLearning and Social Networks, in July 2012. We
used the subject “Design and management of research projects” and a specific software
implementation called the iLIME project, supported by the LIME conceptual model.
The scope of this scenario dealt with scheduled, regular activities (e.g., knowledge
tests), and informal learning activities (e.g., user interaction in group debates), up to 30
various inputs. Over four weeks we took weekly measurements (milestones M8, M15,
M22, M29) from two groups of 24 students: experimental (A) and control (B).
The results of the application case showed positive progress over the four weeks,
with a final positive difference of 10.53% between the groups, and a maximum
difference of 37.37%, favouring the experimental group receiving support from ARII-
eL. In addition, we distributed an online questionnaire among the members of the
experimental group (A). The results showed clear support, with 79.05% satisfaction
among the 21 respondents. The results were concentrated on the responses 4 and 5
(73.40%) on a modified Likert scale from 1 (completely disagree) to 5 (completely
agree), including value 0 (completely against). The survey grouped a total of 12
questions in five categories: Usefulness, Usability, Performance, Content, and
Adaptation.
These results, from the users’ performance and questionnaires, represent tangible
proof of the success of the ARII-eL user model and the implementation in the iLIME
application, based on a large number of objective measurements. Therefore, they back
up the conceptual design with practical experience.
5. Acknowledgements
We thank Emmanuel Jamin and Vicente Romero for their contribution to the original
conceptual work on the design of the ontology and the Meta-Mender recommendation
system. This work is partially funded by UNIR Research (http://research.unir.net),
Universidad Internacional de La Rioja (UNIR, http://www.unir.net, Spain), under the
Research Support Strategy [2013-2015], Research Group TELSOCK on Technology-
enhanced Learning and Social Networks.
References
[1] S.E. Middleton, N.R. Shadbolt and D.C. De Roure, Ontological user profiling in recommender systems.
ACM Transactions on Information Systems (TOIS), 22(2014), 54 – 88
[2] R. Denaux, V. Dimitrova and L. Aroyo, Interactive Ontology-Based User Modeling for Personalized
Learning Content Management, AH 2004: Workshop Proceedings Part II, 338 – 347, 2004
[3] RDF, RDF: http://www.w3.org/RDF/
[4] RDFS, RDFS: http://www.w3.org/TR/rdf-schema/
[5] OWL, http://www.w3.org/TR/owl-features/
[6] SPARQL Query Language for RDF, http://www.w3.org/TR/rdf-sparql-query/
[7] SWRL, http://www.w3.org/Submission/SWRL/
[8] J. Breslin, A. Harth, U. Bojars and S. Decker, Towards semantically-interlinked online communities.
Proceedings of the 2nd European Semantic Web Conference, 2005
[9] FOAF, http://xmlns.com/foaf/spec/20100101.rdf
518 D. Burgos / ARII-eL: An Adaptive, Informal and Interactive eLearning Ontology Network
Abstract. A system will produce massive status data during its runtime, which
contains rich status information. In this work, we target at detecting system faults as
early as possible based on the system status data sequences. Firstly, we formalized
the system fault detection into classification problem, in which different types of
status data were integrated to reflect the system status. Secondly, we devised a
detection method to predict the class of a status sequence when its full length is not
yet available. At last, a series of experiments were conducted to verify the proposed
methods’s effectiveness.
Introduction
Over the last few decades, data-driven applications have received many attentions be-
cause data collecting and processing ability of computer has an enormous promotion.
For a complex system, massive status data is generated over time, which contains rich
information for diagnosing the system’s status. Based on such data, it is possible for user
to detect the system’s faults on time, even to predict the faults before they happen.
In recent years, the data-based system fault detection techniques were proposed to
maintain systems, such as in [1,2,3]. More specifically, Neuhaus et al. [4] developed the
VULTURE to analyze the correlation of history data on modifying the source codes, bug
reports and software structure by mining the vulnerability database. Alhazmi et al. [5]
built a model to predict the number of undiscovered bugs by using the rate of discov-
ering bugs. For the fault detection of software systems and the hardware systems, con-
structing the relationship graph for different artifacts is a common method, such as in the
HIPIKAT [6], the FRAN [7]. Traditional mechanism analysis diagnosis methods depend
on a complex nonlinear dynamic mathematical model, therefore they have the limitations
on timeliness and accuracy for early fault diagnosis for the complicated system. In fact,
system status data constitutes various data sequences with respect to different systems or
purposes. Generally, the system’s status can be presented by some values such as temper-
atures, CPU workload. These values can be used independently for the single component
of the system, and they can also be merged into one value to reflect the system’s state. For
1 Corresponding Author: Yu-Ming LIN, Guilin University of Electronic Technology, No. 1, Jinji Road,
Qixing District, Guilin City, Guangxi Province, 541004, China; E-mail: ymlinbh@163.com
520 Y. Li and Y.-M. Lin / Early Prediction of System Faults
these cases, some burst detection techniques like sliding windows [8] can find the faults
effectively. However, some values like temperatures would improve slowly, which leads
to a poor effectiveness for the traditional methods. Further, if faults can be predicted with
some harbingers, it would reduce the system’s risk and damage significantly.
In this work, we tackle the problem of predicting the system’s faults on system
status data series. The diagnosis on system’s faults is treated as a classification problem
according to the status data series, in which the label means there is a system fault or
not. In this scenario, a classifier trained by labeled samples can overcome the above
limitations. As time passed by, the classifier makes the prediction on system faults online
for the current status data series. In summary, our work includes mainly as follows:
1. We formalize the system fault detection as a supervised learning problem, in
which the objective is to predict the system status accurately as early as possible.
2. A system fault prediction algorithm based on early classification is proposed,
by which system faults can be identified when the series data’s length is not yet
available.
3. Extensive experiments are conducted to verify the proposed method’s efficiency.
1. Problem Statement
Assuming S = {s1 , s2 , ..., s3 } is the set of status data sequences. For the convenience
narration, we list some symbols used in this work in Table 1.
For each data sequence, we try to predict its class as early as possible. In other words,
we try to find the j as small as possible, where we can predict the data sequence’s class
accurately. Formally, our target can be concluded as follows.
n
C∗ = argmin loss(C(si [1 . . . j]), Lsi ) (1)
C∈H,j≤Lensi i=1
where H is the set of classification hypothesis, C(x) would make a prediction for the
sample x. The loss is the predefined loss function such as the 0-1 loss, the hinge loss.
The system’s status can be reflected by various factors, such as temperatures, workload.
However, these factors are measured with different dimensions. For example, the CPU’s
Y. Li and Y.-M. Lin / Early Prediction of System Faults 521
temperatures could be 30 degrees Celsius at a certain time point, CPU’s workload could
be 70% at the same time. In order to integrate various factors into one value, the first
thing is to normalize these factors. In this work, we use the Z-score normalization.
Moreover, each factor plays different role for predicting the system’s overall status.
Some weight parameters are used to regulate the contributions of different factors, which
are needed to turn by expert experimentally. So, a value of the status data sequence can
be calculated by the following formula.
n
ajk − μk
sji = θk (2)
σk
k=1
where θk is the parameter turning the k th indicator’s weight, ajk is the j th value of the
k th indicator, μk is the mean of the k th indicator’s values, σk is the standard deviation
for the k th indicator.
If the system status is made up by many status data sequences with fixed length, predict-
ing the system status can be treated as a binary classification problem. For example, the
label "1" means system works normally, and "-1" means system might not work properly.
In such scenario, the existing classification method can be used to make predictions.
The nearest neighbor algorithm is one of the frequently-used algorithms for classifi-
cation, in which a sample’s label is determined by the samples close to it. This method is
simple, parameter-free and does not require feature selection and discretization. It would
work until the fixed-length sequence is generated totally. However, some system statuses
can be diagnosed with part of system status data. Based on the early classification[9], we
proposed a technique to predict the system’s status as early as possible.
Let Nsli is the set of the data sequence si ’s nearest neighbors in the training set T r,
which means Nsli = {t|t = arg min(dist(t[1, l], si [1, l]))}. The si [1, l] is the sequence
t∈T r
si ’s prefix subsequence with the length of l. The dist(a, b) is a distance function, which
can measure the distance of two data sequences like the Euclidean distance. Let Rl (t) be
the set of reverse nearest neighbors of t[1, l] that treat t[1, l] as the nearest neighbor, that
is Rl (t) = {t |t ∈ Ntl }.
Based on the definitions above and the conclusions in [9], the data sequence si ’s
Minimum Prediction Length MPL(si ) =k if for any l (k ≤ l ≤ n), Rl (si ) = Rn (si ) = ∅
and Rk−1 (si ) = Rn (si ), where n is the full length of data sequence si . Then, we can
make a prediction for a testing sequence sample by the 1-nearest neighbor classification
algorithm as soon as it’s length is up to the MPL value. However, such method has strict
requires for the stability of a training sample’s reverse nearest neighbor set after the time
point MPL. Moreover, this method is prone to overfit the training set.
To improve the robustness, a sequence’s label should be generated by a cluster of
samples rather than one sample. Then, a clustering algorithm could be used to group the
sequences. The MPL of a cluster G with n-length sequence data is k if for any l ≥ k,
Rl (G) = Rn (G) and G is 1-nearest neighbor consistent [10] and for l = k − 1 the first
two conditions can not hold simultaneously [9]. When a testing sample belonging to a
cluster can be predicted by the dominant label in the cluster according to the cluster’s
522 Y. Li and Y.-M. Lin / Early Prediction of System Faults
3. Experiments
We constructed a data set to verify the proposed method’s effectiveness, which is based
on computers’ status data from different factors including the CPU temperature, the
CPU’s workload, the hard disk rotational seed, the graphics card temperature, the mem-
ory usage. We measured and recorded the values of the above factors every five seconds.
According to the Equation 2, we merged the above factor values into one value
to describe the system’s status. Further, we constructed a system status sequence with
Y. Li and Y.-M. Lin / Early Prediction of System Faults 523
fifty continuous merged values. Each sequence was labeled as "normal" or "abnormal"
manually. In total, the data set is made up of 6000 labeled system status sequences, in
which there are 644 samples labeled as "abnormal".
To measure the effectiveness of system fault detection, the accuracy (ACC) is treat-
ed as a evaluation indicators which focuses on the proportion of correct predictions.
However, it is not enough to consider the prediction accuracy only, because the abnormal
sequences are a tiny proportion of all sequences. Thus, if we predict all sequences to be
the normal ones, the prediction accuracy is still very high. But it is completely useless
for users. Our target is to find the abnormal sequences. Therefore, we treat the precision
(Pf ), recall (Rf ) and F1-score (F1 ) on abnormal sequence as the important indexes:
In the first experiment, we compared the EPSS with the full-length 1-nearest neigh-
bor (1NN) in Table 2. This means the EPSS and 1NN can make correct predictions for
most samples. However, the 1NN can not make prediction until the status data was gen-
erated totally. The EPSS’s average prediction length is 39.61. This is very important for
system maintenance, because we can know the system’s status. Especially for the cas-
es of system fault, the maintainers can diagnose and troubleshoot system earlier, which
would reduce the risk significantly caused by the system’s faults. Moreover, we investi-
gated the predictions on samples labeled as "abnormal" since such samples are crucial
for system and make up a tiny percentage of all samples. We can find that the EPSS al-
so achieved the high prediction precision and recall with small prediction length, which
means the proposed is effective to predict the system’s faults.
Table 2. The prediction effectiveness comparison between the 1-nearest neighbor and the EPPS
We can see that the EPSS will work with enough samples in first experiment. The
second experiment focused on the influence of the training sample number, which in-
creased from 200. We investigate the changes of efficiency on EPSS gradually. Overall,
the prediction effectiveness is promoted with the constant increase of the training sample
number. Thus, when we collect more and more training samples, the proposed method
would make better effectiveness for system status prediction. On the other hand, the third
line of Table 3 shows the average prediction time on making prediction for one sample.
We can find that the prediction process is real-approximate, which can help the system
managers further to diagnose the system status as early as possible.
# of training samples 200 300 400 500 600 700 800 900 1000
ACC 0.963 0.972 0.977 0.978 0.983 0.984 0.987 0.987 0.991
prediction time (sec.) 0.013 0.022 0.031 0.038 0.055 0.060 0.069 0.071 0.078
Rf 0.874 0.881 0.914 0.943 0.925 0.932 0.953 0.949 0.951
Pf 0.802 0.867 0.880 0.867 0.892 0.922 0.929 0.937 0.965
F1 0.836 0.874 0.897 0.904 0.908 0.927 0.940 0.943 0.958
524 Y. Li and Y.-M. Lin / Early Prediction of System Faults
4. Conclusion
It is very important for maintainers to predict and diagnose the system’s faults early,
which would reduce the risk significantly. In this work, we treated the system status pre-
diction as a sequence classification problem. Then, we proposed a framework to predic-
t the system faults, by which the status predictions can be made as early as possible.
Moreover, we constructed a real world data set, and carried out a series of experiments
to verify the proposed method’s effectiveness. As the future work, we will further reduce
the proposed method’s sensitivity to the noise data and explore the parallel solution for
system status prediction based on the Map-Reduce framework.
Acknowledgment
This work is supported by the Guangxi Key Laboratory of Automatic Detecting Technol-
ogy and Instruments (No. YQ14109), the NSFC grant (No. 61562014, and U1501252,
No.61262008), the Guangxi Natural Science Foundation (No. 2015GXNSFAA139303),
the project of Guangxi Key Laboratory of Trusted Software, the high level of innovation
team of colleges and universities in Guangxi and outstanding scholars program funding,
and program for innovative research team of Guilin University of Electronic Technology.
References
[1] A S. Raj, N. Murali, Early classification of bearing faults using morphological operators and fuzzy
inference, IEEE Transactions on Industrial Electronics, 2 (2013), 567–574.
[2] S. Yin, G. Wang, H R. Karimi, Data-driven design of robust fault detection system for wind turbines,
Mechatronics, 4 (2014), 298–306.
[3] N. Subrahmanya, Y C. Shin, A data-based framework for fault detection and diagnostics of non-linear
systems with partial state measurement, Engineering Applications of Artificial Intelligence, 1 (2013),
446–455.
[4] T. Zimmermann, P. Weissgerber, S. Diehl, A. Zeller, Mining version histories to guide software changes,
IEEE Transactions on Software Engineering, 6 (2005), 429-445.
[5] O. Alhazmi, Y. Malaiya, I. Ray, Security vulnerabilities in software systems: A quantitative perspective,
IFIP Annual Conference on Data and Applications Security and Privacy, 2005, 281–294.
[6] D. Čubranić, G. C Murphy, J. Singer, et al., Hipikat: A project memory for software development, IEEE
Transactions on Software Engineering, 6 (2005), 446–465.
[7] Z. M Saul, V. Filkov, P. Devanbu, C. Bird, Recommending random walks, Proceedings of the the 6th
joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on
The foundations of software engineering, 2007, 15–24.
[8] Y. Zhu, D. Shasha, Efficient elastic burst detection in data streams, the ninth international conference
on Knowledge Discovery and Data Mining, 2003, 336–345.
[9] Z. Xing, J. Pei, S Y. Philip, Early Prediction on Time Series: A Nearest Neighbor Approach, Twenty-first
International Joint Conference on Artificial Intelligence, 2009, 1297–1302.
[10] C. Ding, X. He, K-nearest-neighbor consistency in data clustering: incorporating local information into
global optimization, Proceedings of the 2004 ACM symposium on Applied computing, 2004, 584–589.
Fuzzy Systems and Data Mining II 525
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-525
Abstract. Though node mobility allows quick network setup for Mobile Ad Hoc
Networks (MANETs), it can lead to route failure if improper movements are taken.
Therefore, how to maintain the reliability of routes has always been a challenge in
MANETs. Since routes are composed of relaying-nodes hop-by-hop, how to select
reliable relay nodes is of great importance. Existing researches of routing protocol
mainly utilize the metric of Hop or Received Signal Strength (RSS), which ignores
the interference from other nodes and cannot make a good indication for link
quality. In allusion to this problem, we propose an extension of OLSR protocol,
named QoS aware Hierarchical Routing Protocol (QHRP), by replacing its relay-
node selection policy. To be specific, we use a new metric combining estimated
signal to interference plus noise ratio (SINR) and link duration (LD) instead of the
number of hops used in OLSR. Since SINR considers the accumulative
interference from neighbor nodes, and LD considers the lifetime per link, it is
expected that routes generated by QHRP can be more reliable. Extensive
simulations show that QHRP can achieve an outstanding performance in terms of
calculation of route, overhead and packet drop ratio.
Introduction
1
Corresponding Author: Yan-Ling WU, Dongguan University of Technology, Dongguan, Guangdong,
523808, China; E-mail: wu_yanling@hotmail.com.
526 Y.-L. Wu et al. / QoS Aware Hierarchical Routing Protocol
only when desired. Once a route has been created, it is maintained by a route
maintenance procedure during data delivery, such as Dynamic Source Routing (DSR)
for Mobile Ad Hoc Networks for IPv4 [4] and Ad hoc On-demand Distance Vector
(AODV) [5]. However, the latency to determine a route can be quiet significant if there
is not an available route between source node and destination node.
Quality of Service (QoS) is always a challenge in MANETs due to interference
and node’s mobility. Throughput, end to end latency, lifetime, available bandwidth and
packet delivery ratio are usually evaluated as the key QoS parameters [6-23]. In this
paper, we propose an extension of OLSR protocol, named QoS aware Hierarchical
Routing Protocol (QHRP), by replacing its relay-node selection policy. To be specific,
we use a new metric combining estimated signal to interference plus noise ratio (SINR)
and link duration (LD) instead of the number of hops used in OLSR. Since SINR
considers the accumulative interference from neighbor nodes, and LD considers the
lifetime per link, it is expected that paths generated by QHRP can be more reliable.
Only the nodes with SINR higher than a given threshold can make candidates for relay
nodes, and candidate which have the longest LD win the selection. Similar to OLSR,
once the relay nodes (hereafter renamed as parent-nodes in QHRP) are elected
according to our new policy, Topology Control (TC) messages are generated and
diffused by parent-nodes in the network. This strategy allows that more reliable routes
can be established and most of flooding overhead can be eliminated in MANETs.
The rest of paper is organized as follows. In Section 1, related works are reviewed.
The details of QHRP are given in Section 2, followed by the performance evaluations
in Section 3. Finally, we conclude this paper in Section 4.
1. Related Works
In this Section, we review the existing routing protocols for MANETs from the
perspective of QoS, and summarize them into three categories: strategies based on
reservation of available bandwidth [7-8], [18], [24, 25]; strategies based on link state
[15], [16], [19, 20], [23].
In this Section, we review the existing routing protocols for MANETs from the
perspective of QoS, and summarize them into two categories: available bandwidth
reservation strategies [7, 8], [18], link-state strategies [15, 16], [19, 20], [23].
Guimarães et al. [7] introduced a mechanism called Bandwidth Reservation over
Ad hoc Wireless Networks (BRAWN), where the available bandwidth was calculated
by each node of the network to estimate the suitable rate allocation. A cross-layer
routing protocol was proposed, which applied two different methods “Listen” and
“Hello” to estimate residual bandwidth, then integrated a QoS-aware mechanism into
the route discovery procedure and providing feedback to the application [8]. Lei et al.
[18] proposed an available bandwidth estimation method on considering concurrent
transmission for MANETs.
Rubin and Liu [15] studied in depth the statistical properties of link stability in
four movement patterns, i.e. Random Destination, Random Walk, Random Movement
and Swarm Movement patterns, respectively. The lifetime distribution of links was
analyzed and different models for different movement patterns were developed. Al-
Akidi et al. [19] proposed three schemes to offer a novel mechanism for establishing
and maintaining routes based on-demand routing technique. The heading direction
information was applied instead of GPS information which was not always available
Y.-L. Wu et al. / QoS Aware Hierarchical Routing Protocol 527
2. System Model
j
g n
e i
R
k f
h
m l
(0, 0)
RSS from i
SINRi lg (1)
¦ Total of RSS excluded the one from i
PTi x Gi x Gj x O 2 (2)
RSSi
4S 2 x di2, j x L
where PTi is the transmission power of node i, Gi and Gj are the gain of antennas of
i and j, respectively. λ, L, di, j represent the wave length, the system loss, and the
distance between i and j, respectively.
Based on Eqs.(1) and (2), for the node j, the SINRi can be written as:
PTi x Gi x Gj x O 2
SINRi lg
4S 2 x di2, j x L (3)
¦ Total of RSS excluded the one from node i
Since the coordinate of each node is known in QHRP, LD between two one-hop
neighbor nodes i and j can be estimated as the proposition introduced by Su et al. [26].
Let vi and vj be the speeds, and θi and θj (0 θi, θj 2π) be the moving direction of i
and j.
The HELLO message is extended to carry the SINR and the LD between one-hop
neighbors QHRP. The nodes whose SINR is greater than SINRthr are considered as the
candidates of parent-node. Once the candidates determined, the one with the longest
LD will be elected as the parent-node.
Reserved Htime Willingness
Link code Reserved Link Message Size
Interface Address of g
SINR of g LDi, g
Interface Address of h
SINR of h LDi, h
Interface Address of e
SINR of e LDi, e
Interface Address of j
SINR of j LDi, j
Figure 2.(a) Format of HELLO message, HELLO message sent by i
Y.-L. Wu et al. / QoS Aware Hierarchical Routing Protocol 529
Once the parent-node is determined, the Parent_Update message will be generated and
sent to the parent-nodes by the children-nodes. The Parent_Update message allows that
each parent-node collects information of its children-nodes. Afterwards, information
will be exchanged between the pair of children-node and parent-node during Htime.
The format of Parent_Update message is illustrated in Figure 3.
Parent_Election_Ti
Htime Reserved
mer
In QHRP, the main function of parent-nodes is relaying the messages between nodes
and establishing the suitable routes to the destination node. Each parent-node
periodically diffuses the list of its children-nodes by broadcasting TC messages which
allow creating and maintaining the routing information in the network. The TC
messages can be only generated and reconstructed by parent-nodes. A modified TC
message is introduced which is included the SINR and LD in QHRP. This modification
certainly is helpful to discover the more stable routes.
The format of modified TC is illustrated in Figure 4.
ANSN Reserved
Advertised Children-node Main Address
SINR LD
Advertised Children-node Main Address
SINR LD
3. Performance Evaluation
Kitasuka and Tagashira [27] proposed a method (denoted as Shared MPR) to reduce
the traffic by finding more MPR to decrease the MPR ratio, which is defined as the
number of MPR nodes divided by the total number of nodes in the network.
Consequently, the number of TC message generation is decreased, and the overhead is
reduced in the networks. To reduce the number of TC messages, the MPR computation
algorithm is modified. In the proposed algorithm, each node tried to select a node as an
MPR, which is already selected as an MPR by its neighbors, only if the size of MPR set
is not larger than that of the conventional MPR selection.
To demonstrate the effectiveness of QHRP, the performance evaluation is achieved
in comparison with the shared MPR selection.
The main simulation parameters are illustrated in Table 1.
Simulations are achieved with Matlab 7.0.4 in terms of:
x overhead: the number of control packets to total of packets generated in the
networks.
x packet drop ratio: the number of lost packets to total of packets generated in
the network.
Y.-L. Wu et al. / QoS Aware Hierarchical Routing Protocol 531
Parameter Value
network size 1000 m × 1000 m
mobility model RWP
pause time 5s
HELLO interval 2s
TC interval 5s
hold time 15 s
physical layer 802.11
Transmit power 1 mW
gain of antenna 3 dbi
radio frequency 2.4 GHz
transmission range of node 150 m
SINRthr -91 dBm
velocity of nodes uniform [1, 10] m/s
number of nodes 100
simulation time 3000 s
The comparison of overhead for two protocols is illustrated in Figure 5.
25
QHRP
shared MPR
20
Overhead (%)
15
10
0
1 2 3 4 5 6 7 8 9 10
Velocity of nodes (m/s)
It is well-known that for OLSR-extensions, reducing the size of MPR set can help
reducing the protocol overhead. In theory, Kitasuka and Tagashira [27] can achieve
better performance than QHRP since its purpose is to minimize the size of MPR set by
finding more MPR in the networks. However, as shown in Figure 6, the performance of
Kitasuka and Tagashira [27] is not really better than QHRP. While the velocity of
nodes is greater than 5 m/s, the performance of QHRP is much better than shared MPR.
The main reason is that the nodes are assumed to be static in Kitasuka and Tagashira
[27], the created routes are vulnerable while the nodes move. While the nodes move
more rapid, to maintain and re-discovery the network topology, more control packets
are generated. Contrary to Kitasuka and Tagashira [27], the routes created in QHRP are
more stable since the mobility has already been considered. In QHRP, the route
establishment is not only based on the SINR between two nodes, but also based on the
estimation of link duration between them.
The evaluation of packet drop ratio for two protocols is shown in Figure 6.
10
QHRP
9 shared MPR
7
Packet drop ratio (%)
0
1 2 3 4 5 6 7 8 9 10
Velocity of nodes (m/s)
more reliable compared to the shared MPR since the SINR and LD are
considered while the route establishing.
4. Conclusion
Route failure resulting from node mobility has always been a challenge in MANETs.
Many extensions of the link stated based routing protocol named OLSR have been
proposed. However, since the interference is not able to be predicted, selecting relay
nodes mainly by counting the number of hops in OLSR does not bring reliable routes in
MANETs. In this paper, a QoS aware Hierarchical Routing Protocol (QHRP) is
proposed. In QHRP, we use a new metric combing SINR and LD to replace the
traditional hops used in OLSR, and exchange these new metrics among one-hop
neighbors via periodically broadcasted HELLO messages. Upon receiving several
HELLO messages from its one-hop neighbor nodes, the node will only consider
neighbor nodes whose SINR go beyond a predefined threshold as candidates for relays,
and select the candidate with the longest LD as its parent-node. Then, similar to OLSR,
TC messages will be diffused by parent-nodes. Extensive simulation results confirm
the effectiveness of our proposed approach in term of calculation of route table,
overhead and packet drop ratio.
Acknowledgements
References
[1] E. M. Royer, C-K Toh. A Review of current routing protocols for ad hoc mobile wireless networks. IEEE
Personal Communication 6 (1999), 46-55.
[2] C. E. Perkins, P. Bhagwat. Highly dynamic destination-sequenced distance-vector routing (DSDV) for
mobile computers. ACM Sigcom (1994), 234-244.
[3] S. Murthy, J. J. Garicia-Luna-Aceves. A Routing Protocol for packet radio networks. ACM MobiCom
(1995), 86-94.
[4] D. Johnson, Y. Hu, D. Maltz. The Dynamic Source Routing Protocol (DSR) for Mobile Ad Hoc
Networks for IPv4. IETF RFC 4728, Feb. 2007.
[5] C. E. Perkins, E. M. Royer. Ad hoc on-demand distance vector routing. IETF RFC 3562, 2003.
[6] T. B. Reddy, I. Karthigeyan, B.S. Manoj, C. Siva Ram Murthy. Quality of service provisioning in ad hoc
wireless networks: a survey of issues and solutions. Ad Hoc Networks 4 (2006), 83-124.
[7] R. Guimarães, L. Cerdà, José M. Barceló, J. García, M. Voorhaen, C. Blondia. Quality of service through
bandwidth reservation on multirate ad hoc wireless networks. Ad Hoc Networks 7 (2009), 388-400.
[8] L. Chen, Wendi B. Heinzelman. QoS-Aware Routing Based on Bandwidth Estimation for Mobile Ad
Hoc Networks. IEEE Journal on selected areas in communications 23 (2005), 56-572.
[9] M. Xie, M. Haenggi. Towards an end-to-end delay analysis of wireless multihop networks. Ad Hoc
Networks 7 (2009), 849-861.
[10] S. Kajioka, N. Wakamiya, H. Satoh, K. Monden, M. Hayashi, S. Matsui, M. Murata. A QoS-aware
routing mechanism for multi-channel multi-interface ad-hoc networks. Ad Hoc Networks 9 (2011), 911-
927.
534 Y.-L. Wu et al. / QoS Aware Hierarchical Routing Protocol
[11] J. J. Liu, X. H. Jiang, H. Nishiyama, N. Kato and X. M. (Sherman) Shen. End-to-End Delay in Mobile
Ad Hoc Networks with Generalized Transmission Range and Limited Packet Redundancy. IEEE
Wireless Communications and Networking Conference (2012), 1731-1736.
[12] C. K. Toh, A. N. Le and Y.Z. Cho. Load balanced Routing Protocols for Ad Hoc Mobile Wireless
Networks. IEEE Communications Magazine 47 (2009), 78-84.
[13] I. T. Haque. On the Overheads of Ad Hoc Routing Schemes. IEEE System Journal 9 (2014), 605-614.
[14] Q. Xue, A. Ganz. Ad hoc QoS on-demand routing (AQOR) in mobile ad hoc networks. Journal of
Parallel and Distributed Computing 63 (2003), 154-165.
[15] I. Rubin, Y.-C. Liu. Link stability models for QoS ad hoc routing algorithms. IEEE 58th VTC Fall
(2003), 3084-3088.
[16] N. Sarma, S. Nandi. Route Stability Based QoS Routing in Mobile Ad Hoc Networks. Wireless
Personal Communication 54 (2010), 203-224.
[17] C. W. Yu, T. K. Wu, R. H. Cheng. A low overhead dynamic route repairing mechanism for mobile ad
hoc networks. Computer Communications 30 (2007), 1152-1163.
[18] L. Lei, T. Zhang, L. Zhou, X. M Chen, C.F. Zhang, and C. Luo. Estimating the Available Medium
Access Bandwidth of IEEE 802.11 Ad Hoc Networks with Concurrent Transmissions. IEEE
Transactions on vehicular technologies 64 (2015), 689-701.
[19] M. Al-Akaidi and M. Alchaita. Link stability and mobility in ad hoc wireless networks. IET
communications 1 (2007), 173-178.
[20] A. Moussaoui, F. Sechedine, A. Boukerram. A link-stat QoS protocol based on link stability for Mobile
Ad hoc Networks. Journal of Network and Computer Applications 39 (2014), 117-125.
[21] A. Nayebi, H. Sarbazi-Azad. Analysis of link lifetime in wireless mobile networks. Ad Hoc Networks
10 (2012), 1221-1237.
[22] C. Ma, Y. Y. Yang. A Battery-Aware Scheme for Routing in Wireless Ad Hoc Networks. IEEE
Transactions on vehicular technology 60 (2011), 3919-3932.
[23] T. Clausen, P. Jacquet. Optimized Link State Routing Protocol (OLSR). IETF RFC 3626, Oct. 2003.
[24] R. Belbachir, Z. M. Maaza, and A. Kies. The mobility issue in admission controls and available
bandwidth measures in MANets. Wireless Personal Communication 70 (2013), 743–757.
[25] C. Sarr, C. Chaudet, G. Chelius, and I. G. Lassous. Bandwidth estimation for IEEE 802.11-based ad
hoc networks. IEEE Transactions on Mobile Computing 7 (2008), 1228–1241.
[26] W. Su, S. J. Lee, and Mario Gerla. Mobility Prediction and Routing in Ad Hoc Wireless Networks.
International Journal of Network Management 11 (1970), 3-30.
[27] T. Kitasuka, S. Tagashira. Finding More Efficient Multipoint Relay Set to Reduce Topology Control
Traffic of OLSR. IEEE 14th International Symposium and Workshops on a World of Wireless, Mobile
and Multimedia Networks (WoWMoM), (2013), 1-9.
Fuzzy Systems and Data Mining II 535
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-535
Abstract. Sina meteorological microblog was taken as data resource, with a view
to Sina Shaanxi meteorological microblog, the paper achieved designing of the
system of extraction of public opinion on microblog’s hot topic. We introduced in
detail the system’s whole process, described data extraction, Word segmentation
system, hot extraction method, and described the problems occurred in realization
and the factors to be improved. The next step will be increased analysis results of
images, to get more accurate hot extraction results.
Introduction
1
Corresponding Author: Fang REN, No.36, North commissioner main Street, Lianhu District, Xi'an City,
Shaanxi Province, China; E-mail: renfang829200@163.com.
536 F. Ren et al. / The Design and Implementation of Meteorological Microblog Public Opinion
1. Process of Discovery
The system extracts original data from microblog in real time using API [2], filters
every microblog’s message simply, reserves only text of the microblog’s message and
forwarding content to proceed with word segmentation. The text’s number of
comments and forwarding, the forwarding content’s number of comments and
forwarding will be count. Their statistical weights then will be handled with
interdependency. The word segmentation system, filtering system, and word
segmentation algorithm are referred from open word segmentation system shootseg [3]
developed by ShootSun working room. And then, all the topic words’ word-frequency
statistics will be calculated out, all the topic words will be sorted by their heat. The hot
words’ interdependency is analyzed then. The method of analysis of interdependency
will be introduced briefly below. The outcome of output includes topic words, topic
words related words, topic words related microblog. The prototype system will judge
every hot word as a topic word. Through comparing related words with related
microblog, perceptually presents the correction of the topic word. Process is presented
in the Figure 1.
The open platform of Sina microblog provides download of plenty of APIs which
adapting to kinds of language environments. Official SDK are self-developed by
authority, with official total technology support, can perform all functions of handling
microblog. Official SDK advantage is its powerful functionality, but its disadvantage is
also obvious that these SDK languages supportability is poor, the programming
language C# used by the system of this paper is not supported by them, so it has to
F. Ren et al. / The Design and Implementation of Meteorological Microblog Public Opinion 537
After compared many APIs provided by The open platform of Sina microblog, and
considering the paper’s language environment, the paper used API of Sina Microblog ,
and the C# Sina Microblog’s third-party SDK [4] adopted by the paper is shared
by netizen LinXuanchen. The third-party API was modified in adaptation relative to
original API, meanwhile, without many functions’ shortage, its complete functions and
help texts bring the paper great help and convenience.
This system acquired practical basic information includes App Key, App Secret,
authorized callback page by registered as a developer in Sina microblog and finished
registering application program. The practical basic information is important condition
of the successful connection from related applications developed by the system to
server of microblog.
Main application modules of Sina microblog’s API are login module and
information extraction module. Login module accomplish identity authentication of
users on the open platform of Sina microblog, obtain verification’s outcome of current
user’s login. Information extraction module realize that every account correspond to
one ID, obtain user’s personal information (head portrait, nickname, label, etc.) , obtain
latest microblog of current user and users focused by current user, and information of
current microblog’s status’ sender.
The system judges the type of text read currently line by line with word segmentation
dictionaries loaded into memory earlier, set decollator as ‘/’. It judges current word is
whitespace or not at first, segment the word when the word is digit or letter, if else,
take the whitespace as noise and filter it directly. Then determine the type of the word,
if the word can not be matched with phraseology after determined the type of the word,
set a mark to indicate its type. The marks include digit mark, letter mark, name mark,
name mark often appears when the word or character can not be processed with
traditional word segmentation. If there is determination of former word, separate the
former word when system determining current word to be different type, and mark the
current word, insert directly the word to output string when word segmentation
retrieval of current word is closing. The most complicated task in the algorithm is
determination of Non-numeric characters, in the retrieval structure of Non-numeric
characters, retrieve the second grade hash table firstly, search in advance, if the word
constructs term, then insert directly the term to output string, and avoid practicing again
with a mark to indicate the word segmentation in advance have been done. It is
F. Ren et al. / The Design and Implementation of Meteorological Microblog Public Opinion 539
After the word segmentation, to count topics, the system have to do coherence analysis
of the key word of every microblog, in order to overcome the disadvantage of
disordering, shortness, unsystematic, decentration of microblog, also to evade problems
of message’s imperfection and interlacement. Common method of coherence analysis
of word includes. ٤ 1 Analysis of the key word’s frequent usage [5-6]. ٤ 2 Word
frequency and word density [7-8]. ٤ 3 Keyword’s position and form [8]. ٤ 4 Distance
between keywords [9-10] . ٤ 5 Analysis of link and pages’ weight [6].
To analyze topic word, it is necessary to use algorithm of keyword’s distance, but
the method has definite problems in analysis of microblog. Because microblog’s
content is tattered and has a large coverage, keywords extracted from a microblog will
be plentiful, so the outcome from the method’s process will has great complicated
dimensions, bringing trouble to operation. The paper’s system adopted a compromised
method. The system calculated out and displayed abundant keyword by utilizing key
word’s frequent usage, word frequency and word density, distance between keywords,
Analysis of link and pages’ weight comprehensively.
The system’s developing platform is Microsoft Visual Studio 2012, and the data base
using Microsoft SQL Server 2008. The operating environment is Dual-core P4 2.5GHz,
4GB, XP SP3, speed of network is 100MB.
Before operation of the system, it is required to set the system with API of Sina
microblog. The setting includes App Key, App Secret, CallbackURL, after the three
data passed verification, the program could be linked to server of Sina microblog.
And then the system verifies the account’s password of users logged in by
comparing local value and verified value, to accomplish login. After login, the system
obtains current users logged in and the latest microblog they focused on. Because the
microblog captured in this method is ranked by time, the system finished the download
of microblog data in batches by setting time, and after the download is finished
completely, the system starts to analyze the microblog. The realizing module of the
system is presented in the Figure 3.
540 F. Ren et al. / The Design and Implementation of Meteorological Microblog Public Opinion
The system captured 671 pieces of effective microblog data through collecting data of
Shaanxi meteorological Sina microblog in 3 months, after data filtering, word
segmentation, topic word count, heat ranking, 356 original microblog and relative
comment are obtained. In the process, the algorithm of word segmentation amended by
the paper is adopted to longer microblog text, then supplemented the popular Word in
Networks to the dictionary, acquired well effect. Firstly, we verified the effectiveness
of algorithm of extracting topic adopted by the paper, the experiment applied database
of SQL Server 2008 to establish corpus which is used to simulate real data. Five
outcomes ranking at the front of extraction of hot spot were selected after being
arranged and analyzed. It is presented in table 1.
To verify whether the algorithm of extracting topic can respond hot event on
microblog or not, we compared the hot spots extracted by the experiment with hot
topics on Sina microblog, it is found that all the events listed by the experiment
appeared on the ranking list of Sina microblog, that meaning the experiment have got
a good result and thus verified that the algorithm of extracting topic of the paper has
feasibility and effectiveness to extract focal topic discussed and commented warmly by
netizen.
Table 1. Hot words sorted
Ranking Topics Occurrence Number Frequency
1 haze 113 0.035
2 El Nino 89 0.021
3 debris flow 77 0.017
4 rainstorm 52 0.012
5 hail 31 0.007
F. Ren et al. / The Design and Implementation of Meteorological Microblog Public Opinion 541
6. Conclusions
The system completed the extraction of the hot topic of work effectively .But, it also
has some shortcomings, is lack of image analysis results. The next step will be for the
further research, to get more accurate hot extraction results.
References
[1] Y. Liu. Introduction of Research on the Network Public Opinion.Tianjin: Tianjin People Publishing
House, 2007.
[2] Sina .Sina Microblog. API Open Platform [EB/OL]. (2013-03-12).
http://open.t.sina.com.cn/wiki/index.hph.
[3] http://download.csdn.net/download/zeal27/3049486
[4] http://www.cnblogs.com/linxuanchen/p/5113233.html
[5] (Can.) Writen by J. W. Han, M. Kamber; Translated by M. Fan, X. F. Meng. Data Mining: Conceptions
and Technology. Beijing: China Machine Press, 2007.3
[6] J. W. Han, X. F. Meng, J. Wang. Web Mining Research. Journal of Computer Research and
Development, 38(2001): 405-414.
[7] G. L. Ji, K. Shuai, Z. H. Sun. Data Mining Technology and Its Application. Journal of Nanjing Normal
University (Natural Science Edition), 23(2000): 25-27.
[8] J. Wu. The Beauty of Mathematics. Google Research Institute, 2008.12
[9] G. M. Yu. Characteristics and Statistical Analysis of Network Public Opinion’s Hot Event. People
Forum(Chinese), 4(2010).
[10] Q. Y. Yao, G. S. Liu, X. Li. Text Clustering Algorithm Based on Vector Space Modal (VSM) \.
Computer Engineering. 9(2008): 40-44.
542 Fuzzy Systems and Data Mining II
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-542
Introduction
For variety transportation problems faced by the modern city, like congestion, travel de-
lays and many more. Sustainable development, especially sustainable mobility (such as
shared transport systems and modes of combined transport), is gaining more and more
attention. And several efficient solutions (such as car sharing, bus and intelligent traf-
fic systems (ITS, an integrated system consists of communication technologies, vehicle
sensing and many other technologies) ) have been proposed to ease these problems [1].
1 Corresponding Author: Rui WANG, School of Information Engineering, Yangzhou University, Yangzhou,
And under this context, carpooling and ride sharing are becoming two most promising
approaches to realize sustainable mobility [2]. Furthermore, carpooling can also help
save travel cost, alleviate traffic pressure, save fuel and protect environment.
Except [1, 2], there are many other researchers have studied on how to make a more
convenient and economical carpooling. In [3], by adopting the genetic algorithm, the au-
thors propose a Low-Complexity and Low-Memory Carpool Matching method to solve
carpool services problems for passengers and help to solve traffic congestion. In [4], by
analyzing a proposed Automated Wireless Carpooling System (AWCS) and the Central
Monitoring System (CMS), the authors find it is efficiently to make passengers get in-
to carpool vehicles in a city. In [5], through applying a cloud computing framework,
the authors propose an intelligent carpool system called BlueNet which mainly consists
of the Mobile Client module and the Cloud Global Carpool Services. Furthermore, by
experiment, they also find this system can dramatically reduces the processing time of
obtaining the carpooling results. Moreover, in [6], authors make an research on how to
utilize positioning systems to support a dynamic network of car and taxi pool services
and maximize use of empty seats of cars and taxis.
Figure 1. The architecture of intelligent real-time route planning and carpooling system (The CSM and Driver
module are extracted from [3, 5])
Inspired by these studies, this paper mainly focuses on the architecture of the in-
telligent real-time route planning and carpooling in ITS. As is shown in Figure 1, by
utilizing positioning system (GPS module), storage module, intelligent analysis system
(ITS, AM and CSM), communication network and a variety of smart mobile devices, this
paper makes an analysis of the intelligent real-time route planning and carpooling sys-
tem. Meanwhile, this paper adopts a formalization language — Performance Evaluation
Process Algebra (PEPA) to model and evaluate the system. From [7], it is easy to find
PEPA language has a great advantage on modeling and evaluating systems with closure
process. Moreover, we use fluid flow approximation to conduct performance evaluation
based on the model. In [8, 9], it is easy to know that in fluid approximation, the discrete
state space is described as continuous. Furthermore, the continuous time Markov chains
(CTMC) that a PEPA model based on is converted into ordinary differential equation-
s (ODEs). Therefore, the relative performance of the corresponding PEPA model (i.e.
544 J. Ding et al. / Modeling and Evaluating Intelligent Real-Time Route Planning
the real-time route planning and carpooling system) can be obtained by the numerical
solution of ODEs.
Section 1 gives the specific PEPA models of the process of intelligent route planning
and carpooling and do performance evaluation of the model. Section 2 concludes this
paper.
This section is mainly to describe the general processing of intelligent real-time route
planning and carpooling. From Figure 2, it is easy to find that the whole process is di-
vided into eight components: Passengers module, Driver module, GPS, traffic acquire
devices (TAD), intelligent traffic system (ITS), analysis module (AM), carpool services
module (CSM) and storage module (SM). Moreover, in Figure 2, rectangles denote indi-
vidual activities, round rectangles represent shared activities and diamonds denote choice
in PEPA. Meanwhile, it is easy to find three kinds of arrows are used in Figure 2. Solid
arrows are used to describe the execution sequences of different components. Dotted ar-
rows represent the specific execution sequences of all activities within every component.
Shorted dotted arrows denote a choice of execution sequences of some activities within
SM module.
The whole specific working process of the intelligent real-time route planning and
carpooling system is stated as follows (here, we just list the PEPA models of Passengers
module and CSM. According to the semantic of PEPA, readers can get the corresponding
PEPA models of the other modules.):
Figure 2. The processing of intelligent real-time route planning and carpooling (The CSM and Driver module
are extracted from [3, 5])
J. Ding et al. / Modeling and Evaluating Intelligent Real-Time Route Planning 545
1. Passengers Module: First sends route planning and carpooling request (i.e.
route_carpool_req) to ITS module. Then, receives detailed results of travel routes and
carpooling which are returned from ITS module (i.e. route_carpool_rsp). Finally, exe-
cutes reset operation (i.e. reset1) and prepares to send new requests to ITS module. The
corresponding PEPA models are:
def
P assenger1 = (carpool_req, rcarpool_req ).P assenger2
def
P assenger2 = (carpool_rsp, rcarpool_rsp ).P assenger3
def
P assenger3 = (reset1, rreset1 ).P assenger1
2. ITS Module: The main functions of this module are described as follows:
• Sends a location request (i.e. locate_req) to GPS module and receives positioning
results from SM module (i.e. locate_rsp).
• Sends a request to TAD module to acquire the corresponding traffic status and
accepts the query results from SM module (i.e. traf f ic_rsp).
• Integrates location result and the corresponding traffic information (i.e. locate_tr
af f ic_comb), sends request to AM module for route planning (i.e. route_req) and ac-
cepts the corresponding route planning result (i.e. route_rsp).
• Sends carpool planning request to CSM (i.e. carpool_req) and receives carpool
result pushed by CSM (i.e. carpool_rsp).
• Sends request of carpool confirmation message to Driver module (i.e. inf o_conf _
req) and acquires the corresponding results (i.e. inf o_conf _rsp).
3. Driver Module: This module will first determine whether the corresponding trav-
el information has been generated (i.e. judge_3), upon receiving the request from IT-
S module. If the travel information has not been generated, then the driver first gener-
ates travel information (i.e. inf o_generate) and returns this information to ITS module.
Otherwise, this module will directly return the information to ITS module. Next, Driver
module executes reset operation (i.e. reset2) and prepares to receive new requests.
4. GPS module: This module will first make a judgment to determine whether it
has connected to the corresponding usable satellite (i.e. judge_1), after receiving lo-
cation request from ITS module. If not, then, GPS connects to the usable satellite (i.e.
satellite_link) and generates the specific location information (i.e. locate_generate).
Otherwise, GPS generates the position information directly.
5. TAD module: Once accepting the request from ITS, this module will first judge
whether the traffic status has been obtained (i.e. judge_2). If not, TAD first acquires
traffic status data (i.e. traf f ic_obtain) and then transforms the acquired data into real-
time traffic information (i.e. traf f ic_generate). Otherwise, TAD generates real-time
traffic information straightway.
6. AM: This module is used to generate intelligent analysis on the information which
consists of location and traffic status provided by ITS module (i.e. route_plan) and
feedbacks the best route to ITS.
7. CSM: The main functions of this module are described as follows:
• After receiving the carpool request from ITS, CSM will first send a request to
SM to obtain detailed information of available vehicles (i.e. car_inf o_req) and receive
query results of vehicles from SM (i.e. car_inf o_rsp).
• According to the information of vehicles and route planning, CSM will make
an intelligent analysis, generate the optimal matching result (i.e. carpool_match) and
return the result to ITS.
The corresponding PEPA models are:
546 J. Ding et al. / Modeling and Evaluating Intelligent Real-Time Route Planning
def
CSM1 = (carpool_match_req, rcarpool_match_req ).CSM2
def
CSM2 = (driver_inf o_req, rdriver_inf o_req ).CSM3
def
CSM3 = (driver_inf o_rsp, rdriver_inf o_rsp ).CSM4
def
CSM4 = (carpool_match, rcarpool_match ).CSM5
def
CSM5 = (carpool_match_rsp, rcarpool_match_rsp ).CSM1
8. SM: This module is responsible for unified store and call the information of loca-
tion, traffic and car (i.e. locate_get, traf f ic_get and car_inf o_get). Meanwhile, SM
will also provide the information to specific modules of the system (i.e. ITS and CSM
modules).
In this section, all activities and components involved in the PEPA models are extracted in
Table 1 and Table 2 respectively. In addition, parameters of all activities and components
are also listed. In Table 1, most parameters are obtained through the Internet via practical
test in the form of debug. The duration denotes the corresponding execution time of every
activity and it’s unit is in seconds. Because of equipment limitations, it is difficult to
obtain the specific information of every component. Then, the amount of each component
in Table 2 is assumed based on Internet resources and actual situation.
Table 1. Description and duration of all activities
Action Duration (s) Action Duration (s)
route_carpool_req 0.000370 route_carpool_rsp 0.00351
reset1 1.0 locate_req 0.000480
locate_rsp 0.000330 traf f ic_req 0.000350
traf f ic_rsp 0.000560 locate_traf f ic_comb 0.0025
route_req 0.000350 route_rsp 0.030
carpool_req 0.000480 carpool_rsp 0.000560
inf o_conf _req 0.00351 inf o_conf _rsp 0.000370
judge_1 0.001 satellite_link 1.499
locate_generate 0.670 locate_store 1.081
locate_get 0.713 traf f ic_store 1.081
traf f ic_get 0.713 car_inf o_req 0.000480
car_inf o_get 0.713 car_inf o_rsp 0.000560
judge_2 0.001 traf f ic_obtain 0.670
traf f ic_generate 0.670 route_plan 2.801
carpool_match 1.227 judge_3 1.0
inf o_generate 2.0 reset2 1.0
The specific dynamic performance of the real-time route planning and carpooling mod-
el is given out in this section (see Figure 3, Figure 4). As shown in these figures, re-
sponse time of the whole process of the model and throughput of car_inf o_get and
locate_traf f ic_comb will be respectively analyzed.
J. Ding et al. / Modeling and Evaluating Intelligent Real-Time Route Planning 547
In Figure 3, the maximum number of passengers (in Table 2) is set as constant. Then,
the analysis is made on the probability of passengers finish the whole process of real-time
route planning and carpooling. So, in Figure 3, it is easy to find that when the number
of passengers increases, the probability of passengers complete the route planning and
carpooling is lower, i.e. passengers spend more time completing the search process.
!"
!"
!"
Figure 3. Response time of passengers complete the process of real-time route planning and carpooling
*%*&
*%%*
#$
%&
The throughput of car_inf o_get and locate_traf f ic_comb are analyzed in Figure
4. From the two figures, it is easy to find that when the number of passengers increases
(i.e. when the requests from passengers increase), the throughput of these activities be-
comes larger too. However, the curve becomes smooth when the throughput approaches
the maximum value of system.
548 J. Ding et al. / Modeling and Evaluating Intelligent Real-Time Route Planning
Through analyzing the response time and throughput of the model, it is helpful to
test the ability of the system to process passengers’ requests in practice and maximize
the utilization of the system.
2. Conclusion
This paper employs formalization PEPA to model the whole process of real-time route
planning and carpooling in ITS. As in practice, an efficient ITS is of great significance for
the management of traffic information and reduction of traffic congestion. Meanwhile,
performance evaluation of the system can improve the use of the system. In our future
work, we will focus on intelligent decision and intelligent push of real-time traffic. Fur-
thermore, we will look for a rational and effective intelligent decision-making algorithm
and apply it in ITS.
Acknowledgements
The authors acknowledge the financial support by the National NSF of China under Grant
No. 61472343, the National Natural Science Foundation of Jiangsu Province under Grant
No.BK20160543 and BM20082061507.
References
[1] A. Awasthi, and S. S. Chauhan: Using AHP and Dempster-Shafer theory for evaluating sustainable
transport solutions. Environmental Modelling & Software, 26 (2011), 787-796.
[2] E. Cangialosi, A. D. Febbraro, and N. Sacco: Designing a multimodal generalised ride sharing system.
Institution of Engineering and Technology, 10 (2016), 227-236.
[3] M. K. Jiau, and S. C. Huang: Services-Oriented Computing Using the Compact Genetic Algorithm
for Solving the Carpool Services Problem. IEEE Intelligent Transportation Systems Society, 16 (2015),
2711-2722.
[4] R. K. Megalingam, R. N. Nair, and V. Radhakrishnan: Automated Wireless Carpooling System for
an eco-friendly travel. 3rd International Conference on Electronics Computer Technology (ICECT), 4
(2011), 325-329.
[5] S. C. Huang, M. K. Jiau, and C. H. Lin: A Genetic-Algorithm-Based Approach to Solve Carpool Service
Problems in Cloud Computing. IEEE Transactions on Intelligent Transportation Systems, 16 (2015),
352-364.
[6] P. Lalos, A. Korres, C. K. Datsikas, G. S. Tombras and K. Peppas: A Framework for Dynam-
ic Car and Taxi Pools with the Use of Positioning Systems. Computation World: Future Comput-
ing, Service Computation, Cognitive, Adaptive, Content, Patterns (COMPUTATIONWORLD ’09),
DOI.10.1109/ComputationWorld.2009.55, (2009), 385-391.
[7] J. Ding: A Comparison of Fluid Approximation and Stochastic Simulation for Evluating Content Adap-
tation Systems. Wireless Personal Communications, 84 (2015), 231-250.
[8] J. Ding: Structural and Fluid Analysis for Large Scale PEPA Models–With Applications to Content
Adaptation Systems. PhD Thesis, The University of Edinburgh, 2010.
[9] J. Ding and J. Hillston: Numerically Representing Stochastic Process Algebra Models. Computer Jour-
nal, 55 (2012), 1383-1397.
Fuzzy Systems and Data Mining II 549
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-549
Introduction
Multimode problem often occurs when discontinuity is introduced into the microwave
structure such as the coupled microstrip resonator structure which is seen as the
substructures of a Microwave equalizer which can make the gain characteristics of the
traveling-wave-tube amplifiers (TWTA) linear [1]. While solid-state devices’ gain
characteristics linear, TWTAs’ output is nonlinear, whose center point has the
maximum value [2]. The main problem of the design is how to deal with the multimode
problem caused for the discontinuities in the structure. The multimode problem of the
equalizer is same as in the basic substructure which is composed with only one
resonator coupled to the transmission line. The common method for the multimode
problem is to compute the modes field precisely for the structure. But as we all know, it
is too difficult to get precise results. And also for the equalizer designing, the goal is
the implementation not the field computation. So choosing a suitable method to deal
with it for the equalizer’s precise realization is critical to the designing.
Without multimode problem, show as in figure 1, the equalizer’s S-parameter can
be calculated using the cascade equation directly. But the multimode problem cannot
1
Corresponding Author: Ying ZHAO: PhD in communication engineering, The Institution of
Electrical engineering and Information Engineering of Lanzhou University of Technology. E-mail:
Zhying2005@163.com.
550 Y. Zhao et al. / Multimode Theory Analysis of the Coupled Microstrip Resonator Structure
be avoided for its discontinuity structure. Compromising the multimode effect, the
method of changing structure itself is improved in the paper. We analyzed an equalizer
using thin-film resistor loaded branch microstrip resonator as an example.Investigating
the length of the microstrip trace which decides the multimode’s fading character. Such
equalizer has a microstrip with finite length openning at one end and loaded with
thin-film resistors on the other end[3][4]. It can provide a good attenuation curve to
compensate a TWTA’s nonlinear response.
The equalizer’s basic substructure is shown as in figure 2. There is only one resonator
is coupled to the microstrip trace.
Figure 3. The equivalent cascaded network Figure 4. S-parameter model of the cascaded network
The cascades equivalent network is shown as in figure 3, where[5][6]: [Sa] is the
input continuity part’s equivalent network, [Sb] is the discontinuity part’s equivalent
network, [Sc] is the output continuity part’s equivalent network.
The cascaded network’s S-parameter model with two cases is shown as in figure 4.
We suppose the normalized impedance is matching to the output port of the first case
and the input port of the second case. Then their S-parameters are [S]1 and [S]2
respectively[7]. Where:
b11 1
S11a11 S12
1
a21, b21 1
S21a11 S22
1
a21
b12 2
S a
11 12 S a22 , b22
2
12
21
S21 a12 S22
2
a22 (1)
Then, b21 a12 , b12 a21 .So the total S-parameters of the structure is as follows:
ª 1 S121 2
S11 1
S 21 1
S12 S122
º (2)
« S11 1 S 1 S 2 1 S 22
1
S112 »
[S ] « 1 2
22 11
2 1 2
»
« S 21 S 21 2
S 22
S S S12 »
21 22
« 1 S 22 S11
1 2 2 »
1 S 22 S11 ¼
1
¬
° (3)
( S12 S12 ) /(1 S 22 S11 )
a b a b
° S12
®
( S 21 S 21 ) /(1 S 22 S11 )
a b a b
° S 21
°S S 22 ( S 21 S 22 S12 ) /(1 S 22 S11 )
b b a b a b
¯ 22
Y. Zhao et al. / Multimode Theory Analysis of the Coupled Microstrip Resonator Structure 551
Ignoring the multimode effect, from figure 3 the total S-parameters can be seen as
the cascade result of the [Sa] , [Sb] and [Sc] with Eq. (3).But as we can see the
discontinuity is near the resonator. So Eq. (3) cannot be used to compute the total S
parameter. So we supposing it’s S-parameter is as follows[7]:
«V2, #1 » « «S S 22 »¼ «¬ S 21 S 22 »¼ » «V » (4)
« » « ¬ 21 # 2 , #1 # 2,# 2
» « 2, #1 »
«V1, # 2 » « ª S11 S12 º ª S11 S12 º » «V1, # 2 »
«¬V2, # 2 »¼ « «¬ S 21 S 22 »¼ «¬ S 21 S 22 »¼ » «¬V2, # 2 »¼
¬ ¼
The S parameter of the even part far away from the discontinuous part should be
seen as the even transmission line. So the most important thing for the total S
parameter is [Sb] .And the equivalent circuit of the discontinuous part is shown as in
figure 3.Where [Sa] , [Sb] and [Sc] has the same meaning as before. And then we get
the S-parameter model of the multimode circuits as shown in figure 5:
V1,#1
V2,#1
V1,#1 V2,#1
ª[S]#1,#1 [S]#1,#2 º
« #2,#1 »
V1,#2 ¬[S] [S]#2,#2 ¼ V2,#2
V1,# 2 V2, #2
#i , #i
ª S11 S12 º (5)
>S @ «S
¬ 21 S 22 »
¼
ª S11 S12 º
#i , # j
(6)
>S @ «S
¬ 21 S 22 »
¼ iz j
º ª V1,#1 º
ª V1,#1 º ª ª S11 S12 º
#1, #1 #1, # 2 #1, # n
« » ª S11 S12 º ªS S12 º « »
«« " « 11 ! »
«V2,#1 » « ¬ S 21 S 22 »¼ «S
¬ 21 S 22 »¼ ¬ S 21 S 22 »¼ V
» « 2,#1 » (7)
«V1,# 2 » «ª S # 2 , #1 # 2,# 2 # 2,#n » «V1,# 2 »
S12 º ª S11 S12 º ª S11 S12 º
« » « « 11 » «S ! « "» «V »
«V2,# 2 » « ¬ S 21 S 22 ¼ ¬ 21 S 22 »¼ ¬ S 21 S 22 »¼ » « 2,# 2 »
« # » « « »
« » # # # # # »« # »
« # n , #1 #n ,# 2 #n,#n »
«V1,#n » « ª S11 S12 º ª S11 S12 º ªS
! « 11
S12 º «V »
"» « 1,#n »
«V » « «¬ S 21 »
S 22 ¼ «S
¬ 21 S 22 ¼»
¬ S 21 S 22 »¼ » V2 , # n
« 2,#n » « « »
¬« # ¼» ¬ # # # # # »¼ « # »
¬ ¼
552 Y. Zhao et al. / Multimode Theory Analysis of the Coupled Microstrip Resonator Structure
Where: V jr,#i :( j 1, 2 ) is the mode voltage of the incident wave and reflect wave
of the ith mode of the jth physical port. S11 and S 21 :are the real quantities representing
the reflection and transmission coefficients respectively. Mutual S parameter is the
effect of the multimode on the main mode of the transmission structure.
From Eq. (7), we can see without multimode the total S parameter will has no
mutual S parameter, and it becomes the same as the main mode S parameter of the
transmission structure. So, if we can remove the multimode effect from the structure
we can remove the difficulty in the computation of the total S- parameters, and so the
cascade equation can be used to calculate the cascade structure. So ‘equivalent
removing’ method is developed in this paper.
Substructures shown in figure 6 are fabricated for the analysis. And also HFSS
simulator from Ansoft Corporation was programmed to analyze the multimode effect
of the substructures[8].The software includes post-processing commands for analyzing
this behavior in detail. The same results were gotten from both the measurements and
the simulations.
Figure 6. The equivalent circuits of the discontinuity (a) (b) (c) and (d)
The results are shown in figure 7 (only the transmission S21 coefficients
comparisons were shown). We analyzed the difference of transmission characteristics,
especially S21, of the multimode and the multimode removing structure, shown as the
figure 6(b).The layout of the proposed branch resonator coupled to 50 : microstrip line.
Figure 7. The multimode data and Figure 8. S21 of the multimode and the different
the non-multimode data multimode removing data
The results shown in figure 7 shows that the resonant frequency of the multimode
data is lower than the multimode removing structure, while the attenuation is larger.
The main reason for this phenomenon is that the multimode made the loaded capacity
of the resonator larger which lengthened the branch equivalently and lowered the
resonant frequency. And the multimode consumed the energy of the main mode which
Y. Zhao et al. / Multimode Theory Analysis of the Coupled Microstrip Resonator Structure 553
made the attenuation larger. The data for the both sides multimode removing and single
side multimode removing data are shown as in figure 8,and ‘m’ means multimode data,
‘srm’ means single side multimode removing data, and ‘brm’ means both side
multimode removing data. The rule of the changing of the multimode is same.
So as for the structure shown in figure 6 (c), if we divide it through line a-b, we
can see the multimode between the two parts cannot be omitted simply. So the cascade
formulation (4) cannot be used to get the total S-parameters of the structure. So it is
difficult to deal with the designing. To simplifying the designing, the ‘equivalent
removing’ method is improved in this paper which is improved with numerical and
simulations. It means that we can divide the structure as figure 6(c) into two parts and
find the substitute structure as figure 6 (b) for the trace is long enough. The first part of
the divided structure is called the single side multimode removing equivalent structure.
And the second part of it can be substituted with a reversal structure as figure 6(d).
In this way, the multimode effect in the middle of the structure shown in figure 6
(c) can be removed equivalently, and also we can get the total S-parameter of the
structure (c), from the cascade formulations (4). In the formulations we use the
S-parameters of the structure (b) to substitute the S-parameters of the structure (a)
which removing the multimode effects of the structure equivalently. The results of the
method used to designing the equalizer show as below which the original computed
S-parameter results compared with the measured data shows in figure 9.
In figure 9, the calculated data means the results which is using the cascade
method to get the total S-parameter of the structure (c) by dividing it into two equal
parts as (a), and using the original S-parameter of the structure (a) to compute the total
S-parameter of the structure(c). The result in the figure shows that a large error is
introduced. So we know the cascade method cannot be used in this way to calculate the
total S-parameters for the structure (c).
Figure 9. The original and the calculated result Figure 10 The adjusted and the calculated result
When we were using the fading character of the high modes the ‘equivalent
removing’ method is improved and used, the good results are gotten, as shown in figure
10. This time we use the equivalent structure and corresponding S data to compute the
total S-parameters with the formulations (4). From figure 10, we can see the curve of
the computed data, and the curve of the measured data which was measured by HP8255
are almost overlapped which means the method is suited for the cascade design. And
the errors between the computed data and the measured data are given as shown in
figure 11. From the figure, we can see, for 95% of all the points we considered above,
the error is lower than 1dB. And the error on 80% of all the points above is lower than
0.7dB. The error data shows a very good matching of the computed data and the
measured data which means the method we used is right and is good to the designing.
554 Y. Zhao et al. / Multimode Theory Analysis of the Coupled Microstrip Resonator Structure
3. Conclusion
We have investigated the influence of the length of the microstrip trace and cascade
state to the discontinuous problem of the microstrip equalizer designing through
numerical simulations done with HFSS method and experiments. And numerical results
compare very well with the experimental results. The results show that when the first
high mode fades 90dB, in the both side removing structure, the cascade result is quite
well which can meet the equalizer designing requirement well. In the signal side
removing structure, as the removing side is designed to made the first high mode fades
90dB, the other side which is the side with multimode is designed to make the first high
mode fades only 6dB[9][10].And also that the resonant frequency is closely related to
the thin-film resistor’s dimension loaded resonator, and the attenuation is closely
related to both the resonator and the trace.
References
[1] D. J. Mellor, On the Design of Matched Equalizer of prescribed Gain Versus Frequency Profile. IEEE
MTT-S International Microwave Symposium Digest, 1997, 308-311.
[2] Broadband MIC Equalizers TWTA Output Response. IEEE Design Feature. Oct 1993.
[3] J. Y. Chi, G. X. Zhang, G. Huang. CAD and Experimental Research of The Microstrip Eqaulizer for
TWT Amplifier, Journal of Electronics & Information Technology , April 1989.
[4] M. Sankara Narayana. Gain Equalizer Flattens Attenuation Over 6-18GHz. Applied Microwave
&wireless. November 1998, 74-78.
[5] JD Baena,J Bonache,F Martin,RM Sillero,F Falcone,Equivalent-Circuit models for split-ring resonators
and complementary split-ring resonators coupled to planar transmission lines, IEEE Transactions on
Microwave Theory and Techniquesdoi ,2005, 53(4):1451-1461.
[6] V Sanz,A Belenguer,AL Borja,J Cascon,H Esteban, Broadband Equivalent Circuit Model for a Coplanar
Waveguide Line Loaded with Split Ring Resonators, International Journal of Antennas & Propagation,
4(2012):1238-1241.
[7] O Pitzalis㧘RA Gilson, Tables of Impedance Matching Networks Which Approximate Prescribed
Attenuation Versus Frequency Slopes, IEEE Transactions on Microwave Theory and Techniques,
19(1971):381-386.
[8] M. Shattuck. EM-Based Models Improve Circuit Simulators. Microwave &RF ,2000 JUNE 97-108
[9] P. Heymann, H. Prinzler, and F. Schnieder. DE-embedding of MMIC transmission-line measurements.
1994 IEEE MTT-S Digest 1045-1048.
[10] GD Vendelin, AM Pavio, UL Rohde, Microwave Circuit Design Using Linear and Nonlinear
Techniques, Wiley, 37(2005):973-974.
Fuzzy Systems and Data Mining II 555
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-555
Abstract. This paper demonstrates a method for obtaining the woodcut rendering
results of images. The first four steps to produce woodcut rendering simulation
image include: boundary extraction, histogram matching, image edge enhancement
and binarization processing. First, we used the Roberts operator to extract the
image boundary. Second, we used the histogram matching to adjust the color
distribution of the gray image. Third, through image fusion we fused the image
boundary and the adjusted image. Finally, we used image binarization by setting
the threshold to get woodcut rendering results. Experimental results showed that
the algorithm has low computational complexity, and the real-time characteristic
of it is very good. Using the method in this paper, we can get the woodcut
rendering results with excellent, artistic effects.
Introduction
1
Corresponding Author: Shu-Wen WANG, College of Electrical Engineering, Northwest University for
Nationalities, China; E-mail: shuwenwang@163.com.
556 H.-Q. Zhang et al. / A Method for Woodcut Rendering from Images
problems: 1) artists use sketches to describe the overall shape and main profile; 2)
artists grave the dark space to express the image’s tone and level in different areas.
There is comprehensive consideration of the above two issues. This paper presents
a stepwise woodcut rendering method. We use four steps to produce a woodcut
rendering simulation image: boundary extraction, histogram matching, image edge
enhancement and binarization processing.
1. Boundary Extraction
The image boundary data provides the basic information for photo processing [4]. In
the process of photo processing, we greatly need boundary information, because there
are many important image details in the image boundary.
Robert’s operator [5] is a very simple algorithm, using a local difference operator
for the edge. Robert’s operator has high positioning accuracy and the advantages of a
sensitivity to noise. Figure 1 is the original drawing. Robert’s operator boundary
extraction result is shown in Figure 2.
Robert’s operator convolution factors as follows:
ª1 0 º ª 0 1º (1)
Gx Gy
«¬0 1»¼ «¬1 0»¼
Formula for calculating grey degrees:
2 2
G Gx G y (2)
The specific calculation is as follows:
G x, y
abs f x, y f x 1, y 1 abs f x, y 1 f x 1, y (3)
2. Histogram Matching
this way, the image information will be concentrated so we can set up the unified
threshold for binarization processing. The Histogram Matching results are shown in
Figure 4.
This article uses the following function as the distribution curve:
1 ucVv ½
° e if u b d v d u c °
p1 (v) ® V ¾ (4)
°¯ 0 otherwise ° ¿
1 ½
° if v d u b °
p 2 (v) ® u b u a ¾ (5)
°¯ 0 otherwise °
¿
Among them u a 105, u b 225, u c 255, V 9 . The function curve is shown in
Figure 5.
Figure 3. The gray histogram of three different Figure 4. The gray histogram after the histogram
photo matching
558 H.-Q. Zhang et al. / A Method for Woodcut Rendering from Images
9
8
7
6
5
4
3
2
1
0 109
121
133
145
157
169
181
193
205
217
229
241
253
1
13
25
37
49
61
73
85
97
As shown in Figure 6(b), the image has very high brightness and has an obvious
phenomenon of image distortion. As a result, we need to strengthen the details for
Figure 6(b). The boundary of the image has some important details. Therefore,
enhancing the image boundary information can achieve the purpose of enhancing the
image [8]. In this paper, we enhance the image in Figure 6(b) by using edge
enhancement [9]. The image in Figure 6(c) can be obtained by adding edge information
to Figure 6(b). The detail of (c) has significant improvement compared with the detail
of (b).
As shown in Figure 7, woodcut image is a kind of binary image. Also, the images only
have two kinds of color: black and white. In order to achieve woodcut rendering, we
need the image binarization [10] processing. In order to reduce the computational
complexity, we realize the binarization processing by setting the binarization threshold
value [11].
p(v) ^1 vtu
0 vu ` (7)
By setting the threshold value (u) to control the result of the binarization
processing, we have concentrated the image information through histogram matching.
And as shown in Figure 5, the image information is concentrated in the part of the gray
level values greater than 230. As a result, the value of the threshold (u) can be set as
230.
This article has done woodcut rendering works for three different types of images.
They are: (a) Natural scenery, (b) Cityscape and (c) Portrait photography. We also
received very good rendering results, as shown in Figure 8.
Compared with the original image and the results of (a), it can be found that the
results of the rendering image are very details and that the details are very accurate. For
example, the reflection in the river is not only a black shadow. There are some white
spots in the reflection.
Compared with the original image and the result of (b), it can be found that the
shop signs on the wall, the structure and the texture of the building are very clear.
However, we can only see part of the clouds. This is a problem that needs to be solved.
Compared with the original image and the result of (c), it can be found that the
portrait of the result is very obvious. Also, the woman's hat is very clear as well.
560 H.-Q. Zhang et al. / A Method for Woodcut Rendering from Images
(b) Cityscape
6. Conclusion
In this paper, we realized a simple method for woodcut rendering. First, we used the
Robert’s operator to extract the image boundary. Second, we used the histogram
matching to adjust the color distribution of the gray image. Third, through image fusion
we fused the image boundary and the adjusted image. Finally, through image
binarization, we set the threshold in order to get woodcut rendering results.
Experimental results show: the algorithm has low computational complexity and the
real-time characteristics of it are very good. The proposed method can obtain woodcut
rendering images with better artistic effects, as well as having excellent universality.
Future research will include improving the accuracy of rendering, the rendering quality
and the artistic quality of the rendering image.
Acknowledgements
We would like to express our gratitude to the National Natural Science Foundation of
China (No.61261042) and the scientific research innovation team about the key
technologies of Internet of Things of Northwest University for Nationalities. These
projects have brought a lot of help to the thesis.
References
[1] S. Mizuno and T. Kasaura, et al. Automatic generation of virtual woodblocks and multicolor woodblock
printing. Computer Graphics Forum, 19(2000), 51-58.
[2] V. Mello, C. R. Jung, et al. Virtual woodcuts from images. 5th international conference on Computer
graphics and interactive techniques, 2007: 103-109.
[3] J. Li, D. Xu. A scores based rendering for Yunnan out-of-print woodcut. 14th International Conference
on Computer-Aided Design and Computer Graphics, 2015: 214-215.
[4] J. H. Yeom, M. Y. Jung, Y. Kim. Line-based paddy boundary extraction using the Rapid Eye satellite
image. IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Milan, 2015: 3397-
3400.
[5] R. L. Duan, Q. X. Li, Y. H. Li. Summary of image edge detection. Optical Technique, 31(2005): 415-
419.
[6] C. Q. Huang, Q. Zhang, H. Wang, et al. A low power and low complexity automatic white balance
algorithm for amoled driving using histogram matching. Journal of Display Technology, 11(2015): 53-
59.
[7] J. Wu, L. Lu, D. Dong. Fusion mutispectral and high resolution image using IHS transform and
histogram equilibrium. Journal of Wuhan University of Technology (Transportation Science &
Engineering), 28(2004): 55-58.
[8] J. Jin, S. Y. Tang, Y. Shen. An innovative image enhancement method for edge preservation in wavelet
domain. 2015 IEEE International Instrumentation and Measurement Technology Conference (I2MTC)
Proceedings, Pisa, 2015, 52-56.
[9] J. Chen, X. X. Cui, J. Xiao, et al. Properties of image edge enhancement using radial hilbert transform.
Acta Photonica Sinica, 40(2011): 483-486.
[10] B. Wu, Z. Y. Qin. New approaches for the automatic selection of the optimal threshold in image
binarization. Journal of Institute of Surveying and Mapping, 18(2001): 283-286.
[11] M. Soua, R. Kachouri, M. Akil. Improved Hybrid Binarization based on Kmeans for Heterogeneous
document processing. 9th International Symposium on Image and Signal Processing and Analysis
(ISPA), Zagreb, 2015, 210-215.
562 Fuzzy Systems and Data Mining II
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-562
Introduction
1
Corresponding Author: Yi-Nan LU, College of Computer Science and Technology, Jilin University, Jilin
130000, China; E-mail: luyn@jlu.edu.cn.
T.-W. Yuan et al. / Research on a Non-Rigid 3D Shape Retrieval Method 563
shape classes. However, a recent study conducted by Lian et al. [4] suggests that some
global descriptors can also effectively identify similarities among shapes. In another
study, Bronstein et al. [5] developed the shape descriptor BoF-HKS based on the bags
of words (BoW) framework. However, the BoF-HKS shape descriptor yielded
unsatisfactory results when applied to a non-rigid 3D shape retrieval system in
SHREC’15 [6].
Some parts may not occur large shape deformation so that they can reserve original
partial characteristics. Ran et al. [7] and Li et al. [8] presented several partial matching
methods. However, these methods cannot be applied to shape retrieval. In a more
recent method proposed in [9] and [10], global and partial features were combined.
According to the results, this method outperformed other methods in content-based 3D
shape retrieval tasks. In another study, Sipiran et al. [11] suggested that a 3D generic
shape can be represented as a linear combination of a global descriptor and a set of
partial descriptors. First, the local features of the 3D Harris feature points were
computed. Then, a clustering approach was used to identify points in the same cluster
with similar features. However, points in the same cluster could be disjointed without
location information. Katz et al. [12] developed a novel hierarchical mesh segmentation
algorithm using Core Extraction. The segmentation results were invariant to both the
pose of the model and the differing proportions of the model’s components.
Furthermore, Zhang et al. [13] proposed a region-growing approach, in which the
mean-shift curvature is used to cluster points. The points within a cluster are then
compiled into mesh faces, which are connected to form sub graphs. These methods can
also be used to generate joint segmentation.
Therefore, in this study, a new approach is developed by combining the global and
partial descriptions for non-rigid 3D shape retrieval. This approach is fundamentally
different from most existing approaches to 3D shape retrieval. In addition, a
segmentation method is adopted to detect points using the improved DoG based on
curvature flow. Furthermore, an algorithm is used to measure the similarity among
segmented parts of different objects.
1. Method
Few approaches utilize 3D partitions as input in non-rigid retrieval tasks. In this section,
a segmentation approach based on Core Extraction is adopted. The complete and joint
partitions included in this approach are thought to be more applicable to non-rigid 3D
shape retrieval.
x Keypoints extraction: the feature points which satisfy the condition are
identified and used to guide the segmentation process.
¦ GeodDist(ν,ν ) ! ¦ GeodDist(ν ,ν )
νiS
i
νiS
n i (1)
Figure 1. Feature Points Detected by SP, Mesh-DoG, and the Proposed Algorithm.
Note that, for non-rigid 3D shape retrieval system, a robust feature points
extraction approach can be used to detect repeatable points in variant pose models. The
proposed algorithm, which was based on Mesh-DoG and SP, is presented in Algorithm .
1.3. Description
The proposed description can be divided into two types of descriptions, including a
global descriptor of the entire 3D model and a set of partial descriptors.
BoF-HKS was used as the global feature since it has proven to be robust and
effective against non-rigid transformation.
The following algorithm was used to select the set of partial descriptors:
x Given a partial descriptor, a set of feature points C can be obtained directly.
First, the number of feature points in C is determined. If the number of feature
points is less than a predetermined size, the set is discarded since the
performance of the set decreases as the number of feature points decreases.
x The neighbors surrounding the feature points are added to C in order to
improve the retrieval results.
T.-W. Yuan et al. / Research on a Non-Rigid 3D Shape Retrieval Method 565
x The BoW method is used to generate the feature representations of the partial
descriptor and normalize the feature vector.
In order to determine how similar two objects are, the distances between pairs of
descriptors must be computed using a dissimilarity measure. In addition, a linear
combination is applied between the global and partial descriptor distance as suggested
by Sipiran et al. [11]. The global and partial descriptions of two 3D mesh models P and
Q can be expressed as:
DO {(GO , RO ) | GO Rn , PO { pO1 , pO2 ,..., pOm}, pOi R n } (2)
DQ {(GQ , PQ ) | GQ Rn , PQ { pQ1 , pQ2 ,..., pQk }, pQi R n} (3)
In addition, the matching distance can be expressed as:
d ( D O , DQ ) P GO GQ 1 (1 P ) PO PQ 1
(4)
where weighs the degree of correspondence between the two distances, Gi is the
global description, and Pi is the partial description.
When computing the global-to-global distance, L1 was applied instead of L2 [11].
The retrieval results of L1 distance and L2 distance are compared in the following
section. When computing the part-to-part distance, the degree of correspondence
between the two sets {PO} and {PQ} was unknown. In this section, when dealing with
two parts, all of the possible corresponding distances were computed, but only the best
and second-best corresponding distances were considered, as suggested by Lowe [2].
When the difference between the best and second-best distance was less than 0.2, the
part was assumed to correspond with the other part. Otherwise, the part was assumed to
have properties similar to the other part, such as those shared by the right and left hands.
566 T.-W. Yuan et al. / Research on a Non-Rigid 3D Shape Retrieval Method
2. Experiment
The feature extraction results are shown in Figure 2. The models were obtained from
the database provided in [16]. A total of 900 points were extracted from each mesh.
The four models shown in Figure 3 are in various poses. The next three models are in
Shot Noise, local scaling, and scaling with a similar pose positions. Feature points on
the ears, hands, and feet were repeatedly extracted.
Table 1. Repeatability at radius=5 of my approach and Mesh-DoG (mean curvature) feature detection
algorithm. Average number of detected points: 392
Strength
Transf. Method
1 2 3 4 5
Isometry My Appro. 91.69 96.44 93.86 90.93 92.84
Mesh-DoG 97.75 98.13 97.92 97.14 97.70
Scaling My Appro. 97.02 96.89 95.50 95.34 94.63
Mesh-DoG 98.00 98.00 98.00 98.00 98.00
Shot Noise My Appro. 99.23 98.85 98.46 98.10 97.99
Mesh-DoG 98.25 98.00 98.00 97.87 97.75
My Appro. 95.98 97.39 95.94 94.79 95.15
Average
Mesh-DoG 98.00 97.84 97.97 97.67 97.82
Table 2. Repeatability at radius=5 of my approach and SP feature detection algorithm. Average number of
detected points: 205
Strength
Transf. Method
1 2 3 4 5
Isometry My Appro. 88.48 93.31 91.9 89.79 91.79
SP 79.01 83.5 83.9 84.33 84.79
Scaling My Appro. 93.86 92.83 92.16 91.5 91.33
SP 84.68 82.36 80.77 78.98 77.42
Shot Noise My Appro. 98.28 96.57 95.59 94.37 93.14
SP 77.78 73.31 66.06 62.25 59.68
My Appro. 93.54 94.24 93.22 91.89 92.09
Average
SP 80.49 79.72 76.91 75.19 73.96
my approach takes the second place considering all evaluation measures. Although it
performs slightly worse than Mesh-DoG, it clearly outperforms in Shot Noise and
yields better results than SP with a same condition: operation directly on coordination.
An HKS interval and visual dictionary of 60 and 40 were used in the retrieval
process, respectively. The segmentation database provided in [19], which includes 380
meshes across 19 object categories, was used to compare the retrieval performances of
the proposed model, SP, and Mesh-DoG based on their Precision-recall curves for
different values of. Mean Average Precision (MAP), First Tier (FT), and Nearest
Neighbor (NN) measurements were also used to evaluate the three methods [6].
Unfortunately, the Core Extraction code was unable to be obtained for segmentation;
thus, the three measurements were implemented using the database provided in [19].
The recall-precision plot (PR-curve) shown in Figure 3 illustrates that the value of
μ influenced the retrieval results. The model yielded poor retrieval results when all of
the matching distances were determined by the partial descriptors (μ=0). This was
likely because two models within the same class lack common parts, resulting in
segmentation fault. However, when μ=1, BoF-HKS was directly applied to the dataset
as in SHREC’15, yielding acceptable results. The optimum results were obtained when
μ=0.7. As shown by the results, the performance of the proposed method was higher
than that of BoF-HKS (μ=1) and Part-HKS (μ=0). Therefore, only μ values of 0, 0.7,
and 1 were considered in the analysis.
The MAP, FT, NN, and global difference results are displayed in Figure 4. The
results obtained when μ=0 were identical due to the lack of influence of the global
distance. In addition, when computing the global distance, L1 yielded better results than
L2. Furthermore, the proposed method performed better than BoF-HKS (μ=0).
Therefore, the proposed method could improve the retrieval performance of non-
rigid 3D shape systems.
3. Conclusion
In this paper, a new feature point detector based on Curvature Flow was developed.
The proposed method operates on coordinates using curvature information, but does
not require the computation of extra curvature information. In addition, a new distance
is proposed for non-rigid 3D shape retrieval by combining global and partial
representation. The Core Extraction method was used to generate complete, joint, and
functional parts. The best and second-best distances were used to evaluate
correspondence between parts of different models.
According to the experimental data, the performance of the proposed method was
higher than that of the other approaches when applied to non-rigid 3D shape retrieval
with only global or partial representation. However, the performance of the proposed
method was limited when two models within the same class did not share enough parts,
possibly due to segmentation fault. Regardless, the proposed method offers new
representational capabilities for non-rigid 3D shape retrieval.
Acknowledgement
References
[1] A. Zaharescu, E. Boyer, K. Varanasi, et al. Surface Feature Detection and Description with Applications
to Mesh Matching. Computer Vision and Pattern Recognition, IEEE Conference on, Miami, FL,
2009:373-380.
[2] D. G. Lowe. Object recognition from local scale-invariant features. The proceedings of the seventh IEEE
international conference, 2(1999):1150.
[3] U. Castellani, M. Cristian, S. Fantoni, et al. Sparse points matching by combining 3D mesh saliency with
statistical descriptors. Computer Graphics Forum. Blackwell Publishing Ltd, 27(2008): 643-652.
[4] Z. H. Lian, A. Godil, B. Bustos, et al. A comparison of methods for non-rigid 3D shape retrieval. Pattern
Recognition, 46 (2013):449-461.
[5] A. M. Bronstein, M. M. Bronstein, L. J. Guibas, et al. Shape google: Geometric words and expressions
for invariant shape retrieval. Acm Transactions on Graphics 30(2011):623-636.
[6] Z. Lian, J. Zhang, S. Choi, et al. SHREC'15 Track: Non-rigid 3D shape retrieval. Eurographics
Workshop Ond Object Retrieval (2015).
[7] Ran, Gal, and D. Cohen-Or. Salient geometric features for partial shape matching and similarity. Acm
Transactions on Graphics. 25 (2006):130-150.
T.-W. Yuan et al. / Research on a Non-Rigid 3D Shape Retrieval Method 569
[8] B. Li, A. Godil, H. Johan. Non-rigid and Partial 3D Model Retrieval Using Hybrid Shape Descriptor and
Meta Similarity. Advances in Visual Computing. Springer Berlin Heidelberg, 2012:199-209.
[9] A. Mademlis, P. Daras, A. Axenopoulos, et al Combining Topological and Geometrical Features for
Global and Partial 3-D Shape Retrieval. IEEE Transactions on Multimedia 10(2008):819-831.
[10] B. Bustos, T. Schreck , M. Walter, et al. Improving 3D similarity search by enhancing and combining 3D
descriptors. Multimedia Tools & Applications 58 (2012):81-108.
[11] I. Sipiran, B. Bustos, and T. Schreck. Data-aware 3D partitioning for generic shape retrieval ۼ.
Computers & Graphics. 37(2013):460-472.
[12] S. Katz, G. Leifman, A. Tal. Mesh segmentation using feature point and core extraction. Visual
Computer, 21(2005):649-658.
[13] X. Zhang, G. Li, Y. Xiong, et al. 3D Mesh Segmentation Using Mean-Shifted Curvature. Advances in
Geometric Modeling and Processing, International Conference, GMP 2008, Hangzhou, China, April 23-
25, 2008. Proceedings 2008:465-474.
[14] Tombari, Federico, S. Salti, et al. Performance Evaluation of 3D Keypoint Detectors. International
Journal of Computer Vision. 102 (2013):198-220.
[15] Chen Wei. A Mesh Smoothing Algorithm Using Curvature Flow. Computer Engineering and
Applications (2005).
[16] Shape Retrieval Contest Datasets : http://tosca.cs.technion.ac.il/book/shrec_feat2010.html
[17] A. M. Bronstein, M. M. Bronstein, B. Bustos, et al. SHREC 2010: robust feature detection and
description benchmark. Eurographics 2010 Workshop on 3D Object Retrieval. The Eurographics
Association.
[18] E. Boyer, A. M. Bronstein, M. M. Bronstein, et al. SHREC 2011: robust feature detection and
description benchmark. Eurographics 2011 Workshop on 3D Object Retrieval. The Eurographics
Association, 2011:71-78.
[19] A Benchmark for 3D Mesh Segmentation :http://segeval.cs.princeton.edu/
570 Fuzzy Systems and Data Mining II
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-570
Introduction
another server. During the research work of relocating virtual machine, some work set
resource utilization as the target, some are drive by profit; others set energy
consumption as the goal. Many relocating methods with consideration of energy
consumption [4] often reduce energy consumption without regarding on-demand
service quality of virtual machine as the key. In some practical application scenario,
during the process of virtual machine relocating, we need to reduce energy
consumption and consider how to ensure a good user experience at the same time. The
contributions of this paper include:(1) a virtual machine relocating strategy of
collaborating energy and performance in cloud is introduced; (2) the strategy
effectively avoids the unnecessary migration by using autoregressive model to predict
the performance of the next period.
The rest of this paper is organized as follows: Section 1 discusses the related work.
Section 2 presents the strategies for VM relocating. Section 3 conducts experiment to
verify the strategy. Section 4 concludes this paper.
1. Related Work
Figure 1 shows a virtual machine relocating architecture, which consists of three parts:
Decider, Polynomial with Lasso Energy model based on Resource Utilization (PLERU)
and Performance Monitoring and Tracing model(PMT). The Decider module is used to
determine which virtual machine needs to be relocated and where to be relocated. In a
certain period, PLERU collects the usage of virtual machine and physical machine,
including the utilization rate of CPU and memory utilization rate, to model the energy
consumption. PMT monitors the performance indicators of each virtual machine. Based
on the output data of PLERU and PMT module, Decider module calculates the number
of virtual machine that need to be relocated and the destination of the relating virtual
machine.
Figure 1.The Architecture of Virtual Machine Relocation with Energy- performance Awareness
In Decider module, this paper designs a virtual machine relocation strategy. As
choosing the virtual machines that is need to be relocated, we use the autoregressive
model of time series prediction strategy to predict the SLA value of in the future time.
The prediction strategy considers the future influence on the rest of the virtual machine
after relocation, in order to avoid the virtual machine which has temporary SLA
violation caused by mutation load is relocated. At the same time, the virtual machine
that violates most its SLA relocated and the original host’s resource is released and
other virtual machines in the original host machine can use more resources. The
strategy let other virtual machines to avoid unnecessary migration so as to reduce the
overall energy consumption and improve the service quality of virtual machine.
2.1. Select Virtual Machine Using Autoregressive Model Based on Time Series
As for choosing virtual machines, in order to avoid the SLA violation, this paper adopts
the time series forecasting autoregressive model to predict the performance of virtual
machines in the case of SLA violation in next period of time. The response time is used
as the key performance indicator for virtual machines and physical server. The n-order
Autoregressive model (AR) is used to predict the next response time value. The
response time Tt at moment t only relates with the values of Tt-1, Tt-2,...Tt-n before t. So
Tt can be expressed as Formula(1):
Tt = ɔ1Tt-1+ɔ2Tt-2+,͐,+ɔnTt-n+at (1)
X. Li et al. / VM Relocating with Combination of Energy and Performance Awareness 573
In the selection of host, principle is that choosing the host without violating SLA as
well as choosing the one having the maximum residual energy. In the process of
calculating the host energy, the polynomial with lasso energy model based on resource
utilization (PLERU) is adopted.
Usually, CPU energy consumption and energy memory consumption are the main
component of the total server energy consumption, accounting for 58% and 28% of the
total energy consumption of the total server respectively [18]. Therefore, our work
considers mainly CPU consumption and memory consumption.CPU energy
consumption and memory energy consumption is related to the CPU utilization and
memory usage respectively, which is not a simple linear relationship. So the energy
consumption model based on multivariate linear regression strategy is designed. In
PLERU, the multiple linear regression model of energy consumption for CPU
utilization and memory utilization is established. The energy consumption is given as
Formula (2).
yi β0 β1 xicpu β2 ximen Hi ,i 1,...,n
{E ( H ) 0,Var ( H ) V 2 (2)
i i
where yi stands for the measured energy consumption xicpu and ximen respectively
mean the measurement of CPU utilization and memory utilization; εi stands for
unobservable random errors, i = 1,2, ..., n; β0, β1,…, βm stand for the regression
coefficient.
The absolute value function of the model regression coefficients is used as a
penalty to compress the model coefficients, and the smaller absolute value of the
coefficient is automatically compressed to 0. At the same time, the selection of
significant variables and the estimation of the corresponding parameters are realized.
During the process of data training, it is required to achieve a balance between
overfitting and underfitting.
574 X. Li et al. / VM Relocating with Combination of Energy and Performance Awareness
3. Experiments
response time output from TPC_W benchmark by PMT is regarded as the response
time of virtual machine. The experimental results are shown in from Figure 3 to Figure
5.
by a certain strategy, the response times of the virtual machines are shortened.
RelocatEP strategy can guarantee the response time would not exceed 5000ms after
virtual machine relocation in all the three situation, and it effectively ensures the user
experience. From Figure 5, it can be seen that the number of migrated virtual machine
is the least. It is because that RelocatEP uses the predicting process based on self-
regression model to predict the performance of virtual machine in the future period,
which helps to deal with the performance violation caused by temporary load mutation
of virtual machine.
In the experiments, compared with MM+ GAPA method and MMT + PABFD
method, in the aspect of saving energy, our strategy is better than MM+GAPA, but
worse than MMT+PABFD. However in the aspect of response time and the number of
virtual machine migration, RelocatEP is better than other at least 30% and
10%respectively.
In summary, compared to MM+IMBFD and MMT+ PABFD, RelocatEP is good at
reducing the energy consumption and avoiding unnecessary virtual machine migration
in the case of guaranteeing SLA.
4. Conclusions
In this paper, we propose the RelocatEP strategy. It uses the three-threshold method for
choosing the relocated virtual machines. Then, it adopts the automatic regression model
based on time series to predict the SLA violation in the next time period, to detect
whether one virtual machine is a temporary load mutation caused by SLA violation,
avoiding unnecessary migration. During the process of choosing a host, the
performance and energy consumption of the host are balanced. The experimental
results show that RelocatEP can guarantee the user’s SLA and provide the effective
way to reduce energy consumption, and also can avoid unnecessary virtual machine
migration. In the future work we will research the optimization of cost in different
virtual machine migration strategy.
Acknowledgements
References
[1] J. S. Yan, S. Ali, S. Kun, et al. State-of-the-art research study for green cloud computing. The Journal of
Supercomputing, 65(2013): 445-468.
[2] J. Dong, X. Jin, H. Wang, et al. Energy-saving virtual machine placement in cloud data centers//
Proceedings of 13th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing
(CCGrid), IEEE, 2013: 618-624.
[3] Y. Wang, X. Wang, M. Chen, et al. Partic: Power-aware response time control for virtualized web servers.
IEEE Transactions on parallel and distributed systems, 22(2011): 323-336.
[4] J. W. Jang, M. Jeon, H. S. Kim, et al. Energy Reduction in Consolidated Servers through Memory-Aware
Virtual Machine Scheduling[J]. IEEE Transactionson Computers, 60(2011):552-564.
578 X. Li et al. / VM Relocating with Combination of Energy and Performance Awareness
[5] A. K. Das, T. Adhikary, M. A. Razzaque, et al. An intelligent approach for virtual machine and QoS
provisioning in cloud computing// Proceedings of The International Conference on Information
Networking 2013 (ICOIN). IEEE, 2013: 462-467.
[6] D. Hu, N. Chen, S. Dong, et al. A user preference and service time mix-aware resource provisioning
strategy for multi-tier cloud services. AASRI Procedia, 2013, 5: 235-242.
[7] V.Mann, A. Vishnoi, A. Iyer, et al. Vmpatrol: Dynamic and automated qos for virtual machine
migrations//Proceedings of the 8th International Conference on Network and Service Management.
International Federation for Information Processing, 2012: 174-178.
[8] R. K.Sharma, P. Kamal, S. P. Singh. A latency reduction mechanism for virtual machine resource
allocation in delay sensitive cloud service//Green Computing and Internet of Things (ICGCIoT), 2015
International Conference on. IEEE, 2015: 371-375.
[9] A.Beloglazov, R. Buyya. Managing overloaded hosts for dynamic consolidation of virtual machines in
cloud data centers under quality of service constraints. IEEE Transactions on Parallel and Distributed
Systems, 24(2013): 1366-1379.
[10] S. H. Wang, P. P. W. Huang, C. H. P. Wen, et al. EQVMP: Energy-efficient and QoS-aware virtual
machine placement for software defined datacenter networks// Proceedings of The International
Conference on Information Networking 2014 (ICOIN2014). IEEE, 2014: 220-225.
[11] Y. Kessaci, N. Melab, E. G. Talbi. A Pareto-based metaheuristic for scheduling HPC applications on a
geographically distributed cloud federation. Cluster Computing,16(2013):451-468.
[12] T. Mastelic, A. Oleksiak, H. Claussen, et al. Cloud computing: Survey on energy efficiency. ACM
Computing Surveys, 47(2015): 33.
[13] A. Kansal, J. Liu, A. Singh, et al. Semantic-less coordination of power management and application
performance. ACM SIGOPS Operating Systems Review,44(2010): 66-70.
[14] G. Jung, M. A. Hiltunen, K. R. Joshi, et al. Mistral: Dynamically managing power, performance, and
adaptation cost in cloud infrastructures// Proceedings of IEEE 30th International Conference on
Distributed Computing Systems (ICDCS), IEEE, 2010: 62-73.
[15] A. Beloglazov, R. Buyya. Optimal online deterministic algorithms and adaptive heuristics for energy
and performance efficient dynamic consolidation of virtual machines in cloud data centers.
Concurrency and Computation: Practice and Experience, 24(2012): 1397-1420.
[16] A. Beloglazov, J. Abawajy, R. Buyya. Energy-aware resource allocation heuristics for efficient
management of data centers for Cloud computing. Future Generation Computer Systems,28(2012):755-
768.
[17] L. L. Xiao, H. Z. Xi. A virtualized cloud computing data center energy aware resource allocation
mechanism .Computer application, 33(2013): 3586-3590.(in Chinese)
[18] N. Quang-Hung, P. D. Nien, N. H. Nam, et al. A genetic algorithm for power-aware virtual machine
allocation in private cloud//Proceedings of Information and Communication Technology-EurAsia
Conference. Springer Berlin Heidelberg, 2013: 183-191.
Fuzzy Systems and Data Mining II 579
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-579
Introduction
Many real physical, biological, social complex systems, such as transportation, nervous
system and social relations can be abstracted to topological network models with nodes
representing individuals and edges representing interactions.
Since the 1990s, many models have been proposed to characterize the properties of
complex networks, like small world and scale-free properties. In which the small world
model proposed by Watts and Strogstz [1] and the growth preferential attachment model
proposed by Barabsi and Albert(BA model)[2] are recognized as the pioneering work.
(In fact, as early as 1965, Price proposed the preferential attachment mechanism on de-
gree during his study on citation relations[3]).Inspired by the BA model, many network
evolution models are proposed with consideration of local preferential attachment[4],
age preferential attachment[5], and the fitness[6].
Beyond the topological structure of networks, some recent studies take the locations
of nodes into consideration. Dynamic random geometric graphs are basic framework in
whose models, and the networks generated are more similar to real complex networks.
Krioukov et al found that hyperbolic geometry is the hidden geometry of the networks
with a power-law degree distribution, and build a network navigation algorithm[7]. Xie
et al constructed geometric graph models for citations, cooperations and Internet, using
the geometric distance to describe the correlations between nodes [8].
Note that most studies rarely consider the various interactions between the nodes in
network, which are hard to describe by simple links. As in protein networks, different
1 Corresponding Author: En-Ming Dong, National University of Defense Technology, Changsha, Hunan,
types of proteins have two diametrically opposite effect: mutual promotion or suppres-
sion; and in a social network, the edges between individuals can represent both cooper-
ation and competition. In this paper, under the framework of game theory, the strategies
are used to mimick relationships between nodes in networks. A network evolution model
is proposed, in which the new node selects its neighbors by preference attachment first.
Then in order to get expected payoff, it changes some neighbors according to the coordi-
nation Game. Two neighbor changing methods are considered in the model, one is only
breaking links(NEB), while the other one is rebuilding links(NER). Theoretical analysis
and experiment results show that networks generated by both NEB and NER model fol-
low power law degree distributions in large degree. By controlling expected payoff rate,
the rate of small degree nodes and the clustering coefficient can be more similar to real
world networks.
1. The Model
Cooperation and competition are common phenomenon in real life. Game theory is a
subject to study them mathematically, wich is an important branch of operational re-
search with applications to economics, military and psychology. The game model can
take many forms in regard to the various real-life examples, but essentially composed of
three basic elements: the players, the strategy set, the payoff matrix. Players are the game
participators, who can choose strategies. At least two players are needed in one game.
Each player i can take strategy si . When all the players’ strategies are chosen, a game
situation x is formed. The payoff of each player is a function of the situation x. A basic
game model is defined below.
Definition 1 A basic game model refers to a triple Γ = (N, S, P), where N is a non-empty
finite set of the players, S is non-empty policy set, P is payoff matrix of two or
more players at different strategies.[9]
HH n
H 2 C D
n1 HHH
C a, a 0, b
D b, 0 c, c
In real life, people are usually influenced by their neighbors, such as they will make
reference to the surrounding neighbors when buying a product, they will consider the
compatibility with others when buying a software. Such conformity can be described
by majority game, that people are forced to take some actions or strategies in order to
be consistent with the most neighbors. In fact, people can take two actions to achieve
consistency with the surrounding neighbors: one is to change their actions or strategies;
the other is to change their neighbors, by choosing neighbors with the same interests to
achieve that consistency. The online social softwares can help people to choose friends
with similar views, behaviors and hobbies. Coordination game is efficient to reflect the
E.-M. Dong et al. / Network Evolution via Preference and Coordination Game 581
payoff of the consistency. Coordination game is a special kind of game, only the players
with the same strategy can get income, that is the income come from the coordination
but conflict of the strategy. Take two-person game as an example, the payoff matrix are
defined above.
A network evolution mechanism is proposed based on coordination game, first we
define node label strategy.
In particular, the label vector and label strategy is an enrichment to the networks.
In social networks, for example, if one component of L is basketball, the strategy will
be 1 if node i likes basketball, 0 otherwise. According to BA model, when a new node
comes into a network, it prefer to link existing famous nodes, namely nodes with large
degree. However, the well-known nodes may not share the same interests and hobbies
with the new node. For example, it’s odd to follow a baseball star when you are absorbed
in football. So we use coordination game to make a selection of the neighbors.
Consider a simple coordination game, the payoff matrix is defined as below. If two nodes
have the same strategy, they both gain 1, otherwise 0. Therefore, new nodes can choose
neighbors based on label strategy in order to make its payoff rate higher than a given
threshold α .
HH n
HH 2 C D
n1 HH
C 1, 1 0, 0
D 0, 0 1, 1
For a given network G = (V, E) with label vector L = (l1 , l2 , . . . , l j ), the label strategy
of existing node k ∈ V is Sk . The new node i’s label strategy is Si . Its neighbors selected
by preference attachment is N = (n1 , n2 , . . . , nm ), with the corresponding label strategies
Sn1 , Sn2 , · · · , Snm , for a neighbor nc , node i can get payoff
in which u(Snc (d), Si (d)) is payoff according to the payoff matrix of coordination matrix.
So the payoff rate of the new node is
∑c ui c
Rui = .
mj
When the Rui is smaller than α , the new node break links with the neighbor of the
least payoff, until Rui > α . So the network evolution algorithm NEB is as follows.
582 E.-M. Dong et al. / Network Evolution via Preference and Coordination Game
Algorithm 1
Step 1 Initialize: randomly generate a initial network G = (V, E) with m0 nodes, a label
vector L = (l1 , l2 , . . . , lk ), and the corresponding label strategies S j , j = 1, 2, · · · , m0 ;
Step 2 Termination conditions: if the total nodes is larger than N, stop; else go step 3;
Step 3 Preferential attachment: new node i comes into the network, with label strategies
Si , it first select m(m < m0 )nodes by preferential attachment, namely the probabil-
ity of chosen a existing node j is
dj
Pj = ;
∑x dx
Step 4 Coordination game: for all the neighbor N = (n1 , n2 , . . . , nm ) with the label strate-
gies Sn1 , Sn2 , . . . , Snm , calculate the payoff uic and the whole payoff rate Rui ; for a
given threshold α , if Rui > α , go step 3, else break links with the node with the
least payoff, go step 4.
To test the performance of the algorithm, we consider the simple case of k = 2. The
strategies of nodes to the network label are generated randomly. The degree distribution
are shown in figure 1. The different colors represent degree distributions of different α ,
with initial m0 = 30, m = 20, N = 10000. When d > 20, the degree distribution of all the
generated networks follow power law distributions. When d ≤ 20, the rate of nodes with
small degrees increase as α increases.
10-1
10-2
10-3
100 101 102 103
Degree k
It can also be proved that the degree distributions of networks generated by algorithm
1 follow power law.
Theorem 1 When the new node i comes into the network, the probability of the existing
node q to be selected as the neighbors of i is independent to α
Proof: First the node q can be selected according to the preferential attachment with
the probability Pqp = d j / ∑ j d j . Then suppose in the coordination game process, node q
is selected according to probability PqG . So the total probability that node q is selected is
Pq = Pqp PqG .
Note that ∑ j d j is twice the number of links in the network, so if we only consider the
preferential attachment process, ∑ j d j 2m for large m. When consider the coordination
E.-M. Dong et al. / Network Evolution via Preference and Coordination Game 583
dq dq
Pq = PqP PqG = PG = .
2mPqG q 2m
In algorithm 1, if the payoff of building a link is 0, the total payoff of the new node is
unchanged. However, people are more concerned with the total amount of payoff instead
of the payoff rate. So after breaking a link, the new node will select one existing node
with nearly the same strategies to remain the total number of neighbors m unchanged.
For the payoff rate α , the new node will get payoff α m.
In most cases, nodes do not know strategy information of the other nodes in the
network. They can only get information from their neighbors, so the new neighbors will
be selected in the neighborhood of the existing neighbors. The NER algorithm is nearly
the same as algorithm 1. The only difference is in step 3, after breaking a link, the new
node i will chose one node in the neighborhood of its neighbors according to the payoff.
100
10-1
10-2
10-3
100 101 102 103
Degree K
The degree distribution of networks generated by NER algorithm is also power law,
the theoretical proof is similar to the model proposed by Peter Holme[10]. Experiment
results is shown in figure 2, in which the rate of number of nodes with small degree also
increases as α grows. The clustering coefficient of the generated networks is shown in
figure 3, it remains stable as N increases. The clustering coefficient can be adjusted by
α , it grows as α increases.
584 E.-M. Dong et al. / Network Evolution via Preference and Coordination Game
Clustering coefficient
0.3
0.25
0.2
0.15
0.1
0.05
0
0.5 1 1.5 2 2.5 3
Number of nodes N ×104
Conclusion
References
[1] D. J. Watts, S. H. Strogatz, Collective dynamics of small-world networks, Nature 93 (1998), 440-442.
[2] A. L. Barabsi, R. Albert, and H. Jeong, Emergence of scaling in random networks, Science 286 (1999),
509-512.
[3] D. J. Price, Networks of scientific papers, Science 149(1965), 510-515.
[4] X. Li, G. Chen, A local-world evolving network model, Physica A 328(2003), 274C286.
[5] S. N. Dorogovtsev, P. L. Krapivsky, JFF. Mendes, Transition from small to large world in growing
networks, Europhys.Letter 81(2007), 226-234.
[6] R. Albert, A. L. Barabasi, Statisrical Mechanics of Complex Networks, Rev.Mod.Phys 74(2002), 47.
[7] M. Boguna, D. Krioukov, K. Claffy. Navigability of Complex Networks, Nature Physics 5(2009), 74-80.
[8] Z. Xie, Z. Ouyang, J. Li, A geometric graph model for coauthorship networks, Journal of Informetrics
10(2016), 299-311.
[9] D. Fudenberg, Game Theory, The MIT Press 60(1991), 841-846.
[10] P. Holme, B. J. Kim, Growing scale-free networks with tunable clustering, Physical Review E 65(2002),
95-129.
Fuzzy Systems and Data Mining II 585
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-585
Introduction
Wireless sensor network (WSN) which usually consists of a large number of static
sensor nodes, has been widely applied in military and civilian fields [1-2]. Target
tracking is considered as one of the important applications in WSN, such as monitoring
of wild animals, intruder detection and surveillance in the military areas.
How to improve the energy efficiency and tracking performance in WSN has
attracted more and more attention [3]. Sensor nodes selection problem plays a
significant role in improving both the energy efficiency and tracking performance in
WSN [4-5]. Prediction-based scheme [6] is one of such methods to select sensor nodes
for the next sampling period. Based on prediction-based scheme, various methods are
proposed to improve the energy efficiency, increase tracking performance, and shorten
the latency between sensors and the sink to increase the communication performance
[7-9]. These methods can be roughly divided into two categories. One is clustering
management methods [2, 10-15] which create multiple clusters of sensor nodes and
each cluster consists of a cluster head node and several cluster member nodes. Cluster
1
Corresponding Author: Yong-Jian YANG, Aeronautics and Astronautics Engineering College, Air
Force Engineering University, Xi’an, Shaanxi, China; Email: yangyongjian_king@126.com.
586 Y.-J. Yang et al. / Sensor Management Strategy with Probabilistic Sensing Model
1. Problem Formulation
S S
Suppose a WSN, in which the sensor locations Si=( xi , yi ), i=1, 2, …, Ns are assumed
known, is used to track the state of moving target whose motion equation and the
observation equation can be expressed as follows
x(k 1) Φ(k 1| k ) x(k ) w(k ) (1)
zi (k ) Hi (k ) x(k ) vi (k ) (2)
n m
where x(k)R is the target state vector, zi(k)R is the measurement vector obtained
from the i-th sensor (the target can be detected by the i-th sensor) in the network.
W(k)~N(0, Q(k))Rn and vi(k)~N(0, Ri(k))Rm are the process and measurement noise,
respectively. Φ(k+1|k) and Hi(k) are constant matrices with suitable dimensions.
According to the distributed Kalman filtering fusion with feedback [21], the
estimated state and its covariance in fusion center are as follows
N
xˆ (k | k ) xˆ (k | k 1) P(k | k )¦[ H i T (k )( Ri (k )) 1 zi (k )] (3)
i 1
N
P( k | k ) ¦ ( P (k | k ))
i 1
i
1
( N 1)( P(k | k 1)) 1 (4)
Y.-J. Yang et al. / Sensor Management Strategy with Probabilistic Sensing Model 587
As our best knowledge, the collaborative target tracking in WSN focus on how to use
sensors in an energy-efficient way. These methods assume the target will be detected
when the target is under the sensing range of sensor node and the sensor node cannot
move, furthermore, it requires redundant sensor nodes. However, it is possible that
none of sensor is available to track the target if the distance between target and sensor
is great. Therefore, we assume that the nodes of WSN includes both the static and
mobile sensors, the static sensor node indicates the node cannot move and the mobile
node indicate the node can move to any directions with limited velocity.
The procedure of the sensor management strategy with mobile sensor nodes in
WSN is descripted as follows
Step 1 Potential sensor nodes selecting. Select the sensor nodes whose sensing
range includes the predicted location of target as potential sensor nodes of time k.
Calculate Np(k) which is the number of potential sensor nodes, and make Nu(k) = Np(k)
where Nu(k) is the number of useful sensor nodes. The useful sensor node represents
the sensor node detected the target and received the echoes from the target.
Step 2 Mobile sensor nodes update their locations. When the potential sensor
nodes include the mobile sensor nodes, update the positions of these mobile sensor
nodes as follows
588 Y.-J. Yang et al. / Sensor Management Strategy with Probabilistic Sensing Model
xˆ1 (k | k 1) xiSm (k 1)
T atan( ) (7)
xˆ2 (k | k 1) yiSm (k 1)
xiSm (k ) xiSm (k 1) v sin(T )
® Sm (8)
¯ yi (k ) yi (k 1) v cos(T )
Sm
where xˆ1 (k | k 1) and xˆ2 (k | k 1) represent the predicted position at time k along x
direction and y direction, respectively. xiSm (k 1) and yiSm (k 1) represent the position
of mobile sensor node at time k-1 along x direction and y direction, respectively. And v
is the velocity of mobile sensor.
Step 3 Local estimated state and corresponding error covariance matrix
updating. Under the detection probability modeled as (5), update the estimated state
and the corresponding error covariance matrix by using Kalman filtering and return
these data to fusion center. When the observations miss, Nu(k) = Nu(k)-1 and no data
return to fusion center.
Step 4 Target trajectory updating. Update the estimated state and corresponding
covariance of target by using distributed Kalman filtering fusion with feedback. Then
broadcast the fused estimated state and corresponding covariance to the potential
sensor nodes.
Obviously, the proposed sensor management strategy does not consider the energy
efficiency, but it is very useful to improve the tracking performance. In some
applications such as battlefield surveillance, the number of sensor nodes is not
redundant, and the sensor nodes are generally chargeable, thus, energy efficiency is not
a priority factor.
3. Simulation Results
low detection probability for these sensor nodes. In addition, because of the target
maneuvers and the detection probability of sensor node, the observations acquired by
each sensor node are intermittent.
40
35
30
20
15
10
0
0 10 20 30 40 50 60 70 80 90 100
X (m) Time (s)
is the mobile node, is the static node is the selected tracking node; is the true trajectory of target
is the trajectory of the mobile node, is the esitmated trajectory of target using mobile nodes
Figure 1.The distribution of nodes and the Figure 2. The indexes of sensor nodes
estimated target trajectory detected the target
14
120
RMSEs
12
100
10
80
8
60
6
40 4
20 2
0 0
0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100
Time (s) Time (s)
Figure 3. RMSEs of position velocity and acceleration Figure 4. The average number of useful sensor node
Figure 3 shows the RMSEs of position, velocity and acceleration using mobile
sensor nodes and using static sensor nodes. The simulation has been independently
operated 100 times. Obviously, the estimated position and velocity of collaborative
tracking using static sensor nodes are divergent after target maneuver occurred. The
collaborative tracking using mobile sensor nodes can apparently improve the precision
of estimated target state.
Figure 4 shows the average number of useful sensor nodes of 100 times
simulations. Obviously, the number of useful sensor nodes is less than the potential
sensor nodes, because of the detection probability of sensor node. The number of useful
sensor nodes using mobile sensor nodes is greater than those using static sensor nodes,
which indicates the higher collaborative detection probability of using mobile sensor
nodes.
590 Y.-J. Yang et al. / Sensor Management Strategy with Probabilistic Sensing Model
From these simulation results, the following conclusions can be acquired. (1) The
useful sensor nodes number is always less than the potential sensor nodes number
because the detection probability of sensor node is less than 1. (2) The collaborative
tracking results using static sensor nodes are worse than using mobile static sensor
nodes. Especially, when target maneuver occurs, the tracking results using static sensor
nodes even diverge. (3) When some mobile sensor nodes are added in WSN, the sensor
management strategy with mobile sensor nodes proposed in this paper can improve the
collaborative detection probability and the estimated state precision.
4. Conclusions
Focused on the collaborative target tracking in wireless sensor networks with the
probabilistic sensing model, this paper has proposed to use mobile sensor nodes to
counteract the shortcomings resulted by the probabilistic sensing model. Specifically,
leveraged by the mobile sensor nodes, a sensor management strategy is proposed to
improve the number of useful sensor nodes, the collaborative detection probability and
the estimated state precision. Extensive simulations have been conducted in this paper
and the simulation results verify that the proposed sensor management strategy with
mobile sensor nodes based on distributed Kalman filtering fusion with feedback has
achieved high performance in terms of the number of useful sensors, the collaborative
detection probability and the tracking results.
References
[1] O. Eemigha, W. Hidouci, and T. Ahmed, On energy efficiency in collaborative target tracking in wireless
sensor network: a review, IEEE Communications Surveys & Tutorials, 15(2013), 1210-1222.
[2] Z. X. Cai, S. Wen, and L. J. Liu, Dynamic cluster member selection method for multi-target tracking in
wireless sensor network, Journal of Central South University, 21(2014), 636-645.
[3] J. M. Chen, J. K. Li, and T. H. Lai, Energy-efficient intrusion detection with a barrier of probabilistic
sensors: global and local, IEEE Transactions on Wireless Communications, 12(2013): 4742-4755.
[4] V. Isler, and R. Bajcsy, The sensor selection problem for bounded uncertainty sensing models, IEEE
Transaction on Automation Science and Engineering., 3(2006), 372-381.
[5] U. D. Ramdaras, F. G. J. Absil, and R. V. Genderen, Sensor selection for optimal target tracking in
sensor networks, International Journal of Intelligent Defence Support Systems, 4(2011), 187-207.
[6] J. H. Yoo, and H. J. Kim, Predictive target detection and sleep scheduling for wireless sensor networks,
IEEE International Conference on Systems, Man and Cybernetics, 2013, 362-367.
[7] G. Wang, Y. Wu, K. Dou, Y. Ren, and J. Li, AppTCP: The design and evaluation of application-based
TCP for e-VLBI in fast long distance networks, Future Generation Computer Systems, 39(2014), 67–
74.
[8] G. Wang, Y. Ren, K. Dou, and J. Li, IDTCP: An effective approach to mitigating the TCP Incast problem
in data center networks, Information Systems Frontiers, 16(2014), 35–44,.
[9] G. Wang, Y. Ren, and J. Li, An effective approach to alleviating the challenges of transmission control
protocol, IET Communications, 8(2014), 860–869,.
[10] J. Teng, H. Snoussi, and C. Richard, Prediction-based cluster management for target tracking in
wireless sensor networks, Wireless Communications and Mobile Computing, 12(2012), 797-812,.
[11] Z. Zhou, S. L. Zhou, S. G. Cui, et al., Energy-efficient cooperative communication in clustered wireless
sensor networks, IEEE Transactions on Vehicular Technology, 3(2006), 271−290.
[12] G. Wang, Y. Zhao, J. Huang, et al. A K-means-based Network Partition Algorithm for Controller
Placement in Software Defined Network, International Conference on Communications (ICC), 2016.
[13] B. Jinsuk, K. A. Sun, and P. Fisher, Dynamic cluster header selection and conditional re-clustering for
wireless sensor networks, IEEE Transactions on Consumer Electronics, 56(2010), 2249−2257.
[14] J. Meng, S. R. Li, and Z. Zhou, Overall energy efficient clustering algorithm in UWB based wireless
sensor network, Second International Symposium on Intelligent Information Technology Application.
Shanghai, China, 2(2008), 806−810.
Y.-J. Yang et al. / Sensor Management Strategy with Probabilistic Sensing Model 591
[15] Z. Wang, W. Lou, Z. Wang, et al., A novel mobility management scheme for target tracking in cluster-
based sensor networks, Lecture Notes in Computer Science, 6131(2010), 172-186.
[16] F. Zhao, J. Shin, and J. Reich, Information-driven dynamic sensor collaboration, IEEE Signal
Processing Magazine, 19(2002), 61-72.
[17] R. Olfati-Saber, Distributed tracking for mobile sensor networks with information-driven mobility, In
Proc. 2007 American Control Conference, New York City, USA, 2007, 4606-4612.
[18] J. Passerieux, and D. Van Cappel, Optimal observer maneuver for bearings-only tracking, IEEE
Transactions on Aerospace and Electronic Systems, 34(1998): 777-788.
[19] B. Wang, Coverage problems in sensor networks: a survey, ACM Computing Surveys, 43(2011), 32-53.
[20] X. Wang, J. J. Ma, S. Wang, et al., Distributed energy optimization for target tracking in wireless
sensor Networks, IEEE Transactions on Mobile Computing, 9(2009), 73-86.
[21] Y. M. Zhu, Z. S. You, J. Zhao, et al., The optimality for the distributed Kalman filtering fusion with
feedback, Automatica, 37(2011), 1489-1493.
592 Fuzzy Systems and Data Mining II
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-592
Introduction
1 Corresponding Author: Yong LI, The 54th Research Institute of China Electronic Technology Group
However, SCM system will be easily plagued due to the ISI under DS channels. OFDM
system will be impaired because of the significant time variations over highly Doppler
DS channels [5, 6].
To this end, we proposed the generalized hybrid carrier modulation (GHCM) system
with the partial fast Fourier transform (FFT) demodulation to mitigate the ISI and ICI
over DS channels in this paper. The GHCM system, merges the components of SCM and
OFDM systems. It is demonstrated via numerical simulation, with partial FFT demod-
ulation, that GHCM system outperforms both SCM and OFDM systems over the same
DS channels.
This paper is organized as follows. The first section presents preliminary to de-
rive the multi-weighted type fractional Fourier transform (M-WFRFT) and its important
property. The GHCM with partial FFT demodulation, will be provided in section two.
Furthermore, some simulations and discussions have been considered in section three.
We finally conclude the whole paper in the last section.
1. Preliminary
There are many various forms for multi-WFRFT according to [7], such as the classical
fractional Fourier transform (CFRFT) based multi-WFRFT and the generalized classical
fractional Fourier transform (GCFRFT) based multi-WFRFT [8–12]. However, the 4-
weighted type fractional Fourier transform(i.e.,4-WFRFT) based M-WFRFT is popular
due to its structure [2]. Moreover, et.al in [7] have provided the explanation of multi-
WFRFT in theory. However, its application upon wireless communication has not been
explored. In this paper, we focus on the 4-WFRFT based M-WFRFT.
Since the M-WFRFT is based on the 4-WFRFT, we first derive the definition of 4-
WFRFT for the original signal. A set of N -length symbols X as X = {x1 , x2 , ..., xN },
α4 -order 4-WFRFT of X can be defined as:
4 [X] = W4 X
S = Fα α T
(1)
3
W4α = Aρ (α)Fl , (2)
ρ=0
4
Here, the weighted coefficient Aρ (α) = 14 m=1 exp − jmπ(α−ρ+1) 2 ,ρ =
0, 1, 2, 3. As mentioned previously, there are many different methods for defining multi-
WFRFT [7]. Upon the 4-WFRFT, we can define the M-WFRFT of signal X as follows:
594 Y. Li et al. / Generalized Hybrid Carrier Modulation System
Fα αM
M X = WM X, M > 4,
M
(3)
with
M −1
αM ( 4ρ )
WM = Bρ (αM )W4 M , M > 4. (4)
ρ=0
where
1 1 − exp[−2iπ(αM − ρ)]
Bρ (αM ) = , M > 4. (5)
M 1 − exp[−2iπ(αM − ρ)/M ]
α+β β β
WM = WM
α
WM = WM WM
α
(6)
According to the additive property of 4-WFRFT [2, 5], the proof of Property 1 is
straight and omitted here. Furthermore, the M-WFRFT will be degenerated to 4-WFRFT
with M = 4.
The baseband model for the GHCM system, employing partial FFT demodulation, is
demonstrated in Figure.1. When M = 4, the GHCM system will be degenerated
to HCM system based 4-WFRFT. Assuming that a set of N -length symbols S as
S = {s1 , s2 , ..., sN }, is in the α M-WFRFT domain. Then, the signal S will be converted
to time domain through −α order M-WFRFT. The GHCM block involves a cyclic pre-
Y. Li et al. / Generalized Hybrid Carrier Modulation System 595
fix(CP) with Tp ≥ Tl , where Tl is the maximum multipath delay spread. The transmitted
GHCM signal can be given as follows:
−α
D = WM S, (7)
The signal D can be then transmitted serially under DS channels. Furthermore, the
received signal, after removing CP, can be expressed as:
V = Ht S + n, (8)
where Ht denotes the time varying channel convolution matrix [1, 13]. Moreover, n is
the additive white Gaussian noise (AWGN) samples.
At the receiver, the time domain signal V can be converted to frequency domain through
FFT. However, to mitigate the ICI to some degree. We, in this paper, employ the partial
FFT demodulation. The partial FFT demodulation, first divide the whole sampling inter-
val [0, T ] to P non-overlapping intervals, with the pth interval of [(p − 1)T /P, pT /P ].
Each interval can be operated by an FFT. Let V = {v1 , v2 , ..., vN } be the signals at the
receiver, the partial FFT demodulation output will be holds:
Yp = FVp (9)
with
and
In order to further suppress ICI, the appropriate equalizer is needed to the GHCM sys-
tem. Assuming Uk = {uk,1 , uk,2 , ..., uk,P } is the optimal compensation for the k th
subcarrier. And then, the estimation of Xk , can be expressed as:
in which,
#
N
N0 $−1
Uk = (Hf,l ul−k )2 + IP Hf,k u0 , (13)
P
l=1
596 Y. Li et al. / Generalized Hybrid Carrier Modulation System
Figure 2. BER performance comparison between between GHCM system (M = 8, α = 0.8),SCM and
OFDM systems over DS channels
with α + β = −1.
To verify the effectiveness of the GHCM modulation system, comparing to OFDM and
SCM modulation schemes with partial FFT, we present a simulation over the typical DS
channels. Let M = 8, the division number is 32, which trade off the calculational com-
plexity and performance. The normalized Doppler frequency is fd Td = 0.27. Here, fd
denotes the maximum Doppler frequency of the signal, and Td is the symbol sampling
duration. To simulate DS channels, we assume a ten-tap wide-sense stationary uncorre-
lated scatting (WSSUS) channel [14].
According to simulation result in Figure 2, we can see that the GHCM system
(M = 8, α= 0.8) performs better than OFDM system with partial FFT demodulation
when division number is 32. Furthermore, GHCM system with PFFT exhibits better BER
performance than SCM system with partial FFT as SNR > 15dB.
4. Conclusion
In this paper, we proposed a GHCM system based M-WFRFT with partial FFT demod-
ulation over doubly selective channels. We derive the structure of GHCM after the pre-
liminary. Moreover, we also explore the properties of M-WFRFT in this paper. Under
the considered DS channel, the performance of GHCM modulation system with proper
modulation order is more robust than OFDM system and SCM system with partial FFT
in the moderate-to-high SNR regions.
Y. Li et al. / Generalized Hybrid Carrier Modulation System 597
Acknowledgements
This work is supported by the fund of Science and Technology on Communication Net-
works Laboratory under grant number EX156410046 and the 973 Program under Grant
No.2013CB329003. Moreover, this work is also supported by the fund of National Key
Laboratory of Science and Technology on Communications.
References
University,Guangzhou China
b University of Chinese Academy of Sciences, Beijing China
Abstract. With the explosive increase in mobile services and user demands,cellular
networks will very likely be overloaded and congested in the near future.To cope
with this explosive growth in traffic demands with the limited current network ca-
pacity, opportunistic network is used to offload traffic from cellular networks to
free device-to-device networks. Network coding can make full use of the mobility
of nodes and improve the performance of opportunistic traffic offloading. In this
paper ,we investigate the benefits of applying a form of network coding known
as random linear coding to unicast application in opportunistic traffic offloading.
Moreover we establish a mathematical model to analyze the benefits of RLC. RLC
schemes achieve faster information propagation at the price of a greater number of
transmissions and take up a lot of storage space. To optimize scheme, we utilize the
survival time to control the number of packets in the network and free up storage
space for node.
Introduction
Mobile Internet access is getting ever-increasingly popular and today provides various
services and applications, including audio, images and video. Mobile Social Networks
have began to attract increasing attention in recent years and currently a large percent-
age of mobile data traffic is generated by these mobile social application and mobile
broadband-based PCs[1]. Therefor, mobile data traffic is growing at an unprecedented
rate, which causes many problems to network providers. Many studies show that the
traffic load on cellular networks may soon reach the networks’ critical limit.
To cope with the explosive traffic demands and limited capacity provided by the cur-
rent cellular networks, many schemes are proposed using opportunistic network[2][3][4].
Opportunistic network is self-organization network that utilizes the mobile node contact
opportunity to achieve data communications. It does not require one complete commu-
nication path exists between source and destination. By utilizing delay-tolerant-network,
1 Corresponding Author: Da-Ru PAN, school of Physics and Telecommunication Engineering, South China
service providers may deliver the information to only a small fraction of selected users
to reduce mobile data traffic and the selected users then help to further propagate the
information among all the subscribed users through their social participation.When the
non-selected nodes are in the communication range of the selected nodes, the selected
nodes will disseminate the information to the non-selected nodes.Then,the non-selected
nodes disseminate messages to other nodes which have not received messages after they
get it from selected nodes or others. But these schemes can not make full use of the
node’s mobility characteristics. Because in delay-tolerant-network, the factors that affect
the information dissemination are not only related to the probability of the node’s en-
counter, but also the effective information carried by nodes. By encoding can increase
the effectiveness of information carried by nodes when nodes are within the transmission
range.
Our main contribution is that by extending the result of Xiaolan Zhang et al.[4],
we analyze the situation of multiple-sources single-destination and apply this scheme
to offload mobile data. Moreover we establish a mathematical framework to analyze
the delivery rate of packets. Another contribution is that we propose a new scheme to
reduce the number of packets in the network. Under the RLC scheme, when two nodes
meet each other, the probability that nodes carry useful to exchange is higher, so the
number of packet copies is larger in the network. Now the main way of some of the
replication control scheme is to assign a certain number of tokens to each packet. But the
flooding speed of these schemes is slow. In our scheme, the packet flood until a threshold
calculated based on the probability model we deduced and then the packets are deleted
according to their TTL. The advantage of this scheme is that it achieves much lower
latency and can control the number of packets in the network.
This paper is organized as follows: we briefly review related work in Section2. An-
alyze offloading mobile traffic by RLC scheme and optimize RLC scheme in Section3.
Finally, Section4 concludes the paper.
1. Related Work
A detailed study of the linear encoding was done by Zhang et al[4], they focus on the
benefits of applying random liner coding to unicast application in opportunistic network.
Their work exhibits the benefit of RLC schemes through simulations. In our work ,we
will do further research on the RLC schemes and apply it to offload mobile data.
Groenevelt et al.[5] built a stochastic model using Laplace-Stieltjes transform to
analyze the message delay and the number of copies in network. The work [6] develop
a rigorous, unified framework based on ordinary differential equations(ODEs) to study
epidemic routing and its variations and Lin et al.[7] developed ODE models to analyze
the group delivery delay for a single group of single source single destination packets
under random liner coding and no-coding. We establish a mathematical model to analyze
the relationship between delivery rate and simulation time.
In RLC schemes, To control the number of packets, Zhang et al.[8] introduces a
token-based RLC scheme that extends binary spray-and -wait[9][10]. This scheme as-
sociates a token number with each generation which limits the transmission times of
combinations and implemented by two steps. The first step is token reallocation that re-
distribute the tokens of generations in proportion to the ranks of nodes, and the second
600 J.-K. Jiao et al. / On the Benefits of Network Coding for Unicast Application
step is that transmit one combination that if the meeting nodes have useful information
then they will transmit a random liner combination and their tokens will decrease one.
We utilize the survival time to control the number of packets in the network and free up
storage space for node.
In this section, we describe the scheme that use random liner coding to deliver the mes-
sage and apply this scheme to offload mobile data.
Random Liner Coding, just as its name implies is to generate a linear combination of the
encoded packets which is viewed as a vector of symbols from a finite field GFq of size
q[11]. A S bits packets can be viewed as a vector of d = [S/logq2 ] symbols from GFq .
So,we assume that a data can be divided into K packets Pi , i = 1, 2, . . . , K,and the size
of every packet is S bit. We can viewed Pi as a vector mi , i = 1, 2, . . . , K,with d symbols
from GFq . Combining the K packets linearly and the encoded packets x can be written as
K
x = ∑ αi m i
i=1
(1)
= (α1 , α2 , . . . , αK ) × (m1 , m2 , . . . , mK )T
= α × M
We consider a network with N identical mobile nodes.Among the N nodes, there are N1
initial nodes,N2 intermediary nodes and N3 destination nodes.The size of large data is M
and the size of packet is m.The number of packet is M/m equaling to N1 i.e. N1 = M/m.
In our scheme, every initial nodes will carry one packet and the destination nodes needs
to collect the N1 packets. If destination nodes could not receive initial N1 packets from
other nodes after a specified "tolerable" duration which is related to the data lifetime,
they can directly ask to receive the packets from the cellular network.We denote the total
number of packets sent in the core network is L , the number of useful packets destination
nodes received is Q, the number of destination nodes have received the complete data
is W, the number of packets the core network cost is U. If the Q = N1 ,we say that this
destination node belongs to W and don’t need to be receive packets from base station.
2.3. Analyze multiple source single destination RLC routing scheme by mathematical
Next we analyze multiple source single destination RLC scheme. To reduce the amount
of cellular traffic compared to the epidemic routing, the advantage of multiple source sin-
gle destination RLC scheme increases the effectiveness of information. In RLC scheme,
each node may contain the packets you need, so the Poisson intensity for each node to
meet the other nodes at the next moment is Nλ .We assume that the probability that one
node contain useful packet is 1/2 when two nodes encounter each other. So the practical
Poisson intensity is 1/2Nλ . In the previous content, we have introduced the probability
of the useful packets nodes accepted is greater than or equal to 1 − 1/q which is only
with regard to the size of finite field. In RLC scheme, the destination nodes get the data
only in the case that they collects N1 linearly independent packets, so the destination
nodes needs to receiveN1 /(1 − 1/q) packets at most. So in a certain period of time T,
the probability of the target node to receive one packet is Prlc = 1 − e−1/2Nλ T and the
probability of the target nodes successfully get the initial data which does’t divide into
packet is
Next, we consider the benefits of offloading mobile data under RLC scheme and epi-
demic routing. We use relatively small data packets with m bit which can be successfully
transmitted within the transmission range. We can calculate the total size of mobile data
offloaded and the consumption of mobile traffic. In this paper,we apply the total number
of packets sent in the core network to measure the pros and cons of each routing mech-
anism. Without delay tolerant network, U is equal to U = N1 × N3 .Under the multiple
source single destination RLC scheme, W = N3 × PRLC and the U is
N3 −W
U = N1 + ∑ (N1 − Qi ) (3)
i=1
In this simulation we have a data into 10 data packets and encoding (i.e the number
of initial nodes is 10 which every node carries one packet according to our scheme ) ,
602 J.-K. Jiao et al. / On the Benefits of Network Coding for Unicast Application
then set 10 target nodes, select 80 volunteer nodes helping transmission. Figure1 plots
the delivery rate of epidemic routing and RLC. The flooding speed of this multiple source
RLC routing is faster and all the target nodes successfully receive data using less time
than epidemic routing. Figure2 plots the total number of packets copies in the network
.The number of packets in nodes’ buffer under RLC routing is more than epidemic rout-
ing.Another phenomenon shown in Figure2 is that when the growth trend of curves tends
to be saturated (i.e every node has all the packets), the final number of packets in RLC
routing is more than epidemic routing. The reason for this phenomenon is that RLC
scheme may produce linear correlation encoding vector in the process of rapid flooding.
So RLC scheme take up a lot of storage space.
Figure 1. Delivery rate of two schemes Figure 2. The number of packets in the network
To control the number of packets in the network and free up storage space of node,
a scheme is proposed in the following section.
While flooding-based routing schemes have a high probability of delivery, they waste a
lot of energy and incur a great number of transmissions. To reduce the copies of network,
a frequently used scheme is binary spray and wait scheme. Zhang et al.[8] proposed the
token-based RLC schema that extends binary spray and wait. The node redistribute their
tokens in proportion to their ranks.
However limiting the the number of token affect the speed of flooding. We propose a
new replication control scheme based on the TTL of packet. In our scheme, every initial
packet generated in the source node is assigned with survival time Tttl . When two nodes
are within transmission range, a new copy is generated to hand over to another node
with survive time Tttl .This scheme at the expense of large number of transmission in
exchange for reducing the copies of network. Considering that the node mobility shows
a very high degree of temporal and spatial regularity, and each individual returns to a few
highly frequented locations with a significant probability, the packets stored in node is
are nearly saturated after message flooding over a period time. The messages carried by
nodes waste mass of storage space. Our scheme consists of two phases: flooding phase
and delete phase. In flooding phase, given a probability PRLC and calculate the T based on
J.-K. Jiao et al. / On the Benefits of Network Coding for Unicast Application 603
the probability model PRLC = (1 − e−1/2Nλ T )N1 /(1−1/q) we derived. T is used to control
the time of packet flooding. In delete phase(i.e. after time T ), the packets are removed
according to their Tttl . The Tttl is determined according to Prlc = (1 − e−1/2Nλ ) where
Prlc is the probability that one node successfully receive a packet.
Figure 3. Packets receiving from base station Figure 4. The number of packets in the network
Figure 5. The number of packets in the network Figure 6. Packets receiving from base station
Figure3 plots the number of packets needs receiving from base station under three
schemes. The performance of our scheme outperforms the two schemes(binary and spray,
controlling packets based on rank). In the same simulation time, our scheme achieves
higher delivery rate than other scheme. Figure4 depict the number of packets in the net-
work of four schemes. The number of packets keeps growing and tends to be saturated
after a period of time. In controlling packets based on rank and binary and spray schemes,
we set token=64. Because the flooding speed of these schemes is slow, so the number of
packets is relatively small and the delivery rate is low. In our scheme, there is a signif-
icant decline at threshold T which is used to control the time of packet flooding. After
time T, the number of packets is relatively stable. Figure5 and Figure6 plot the number
of packets in the network and the number of packets receiving from base station under
different value of. The higher delivery rate is achieved under the higher and meanwhile
it consumes a large amount of storage space.
604 J.-K. Jiao et al. / On the Benefits of Network Coding for Unicast Application
4. Conclusion
In this paper, we have studied the problem of how to offload multiple mobile data traf-
fics from overloaded cellular networks using RLC scheme in opportunistic network. Be-
cause of its higher degree of randomness compare to non-coding schemes, RLC scheme
increase the delivery rate and reduce the cost of base station. Especially we analyze
the relationship between delivery rate and simulation time by establishing mathematical
model. To optimize scheme, we utilize the survival time to control the number of packets
in the network and free up storage space for node. We plan to investigate the influence of
the size of packet on the RLC scheme.
Acknowledgements
This study was supported by the National Natural Science Foundation of China under
Grant Nos.61471175 and U1301251, the Supporting Plan for New Century Excellent
Talents of the Ministry of Education under Grant No. NCET-13-0805.
References
Abstract. Due to the fact that the numbers of annually published papers in some
citation networks have witnessed a linear growth, a geometric model is thus raised
to reproduce some statistical features of those networks, in which the academic in-
fluence scopes of the papers are denoted through specific geometric areas related to
time and space. In the model, nodes (papers) are uniformly and randomly sprinkled
onto a cluster of circles of the Minkowski space whose centers are on the time axis.
Edges (citations) are linked according to an influence mechanism which indicates
that an existing paper will be cited by a new paper located in its influence zone.
Considering the citations among papers in different disciplines, an interdisciplinary
citation mechanism is added to the model in which some papers with a small prob-
ability of being chosen will cite some existing papers randomly and uniformly. Dif-
ferent from most existing models that only study the scale-free tail of the in-degree
distribution, this model characterizes the overall in-degree distribution. Moreover,
the model can also predict the scale-free tail of the out-degree distribution, which
indicates that the model is a good tool in researches on the evolutionary mechanism
of citation networks.
Keywords. Citation network, Influence mechanism, Interdisciplinary citation
mechanism, Power-law distribution
Introduction
The research of citation networks has drawn increasing attention and been applied to
many fields. It can help scientists find useful academic papers, help inventors find inter-
esting patents, or help judges discover relevant past judgements. The scientific citation
networks considered in this paper are directed graphs, in which nodes represent papers,
while edges stand for the citation relationships between them. Since new papers can only
cite the published papers [1], these graphs are acyclic.
Degree distribution is a fundamental research object of citation networks, and a se-
ries of models have been proposed to illustrate it. The Price model appears to be the first
to discuss about cumulative advantage in the context of citation networks and their in-
degree distributions [2,3]. The idea lies in that the rate at which a paper gets new citations
1 Corresponding Author: Qi LIU, National University of Defense Technology, Changsha, China ; E-mail:
liuqi@smail.nju.edu.cn.
606 Q. Liu et al. / A Geometric Graph Model of Citation Networks
should be proportional to the citations that it already has. This can lead to a scale-free
distribution according to the Price model [4]. The cumulative advantage is also known as
the preferential attachment in other literatures [5]. An investigation has been conducted
by Eom et al [1] on the microscopic mechanism for the evolution of citation networks
by raising a linear preferential attachment with the initial attractiveness dependent on the
time. The model reproduces the tails of the in-degree distributions and the phenomenon
called “burst”: the citations received by papers increase rapidly in the early years since
publication. The above-mentioned models have studied the tail of the in-degree distribu-
tion only, while the two-mechanism model proposed by George et al [4] characterizes
the properties of the overall in-degree distributions.
With respect to the research of networks from the real world (e.g. citation networks),
using random geometric graph has become a hot topic in recent years. Xie et al [6] define
the academic influence scope as a geometric area and present an influence mechanism,
which means that an existing paper will be cited by a new paper located in its influence
zone. Based on this mechanism, they further propose the concentric circles model (CC
model), which can fit the scale-free tails of the in-degree distributions of the citation
networks with the exponentially growing nodes. Nevertheless, the forepart of the in-
degree distribution and the out-degree distribution can not be well fitted by this model.
In reality, node-increment in many current citation networks enjoys a linear growth,
e.g. Cit-HepPh, Cit-HepTh (Figure 1). Therefore, a model with linearly growing node-
increment is proposed. The edges in the model are still linked according to the influence
mechanism, whereas they are revised in that the influence scopes of papers are deter-
mined by their topics and ages (the time that has passed since publication). Differen-
t from the previous models that only focus on the tails of in-degree distributions, the
improved model can well predict the overall in-degree distributions of the real citation
networks. In consideration of the citations among different disciplines in real citation
networks, a mechanism that is referred to as the interdisciplinary citation mechanism is
proposed. Under appropriate parameters, these mechanisms can reproduce the scale-free
tail of the out-degree distribution of citation networks. These results show that our model
can be used as a medium to study the intrinsic mechanism of citation networks.
The structure of this paper is as follows. The model is described in Section 2. The
degree distributions are analyzed in Section 3, and finally the conclusion is provided in
the last section.
Table 1. Some statistical indices of the citation networks.
Networks Nodes Links CC AC AC-In AC-out PG MO
Cit-HepTh 27770 352807 0.165 -0.030 0.041 0.096 0.987 0.650
Cit-HepPh 34546 421578 0.149 -0.006 0.077 0.112 0.999 0.724
Modeled network 33165 162080 0.393 -0.068 0.316 0.166 0.970 0.967
1 The first two networks extract from arXiv which cover paper from January 1993 to April 2003
(124 months) in high energy physics theory and in high energy physics phenomenology [7, 8]. The
last network is generated according to the generating process of the model, where parameters are
m = 15, T = 66, β0 = 0.035, λ = 0.001, S = 66, p = 1, η = 2.5, α = 1.3, r = 0.01, ξ = 2.7, k0 = 6.
2 In the header of the table, CC, AC, AC-In, AC-Out, PG and MO denote the clustering coefficient,
the assortative coefficient, the in-assortative coefficient, the out-assortative coefficient, the node
proportion of giant component and modularity, respectively.
Q. Liu et al. / A Geometric Graph Model of Citation Networks 607
800 10000
600
Cit−HepPh Modeled network (c)
Cit−HepTh 9000
550 700 y=2.277t+208.2 y=15t
y=1.543t+210.7
500
(a) 600
(b) 7000
450
6000
500
400
5000
350 400
4000
300 3000
300
250 2000
200
200 2 1000
R 2 =0.7292 R =0.7287 R 2 =1
100 0
150 0 20 40 60 80 100 120 140 160
0 50 100 150 0 100 200 300 400 500
t/Month t/Month t/2Month
Figure 1. The changing trends of the monthly numbers of papers of the data in Table 1. (a)
the trend for the papers of Cit-HepTh; (b) the trend for papers of the papers of Cit-HepPh; (c) the
trend for the modeled network. They are fitted by linear functions. The coefficient of determination
(R2 ) is used to measure the goodness of fits.
1. The model
Since many journals and databases publish papers monthly or yearly and papers in the
same issue cannot cite each other normally, models like the Price model or the copy
model that publish one paper at each time step do not consider the growing trends of
papers. Xie et al [6] pay attention to the citation networks in which the number of papers
published each year grows exponentially. However, in some real citation networks (e.g.
Cit-HepPh and Cit-HepTh), the monthly or annual numbers of papers published grow
linearly (Figure 1). For purpose of the evolution and features of these networks, a ge-
ometric graph model, in which the node-increment in specific time unit experiences a
linear growth, is proposed here.
In our model, a simple spacetime, (2+1)-dimensional Minkowski spacetime of two
spatial dimensions, along with one temporal dimension is considered, so that the time
characteristics of the nodes in citation networks can be modeled. The nodes in the model
are uniformly and randomly sprinkled onto a cluster of concentric circles (the centers of
which are on the time axis), and some spatial coordinates are given to them to represent
the research contents of papers. In addition, the nodes on different circles are generated
in different time units, while those in the same circle represent the papers published in
the same issue. The number of nodes in a circle is a linearly increasing function of the
circle’s temporal coordinate. In the spacetime, nodes are identified by their locations
(R(t), θ ,t), where t is the generated time of the node, R(t) is the radius of the circle
born at time t, and θ is the angular coordinate. Considering that the radius R(t) and the
time t are 1-to-1 correspondence, each node is identified by its location only with time
coordinate t and angular coordinate θ . The edges in the model are linked according to
the influence mechanism and the interdisciplinary citation mechanism,
Supposing that a modeled network has N(t) = mt papers (m ∈ Z+ ) published at time
t (t = 1, 2, ..., T ∈ Z+ ), including some interdisciplinary papers, the generating process
of the model is listed as follows.
1. Generate a new circle Ct with radius R(t) = N(t)/(2πδ ) (δ ∈ R+ ) centered at
point (0, 0,t) at each time t = 1, 2, ...T ∈ Z+ , sprinkle N(t) nodes (papers) on it
randomly and uniformly, and give each node a coordinate, e.g. the coordinate of
node i is (θi ,ti ).
2. For each node with coordinate (θ ,t), the influence zone (academic influence s-
cope) of the node is defined as an interval of angular coordinate with center θ
and arc-length D = β (θ )/t α , where α ∈ (1, 2) is used to tune the exponent of
608 Q. Liu et al. / A Geometric Graph Model of Citation Networks
where β0 ∈ R+ , λ > 0, S ∈ Z+ , and [θ0 , θ1 ], ..., [θS−1 , θS ] are a specific partition of [0, 2π ]
satisfying Δ(θi+1 , θi ) = 2π (β0 + iλ )−η /∑S−1
j=1 (β0 + j λ )
−η , i = 0, 2, ..., S − 1, η > 0, θ =
0
0, θS = 2π , and the aging of the papers’ influences is ignored here due to the short time
span of the empirical data (around ten years) (Table 1).
0
(a) Modeled network (b) Cit−HepTh −1 (c) Cit−HepPh
10 10
−1
10
Out−degree distribution
−1
Out−degree distribution
Out−degree distribution
10
−2 −2
10 10
−2
10
−3 −3
10 10
−3
10
−4 −4
−4 a=3.518 10 10 a=3
10 a=2.636
b=0.2939 b=0.8646 b=0.788
R2 =0.9989 R2=0.9815 R2 =0.9893
RMSE=0.001742 RMSE=0.002605 RMSE=0.00211
−5 −5
10 10
−5
10
0 1 2 0 1 2 3
10 10 10 10
0
10
1
10
2
10
3
10 10 10 10
Node out−degree k Node out−degree k Node out−degree k
Figure 2. Out-degree distribution. (a) the out-degree distribution of the modeled network; (b)
the out-degree distribution of Cit-HepTh; (c) the out-degree distribution of Cit-HepPh. The fit-
ting functions of their foreparts are f1 (k) = a(a + bk)k−1 e−a−bk /k!. The root mean squared error
(RMSE) and coefficient of determination (R2 ) are used to measure the goodness of fits.
2. Out-degree Distribution
The out-degree distributions of the real citation networks (Table 1) take the form of scale-
free tails and curves in the forepart (Figures 2b,2c). The curves in the forepart of the out-
degree distributions can be well fitted by the generalized Poisson distribution. In reality,
the behavior that paper j cites paper i is influenced by the number of the citations [2, 3]
and the popularity of paper i s author. At the same time, it can be viewed as a low-
probability event (the reference length of paper j is very small compared with the large
number of papers). These settings are suitable for the use of the generalized Poisson
distribution. Now the formulas of the forepart and tail of the out-degree distribution of the
modeled network (Table 1) are derived to revel the mathematical mechanism behind the
phenomenon that our model generates the similar curve and scale-free tail (Figure 2a).
Q. Liu et al. / A Geometric Graph Model of Citation Networks 609
The edges in the model are linked according to the influence mechanism and the
interdisciplinary citation mechanism. Firstly, the non-interdisciplinary paper i with co-
ordinate (θi ,ti ) is considered. For prior node j, its coordinate is (θ j ,t j ), where t j < ti . If
Δ(θi , θ j ) < β (θ j )/t αj , node i is located in the influence zone of node j. When β (θ j )/t αj
is small enough, β (θi ) ≈ β (θ j ), because β (.) is a staircase function. Then the expected
out-degree of node i is as follows:
ti −1
β (θi )p mβ (θi )pti2−α
k+ (θi ,ti ) = ∑ tα R(t j )δ ≈
2π (2 − α )
, (1)
t j =1 j
1 +
(k (θi ,ti ))k e−k (θi ,ti ) ,
+
p(k+ (θi ,ti ) = k) = (2)
k!
which is the probability that node i has out-degree k, with the temporal density ρ (ti ). In
this model, ρ (ti ) ≈ 2ti /T 2 . So the out-degree distribution is
mβ (θi )p
2π −
1 mβ (θi )p k+ α e 2π (2−α )
+
pnon (k = k) ≈ ( ) 2−α d θi . (3)
2π 0 2π (2 − α ) k!
It is a mixture poisson distribution, which is similar to that of the real citation networks.
Moreover, we use the generalized Poisson distribution to fit the curve in the forepart of
the modeled out-degree distribution, and get a good result (Figure 2a).
The interdisciplinary papers make the tail of the modeled out-degree distribution
fat (Step 4) (Figure 2a). Thus, in combination with the non interdisciplinary papers, the
out-degree distribution is
where r expresses the proportion of interdisciplinary papers, and f (k) refers to the power-
law distribution mentioned in Step 4.
610 Q. Liu et al. / A Geometric Graph Model of Citation Networks
0
(a) Modeled network 0 (b) Cit−HepTh 0 (c) Cit−HepPh
10 10 10
a=1.225 a=1.7 a=1.825
b=0.6643 b=0.8789
In−degree distribution
b=0.8358
In−degree distribution
In−degree distribution
10
−1 R2=0.9994 10
−1
R2 =0.9991 10
−1
2
R =0.9992
RMSE=0.000816 RMSE=0.000766 RMSE=0.000546
−2 −2 −2
10 10 10
10
−3
γ=2.47 10
−3 γ=2.7 10
−3
γ=3.47
xmin =12 xmin =55 xmin=94
p=0.0120 p=0.0060 p=0.3500
−4 gof=0.0157 −4
gof=0.0265 −4 gof=0.0237
10 10 d=70.75 10
d=1.579 d=2610.15
R2 =0.9931 R2=0.9107 R2=0.7899
RMSE=0.000095 RMSE=0.000075 RMSE=0.000044
ϣ
−5 −5 −5
10 10 10
0 1 2 3 0 1 2 3 4 0 1 2 3
10 10 10 10 10 10 10 10 10 10 10 10 10
Node in−degree k Node in−degree k Node in−degree k
Figure 3. In-degree distribution. (a) the in-degree distribution of the modeled network; (b) the
in-degree distribution of Cit-HepTh; (c) the in-degree distribution of Cit-HepPh. The fitting func-
tions of their foreparts are f1 (k) = a(a + bk)k−1 e−a−bk /k!, and tails are f2 (k) = ck−γ (fitted by the
method in Ref ( [9])).
3. In-degree Distribution
The in-degree distributions of the real citation networks have been investigated with the
result showing that the curves in the forepart of the in-degree distributions can be well
fitted by the generalized Poisson distribution (Figures 3b,3c). Actually, the citations of
one paper are affected by the new papers of its authors, and the probability of one paper
receiving citations (be selected from plenty of papers) is small and not equal to that of
other papers. These are the conditions in which the generalized Poisson distribution can
be applied. Besides, the in-degree distributions of the real citation networks have scale-
free tails (Figures 3b,3c) which are usually interpreted as a result of the preferential at-
tachment [2,3]. In this model, this phenomenon is caused by the highly cited papers with
large influence zones. Now, an expression of the forepart and tail of the in-degree distri-
bution of the modeled network is derived to revel the mathematical mechanism behind
the phenomenon that our model generates the similar curve and scale-free tail(Figure 3a).
For the modeled paper i, it can receive citations from the papers located inside or
outside of its influence zone. Therefore, the expected in-degree of paper i with coordinate
(θi ,ti ) is
T
β (θi )p T
2 mβ (θi )p T −1
k− (θi ,ti ) = ∑ tα R(s)δ + r ∑ s − 1 ≈ 4π t α (T 2 − ti2 ) + 2r ln ti . (5)
s=t+1 s=t+1 i
When ti is small, the first item in formula (5) is larger than the second item, so k− (θi ,ti ) ≈
mβ (θi )pT 2 /4π tiα . The in-degree distribution in the large in-degree region, by averaging
the Poisson distribution, is
(τ −k+ αα
+2 )2
2π a2 − 2(k− αα
+2 )
1 −(1+ α2 ) e
p(k− = k) ≈ k d τ d θi , (6)
2π 0
a2
Tα 2π (k − αα+2 )
where a2 = mβ (θi )pT 2 /4π , τ = a2 /tiα . The Laplace approximation and the Stirling’s
approximation are used in this approximation. It can be proven that the integral term of
τ is approximately independent of k. In this way, the modeled in-degree distribution in
the large-k has a scale-free tail with exponent 1 + 2/α .
When ti is large, the time derivative of the influence zone of paper i is considered,
(β (θi )/tiα ) = −αβ (θi )ti−α −1 ≈ 0. It means that the influence zone of paper i in this
Q. Liu et al. / A Geometric Graph Model of Citation Networks 611
model is approximately a constant when ti is large. Hence, it is assumed that β (θi )/tiα ≈
D (D is a constant). The expected in-degree of paper i is k− (θi ,ti ) ≈ mDp(T 2 −ti2 )/4π +
2r ln((T − 1)/ti ). And thus the in-degree distribution in the small in-degree region is
mDp k − mDp
4π (T −1)
2
4cπ ( 4π (T − 1)) e
2
−
p(k = k) ≈
mDpT 2 k!
(1 − c)e−2 ln(T −1) (2r ln(T − 1))k e−2r ln(T −1)
+ . (7)
r k!
It indicates that the in-degree distribution of the modeled network in the small in-degree
region is a mixture Poisson distribution, which is similar to that of the real citation net-
works. Moreover, the fitting shows that the in-degree distribution of the modeled network
in the small in-degree region can be well fitted by the generalized Poisson distribution
(Figure 3a).
4. Conclusion
References
[1] YH Eom, S Fortunato, Characterizing and modeling citation dynamics. PLoS ONE 6 (2011): e24926.
[2] DJ Price, Networks of scientific papers. Science 149 (1965): 510-515.
[3] DJ Price, A Generai Theory of Bibiiometric and Other Cumulative Advantage Processes. J Am Soc Inf
Sci Technol 27 (1976): 293.
[4] GJ Peterson, S Pressé, KA Dill, Nonuniversal power law scaling in the probability distribution of scien-
tific citations. Proc Natl Acad Sci USA 107 (2010): 16023-16027.
[5] MEJ Newman, Clustering and preferential attachment in growing networks. Phys Rev E 64 (2001):
025102.
[6] Z Xie, ZZ Ouyang, PY Zhang, DY Yi, DX Kong, Modeling the citation network by network cosmology.
PLoS ONE 10 (2015): e0120687.
[7] J Leskovec, J Kleinberg, C Faloutsos, Graph Evolution: Densification and Shrinking Diameters. ACM
TKDD 1 (2007): 2.
[8] J Gehrke, P Ginsparg, J Kleinberg, Overview of the 2003 KDD Cup. SIGKDD Explorations 5 (2003):
149-151.
[9] A Clauset, CR Shalizi, MEJ Newman, Power-law distributions in emprical data. SIAM Rev 51 (2009):
661-703.
612 Fuzzy Systems and Data Mining II
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-612
Introduction
Since the concept of general systems was initially hinted by von Bertalanffy in the 1920s,
the contents of systems science have covered a very wide range of topics, including:
human behavioral systems, social systems, mechanical systems, computer and network
systems, intelligent systems, simulation systems, biological systems, aerospace systems,
the earth system, commercial systems, administrative systems, etc[1]. The research of
complexity is one of the foci of systems science[2,3].
The complex system consists of the interconnected or interwoven parts. To under-
stand the behavior of a complex system one must understand not only the behaviors of
the parts, but also how they act together to form the behavior of the whole[4]. Different
methods are established to resolve various kinds of complexity in different systems. Op-
erations research, information theory, control theory and cybernetics, dissipative struc-
tures, synergetics, complex networks and so on offer theories and methods to investigate
natural and social systems.
This paper aims to analyze the role of complex system in scientific knowledge. The
Proceedings of the National Academy of Sciences(PNAS, http://pnas.org) is an impor-
tant scientific journal and knowledge dataset that publishes highly cited research reports,
commentaries, reviews, perspectives and letters. A data set comprising the set of 52,025
papers published in the PNAS in the years 1999-2012 is used to achieve the aim. It has
1 Corresponding Author: Zong-Lin Xie, National University of Defense Technology, Changsha, Hunan,
been analyzed and modeled by us to some extent[5]. Amazingly, the percentage of the
papers containing both ‘complex’ and ‘system’ contributing to the total papers touches
47%. The research into the major topics of complex system, such as complex network,
emergence, complexity, and uncertainty also shows similar upward trends.
Coverage in PNAS broadly spans the biological, physical, and social sciences. We
can finely analyze the effects of complex system on research topics in each sciences.
Co-word occurrence analysis is a content analysis technique and is used here to identify
the strength of associations between those topic words based on their co-occurrence in
the same document. The resulting co-occurrence frequency matrix can be expressed by
a weighted network. We find that the words ‘complex’ and ‘system’ are the core nodes
in the co-words network of each science, which shows the important role of complex
system in every research field.
6000 6000
(a) (b)
Number of papers Number of papers
5000 5000
Complex system Control
4000 4000
Complex network
Frequency
Model
Frequency
3000 3000
2000 2000
1000 1000
0 0
Year Year
Figure 1. Panel(a) shows the frequency trend of the co-occurrence of ‘complex’ and ‘system’ and that of
‘complex’ and ‘network’. Panel(b) shows the frequency trend of the words ‘model’ and ‘control’.
800 1000
(a) Complex network (b)
Complexity
700 900
Feedback 800
Network control
600
Emergence 700 Network community
500 Nonlinear
Frequency
0 0
Year Year
Figure 2. The frequency trend of some hot topics in complex system (Panel(a)) and in complex network
(Panel(b)). It can see that compared with network structure, network control has received more attention of
researchers.
In order to explain explicitly the general and fundamental postulates of different system-
s jointly, be they natural or social, the complex systems or complexity provides one of
those unified theories to describe the intrinsic mechanism underlying the systems and the
interactions between these individual systems and their outside environments. In com-
plex systems field, although derived from a specific discipline, many methodologies offer
a possible way to unify physical, biological and social systems.
In the corpus, there are 40,262 biological papers, which account for 77% of the total.
Meanwhile, PNAS recruits papers and publishes special features in the physical(8,997
papers in the corpus, including 3,647 papers in biophysics) and social sciences(1,192
papers in the corpus).
By the analysis above, the PNAS data can be mainly divided into three categories:
biological, physical, and social sciences. Now we use the co-word technology to analyze
the relationship between complex system and the topics in those sciences respectively.
Firstly, we use Natural Language Toolkit(NLTK, www.nltk.org) to build the wordlist
for the corpus by selecting nouns and the words whose synsets contain nouns. Secondly,
we divide most of the papers(46,766 papers) in the corpus into three overlapped cluster-
ings, i.e., biological, physical, and social sciences, based on the discipline tags of the pa-
pers given by the authors. The others are tagged as acknowledgment, letter, symposium,
etc. Thirdly, for each paper clustering, we selected the highly frequent words, which oc-
Z.-L. Xie et al. / Complex System in Scientific Knowledge 615
cur in more than 10% paper of the total in the clustering. Fourthly, the domain experts in
each science select the most meaningful words as topic words based on their professional
knowledge.
The resulting co-occurrence frequency matrix can be expressed by a weight-
ed network. The layouts depicted in Figure 3,4,5 are generated by the free soft-
ware Gephi(www.gephi.github.io) and using the force directed algorithm proposed by
Fruchterman and Reingold[9]. We find that the words ‘complex’ and ‘system’ are the
core nodes in the co-words network of each science(Figure 3,4,5), which reveals the im-
portance of complex system in every research field. In what follows, we briefly discuss
the role of complex system in each science.
Social complexity and its emergent properties are central recurring themes through-
out the historical development of social thought and the study of social change. The ear-
ly founders of sociological theory, all examined the exponential growth and increasing
interrelatedness of social encounters and exchanges. This emphasis on inter-connectivity
in social relationships and the emergence of new properties within society is found in
theoretical thinking in multiple areas of sociology. Complexity in the social and behav-
ioral sciences referring mainly to a complex system is found in the study of modern or-
ganizations and management studies. The focus of related object of study in topics of
social complexity is also shown in Figure 3.
policy
evolution
child bias
technology
regression
family food
adult person
risk
animal
world water
nature society
behavior control right
brain
poor information prediction
statistic
life
increase
population environment
female
male
country people
system outcome
human
national
social mechanism technique
development
state network
movement
region relationship
knowledge history
challenge
organization complex natural
dynamic
thought
capacity community
growth specie
rise
health
Figure 3. Co-word network of the highly frequent and meaningful words based on 1,192 social science papers
in the PNAS data(1999-2012).
We randomly select 10 papers from the 450 ones, which belong to social science
and contain ‘complex’ and ‘system’. Those researches are about the women’s under-
representation phenomenon, the Maya abandonments reason, the collaboration in social
networks, the phenomenon of territorial expansion, the emergence of segregation in so-
cial network, climate change mitigation, emerging sign language, the factors on gene
expression, the Climate negotiations, and dynamics of relationship between two peoples.
Complexity of biological science, not only lies in emergence and evolution of or-
ganisms and species , but also in both structure and function in biological organisms,
with emphasis placed on the complex interactions and the fundamental relations and re-
lational patterns that are essential to life. A complete definition of complexity for indi-
vidual organisms, species, ecosystems, biological evolution and the biosphere is still an
ongoing issue. Specifically, the topics related to complex systems biology include but
616 Z.-L. Xie et al. / Complex System in Scientific Knowledge
are not limited to, DNA, structure and function, organisms and species relations and evo-
lution, interactions among species, evolution theories, self-reproduction, computational
gene models, autopoiesis, protein folding, cell signaling, signal transduction network-
s, complex neural nets, genetic networks. Many related keywords are embodied in the
Figure 4.
edta male
behavior electron
rabbit virus
ligand
energy
allele patient
matrix
infection
phosphate adult
nature motif drug spectrum
chemical
rat blood proliferation
Figure 4. Co-word network of the highly frequent and meaningful words based on 40,262 biological science
papers in the PNAS data(1999-2012).
We randomly select 10 papers from the 19,515 ones, which belong to biological sci-
ence and contain ‘complex’ and ‘system’. Those researches are concerned with the ge-
netic code expansion, the dynamics in an RNA virus, a disease treatment, the mechanism
of DNA transcription, the effect of the expression of intracellular signaling molecules,
study the behavior of insects, the changes of gene expression, the influencing factors
of bacteria growth, the neural processing of emotional faces, and yeast commitmen-
t complex formation. In the physical world, the classical dynamics presents a reversible
and symmetric image. Nowadays, many methodologies from physics offer possible ap-
proaches to unify natural and social systems. Some examples include classical mechanic-
s, friction, patterned ground, statistical mechanics, electrical networks, temperature, and
convection. Most of the laws of physics themselves appear to be the most fundamental
principles in the universe, raising the question of what might be the most fundamental
law of physics from which all others emerged.
The co-words network(Figure 5) shows the interest in physical complexity nowa-
days. Significantly, as Figure 5 shows, some biological topics, such as protein, cell and
molecule, are also hot topics in physical science, because they are the hot topics in bio-
physics, whose papers account for 42% of the total physical science papers in the corpus.
We randomly select 10 papers from the 4,703 ones, which belong to physical science
and contain ‘complex’ and ‘system’. Those researches are related to the control of pro-
tein crystal nucleation, intermolecular forces, permeation mechanism of a mammalian
urea transporter, dynamic force spectroscopy of adhesion bonds, Nucleic acid-triggered
catalytic drug release, the biological macromolecules, the lipid matrix and tensile force,
the protein-protein interactions, the Coupling of protein and hydration-water dynamics
in biological membranes, and Molecular dynamics simulations.
Z.-L. Xie et al. / Complex System in Scientific Knowledge 617
proton
statistic
ray mass
Figure 5. Co-word network of the highly frequent and meaningful words based on 8,997 physical science
papers (including 3,647 biophysics papers) in the PNAS data(1999-2012).
3. Conclusion
To show the role of the complex systems in scientific knowledge, we apply the co-word
occurrence analysis to the corpus of papers published in PNAS 1999-2013. Surprisingly,
the percentage of the papers containing ‘complex’ and ‘system’ contributes to the total
touches 47%. The papers containing ‘complex’ and ‘network’ account for 13.5% of the
total. The research about the major topics of complex systems, such as complex network,
emergence, complexity, uncertainty also shows upward trend. The papers in the corpus
mainly belong to the biological, physical, and social sciences. We further analyze the
effect of complex system on research topics in those sciences respectively. The frequent
and meaningful words are selected as topics by domain experts. The co-occurrence fre-
quency matrix of those topic words is expressed by a weighted network. We find that
‘complex’ and ‘system’ are the core nodes in these co-word networks of all sciences.
The phenomenon shows the importance of complex system in every research field.
References
[1] D. B. Kenneth, Fifty Years of Systems Science: Further Reflections, Systems Research and Behavioral
Science, 22 (2005), 355-361.
[2] D. Chu, R. Strand, R. Fjelland, Theories of complexity, Complexity, 8 (2003), 19-30.
[3] D. Harel, Statecharts: a visual formalism for complex systems, Science of Computer Programming, 8
(1987), 231-274.
[4] Z. Xie, X. Duan, Z. Ouyang, P Zhang, Quantitative Analysis of the Interdisciplinarity of Applied Math-
ematics, Plos One, 10 (2015), e0137424.
[5] Z. Xie, Z. Ouyang, J. Li, A geometric graph model for coauthorship networks, Journal of Informetrics,
10 (2016), 299-311.
[6] K. K. Mane, K. Börner, Mapping Topics and Topic Bursts in PNAS, Proceedings of the National Acade-
my of Sciences of the United States of America, 101 (2004), 5287-5290.
[7] A. L. Barabási, The network takeover, Nature Physics, 8 (2012), 14-16.
[8] Y. Y. Liu, J. J. Slotine, A. L. Barabási, Controllability of complex networks, Nature, 473 (2011), 167-
173.
[9] T. M. J. Fruchterman, E. M. Reingold, Graph drawing by force-directed placement, Software-Practice
and Experience, 21 (1991), 1129-1164.
618 Fuzzy Systems and Data Mining II
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-618
Introduction
In various types of optical metrology, such as digital holography [1], digital speckle
pattern interferometry [2], X-ray imaging [3] and so on, an important problem exists
that the phase recovered from the intensity suffers from wrapping caused by the
arctangent function, which leads the obtained phases within the interval [-π, π] as
modulo 2π of the true phases. The inverse process, called phase unwrapping [4], is
aimed to recover the missing integer multiple of 2π from wrapped phase map to
remove the discontinuities in the phase map.
Many phase unwrapping algorithms can be used to deal with phase unwrapping
problem, such as Branch-cut algorithm [5], quality-guided method [6], and so on.
These methods can be categorized into spatial phase unwrapping (SPU). In general, for
these SPU methods, an unwrapped phase can be derived from single wrapped phase
map only, according to the neighborhood characteristic of the phase value of the pixel.
However, these methods have a common limitation, which cannot deal with the case
that contains discontinuities, especially fails for large-step objects and isolated objects.
Compared to SPU methods, another type of phase unwrapping method-temporal
phase unwrapping (TPU) have been reported [7], for the purpose of resolving the
Corresponding Author: Cheng ZHANG, Key Laboratory of Intelligent Computing & Signal Processing,
Anhui University, Hefei 230039, China; E-mail: question1996@163.com.
C. Zhang et al. / Two-Wavelength Transport of Intensity Equation for Phase Unwrapping 619
wrapped phase of the objects with large-step discontinuities and separations. Usually,
more than one wrapped phase maps or additional black/white pattern are utilized to
provide additional information about the fringe patterns. The most competitive
advantage of TPU is that it is capable of large-step objects and isolated objects analysis,
and noise uncertainties don’t spread during the entire unwrapping process.
In this paper, a novel phase unwrapping method based on two-wavelength
transport of intensity equation (TW-TIE) is proposed. Different from classical transport
of intensity equation (TIE), two illumination with different wavelengths, rather than
two different propagation distances, are used to obtain two different intensity images,
which benefits from no precise translation needed. Numerical demonstrations are given
to show the effectiveness of our proposed method.
1. Principles
Under the paraxial approximation, classical TIE can be derived from the parabolic
wave equation [8]:
[I z (r ) z (r )] k wI z (r )/wz (1)
respectively. Note that the FFT-based solver in Eq.(2) have a very fast implementation
and memory efficient due to two FFTs required only.
620 C. Zhang et al. / Two-Wavelength Transport of Intensity Equation for Phase Unwrapping
H (fx , fy ; z, 2 ) e p ª«i(2
exp ( / 2 )z 1 [( 2 fx )2 ( 2 fy )2 ] º»
¬ ¼
ª
¬«
| exp i(2 / 1 )z 1 ' 21 / 2 1 [( 1 fx )2 ( 1 fy )2 ] º
¼»
(6)
=H (fx , fy ; z ', 1 )
Here,
z' z(1 ' 21 / 2 ) . Under proper approximation, the transfer function for
two different intensity images correspond two different wavelengths 1 and 2 are
obtained as follows:
I ( 1) | 1
( [u0 ].
] u H (z, 1 ))|2 (7)
I( 2) | 1
( [u0 ]. u H (z, 2 ))| | | 2 1
( [u0 ]. u H (z ', 1))| 2
(8)
For intensities I(( 2 ) , which can be considered as the defocused intensity images
for the wavelength 1 with different distance z ' . Then, according the Eq.(4), The
longitudinal derivation for two different wavelengths can be expressed as:
C. Zhang et al. / Two-Wavelength Transport of Intensity Equation for Phase Unwrapping 621
wI z (r ) I ( 1 ) I ( 2 ) I (z, 1 ) I (z z ' 21 / 2 , 1 )
| | (9)
wz 'z ' 'z '
Once the Eq.(9) is computed, the phase distribution could be recovered using
various TIE-solver, i.e., the unwrapped phase is recovered finally.
3. Numerical Simulation
In this section, numerical examples are presented to verify the feasibility and
effectiveness of TW-TIE for phase unwrapping, which is tested with two numerical
synthetic phase fields.
In first simulation, a 128×128 noiseless unwrapped phase is generated using the
4. Conclusion
Acknowledgements
References
[1] D. Parshall, M. K. Kim. Digital holographic microscopy with dual-wavelength phase unwrapping.
Applied Optics, 45(2006), 451-459.
[2] B. Bhaduri, N. K. Mohan, M. P. Kothiyal, et al. Use of spatial phase shifting technique in digital speckle
pattern interferometry (DSPI) and digital shearography (DS). Optics Express, 14(2006): 11598-11607.
[3] A. Momose. Recent advances in X-ray phase imaging. Japanese Journal of Applied Physics, 44(2005):
6355-6367.
[4] J. C. Estrada, M. Servin, J. Vargas. 2D simultaneous phase unwrapping and filtering: A review and
comparison. Optics and Lasers in Engineering, 50(2012): 1026-1029.
[5] R. M. Goldstein, H. A. Zebker, C. L. Werner. Satellite radar interferometry: Twódimensional phase
unwrapping. Radio science, 23(1988): 713-720.
[6] M. Zhao, L. Huang, Q. Zhang, et al. Quality-guided phase unwrapping technique: comparison of quality
maps and guiding strategies. Applied optics, 50(2011): 6214-6224.
[7] A. Davila, J. M. Huntley, C. Pallikarakis, et al. Simultaneous wavenumber measurement and coherence
detection using temporal phase unwrapping. Applied optics, 51(2012): 558-567.
[8] M. R. Teague. Deterministic phase retrieval: a Green’s function solution. JOSA, 73(1983): 1434-1441.
[9] T. E. Gureyev, K. A. Nugent. Rapid quantitative phase imaging using the transport of intensity equation.
Optics Communications, 133(1997): 339-346.
[10] D. Paganin, K. A. Nugent. Noninterferometric phase imaging with partially coherent light. Physical
review letters, 80(1998): 2586-2589.
624 Fuzzy Systems and Data Mining II
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-624
Introduction
1
Corresponding Author: Andy Kyung-yong YOON, CTO, Professor, 623 Gangnamdae-ro Seocho-gu, Seoul
06524 Korea; E-mail: xperado@yonsei.ac.kr.
Y.H. Jin et al. / A Study of Filtering Method for Accurate Indoor Positioning System 625
installation in most cases. Also, RSSI signal at fixed position, depending on the
environment, can be fluctuated greatly by noise. Recently many studies have been
performed in order to improve the error value of the RSSI. Most applied methods are
Bayesian recursive algorithm [3], Kalman filter [4], and weighted signal [5]. As a
method for position tracking, there are TOA, TDOA, AOA, and RSSI. However, the
remaining schemes, excluding the RSSI, require antenna for position tracking, thus not
suitable for location tracking utilizing Bluetooth. On the other hand, the characteristics
of the RSSI that does not require synchronization between devices is suitable for
location tracking using BLE beacon [6]. This paper studies accuracy of IPS using
Kalman filtered beacon fingerprinting. This paper also studies how Gaussian filter
applied fingerprint database update affects the positioning accuracy.
As a method to implement the IPS, trilateration and signal fingerprinting have been
widely known. Trilateration technique utilizes geometry of signal angle, a method for
determining the relative position of the object. Trilateration has an advantage that it is
possible to easily calculate position in geometry. However, there is a disadvantage that
it requires additional filtering by signal distortion [7]. Fingerprinting technique takes
advantage of unique RSSI to specific location. To track location, first RSSI is collected
in offline phase to build signal fingerprint database. Then in online phase received
signal pattern is compared to the collected fingerprint database to find the most similar
signal pattern [8].
Kalman filter is a probability based inference methods for estimating the continuous
state. The filter has role of removing the noise from the measured data according to
time, relieves the uncertainty contained in the measurement data. It is used in computer
vision, robotics, radar, and other various fields. Also, Kalman filter is used in the study
of indoor positioning technology. In this study, Kalman filter is applied to receive
signals from access point when constructing fingerprint database.
Gaussian smoothing filter is used to remove noise using a Gaussian distribution. The
basic principle of the method is to give more weight to the kernel and less weight to
around the kernel as the distance increases. Therefore high probability values of the
sample are taken with the exclusion of the sample of low probability values. In the
experiments, fingerprint database is created by applying a Gaussian filter to the data
collected in offline training phase. Gaussian interpolation is used to estimate signal data
at location where collection didn’t occur [9].
626 Y.H. Jin et al. / A Study of Filtering Method for Accurate Indoor Positioning System
2. Proposed Method
Kalman filter is applied to RSSI collected from reference point (RP). Fingerprint
database is constructed using average signal values. Beacon signal strength collected
from all RPs comprise fingerprint database. ,n represents beacon RSSI collected
from nth beacon of mth RP. Eq. (1) represents structure of the fingerprint database.
Eq. (2) is process for finding Nearest Neighbor (NN). represents average RSSI
of fingerprint at position i. This value is composed of a 9-dimensional vector.
means average RSSI of Test Point (TP) at position k, also composed of a 9-dimensional
vector. Then it finds smallest result from minimized Euclidean distance of TP and
RP.
Determine the value of σ in proportion to the distance from the kernel. Obtain signal
data around the kernel necessary for smoothing in accordance with the value σ. Apply
smoothing to RP signal of the data in kernel using signal data calculated in the previous
step. Lastly, update the fingerprint database. The following is normal distribution
formula for Gaussian smoothing.
ሺ
ሻమ
ǡ ൌ െ (3)
√
మ
For the experiment, beacons deployment layout is designed. There are 3 rows of 3
beacons, total of 9 beacons are placed in 14 meters by 14 meters indoor environment as
shown in Figure 1. The distance between each row is 7 meters. Reference Points are
located inside 6 meters by 6 meters beacon area. The shortest distance of RPs is about
2.33 meters. Test Points are randomly selected inside the experiment area. At least 3
TPs should be inside of each of 4 beacon square area as shown in Figure 1. The device
to retrieve Bluetooth RSSI data at each RPs must be identical. Duration of retrieval
should be long enough at every RPs to yield substantial result. The height of collection
device should be fixed. Scanned data may be filtered for noise reduction.
Y.H. Jin et al. / A Study of Filtering Method for Accurate Indoor Positioning System 627
Figure 2. RP Deployment
4. Experiment
Beacons’ TX power was set to 0dbm (Lv6). Bluetooth RSSI collected ranges from -50
to -90. Mobile device is running Android 5.1, namely a Samsung Galaxy note 5
smartphone with a 2.1 GHz ARM Cortex-A57 processor and 4 GB RAM was used for
this experiment.
80
75
RSSI
70
65
60
0 10 20 30 40 50 60 70 80
Count of Incoming Signal
Figure 3. Beacon RSSI measurement for Kalman filter
In this experiment, TP represents location of actual target. One of signal data set collect
from the TP was selected randomly. Selected signal data set was compared with
previously collected RP data using the RSSI difference. Then most closely matched RP
was represented as positioning results. The success ratio of two different test cases was
measured for the analysis. The first case is designed to test if a TP is inside of a 3.2
meter diameter cell, comprised of four neighboring RPs. For the second case we
measured NN success ratio of TP. Nearest RP’s distance is 0.6 to 0.85 meters.
Lastly, we analyzed collected data to measure error distance and positioning
success rate. After the first experiment, same experiment was conducted using
Gaussian smoothing to analyze effect of filtered data as shown in Figure 5 and Table 1.
Table 1 shows results of positioning of each TPs. Final results showed average error
distance of 1.74 meters. Results of NN accuracy rate and In-Cell accuracy rate were
46.04% and 94.58% respectively.
Therefore we have concluded that the actual positioning error distance of about 1m.
There is slight increase in accuracy rate when Gaussian smoothing is applied. TP 6 in
Table 1 shows significant difference between Gaussian smoothing data and non-
Gaussian smoothing data. It is confirmed that Gaussian smoothing can compensate
unstable signal and increase positioning accuracy. We have noticed that sample data
sometimes showed decrease in accuracy rate, however, overall results showed
improvement. As mentioned previously, the distance of the nearest RPs of TP is 0.6
meters to 0.85 meters.
5. Conclusion
In this study, Beacon fingerprinting experiment using BLE signal for IPS was
performed as an alternative to Wi-Fi fingerprinting technology. Our results confirmed
that it is possible to ensure the positioning accuracy of RP interval set only by utilizing
a simple NN algorithm. Thus, it was found that a simple and repetitive method can be
possible to construct a low-cost BLE IPS. Kalman filter was used in order to reduce the
measurement error of the instability of the BLE signal. In addition the signal data is
accumulated repeatedly, and its average is applied. The result confirmed practical use
of low-cost BLE for IPS. Result showed slight improvement in both error distance and
hit ratio when Gaussian smoothing was applied. Reliability of fingerprint database is
low when constructing the collected data in short time at low cost. In this case the
Y.H. Jin et al. / A Study of Filtering Method for Accurate Indoor Positioning System 631
References
[1] K. Kamol, K. Prashant, Modeling of indoor positioning systems based on location fingerprinting,
INFOCOM 2004. Twenty-third AnnualJoint Conference of the IEEE Computer and Communications
Societies 2 (2004), IEEE, 1012-1022.
[2] V. Gabriel, et al. Monitoring and detection platform to prevent anomalous situations in home care,
Sensors 14(2014), 9900-9921.
[3] R. Aswin N., et al. Accurate mobile robot localization in indoor environments using Bluetooth, Robotics
and Automation (ICRA), 2010 IEEE International Conference on (2010), IEEE, 4391-4396.
[4] P. Nazemzadeh, D. Fontanelli, D. Macii, T. Rizano, and L. Palopoli, Design and performance analysis
of an indoor position tracking technique for smart rollators, Indoor Positioning and Indoor Navigation
(IPIN) 2013 International Conference on, IEEE, 1–10.
[5] L.J. Liu and H.J. Ma, Study on wireless sensor network boundary localization based on rssi, Wireless
Communication and Sensor Network (WCSN), 2014 International Conference on, IEEE , 232–235.
[6] Y. Wang, X. Yang, Y. Zhao, Y. Liu, and L. Cuthbert, Bluetooth positioning using rssi and triangulation
methods, Consumer Communications and Networking Conference (CCNC), 2013 IEEE, 837–842.
[7] H. Guangjie, C. Deokjai, L. Wontaek, A novel reference node selection algorithm based on trilateration
for indoor sensor networks. In: Computer and Information Technology 2007, CIT 2007, 7th IEEE
International Conference on, IEEE, (2007), 1003-1008.
[8] Z. Peng, et al. Collaborative WiFi fingerprinting using sensor-based navigation on smartphones.
Sensors 15 (2015), 17534-17557.
[9] K. In-Cheol, C. Eun-Mi, O. Hui-Kyung, Gaussian Interpolation-Based Pedestrian Tracking in
Continuous Free Spaces, The KIPS Transactions: PartB 19 (2012), 177-182.
This page intentionally left blank
Fuzzy Systems and Data Mining II 633
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
Subject Index
3D mesh retrieval 562 comment mining 220
access control 407 complex network 612
accident prediction 232 complex system 612
accident state vector 232 computation times 241
adaptive learning 507 computational method 3
aggregation operator 37 conditional probability
applicability 254 introduction 220
architecture 327, 458 cooperative relaying 312
area compensation 58 coordination game 579
artificial neural network 173 correlation analysis 274
association pair extraction 220 correlation-weighted 487
association rule mining 141, 429 course 397
attribute reduction 94 credit approval dataset 149
authentication client 407 curvature flow 562
auto-encoder 321 data analysis 254
autoregressive model 570 data center networks 167
avalanche 226 data mining 65, 149, 159, 173,
bankruptcy prediction 282 179, 194, 248, 347, 612
battery equalization 28 data mining visualization 353
beacon 624 data sequence 519
big data 65 database design 414
bipolar-valued fuzzy set 115 decision tree model 194
bitrate clustering recognition 466 deep learning 149
Bluetooth Low Energy (BLE) 624 delay 87
blurred security boundaries 208 detection probability 585
Bonferroni mean 37 differential evolution 438
BP network 274, 341 DoG 562
BP neural network 360 DRAM 173
BPNN 390 Dynamic Bayesian Networks
bushing type cable terminal 360 (DBN) 290
card transaction dataset 149 dynamic itemset mining 141
case 65 early prediction 519
centroid points 94 early warning index 341
chance constraints 121 Ebola 226
citation network 605 e-commerce 65
cloud reasoning 377 energy awareness 570
clustering 397 energy saving 167
clustering algorithm 466 energy utilization 28
cobweb 397 engineering 353
cognitive image 476 equalizer 549
collaborative filtering 397, 487 equivalent removing 549
collaborative target tracking 585 E-R model 414
color histogram 299 evaluation model 438
634
Author Index
Abdullah, L. 45 Gao, Z.-W. 327
Abuzayed, N. 141 Guo, Z.-B. 367
Akhmetova, Z. 353 Han, C. 618
Bao, W.-X. 618 Han, Y. 115
Boranbayev, S. 353 Han, Y.-H. 429
Burgos, D. 507 Han, Z.-Z. 232
Cai, J.-D. 81 He, D. 267
Cao, H.-Y. 220 He, D.-H. 81
Cao, Q. 466 He, L.-M. 194
Cao, Q.-L. 121 He, W. 327, 458
Chang, C.-W. 173 He, X.-R. 37
Chen, H.-L. 282 Hsu, Y.-C. 22
Chen, H.-Y. 407 Hu, K.-W. 28
Chen, Lin 535 Hu, Y. 194
Chen, Ling 51 Huang, F. 65
Chen, N.-J. 570 Huang, N.-J. 121
Chen, S. 115 Huang, P. 94
Chen, S.-D. 194 Jang, W. 624
Chen, W. 458 Ji, Y. 28
Chen, X. 542 Jiang, H.-Y. 194
Chen, X.-H. 267 Jiao, J.-K. 598
Chen, Yan-Hui 444 Jin, W.-D. 438
Chen, Yao-Hua 254 Jin, Y.H. 624
Chen, Y.-W. 290 Jing, H. 306
Cheng, D.-B. 334 Jing, J.-H. 260
Cheng, H. 618 Kamal, C.W.R.A.C.W. 45
Chi, H.-X. 487 Kang, X. 186
Chi, J.-R. 108 Kita, K. 299
Cosenza, C.A.N. 11 Krykhtine, F. 11
Dai, M. 444 Kumar, S. 3
Deng, Z.-Y. 334 Kuo, C.-L. 476
Ding, J. 542 Kwon, S.J. 624
Dong, E.-M. 579, 605 Lai, F.-G. 312
Du, J. 377 Li, B. 624
Duan, X.-J. 612 Li, D. 71
El Moudani, W. 11 Li, G.-L. 290
Ergenç, B. 141 Li, H.-F. 159, 179, 241
Fan, X.-G. 585 Li, J.-P. 579, 605, 612
Fang, J. 618 Li, L. 71
Fang, W.-D. 327, 458 Li, M. 525
Fujisawa, A. 299 Li, M.-H. 248
Gangwar, S.S. 3 Li, Q. 282
Gao, J.-J. 420 Li, S. 466
638