Anda di halaman 1dari 266

Journal of Computers

ISSN 1796-203X Volume 7, Number 9, September 2012

Contents
REGULAR PAPERS Study on Anti-worm Diffusion Strategies Based on B+ Address Tree Dewu Xu, Jianfeng Lu, and Wei Chen Business Rules Modeling for Business Process Events: An Oracle Prototype Rajeev Kaula Orientation Selectivity for Representing Dynamic Diversity of Facial Expressions Hirokazu Madokoro and Kazuhito Sato Baldwin Effect based Particle Swarm Optimizer for Multimodal Optimization Ji Qiang Zhai and Ke Qi Wang Acoustic Emission Signal Feature Extraction in Rotor Crack Fault Diagnosis Kuanfang He, Jigang Wu, and Guangbin Wang Integrability of the Reduction Fourth-Order Eigenvalue Problem Shuhong Wang, Wei Liu, and Shujuan Yuan Effluent Quality Prediction of Wastewater Treatment System Based on Small-world ANN Ruicheng Zhang and Xulei Hu Nonlinear Evolution Equations for Second-order Spectral Problem Wei Liu, Shujuan Yuan, and Shuhong Wang Analysis of Causality between Tourism and Economic Growth Based on Computational Econometrics Liangju Wang, Huihui Zhang, and Wanlian Li Multi-robot Task Allocation Based on Ant Colony Algorithm Jian-Ping Wang, Yuesheng Gu, and Xiao-Min Li Parameter Auto-tuning Method Based on Self-learning Algorithm Chaohua Ao and Jianchao Bi Efficient Graduate Employment Serving System based on Queuing Theory Hui Zeng Staying Cable Wires of Fiber Bragg Grating/Fiber-Reinforced Composite Jianzhi Li, Yanliang Du, and Baochen Sun Research and Development of Intelligent Motor Test System Li Li, Jian Liu, and Yuelong Yang Application of Intelligent Controller in SRM Drive Baojian Zhang, Yanli Zhu, Jianping Xie, and Jianping Wang 2093

2099

2107

2114

2120

2128

2136

2144

2152

2160

2168

2176

2184

2192

2200

Simulation of Rolling Forming of Precision Profile Used for Piston Ring based on LS_DYNA Jigang Wu, Xuejun Li, and Kuanfang He Bargmann and Neumann System of the Second-Order Matrix Eigenvalue Problem Shujuan Yuan, Shuhong Wang, Wei Liu, Xiaohong Liu, and Li Li Privacy-preserving Judgment of the Intersection for Convex Polygons Yifei Yao, Shurong Ning, Miaomiao Tian, and Wei Yang Medical Equipment Utility Analysis based on Queuing Theory Xiaoqing Lu, Ruyu Tian, and Shuming Guan Research and Application of Electromagnetic Compatibility Technology Hong Zhao, Guofeng Li, Ninghui Wang, Shunli Zheng, and Lijun Yu An Improved Mixed Gas Pipeline Multivariable Decoupling Control Method Based on ADRC Technology Zhikun Chen, Yutian Wang, Ruicheng Zhang, and Xu Wu A Robust Scalable Spatial Spread-Spectrum Video Watermarking Scheme Based on a Fast Downsampling Method Cheng Wang, Shaohui Liu, Feng Jiang, and Yan Liu Blocking Contourlet Transform: An Improvement of Contourlet Transformand Its Application to Image Retrieval Jian Wu, Zhiming Cui, Pengpeng Zhao, and Jianming Chen Speech Recognition Approach Based on Speech Feature Clustering and HMM Xinguang Li, Minfeng Yao, and Jianeng Yang Electronic Nose for the Vinegar Quality Evaluation by an Incremental RBF Network Hong Men, Lei Wang, and Haiping Zhang Intelligent Recognition for Microbiologically Influenced Corrosion Based On Hilbert-huang Transform and BP Neural Network Hong Men, Jing Zhang, and Lihua Zhang Research on Diagnosis of AC Engine Wear Fault Based on Support Vector Machine and Information Fusion Lei Zhang and Yanfei Dong Optimal Kernel Marginal Fisher Analysis for Face Recognition Ziqiang Wang and Xia Sun Hybrid Cloud Computing Platform for Hazardous Chemicals Releases Monitoring and Forecasting Xuelin Shi, Yongjie Sui, and Ying Zhao Quantum Competition Network Model Based On Quantum Entanglement Yanhua Zhong and Changqing Yuan BP Neural Network based on PSO Algorithm for Temperature Characteristics of Gas Nanosensor Weiguo Zhao Hybrid SVM-HMM Diagnosis Method for Rotor-Gear-Bearing Transmission System Qiang Shao and Changjian Feng A Real-Time Information Service Platform for High-Speed Train Ruidan Su, Tao Wen, Weiwei Yan, Kunlin Zhang, Dayu Shi, and Huaiyu Xu

2208

2216

2224

2232

2240

2248

2256

2262

2269

2276

2283

2292

2298

2306

2312

2318

2324

2330

Research on the Grey Assessment System of Dam Failure Risk Ying Jiang and Qiuwen Zhang Research on the Dynamic Relationship between Prices of Agricultural Futures in China and Japan Qizhi He

2334

2342

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2093

Study on Anti-worm Diffusion Strategies Based on B+ Address Tree


Dewu Xu
College of Mathematics, Physics & Information Engineering, Zhejiang Normal University, Jinhua, Zhejiang, 321004, China Email:xdw_zjnu@126.com

Jianfeng Lu and Wei Chen


College of Mathematics, Physics & Information Engineering, Zhejiang Normal University, Jinhua, Zhejiang, 321004, China Email:{lujianfeng, chen_wei}@zjnu.cn

AbstractTo improve the efficiency of resistance of anti-worms to malicious worms and enhance their diffusibility, diffusion strategies of anti-worms based on B+ address tree are proposed in order to speed up anti-worm diffusion rate in the network and reduce anti-worm influence on network system when diffused. The diffusion strategies are simulated by scilab. Results show that anti-worms using B+ addresses tree strategies have faster diffusion speed and less traffic impact on network compared with traditional strategies. Index Termsanti-worm, propagation mode, diffusion strategies, B+ address tree, active diffusion, detection host

I. INTRODUCTION In recent years, network worm threats to the computer system and network security have rapidly increased. Active detection worms represented by CodeRed, Blaster, and Slammer and E-mail worms represented by Melissa, LoveLetter, and MyDoom live longer, cover a wider area, and have caused tremendous damage to information systems[1]. Researchers have developed a variety of approaches[2-4] to prevent the network from damages caused by worms, but the defensive approach is after all a temporary solution and a manageable method of proactive protection is urgently needed. Anti-worms can proactively fix the vulnerabilities of the host before the outbreak of worms or during the early stage of the outbreak in order to control the scope of the worm outbreak. The use of anti-worms to combat malicious worms is becoming a new emergency measure. At present, anti-worm diffusion strategies and proactive counter strategies need to be improved. The current process of anti-worms against malicious worms uses the same diffusion strategies as malicious worms. Such strategies have caused serious impact on the network during the counter process and put this
Manuscript received October 8, 2011; revised December 29, 2011; accepted January 18, 2012. Project number: Y201120829 , Y1110483 and 60873234 Contact author: Dewu Xu

technology into a heated debate in the long term; thus the pace of the research has been slowed down[5]. Based on the counter idea of anti-worms, this paper has designed practical B+ tree address diffusion strategies to reduce network traffic. At present, worm diffusion strategies include uniform random diffusion, local priority diffusion, diffusion based on the target list, diffusion based on K-Way algorithm[6], diffusion based on search engines, passive diffusion and so on. Some strategies could easily lead to network traffic overloading and network congestion; therefore effective strategies and algorithms to control the diffusion methods and speed of anti-worms are becoming more and more essential. A uniform random diffusion algorithm randomly generates IP addresses from address space to be detected to carry out the diffusion, which will produce a large amount of abnormal traffic. In contrast, a local priorities diffusion algorithm generates a subnet IP address of the infected host so as to increase the ability to detect the target host and reduce the abnormal flows. Additionally, such a strategy can exclude the unallocated and retained IP addresses of the address space to be detected. A diffusion algorithm based on the target list generates a pre-test target address list, and then tentative diffusion is done according to the destination addresses in the list[7]. The algorithm generates a target address list based on routing table information and the generated diffusion rate of worms is 3.5 times the rate of the random scan algorithm. But the drawback is that the IP address library must be carried during the diffusion, which increases the load traffic. The K-Way diffusion algorithm was originally used in flash worm. The basic idea is to build all websites to be probed into M K-Way trees. All nodes within each tree are 1/K of total nodes and 1/K node sets are mutually disjoint with each other. Each node is an internal node of a tree and all internal nodes at most have K branches. Those internal nodes are leaf nodes within other trees. This greatly improves the stability of flash worm

2012 ACADEMY PUBLISHER doi:10.4304/jcp.7.9.2093-2098

2094

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

diffusion, but at a price: the code becomes complex, and the scanning numbers of the target nodes become larger. Increasingly powerful search engines have lead to the emergence of Google Hacking which represents this kind of attack means and method. Santy, the first smart worm that used a search engine to find the attack target, shows that the idea of a smart worm has been realized. The development of smart worms signifies that network security has converted from a scattered counter into an overall security alliance. With the passive diffusion algorithm, worms are latent in the infected hosts first, and they monitor network data packets to obtain other users activity information through the host, with loopholes actively contacting infected hosts in order to find new target hosts. Therefore, in the process of finding the target hosts, abnormal network traffic will not be generated, which makes it difficult for the detection system to find. But the drawback is that the diffusion rate is slow when the target number is small or scan frequency is low, and the diffusion rate is fast when scan frequency is high. There are four factors that affect the propagation speed of worms: selection of target address space, whether it adopts a multi-thread search for vulnerable hosts, whether there is a list of susceptible hosts, and diversity of propagation. The main difference of each diffusion strategy lies in the selection of target address space. The key to fast and light-load propagation lies in the design of diffusion strategies. Compared with the passive anti-worm strategy proposed in [8], anti-worms that include proactive information collection modules can automatically look for hosts with loopholes in the network, penetrate the target hosts, and propagate a duplicate copy and repair tools. The propagation of anti-worms under this counteractive strategy is very fast. The difference is that after the penetration of the hosts, anti-worms carry out the tasks such as antivirus and repair, which is called the counteractive model of proactive diffusion. II. DIFFUSION STRATEGY OF ANTI-WORM B+ ADDRESS
TREES

A. Divination of Anti-worm Proactive Diffusion and B+ Address tree The strategy of parallel transmission of information with proactive diffusion is used in this study. Control information is taken in the process of information transmission so that anti-worms are in controllable diffusion state. B+ tree is the general form of multi-channel expansion of binary search tree. It is used to manage and maintain large-scale data index with high effective random query, low update overhead, self-balancing and so on. B + tree further allows for all "keys" to appear only in the leaf nodes and links up all the leaf nodes in a manner of list in order to improve the query efficiency. This study uses the above advantages of B+ tree to design diffusion strategies. The definition of B+ tree is as follows:

Definition 1: B+ tree address 1. All IP addresses are 0-order B+ address tree. When the number of hosts being detected is M, the order of B+ address tree is L; 2. Assuming that Ti is a T0 rooted i-order B+ address tree, add an IP address group to all the nodes of Ti, the resulting Ti+1 is a T0 rooted i+1-order B+ address tree. 3. Let i = i + 1. When i L, Server and Center no longer allot IP address groups to the newly-detected penetrated hosts. During the drawing, penetrated hosts are put below the existing hosts that do not participate the detection; when i <L, go to 2; 4. Only trees obtained by 1 and a certain number of 2 and 3 are called N-order B+ address tree. Therefore, the maximum order that participates in the detection in B+ address tree is L. Assume that worms are not affected by worms or before the penetration of anti-worms are all loophole hosts and that loopholes are 0day loopholes. Even without the patch, technicians can take temporary protective measures that are applied in anti-worms to guard the loophole hosts until release of the patch. To study this problem, we further assume that: All node hosts within the network can visit each other, the propagation delay between any two points is almost the same. Broadband network, switching network and fully connected internal switching fabric can be regarded as such an environment. This paper descripts the diffusion in the manner of a target list. Divide the whole list of IP detection addresses into N IP addresses with n IP addresses in each group. Assume that in the process of anti-worm diffusion, the maximum number of detection hosts is M (M<N), and all the penetrated hosts are allocated an IP address group before the number of detection hosts reaches M. Detection hosts that are already allocated an IP address group only detect hosts within the allocated IP address group and send feedbacks to the Center (or Server) after the detection. Figure 1 shows 2-order and 4-order B+ address trees consisting of an IP address group with 4 IP addresses. Each small box represents a host and the IP address group is represented by a rectangle composed of 4 small squares. The solid line represents loophole hosts that are already detected by detection hosts, and the dash arrow indicates the feedback from the penetrated hosts to the Server (4-order address tree in the figure omits the feedback dash arrow). The dash box indicates that penetrated hosts within the IP address group do not participate in the detection.

a) 2-order B+ address tree

2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2095

of these need L times. After the detection number L reached M, N-order B+ address tree remains n a j
j =1

amount of the IP address group, which needs


L n * ( n a j ) M j =1

amount of time to retreat, so it amount of time to diffuse

takes
b) 4-order B+ address tree with 4 detection hosts Figure 1 Schematic diagram of 2-order and 4-order B+ address tree.

L n * ( n a j ) M + L j =1

It can be seen from Fig.1, Definition 1 and the definition of ET tree that ET tree is a special case of B+ address tree. When all the IP address groups contain an IP address, the B+ address tree is an ET tree. Because this study provides that all penetrated hosts send feedbacks to the Server, the B+ address tree is more stable than the ET tree. Related theorems and proofs are given as follows: Theorem 1:T is the B+ address tree with the maximum order N. When the number of detection hosts reaches M, the order of T is L. Then the total node number of T is 2L + M * (N-L) and the total number of IP addresses are n*(2N + M * (N-L))-n. Proof: 1. When N=0, then M=L=0,so 2N+M*(N-L) =20+0*(0-0) =1; therefore in this case, the theorem is proved; 2. Assuming that when N=i (i!=0), the theorem is tenable, the nodes number of N-order B + address are 2L + M*(i-L); For N=i+1, according to definition 1, the order participating in the scan is the same L, the (i+1)-order B+ address tree increases just M nodes which are not involved in the scan compared to the i-order B+ address tree. So, the total node number of T is 2L + M*(i-L)+M=2L + M*(i+1-L)=2L + M * (N-L). Besides the original node, namely server node, every B + address tree nodes contains an IP address, so the total number of IP addresses is n*(2N + M * (N-L))-n. Therefore, we can see that this theorem is proven by the mathematical induction. Theorem 2:Let T denote the N-order B+ address tree and M denote the total number of detection hosts (1 approaches to M in B+ address tree manner), it takes
L n * ( n a j ) M + L j =1 amount of time to diffuse the

the anti-worm duplicate to the IP address group of the leaf in a B+ address tree from the information source. Moreover, it takes n-1 amount of time to finish detecting and penetrating Lth-order in B+ address tree. Therefore, the theorem is proved. Theorem 3: If the number of hosts that participate in the detection is M, then it can be represented as: 1. If L<n, then the relationship between L and M is 2L+1-LM+2>2L-L+1; 2. If Ln, then the relationship between L and M is 2L+1-2L-n+1-nM>2L -2L-n - n. Proof: Suppose the first IP address of IP address group in the Lth-order is detected and participates in detecting, the number of hosts participating in detection is M. It can be seen from theorem 2 that this moment B+ address tree is L-order, then the number of hosts involved in detection in this B+ address tree is: 1. If L<n, then it can be seen from theorem 2 that (L-n+1<n-n+1=1)-order in B+ address tree has been detected. So, the total number of hosts participating in detection is presented as: aL+2*aL-1+3*aL-2+...+La1. Owing to the following expression:
L 1 a L = a j = AL 1 , L 1, a 0 = 1 j =0 L L 1 A = a = a + a j L L j j =0 j =0 = 2 a = 2 A = 2 L , L 1 L L 1 a L = 2 L 1 , L 1

all nodes are IP address group except the original node, namely server node, so the number of IP address group that L-order B+ address tree contains are:

a j 1 = AL 1 = 2 L 1
j =0

anti-worm duplicate to the IP address group of the Nth-order leaf in a B+ address tree from the information source. And it takes amount of time to diffuse the anti-worm duplicate to all IP address list spaces. Proof: It can be seen from theorem 1 that the system needs L detection time before the number of hosts involved in the detection approaches M, at the same time the order of B+ address tree approaches to L from 0, all
L n * ( n a j ) M + L + n 1 j =1

So, the following expression is correct: aL+2*aL-1+3*aL-2+...+La1=2L-1+2*2L-2+3*2L-3+...+L*20 =2L+1-2-L M. Then, the relationship between L and M is 2L+1-LM+2. For M takes the maximum value of the above formula, M+2>2L-L+1. This theorem is proved. 2. If Ln, then the order of nodes in which all allocated IP address groups have been detected is (L-n+1). So, there are (L-(L-n+1)=n-1)-order nodes which have not been detected. From the above derivation, the number of IP address, detected in IP address group still waiting to be detected are : aL+2*aL-1+3*aL-2+...+(n-1)aL-n+2= 2L-1+2*2L-2+3*2L-3+...+(n-1)*2L-n+1=2L*[2-(n+1)*2-n+1], the number of IP addresses that have been detected are L -n +1 n( a j ) = n * (2 L n +1 1) . So, the total number is
j=1

2012 ACADEMY PUBLISHER

2096

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2 L * [2 - (n + 1) * 2 -n +1 ] + n * (2 L-n +1 - 1) = 2 L +1 (n + 1) * 2 L n +1 + n * 2 L-n +1 n = 2 L+1 2 L n +1 n M

Similarly, as M takes the maximum value of the above formula, then M>2L -2L-n-n. This theorem is proved. Theorem 4: Suppose the number of IP addresses comprised by the whole IP address list is ,=n*(2L + M*(N-L))-n, then through the comparison among the diffusion strategies based on B+ address tree, the Flash Tree diffused in K-Way manner in [6], and the exponential tree diffused in ET manner in [9], the diffusion speed has the following relationship: 1. If KM,the propagation rate of B+ address tree is the same as the exponential tree in ET manner, and faster than the Flash Tree in K-way manner. 2. If M<K<N, the propagation rate of B+ address tree diffuses in a uniform manner. The proof of the first section about this theorem can be found in [9], so it is omitted here. Now, we will prove the second section: As the maximum number of machine that the B + address tree involved in detection and network penetration is M, so each new loophole host is not assigned an IP address after reaching the maximum number, the number of machines that execute diffusion tasks is only M in the whole address tree. So every time, there are only M machines that convert from loopholes host into penetrated ones. It can be seen from theorem 1 that the number of IP address group of B+ address tree that comprises M detection hosts is 2L + M*(N-L)-1,and the total number of IPs in the IP address list is n*(2L + M*(N-L))-n. But in fact, the number of IP addresses can be any value and the number of IP addresses changes dynamically during the process of detection and penetration. Therefore, we introduce the concept of B+ address deformed tree to make the above situation diffuse according to B+ address tree. The definition of B+ address deformed tree is as follows: Definition 2: B+ address deformed tree Let the total number of IP addresses to be detected be and all the IP address groups contain n number of IPs. For example, =n*(2L +M*(N-L))-n,. Then according to definition 1, generate all the IP address groups into corresponding N-order B+ address trees; otherwise, let n*(2L +M*(N-L))-n<<n*(2L+M*(N-L+1))-n and let =-(n*(2L +M*(N-L))-n), add the left number of IP address groups separately to number of different sub-nodes of N-order B+ address tree. This formation is called N-order B+ address deformed tree. The basic idea of a B+ address deformed tree is to let 2L + M*(N-L)-1 number of IP address groups of number of IP address groups constitute a complete N-order B+ address tree and connect the left nodes to the B+ address tree. It can be seen from definition 1 and 2 that no matter what the address trees are, they all meet theorems 1-4; what is needed is to adjust the B+ address deformed tree as follows:

Figure 2. Schematic diagram of 2-order B+ address tree( on the above left), 2-order B+ address deformed tree( on the above right), and 3-order B+ address tree( on the below).

Fig.2 is the schematic diagram of a 2-order B+ address tree (on the above left), a 2-order B+ address deformed tree (on the above right), and a 3-order B+ address tree (on the bottom). The feedback dotted line is omitted. The Server point in the figure is the initial node and does not have an IP address group. The dash box represents the nodes that are added when the original B+ address tree is modified into a B+ address deformed tree. B. Dynamic B+ Address Tree Generation Algorithm Because the number of IP addresses in reality can be any value, the number of groups (the number of nodes) after the division of IP address list can also be any value. During the generation of a B+ address tree, IP address groups are allocated in a dynamic way. The steps of a dynamic B+ address tree generation algorithm are as follows: 1. Divide all the IP addresses that are to be tested into groups with n number of IP addresses in each group. Treat the left IP addresses that are less than n as one / n . group; the total number of groups is 2. When the Server detects, retrieve an IP address group according to the sequence of the IP address group list, then select one of the IP addresses (the loophole host is marked as A) to begin the detection. 3. After the success of the detection, penetrate that loophole host A. A duplicate copy of anti-worms in A feeds back the information to the Server and constitutes a connection. The Server takes the following steps to allocate the IP addresses: Server adds A into the penetrated host list and detection host list, then allocates the IP address group where A locates to A and lets A detect the rest of the loophole hosts with the group. 4. As a detection host, A detects the remaining loophole hosts in this group in turn. If the detection and penetration are successful, then the penetrated host (denoted as B) sends a request to the Server. If by now the total number of the detection hosts has not yet reached M, the Server will add B into the penetrated host

2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2097

list and detection host list and then allocate an undetected IP address group to B; if the total number has reached M, the Server will not send and it will add B into the penetrated host list; if B does not send a request to the Server, the Server will add B into the corresponding host list according to the detection feedback of A. 5. Server and detection host repeat the above steps 2-4 and detect and penetrate all IP address groups. Because the Server selects the process of Center and the process that Server periodically sends the confirmation information to the detection hosts does not have any influence on the constitution of B+ address tree, the above process doesnt need to be mentioned. The dynamic B+ address tree generation algorithm has the following good properties: 1. In the process of dynamic generation of diffusion tree, nodes that enter the diffusion model at any time will not change their relationships that they had before those nodes entered the model and the balance of the whole tree will not be changed; 2. Nodes that enter the model earlier are sure to get the information earlier than or at the same time as those that enter the model later; 3. Regarding the given network, the structure of the corresponding B+ address tree is fixed; therefore the track of information diffusion can be learned in advance. C. Analysis of Stability It can be seen from the above algorithm that the B+ address tree needs only to divide IP addresses into groups before the detection, the construction of the whole tree is naturally completed in the process of detection, the construction does not need to be done in advance, and ET trees and K-way based flash trees have no such advantage. There are some problems with the stability of ET trees in [9] and K-way based flash trees in [6], for example, there will be many uncertainties for the diffusion trees that are built within the topology of an Internet virtual network; if some node in the spanning tree becomes a bad node because of the network or other reasons, child nodes below it will lose the chance of being infected. From the above algorithm, when the detected suspicious host is a bad point, this host is not penetrated, thus no duplicate copy of anti-worms will send a query to the Server and the follow-up action will not be performed. Therefore, there will be no problem for ET trees and flash trees. When the detection host that has been allocated an IP address group becomes a bad point during the detection for some reason, because the Server periodically sends confirmation information to the detection host, the only thing that will be delayed is the detection time of the IP address group that should be allocated. III. SIMULATION OF DIFFUSION This section uses scilab to test the diffusion strategies of the B+ address tree. The number of detection hosts is denoted by M. It can be known from theorem 4 that the amount of time required for the duplicate copy of

anti-worm to spread through the whole B+ address tree is Fig. 1 and Table 1 are the relationship between M and the diffusion time of duplicate copy of anti-worms spreading through the whole B+ address tree under condition =10000,n=10:
L i + n * / n aj / M + n 1 j =1 ,

A) Diffusion time when M is 1-50

B) Diffusion time when M is 51-300

C) Diffusion time when M is 301-600

D) Diffusion time when M is 601-1000 Figure 3 Relationship between M and diffusion time of anti-worms TABLE I. RELATIONSHIP BETWEEN B+ ADDRESS TREE AND M Order of B+ address tree 1 2 3 Number of detection hosts Order of B+ address tree Number of detection hosts Order of B+ address tree Number of detection hosts 1-3 4 14-28 7 123-249 4-6 5 29-59 8 250-504 7-13 6 60-122 9 505-1000

According to Fig. 3 and Table 1, we obtain the following conclusions: 1. The increment of the number of detection hosts does not necessarily accelerate the propagation speed of anti-worms or decrease the detection time; 2. The number of detection hosts can be chosen according to the situation of malicious worms. (1) If malicious worms just begin to propagate, B map

2012 ACADEMY PUBLISHER

2098

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

can be chosen so that there will be little impact on the network traffic, and the propagation of malicious worms can be inhibited without influencing the working of the network. In the process of anti-worms detecting and penetrating, a user system that has been patched can be used as patch host. (2) If the diffusion of malicious worms is serious, C map can be chosen so that the average number of detections of each detection host is about 20. Since the IP address is continuous, every segment only needs one host to eradicate its malicious worms, and it cannot influence the other segments. (3) It is recommended that under normal conditions B should not be chosen, because, as Table 1 shows, in the late period of the propagation of anti-worms there are few malicious worms left. But there are still more detection hosts; thus the network data generated by anti-worms themselves during the detection is the main factor that influences the network. Certainly, it is hoped that a highly secure network can quickly eradicate malicious worms, and then D map can be selected. Increasing the number of detection hosts will on the whole shorten the diffusion time of anti-worms. But in some situations, such increase does not necessarily shorten the diffusion time. Therefore for the sake of meeting a certain diffusion time, the number of detection hosts should be appropriately selected so as to reduce the impact on the network by anti-worms themselves while diffusing. IV.CONCLUSIONS Diffusion strategies based on B+ address tree are proposed in this study, which reduces the impact on the network system by anti-worms during the diffusion and enhances the diffusibility of anti-worms as well. It can be seen from the simulation that diffusion algorithms based on a B+ address tree are quite stable in terms of the network performance and the B+ address tree itself is also very stable. ACKNOWLEDGEMENT This research is financially supported by the National Nature Science Foundation of China under Grant No. 60873234, the Education Department Foundation of Zhejiang Province under Grant No. Y201120829 and the Nature Science Foundation of Zhejiang Province under Grant No. Y1110483. Thanks to the reviewers for the valuable comments helping to improve the quality of the manuscript. REFERENCES
[1] WANG Xiu-ying, SHAO Zhi-qing, LIU Bai-xiang. Worm Detection Algorithm Under P2P Circumstances [J]. Computer Engineering. 2009(3): 173-175. [2] J W Lockwood, J Moscola, M Kulig, et al. Internet worm and virus protection in dynamically reconfigurable

[3] [4] [5]

[6] [7] [8] [9]

hardware[C]. ACM CCS Workshop on Rapid Malcode (WORM 2003), Washington, 2003. N Weaver, V Paxson, S Staniford, et al. Large scale malicious code: A research agenda[OL]. http://www.cs.berkeley.edu/~ nweaver/, 2003. N Provos, A virtual honeypot framework[R]. Center of Information Technology Integration, University of Michigan, Tech Rep: citi-tr-03-1, 2003. Wang Bailing, Fang Binxing. A New Friendly Worm Propagation Strategy Based on Diffusing Balance Tree [J]. Journal of Computer Research and Development. 2006(9): 1593-1602. S. Staniford, D. Moore, V Paxson, et al. The Top Speed of Flash Worms[C]. In: Proc. ACM CCS Workshop on Rapid Malcode, Washington DC, USA, 2004:33-42. Staniford S., Paxson V., Weaver N. How to own the Internet in your spare time[C]. Proc of the 11th Usenix Security Symp. San Francisco: Usenix, 2002:149-167. Deng Ying-yi. Research on p2p worms and defense technology[D], ChengDu: University of Electronic Science and Technology of China, 2007. Wang Bai-ling. Friendly Worm Based Active Countermeasure Technology to Contain Network Worm [D]. Harbin Institute of Technology. 2006.

Dewu Xu received his M.S. degree from the School of Information Science and Technology at East China Normal University in 2005. He is now a lecturer in the College of Mathematics, Physics and Information Engineering at Zhejiang Normal University. His research interests include Cryptology and Distributed System Security.

Jianfeng Lu received his B.S. degree from the College of Computer Science and Technology at Wuhan University of Science and Technology in 2005, and his PhD degree from the College of Computer Science and Technology at Huazhong University of Science and Technology in 2010. He is a lecturer in the College of Mathematics, Physics and Information Engineering at Zhejiang Normal University. His research interests include Distributed System Security and Access Control.

Wei Chen received his PhD degree from the Beijing University of Post & Telecommunication in 2006. He is now an associate professor and tutor for graduate in the College of Mathematics, Physics and Information Engineering at Zhejiang Normal University. His research interests include Cryptology and Intrusion Detection.

2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2099

Business Rules Modeling for Business Process Events: An Oracle Prototype


Rajeev Kaula
Computer Information Systems Department, Missouri State University, Springfield, MO 65897 (USA) E-Mail: RajeevKaula@missouristate.edu

AbstractBusiness rules guide business process activities and events, besides impacting associated business entity types. This paper outlines an approach to model business rules associated with business process events through traditional data modeling techniques like entity-relationship diagrams, and then transform the data model into application procedures that can effect business process events. A prototype to model a sample set of business rules pertaining to a business process event into a relational database is provided to demonstrate the application of the concept. The paper utilizes the Oracle database for illustrating the concepts. The prototype is explained through an application in Oracle's PL/SQL Server Pages. Index TermsBusiness Rules, Business Process, Data Modeling, Entity-Relationship Model, PL/SQL Server Pages, Relational model, Web Application.

Figure 1. Sample Business Rule

I. INTRODUCTION A business process is a collection of structured activities to complete some task. Business processes activities transform a set of inputs into outputs (product or service) for another process or event. Examining business processes helps an organization determine bottlenecks, and identify outdated, duplicate, and smooth running processes. To stay competitive organizations automate and optimize their business processes [2, 6, 14, 20]. Any automation of a business process is influenced by the business rules that guide its activities and events, besides impacting associated business entity types. Business rules are abstractions of the policies and practices of a business organization [4, 5, 7, 18]. Organizations that consolidate their business rules and automate their implementation, often derive strategic advantage [15]. In general business rules facilitate specification of some business guideline expressed declaratively in condition-action terminology as IF condition THEN action. A condition is some constraint, while the action clause reflects the action. Figure 1 shows an example of a business rule that describes a set of constraints applicable for handling a new hotel reservation event in hotel reservation business process.

Business rules influence on business processes is accomplished through their inclusion in business applications utilized by such processes to complete their tasks or activities [1, 7, 8]. Even though business rules may be outlined declaratively, from a development standpoint they are often expressed as an addendum to database development in the form of either a proprietary product or some DBMS encoded trigger and/or procedure [1, 4, 5, 7]. Often times business rules that are embedded within application logic are difficult to document and change [19]. Since any business process needs to be adaptive to remain competitive, it is necessary to develop business rules that can be expressed declaratively, yet modeled for change and implementation automatically. In this context, data modeling of business rules provides an approach that facilitates standardized representation of business rules. Several attempts have been made to apply the standard techniques of traditional entity-relationship modeling to structure the business rules as a rule repository in a relational database management system [8, 9, 10, 11, 12, 16, 17]. However, these approaches are limited to either a singular representation of business rules semantic definition within a conceptual framework in database context, or utilization of expert system framework in database context. There is a lack of an approach that can (i) structure business rules with multiple constraints in database, and (ii) also associate their specification with database operations from a business process perspective. This paper outlines such an approach by modeling business rules associated with business process events through traditional data modeling techniques like entityrelationship diagrams (ERD), and then transform the data model into application procedures that can effect business process events. Such a model represents a specialized logical schema for structuring business rules for storage in relational database. As shown in Figure 2, while business process events are governed through business rules, such events also impact business process entity types. Consequently, any modeling of business rules

2012 ACADEMY PUBLISHER doi:10.4304/jcp.7.9.2099-2106

2100

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

should also explore the association between the business rules structure and the associated entity types.

Figure 2. Business Process Event Modeling Perspective

Developing the logical schema for modeling of business rules for business process events now involves (i) ERD representation of business rules associated with business process events, (ii) transformation of the ERD into a relational model, (iii) representation of relationship between the business rules and the associated entity type, and (iv) outline the development of an application procedure that can impact business process event. The approach is illustrated through an Oracle 11g database using a prototype in PL/SQL Server Pages [3, 13]. PL/SQL server pages is a server-side scripting approach for developing database driven dynamic Web pages. The PL/SQL server page uses Oracle's primary database language PL/SQL as a scripting language along with HTML to generate database driven Web pages. As each business rule is atomic and represents one complete thought, it cannot be decomposed further [7]. Consequently, the paper focuses on a business rule structure where the constraints are expressed using the AND operator, and there is only one action specification in the THEN clause. For this reason, a business rules statement expressed with the OR operator can be represented through separate atomic rules involving the AND operator. The paper now briefly reviews existing research, and then outlines the modeling and transformation of business rules structure, followed by its implementation through a Web prototype. II. REVIEW OF EXISTING RESEARCH Utilization of business rules in the context of entityrelationship model and business process working can be categorized through four approaches. The first approach [8, 9, 10] focuses on how entityrelationship diagrams can themselves be represented as business rules through the notion of term, fact, derivation, and constraint. However, the approach does not explore the representation of relationships that can exist among entity attributes in the formation of declarative IF/THEN business rules. Besides there is no specification on the impact of business rules on business events. The second approach [11] expresses business rules as part of cardinality specification. In this approach business rules represent changes in the states of business entities (entities life histories) as a response to business events. However, as the approach illustration is limited to

business rules with only one constraint, it is not clear how multiple constraints are addressed. The third approach [16] focuses on expressing business rules through expert system like knowledge base and then integrate it with database through entity-relationship diagrams. However, the paper is more of developing business rules system, and not on interaction with existing database entity types. The fourth approach [17] outlines a meta-schema of an entity-relationship model in the form of an extended ER model that does include rules and events that can result in database triggers and procedures. However, the outline is largely conceptual and lacks the details on implementation. III. BUSINESS RULES MODELING The entity relationship modeling of business rules involves the following elements: (i) entity type structure of business rules, (ii) relational table representation, and (iii) business rules entity association with business process events entity type. Each element builds on one another. A. Entity Type Structure The business rule structure is represented through a schema of entity types. Such entity types essentially follow the abstract structure of a business rule statement as shown in Figure 3. In the figure each constraint-i operator-i value-i clause is some constraint, while the entity action clause is some action when the constraint conditions are true.

Figure 3. Abstract Structure of Business Rule

Entity action clause may specify descriptive statement like create purchase order or can be specific about what attribute value in the affected business process entity event needs to be changed as shown in Figure 1. In this paper, the focus is on business rules that impact individual business process entity types. The business rule structure is represented through a collection of two entity types having an identifying relationship with each other as shown in Figure 4. The Business Rule entity type in the figure refers to the entity action clause of the business rule statement. The Business Rule entity type is the strong entity type. The attribute descriptions are as follows: the RuleID attribute is the primary key; the Description attribute provides a brief description of the business rule; the Rule Type attribute specifies what business process event operation will be the trigger for the business rule in the form of insert, update, or delete operations; the Action Attribute is the name of the business process transactional entity type attribute as expressed in the

2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2101

entity action clause; the Action Value is the value that will be assigned to the Action Attribute in the entity action clause; the Action Description is a description of the nature of action that will be performed; and the Action Unit is the name of the program unit that will implement the business rule.

business process as shown in Figure 1 will be represented through the business rule entity structure as shown in Figure 5. The Hotel Reserve BR represents details about the various rules along with the entries in the THEN clause of the business rule. The Hotel Reserve BR Details contains the details of the associated IF clauses of the business rule. Figure 6 shows the entity instances of the business rule shown in Figure 1. The business rule is represented through a single Hotel Reserve BR entity instance which contains the values associated with the THEN clause of the business rule. This entity instance is associated with three entity instances of Hotel Reserve BR Details entity type for the three clauses in the IF part of the business rule. The Hotel Reserve BR entity type is the strong entity type having an identifying relationship with the weak Hotel Reserve BR Details entity type.

Figure 4. Business Rules Entity Schema

The Business Rule Details entity type in the figure is a weak entity type representing the various constraint clauses of the business rule. The Constraint attribute is the name of the constraint entry in the constraint clause, the Operator attribute is the condition operator in the constraint clause, while the ConstraintValue attribute is the value assigned to the constraint condition.
Figure 6. Entity Instances of Business Rule Specification for Hotel Reservation Event

Figure 5. Business Rule Model for Hotel Reservation Event

B. Relational Table Representation The Business Rule entity type and Business Rule Details entity type are represented as separate tables in a relational database. For example, Figure 7 shows the table structure of Figure 6 entity instances of Hotel Reserve BR and Hotel Reserve BR Details entity types. The Hotel Reserve BR Details entity type being the weak entity type has the composite primary key. The foreign key Hotel_Reserv_RuleID in Hotel Reserve BR Details table represents the 1:N relationship with the Hotel Reserve BR table. The 1:N relationship between the strong and weak entity types binds the THEN entity action clause instance with the multiple IF constraint clauses of a business rule statement.

For instance, business rules pertaining to the creation of a hotel reservation event in the hotel reservation

2012 ACADEMY PUBLISHER

2102

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

business process event entity type into two 1:N relationship with a new associate business process rule link entity type as shown in Figure 9.

Figure 7. Relational Database Table Representation for Hotel Reservation Event

C. Business Rule Entity Association with Business Process Event Entity Type As business rules can be represented and stored in a relational database, it is possible now to establish a binary relationship between the business rule entity types and the business process event entity type. This relationship can then be utilized to maintain values on what rows in the entity type are changed due to individual business rules. The binary relationship will be many-to-many (M:N) as shown in Figure 8, as it is possible that one business rule may influence attribute values in more than one business process event entity instance, and vice-versa.

Figure 9. Transformation of Database Relationship between Business Rules Entity Types and Business Process Event Entity

For instance, the Hotel Reserve BR entity type of Figure 5 will be associated with business process entity type Hotel Reservation utilized during the hotel reservation event. This association as shown in Figure 10 is represented through Hotel Reserve BR Link associate entity type. Once the business rules are stored in relational database tables, it is possible to query the table Hotel Reserve BR Link to determine the use of a business rule for individual Hotel Reservation table rows. Further, if the business rules change or are dropped, the link table Hotel Reserve BR Link can be utilized to maintain the changes.

Figure 8. Database Relationship between Business Rules Entity Types and Business Process Event Entity Type

The M:N binary relationship is optional on both sides, implying that a business process event entity may not always be affected through the business rule. Also business rules may exist without having impacted any instance of the business process event entity type. The relational model representation will transform the M:N relationship between the business rule entity type and the

Figure 10. Database Relationship Representation of Hotel Reservation Business Rules and Hotel Reservation Event Entity

IV. BUSINESS RULES MODELING IMPLEMENTATION PROTOTYPE

2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2103

A prototype that transforms business rules associated with business process event entity types into Oracle database program units is illustrated now. The prototype utilizes a small database schema associated with hotel reservation business process event. The data model of the hotel reservation schema is shown in Figure 11. Entity types are hotels, hotel_details, travel_customer, and hotel_reservation. The prototype is limited to business rules associated with only one business process event entity type hotel_reservation. The prototype utilizes sample business rules associated with hotel reservation business process. A Web application to illustrate the business rule input in database, and its transformation into database triggers program units is shown thereafter through the Oracle's PL/SQL server pages technology. Once the business rules are implemented, their application is enforced during business process operations.

num_adults <= 2 THEN bed_type = DQ Business Rule 3: IF baby_crib = Yes AND num_kids = 3 AND num_adults = 2 THEN bed_type = DQ Business Rule 4: IF num_adults <= 2 AND num_kids >= 2 THEN rooms = 2 The ERD of these business rules is similar to Figure 5. Relational table representation of these business rules is listed in Figure 12 and Figure 13. Appendix A lists the SQL script for creating these tables including the Hotel_Reserve_BR_Link table.

Figure 12. Hotel Reservation Table

Figure 11. Hotel Reservation Database Schema

A. Business Rules Representation As the prototype focuses on a single business process entity type hotel_reservation, the business rules are limited to new hotel reservations. Sample set of four business rules associated with hotel reservation entity type are listed below. These business rules are activated during the creation of new hotel reservation. Other combinations of attributes for business rules can be similarly developed based on business process operating guidelines. In the business rules specifications below, DQ implies double queen. Business Rule 1: IF num_adults >= 2 AND num_adults <= 4 AND num_kids = 0 THEN rooms = 2 Business Rule 2: IF num_kids > 0 AND num_kids <= 2 AND
2012 ACADEMY PUBLISHER

Figure 13. Hotel Reservation Details Table

B. Web Prototype The Web prototype illustrates the entering of the business rules in declarative format, and their eventual (i) representation in tables listed above, and (ii) generation of relevant database triggers program units. The prototype consists of two Web pages. The interaction of the two Web pages within the Web architecture is shown in Figure 14.

2104

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

first stores the business rules structure in the two business rule tables hotel_reserve_br and hotel_reserve_br_details. Thereafter, the Web page constructs two database triggers and returns a message display "Business Rule successfully created. Program Unit sucessfully created." Every business rule will have its own separate set of database triggers. The logic of the create_br Web procedure and its associated PL/SQL procedure is shown in Figure 16.

Figure 14. Prototype Web Architecture

The prototype is currently limited to business rules pertaining to insert operation (Rule Type value insert). It is possible to perform similar operations with respect to delete or update operations.

Figure 16. Create_br Web page logic

Figure 15. Input_br Web Form

PL/SQL server pages are stored as Web procedures within the Oracle database. The first page titled input_br displays a Web input form to enter business rules as shown in Figure 15. The entries in the Web page are similar to the entity attributes in Figure 4. The prototype currently is limited to maximum of only three constraints. A more flexible input Web form can always be developed to include more number of constraints. The input_br Web page submits the input form data to the second Web page create_br. The second Web page
2012 ACADEMY PUBLISHER

The Figure 16 logic description is as follows: 1. Insert the form input data into hotel_reserve_br and hotel_reserve_br_details table. 2. Call a procedure create_hr_br_trig with input parameters on entity table name value, hotel_reserve_ruleid value, and database trigger program unit name value. The procedure create_hr_br_trig utilizes the Oracle Dynamic SQL capability to create two database triggers as follows: a. The first database trigger is named on the database trigger program unit name input parameter value, and it (i) defines the business rule as stored in hotel_reserve_br and hotel_reserve_br_details tables, and (ii) creates a row in hotel_reserve_br_link table where the hotel_reservation table primary key value is null. b. The second database trigger is named similar to the same database trigger program unit name input parameter

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2105

value with a number appended, and it (i) checks for the row in hotel_reserve_br_link table where the hotel_reservation table key value is null and the hotel_reserve_ruleid is same as the applied rule, and (ii) updates the hotel_reserve_br_link table with the new hotel_reservation primary key value. The purpose of the second database trigger is to store the primary key of the business process event table (EntityID) for the associated business rule (RuleID) as shown in Figure 9. The two database triggers generated by the prototype are listed in Appendix B. Once these business rule have been stored in the database along with their database triggers program units, whenever a new hotel reservation is created through a SQL insert statement, these rules will be executed through their associated triggers. V. CONCLUSIONS Entity relationship modeling of business rules for business process events in declarative format, and representing such business rules through automated database program units, provides a better structuring and management of business rules for business process. As each business rule is atomic, each such representation in entity relationship diagram is also a formal representation of a single derivation or constraint on business. Storage of business rules as a rule repository in a relational DBMS also enables utilization of services similar to those provided for transactional database like conceptually centralized management, access optimization, recovery and concurrency controls, and so on. The rule repository schema can facilitate some additional features like: Any given user shall need to be aware only of those rules that are pertinent to that user (just as any given user needs to be aware only of that portion of the data in a given database that is pertinent to that user). Rules shall be queryable and updatable. Rule consistency shall be maintained. Rules shall be sharable and reusable across applications and users. As the current research is limited in its generation of program units that handle attributes associated with one business process entity type or rule calculation/function, further research is ongoing to handle more complex business rules. Such complexity will be in the form of developing database program units for business rules that reference attributes in more than one business process entity type, besides developing business rules that impact more than one business process event. This could be in the form of multiple business process event entity types associated with one business rule. APPENDIX A SQL SCRIPT FOR PROTOTYPE BUSINESS RULES TABLE

create table hotel_reserve_br (hotel_reserv_ruleid integer constraint hotel_reserve_rule1_pk primary key, rule_desc varchar2(255), rule_type varchar2(10), action_attr varchar2(30), action_value varchar2(100), action_desc varchar2(255), action_unit varchar2(512)); create table hotel_reserve_br_details (hotel_reserv_ruleid integer constraint hotel_reserv_rule1_details_fk references hotel_reserve_br, rule_detailid integer, rule_constraint varchar2(255), rule_operator varchar2(10), constraintvalue varchar2(255), constraint hotel_reserve_rule1_details_pk primary key (hotel_reserv_ruleid,rule_detailid)); create table hotel_reserve_br_link (hr_br_id integer constraint hotel_reserve_br_link primary key, hotel_reserv_ruleid integer constraint hotel_reserve_br_link_fk1 references hotel_reserve_br, reserve_no integer constraint hotel_reserve_br_link_fk2 references hotel_reservation); create sequence hotel_reserv_br_seq increment by 1 nocache; create sequence hr_br_detailid_seq increment by 1 nocache; create sequence hr_br_seq increment by 1 nocache;

APPENDIX B PROTOTYPE GENERATED DATABASE TRIGGERS


create or replace trigger brule_web_trig before insert on HOTEL_RESERVATION for each row begin if :new.NUM_ADULTS >= 2 and :new.NUM_ADULTS <= 4 and :new.NUM_KIDS = 0 then :new.ROOMS := 2; insert into hotel_reserve_br_link values (hr_br_seq.nextval, 3, null); end if; end; create or replace trigger brule_web_trig2 after insert on HOTEL_RESERVATION declare

2012 ACADEMY PUBLISHER

2106

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

new_reserve_no integer; hr_br_ctr integer;hrbr_no integer; hotel_ruleid_no integer; begin select max(reserve_no) into new_reserve_no from hotel_reservation; select count(*) into hr_br_ctr from hotel_reserve_br_link where reserve_no is null and hotel_reserv_ruleid = 3; if hr_br_ctr > 0 then select hr_br_id,hotel_reserv_ruleid into hrbr_no, hotel_ruleid_no from hotel_reserve_br_link where reserve_no is null and hotel_reserv_ruleid = 3; if hotel_ruleid_no = 3 then update hotel_reserve_br_link set reserve_no = new_reserve_no where hr_br_id = hrbr_no; end if; end if; end;

REFERENCES
[1] Y. Amghar, M. Meziane, and A. Flory, Modeling of Business Rules For Active Database Application Specification, in Advanced Topics in Database Research, K. Siau, Ed., Hershey, PA: Idea Group Publishing, pp. 135-156, 2002. [2] M. Attaran, Exploring the relationship between information technology and business process reengineering, Information & Management, Vol. 41, No. 5, pp. 585-596, 2004. [3] S. Boardman, M. Caffrey, S. Morse, and B. Rosenzweig, Oracle Web Application Programming for PL/SQL Developers, Upper Saddle River, NJ: Prentice-Hall, 2003. [4] C.J. Date, What Not How: The Business Rules Approach to Application Development, Reading, MA: Addison-Wesley, 2000. [5] C. J. Date and H. Darwen,, Foundation for Future Database Systems: The Third Manifesto, 2nd ed., Reading, MA: Addison-Wesley, 2000. [6] M. J. Earl, The new and the old of business process redesign, Journal of Strategic Information Systems, Vol. 3, No. 1, pp. 5-22, 1994. [7] B. V. Halle, Business Rules Applied, New York, NY:John Wiley & Sons, 2002.

[8] D.C. Hay, A Repository Model - Business Rules - Part I (Structural assertions and derivations), The Data Administration Newsletter, Issue 19, January 2002. Available: http://www.tdan.com [9] D.C. Hay, A Repository Model - Business Rules - Part II (Action Assertions), The Data Administration Newsletter, Issue 20, April 2002. Available: http://www.tdan.com [10] D.C. Hay, Modeling Busines Rules: What Data Models Do, The Data Administration Newsletter, Issue 27, January 2004. Available: http://www.tdan.com [11] H. Herbst, G. Knolmayer, T. Myrach, and M. Schlesinger, The Specification of Business Rules: a Comparison of Selected Methodologies, in Methods and Associated Tools for the Information System Life Cycle, A.A. VerijnStuart and T.-W. Olle , Ed., Amsterdam: Elsevier, pp. 2946, 1994. [12] P. Kardasis and P. Loucopoulos, Expressing and organising business rules, Information and Software Technology, Vol. 46, No. 11, pp. 701-718, 2004. [13] R. Kaula, Oracle 10g: Developing Web Applications with PL/SQL Server Pages, New York, NY: Mc-Graw-Hill, 2006. [14] A. Lindsay, D. Downs, and K. Lunn, Business processesattempts to find a definition, Information and Software Technology, Vol. 45, No. 15, pp. 1015-1019, 2003. [15] D. Loshin, Business Intelligence: The Savvy Managers Guide, San Francisco: Morgan Kaufman, 2003. [16] A. Maciol, An application of rule-based tool in attributive logic for business rules modeling, Expert Systems with Applications, Vol. 34, No. 3, pp. 1825-1836, 2008. [17] S.B. Navathe, A.K. Tanaka, and S. Chakravarthy, Active Database Modeling and Design Tools: Issues, Approach, and Architecture, Data Engineering, Vol. 15, No. 1-4, pp. 6-9, 1992. [18] S. Ram and V. Khatri, A comprehensive framework for modeling set-based business rules during conceptual database design, Information Systems, Vol. 30, pp. 89-118, 2005. [19] F. Rosenberg and S. Dustdar, Business rules integration in BPEL - a service-oriented approach, Proceedings of the seventh IEEE International Conference on E-Commerce Technology, pp. 476 - 479, 2005. [20] A. Scheer, ARIS - business process modeling: Business Process Modeling, Berlin: Springer-Verlag, 2000.

2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2107

Orientation Selectivity for Representing Dynamic Diversity of Facial Expressions


H. Madokoro and K. Sato
Department of Machine Intelligence and Systems Engineering, Faculty of Systems Science and Technology, Akita Prefectural University, Yurihonjo, Japan Email: {madokoro, ksato}@akita-pu.ac.jp

Abstract This paper presents a representation method of facial expression changes using Adaptive Resonance Theory (ART) networks. Our method extracts orientation selectivity of Gabor wavelets on ART networks, which are unsupervised and self-organizing neural networks that contain a stabilityplasticity tradeoff. The classication ability of ART is controlled by a parameter called the attentional vigilance parameter. However, the networks often produce redundant categories. The proposed method produces suitable vigilance parameters according to classication granularity using orientation selectivity. Moreover, the method can represent the appearance and disappearance of facial expression changes to detect dynamic, local, and topological feature changes from obtained whole facial images. Index Terms Orientation Selectivity, Adaptive Resonance Theory, Gabor wavelets.

I. I NTRODUCTION People with rich facial expressions are robust to uncertain situations or adverse circumstances. The roles of facial expressions in communication among people are important and various. Especially in a close relationship, we can mutually understand the feeling and intensity from the information of facial expressions. In the eld of human communication, computer recognition of facial expressions has been studied for realizing a natural and exible Man-Machine Interface (MMI) that can interpret the feeling or intensity of users [1]. Akamatsu dened facial diversity of two types [2]. Facial components such as eyes, eyebrows, and the mouth are different for each person. Facial features of those facial components position, size, location, etc. are also different. This is called static diversity. On the other hand, we move facial muscles to express internal emotions unconsciously or express emotions as a message. Facial expressions are produced by the facial components and their transition from a normal facial expression. This is called dynamic diversity. Regarding facial recognitions in the eld of facial image processing, only the use of static diversity is sufcient to obtain good results. For facial expression recognition, it requires not only static diversity but also dynamic diversity as a time-series to cope with facial pattern transitions.
This paper is based on Orientation Selectivity for Representation of Facial Expression Changes, by H. Madokoro and K. Sato, which appeared in the Proceedings of the 2007 IEEE International Joint Conference on Neural Networks (IJCNN 2007), Orlando, Florida, USA, Aug. 2010. c 2007 IEEE.

Nishiyama et al. [3] proposed facial scores, a method to describe facial expression rhythms. They pointed out that facial expressions that are describable with a Facial Action Coding System (FACS) by Ekman [4] are only static features. Therefore, they did not use Action Units (AUs) of FACS. They originally used setting feature points because FACS can not describe time-series transitions of facial expressions. On the other hand, humans can recognize facial expressions to detect movements of local facial components from entire structures of faces. We do not need to detect facial elements as movements of characteristic points. We can automatically detect dynamic, local, and topological feature changes of facial expressions from whole facial changes. Ekman dened six basic expressions (anger, sadness, disgust, happiness, surprise, and fear) based on basic feelings of six types [4]. However, the number of categories to express is unknown because facial expressions exist that are invalid or which reect several mixed feelings. In this paper, we introduce Adaptive Resonance Theory (ART) networks [5] as a method to represent detection of dynamic, local, and topological changes of facial expressions. The ART, which was proposed by Grossberg et al., is a theoretical model of an unsupervised and self-organizing neural network to form a category adaptively in real time while maintaining stability and plasticity. Using incremental learning of ART, the method can classify facial expressions without presetting of the number of categories. In addition, facial expressions that are controlled by feelings change over time through aging. We consider that ART, which can learn over time, is useful to deal with time-series movements of facial expressions. However, setting the parameters of ART networks is complex; furthermore, classication results depend strongly on settings and combinations of parameters. Especially, a parameter called the attentional vigilance parameter strongly inuences classication granularity. In addition, ART networks generate redundant categories, even though the setting of vigilance parameters is the same. In this paper, we specically describe orientation selectivity of Gabor wavelets for analyzing classication granularity of ART networks. The method can detect dynamic, local, and topological changes of facial expressions for category changes of ART networks. Moreover, the method can prevent redundant categories through the use of orientation selectivity.

2012 ACADEMY PUBLISHER doi:10.4304/jcp.7.9.2107-2113

2108

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

layers are propagated as


J Zij: Bottom-up Weights F2 F1 qi

ri

Figure 2. Three-dimensional representations of Gabor wavelet lters.

Actually, ART has many variations: ART1, ART1.5, ART2, ART2-A, ART3, ARTMAP, Fuzzy ART, Fuzzy ARTMAP, etc. [6]. We use ART2 [5], into which analog values can be input. Figure 1 shows the ART network architecture. The network consists of two elds: Field 1 (F1) for feature representation and Field 2 (F2) for category representation. The F1 consist of six sub-layers: pi , qi , ui , vi , wi , and xi . These sub-layers realize Short Term Memory (STM), which enhances features of input data and detects noise for a lter. The F2 realizes Long Term Memory (LTM) based on ner or coarser recognition categories. The algorithm of ART2 is the following. 1) The top-down weights Zji and bottom-up weights Zij are initialized as Zji (0) = 0, Zij (0) = 1 . (1 d) M (1)

2) The sub-layers of F1 are initialized as pi (t) = qi (t) = ui (t) = vi (t) = wi (t) = xi (t) = 0. (2) 3) The input data Ii are presented to the F1. The sub 2012 ACADEMY PUBLISHER

cpi pi ui ui aui wi Ii
(a) Real Part

Zji:Top-down Weights

bf(qi) vi

wi (t) = Ii (t) + aui (t 1), wi (t) , xi (t) = e + ||w|| vi (t) = f (xi (t)) + bf (qi (t 1)), vi (t) ui (t) = , e + ||v|| pi (t) qi (t) = , e + ||p|| pi (t) = ui (t) (inactive) ui (t) + dZJi (t) (active), 0 if x if 0 x < , x .

(3) (4) (5) (6) (7) (8)

f(xi) xi

where f (x) = (9)

4) Search for the maximum active unit TJ as Tj (t) =


j

Figure 1. Architecture of an ART2 network.

pi (t)Zij (t),

(10) (11)

TJ (t) = max(Tj (t)). d ZJi (t) = d[pi (t) ZJi (t)] dt d ZiJ (t) = d[pi (t) ZiJ (t)] dt 6) The output value of ri (t) is calculated as ri (t) = ui (t) + cpi (t) . e + ||u|| + ||cp||

5) The weights Zji and Zij are updated as follows. (12) (13)

(b) Imaginary Part

(14)

II. A DAPTIVE R ESONANCE T HEORY 2

The reset is dened as > 1. e + ||r||

(15)

7) If eq. (15) is true, the active unit is reset; go back to 4) to search again. If no active unit exists, a new category is created; return to 3). If eq. (15) is not true, repeat 3) and 5) until the changing of F1 is sufciently small, then return to 2). Parameters are the following: a and b are coefcients on feedback loops from ui to wi and from qi to vi ; c is a coefcient from pi to ri ; d is a learning rate; cd/(1d) 1 is the constraint between them; and is a parameter to control a noise-detection level in layer v. III. G ABOR WAVELETS Visual information captured by the retina is conveyed to Visual area 1 (V1) in the occipital lobe via the Lateral Geniculate Nucleus (LGN). The V1 consists of two visual cells: simple cells and complex cells. The LGN and simple cells have receptive elds. Receptive elds respond to a particular stimulus of gures such as the size, length, direction, movement direction, color, and frequency. This is called response selectivity. Since the time Hubel and Wiesel [8] discovered orientation selectivity on receptive elds from their electrophysiological experiment using

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2109

anesthetized cats, orientation selectivity has become the most well known among response selectivity. Various methods based on visual cortex information processing models have been proposed to develop image processing or computer vision systems [2], [7], [9]. The representation of Gabor wavelets, which can emphasize an arbitrary characteristic with inner parameters, is closed to receptive elds. Therefore, Gabor wavelets are applied to various elds such as character recognition, texture classication, and facial image processing [14], [15]. Gabor wavelets are functions that are combined with a plane wave propagating to one direction and a Gaussian wave. A three-dimensional (3D) representation of Gabor wavelets is shown in Fig. 2. Let be a wavelength, and let x and y respectively denote widths of horizontal and vertical directions of Gaussian windows, where is the angle between the direction of a plane wave and the horizontal axis. The output of Gabor wavelets G(x, y) is given as
2 2 Ry 1 Rx 2Rx G(x, y) = exp{ ( 2 + 2 )} exp(i ), 2 x y

S
1.0

0.9

0.8

0.7

0.6

0.5 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0

Figure 3. Gabor wavelet output images of the combination of and S.

(16)

10

15

20

25

30

35

40

45

where cos Rx = sin Ry When Eulers formula, exp(i) = cos + i sin , is applied, the formula (16) is changed as: G(x, y) = Rm (x, y) + iIm (x, y),
2 2 Ry 1 Rx 2Rx Rm (x, y) = exp{ ( 2 + 2 )}cos( ), 2 x y 2 2 Ry 2Rx 1 Rx Im (x, y) = exp{ ( 2 + 2 )}sin( ). 2 x y

50

55

60

65

70

75

80

85

90

95

sin cos

x . y

(17)
100 105 110 115 120 125 130 135 140 145

(18)

150

155

160

165

170

175

180

[degrees]

(19) (20)

Figure 4. Gabor wavelet output images of (0 180, = 4.0 and S = 0.7)

A. Target Images For this experiment, our evaluation targets are six basic facial expressions (anger, sadness, disgust, happiness, surprise, and fear) as dened by Ekman. We took 600 facial images from each person. The frame rate was 10 frames per second. Each facial expression comprises 100 images. The images at W320 H240 pixels resolution were taken using a CCD camera in front of the face. We manually clipped the facial region of W92 H110 pixels from the images. For automatic facial detection, we plan to use a method using Haar-like features by Papageorgiou et al. [12]. The targeted person is a woman in her 20s, a university
TABLE I. TARGET FRAMES THAT PORTRAY FACIAL EXPRESSIONS . Facial expressions Anger Sadness Disgust Happiness Surprise Fear 1st 18-30 11-24 15-32 21-47 16-26 16-34 2nd 50-57 40-48 52-65 64-78 52-60 56-63 3rd 76-82 65-78 90-100 81-92 85-98

(21)

The nal output is G(x, y) = Rm2 (x, y) + Im2 (x, y). (22)

The suitable values of x , y are reported as a function of [11], so that x S = x , (23) y Sy where Sx and Sy are coefcients. IV. E XPERIMENT The purpose of this experiment is to detect facial expression changes for the category changes of ART networks from datasets that include both expressive and normal faces. Moreover, we evaluate orientation selectivity obtained from categorical changes of ART.
2012 ACADEMY PUBLISHER

2110

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

categories
10 9 8 7 6 5 4 3 2 1 0 0 15 30 45 60 75 90 105 120 135 150 165 180

categories
10 9 8 7 6 5 4 3 2 1 0 0 15 30 45 60 75 90 105 120 135 150 165 180

categories
10 9 8 7 6 5 4 3 2 1 0 0 15 30 45 60 75 90 105 120 135 150 165 180

(a) Anger
categories
10 9 8 7 6 5 4 3 2 1 0 0 15 30 45 60 75 90 105 120 135

degrees categories
10 9 8 7 6 5 4 3 2 1 0 150 165 180 0 15 30 45 60

(b) Sadness

degrees categories
10 9 8 7 6 5 4 3 2 1 0

(c) Disgust

degrees

75

90

105

120

135

150

165

180

15

30

45

60

75

90

105

120

135

150

165

180

(d) Happiness

degrees

(e) Surprise

degrees

(f) Fear

degrees

Figure 5. The number of categories in each direction from 0 to 180 degrees by 5 degree steps ( = 0.970).

graduate school student. She repeated one expressed face and a normal face in each facial expression. Therefore, this image dataset consists of two face types in each facial expression: one type of expression face and a normal face. The facial expressions are intentional. However, the timing of expression is idiosyncratic: the targeted person decided that timing. After the dataset acquisition, we specied appearance and disappearance points of all facial expression datasets. The appearance and disappearance points are summarized in Table I. B. Parameters We evaluated parameters of Gabor wavelets and ART2 networks. Figure 3 shows the relationship between and S in the case of = 0. The respective ranges of and S are 2.0 10.0 and 0.5 S 1.0. We selected = 4.0 and S = 0.7!J = 2.8!K because these representations are sparse features. In this case, we set the parameters subjectively. Optimizing parameters using automatic and objective setting methods is a subject for our future work [13]. The electrophysiological knowledge indicates that the visual range of receptive elds is 15 degrees to yield a response to an input stimulation, the parameter is set in each case to ve degrees. Figure 4 shows a twodimensional representation of Gabor wavelets from 0 to 180 degrees by 5 degree steps. Moreover, we set the parameters of ART2 networks, = 0.01, a = b = 10, c = 0.225, d = 0.8, e = 0.0001, based on our experience and the Grossbergs original paper [5]. C. Results and Discussion Figure 5 shows the number of categories in each direction from 0 to 180 degrees step by 5 degree steps. The vigilance parameter is set to 0.970. The directions with a large number of categories mean that facial feature
2012 ACADEMY PUBLISHER

changes are large in the input images. The number of categories of surprise, which features the open level of a mouth at 90 degrees and nearby, is larger than for the other directions (Fig. 5e). This result means that the category changes of these directions are remarkable. The number of categories of sadness, which features wrinkles between the eyebrows at 0 and 180 degrees and nearby, is larger than for the other directions (Fig. 5b). The result of sadness is a large number of categories compared with other facial expressions. This result means that category changes of these directions are also remarkable. Figure 6 shows category changes in the case of = 0.970. This gure shows the generation of new categories as lled rectangles and transitions to existing categories as empty rectangles. The vertical lines in each graph are appearance or disappearance of facial expressions as specied in the section IV-A. These lines that correspond to Table I show the changing frames of appearance and disappearance between the normal expression and each facial expression. Category changes are not required on these lines because the appearance and disappearance continue a few frames before and after specied frames. The appearance of anger is represented for categorical changes of ART networks in all three times (Fig. 6a). The directions at the rst appearance are only three: 5, 10, and 15 degrees. The directions at the second and third appearance include wide ranges of directions, meaning that the ranges of orientation selectivity are narrow at the rst point and wide at the second and third points. The rst and second disappearance of anger can be detected, but the third one can not be detected. The appearance and disappearance of sadness are represented (Fig. 6b). However, the setting value of is high because many categories occurred, except at the transition points. The expression of disgust shows a weak response (Fig. 6c). This response means the classication granularity is low,

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2111

degrees
180

1st

2nd

3rd

degrees
180

1st

2nd

3rd

150

150

120

120

90

90

60

60

30

30

0 0 10 20 30 40 50 60 70 80 90

(a) Anger
degrees
180

frames

100

10

20

30

40

50

60

70

80

90

(b) Sadness
degrees
180

frames

100

1st

2nd

3rd

1st

2nd

150

150

120

120

90

90

60

60

30

30

0 0 10 20 30 40 50 60 70 80 90 100

0 0 10 20 30 40 50 60 70 80 90 100

degrees
180

1st

(c) Disgust 2nd

frames

3rd

degrees
180

1st

(d) Happiness 2nd

frames

3rd

150

150

120

120

90

90

60

60

30

30

0 0 10 20 30 40 50 60 70 80 90 100

0 0 10 20 30 40 50 60 70 80 90 100

(e) Surprise

frames

(f) Fear

frames

Figure 6. Categorical changes of ART2 networks for = 0.970 in each facial expression: (a) (f). Filled rectangles represent the generation of new categories; empty rectangles represent transitions to existing categories. Vertical lines in each graph show the appearance or disappearance of facial expressions that correspond to Table I. The arrows show the point orientation selectivity represented.

although slight orientation selectivity is apparent. The categorical changes of happiness are redundant (Fig. 6d). The classication granularity seems to be short because the second expression can only detect two directions: 100 and 105 degrees. In this case, the setting value of cannot be increased. The open level of the mouth differs before and after the 34th frame of the rst appearance. That difference is detectable for the category changes. The open level of the mouth at surprise is characteristic (Fig. 6e). The categorical changes are noticeable around 90 degrees. However, in expectation of facial appearance points, the result is strongly reective of eye blinking. The categorical changes of fear are not detected (Fig. 6f). This

result indicates that the vigilance parameter, = 0.970, is small. Next, Fig. 7 shows the results of sadness, fear, and anger with changing vigilance parameters. The vigilance parameter of sadness were turned down step by 0.010 because the category is redundant in Fig. 6b. The vigilance parameter of fear and anger were turned up to 0.980 because the classication granularity is short in Figs. 6c and 6f. The redundant categories were decreased at the result of sadness in case of = 0.960 (Fig. 7a). The category changes are seen at the rst appearance. Moreover, in the case of = 0.950, the category changes are more apparent at the rst and second appearance (Fig.

2012 ACADEMY PUBLISHER

2112

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

degrees
180

1st

2nd

3rd

degrees
180

1st

2nd

3rd

150

150

120

120

90

90

60

60

30

30

0 0 10 20 30 40 50 60 70 80 90

(a) Sadness (
degrees
180

1st

=0.960) 2nd

(b) Sadness (
degrees
180

3rd

1st

150

150

120

120

90

90

60

60

30

30

0 0 10 20 30 40 50 60 70 80 90 100

0 0 10 20 30 40 50 60 70 80 90 100

(c) Disgust (

=0.980)

(d) Fear (

Figure 7. Categorical changes of ART2 networks ( = 0.960, 0.950 for sadness and = 0.980 for disgust and anger). The arrows show the point orientation selectivity represented.

7b). The category changes appeared all directions at the 98th and 99th frames. The cause is eye blinking, which occurred in the frames. We consider that the features of eye blinking are easy to divide into other features because eye blinking occurred in almost all directions. The appearance and disappearance of disgust appeared in the case of = 0.980, especially in the second one (Fig. 7c). In the case of = 0.980 of fear, all appearances were detected (Fig. 7d). Especially, it was detected in a wide range at the second appearance. The method can detect facial appearance points, even in the lower setting of using orientation selectivity. In other words, the method can reduce redundant categories with the lower setting of . Moreover, the method can detect facial expression changes with the range of avoiding redundant categories to increase the setting of within orientation selectivity if the classication granularity is insufcient for a problem to be solved. We consider that the method can realize an advanced type of facial expression recognition for the next step of facial expression classication using the patterns of category changes with orientation selectivity. V. C ONCLUSION This paper presents a method for representation of facial expression changes using orientation selectivity of Gabor wavelets on ART networks. The method produced
2012 ACADEMY PUBLISHER

suitable vigilance parameters according to classication granularity using orientation selectivity. Moreover, the method represented the appearance and disappearance of facial expression changes to detect dynamic, local, and topological feature changes from whole facial images. Future studies must evaluate other response selectivity, such as wavelength, amplitude, frequency and direction of motion. In addition, we are going to take examinations about the formation of categories for long-term facial changes, implementation of oblivion mechanisms, fusion with context information, etc. to realize a natural and exible MMI. In our method, we selected the best size of category maps. The suitable training data are different in each problem to be solved. Automatic setting of the size of category maps is the subject of our future work. Moreover, we will apply our method to large-scale problems. ACKNOWLEDGMENT This work was supported by a Grant-in-Aid for Young Scientists (B) No. 21700257 from the Ministry of Education, Culture, Sports, Science and Technology of Japan. R EFERENCES
[1] M. Pantic and L.J.M. Rothkrantz, Automatic Analysis of Facial Expressions: The State of the Art, IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 22, no. 12, pp. 14241445, Dec. 2000.

frames

frames

100

10

20

30

40

50

60

70

80

90

100

=0.950) 2nd

frames

3rd

=0.980)

frames

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2113

[2] M.J. Lyons, J. Budynek, and S. Akamatsu, Automatic Classication of Single Facial Images, IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 21, no. 12, pp. 13571362, Dec. 1999. [3] M. Nishiyama, H. Kawashima, T. Hirayama, and T. Matsuyama, Facial Expression Representation based on Timing Structures in Faces, IEEE Intl. Workshop on Analysis and Modeling of Faces and Gestures, pp. 140154, 2005. [4] P. Ekman and W. V. Friesen, Unmasking the Face: A Guide to Recognizing Emotions from Facial Clues, Malor Books, 2003. [5] G.A. Carpenter and S. Grossberg, ART 2: Stable SelfOrganization of Pattern Recognition Codes for Analog Input Patterns, Applied Optics, vol. 26, pp. 49194930, 1987. [6] G.A Carpenter and S. Grossberg, Pattern Recognition by Self-Organizing Neural Networks, The MIT Press, 1991. [7] G. Donato, M.S. Bartlett, J.C. Hager, P. Ekman, and T.J. Sejnowski, Classifying Facial Actions, IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 21, no. 10, pp. 974989, Oct. 1999. [8] D.H. Hubel and T.N. Wiesel, Functional Architecture of Macaque Monkey Visual Cortex, Proc. Royal Soc. B (London), vol. 198, pp. 1-59, 1978. [9] C. Liu, Gabor-Based Kernel PCA with Fractional Power Polynomial Models for Face Recognition, IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 21, no. 10, pp. 974989, Oct. 1999. [10] T.S. Lee, Image representation using 2D Gabor wavelets, IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 18, no. 10, pp. 959971, Oct. 1996. [11] A.K. Jain and S.K. Bhattacharjee, Address block location on envelopes using Gabor lters: supervised method, 11th IAPR International Conference on Pattern Recognition, vol. II, pp. 264267, Sep. 1992. [12] C.P. Papageorgiou, M. Oren, and T. Poggio, A general framework for object detection, Proc. International Conference on Computer Vision, pp. 555562, 1998. [13] T. Randen and J.H. Husoy, Filtering for texture classication: a comparative study, IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 21, no. 4, pp. 291310, Apr. 1999. [14] D. Shan and R.K. Ward, Statistical Non-Uniform Sampling of Gabor Wavelet Coefcients for Face Recongnition, Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing, vol.2, pp.7376, 2005. [15] C. Liu and H. Wechsler, Gabor Feature Based Classication Using the Enhanced Fisher Linear Discriminant Model for Face Recognition, IEEE Trans. Image Processing, vol.11, no.4, pp.467476, 2002.

Kazuhito Sato received the ME degree in electrical engineering from Akita University in 1975 and joined Hitachi Engineering Corporation. He moved to Akita Prefectural Industrial Technology Center and Akita Research Institute of Advanced Technology in 1979 and 2005, respectively. He received the PhD degree from Akita University in 1997. He is currently an associate professor at the Department of Machine Intelligence and Systems Engineering, Akita Prefectural University. He is engaged in the development of equipment for noninvasive inspection of electronic pats, various kinds of expert systems, and MRI brain image diagnostic algorithms. His current research interests include in biometrics, medical image processing, facial expression analysis, computer vision. He is a member of the Medical Information Society, the Medical Imaging Technology Society, the Japan Society for Welfare Engineering, the Institute of Electronics, Information and Communication Engineers, and the IEEE.

Hirokazu Madokoro received the ME degree in information engineering from Akita University in 2000 and joined Matsushita Systems Engineering Corporation. He moved to Akita Prefectural Industrial Technology Center and Akita Research Institute of Advanced Technology in 2002 and 2006, respectively. He received the PhD degree from Nara Institute of Science and Technology in 2010. He is currently an assistant professor at the Department of Machine Intelligence and Systems Engineering, Akita Prefectural University. His research interests include in machine learning and robot vision. He is a member of the Robotics Society of Japan, the Japan Society for Welfare Engineering, the Institute of Electronics, Information and Communication Engineers, and the IEEE.
2012 ACADEMY PUBLISHER

2114

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

Baldwin Effect based Particle Swarm Optimizer for Multimodal Optimization


Ji Qiang Zhai
College of Mechanical and Electrical Engineering, Northeast Forestry University, Harbin, 150040, P. R. China School of Computer Science and Technology, Harbin University of Science and Technology, Harbin 150080, P. R. China Email: zhaijiqiangnfus@163.com

Ke Qi Wang
College of Mechanical and Electrical Engineering, Northeast Forestry University, Harbin, 150040, P. R. China

AbstractParticle Swarm Optimization (PSO) is an effective optimal technique. However, it often suffers from being trapped into local optima when solving complex multimodal optimizing problems due to its inefficient exploiting of feasible solution space. This paper proposes a Baldwin effect based learning particle swarm optimizer (BELPSO) to improve the performance of PSO when solving complex multimodal optimizing problems. This Baldwin effect based learning strategy utilizes the historical beneficial information to increase the potential search range and retains diversity of the particle population to discourage premature. On the other hand, the exemplars provided by Baldwin effect based learning strategy can flatten out the fitness landscape closing to optima and hence guide the search path towards optimal region. Experimental simulations show that BELPSO has a wider search range of feasible solution space than PSO. Furthermore, the performance comparison between BELPSO and amount of population based algorithms on sixteen well-known test problems shows that BELPSO has better performance in quality of solution. Index TermsParticle Swarm Optimization; Baldwin effect; Swarm intelligence; Population based algorithm; Computational intelligence

I. INTRODUCTION Optimization has been an active area of research for several decades. Since many practical problems which arise in almost every field of science, engineering and business can be formulated to multi-modal optimization problems, many algorithms and approaches are presented to solving these complex optimization problems. The particle swarm optimizer (PSO), proposed by Kennedy and Eberhart [1, 2] in 1995, has gained growing interesting and has been widely applied to deal with numerous engineering applications [3, 4]. However, the performance of PSO greatly depends on its parameters and it often suffers from being trapped in local optima. Thus, a large amount of variants of PSO have been proposed for improving its performance. Shi and Eberhart first proposed a linearly decreasing inertia weight during the process of search [5], and designed fuzzy methods to

nonlinearly change the inertia weight [6]. In Ref. [7], a self adaptive approach of changing each particles inertia weight is proposed. Clerc and Kennedy [8] introduced a constriction factor in PSO to guarantee the convergence and improve the convergent speed. Improving the performance of PSO by combining PSO with other search techniques has been an active research direction. The selection operator of evolutionary algorithms has been used in PSO to preserve the best particles and thus to ensure the convergence [9]. Besides, the mutation operator has also been used for retain the swarm diversity so that to avoid tripping into a local optimum [10]. Bergh and Engelbrecht proposed a cooperative approach by searching one dimension separately by particles and combine the results together [11]. CLPSO [12] introduces a comprehensive learning strategy into PSO algorithms, whereby all other particles historical best information is used to update a particles velocity. Baldwin effect is a nature phenomena where individuals will survive longer through learning from others to fit the environment better and thus to improve the entire evolution process [13]. An advantage of Baldwinian learning is that it can flatten out the fitness landscape around the optimal regions, and hence help find the global optimum even in a dynamic environment [14, 15]. The first work of exploring the Baldwin effect can be traced back to 1980s when Hinton and Nowlan [16] proposed a hybrid algorithm combining a genetic algorithm and a Baldwinian learning strategy for developing simple neural networks. Baldwinian learning has been gained increasing interesting, and hence numbers of further investigations and models based on Baldwin effect were presented [17-20]. This paper aims to alleviate the premature of PSO algorithm and to further improve the performance in quality of solution on complex multi-modal problems, based on Baldwin effect, a novel Baldwin effect based learning particle swarm optimizer (BELPSO) is presented for improving the performance when solving complex multi-modal problems. This learning strategy utilizes the historical beneficial information to increase the search

2012 ACADEMY PUBLISHER doi:10.4304/jcp.7.9.2114-2119

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2115

range and retains diversity of the particles to discourage premature. On the other hand, the exemplars provided by Baldwin effect based learning can flatten out the fitness landscape approaching optimum and hence guide the search path towards optimal region. Experimental results show that BELPSO has good performance in solving most of the test problems and is an effective algorithm for complex multi-modal problem optimization. II. BALDWIN EFFECT BASED LEARNING PARTICLE SWARM
OPTIMIZER

B. The Proposed BELPSO The novel algorithm is implemented as Fig. 1. In BELPSO, not only the particles own pbest but also all particles pbest s and some of neighbors can potentially be the learning exemplars, while only particles own pbest and gbest be the exemplars in simple PSO. Besides, there is only one exemplar pbest _ baldwin to be learned in every generation in BELPSO, in stead of the two exemplars pbest and gbest in simple PSO.
Step 1 BELPSO initialization. For each particle i in the population,

A. Baldwin Effect Based Learning Strategy for Velocity Updating This new learning strategy not only flattens out the optimal region but also keeps the diversity of the particle population. The following velocity updating equation is used in BELPSO
Vi d w(k ) *Vi d + c * randid *( pbest _ baldwinid X id ) (1)

randomly generate the X i and Vi , evaluate f ( X i ) to initialize the pbesti , set k=1.
Step 2 Repeat until the termination criterion is satisfied. If the

termination criterion is satisfied, then stop iteration and output the best solution pS such that f ( pS ) f ( pbesti ) ; else set k=k+1 and go to Step 2.1
Step 2.1 If i ps , then set i = 0 and go to Step 2; else set i = i + 1 and go to Step 2.2. Step 2.2 If suc(i ) > LG , then go to Step 2.3; else go to Step

where c is acceleration constants. randid is random number selected from the range [0, 1]. w(k ) is the inertia weight at k th generation. pbest _ baldwinid is the dth dimension of the
ith

2.4.
Step 2.3 Baldwin effect based learning. Update

particle

pbest _ baldwini ,

pbest _ baldwini is obtained as follows:


pbesti if suc(i) <= LG sup( pbest j , pbestl ) pbest _ baldwini = pbesti + s * ( ps ) infe( pbest , pbest ) < j ,l >randgen j l if suc(i) > LG

pbest _ baldwini as statement of Eq. (2), then go to Step 2.5.


Step 2.4 Update pbest _ baldwini = pbesti . Step 2.5 Firstly, update Vi d as Eq. (1), and then apply the

following equation to control the flying step of particle i .

(2) where pbesti = ( pbest , pbest ,..., pbest ) denotes i th particles pbest , ps is the size of particle population, randgen( ps ) means generating a set consisting of a comparison shows that BELPSO performs better than CSA and DEA on most of the test functions with D=10.random number of unique tuples from 1, 2,..., ps , i.e. (<1,2>,<3,4>,<5,6>) from 1, 2,3,..., 6 , sup( x, y ) and infe( x, y ) are the superior and inferior of x and y , respectively. Take minimizing problem for example, sup( x, y ) and infe( x, y ) are the minimum and maximum of x and y respectively. s [0,1] is the Baldwin learning strength. suc(i ) defines the successive generation without improvement of i th particles pbest , LG expresses the learning gap which controls the local search ability by minimizing the time wasted in poor search direction to some extent. When suc(i ) > LG , the particle executes a Baldwin learning to alter the search space and thereby provides the good exemplars towards the optimal regions.
1 i 2 i

D i

d d Vi d min(Vmax , max(Vmax ,Vi d ))

d where Vmax is a positive constant value specified by the user, in

our study, it is set to twenty percent of the maximum search range of each dimension.
Step 2.6 Update X id . Step 2.7 Evaluate f ( X i ) if X i is in the feasible searching

range; else go to Step 2.1.


Step 2.8 If f ( X i ) f ( pbesti ) , then update pbesti = X i and

set suc(i ) =0; else set suc(i ) = suc(i ) +1, and then go to Step 2.1. Figure 1. Flowchart of the BELPSO algorithm

. SIMULATION EXPERIMENTS

In this section, we use experiments to evaluate the performance of BELPSO by solving sixteen function optimization problems [21-24].

2012 ACADEMY PUBLISHER

2116

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

Figure 2. Sensitivity in relation to learning parameters of LG and s, (a), (b), (c), (d) are BELPSOs function values to LG and s in optimizing f 2 , f 6 ,
f12 , and f15 , respectively

A. Sensitivity in Relation to Parameters We investigate the effects of the main parameters about Baldwin effect based learning of BELPSO by applying it to solve the unimodal function, unroated multi-modal function, rotated multi-modal function and composition function with various learning gap LG and learning strength s . The experimental results of BELPSO in optimizing f 2 , f 6 , f12 , and f15 with learning gap LG increased from 1 to 10 in steps of 1 and the learning strength s from 0.1 to 1 in steps of 0.1 are shown in Fig. 2. The values of other parameters are as follows: the problem dimension D is 10, the population size is set at 10 and the maximum FEs is set at 30000. The inertia weight at k th generation w(k ) is as follows:
w(k ) = w0 * ( w0 w1 ) * k max_ gen

(3)

where k denotes the generation, max_ gen is the maximum generations, it is set at 3000 in our study, w0 and w1 are specified to 0.9 and 0.4, respectively, which is the same as [12]. Fig. 2 shows the statistical average values obtained from 30 independent runs. From Fig. 2 we observe that learning gap and learning strength can influence the performance of BELPSO. For f 2 , we obtained a faster convergence velocity and better results when LG is set at 1 and s is set at 1. For f 6 , f12 , and f15 , too small values of LG make the algorithm trap into local optima, better results were obtained when LG is 6~10 and s is 0.1~ 0.6. The results demonstrate that too much learning or too less learning may discourage the convergence speed when deal with complex multi-modal problems, which complies with Baldwin effect [16]. Hence, in our study, the learning gap and learning strength are set at 7 and 0.5 respectively for all test functions.

2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2117

10

50

10

Function value

10

Function value

BELPSO simple PSO

BELPSO simple PSO 10


2

10

50

10

100

500

1000

1500 Iterations (a)

2000

2500

3000

10

500

1000

1500 Iterations (b)

2000

2500

3000

10

10

10

Function value

10

Function value

BELPSO simple PSO

10

BELPSO simple PSO

10

10

10

10

20

500

1000

1500 Iterations (c)

2000

2500

3000

10

500

1000

1500 Iterations (d)

2000

2500

3000

10

10

10

Function value

10

Function value

BELPSO simple PSO

10

BELPSO simple PSO

10

10

10

10

20

500

1000

1500 Iterations (e)

2000

2500

3000

10

500

1000

1500 Iterations (f)

2000

2500

3000

10

10

Function value

10

Function value

BELPSO simple PSO

10

BELPSO simple PSO

10

10

10

500

1000

1500 Iterations (g)

2000

2500

3000

10

500

1000

1500 Iterations (h)

2000

2500

3000

Figure 3. The convergence graph of BELPSO and simple PSO to iterations on all test functions with D=10. (a), (b), (c), (d), (e), (f), (g) and (h) are results of the two algorithms in optimizing functions 1, 2, 3, 4, 5, 6, 7 and 8, respectively

10

20

10

Function value

10

Function value

BELPSO simple PSO

BELPSO simple PSO 10


0

10

20

500

1000

10

20

1500 2000 Iterations (i)

2500

3000

10

0
4

500

1000

1500 2000 Iterations (j)

2500

3000

10

Function value

10

Function value

BELPSO simple PSO

BELPSO simple PSO 10


2

10

20

0
4

500

1000

1500 2000 Iterations (k)

2500

3000

10

500

1000

10

10

1500 2000 Iterations (l)

2500

3000

Function value

10

Function value

BELPSO simple PSO

BELPSO simple PSO 10


3

10

500

1000

10

1500 2000 Iterations (m)

2500

3000

10

500

1000

10

1500 2000 Iterations (n)

2500

3000

Function value

10

Function value

BELPSO simple PSO

BELPSO simple PSO 10


2

10

500

1000

1500 2000 Iterations (o)

2500

3000

10

500

1000

1500 2000 Iterations (p)

2500

3000

Figure 3. (Continued) The convergence graph of BELPSO and simple PSO to iterations on all test functions with D=10. (i), (j), (k), (l), (m), (n), (o) and (p) are results of the two algorithms in optimizing functions 9, 10, 11, 12, 13, 14, 15 and 16, respectively

2012 ACADEMY PUBLISHER

2118

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

B. The Statistical Results on the Test Functions Via the analysis of the sensitivity of the BELPSO, we set the main parameters as follows: The learning gap is 7, the learning strength is 0.5, and w is the same as above. When solving the 10-D functions, the population size is set at 10 and the maximum FEs is set at 30000. When solving the 30-D functions, the population size is set at 30 and the maximum FEs is set at 2000000. All experiments were run 30 times. Fig. 3 illustrates the comparison of BELPSO and simple PSOs convergence characteristics on all of test functions, where the results are the median value of 20 independent runs with D=10. The parameters of BELPSO and simple PSO are same as above. Fig. 3 shows that BELPSO outperform the simple PSO on almost all of the test problems. Since Baldwin effect based learning can provide wide potential search spaces to maintain population diversify and thus to alleviate the premature, the better results are obtained when apply this Baldwin effect based learning strategy to PSO on complex multimodal problems, especially on the composition problems. In addition, with the ability of smooth out the optimal region of Baldwin effect based learning, the quality of the results of BELPSO is higher than that of simple PSO.

results of CSA and DEA are obtained from [27] for direct comparison. The comparisons of BELPSO with CSA and DEA on 10-D functions are shown in Table. Table indicates that BELPSO surpasses CSA and DEA on functions 1, 3, 4, 5, 6, 7, 9, 10, 11, 12, 13 and 14 (twelve out of sixteen functions), especially improve the results on functions 1, 3, 5 and 9 significantly. This comparison shows that BELPSO performs better than CSA and DEA on most of the test functions with D=10.
. CONCLUSIONS

By incorporating a novel Baldwin effect based learning strategy into particle swarm optimizer, a novel algorithm termed BELPSO, is presented for solving complex multimodal problems. BELPSO was executed to solve sixteen test problems. The performance comparisons of the BELPSO with other variants of PSO, and several population based algorithms including CSA and DEA indicated that BELPSO perform better on most of the multi-modal test functions.

Table I MEAN AND STANDARD DEVIATION VALUES OBTAINED BY CSA, DEA AND BELPSO ON ALL TEST FUNCTIONS WITH D=10 Group A
f1 CSA DEA BELPSO

Group A
f2

Group B
f3

Group B
f4

3.54e+000 1.53e+000 9.55e-013 1.32e-012


1.03e-041 2.40e-041

1.69e+000 0.63e+000
1.02e-002 8.60e-003

1.83e+000 0.36e+000 4.80e-017 3.58e-007


3.55e-015 4.32e-31

0.91e+000 0.10e+000 4.30e-001 7.19e-002


2.70e-002 1.04e-002

3.53e+000 6.29-001 Group B


f6

Group B
f5

Group B
f7

Group B
f8

CSA DEA BELPSO

1.17e+000 0.13e+000 7.31e-004 2.87e-004


0 0

1.92e+000 0.55e+000 2.20e+001 4.60e+000


2.38e-001 4.04e-001

1.90e+000 0.61e+000 1.37e+001 2.08e+000


1.70e-001 4.10e-001

7.66e+000 3.82e+000
1.28e-002 4.45e-002

1.59e+002 1.44e+002 Group C


f12

Group C
f9 CSA DEA BELPSO

Group C
f10

Group C
f11

2.45e+000 0.36e+000 5.98e-007 3.66e-007


4.74e-015 1.83e-015

0.90e+000 0.11e+000 5.37e-001 9.71e-002


6.84e-002 4.45e-002

2.71e+000 0.57e+000 1.10e-002 4.50e-003


3.55e-005 4.21e-005

1.61e+001 3.66e+001 3.62e+001 5.49e+001


5.77e+000 1.88e+000

Group C
f13 CSA DEA BELPSO

Group C
f14

Group D
f15

Group D
f16 3.20e+000 9.64e+000

1.05e+001 2.99e+000 2.10e+001 4.24e+001


7.25e+000 7.51e-001

4.81e+002 1.43e+002 9.00e+002 4.47e+002


1.98e+002 1.42+002

6.05e+000 3.51e+000
1.73e-012 2.55e-012

1.20e+001 2.99e+001 1.39e+001 1.11e+001

4.56e+000 7.89e+000

C. Performance Comparisons between BELPSO with Some of Heuristic Population based Algorithms In this section, we compare the performance of BELPSO with CSA [25] and DEA [26]. The reported

REFERENCES
[1] R. Eberhart and J. Kennedy, "A new optimizer using particle swarm theory," in Proceedings of the Sixth International Symposium on Micro Machine and Human Science, 1995, pp. 39-43.

2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2119

[2] J. Kennedy and R. Eberhart, "Particle swarm optimization," in Proceedings IEEE International Conference on Neural Networks, 1995, pp. 1942-1948. [3] S. Meshoul and T. Al-Owaisheq, "QPSO-MD: A Quantum Behaved Particle Swarm Optimization for Consensus Pattern Identification," in Computational Intelligence and Intelligent Systems. vol. 51, 2009, pp. 369-378. [4] P. M. Pradhan, V. Baghel, G. Panda, and M. Bernard, "Energy Efficient Layout for a Wireless Sensor Network using Multi-Objective Particle Swarm Optimization," in Advance Computing Conference, 2009. IEEE International, 2009, pp. 65-70. [5] Y. Shi and R. Eberhart, "A modified particle swarm optimizer," in 1998 IEEE International Conference on Evolutionary Computation Proceedings. New York, NY, USA, 1998, pp. 69-73. [6] S. Yuhui and R. C. Eberhart, "Parameter selection in particle swarm optimization," in Evolutionary Programming VII. 7th International Conference, EP98. Proceedings, 25-27 March 1998, Berlin, Germany, 1998, pp. 591-600. [7] B. Liu, L. Wang, Y. H. Jin, F. Tang, and D. X. Huang, "Improved particle swarm optimization combined with chaos," Chaos Solitons & Fractals, vol. 25, pp. 1261-1271, 2005. [8] M. Clerc and J. Kennedy, "The particle swarm - explosion, stability, and convergence in a multidimensional complex space," IEEE Transactions on Evolutionary Computation, vol. 6, pp. 58-73, 2002. [9] P. J. Angeline, "Using selection to improve particle swarm optimization," in Proceedings IEEE World Congress on Computational Intelligence., 1998, pp. 84-89. [10] T. K. R. Morten Lovbjerg , Thiemo Krink, Hybrid Particle Swarm Optimiser with Breeding and Subpopulations, in Proceedings of the Genetic and Evolutionary Computation Conference, 2001, pp. 469-476. [11] F. van den Bergh and A. P. Engelbrecht, "A Cooperative approach to particle swarm optimization," IEEE Transactions on Evolutionary Computation, vol. 8, pp. 225-39, 2004. [12] J. J. Liang, A. K. Qin, P. N. Suganthan, and S. Baskar, "Comprehensive learning particle swarm optimizer for global optimization of multimodal functions," IEEE Transactions On Evolutionary Computation, vol. 10, pp. 281-295, 2006. [13] J. M. Baldwin, "A New Factor in Evolution " The American Naturalist, vol. 30, No. 354 pp. 441-451, 1896. [14] D. Whitley, V. S. Gordon, and K. Mathias, "Lamarckian evolution, the Baldwin effect and function optimization," in Parallel Problem Solving from Nature - PPSN III

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]

[25]

[26]

[27]

International Conference on Evolutionary Computation The Third Conference on Parallel Problem Solving from Nature, Berlin, Germany, 1994, pp. 6-15. R. W. Anderson, "Learning and evolution: A quantitative genetics approach," Journal of Theoretical Biology, vol. 175, pp. 89-101, 1995. G. E. Hinton and S. J. Nowlan, "How learning can guide evolution," in Adaptive individuals in evolving populations: models and algorithms, 1996, pp. 447-454. R. K. Belew, When Both Individuals and Populations Search: Adding Simple Learning to the Genetic Algorithm, in Proceedings of the 3rd International Conference on Genetic Algorithms, Morgan Kaufmann Publishers Inc., 1989, pp. 34-41. J. A. Bullinaria, "Exploring the Baldwin effect in evolving adaptable control systems," in Connectionist Models of Learning, Development and Evolution, 2000, pp. 231-242. I. Harvey, "Puzzle of the persistent question marks. A case study of genetic drift," Australian Electronics Engineering, vol. 27, pp. 15-15, 1994. J. Huang and H. Wechsler, "Visual routines for eye location using learning and evolution," IEEE Transactions on Evolutionary Computation, vol. 4, pp. 73-82, 2000. X. Yao, Y. Liu, and G. Lin, "Evolutionary programming made faster," IEEE Transactions on Evolutionary Computation, vol. 3, pp. 82-102, 1999. C.-Y. Lee and X. Yao, "Evolutionary Programming Using Mutations Based on the Levy Probability Distribution," IEEE Transactions on Evolutionary Computation, vol. 8, pp. 1-13, 2004. Z. Tu and Y. Lu, "A robust stochastic genetic algorithm (StGA) for global numerical optimization," IEEE Transactions on Evolutionary Computation, vol. 8, pp. 456-470, 2004. S. C. Esquivel and C. A. C. Coello, "On the use of particle swarm optimization with multimodal functions," in Evolutionary Computation, 2003, pp. 1130-1136. L. N. de Castro and F. J. Von Zuben, "Learning and optimization using the clonal selection principle," IEEE Transactions on Evolutionary Computation, vol. 6, pp. 239-251, 2002. R. Storn and K. Price, "Differential Evolution A Simple and Efficient Heuristic for global Optimization over Continuous Spaces," Journal of Global Optimization, vol. 11, pp. 341-359, 1997. Z. Lining, G. Maoguo, J. Licheng, and Y. Jie, "Improved Clonal Selection Algorithm based on Baldwinian learning," in IEEE World Congress on Computational Intelligence, 2008, pp. 519-526.

2012 ACADEMY PUBLISHER

2120

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

Acoustic Emission Signal Feature Extraction in Rotor Crack Fault Diagnosis


Kuanfang He
Hunan Provincial Key Laboratory of Health Maintenance for Mechanical Equipment, Hunan University of Science and Technology, Xiantan, 411201, China Email: hkf791113@163.com

Jigang Wu, Guangbin Wang


Engineering Research Center of Advanced Mine Equipment, Ministry of Education, Hunan University of Science and Technology, Xiangtan 411201, China Email: jgwuhust@gmail.com, jxxwgb@126.com

AbstractOn the rotor comprehensive fault simulation testbed, characteristics of acoustic emission signal of different crack rotors in various depth and conditions are analyzed by the acoustic emission extraction experiment. The noiseeliminating method of the acoustic emission signal was researched by the wavelet packet technique, the characteristics of acoustic emission signal of the rotor crack was also obtained for fault diagnosis in this paper. The advantages of acoustic emission technique have been highlighted in the early period crack fault diagnosis compared to the vibration method of crack fault diagnosis. The diagnosis results were shown to be quite clear, reliable and accurate. Index TermsRotor crack, acoustic emission, wavelet packet, fault diagnosis

I.

INTRODUCTION

Rotor crack is a common fault in steam turbine rotor, turbines, generators, compressors and other large rotating machinery. Once such machinery and equipment occur failure suddenly in operation, it will bring about huge economic losses, even cause catastrophic accidents. So rotor crack fault diagnosis technology has been attracted attention from government, academia, and enterprises. Reference [1] showed that acoustic emission as a nondestructive testing method has been applied widely since the last mid century. Conventional diagnosis method of rotor crack is used to process and analyze fault characteristics by the vibration signal. However, rotating machinery vibration signals of fault are relatively complex and weak, especially for the early period failure without vibration sometimes. Therefore, it is a bottleneck problem that is difficult to identify the faulty feature by vibration method in strong background noise [2, 3]. The phenomena of relative motion of particles within material including atomic, molecular and particle swarm can produce the transient release of strain energy of elastic wave, which is used to identify and understand the substance or structure of the internal state. Currently, Acoustic Emission technology has been successfully

applied in the rotating machinery fault diagnosis, including gear box failure, rolling and sliding bearings failure, transmission component failure, rub rotor failure and so on. E.Govekar analyzed the acoustic emission signals to monitor machining processes in 2000 [4]. Followed by this, L.D. Hall used acoustic emission to detect shaft-seal rubbing in large scale turbines [5]. In reference [6], acoustic emissions was introduced to achieve diagnosis of continuous rotorstator rubbing in large scale turbine units, this paper proposed the integration of acoustic emissions and signal processing methods to fault diagnosis. In order to expand acoustic emissions range of applications, H.M. Lei further studied the mechanism of electromagnetic acoustic transducer for ultrasonic generation in ferromagnetic material [7]. The acoustic emissions technology further promoted, reference [8] showed wavelet packets has been applied in acoustic emission signal feature extraction. References [9, 10] have proved that the elastic wave within the metal structure would be generated as the form of Lei Rem wave in the running of the equipment and communicate on the surface of the material. The elastic wave can be detected by the dedicated acoustic emission sensors. Therefore, the acoustic emission technique can be used to the rotor crack diagnosis. Because the acoustic emission technology in the early diagnosis of crack is highlighted by comparison of the time domain graph, power spectra of vibration crack fault diagnosis etc. Reference [11] has proposed crack fault identification in rotor shaft with artificial neural network. In reference [12, 13], Based on wavelet packet analysis, acoustic emission was applied to achieve location technique for crack fault source and feature extraction for cable damage. So the acoustic emission signal characteristics of rotor crack of four different crack depths with the same of materials are studied in this paper, as well as the relationship between acoustic emission signals and the depth, speed and load of the crack. The early period fault feature of crack is extracted by use of wavelet packet, which highlights the outstanding advantages in the acoustic emission signal analysis and processing. The acoustic emission technique for detection

2012 ACADEMY PUBLISHER doi:10.4304/jcp.7.9.2120-2127

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2121

and fault diagnosis of rotor crack of rotating machinery is of great significance. II.EXPERIMENTAL METHODS AND DATA ACQUISITION A. Experimental Condition and Method Fig.1 shows rotor fault acoustic emission crack detection system diagram. The entire experiment is operated in the United States MFSPK6 integrated fault simulation. Acoustic emission (AE) acquisition system is the Danish SWAES full waveform acoustic emission monitor shown in Fig.2, the acoustic emission acquisition

Figure1. Rotor crack fault acoustic emission detection system diagram

software is made in Beijing Sublimation Company shown in Fig. 3. Four home-made experimental shafts shown in Fig. 4, Size DL is 16mm53mm, materials are the 1Cr18Ni9Ti. One of them is good shaft, The remaining three shaft are made by pre-cut wire cutting and fatigue testing equipment, of which the crack depth is 3mm, 5m, 8mm with crack width of 0.12mm, crack near the roundabout at the centre of shaft, and the thickness of 0.10mm thin metal sheets is embed to simulate opening and closing of the crack, 502 glue is used to cement a side of the thin metal with the thin metal piece to prevent felling off during the experiment. Fig.5 is the rotor testing equipment. The fault simulation platform of the U.S. SpectraQuest company is chose as cracked rotor running environment. The testing can be able to simulate 32 kinds of failure modes such as bearing failure, gear failure, gearbox failure, shaft failure, electrical failure and so on, of which each failure mode supported by the typical fault conditions. Detection of acoustic emission of rotor crack failures is tested by this equipment; it is reliability, high precision and has a very

Figure2. Denmark SWAES full waveform acoustic emission detector Figure 5. American MFS integrated fault simulation test bed

Figure3. AE monitoring of signal collection software interface

Figure 4. Four pre-shaft

valuable reference. Main components of SWAES full waveform acoustic emission detector includes SR15-type acoustic emission sensor with diameter of 19mm, weight of 34g, resonant frequency of 150KHz, sensitivity of 65dB, within the temperature scope of -65~177. The part of signal acquisition is the piezoelectric ceramic, which can sample the signal frequency range of 50~200 kHz. PAI preamplifier is frequency bandwidth of 0.02-2MHz and gain of 40dB. The main SA4 amplifier is frequency range of 20-400K, inputting impedance is more than 50M. The 9812 data acquisition card is 20MHz and 4-channel analog. The portable computer is IBM-T43. The acoustic emission signal is obtained by sensors, and sent to the pre-amplifier and main amplifier processing, then stored in the portable computer, which is controlled by the acoustic emission data acquisition system with the subsequent signal processing and analysis. Another function of the main amplifier is to supply electric for acoustic emission acquisition card. Two acoustic emission sensors are pasted to both ends of the bearing seat at the integrated fault simulation testing bench by glue in the experiment. The cable is

2012 ACADEMY PUBLISHER

2122

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

reasonable layout and non-contact rotating shaft in the experiment. B. Data Acquisition In experiment, we collect acoustic emission signals of four different shafts in the condition of same load at different speed and same speed at different loads. The first step is to measure the noise signal, fit the well shaft, and then run the integrated fault simulation testing bed, the collected signal in the experiment is the samples of the noise signal. The fitting method of process and denoising the crack acoustic emission signal is obtained by analyzing the characteristic of sampled noise signal. The acoustic emission signals of the four shafts are detected in the condition of same load at different speed and same speed at different loads. Acoustic emission signals usually have high amplitude compared to electromagnetic noise signals, so we can set a reasonable threshold in software to collect the signal. Fig. 6 shows time domain graph of the original signal at the speed of 1800r/min. Fig. 6 (a) shows that the signal amplitude is low at the range of 0.85V~1V, the Fig. 6 (b) (c) (d) show that the signal amplitude have an increasing trend as the degree of cracks deepen, and the timedomain signal of the well-shaft is relatively flat, the timedomain signal of the shaft has significantly waveform. Fig. 7 shows the vibration signal time domain graph of the entire shaft at the speed of 1800r/min, we can see that it is no significant differences between the well shaft and the crack shaft from the Fig.7. Therefore, detection of acoustic emission is more easily identified failure, especially the early period crack failure.

(c) 5mm crack depth of shaft

(d) 8mm crack depth of shaft Figure 6. 1800r/min the time domain graph of the AE signal

(a) Well shaft (a) Well shaft

(b) 3mm crack depth of shaft

(b) 3mm crack depth of shaft

2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2123

Fig. 8 shows the curve of rotating speed and the amplitude of acoustic emission signal with the load weight of 2.5Kg. Fig. 8 shows that the amplitude of acoustic emission signals increase as the rotor speed no

(c) 5mm crack depth of shaft

(c) 8mm shaft crack Figure8. Speed affects the curve of the AE signal amplitude

(d) 8mm crack depth of shaft Figure 7. 1800r/min time-domain diagram of vibration signal

(a) 3mm shaft crack

(a) 3mm shaft crack (b) 5mm shaft crack

(c) 8mm shaft crack (b) 5mm shaft crack Figure9. Load affects the curve of the AE signal amplitude

2012 ACADEMY PUBLISHER

2124

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

matter how the crack depth is. Fig.9 shows the curve of load and the amplitude of acoustic emission signal with rotational speed of 1800r/min. III. FAULT FEATURE EXTRACTION OF CRACK ACOUSTIC EMISSION SIGNAL Fig. 10 (a) is the power spectrum of vibration signal of

well-shaft rotating at the speed of 1800 r/min, Fig. 10 (b) (c), (d) are the power spectrum of the original acoustic emission signal of crack shaft rotating at the speed of 1800 r/min. In the Fig.10, we can see there are no significant difference between Fig.10 (a) and Fig.10 (b), but the difference between Fig. 10(c) and Fig. 10(d) are obvious. It illustrate that vibration detection is hard to find the early period crack fault. Fig. 11(a) shows the power spectrum of original acoustic emission signal of well-shaft rotating at the

(a)

Well-shaft

(a)

Well-shaft

(b) 3mm crack depth shaft

(b)

3mm crack depth shaft

(c)5mm crack depth shaft

(c)5mm crack depth shaft

(d)8mm crack depth shaft (d)8mm crack depth shaft Figure10. Power spectrum of original vibration signal of 4 shafts rotating at the speed of 1800 r/min Figure11. Power spectrum of original AE signal of 4 shafts rotating at the speed of 1800 r/min

2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2125

speed of 1800 r/min, Fig. 11 (b), (c), (d) are the power spectrum of the original acoustic emission signal of crack shaft rotating at the speed of 1800 r/min. Fig. 11 show that the power spectrum of well-shaft and crack shaft are very different, the range of frequency are wider, and the spectral lines of characteristic frequency are also higher. The distribution of acoustic emission signal energy and the concentration of band can be learned from Fig. 11 (b), (c), (d), as well as the existed crack fault. However, we cant determine the location and the degree of the crack fault clearly from Fig. 11 (b), (c), (d). So it is hard to diagnose the crack fault by acoustic emission signals precisely only through Fourier Transform. In order to extract the feature of crack fault acoustic emission signal clearly, pretreatment such as trend eliminating, mean eliminating and Gaussian noise filtering to the original acoustic emission signal of 3mm depth crack shaft rotating are done at the speed of 1800 r/min. Wavelet packet feature extraction method of acoustic emission signals is introduced to decompose the acoustic emission signal after noise elimination by Shannon entropy and 4 layer db10 wavelet. Thus, 4 levels reconstructed signal is divided into 16 bands from high to low, the length of each band is 15.625KHz, wavelet packet decomposition coefficients of tree node (4, 2) and (4,6) are selected to reconstruct. Fig. 12 is the acoustic emission signal and power spectrum of signal

reconstruction, concentrated high-frequency spectrum can be clearly seen between 55 KHz~70 KHz in Fig.12, it demonstrates that behavior of acoustic emission of highfrequency elastic waves are happened during the crack propagation. Fig. 13 shows the original acoustic emission signal of 3mm depth crack shaft rotating at the speed of 3600 r/min, the amplitude of the time domain signal is about 1.3V. According to the original acoustic emission signal of 3mm depth crack shaft rotating at the speed of 3600 r/min shown in Fig. 13, Fig. 14 is reconstruction of the power spectrum of acoustic emission signal after wavelet packet decomposition. Fig. 14 shows that the main spectrum line concentrates between 75 KHz and 95 KHz, the low amplitude of spectrum line is between 50KHz and 75KHz, and little frequency distributed above 100KHz, it illustrates that the crack expanded speed and the amplitude of acoustic emission signals increase as the

Figure14. The AE signal power after spectrum wavelet packet decomposition and reconstruction

Figure12. AE signal wavelet packet reconstruction signal and power spectrum

Figure15. The original AE signal of 3mm deep crack shaft with 4500r/min

Figure13. The original AE signal of 3mm deep crack shaft with 3600r/min

Figure16. The AE signal power after spectrum wavelet packet decomposition and reconstruction

2012 ACADEMY PUBLISHER

2126

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

frequency become larger. Fig.15 shows the original acoustic emission signal of 3mm depth crack shaft rotating at the speed of 4500 r/min, the amplitude of time domain signal reaches about 2V, characteristics of acoustic emission signals are obvious. Fig.16 shows the power spectrum reconstruction of acoustic emission signal after 4 layers wavelet packet decomposition. Fig. 16 show that high-frequency spectrum are mainly distributed between 90KHz and 105KHz, a small amount of spectrum lines concentrates the rang form 50 KHz to 70KHz and around 150KHz, and the amplitude are very low. It demonstrates that the amplitude and the frequency of acoustic emission signal increase as the rotating speed rises and the crack extension expands. The same studying of the shaft crack depth of 5mm and 8mm are done, the results also show that the fault characteristic become more obvious as the increasing of crack depth and the rotation speed. The above analyses have been shown that acoustic emission is feasible to diagnose the early period crack failure with accurately effect. IV. CONCLUSIONS (1) According to the comparison of acoustic emission and vibration signal applied to analyze the cracks faults of rotor, it conclude that acoustic emission is easier to be detected for the early period cracks at high speed, the vibration signal is only to identify the early period crack fault in the low rotation speed. (2) Wavelet packet technique can extract the pulse signal represented fault source characteristic and obtain effective failure information from the acoustic emission signal. When wavelet packet technique is applied to the actual acoustic emission testing systems, it will greatly improve the relevant technical indicators such as the rate of fault coverage, fault diagnosis and reduce false alarming rate. (3) The features of acoustic emission signal detected by shafts crack in different states are obviously different as the time changes, which has a full matching with the actual crack expanded propagation process. It is an important significance of acoustic emission technology further applied to fault diagnosis for rotor crack. ACKNOWLEDGMENT This work was supported in part by a grant from Financial support from National Natural Science Foundation of China (51075140, 51005073), Hunan Provincial Natural Science Foundation of China (11JJ2027), Project of Hunan Provincial Research Scheme (2011GK3052), Scientific Research Fund of Hunan Provincial Education Department (10C0682), Ph. D Start Fund (E51088), CEEUSRO special plan of Hunan province(2010XK6066), Industrial Cultivation Program of Scientific and Technological Achievements in Higher Educational Institutions of Hunan Province (10CY008), also from Aid program for Science and Technology Innovative Research Team in Higher Educational

Institutions of acknowledged.

Hunan

Province,

are

gratefully

REFERENCES
[1] G.M.Liu, Non-destructive Testing Technology, Beijing: National Defence Industry Press, 2006. [2] L.D. Hall, D. Mba, Diagnosis of continuous rotorstator rubbing in large scale turbine units using acoustic emissions,Ultrasonics, Germany, vol.41, pp.765773, September 2004. [3] E. Govekar, J. Gradisek and I. Grabec, Analysis of acoustic emission signals and monitoring of machining processes, Ultrasonics, Germany, vol.38, pp.598603, May 2000. [4] M.W.Yang, Acoustic Emission Testing, Beijing: Mechanical Industry Press, 2005, pp.108117. [5] R.Bagnoli, P. Citti, Comparison of accelerometer and acoustic emission signals as diagnostic tools in assessing bearing, Proceedings of 2nd International Conference on Condition Monitoring, London, UK, pp.117 125, May 1988. [6] L.D. Hall, D. Mba, The detection of shaft-seal rubbing in large scale turbines using acoustic emission, 14th International Congress on Condition Monitoring and Diagnostic Engineering Management, Manchester, UK, pp. 2128, September 2001. [7] H.M. Lei, P.W. Que and Z.G. Zhang, Mechanism study on electromagnetic acoustic transducer for ultrasonic generation in ferromagnetic material, Journal of Southeast University (English Edition), China, vol.20, pp299-314, September 2004. [8] C.J. Liao and X.L Luo. Wavelet packets apply in acoustic emission signal feature extraction, Electronic Measurement and Instrument, China, vol.22, pp.79-85, Aguest 2008, [9] C.Kumar, V.Rastogi, A Brief Review on Dynamic of A Cracked Rotor, International Journal of Rotating Machinery, United States, vol 32, pp.1-6, February 2009. [10] A.Alfyo, Dual Time-Frequency-Feature Investigation and Diagnostics of A Cracked de-Laval Rotor, IEEE Transl. Japan, vol. 2, pp.1-7, September 2009. [11] Y.Tao, H.Qingkai, Crack Fault Identification in Rotor Shaft with Artificial Neural Network, 2010 Sixth International Conference on Natural Computation, China, pp.1629-1633, March 2010. [12]Z.G.Zhi, R.K.Sun, Acoustic emission source location technique based on wavelet packet analysis, Journal of Jiangsu University,China, vol 21, pp109-113, September 2010. [13]D.Yang, Y.L.Ding, A.Q.Li, Feature extraction of acoustic emission signals for cable damage based on wavelet packet analysis Journal of Vibration and Shock, United States, vol 29, pp154-158, April 2010.

Kuanfang He, who was born on December 13,1979 in Hunan province, received master degree in 2006 and PHD in 2009 from South China University of Technology, research field are dynamic monitoring and control for electromechanical systems. He has been working at Hunan University of Science and Technology since 2009, mainly engaged in theory and practice of teaching, scientific research. His paper works are: Research on Controller of Arc welding Process Based on PID Neural Network, J. Control Theory and Applications (2008),Fuzzy logic control strategy for

2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2127

submerged arc automatic welding of digital controlling China welding (2008), Time-Frequency Entropy Analysis of Arc Signal in Non-stationary Submerged Arc Welding Engineering(2011) , Wavelet Analysis for Electronic Signal of Submerged Arc Welding ProcessICMTMA(2011), Modeling and Analysis of arc weld inverter based on Double closed-loop control Applied Mechanics and Materials(2010), Prediction Model of Twin-Arc High Speed Submerged Arc Weld Shape Based on Improved BP Neural Network. Advanced Materials Research(2011),etc. His research interest is dynamic monitoring and control for electromechanical systems. Dr. He is a member of Hunan Province instrumentation Institute. In recent years, he has presided one National Natural Science Foundation of China and two provincial research projects, obtained three provincial academic reward, published more than 20 academic papers. Jigang Wu, male, born in 1978 August 3, 1978 in Hunan province, received master degree in 2004 and Ph.D in 2008 from Huazhong University of Science and Technology, research field are dynamic monitoring and control for electromechanical systems, fault diagnosis. He has been working at Hunan University of Science and Technology since 2008, mainly engaged in theory and practice of teaching, scientific research. His paper works are: Subpixel Edge Detection of Machine Vision Image for Thin Sheet Part, China Mechanical Engineering (2009), Research on planar contour primitive recognition method based on curvature and HOUGH transform,

Journal Of Electronic Measurement And Instrument (2010), etc. His research interest is dynamic monitoring and control for electromechanical systems, fault diagnosis. Dr. Wu is a member of Hunan Province instrumentation Institute. In recent years, he has presided two provincial research projects, obtained three provincial academic reward, published more than 10 academic papers. Guangbin Wang, male, born in December 1974 in LinZhou city of Henan province, obtained PHD in 2010 from Central South University, research field are manifold learning and equitments fault diagnosis. He has been working at Hunan University of Science and Technology since 1999, mainly engaged in student management, teaching and scientific research. His works have, Fault diagnosis based on kernel schur orthogonal local fisher discriminant, Chinese journal of scientific instrument (2010),Rotor Fault Diagnosis Based on Orthogonal Iteration Local Fisher Discriminant Journal of Vibration, Measurement and Diagnosis (2010), An improved noise reduction algorithm based on manifold learning and its application to signal noise reduction , Applied Mechanics and Materials(2010).His research interest are nonlinear signal processing and equipments condition monitoring and fault diagnosis. Dr. Wang is a member of Hunan Province instrument nation Institute. In recent years, he presided 2 provincial research projects, obtained two provincial academic reward, published more than 20 academic papers.

2012 ACADEMY PUBLISHER

2128

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

Integrability of the Reduction Fourth-Order Eigenvalue Problem


Shuhong Wang
College of Mathematics, Inner Mongolia University for Nationalities, Tongliao, China Email: shuhong7682 @163.com

Wei Liu
Department of Mathematics and Physics, ShiJiaZhuang TieDao University, Shijiazhuang, China Email: Lwei_1981@126.com

Shujuan Yuan
Qinggong College, Hebei United University, Tangshan, China Email: yuanshujuan1980@163.com

AbstractTo study the reduced fourth-order eigenvalue problem, the Bargmann constraint of this problem has been given, and the associated Lax pairs have been nonlineared. By means of the viewpoint of Hamilton mechanics, the Euler-Lagrange function and the Legendre transformations have been derived, and a reasonable Jacobi-Ostrogradsky coordinate system has been found. Then, the Hamiltonian cannonical coordinate system equivalent to this eigenvalue problem has been obtained on the symplectic manifolds. It is proved to be an infinite-dimensional integrable Hamilton system in the Liouville sense. Moreover the involutive representation of the solutions is generated for the evolution equations hierarchy in correspondence with this reduced fourth-order eigenvalue problem. Index Termsconstraint flow, Bargmann system, integrable system, involutive representation

I. INTRODUCTION The theory of integrable systems has been an interesting and important problem. The finite dimensional integrable system is used to describe the problems in mathematical physics, mechanics, etc., such as Kovalevskia top, geodesic flows on the ellipsoid harmonic oscillator equation on sphere, Calogero-Moser system.etc [1-3]. The soliton equation as an infinite dimensional integrable system is one of the most prominent subjects in the field for the nonlinear science. Solutions were first observed by J. Scott Russell in 1834 whilst riding on horseback beside the narrow Union canal near Edinburgh, Scotland. There are a number of discussions in the literature describing Russells observations. At the center of these observations is the discovery that these nonlinear waves can interact elastically and continue afterward almost as if there had been no interaction at all (see Fig. 1- Fig. 5). Because of the analogy with particles, Zabusky and Kruskal named these special waves as solitons. Zabusky and Krusks remarkable numerical discovery demanded an analytical

explanation and detailed mathematical study of the partial differential equations. In 1968, Lax put the inverse scattering method for solving the KDV equation into a more general framework which subsequently paved the way to generalizations of the technique as a method for solving other partial differential equations [1-4]. The technique of the so-called nonlinearization of Lax pairs [5,8-10] has been developed and applied to various soliton hierarchies, from which a large of interesting finite-dimensional Liouville integrable Hamiltonian systems have been obtained. Recently, this method was generalized to discuss the nonlinearization of Lax pairs and adjoint Lax pairs of soliton hierarchies [8-11]. The nonlinearization approach (or constrained flows) of eigenvalue problems or Lax Pairs has been used to seek the relations between the infinite and finite integrable system. There are series of finite integrable system obtained by this approach [8-19].

Figure1. Bell-shaped Solitons at different times

2012 ACADEMY PUBLISHER doi:10.4304/jcp.7.9.2128-2135

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2129

Figure2. Bell-shaped Solitons at different times

the viewpoint of Hamilton mechanics, the EulerLagrange function and the Legendre transformations have been derived, and a reasonable Jacobi-Ostrogradsky coordinate system has been found. Then, the Hamiltonian cannonical coordinate system equivalent to the eigenvalue problem (1) has been obtained on the symplectic manifold. It is proved to be an infinite-dimensional integrable Hamilton system in the Liouville sense. Moreover the involutive representation of the solutions is generated for the evolution equations hierarchy in correspondence with the reduced fourth-order eigenvalue problem (1). II. MAIN RESULTS

Figure3. Bell-shaped Solitons at different times

A. Evolution Equations and Lax Pairs Now, suppose is the basic interval of the reduced eigenvalue problem (1), if the potentials u , v, in (1) and their derivates on x are all decay at infinity, then = ( , + ) ; if they are all periodic T functions, then = [0, 2T ] . Definition 1 [8, 9] Assume that our linear space is equipped with a L2 scalar product

( f , g )L () = fg *dx <
2

the symbol * denotes the complex conjugate. Definition 2 [8, 9] Operator A is called dual operator of A , if

( Af ,g )L ( ) = ( f , Ag )L ( )
2 2

Figure4. Bell-shaped Solitons at different times

and A is called a self-adjioint operator, if A= A. We consider the following fourth-order operator L on the interval L = 4 + u + v here u , v is potential function of the eigenvalue problem (1). is called eigenvalue of the eigenvalue problem (1), and is called eigenfunction to eigenvalue , if L = ,

Figure5. Bell-shaped Solitons at different times

The fourth-order eigenvalue problem L = ( 4 + q 2 + 2 q + p + p + r ) = has been discussed , but its reduced system L = ( 4 + u + v ) = . In this paper, we consider the reduced fourth-order eigenvalue problem (1) L = ( 4 + u + v ) = here =
eigenparameter R , and potential x function u , v is a actual function in ( x, t ) . By means of
2012 ACADEMY PUBLISHER

L2 () is non-trivial solution of the eigenvalue problem (1) . Theorem 1 L = 4 + u + v is a self-adjioint operator on . Proof. By Definition 2, we easily derive Theorem1. L(u + ) is called the derived function If L = = 0 of differential operator L = L (u ) , here u is potential function of L , then we have Theorem 2 If is an eigenfunction corresponding to the eigenvalue of (1), then functional Gradient [20] 2 u 1 x 2 = (2) = ( dx) 2 v Proof. By (1), we have L + L = + and

2130

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

( L ) dx =

L dx = dx

( )

then hence

L =
L =

Hamilton operators by the definition of the bi-Hamilton operators. By complex computing, we can obtain (6), (7). Set JG0 = 0 , then
4 G0 = 0 or
u G0 = 4 So, we have
Ltm = [Wm , L] = [ L* ( KG j ) L* ( JG j ) L]Lm j
j =0 m

(8)

Namely: dx = ( L ) dx

= ( u 2 + ux + v ) dx = ( v 2 u 2 )dx

(9)

2 = u xx u ( x + xx ) + v 2 dx

so 2 u = = ( 2 dx) 1 2x v Definition 3 [21]. The commutator of operators W and L is defined as follows [W , L ] = WL LW . Set m 1 1 1 3 2 Wm = 4 b j 8 b jx + 8 ( 2a j + b jxx + 2ub j ) j =0 1 ( 3a jx + b jxxx + ub jx ) Lm j , m = 0,1, 2L 8 aj G j = , j = 0,1, 2,L m bj 0 J= 3 3 1 1 + u + ux 4 2 4 K11 K12 K = K 21 K 22 here 4 1 1 K11 = 3 + u + u x , 5 2 4 3 3 3 3 3 K12 = 5 + u 3 + u x 2 + u xx + v + vx , 8 8 4 8 4
3 3 3 1 K 21 = 5 + u 3 + u x 2 + v + vx , 8 8 8 4 1 7 1 5 5 4 3 1 3 K22 = + u + ux + uxx + u2 + v 3 8 4 8 8 4 4
here, L* ( ) =

= L* ( KGm ) L* ( JG0 ) Lm +1 = L* ( KGm ) = L* ( JGm +1 )

d d

q 2 + : R R is one to one =0 p

[5,8-10]. By the Hamilton operators J and K (4), (5), we have: Theorem 4 Set JG0 = 0 , then the m-th-order evolution equation of (1) are: (10) qt = X m = JGm +1 = KGm
m

(3)

(4)

here q = (u , v)T , m = 0,1, 2L , and (10) become the isospectral compatible condition of the following Lax pairs L = (11) tm = Wm here m = 0,1, 2L . So, we have the first evolution equation

(5)

ut0 = u x vt0 = vx
and its Lax pairs L = t0 = x

(12)

(13)

3 9 1 1 7 + uxxx + uux + vx 2 + vxx + uxxxx 2 8 8 8 8 1 1 1 1 1 1 + uuxx + uv + ux2 + uvx + vux + vxxx 8 2 8 4 4 4 Theorem 3 Operators J and K are the bi-Hamilton operators [5, 8, 9], moreover if and are the eigenvalue and eigenfunction of (1), then JGm +1 = KGm m = 0,1, 2L (6) K = J (7) Proof. By means of the Definition of the bi-Hamilton operators, it is easily derived that J and K are the bi 2012 ACADEMY PUBLISHER

or the first evolution equation 5 3 ut0 = 4 u xxx 4 uu x + 3vx (14) 3 3 3 v = 3 u uu xxx u x u xx + vxxx + uvx xxxxx t0 8 8 8 4 and its Lax pairs L = (15) 3 3 t0 = 8 u x + 4 u x Remark 1The condition (8) generates the Bargmann system for the fourth-order eigenvalue problem (1); the condition (9) generates the C.Neumann system for the fourth-order eigenvalue problem (1) [1, 3-5].

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2131

B. Jacobi-Ostrogradsky Coordinates and Hamilton Canonical Forms of Bargmann System Now, suppose { j , j }( j = 1, 2,L , N ) are eigenvalues and eigenfunctions for the fourth-order eigenvalue problem (1), and 1 < 2 < L < N , then
L = ( + u + v) =
4

I=

1 1 1 xx , xx + , , xx + , x 2 2 4 3 1 3 + , , 48 2

(23)

(16)

here, = diag (1 , 2 ,L , N ) , = diag (1 , 2 ,L , N )T . From (7), we have N N K j k j = J j k +1 j , j =1 j =1 here k = 0,1, 2,L . So k x , x k +1 x , x = J , K (17) k , k +1, here k = 0,1, 2,L . Set G0 = (4, 0)T , using (6) , we have:
3 3 u u2 + v G1 = 4 xx 8 u Now we define the constrained Bargmann system 3 3 u u2 + v x , x G1 = 4 xx 8 = , u

Theorem 5. The Bargmann system (21) for the fourthorder eigenvalue problem (1) is equivalent to the EulerLagrange equations

=0

(24)

Proof. By (22), we have

I I I I = + x x xx xx 1 2 = + 3 , + 4 x , x + 12 , xx 8 + 2 , x x + , xx + xxxx

so

I = L = 0
Set y1 = , y2 = x , and h = y jx , z j I , our
j =1

system to the

aims are to find the coordinates z1 , z2 and the Hamilton function h , that satisfy the following Hamilton canonical equations h y jx = { y j , h} = z j (25) z = { z , h} = h j jx y j here j = 1, 2 , the symbol

(18)

then, we have the relation between the potential (u , v) and the eigenvector as follows , u =3 (19) v , xx + 1 x , x + 3 , 2 2 8 2 From (6), (17), (18) and (19), we have k 1 x , x , Gk = k 1, here k = 1, 2L . Based on the constrained system (19), the eigenvalue problem L = (20) is equivalent to the following system
xxxx + , xx + 2 , x x 1 3 3 + , xx + x , x + , 2 8 2
2

{ , } is the Poisson bracket in

2 the symplectic space = dy j dz j , R 4 N , and the j =1 Poisson bracket of Hamilton functions F , H in the symplectic space is defined as follows [1, 3, 4, 5] 2 N F H H F {F , H } = y z y jk z jk j =1 k =1 jk jk

= Fy j , H z j Fz j , H y j
j =1

By h = y jx , z j I , then
j =1

(21)

dh = ( y jx , dz j + z j , dy jx ) dI .
j =1

Remark 2 we call the equation system (21) to be the Bargmann system for the fourth-order eigenvalue problem (1). In order to obtain the Hamilton canonical forms which is equivalent to the Bargmann system (21), the Lagrange

On the other hand, h = h( y j , z j ),( j = 1, 2) , so


2 h h , dy j + , dz j dh = y j z j j =1

function I [8-9] is defined as follows:


I = Idx

= z jx , dy j + y jx , dz j

(22)

j =1

so that

here,

2012 ACADEMY PUBLISHER

2132

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

dI = z1 , dy1x + z2 , dy2 x + z1x , dy1 + z2 x , dy2 = z1 , d x + z2 , d xx z1x , d + z2 x , d x = z1x , d + z2 x z1 , d x + z2 , d xx

(32) and the system of the Lax pairs for the evolution equation hierarchy (10) is equivalent to (33) Ytm = WmY where m = 0,1, 2,L , and

we have the relations:


1 1 1 1 z1 = xxx + 2 x , + 2 , x = xxx + 2 x u + 4 ux z = + 1 , = + 1 u xx xx 2 2 2

Wm = (ij m )44
m

m = 0,1, 2L

(34)

11m = b j 1xxx + 12 m

Theorem 6. The Jacobi-Ostrogradsky coordinates are as follows y1 = y2 = x 1 1 (26) z1 = xxx + u x + u x 2 4 1 z2 = xx + u 2 and the Bargmann system (21) for the fourth-order eigenvalue problem (1) is equivalent to the Hamilton canonical system h y jx = { y j , h} = z j (27) h z = { z , h} = j jx y j here j = 1, 2 , and the Hamilton function h is 1 1 1 2 h = y1 , y1 y2 , z1 + z2 , z2 y1 , y2 2 2 4 (28) 1 1 3 + y1 , y1 y1 , y2 + y1 , y1 2 16 Remark 3. Based on the Jacobi-Ostrogradsky coordinate system (26), the Bargmann constrained equations associated with the fourth-order eigenvalue problem (1) is u = y1 , y1 (29) 3 1 3 2 v = y1 , z2 + y2 , y2 y1 , y1 2 2 8

1 1 3 1 ub j 1x + ux b j 1 + a j 1x m j 8 16 16 8 j =0 m 1 1 1 = b j 1xx + ub j 1 + a j 1 m j 8 4 j =0 8 1 j =0 8 m 1 = b j 1 m j 4 j =0
m

13m = b j 1x m j 14 m

21m = b j 1xxxx ub j 1xx

1 j =0 8
m

1 8

5 u x b j 1x 32

1 3 1 vb j 1 a j 1xx ua j 1 m j 4 8 8

22 m = 23m

1 1 ub j 1x + a j 1x m j 8 j = 0 16 m 1 = a j 1 m j 4 j =0
m

24 m = b j 1x m j
31m = bj 1xxxxx +
1 j =0 16
m

1 j =0 8
m

3 1 1 1 uxbj 1xx + u2 + v + uxx bj 1x 32 8 16 32

1 1 1 1 + vx uu x bj 1 + aj 1xxx + ux aj 1 m j 32 4 16 8 m 1 bj 1x m j +1 8 j =0

32 m = b j 1xxxx u x b j 1x + ( v

1 j =0 8
m

1 8

1 4

1 2 u )b j 1 16

C. Hamilton Equations of Bargmann System Theorem 7 The Bargmann system (21) for the fourthorder eigenvalue problem (1) is equivalent to: Yx = MY (30) where, Y = ( y1 , y2 , y3 , y4 )T T (31) 1 1 1 = , x , xx + u, xxx + u x + u x 2 2 4 0 E 0 0 1 uE 0 E 0 2 M = 1 0 0 E ux E 4 1 1 1 1 + v u xx u 2 E u x E uE 0 4 4 4 2
2012 ACADEMY PUBLISHER

m 1 1 + a j 1xx m j b j 1 m j +1 2 j =0 4

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2133

33m =
34 m

1 1 ub j 1x a j 1x m j 8 j = 0 16 m 1 1 1 = b j 1xx ub j 1 a j 1 m j 8 4 j =0 8
m

13m =
14 m

41m =

1 5 1 ubj 1xxxxxx ub j 1xxxx ux bj 1xxx 32 32 j = 0 32


m

1 j 1 y1 , y2 j =1 4 m 1 = j 1 y1 , y1 j =1 4
m
m

m j

21m =

1 3 1 3 1 uxx + v bj 1xx uux + vx + uxxx ) j 1x 4 16 32 32 32 1 1 2 1 1 vxx + u x b j 1 a j 1xxxx + ua j 1xx 64 16 8 16 1 1 1 + u 2 v + uxx a j 1 ]m j 4 16 16 m 1 1 + bj 1xx + a j 1 m j +1 4 8 j =0

22 m

1 j =1 4 m 1 = j =1 4
m

m j 1 j 1 y2 , z1 m j y1 , y1 m 2 j 1 y2 , z2 m j

23m =

24 m

1 j 1 y2 , y2 j =1 4 m 1 = j 1 y2 , y1 j =1 4
m

m j m

m j m j 1 m y1 , y2 2
m j

m 42

3 1 1 1 1 = bj 1xxxxx + uxbj 1xx + u2 + v + uxx bj 1x 32 8 16 32 j =0 16


m

31m =
32 m

1 1 1 1 + vx uu x bj 1 + a j 1xxx + ux aj 1 m j 8 32 4 16 m 1 m j +1 bj 1x j =0 8

1 j 1 z2 , z1 j =1 4 m 1 = j 1 z2 , z2 j =1 4
1 j 1 y2 , z 2 j =1 4 m 1 = j 1 y1 , z2 j =1 4
m

33m =

m j

43m = b j 1xxxx ub j 1xx +

1 j =0 8
m

1 8

5 1 u x b j 1x + vb j 1 32 4

m 3 1 1 + a j 1xx + ua j 1 m j b j 1 m j +1 8 8 j =0 4

44 m = b j 1xxx

1 1 3 1 ub j 1x u x b j 1 a j 1x m j 8 16 16 8 j =0 Proof. By direct computing, Theorem 7 is derived. Now, substituting (18)-(20) into (30) and (31), then Yx = MY (35) Ytm = WmY
m

there m = 0,1, 2,L ,

m j m m 1 41m = j 1 z1 , z1 m j + m +1 + y1 , z2 m j =1 4 3 2 + y1 , y1 m 8 m 1 1 42 m = j 1 z2 , z1 m j y1 , y2 m 2 j =1 4 m 1 1 43m = j 1 y2 , z1 m j + y1 , y1 m 2 j =1 4

34 m

M = and

0 E 1 0 y1, y1 E 2 1 y1, y2 E 0 2 3 1 1 2 y1, z2 y1, y1 E y1, y2 E 8 2 2

0 0 E 0 E y1, y1 E 0 (36) 0 (37)

44 m =

1 j 1 y1 , z1 j =1 4
m

m j

Wm = (ij m )
m

4 4

m = 0,1, 2L

So that, the following theorem holds: Theorem 8 On the Bargmann constrained equation (29), the evolution equation hierarchy (10) of the fourthorder eigenvalue problem (1) are nonlinearized as the following Hamilton canonical equation system h h Yx = Z Z x = Y , Y = hm Z = hm tm Z tm (38) Y there m = 0,1, 2,L , and

11m =
12 m

1 j 1 y1 , z1 m j 4 j =1 m 1 = j 1 y1 , z2 m j + m j =1 4

2012 ACADEMY PUBLISHER

2134

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

hm =

1 m +1 1 y1 , y1 + m y2 , z1 + m y1 , y2 y1 , y2 2 4 1 1 2 m m y1 , y1 y1 , z2 y1 , y1 y1 , y1 2 16 1 1 m m z2 , z2 + m j z1 , z1 j 1 y1 , y1 2 8 j =1

here m = 0,1, 2,L . here m, n = 0,1, 2,L

() {hm , hn } = 0 .

(42)

+ m j z2 , z2 m j y2 , z 2 +

j 1 y2 , y2 j 1 y2 , z2

) )

m j y1 , z1 j 1 y1 , z1

1 m m j z2 , z1 j 1 y1 , y2 m j y2 , z1 j 1 y1 , z2 4 j =1

D. Integrable System of Hamilton Equations Set (1) 1 1 1 2 2 2 E j = 2 j y1 j 2 y1 , z2 + 8 y1 , y1 y1 j 1 2 1 + y2 j z1 j z2 j + y1 , y2 y1 j y2 j 2 4 (2) 1 ( l , k ) E j = j , l , k = 1, 2 8 there

2 2 ( ) R 4 N , dy j dz j , h and R 4 N , dy j dz j , hm j =1 j =1 are integrable system in the Liouville sense. Proof. In fact, we obtain (40) by direct computing. So, (41) holds by Theorem 9 and (40). Using (40) and (41), we have (42). According to the Liouville theorem[1, 3, 4], () is holds. Theorem 11. If { y j , z j j = 1, 2} is an involutive

solution of the integrable systems (38), then (30)

(39)

u = y1 , y1 3 1 3 v = y1 , z2 + y2 , y2 y1 , y1 2 2 8

(jl , k ) =

Theorem 9. () { E

ylj 1 i =1, i j j i yli

zlj ykj zli yki

zkj , l , k = 1, 2 zki

is the involutive representation of solutions for the evolution equation hierarchy (10) . Proof. According to (38) is the Lax pairs of the evolution equation hierarchy (10), by means of theorem 10, then theorem 11 holds. III. CONCLUSIONS The Bargmann constraint problem is studied in this paper, and the associated Lax pairs have been nonlineared. By means of the viewpoint of Hamilton mechanics, the Euler-Lagrange function and the Legendre transformations have been derived, and a reasonable Jacobi-Ostrogradsky coordinate system has been found. Then, the Hamiltonian cannonical coordinate system equivalent to this eigenvalue problem has been obtained on the symplectic manifolds. It is proved to be an infinitedimensional integrable Hamilton system in the Liouville sense. Moreover the involutive representation of the solutions is generated for the evolution equations hierarchy in correspondence with this reduced fourthorder eigenvalue problem. REFERENCES
[1] M. J. Ablowitz, and P.A. Clarkson, Solitons, nonlinear evulution equations and the inverse scattering, Cambridge: Cambridge University Press, 1991. [2] V.I. Arnold, Mathematical methods of clsssical mechanics, Second ed., Spring-Verlag, New York, pp.58-65, 1999. [3] M.J. Ablowitz, and H. Segur, Solitons and the inverse scattering transform, SIAM Studies in Appl. Math.. Phiadelphia, 1981. [4] Z. b. Li, Travelling wave solutions of the nonlinear mathematical-physical equation, Beijing: Science press, 2007. [5] C. W. Cao, and X. G. Geng, Research reports in physics. Pro. Conf. on Nonlinear Physics, Spring-Verlag, New York., pp.66-78, 1990. [6] C. W. Cao, A classic integrable systems and the involutive of solutions of the KDV equation, Acta Math. Sinica. New Series, vol7, pp.5-15, 1991. [7] H. Flaschka, Relations between infinite-dimensional and finite-dimensional isospectral equations. Proc RIMS Symp.

(m) j

, j = 1, 2,L , N ; m = 1, 2} are the

involutive system, i.e. {E (j m) , E (j n) } = 0 here j, i = 1, 2,L, N ; m = 1, 2 () H = here


hm = j m ( E (1) + E (2) ) , m = 0,1, 2L j j
N j =1
1 ( E (1) + E (2) ) = m1hm j j j =1 j m=0 N

Proof. () By the definition of the Poisson bracket, it easily is derived. () N N 1 1 H = ( E (1) + E (2) ) = 1 ( E (1) + E (2) ) j j j j j j =1 j j =1 1
=

j 0 j =1 m =
N

(E

(1) j

+ E (2) ) j

= m 1 j m ( E (1) + E (2) ) j j
N m=0

j =1

= m 1hm
m=0

here

hm = j m ( E (1) + E (2) ) , m = 0,1, 2L j j


N j =1

Theorem 10. () h, E (j k ) = 0

(40) (41)

here j = 1,2,L, N; k = 1,2 . () {h, hm } = 0 .

2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2135

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

On Nonlinear Integrable Systems, World Sci, Singapore Kyot Japan, pp.219-239, 1983. Z. Q. Gu, The Neumann system for the 3rd-order eigenvalue problems related to the Boudssinesq equation, Il Nuovc Cimento, vol. B(6)117, pp. 615632, 2002. Z. Q. Gu, Complex confocal involution system associated with the solutions of the AKNS evolution equations, J.Math. Phys, vol. 6(32), pp. 14981504, 1991. C. W. Cao, Symplectic manifold and integrable system, Journal of Shiazhuang Railway Institute, Vol.2, no. 4, 1989. W. Liu, and S. J Yuan, The second-order spectral problem with the speed energy and its completely integrable system, Journal of Shijiazhuang Railway Institute, Vol.22(1), 2009. D. S. Wang, Complete integrability and the Miura transformation of a coupled KdV equation, Appl. Math. Lett., vol. 23, pp. 665-669, 2010. Y. Q. Yao, Y. H. Huang, Y. Wei, and Y. B. Zeng, Some new integrable systems constructed from the biHamiltonian systems with pure differential Hamiltonian operators, Applied Mathematics and Computation, vol. 218, pp. 583-591, 2011. D. S. Wang, Integrability of a coupled KdV system: Painleve property, Lax pair and Baklund transformation, Appl. Math. and Comp, Vol.216, pp.1349-1354, 2010. Y. C. Cao, and D. S. Wang, Prolongation structures of a generalized coupled Korteweg-de Vries equation and Miura transformation, Commun. in Nonl. Sci. and Num. Simul., Vol.15, pp. 2344-2349, 2010. X. Zeng, and D. S. Wang, A generalized extended rational expansion method and its application to (1+1)-dimensional dispersive long wave equation, Appl. Math. Comput., Vol. 212, pp.296-304, 2009.

[17] B. Q. Xia and R. G. Zhou, Integrable deformations of integrable symplectic maps, Phys. Lett. A, vol.373, pp. 11211129, 2009. [18] D. S. Wang, and Z. F. Zhang, On the integrability of the generalized Fisher-type nonlinear diffusion equations, J. Phys. A: Math. Theor., Vol.42, 035209, 2009. [19] B. Q. Xia, and R. G. Zhou, Integrable deformations of integrable symplectic maps, Phys. Lett. A, vol.373, pp.11211129, 2009. [20] M. Adler, On a trace functional for formal pseudoDifferential operators and the symplectic structure of the KDV type equations, Inventions mathematical, pp.219284, 1979. [21] J. O. Peter, Applications of Lie groups to differential equations, Second ed., Spring-Verlag, Berlin Heidelberg New York. pp.458-463, 1999. ShuHong Wang was born in Huludao, Liaoning, China, February 1980. She received M. S. degree in 2006 from School of Sciences, Hebei University of Technology, Tianjin, China. Current research interests include integrable systems and Inequality. Wei Liu was born in Shijiazhuang, Hebei, January 1981, Received M.S. degree in 2007 from School of Sciences, Hebei University of Technology, Tianjin, China. Current research interests include integrable systems and computational geometry. Shujuan Yuan was born in Tangshan, Hebei, September 1980. Received M.S. degree in 2007 from School of Sciences, Hebei University of Technology, TianJin, China. Current research interests include integrable systems and computational geometry.

2012 ACADEMY PUBLISHER

2136

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

Effluent Quality Prediction of Wastewater Treatment System Based on Small-world ANN


Ruicheng Zhang
College of Electrical Engineering, Hebei United University, Hebei Tangshan 063009, China Email: rchzhang@yahoo.com.cn

Xulei Hu
College of Electrical Engineering, Hebei United University, Hebei Tangshan 063009, China Email: hxlei_2010@163.com

AbstractIn order to provide a tool for predicting wastewater treatment performance and form a basis for controlling the operation of the process, a NW multi-layer forward small world artificial neural networks soft sensing model is proposed for the waste water treatment processes. The input and output variables of the network model were determined according to the waste water treatment system. The multi-layer forward small world artificial neural networks model was built, and the hidden layer structure of the network model was studied. The results of model calculation show that the predicted value can better match measured value, playing an effect of simulating and predicting and be able to optimize the operation status. The establishment of the predicting model provides a simple and practical way for the operation and management in wastewater treatment plant, and has good research and engineering practical value. Index TermsNW small-world networks, multi-layer forward neural networks, wastewater plant, modeling, wastewater treatment

I. INTRODUCTION The increased concern about environmental issues has encouraged specialists to focus their attention on the proper operation and control of wastewater treatment plants (WWTPs). The characteristics of influent to the WWTPs are varied from one plant to another depending on the type of community lifestyle. Therefore, the performance of any WWTP depends mainly on local experience of a process engineer who identifies certain states of the plant [1]. The type of influent for any plant is also time-dependent and it is difficult to have a homogeneous influent to a WWTP [2]. This may result in an operational risk impact on the plant. Serious environmental and public health problems may result from improper operation of a WWTP, as discharging contaminated effluent to a receiving water body can cause or spread various diseases to human beings. Accordingly, environmental regulations set restrictions on the quality of effluent that must be met by any WWTP. A better control of a WWTP can be achieved by developing a robust mathematical tool for predicting the plant performance based on past observations of certain
2012 ACADEMY PUBLISHER doi:10.4304/jcp.7.9.2136-2143

key parameters. However, modeling a WWTP is a difficult task due to the complexity of the treatment processes. The complex physical, biological and chemical processes involved in wastewater treatment process exhibit non-linear behaviors which are difficult to describe by linear mathematical models. Owing to their high accuracy, adequacy and quite promising applications in engineering, artificial neural networks (ANNs) can be used for modeling such WWTP processes. The ANN can be used for better prediction of the process performance. It normally relies on representative historical data of the process. In a wastewater treatment plant, there are certain key parameters which can be used to assess the plant performance. These parameters could include biological oxygen demand (BOD), suspended solid (SS) and chemical oxygen demand (COD). Most of the available literature on the application of ANNs for modeling WWTPs utilized these parameters. For example, Oliveira-Esquerre et al. [3] obtained satisfactory predictions of the BOD in the output stream of a local biological wastewater treatment plant for the pulp and paper industry in Brazil. The principle component analysis was used to preprocess the data in the back propagated neural network. The Kohonen SelfOrganizing Feature Maps (KSOFM) neural network was applied by Hong et al. to analyze the multidimensional process data and to diagnose the inter-relationship of the process variables in a real activated sludge process. The authors concluded that the KSOFM technique provides an effective analysis and diagnostic tool to understand the system behavior and to extract knowledge from multidimensional data of a large-scale WWTP. Hamed et al. [2 4] developed ANN models to predict the performance of a WWTP based on past information. The authors found that the ANN-based models provide an efficient and robust tool in predicting WWTP performance. But the large learning assignment, slow convergence, and local minimum in the neural network are observed. For the problems in those wastewater treatment models mentioned above, in this study, a new wastewater treatment model, NW multilayer feedforward small-

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2137

world artificial neural network model, is proposed, which integrates NW small-world networks and multi-layer forward neural networks, learns from others strong points to offset ones weakness, and considerably improves precision, velocity, and anti-interference ability. II. SYSTEM FLOW DESCRIPTION A schematic diagram of the plant is shown in Fig. 1. The crude sewage (CS) from different pumping stations is collected and screened for floating debris and removal of grit is carried out by the grit collector and grit elevators. Primary settlement tanks (PST) are utilized to settle 65 75% of the solids. Settled solids are scrapped down in the hoppers of the PST with the help of mechanical drive scrappers. These settled solids are removed by the Hydro Valves which open in the Consolidation Sludge Tank. Aerobic bacteria are activated by aeration and mixing with activated sludge to reduce the volume of mixed liquor. Primary treated effluent is mixed with the returned activated sludge from the secondary settlement tank and uniformly distributed in channels for aeration with the help of mechanically driven aerators. Mixed liquor out of the aeration tank is made to settle in the secondary settlement tanks. In the post-treatment, the secondary treated effluent is pre-chlorinated and lifted by screw pumps for uniform distribution to sand filters. The resulting stream, designated as final effluent (FE), flows down into the wet well.

B. Decision of Hidden Layer Structure A multi-layer neural network is composed of input, hidden and output layers. In neural network, all nodes per layer are full-connected with those in their adjacent layers. Selecting rational hidden structure is the most important problem in model selection. The NW small-world artificial neural networks will have better performance if

TABLE I.
ALL VARIABLES AND THEIR CORRESPONDING MEANINGS WHICH WILL BE USED IN THIS PAPER

Variable
I

Meaning the number of input nodes

Variable
N

Meaning the number of given samples the nth input sample the nth teacher output the nth real output

the number of output X nodes the number of hidden S Y layers the number of nodes for Ms d the sth hidden layer In table I, s =1,2,, S and k =1,2,, N .
O
k

Figure 1. Schematic diagram of the wastewater treatment system

the number of connections between nodes in distant layers is very small. So, we can determine the number of neurons and layers in hidden layers small-world artificial neural networks using the methods of general neural network. Table I lists all variables and their corresponding meanings which will be used in this paper. The NW small-world artificial neural network is a multilayered, feed forward neural network and is by far the most extensively used. Back Propagation works by approximating the non-linear relationship between the input and the output by adjusting the weight values internally. A supervised learning algorithm of back propagation is utilized to establish the neural network modeling. A normal NW small-world artificial neural networks model consists of an input layer, one or more hidden layers, and output layer. The input samples are X = [ X 1 , X 2 , L , X k , X N ] , the any input sample is X k = [ xk 1 , xk 2 , xkl ] , the actual output of the network is Yk = [ yk1 , yk 2 ,L yko ] , Expected output is d k = [d k1 , d k 2 ,L d ko ] . The neurons weight and threshold are unknown in model of networks, the number of unknown quantities in first hidden layer is ( I + 1) M 1 . Similarly, ( M n + 1) M n +1 is the number of unknown quantities in the No. n hidden layer. Then, the quantities can be expressed as
L = ( I + 1) M 1 + ( M n + 1) M n +1 + ( M N + 1) J
n =1 N 1

III. CONSTRUCTION OF THE MODEL OF EFFLUENT QUALITY


PREDICTION

A. The Input and Output Variables of Model The biological oxygen demand (BOD) detected at the position E may be affected by the variable value at A, B, C, D, E some hours ago (see Fig.1). There are certain key parameters which can be used to assess the plant performance in a wastewater treatment plant. These parameters could include BOD, suspended solid and chemical oxygen demand.

= ( I + 1) M 1 + ( M n + 1) M n +1 + ( M N + 1) J
n =0

N 1

M 0 = 1 .

(1)

2012 ACADEMY PUBLISHER

2138

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

Neural networks are usually based on supervised learning. Mean square error function is selected as the objective function of algorithm which can be written as
1 E(Z ) = 2K
j k

c) When S=3 From (6), we can get


( I + 1) M 1 + ( M 1 + 1) M 2 + ( M 2 + 1) M 3 + ( M 3 + 1) J = K J

k =1 j =1

(12)

(d kj

y kj ( Z )) 2

(2)

If we let M 1 , M 2 as a definite value, then


J ( K 1) ( I + 1) M 1 ( M 1 + 1) M 2 M 3 = int . M 2 + J +1

Where y ( Z ) is the actual output of the network. Back Propagation works by approximating the non-linear relationship between the input and the output by adjusting the weight values internally along with the gradient direction. Ideally, there exists a Z , let E ( Z ) = 0 .then
E(Z ) =

(13) According to Kolmogorov theory [8]: A continuous function: f : [0,1] I R J ,can be achieved by 3 layers front neural network, there are I neurons in the input layer, 2 I + 1 neurons in the middle layer, J neurons in the output layer of the network. So, we can know
M = 2 I + 1 , when S=1; M s < 2 I + 1 , when S=2, 3, L , S;

1 2K

(d
k =1 j =1

j k

y kj ( Z )) 2 =0.

(3)

Or 1 d11 y1 ( Z ) = 0,L d1J y1J ( Z ) = 0, d12 y12 ( Z ) = 0,L


J J 2 2 d 2J y2 ( Z ) = 0, d 2 y2 ( Z ) = 0,L d 2J y2 ( Z ) = 0,L J J 1 d K y1 ( Z ) = 0,L d K yK ( Z ) = 0L K

(4) From (4) we can see that the number of equation is K J and the number of variables is S. According to algebra equation theory we can get
S = KJ .

Where M s is the number of nodes for the sth hidden layer. Then, we can evaluate the maximal number of nodes for all hidden layers

(5)

According to equation (1) and (5) let S = L , we can get


( I + 1) M 1 +

M
n =1

2I + 1 .

(14)

(M
n =0

N 1

+ 1) M n +1 + ( M N + 1) = K J (6)

For example, we can get the formula to determine the number of nodes per hidden layer then S=1, 2, 3, as follows a) When S=1 From (6), we can get
( I + 1) M 1 + ( M 1 + 1) J = K J .

In fact, we usually need very few hidden layers to solve the applications. Having the maximum of hidden layers, the hidden structure can be determined. In order to get the optimal hidden structure, we compare several network construct. As Table II show.
TABLE II.

COMPARISON OF SEVERAL NETWORK CONSTRUCT


the first hidden layer 2 the second hidden layer 4 2 3 4 5 3 the third hidden layer 5 4 12 8 iteration the average error of Network 0.9712 0.6461 0.7182 0.4257 0.6452 0.5564 0.1332

(7)

With simplifying the formularies,


J ( K 1) M 1 = int . I + J + 1

7241 5091 2711 2594 1260 1785 1173

(8)

3 3 4 4 3

b) When S=2 From (6), we can get


( I + 1) M 1 + ( M 1 + 1) M 2 + ( M 2 + 1) J = K J (9)

Then
J ( K 1) ( J + 1) M 2 M 1 = int . M 2 + I +1

(10)

Or
J (k 1) ( J + 1) M 1 M 2 = int . M1 + J + 1

(11)

From Table II, we can see the optimal structure; it contains 5 neurons in the input layer, a neuron in the output layer, 2 neurons in the first hidden layer, 3 neurons in the second hidden layer and 8 neurons in the third hidden layer. The network structure has high rate of convergence and a low level of error.

2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2139

IV. GENERATION AND ALGORITHM OF NW MULTILAYER FEEDFORWARD SMALL-WORLD ARTIFICIAL NEURAL


NETWORKS

A. Model Generation Process Based on the topology of the NW small-world networks, the model of NW multilayer feedforward small-world artificial neural networks is proposed. The model generation process is as follows. (1) Initially, neurons are connected feedforward, i.e. each neuron of a given layer connects to all neurons of the subsequent layer. (2) We make a random draw of two nodes which are connected to each other. We dont cut that old link. In order to create a new link, we make another random draw of two nodes. If those nodes are already connected to each other, we make further draws until we find two unconnected nodes. Then we create a new link between those nodes. (3) In this way we create some connections between nodes in distant layers, i.e. short-cuts and the topology changes gradually (see Fig. 2).

(c)

p =1

Figure 2. The model of small world artificial neural network prediction of wastewater quality

B. The Flow of the Algorithm The NW small-world artificial neural networks program-training process as follows: Step 1: Design the structure of neural network and input parameters of the network. Step2: Get initial weights W and initial threshold values from randomizing. Step 3: Input training data matrix X and output matrix d k. Step 4: Compute the output vector of each neural units. (a) Compute the input and output vector of the No.1 hidden layer
M u m11 = wim1 x ki . i =1 I

(15)

(a)

p=0

M vm11 = f ( wim1 xki ) ; m1 = 1,2, L , M . i =1

(16)

Compute the input and output vector of the No. s hidden layer
Ms u ms =

i =1

wims x ki +

w
t =1 mt =1

s 1 M t

Mt mt m s u mt

(17)

Ms Ms v ms = f (u ms ); M s = 1,2, L , M .

(18)

(b) Compute the output vector of the output layer


M uo o =

w
i =1

io x ki

w
t =1 mt =1

Mt

Mt mt o u mt

(19)

(b)

0 < p <1

M y ko = f (u o o ) ; o = 1,2, LO .

(20)

(c) Compute the error signal of the Output layer neurons

2012 ACADEMY PUBLISHER

2140

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

E=

1 O [d ko (n) yko (n)]2 . 2 o =1

(21)

i ) Denormalization, get the real data of prediction. V. SIMULATIONS Established prediction model of NW multilayer feedforward small-world artificial neural networks, a comparative test was made on the regular multilayer feedfoward networks and NW small-world artificial neural networks ( p =0.1) about convergence speed, precision and stability by using MATLAB 7.0 as Simulation Tool. In the test, we set up a five layers neural network that has three hidden layers, one input layer and one export layer. From table II, the network consists of three hidden layers, and the first, second and third layers have 2, 3 and 8 neurons respectively. We set the rewiring probability p=0.1, run 100 times independently until up to10000 iterations. During the realization process of algorithm, we let inertia coefficient = 0.9 , learning velocity = 0.05 .There are 5 input and one expected output, let <0. 01.
10
1

Step 5: Compute the local gradient (a) Compute the local gradient of the output layer

Mo
M o o (n) = [(1 y ko (n))]2 (d ko (n) y ko (n)) .

o = 1,2, L , O .

(22)

(b) Compute the local gradient of the last one hidden layer
M M M ms s (n) = vmss (n)(1 vmss (n))

o=1

Mo o (n)wmso (n) .

m s = 1,2, L , M S .

(23)

(c) Compute the local gradient of the hidden layer

ms
M M M M m (n) = vm (n)(1 vm (n))[ m (n)wm m (n) .(24)
s s s t s s s

Performance is 0.00993065, Goal is 0.01

Mt

t = s +1 mt =1

s t

10

M M M M m (n) = vm (n)(1 vm (n))[ m (n)wm m (n)


s s s t s s s

Mt

Training-Blue Goal-Black

t = s +1 mt =1

s t

+ oMO (n)wmS o (n)]


o=1

10

-1

10

-2

ms =1,2,L Ms . ,

(25)
10
-3

Step 6: Renew W
w S = w S (n 1) + S y iS 1 (n 1) . ji ji j

50

100

150 200 381 Epochs

250

300

350

(26) (27)

Figure 4. The figure of training following of small-world ANNs


10
1

Performance is 0.00996348, Goal is 0.01

b S ji

= b S ( n ji

1) +

S j

y iS 1 (n

1) .

Step 7: Repeat step 3 to step 6 until converge. C. Prediction Step of Modle The prediction process consists of nine steps: a ) Initialization, we let the network contain L hidden layers nl or p neurons in the hidden layers. b ) The data of input will be normalized. c ) Have circuit training for every sample. d ) Determine whether the cycle has finished, If not, return the step c. e ) The total error E should be computed to check whether the results meet accuracy requirements, if E< , then go on to step f, or go to step c. f ) Whether the iterations exceed the largest number of iterations. If yes, go on to step 7, or go to 3. g ) Record the metrics of the training network saved them for the prediction of wastewater treatment. h ) Compute the predicted value of Sewage effluent quality.
2012 ACADEMY PUBLISHER
Training-Blue Goal-Black

10

10

-1

10

-2

10

-3

200

400

600 1173 Epochs

800

1000

Figure 3. The figure of training following of general

Fig. 3 and 4 is the figure of training following of general and small-world artificial neural networks. 124 valid data have been collected and could be cut into two parts: training sample (80) and predictive sample.

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2141

From Fig. 3 and 4, we nd that the small-world network with p=0.1 gives faster convergence compared to the regular network.

Figure 8. Error curve of multi-layer forward small-world artificial neural network

Figure 6. The training result of effluent quality BOD5 of multi-layer forward small world l neural networks

From Fig 7 and 8, we can get the error curve of the two networks above.

Figure 5. The training result of effluent quality BOD5 of regular multi-layer forward neural networks

Figure 10. Predicting results of multi-layer forward small-world artificial neural networks

Fig 5 and 6 the training result of effluent quality BOD5 of the above two networks.

Figure 7. Error curve of regular multi-layer forward neural networks

Figure 9. Predicting results of regular multi-layer forward neural networks

Fig 9 and 10 show predicting results of effluent quality BOD5 of the above two networks.

2012 ACADEMY PUBLISHER

2142

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

From the comparison of the two networks above, this paper summed up the differences of both in convergence rate, accuracy and robustness: Based on the comparison with the forecast result from regular network, it is demonstrated that the convergence speed of the multilayer feedforward small-world artificial neural network is faster with more accurate precision. Actual values of BOD in the training and testing data are compared to predicted values by the BP neural network models and NW multilayer feedforward small-world artificial neural network, to evaluate the models performance. Visual inspection indicates that the NW multilayer feedforward small-world artificial neural network models resulted in a good fit for the measured BOD data.

interference ability to dispose data while predicting the wastewater treatment plant performance. The limitation in data, however, should be highlighted. If more data were collected, if the data were less noisy, this would have resulted in an improved predictive capability of the network. ACKNOWLEDGMENT This work was supported by the National Natural Science Foundation of China (grant number: 61040012). REFERENCES
[1] Hong, Y-S.T., Rosen, M.R., Bhamidimarri, R, Analysis of a municipal wastewater treatment plant using a neural network-based pattern analysis, Water Research, 37, 1608 1618. 2003. [2] Hamed, M., Khalafallah, M.G., Hassanein, E.A, Prediction of wastewater treatment plant performance using artificial neural network, Environmental Modeling and Software, 19, 919928. 2004. [3] Oliveira-Esquerre, K.P., Mori, M., Bruns, R.E, Simulation of an industrial wastewater treatment plant using artificial neural networks and principal components analysis, Brazilian Journal of Chemical Engineering, 19, 365370. 2002. [4] Farouq S. Mjalli_, S. Al-Asheh1, H. E. Alfadala, Use of artificial neural network black-box modeling for the prediction of wastewater treatment plants performance, Journal of Environmental Management, 83,329338 2007. [5] XIA Y S, WANG J, A general methodology for designing globally convergent optimization neural networks, IEEE Transactions on Neural Networks, 9 (6): 1331-1343.1998. [6] Watts D J. Strogatz S H, Collective dynamics of small world networks, Nature, 393: 440-442.1998. [7] Li Xiaohu, Du Haifeng, Zhang Jinhua, Multilayer feedforward small-world neural networks and its function approximation, Control Theory & Applications, 27(7):836-842. 2010. [8] Yang Ming, Xue Huifeng, Analysis of G Knowledge Communication Network Based on Complex Network, Computer Simulation, 26(11):122-123. 2009. [9] Pengsheng Zheng,Wansheng Tang , Jianxiong Zhang, A simple method for designing efficient small-world neural networks, Neural Networks, 23, 155-159, 2010. [10] Wang Xiaofan, Li Xiang, Chen Guanrong, Complex network theory and its applications, Beijing: Tsinghua University Press, 2006. [11] Jian Wang, Wei Wu, Jacek M. Zurada, Deterministic convergence of conjugate gradient method for feedforward neural networks, Neurocomputing, 74(14):2368-2376.

TABLE III. THE MSE COMPARISION OF REGULAR NETWORK ALGORITHM AND SMALL-WORLD NEURAL NETWORKS ALGORITHM

NETWORK

MSE

regular multilayer feedfoward networks

0.1332

multilayer feedfoward small-world neural networks

0.0010

The table III shows, according to MSE, the multilayer feedforward small-world artificial neural network is more superior to that of regular network in training accuracy and prediction accuracy. The results of prediction indicate that the multilayer feedforward small-world artificial neural network has good predictive effect for the water quality and has high adaptability. The model of small-world artificial neural network runs quickly, is useful to cope with the fluctuation which could happen at any time. So, its an effective method for water quality prediction was provided. VI CONCLUSIONS In this paper, a model based on NW multilayer feedforward small-world artificial neural network is developed to predict the effluent concentrations of BOD for a WWTPs. The model is shown to fit the data precisely and to overcome several disadvantages of the conventional BP neural networks, namely: slow convergence, low accuracy and difficulty in finding the global optimum. A series of tests have been conducted based on the samples. It has been shown that the NW multilayer feedforward small-world artificial neural network model provided good estimates for the BOD data sets. After the network is trained, it becomes simple, fast, and precise, with strong self-adaptability and anti-

2011.
[12] Wilfredo J. Puma-Villanueva, Eurpedes P. dos Santos, Fernando J. Von Zuben, A constructive algorithm to synthesize arbitrarily connected feedforward neural networks, Neurocomputing, in press. 10 August 2011 [13] Syed Shabbir Haider, Xiao-Jun Zeng, Simplified neural networks algorithm for function approximation on discrete input spaces in high dimension-limited sample applications, Neurocomputing, 72(4-6):1078-1083. January 2009. [14] Naimin Zhang, An online gradient method with momentum for two-layer feedforward neural networks, Applied Mathematics and Computation, 212(2):488-498. 2009.

2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2143

[15] M. E. J. Newman, D. J. Watts, Renormalization Group Analysis of the Small-world Network Model, Physics Letters A, 263(4):341-346. 1999. [16] SIMARD D, NADEAU L, KROGER H, Faster learning in small-world neural networks, Physics Letters A, 336(1): 8-15. 2005. [17] Chun Guang, Memorizing morph patterns in small-world neuronal network, Statistical Mechanics and its Applications, 388(2-3):240-246. 15 January 2009. [18] Lei Chen, Guang-Bin Huang, Hung Keng Pung, Systemical convergence rate analysis of convex incremental feedforward neural networks, Neurocomputing, 72, 2009 26272635. Ruicheng Zhang, be born in March 1975, Ph.D. associate professor master instructor in control science and engineering, and Ph.D.in control theory and control engineering, University of science and technology Beijing, mainly engaged in rolling automation, production intelligent control and robust control and other research. He was awarded the Hebei University of Technology Excellent Teacher third

prize, outstanding graduate design instructor, excellent teacher, excellent script and other honorary titles. In recent years, he guiding students to participate in "Freescale" Cup National University Smart Car race, won the first prize and second prize. He has published 30 papers in leading journals such as Journal of Vibration and Control, Control Theory and Application, transactions of china electro technical society, Journal of University of Science and Technology Beijing etc, which has been tagged 11 times by SCI and EI, and cited 50 times by others. Dr. Zhang has co-authored 1 book in Chinese. Associate professor Rui-Cheng Zhang currently chaired by the National Natural Science Foundation: Small world artificial neural network model; Ministry of Science and Technology personnel services business operations funded projects: JDC150 type electronic digital multi-function tape measure ruler; Natural Science Foundation of Hebei Province: ADRC control theory and applied research of vibration control in the rolling mill drive system and other projects. Xulei Hu, be born in December 1985, Master's degree student of engineering in control theory and control engineering, Hebei United University, mainly engaged in modeling and simulation of complex industry control system.

2012 ACADEMY PUBLISHER

2144

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

Nonlinear Evolution Equations for Second-order Spectral Problem


Wei Liu
Department of Mathematics and Physics, ShiJiaZhuang TieDao University, ShiJiaZhuang, China Email: Lwei_1981@126.com

Shujuan Yuan
Qinggong College, Hebei United University, TangShan, China Email: yuanshujuan1980@163.com

Shuhong Wang
College of Mathematics, Inner Mongolia University for Nationalities, TongLiao, China Email: shuhong7682@163.com

AbstractSoliton equations are infinite-dimensional integrable systems described by nonlinear evolution equations. As one of the soliton equations, long wave equation takes on profound significance of theory and reality. By using the method of nonlinearization, the relation between long wave equation and second-order eigenvalue problem is generated. Based on the nonlinearized Lax pairs, Euler-Lagrange function and Legendre transformations, a reasonable Jacobi-Ostrogradsky coordinate system is obtained. Moreover, by means of the Bargmann constrained condition between the potential function and the eigenfunction, the Lax pairs is equivalent to matrix spectral problem. Furthermore, the involutive representations of the solutions for long wave equation are generated. Index Termsspectral problem, Hamilton canonical system, Bargmann constraint, integrable system, involutive solution

expansion, Bcklund transformation, algebraic method and so on [1-7]. Using the inverse scattering method, we could obtain the N-soliton solution of KdV equation (see Fig. 1 and Fig. 2).

I. INTRODUCTION In 1895, Korteweg and de Vries [1-2] derived a nonlinear evolution equation as follows: 3 g 1 2 2 1 2 = ( + + 2 ) 2 h 2 3 3 1 3 Th = h 3 g By making the transformation 1 g 1 1 t= , x = 1 2 , u = + 2 h 2 3 the famous KdV equation ut + 6uu x + u xxx = 0 is obtained. It aroused an increasing interest among scientific researchers in the field of mathematics and physics, so more and more scientists have been interested in searching various methods to obtain solutions of some partial differential equation. Many effective methods have been proposed, for example, Hirota method, the inverse scattering, Darboux transformation, Painlev
2012 ACADEMY PUBLISHER doi:10.4304/jcp.7.9.2144-2151

Figure1. One-soliton solution of the KdV equation

In our paper, by nonlinearization [8-15] of spectral problems, we considered the spectral L = ( 2 + q + p) = x The paper is structured as follows. In Sect.2, the adjoint Lax pairs of the spectral problem is generalized. In Sect.3, based on the Euler-Lagrange equations and Legendre transformations, a suitable Jacobi-Ostrogradsky coordinate is been found. Section 4 and Sect.5 are devoted to establishing the Liouville integrability of the resulting Hamiltonian systems from the 2nd-order spectral problems.

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2145

0 J = 0
2 K = 2 q aj Gj = b j 2 + q p + p

(6) (7)

TABLE I. THE LENARD SEQUENCE

Gj
j = 1
Figure2. Two-soliton solution of the KdV equation

aj
0 p px + 2qp

bj
1 q

j=0
j =1
L

2 p + qx + q 2

II. LAX PAIRS AND EVOLUTION EQUATIONS Let us take the 2nd-order problem L = ( 2 + q + p ) = x (1) where q = q ( x, t ) R, p = p ( x, t ) R , q, p const are potential functions, is a complex eigenparameter, = x , 1 = 1 = 1 . Suppose is the basic interval of (1), for the sake of simplicity, we assume that if the potentials q, p, and all their derivatives with respect to x tend to zero, then = (, +) ; If they are all periodic T functions, then = [0, 2T ] . Definition 2.1 Assume that our linear space is equipped with a L2 scalar product (, ) L2 ( ) :
( , h) L2 ( ) = h* dx <

Now, we consider the auxiliary spectral problem tm = m m = 0,1, 2,K with

(8)

m = [ a j 1 b j 1x + b j 1 ] m j
j =0

(9)

Then, the isospectral (i.e. tm = 0) compatible condition


Ltm = mx + [m , L] = mx + m L Lm

(10)

of the Lax pairs L = x tm = m m = 0,1, 2,K determines a (m + 1) -order long-wave equation

(11)

symbol * denotes the complex conjugate.


Definition 2.2 Operator A is an adjoint operator of A , if ( A , h) L2 ( ) = ( , Ah) L2 ( ) .

Using definition 2.2, we get L = ( 2 q + p) = * x (2) In order to derive the evolution equation related to the spectral problem (1), we consider the stationary zero curvature equation x + [ , L] = x + L L (3) Take

qtm (12) = JGm = KGm 1 m = 0,1, 2,K pt m For example, when m = 1 and m = 2 , we can get the first and the second nonlinear systems, the results are shown in Table. When m = 1 , it is the long-wave equation. When m = 2 and q 0 , it is exactly the famous KdV equations.
TABLE II. THE FIRST AND THE SECOND NONLINEAR SYSTEM Evolution equation

= [a j 1 b j 1x + b j 1 ]
j =0

m =1
2 px + qxx + 2qqx pxx + 2(qp) x

m = 2 and q 0
0

qtm ptm

and set
a1 0 G1 = = b1 1 therefore, we obtain the recursive relation KG j 1 = JG j , j = 0,1, 2K ,

pt2 = pxxx + 3( p 2 ) x

(4) In order to give the constraints between the potentials and the eigenfunctions, it is necessary to calculate the functional gradient with respect to the potential functions. Proposition 2.3: [11] i) If is an eigenvalue of (1), then is real.

(5)

where

2012 ACADEMY PUBLISHER

2146

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

ii) If is an eigenfunction of (1) and is an eigenfunction of (2), then and can be taken real functions. iii) If is an eigenfunction corresponding to the eigenvalue of (1) and is an eigenfunction corresponding to the eigenvalue of (2), then q x = ( x dx) 1 = p and K = J (13) Proof: In fact, from (1) and (2), we have x dx = L ( * )* dx

q x = ( x dx) 1 = p by (6) and (7), (13) holds.

III. BARGMANN SYSTEMS AND THE HAMILTON


CANONICAL FORMS

We suppose 1 < 2 < K < N are the eigenvalues of the eigenvalue problem (1) and (2), j , j are the eigenfunction for j ( j = 1, 2,L , N ). Let
= diag (1 , 2 ,L N ) , = (1 , 2 ,K , N )T , = ( 1 , 2 ,K , N )T Now, we consider the Bargmann constraint [16-18] q =< , > (14) p = < , x > here

= ( L )* dx

= ( L )* dx

= * x dx

x dx

so

< , >= j j
k =1

= If is a complex eigenfunction of (1) on , and = a + ib , a, b are real functions, from L = x and is real, then La = ax Lb = bx
so a, b are eigenfunction of (1) on , can be taken real function. Similarly, can be taken real function. Let d h = h( + , q + q, p + p ) , d =0 by L = x ( L ) = ( x )
L + L = x + x

namely
p < , x > G0 = = q < , > < j , x > j = 0,1, 2L (15) Gj = < j , > so the eigenvalue problem (1) and (2) are equivalent to the systems xx + (< , > ) x < , x > = x (16) xx < , > x < , x > = x

and (16) are called the Bargmann systems for the eigenvalue problem (1) and (2). Let I = Idx (17)

and

( L ) dx = ( L )dx

= ( x )dx

= x dx

so

x dx = L dx

= (( q ) x + p ) dx

= ( q x + p )dx

where the Lagrange function I is defined as follows: 1 1 I =< x , x > + < , >< , x > + < x , > 2 2 1 1 < , x > < , >< x , > 2 2 Proposition 3.1: The Bargmann systems (16) of the eigenvalue problem (1) and (2) are equivalent to the Euler-Lagrange equation systems: I =0 (18) I = 0
Proof: By (17), we have I = jxx < , x > j < , > jx + jx

then

=< , x > j + < , > jx jx


2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2147

< , x > j < , > jx + jx =0 similarly, I = jxx (< , > j ) x + < , x > jx + jx

= (< , > j ) x < , x > jx jx

(< , > j ) x + < , x > jx + jx


=0 Now, the Poisson bracket [19] of the real-valued functions F and H in the symplectic [20] space

( w = dz j dy j , R 4 N ) is defined as follows:
j =1

{F , H } = (
j =1 k =1 2

F H F H ) y jk z jk z jk y jk

= (< Fyj , H zj > < Fzj , H yj >)


j =1

Based on the the Euler-Lagrange equation (18), the Jacobi-Ostrogradsky coordinates can be found, and the Bargmann systems (16) can be written in the Hamilton canonical equation systems [21]. Let
u1 = , u2 = , g = < u jx , v j > I
j =1 2

Our aim is to find that the coordinates {v1 , v2 } and g satisfy the following Hamilton canonical equations: g u jx = {u j , g} = v j j = 1, 2 g v = {v , g} = j jx u j By directly computing, we have 1 1 v1 = x 2 q + 2 v = + 1 q 1 x 2 2 2 1 v = < , > + 1 < , > x x 1x 2 2 1 1 < x , > x 2 2 1 1 v2 x = < , x > < x , > 2 2 1 1 < , > x + x 2 2 So, if we take the Jacobi-Ostrogradsky coordinates as follows: u1 = u = 2 1 1 (19) v1 = x q + 2 2 1 1 v2 = x + q 2 2

Then the Bargmann systems (16) are equivalent to the following Hamilton canonical systems: g u jx = v j j = 1, 2 v = g jx u j where 1 1 g =< v1 , v2 > + < u1 , u2 >< u2 , v2 > < u2 , v2 > 2 2 1 1 1 < u1 , u2 >< u1 , v1 > < u1 , u2 >3 + < u1 , v1 > 2 4 2 1 1 + < u1 , u2 >< u1 , u2 > < 2 u1 , u2 > 2 4 By (19), the Jacobi-Ostrogradsky coordinates can be written in the following form: y1 = , y = x + 1 q 1 2 2 2 (20) z2 = 1 1 z1 = x + q 2 2 then, we have: Theorem 3.2: The Bargmann systems (16) of the eigenvalue problem (1) and (2) are equivalent to the following Hamilton canonical systems: h y jx = z j j = 1, 2 h z = jx y j where 1 1 h =< y2 , z1 > < y1 , z2 >< y2 , z2 > + < y2 , z2 > 2 2 1 1 1 < y1 , z2 >< y1 , z1 > + < y1 , z2 >3 + < y1 , z1 > 2 4 2 1 1 < y1 , z2 >< y1 , z2 > + < 2 y1 , z2 > 2 4 and h = g . IV.
NONLINEARIZATION OF THE LAX PAIRS

Form (20) and Theorem 3.2, the Bargmann systems (16) have the equivalence form Yx = MY T Z x = M Z where y1 Y = = 1 1 y2 x + q 2 2 1 1 x + q z1 Z = = 2 2 z2

2012 ACADEMY PUBLISHER

2148

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

M 11 M = M 21

E M 22

M 11 = M 21

where
1 1 M 11 = qE + 2 2 1 2 1 1 1 M 21 = q E pE qx E q + 2 4 2 2 4 1 1 M 22 = qE + 2 2 E = EN N = diag (1,1,K ,1)

M 22

1 1 < y1 , z2 > + 2 2 1 2 3 1 1 = + < y1 , z2 > 2 < y1 , z1 > < y2 , z2 > 4 4 2 2 1 1 < y1 , z2 > < y1 , z2 > 2 2 1 1 = < y1 , z2 > + 2 2

m 1 1 Am = [ < j 1 y2 , z2 >] m j + m +1 < m y1 , z2 > 2 2 j =1

Proposition 4.1: The Lax pairs (11) for the (m + 1) order evolution equation (12) is equivalent to Yx = MY , Z x = M T Z ; (21) T Ytm = WmY , Z tm = Wm Z , m = 0,1, 2K , where m Am Bm m j Wm = Dm j = 0 Cm
1 1 Am = a j 1 qb j 1 b j 1x + b j 1 2 2 Bm = b j 1 1 1 1 1 Cm = b j 1xx qx b j 1 pb j 1 + q 2 b j 1 qb j 1 2 2 4 2 1 2 + b j 1 4 1 1 Dm = a j 1 qb j 1 + b j 1 2 2 By (14) and (20), we have the Bargmann constraint q =< y1 , z2 > 1 1 2 p =< y1 , z1 > 2 < y1 , z2 > + 2 < y1 , z2 > (22) G11 aj Gj = = , j = 0,1, 2,K , j b j < y1 , z2 > (23) where
G11 =< j y1 , z1 > 1 1 < y1 , z2 >< j y1 , z2 > + < j +1 y1 , z2 > 2 2

Bm = [< j 1 y1 , z2 >] m j + m
j =1
m 1 Cm = [< j 1 y2 , z1 >] m j (< y1 , z1 > + < y2 , z2 > m 2 j =1

1 1 1 < y1 , z2 > m +1 < y1 , z2 > m + m + 2 4 4 4 1 1 < m +1 y1 , z2 > < m y1 , z2 > 4 4 1 1 m + < y1 , z2 >< y1 , z2 > + < y1 , z2 > 2 m 2 4 m 1 1 Dm = [ < j 1 y1 , z1 >] m j + m +1 < m y1 , z2 > 2 2 j =1 Theorem 4.2: On the Bargmann constraint (22), the nonlinearized Lax pairs (24) for the (m + 1) -order long wave equation (12) can be written as the following Hamilton systems [11-12]:

h h Yx = Z , Z x = Y ; Y = hm , Z = hm , m = 0,1, 2,K , tm tm Z Y

(25) where 1 1 1 hm = < m +1 y1 , z1 > + < m +1 y2 , z2 > + < m + 2 y1 , z2 > 2 2 4 1 1 + < y1 , z2 > 2 < m y1 , z2 > < m y1 , z2 >< y1 , z2 > 4 4 1 < m +1 y1 , z2 >< y1 , z2 > + < m y2 , z1 > 4 1 < m y1 , z2 > (< y1 , z1 > + < y2 , z2 >) 2 m < j 1 y , z > < m j y2 , z 2 > 1 2 + j 1 y1 , z1 > < m j y2 , z1 > j =1 < and h0 = h .
Proof:
h 1 1 = y2 < y1 , z2 > y1 + y1 2 2 z1 = y1x h 1 1 1 = y2 < y1 , z2 > y2 < y2 , z2 > y1 2 2 z2 2

Substituting the Bargmann constraint (22) and (23) into (21), the Lax pairs (24) for the (m + 1) -order evolution equation (12) is equivalent to the following forms: Yx = MY , Z x = M T Z (24) T Ytm = WmY Z tm = Wm Z m = 0,1, 2,K where E M 11 M = M 21 M 22
Am Wm = C m Bm Dm

2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2149

1 3 1 < y1 , z1 > y1 + < y1 , z2 > 2 y1 + 2 y1 2 4 4 1 1 < y1 , z2 > y1 < y1 , z2 > y1 2 2 = y2 x hm 1 m +1 1 = y1 < m y1 , z2 > y1 + m y2 2 z1 2 + [< j 1 y1 , z2 > y2 < m j y2 , z2 > y1 ] m j
j =1 m

= z1tm

hm 1 m +1 1 = z2 + m z1 < m y1 , z2 > z2 y2 2 2 + [< j 1 y1 , z2 > z1 < j 1 y1 , z1 > z2 ] m j


j =1 m

= z 2 tm

so
Zx = h h , Z tm = m Y Y

= y1tm

hm 1 m +1 1 1 = y2 + m + 2 y1 < y1 , z1 > m y1 4 2 z2 2 1 1 < m y1 , z2 > y2 < y2 , z2 > m y1 2 2 1 1 + < y1 , z2 > 2 m y1 + < y1 , z2 >< m y1 , z2 > y1 4 2 1 1 m < y1 , z2 > y1 < y1 , z2 > m y1 4 4 1 1 < m +1 y1 , z2 > y1 < y1 , z2 > m +1 y1 4 4 + [< j 1 y2 , z1 > y1 < j 1 y1 , z1 > y2 ] m j
j =1 m

V. LIOUVILLE COMPLETELY INTEGRABLE SYSTEMS Now we discuss the completely integrability for the Bargmann systems (25). Let 1 (1) 1 Ek = 2 k y1k z1k + 2 k y2 k z2 k E (2) = 1 y z (< y , z > + < y , z >) + y z 1k 2 k 1 1 2 2 2 k 1k k 2 1 1 k y1k z2 k < y1 , z2 > + y1k z2 k < y1 , z2 > 2 4 4 1 2 1 + k y1k z2 k y1k z2 k < y1 , z2 > k(1,2) 4 4 where N y1k y1l z1k z1l 1 (1,2) = k l =1, l k k l y2 k y2 l z 2 k z 2 l
Proposition 5.1: i) {E (1) , Ek(1) } = 0,{E (1) , Ek(2) } = 0,{E (2) , Ek(2) } = 0 j j j
j , k = 1, 2,K N
(l ) j

= y2 t m

so
Yx = h h , Ytm = m Z Z

Similarly, we have h 1 1 3 = z1 < y2 , z2 > z2 + < y1 , z2 > 2 z2 y1 2 2 4


1 1 1 + 2 z2 < y1 , z2 > z1 < y1 , z1 > z2 4 2 2 1 1 < y1 , z2 > z2 < y1 , z2 > z2 2 2 = z1x
h 1 1 = z1 < y1 , z2 > z2 + z2 y2 2 2 = z2 x hm 1 m +1 1 1 = z1 < m y1 , z2 > z1 < y1 , z1 > m z2 y1 2 2 2

(26)

ii) {dE , j = 1, 2,K N ; l = 1, 2} are the linear independence.


) H =
N
1 ( E (1) + E (2) ) = m 1hm j j j =1 j m=0

(27)

hm = jm ( E (1) + E (2) ) j j
j =1

m = 0,1, 2,K

(28)

Theorem 5.2: The Bargmann [8] systems (25) are the completely integrable systems in the Liouville sense. i.e. {h, E (j l ) } = 0, l = 1, 2; j = 1, 2,K , N (29)
{hm , E (j l ) } = 0, l = 1, 2; j = 1, 2,K , N {hm , hn } = 0, m, n = 0,1, 2,K {h, hm } = 0, m = 0,1, 2,K Proof: By (26) and (28), we have {hm , hn } = 0, m, n = 0,1, 2,K from (27), then {H , H } = 0

(30) (31) (32)

1 1 < y1 , z2 > 2 m z2 + < y1 , z2 >< m y1 , z2 > z2 4 2 1 1 < m +1 y1 , z2 > z2 < y1 , z2 > m +1 z2 4 4 1 1 m < y1 , z2 > z2 < y1 , z2 > m z2 4 4 1 1 < y2 , z 2 > m z 2 + m + 2 z 2 2 4
+ + [< j 1 y2 , z1 > z2 < j 1 y2 , z2 > z1 ] m j
j =1 m

using h = h0 , we have
{h, E (j l ) } = 0, l = 1, 2; j = 1, 2,K , N

2012 ACADEMY PUBLISHER

2150

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

According to the above Theorem, the Hamiltonian t t phase flows g hnn and g hm are commutable. m Now, we arbitrarily choose an initial value ( yi (0, 0), zi (0, 0))T i = 1, 2 , let
y (0, 0) y (0, 0) yi (tm , tn ) tm tn i t n tm i = g hm g hn = g hn g hm zi (tm , tn ) zi (0, 0) zi (0, 0) From (8), (9) and Theorem 4.2, the following theorem holds. Theorem 5.3: Suppose ( y1 , y2 , z1 , z2 ) is an involutive solution of the Hamiltonian [11] canonical equation systems (25), then q =< y1 , z2 > 1 1 2 p =< y1 , z1 > 2 < y1 , z2 > + 2 < y1 , z2 > satisfies the (m+1)-order long-wave equation (12). Remark: By Theorem 5.2, soliton waves have the following properties: when two of them interact, the larger soliton has been shifted to the right of where it would have been no interaction, and the smaller shifted to the left by the same time (see Fig. 3) [22-24].

Figure3. Interaction of two solitary waves at different times

Especially, if ( y1 , y2 , z1 , z2 ) satisfies
y1 y1 = M y2 y2 x z z 1 = M T 1 z2 x z2 y1 = W y1 1 y2 t1 y2 z1 z T 1 = W1 z2 t1 z2

Figure3. Interaction of two solitary waves at different times

where 1 1 E 2 2 < y1 , z2 > M = 1 1 M 21 < y1 , z2 > 2 2 1 2 3 1 1 M 21 = + < y1 , z2 > 2 < y1 , z1 > < y2 , z2 > 4 4 2 2 A1 B1 W1 = C D 1 1 1 1 A1 = 2 < y1 , z2 > < y2 , z2 > 2 2 B1 = + < y1 , z2 >
1 C1 =< y2 , z1 > (< y1 , z1 > + < y2 , z2 > 2

1 1 1 < y1 , z2 > 2 < y1 , z2 > + 3 4 4 4 1 1 < 2 y1 , z2 > < y1 , z2 > 4 4 1 1 + < y1 , z2 >< y1 , z2 > + < y1 , z2 > 2 2 4 1 2 1 D1 = < y1 , z2 > < y1 , z1 > 2 2 then
Figure3. Interaction of two solitary waves at different times

2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2151

q =< y1 , z2 > 1 1 2 p =< y1 , z1 > 2 < y1 , z2 > + 2 < y1 , z2 > satisfies long-wave equation qt1 2 px + qxx + 2qqx = pt1 pxx + 2(qp) x

[14]

[15]

[16]

ACKNOWLEDGMENT The author is grateful to anonymous referees for their valuable suggestions. This work is supported in part by a grant from the Youth Science Foundation of Hebei Province (Project No. A2011210017). REFERENCES
[1] M. J. Ablowitz and P. A. Clarkson, Solitons, Evolution Equations and Inverse Scattering, Cambridge University Press, 1991. [2] D. Y. Chen, Soliton Theory, Science Press, 2006. [3] V. I. Arnold, Mathematical Methods of Classical Mechanics, 2nd ed., Spring-Verlag, New York Berlin Heidelberg, 1999. [4] C. W. Cao, A classical integrable system and involutive representation of solutions of the KdV equation, Acta Math. Sinia, 1991, pp. 436440. [5] D. S. Wang, Integrability of a coupled KdV system: Painleve property, Lax pair and Bcklund transformation, Appl. Math. and Comp., vol. 216, 2010, pp. 13491354. [6] Y. H. Cao and D. S. Wang, Prolongation structures of a generalized coupled Korteweg-de Vries equation and Miura transformation, Commun. in Nonl. Sci. and Num. Simul., vol.15, 2010, pp.23442349. [7] D. S. Wang, Complete integrability and the Miura transformation of a coupled KdV equation, Appl. Math. Lett., vol. 23, 2010, pp. 65669. [8] X. Zeng and D. S. Wang, A generalized extended rational expansion method and its application to (1+1)-dimensional dispersive long wave equation, Appl. Math. Comp., vol. 212, 2009, pp. 296304. [9] Y. T. Wu and X. G. Geng, A finite-dimensional integrable system associated with the three-wave interaction equations, J. Math. Phys., vol. 40, 1999, pp. 34093430. [10] C. W. Cao, Y. T. Wu, and X. G. Geng, Relation between the Kadometsev-Petviashvili equation and the confocal involution system, J. Math. Phys., vol. 40, 1999, pp. 39483970. [11] Z. Q. Gu, The Neumann system for 3rd-order eigenvalue problems related to the Boussinesq equation. IL Nuovo Cimento, vol. 117B(6), 2002, pp. 615632. [12] W. X. Ma and R. G. Zhou, Binary nonlinearization of spectral problems of the perturbation AKNS systems, Chaos Solitons &Fractals 13, 2002, pp. 14511463. [13] X. G. Geng and D. L. Du, Two hierarchies of new nonlinear evolution equations associated with 3 3 matrix [17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]

spectral problems, Chaos Solitons &Fractals 29, 2006, pp. 11651172. X. G. Geng and H. H. Dai, A hierarchy of new nonlinear differential-difference equations. Journal of the Physical Socoety of Japan 75, vol. 1, 2006. J. S. He, J. Yu, Y. Cheng, and R. G. Zhou, Binary Nonlinearization of the Super AKNS System, Mod. Phys. Lett. B, vol.22, 2008, pp. 275-288. Z. Q. Gu and J. X. Zhang, A new constrained flow for Boussinesq-Burgers hierarchy, IL Nuovo Cimento, vol. 122B(8), 2007, pp. 871884. W. Liu, S. J. Yuan, and Q. Wang, The second-order spectral problem with the speed energy and its completely integrable system, Journal of Shijiazhuang Railway Institute, vol. 22(1), 2009. Z. Q. Gu, J. X. Zhang, and W. Liu, Two new completely integrable systems related to the KdV equation hierarchy, IL Nuovo Cimento, vol. 123B(5), 2008, pp. 605622. O. J. Peter, Applications of Lie Groups to Differential Equations, 2nd ed., Spring-Verlag, New York Berlin Heidelberg, 1999, pp. 1353, 452462. B. Q. Xia and R. G. Zhou, Integrable deformations of integrable symplectic maps, Phys. Lett. A, vol.373, 2009, pp.11211129. Y. Q. Yao, Y. H. Huang, Y. Wei, and Y. B. Zeng, Some new integrable systems constructed from the biHamiltonian systems with pure differential Hamiltonian operators, Applied Mathematics and Computation, vol. 218, 2011, pp. 583-591. Y. Keskin and G. Oturanc, Numerical solution of regularized long wave equation by reduced differential transform method, Applied Mathematical Sciences, vol. 4, 2010, pp. 1221-1231. J. X. Cai, A multisymplectic explicit scheme for the modified regularized long-wave equation, Journal of Computational and Applied Mathematics 234, 2010, pp. 899-905. J. X. Cai, Multisymplectic numerical method for the regularized long-wave equation, Computer Physics Communications 180, 2009, pp. 1821-1831.

Wei Liu was born in Shijiazhuang, Hebei/ January, 1981, Received M.S. degree in 2007 from School of Sciences, Hebei University of Technology, Tianjin, China. She mainly engaged in control and application of differential equations. Current research interests include integrable systems and computational geometry. Shujuan Yuan was born in Tangshan, Hebei, September, 1980. Received M.S. degree in 2007 from School of Sciences, Hebei University of Technology, Tianjin, China. Current research interests include integrable systems and computional geometry. She is a lecturer in department of Qinggong College, Hebei United University. Shuhong Wang, native of Liaoning, was born in January 1980. In 2006, master of science in Applied Mathematics at Hebei University of Technology, Tianjin, China. She mainly engaged in domain of differential equations, Inequality and so on.

2012 ACADEMY PUBLISHER

2152

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

Analysis of Causality between Tourism and Economic Growth Based on Computational Econometrics
WANG Liangju
School of Business Administration, Anhui University of Finance and Economics, Bengbu, Anhui, P. R. China Email: acwlj@163.com

ZHANG Huihui
School of Accounting, Anhui University of Finance and Economics, Bengbu, Anhui, P. R. China Email:justinwlj@sohu.com

LI Wanlian
School of Business Administration, Anhui University of Finance and Economics, Bengbu, Anhui, P. R. China Email:justinwlj@163.com

AbstractTo investigate the causal relationship between Chinas domestic tourism and economic growth, this paper performs co-integration analysis and Granger causality test by making use of annual time series data from 1984 to 2009. Co-integration analysis indicates that there are long-term and stable equilibrium relationships between the development of Chinas domestic tourism and economic growth. The results from the ECM model indicate that there are short-term disequilibrium relationship between the development of Chinas domestic tourism and economic growth. An adjustment mechanism from short term to long term in the relationship between the development of Chinas domestic tourism and economic growth can be found in the ECM model. In addition, bidirectional Granger causality between Chinas domestic tourism and economic growth is demonstrated. The development of Chinas domestic tourism is the Granger cause of economic growth, Chinas economic growth is the Granger cause of development of domestic tourism as well. Our findings imply that China may enhance its economic growth by strategically strengthening the tourism industry while not neglecting the other sectors which also promote growth. Index Termseconomic growth, domestic tourism, co-integration analysis, error correction model, Granger causality test

I. INTRODUCTION Tourism is one of the largest and rapid growing sectors in the world. The role of tourism to the economic growth and to the progress of modern societies has become a common awareness in political authorities worldwide. The fact that tourism is an economic activity of primary value and importance for many countries is an accepted fact by most of all. Tourism industry mainly consists of such factors as traveling, sightseeing, accommodation, food, shopping and entertainment. It is an industry with

strong comprehensiveness, high industrial relevance and large pull function. Tourism consumption directly stimulates the development of such traditional industries as civil aviation, railway, highway, commerce, food and accommodation. In addition, tourism can also promote the development of such modern service industries as international finance, logistics, information consultation, cultural originality, movie production, entertainment, conferences and exhibitions, and so on. A general consensus has emerged that it not only increases foreign exchange income, but also creates employment opportunities, stimulates the growth of the tourism industry and by virtue of this, triggers overall economic growth. As such, tourism development has become an important target for most governments. The development of tourism industry will contribute to a countrys economic growth. [1]. It is now considered as an efficient tool for promoting economic growth of the host country. Domestic tourism industry is one of the largest in Chinas three tourism markets. Chinas domestic tourism industry started from reforming and opening, and has grown rapidly since the 1990s. Chinas domestic tourist arrivals (DTA) increase to 2.1 billion person times in 2010 from 200 million person times in 1984 with an average annual increasing rate at 9.47 per cent. Meanwhile, China keeps a long-term, rapid and stable growth of economy since reforming and opening. Chinas Gross Domestic Product (GDP) increases to 40.12 trillion Yuan in 2010 from 720.8 billion Yuan in 1984 with an average annual increasing rate at 9.86 per cent (we have eliminated the effect of inflation by consumption prices). From data above, we can find out by intuition that Chinas domestic tourism seems to have a same trend of growth with Chinas economy. There seems to have a high positive correlation between the development of Chinas domestic tourism and economic growth. To tell

2012 ACADEMY PUBLISHER doi:10.4304/jcp.7.9.2152-2159

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2153

the true story, it is necessary to perform empirical analysis on the relationship between the development of Chinas domestic tourism and economic growth.
TABLE I. DOMESTIC TOURIST ARRIVALS AND GROSS DOMESTIC PRODUCT Year 1984 1987 1990 1993 1996 1999 2002 2005 2008 2010 DTA(bill) 0.200 0.290 0.280 0.410 0.639 0.719 0.878 1.212 1.712 2.100 Real GDP(constant 1978 CNY trill) 0.601 0.805 0.863 1.294 1.656 2.075 2.776 3.946 5.752 6.935

Source: National Bureau of Statistics of China and National Bureau of Tourism of China. Real GDP in 2010 is preliminary.

The remainder of this paper is organized as follows. In The next section reviews some recent literatures on the tourism-growth. Section 3 discusses the methodology used in this paper. Section 4 explains variables and the data, and presents the empirical results. Section 5 provides a concluding summary and discussion. II. LITERATURE REVIEW In recent years, the role of tourism in the economic development of a country has been the focus of studies. There is an increasing and widely accepted belief that tourism can play a fundamental role for developing countries to achieve economic growth and development. This hypothesis is strongly supported by some international organizations such as World Tourism Organization (WTO) and World Travel and Tourism Council (WTTC). The development of tourism has usually been considered a positive contribution to economic growth. Balaguer and Manuel (2002) examine the role of tourism in the Spanish long-run economic development. The tourist-led growth hypothesis is tested. The results indicate that, at least, during the last three decades economic growth in Spain has been sensible to persistent expansion of international tourism. The increase of this activity has produced multiplier effects over time. External competitively has also been proved in the model to be a fundamental variable for Spanish economic growth in the long run [2]. Dritsakis (2004) examines empirically the tourism impact on the long-run economic growth of Greece by using the causality analysis among real GDP, real effective exchange rate and international tourism earnings. A multivariate autoregressive VAR model is applied for the examined period 1960: 2000: IV. Their results of co-integration analysis suggest that there is one co-integrated vector among real GDP, real effective exchange rate and international tourism earnings. Granger causality tests based on error correction models
2012 ACADEMY PUBLISHER

(ECM), have indicated that there is a strong Granger causal relation between international tourism earnings and economic growth, there is a strong causal relation between real exchange rate and economic growth, while the relation between economic growth and international tourism earnings is simply a causal relation and lastly the relation between real exchange rate and international tourism earnings is simply a causal relation as well [3]. Brida and Risso (2009) investigate possible causal relationships among tourism expenditure, real exchange rate and economic growth using quarterly data from 1986 to 2007. The results indicate that economic growth in Chile has been sensible to the expansion of international tourism during the last decades. The increase of this activity has produced multiplier effects over time. The empirical results support a tourism-led economic growth [4]. Brida, Barquet and Risso (2010) investigate the causal relations between tourism growth, relative prices and economic expansion for the Trentino-Alto Adige, a region of northeast Italy bordering on Switzerland and Austria. Johansen co-integration analysis shows the existence of one co-integrated vector among real GDP, tourism and relative prices where the corresponding elasticities are positive. Tourism and relative prices are weakly exogenous to real GDP. A variation of the Granger Causality test developed by Toda and Yamamoto is performed to reveal the unidirectional causality from tourism to real GDP. Impulse response analysis shows that a shock in tourism expenditure produces a fast positive effect on growth [5]. Kreishan (2010) examines the causality relations between tourism earnings and economic growth (GDP) for Jordan, using annual data covering the period 1970-2009. Developed time-series techniques are used namely, Augmented Dickey-Fuller (ADF) for unit root, Johanson and Juselius (JJ) for co-integration and Granger causality test for causal relationships. The findings of the study show that there is a positive relationship between tourism development and economic development in the long-run. Moreover, the Granger causality test results reveal the presence of unidirectional causality from tourism earnings to economic growth. The results of this study suggest that government should focus on economic policies to promote international tourism as a potential source of economic growth in Jordan [6]. These analyses above are conducted on a single country basis. Some other studies focus on the contribution of tourism to the economic growth on several countries and regions. Fayissa, Nsiah and Tadasse (2007) use a panel data of 42 African countries for the years that span from 1995 to 2004 to explore the potential contribution of tourism to economic growth and development within the conventional neoclassical framework. Their results show that receipts from the tourism industry significantly contribute both to the current level of GDP and the economic growth of Sub-Saharan African countries as do investments in physical and human capital [7]. Lee and Chang (2008) apply the new heterogeneous panel co-integration technique to reinvestigate the long-run comovements and causal relationships between tourism

2154

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

development and economic growth for OECD and nonOECD countries (including those in Asia, Latin America and Sub-Sahara Africa) for the 19902002 period. On the global scale, after allowing for the heterogeneous country effect, a co-integrated relationship between GDP and tourism development is substantiated. It is also determined that tourism development has a greater impact on GDP in non-OECD countries than in OECD countries, and when the variable is tourism receipts, the greatest impact is in Sub-Sahara African countries. Additionally, the real effective exchange rate has significant effects on economic growth. Finally, in the long run, the panel causality test shows unidirectional causality relationships from tourism development to economic growth in OECD countries, bidirectional relationships in non-OECD countries, but only weak relationships in Asia [8]. Fayissa, Nsiah and Tadasse (2009) further use a panel data of 17 Latin American countries (LACs) for the years that span from 1995 to 2004 to investigate the impact of the tourism industry on the economic growth and development Latin American countries within the framework of the conventional neoclassical growth model. Their empirical results show that revenues from the tourism industry positively contribute to both the current level of GDP and the economic growth of LACs as do investments in physical and human capital [9]. With the rapid development of Chinas tourism industry, there are several studies examine the relationship between tourism and economic growth by using Chinas data. Wu (2003) find that the development of tourism industry has largely promoted Chinas economic growth [10]. Yang (2006) finds that domestic tourism has little pulling effects on economic growth, but economic growth had significant driving effects on domestic tourism [11]. Both studies directly take regression analysis on non-stationary variables such as domestic tourism income, inbound tourism income and GDP, thus spurious regression may occur. Chen, Liu and Xu (2006) take a Granger causality test on the relationship between the development of Chinas tourism industry and economic growth based on the annual time series data from 1985 to 2003.Their study indicate that the development of Chinas tourism industry has significantly promoting effects on Chinas economic growth, but Chinas economic growth has little promoting effects on the development of Chinas tourism industry [12]. Making use of the data on Chinas domestic tourism revenue, inbound tourism revenue and GDP from 1985 to 2005, Liu and Wu (2007) find that there are long-term and stable co-integration relationship among domestic tourism, economic growth and inbound tourism. Moreover, they find that there are Granger causalities from economic growth to domestic tourism and inbound tourism [13]. Based on the data of Chinas inbound revenue per capital, domestic revenue per capital and GDP per capital from 1978 to 2007, Wu, Xie and Quan (2009) investigate the causal relations between tourism growth and economic expansion for Chinas economy by using Johansen Co-integration test approach

and Granger causality test. They conclude that there is a long-term equilibrium between domestic tourism growth and economic expansion. Also, they find out there is not causal relationship between international tourism growth and economic expansion at the national level [14]. China has implemented sampling survey on domestic tourism since 1993. Therefore, the data of domestic tourism revenue after 1993 is incomparable with the data before 1993. In addition, the statistical method on international tourism revenue has changed with the reform in Chinas foreign exchange management system, and the data of international tourism revenue is also incomparable with previous year. Taking into account that the quality of the sample data has serious defects in these studies above, conclusions drawn from these studies may be wrong. Some researchers have taken note of that the data of the revenue of Chinas domestic tourism before 1993 can not compare with the data after 1993. Based on the co-integration theory, Zhang and Liu (2009) analyze the relationship between tourism consumption of residents and economic growth by using of the annual time series data during period from 1994 to 2006. They draw some conclusions as follows: the tourism consumption of urban residents has the co-integration relationship with GDP as well as the added value of the third industry [15]. Based on the data from 1993 to 2007, Liu and Hao (2009) examine whether there are co-integration between domestic tourism, inbound tourism and economic growth in China. The results indicate that both domestic tourism and inbound tourism have co-integration relationship with economic growth [16]. Making use of the data from 1993 to 2009, Zhao and Quan (2011) examine the correlativity between Chinas domestic tourism consumption and economic growth by a VAR econometric model. They find out that there is a long-term equilibrium relationship between domestic tourism consumption and economic growth through co-integration test, Granger causality test, impulse responses, and variance decomposition. But in short term, the role of domestic tourism consumption enforcing economic growth is less than it of economic growth enforcing domestic tourism consumption. Furthermore, in long term, the role of domestic tourism consumption enforcing economic growth is greater than it of economic growth enforcing domestic tourism consumption. Finally, their paper gives a series of suggestion on the inter-reaction between domestic tourism consumption and economic growth [17]. However, results drawn from these studies are all unreliable due to the sample size being very small. On the whole, empirical studies on the relationship between tourism revenue and economic growth or more specifically the tourism-led growth hypothesis have been extensively research. Most studies indicate that there is co-integration relationship between tourism and economic growth. However, the direction of the causality remains as yet an unsolved conundrum. Knowing the direction of causality is not just for understanding the process, but it is also vital for designing of appropriate policy. Therefore, examine the validity of tourism-led

2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2155

growth hypothesis or vice versa has become a pivotal issue for economists as well as the policymakers. In this paper, we employ the annual time series data on Chinas economic growth and Chinas domestic tourist arrivals rather than domestic tourism revenue during period from 1984 to 2009, and examine wether there is long term equilibrium relationship (namely co-integration relationship) between the development of Chinas domestic tourism and Chinas economic growth based on co-integration theory. In addition, this paper constructs an error correction model to analysis the short term disequilibrium relationship between the development of Chinas domestic tourism and economic growth. Finally, we examine whether there is causality between them by performing Granger cause test. III. METHODOLOGY Classical regression analysis is based on the hypothesis that the time series data is stationary. However, much time series data is non-stationary. Because the nonstationary time series dont have limited variance and it cant accord with Gauss-Markov Theorem, the Ordinary Least-Squares (OLS) Estimators are inconsistent and then the spurious regression phenomenon may occur [18], thus the incorrect causality can be drawn [19]. Co-integration theory in dynamic econometrics analysis can overcame the deficiency of method mentioned above and deal with nonstationary time series effectively. The general step of co-integration analysis is as following: The first step, we perform a unit root test developed by Dickey and Fuller (1979, 1981) to investigate the stationarity of series whether they are stationary [20, 21]. If they are non-stationary, we should introduce co-integration theory to analysis the relationship between them. On the basis of co-integration test, we employ Granger causality test to examine whether there is causal relationship between these variables. Granger (1988) argues that there is a one-way Granger cause at least if these variables are co-integration [19]. For time series xt , establish the following model:

DF test, the estimated value of may be biased. Dickey and Fuller augmented DF test in 1979 and 1980, and form augmented DF test, namely ADF test [22]. ADF test and DF test have same principle. The general form of ADF test is as following:

hypothesis that the time series is nonstationary will be rejected. This is DF test (Dickey-Fuller test), also called unit root test. Because we can not ensure that t is white noise in

xta + t + xt 1 +

x
i =1 i

t i

+ t . (3)

Supposing that a series is nonstationary but its first-order difference is stationary, then we call this series first-order integration, denoted by I (1) . If a series is stationary after d-order difference, then the series can be called d-order integration, denoted by I ( d ) . Supposing that series xt and yt are both d-order integration series, if there is a vector a = ( a1 , a2 ) makes

zt = a1 yt + a2 xt be (d b) -order integration, that is to say ztI (d b) , where d b 0 . Then series xt and yt can be called ( d , b) co-integration, denoted by xt , ytCI ( d , b) , a is the co-integration vector.
Two variables can be co-integration on condition that they are both integration with same order. If series xt and yt are all nonstationary, but they are all d-order integration, then we can judge whether xt and yt are co-integration through examining whether the residual

in model (4) is stationary. If

is

stationary, then we can consider whether there is a co-integration relationship between xt and yt :

yt = b0 + b1 xt + t .

(4)

The meaning of co-integration analysis is that it can examine whether there is a long-term equilibrium xt = xt 1 + t or xt = xt 1 + t . (1) relationship between variables. If two variables are where denotes the first difference operator and t co-integration, they will not separate far from each other denotes time period. The residuals t are assumed to be in long term. An impulse can merely give rise them to be apart from each other in short term. In long term, they normally distributed, serially uncorrelated, and white will resume equilibrium automatically. Engel-Granger noise. is defined as = 1 . If equals to zero, two-step test (1987) can be used to verify whether xt is nonstationary. That is to say there is a unit root. variables are co-integration [23]. Co-integration test can examine whether there is a Construct t statistic long-term equilibrium relationship between variables. However, it can not reveal whether there is causality t = / s ( ) . (2) between them. The existence of co-integration implies the which has a DF (Dickey-Fuller) distribution. Through existence of Granger causality at least in one direction estimating model (1), we can arrive at the value of t . If (Granger, 1988). Granger causality test provides a good method to deal with such problem. We can consider that the absolute value of t is larger than the absolute value variable X is variable Ys Granger cause if the lagged of critical value at a given significant level, then the term of X included can significantly improve the accuracy

2012 ACADEMY PUBLISHER

2156

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

of the predicted variable Y. Construct the following model:

TABLE II.
DESCRIPTIVE STATISTICS OF VARIABLES

y t =a+ i xt i + i y t j + u t .
i =1
j =1

(5)

Variable GDP DTA

Obs 7.276 22.696

Mean 4.915 16.623

Std. Dev. 2.000 6.009

Min 19.020 62.815

Max 7.276 22.696

where u t

denotes random error which represents increasing since 1984 with an exception of 1989. It seems that Chinas domestic tourism and economic growth keep an appropriately same trend of evolution on the whole. However, further study is necessary to examine whether there is a long-term and stable equilibrium relationship (or co-integration relationship) between the development of Chinas domestic tourism and Chinas economic growth. B. Unit Root Test The co-integration relationship between variables is
4.5 4 3.5 3 2.5 2 1.5 1 0.5 0 1984 1987 1990 1993 1996 1999 2002 2005 2008 2011 lnGDP lnDT A

omitted factors left out by the deterministic part of the model; , are coefficient. Null hypothesis that

H 0 : 1 = 2 = = j = 0 ( j = 1, 2, n ) means that X
is not Y s Granger cause. If we can not refuse the null hypothesis, then
,

y t =a+ i y t j + u t .
j =1

(6)

Let RSS1 and RSS2 denote residual sum of squares in model (5) and model (6). Thus, the ratio
F = ( RSS 2 RSS 1) / n RSS 1 /(T m n 1)

(7)

has an F distribution with n and T m n 1 degrees of freedom. Where T denotes sample size; m, n is the lagged length of Y and X, they are both determined on the rule of AIC (Akaike Information Criterion) or SC (Schwarz Criterion). IV. EMPIRICAL RESULTS A. Variables Definition and Data Specification In this study, we employ the following two indexes to measure the development of Chinas domestic tourism industry and Chinas economic growth. (1) The sign GDP denotes Chinas GDP (the unit is 100 billion Yuan), which is used for reflecting the aggregate macro-economy, and its change reflects economic growth. The data of Chinas GDP is adjusted by constant prices (1978=100) to eliminate the effect of inflation. (2) The sign DTA denotes Chinas domestic tourist arrivals (the unit is 100 million person times), which is considered as a proxy variable of the development of Chinas domestic tourism. Because the logarithmic transformation does not influence the co-integration relationship between the variables, Chinas GDP and domestic tourism arrivals are both transformed into natural logarithm form to avoid the obvious problems of heteroscedasticity. The signs lnGDP and lnDTA respectively denote Chinas GDP and domestic tourist arrivals after the transformation of natural logarithm. Because the data of Chinas domestic tourist arrivals before 1984 is unavailable, this paper covers the sample period from 1984 to 2009. The dataset are collected from The Yearbook of China Statistics and The Yearbook of China Tourism Statistics. The results of descriptive statistical analysis on lnGDP and lnDTA are reported in tab.2. The scatter diagram (fig.1) of lnGDP and lnDTA describe directly the relationship between Chinas domestic tourism and economic growth. We can find out that both domestic tourist arrivals and GDP has been
2012 ACADEMY PUBLISHER

Figure 1.Evolution trend of lnGDP and lnDTA

based on that they have same orders of integration. So, we firstly test the stationary of the two series of lnGDP and lnDTA by unit root test. This paper tests the stationarity of lnGDP and lnDTA as well as their orders of integration by ADF (Augment Dickey-Fuller) test. The lagged order is determined on the rule of AIC. This paper employs GPE2 package to perform ADF test on the two series of lnGDP and lnDTA (the results of ADF test are reported in tab.3). The results from unit root test show that lnGDP and lnDTA are both non-stationary because both ADF value exceed the critical value at the significant level of 10 percent. Moreover, each ADF value of their first-order difference is less than the critical value at the significant level of 5 percent, which shows that their first-order difference are both stationary, that is to say that lnGDP and lnDTA are both first-order integration, namely lnGDP and lnDTA I (1) . Thus, we can perform co-integration analysis on the relationship between Chinas domestic tourism and economic growth. C. Co-integration Test The results from unit root test indicate that lnGDP and lnDTA are both first-order integration series. Then, we will examine whether there is a co-integration relationship between Chinas domestic tourism and economic growth. We employ the method of Engel-Granger two-step test to examine whether there is

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2157

TABLE III. RESULTS OF UNIT UOOT TEST ON VARIABLES (ADF TEST) test type critical value ADF 5% 10% value c, t, p 1% c,t,1 lnGDp 3.243 3.612 4.394 3.074 lnDTA c,t,1 3.243 3.612 4.394 2.501 lnGDp 2.639 2.998 3.753 3.009 c,0,1 2.636 2.992 3.738 4.289 lnDTA c,0,0 Note: (1) test type (c, t, p, where c denotes drift term, t denotes time trend, p denotes lag length ;( 2) denotes the first difference operator. Variable

D. Error Correction Model Error correction model (ECM) is an econometric model with specific form. The general form of ECM model is put forward by Davidson, Hendry, Srba and Yeo in 1978, and which can be called DHSY model [23]. If two variables are co-integration, the short-term disequilibrium relationship between them can be represented with an ECM model (Engle & Granger, 1987). Employing OLS method, we can obtain the following
TABLE IV. RESULTS OF UNIT ROOT TEST ON RESIDUAL SERIES (ADF TEST) Variable

a co-integration relationship between lnGDP and lnDTA [18]. The first step, performing OLS regression on lnGDP and lnDTA, we have

Test type(c, t, p) (0,0,0)

ADF value -4.029

Critical value at 1 percent level -2.665

ln GDPt = 7.945 + 1.041ln DTAt .


(153.626) (38.021)

(8)

et

R 2 = 0.984

DW = 0.726

ECM model (10) to examine the short-term disequilibrium relationship between Chinas domestic tourism and economic growth.

DW=0.726 indicates that there is first-order autocorrelation in model (8). By introducing lagged terms into model (8), we will obtain a dynamic distributed lag model (model 9).
lnGDP =0.455 + 0.244ln DTAt 0.295ln DTAt1 +1.063lnGDP1. (9) t t (-0.768) (3.390) (-3.965) (14.194)

ln GDP = 0.229 ln DTAt 0.280 ln DTAt 1 + 1.500 ln GDP1 t t (4.289) (-3.767) (7.754)

0.436 ln GDPt 2 1.025ecmt 1 . (10)


(-3.138) (-4.313)

R = 0.737
2

DW = 1.969

SSE = 0.023
ARCH1 = 0.163

R 2 = 0.998

DW = 1.567 SSE = 0.029

LM 1 = 0.540

LM 1 = 0.752 LM 2 = 5.972 ARCH1 = 1.723

where ecmt (error correction term) can be represented by the formula:

The results from LM test on serial correlation show that there is no autocorrelation in model (9). The results of ARCH test indicate that there is no heteroscedasticity. Thus, Model (9) can be considered to be the long-term and stable equilibrium relationship (co-integraton relationship) between Chinas domestic tourism and economic growth. The second step, we will perform unit root test on residual series et in model (9) to test whether et is stationary. The results are reported in tab.4. The ADF value is less than the critical value at the significant level of one percent, as shows the hypothesis that et is a stationary series can not be rejected. Therefore, we can consider that the residual series et in model (9) is a stationary series, that is to say et I (0) . Furthermore, the hypothesis that lnGDP and lnDTA are co-integration can not be rejected. That is to say, lnGDP and lnDTA are (1,1) co-integration. Model (9) is really the long-term and stable equilibrium relationship between Chinas domestic tourism and economic growth. The long-term elasticity of lnGDP changing to lnDTA is 0.810 (this value is arrived at through calculating the expression of [(0.244 0.295) / (1 1.063)] ), which indicates Chinas GDP will increase 0.810 percent if Chinas domestic tourist arrivals increase one percent in the long term.
2012 ACADEMY PUBLISHER

ecmt = ln GDPt 0.244ln DTAt + 0.295ln DTAt 1 1.063ln GDPt 1 + 0.455.

(11)

The relevent statistics indicate that the error correction model can pass significant test. The ECM model reveals how the equilibrium error impacts GDP in the short-term. The coefficient of the ecm term equals -1.025 (less than zero), which is in accordence with the reverse correction mechanism. The short-term elasticity of lnY changing to lnR equals to 0.229, which indicates Chinas GDP will increase 0.229 percent if Chinas domestic tourist arrivals increase one percent in the short-term. E. Granger Causality Test The results from co-integration test show that there is a long-term and stable equilibrium relationship between Chinas domestic tourism and economic growth. The existence of long-term relationships between Chinas domestic tourism development and economic growth signifies that both variables are causally related at least in one direction. However, does Chinas domestic tourism development result in economic growth or vice versa? Then, we test whether there is a causality between Chinas domestic tourism and economic growth based on the method of Granger causality test (the results of this test are reported in tab.5).

2158

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

TABLE V. RESULTS OF GRANGER CAUSALITY TEST Lags 1 Null Hypothesis F-Statistic Probability lnDTA does not Granger Cause lnGDP 5.206 0.033 lnGDP does not Granger Cause lnDTA 7.942 0.010

The null hypothesis that lnDTA does not Granger Cause lnGDP can be rejected at 5 percent significant level. The null hypothesis that lnGDP does not Granger Cause lnDTA can be rejected at 1 percent significant level. These results indicate that there is a bidirectioanl Granger causality between the development of Chinas domestic tourism and economic growth. The development of Chinas domestic tourism is the Granger cause of economic growth. Meanwhile, Chinas economic growth is also the Granger cause of the development of Chinas domestic tourism. That is to say, the development of Chinas domestic tourism can pull Chinas economic growth and Chinas economic growth can promote the development of Chinas domestic tourism. V. CONCLUDING REMARKS The main object of this study is to investigate the real relationships between Chinas domestic tourism and economic growth. This paper arrives at following three conclusions by employing co-integration theory and Granger causality test. First of all, we find out that there is a long-term and stable equilibrium relationship (co-integration relationship) between the development of Chinas domestic tourism and economic growth. Chinas GDP will increase 0.810 percent if Chinas domestic tourist arrivals increase one percent in long term. Secondly, there is a short-term disequilibrium relationship between the development of Chinas domestic tourism and economic growth. Chinas GDP will increase 0.229 percent if Chinas domestic tourist arrivals increase one percent in short term. From the ECM model, we can find out that there is an adjustment mechanism from short term to long term in the relationship between the development of Chinas domestic tourism and economic growth. Thirdly, there is a bidirectional Granger causality between the development of Chinas domestic tourism and economic growth. The development of Chinas domestic tourism has significantly contributed to Chinas economic growth. Meanwhile, Chinas economic growth has evidently promoted the development of Chinas domestic tourism. Policy implication which may be drawn from this study is that China can improve its economic growth performance, not only by investing on the traditional sources of growth such as investment in physical and human capital and trade, but also by strategically harnessing the contribution the tourism industry and improving their governance performance. Over the past decades of years, many developing and developed countries have considered tourism as an option for sustainable development of their nations. Tourism has emerged from being a relatively small-scale activity into one of the largest industries in the world and a fastest
2012 ACADEMY PUBLISHER

growing global economic sector of the worlds economy from the 1960s onwards. The importance of tourism as a contributor to economic growth is so widely accepted that year after year throughout the world a massive investment continues to pour in its development. At present, Chinas domestic tourism market has become the largest in the world. Chinas domestic tourism has been entering a popular stage. Total revenue of Chinas tourism industry is about 1.57 trillion Yuan in 2010. Thereinto, the revenue from domestic tourism industry is 1.26 trillion Yuan and accounts for 80.25% of the total revenue of Chinas tourism industry. The development of Chinas domestic tourism industry can increase Chinas domestic demand, promote the development of related industries, drive the adjustment of industrial structure, and promote the transformation of economic growth mode. Chinas tourism industry has played an important role in maintaining a long-term Chinas economic growth from reforming and opening, expanding Chinas domestic demand and adjusting Chinas industrial structure since the international financial crisis. The prosperity of Chinas domestic tourism industry has laid a stable groundwork for Chinas tourism industry being a growth point in Chinas economy. As well as, the sustaining and stable growth of Chinas economy can provide a large amount of capital for tourism infrastructure construction in favor of the development of Chinas domestic tourism. With the continuous, rapid and stable development of Chinas economy, income of resident rising steadily, leisure time of resident increasing step by step, popular and diversifying demand for tourism products provide a favorable opportunity to the development of Chinas domestic tourism industry. The sustaining and healthy development of Chinas economy will keep on driving the development of Chinas domestic tourism industry. ACKNOWLEDGMENT This work was supported in part by sustentation fund to youth college teachers in Anhui province, P.R.China (Grant No.2008jqw059zd). We would like to acknowledge Zhou Mo and Wang Yongpei (they are both economic PhD. candidate of Renming University of China, Beijing, P.R.China) for helpful remarks and suggestions on the original edition of this paper. We also thank the editors and anonymous referees of this journal for constructive comments on the manuscript. Errors and omissions, if any, are strictly our own. REFERENCES
[1] C-O Oh, The Contribution of tourism development to economic growth in the Korean economy, Tourism Management, vol. 26, No. 1, pp. 39-44, February 2005. [2] J. Balaguer and C-J. Manuel, Tourism as a long-run economic growth factor: the Spanish case, Applied Economics, vol. 34, No. 7, pp. 877-884, May 2002. [3] N. Dritsakis, Tourism as a Long-run Economic Growth Factor: An Empirical Investigation for Greece, Tourism Economics, vol. 10, No. 3, pp. 305-316, September 2004.

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2159

[4] J. G. Brida and W. A. Risso, Tourism as a factor of long-run economic growth: an empirical analysis for Chile, European Journal of Tourism Research, vol. 2, No. 2, pp.178-185, October 2009. [5] J. G. Brida, A. Barquet, and W. A. Risso, Causality between economic growth and tourism expansion: empirical evidence from Trentino-Alto Adige, An International Multidisciplinary Journal of Tourism, vol. 5, No. 2, pp. 87-98, Autumn 2010. [6] F. M. M. Kreishan, Tourism and economic growth: The case of Jordan, European Journal of Social. Sciences, vol. 15, No. 2, pp. 229-234, August, 2010. [7] B. Fayissa, C. Nsiah, and B. Tadasse, The impact of tourism on economic growth and development in Africa, Middle Tennessee State University, Department of Economics and Finance, Working Papers, No. 200716, 2007. [8] C-C. Lee and C-P. Chang, Tourism development and economic growth: a closer look at panels, Tourism Management. Vol. 29, No. 1, pp. 180-192, February 2008. [9] B. Fayissa, C. Nsiah, and B. Tadasse, Tourism and Economic Growth in Latin American Countries (LAC): Further Empirical Evidence, Middle Tennessee State University, Department of Economics and Finance, Working Papers, No. 200716, 2009. [10] Guoxin Wu, Analysis of coherency between tourism development and economic growth in China, Journal of Shanghai Institute of Technology, vol. 3, No. 4, pp. 238-241, December 2003. [11] Zhiyong Yang, An Empirical analysis on the interaction effect between tourism consume and economic growth, Journal of Inner Mongolia Finance and Economics College, vol. 4, No. 2, pp. 27-30, April 2006. [12] Youlong Chen, Peilin Liu and Chaojun Xu, Research on causality between development of tourism and economic growth in China, Journal of Hengyang Normal University, vol. 27, No.1, pp. 93-97, February 2006. [13] Siwei Liu and Zhongcai Wu, An Empirical Study on Tourism and Economic Growth of China, Systems Engineering, vol. 25, No. 9, pp. 60-64, September 2007. [14] Chunyou Wu, F.Y. Xie and Hua Quan, Contribution of tourism development to Chinas economic growth, Science-Technology and Management, vol. 11, No. 6, pp. 8-10, November 2009. [15] Lifeng Zhang and Binde Liu, Analysis on influence of tourism consumption on economic growth in China, Technology Economics, vol. 28, No. 5, pp. 81-85, May 2009. [16] Yinhui Liu, Suo Hao, Comparative analysis on the effect of domestic tourism and inbound tourism on economic growth in China, Statistics and Decision, vol. 25, No. 14, pp. 120-122, July 2009. [17] Lei Zhao and Hua Quan, An Empirical study on relation between domestic tourism consumption and economic growth in China, On Economic Problems, vol. 33, No. 4, pp. 32-38, April 2011. [18] C. W. J. Granger and P. Newbold, Spurious regressions in econometrics, Journal of Econometrics, Vol. 2, No.2, pp. 111-120, July 1974. [19] Jingshui Sun, The Tutorial to Econometrics (2nd Edition). Beijing, China: Tsinghua University Press, 2009. [20] D. A. Dickey and W. A. Fuller, Distribution of the estimators for autoregressive time series with a unit root, Journal of the American Statistical Association, vol. 74, No. 366, pp. 427-431, June 1979.

[21] D. A. Dickey and W. A. Fuller, Likelihood ratio statistics for autoregressive time series with a unit root, Econometrica, vol. 49, No. 4, pp.1057-1072, June 1981. [22] Hayashi Fumio, Econometrics. Shanghai, China: Shanghai University of Finance and economics Press, 2005. [23] Tiemei Gao. Methods of Econometric Analysis and Consturcting Model (2nd Edition). Beijing, China: Tsinghua University Press, 2009. Wang Liangju is an economic PhD. Candidate of Renming University of China, Beijing, P. R. China. He earned Masters degree of management in Anhui University of Finance and Economics, Bengbu, P.R.China in January 2005. His main field of study is market theory and industry policy, the effect of tourism on economy and New Empirical Industrial Organization (NEIO). He works in School of Business Administration, Anhui University of Finance and Economics, P.R.China as a lecture from November 2006. He instructs courses such as tourism planning and developing, service management and tourism marketing planning. Currently, his research interest is agglomeration economy and economic growth, the effect of tourism on economy. Zhang Huihui earned Bachelors degree of management in Anhui University of Finance and Economics, Bengbu, P.R.China in June 2004. Her field of study is the theory and practice of accounting. Now, she is a graduate student in School of Accounting, Anhui University of Finance and Economics, P.R.China majoring in accounting. Her research interest is empirical accounting. Li Wanlian earned Doctors degree of ecology in East China Normal University, Shanghai, and P.R.China in June 2005. Her field of study is tourism economic effect. She works in School of Business Administration, Anhui University of Finance and Economics, P.R.China as an associate professor from September 2008. She is a tutor for graduate of tourism management. She instructs courses of tourism planning and developing and tourism psychology. Currently, her research interest is the effect of tourism on economy and the behavior of tourism customer.

2012 ACADEMY PUBLISHER

2160

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

Multi-robot Task Allocation Based on Ant Colony Algorithm


Jian-Ping WANG
School of Information Engineer, Henan Institute of Science and Technology, Henan, Xinxiang, 453003, China E-Mail:xunji2002@163.com

Yuesheng Gu and Xiao-Min LI


School of Information Engineer, Henan Institute of Science and Technology, Henan, Xinxiang, 453003, China and School of Technique and Electricity, Henan Institute of Science and Technology, Henan, Xinxiang, 453003, China E-Mail:hz34567@126.com and lxm0707@163.com

AbstractWith the development of information technology,


the capability and application fields of robots become wider. In order to complete a complex task, the cooperation and coordination of robots are needed to be adopted. As the main problem of the multi-robot systems, multi-robot task allocation (MRTA) reflects the organization form and operation mechanism of the robots system. The cooperation and allocation for large-scale multi-robot system in loosely environment is the hot issue. As a popular bionic intelligence method, ant colony algorithm is powerful for solving MRTA. By analyzing the existing algorithms, this paper proposed a new solution for MRTA based on ant colony algorithm, built up the model of the algorithm and described the robots coalition, high-level task allocation process in details. Finally, we realized the simulation of ant colony algorithm based on MATLAB, and then compared the robustness and the best incomes of the four algorithms. The simulation results show that, ant colony algorithm is a high degree of ability and stability for solving MRTA. Index Terms Ant Colony Algorithm, Multi-Robot Task Allocation, Robot Coalition Formation, multi-robots systems, MATLAB

becomes more difficulty. MRTA is a typical combinatorial optimization problem [1]. The formulation of MRTA with multiple of robots of different types to take up large number tasks consists of several parameters that make it as NP-hard. At present, some bionic algorithms have been proposed for solving the problems of NP-hard. In this paper, we proposed a new methodology for MRTA based on ant colony algorithm, built up the model of the algorithm and described the robots coalition, high-level task allocation process in details. Finally, we realized the simulation of ant colony algorithm based on MATLAB, and then compared the robustness and the best incomes of the four algorithms. The simulation results show that, ant colony algorithm is a high degree of ability and stability for solving MRTA. B. Overview of MRTA The clear definition of MRTA was proposed by Gerkey [2]. In his article, MRTA was defined ad follows: Given are m robots, each capable of executing one task, and n possibly weighted tasks, each requiring one robot. Furthermore given for each robot is a nonnegative efficiency rating estimating its performance for each task (if a robot is incapable of executing a task, then the robot is assigned a rating of zero for that task). The goal is to assign robots to tasks so as to maximize overall expected performance, taking into account the priorities of the tasks and the efficiency ratings of the robots. At present, some methods have been proposed for solving MRTA list as follows: (1) Market-Based Approaches The methods based on the market mechanism are the most popular way for solving MRTA, like first price auctions, Dynamic role assignment, Trade robots, Murdoch, Demircf [3], M+ and so on. In these approaches, each distributed agent computes a cost for completing a task, and broadcasts the bid for that task. The auctioneer robot decides the best available bid, and the winning bidder attempts to perform the task won. They effectively meet the practical demands of robot teams, while producing efficient solutions by capturing

I. INTRODUCTION A. Background Along with the development of robotics, multi-robot coordination has received more attention. Multi-robot systems can provide several advantages over single-robot systems: robustness, flexibility and efficiency among others. To benefit from these potential aspects the robots must cooperate to carry out a common mission. In particular, multi-robot task allocation (MRTA) has recently risen to prominence and become a key research topic in its own right. The general idea of multi-robot systems is that, teams of robots, deployed to achieve a common project, are not only able to perform tasks that a single robot is unable to, but also can outperform systems of the individual robot, in terms of efficiency and quality. MRTA is the basic problem of the multi-robots systems. It means to optimize the task allocation scheme, in order to improve the operation efficiency of multi-robot systems. As the increasing task and robots, task allocation
2012 ACADEMY PUBLISHER doi:10.4304/jcp.7.9.2160-2167

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2161

the respective strengths of both distributed and centralized approaches. First, they can distribute much of the planning and execution over the team and thereby retain the benefits of distributed approaches, including robustness, flexibility, and speed. They also have elements of centralized systems to produce better solutions: auctions concisely gather information about the team and distribute resources in a team-aware context. Because of its well extensibility, these methods are particularly suitable for distributed robotics area. Theoretically, they can ensure the optimal task allocation. However, if the communication costs are too high in the task allocation process, once there are failures in robots communication, the performance will degrade noticeably [4], so these methods fit for small and medium-scale task allocation. M+ was proposed by Botelho and Alami in Ref. [5], the algorithm used a task allocation protocol based on the Contract Net protocol with formalized capabilities and task costs. The need to pre-define the capabilities and costs limits the applicability of the M+ algorithm to domains where these are known. (2) Behaviors-based approaches In this way, the approaches rely on the behaviors of robots, like ALLCANCE, BLE and ASyMTRe, etc. ALLIANCE [6] is an architecture that has been proposed for fault tolerant instantaneous allocation, and integrates impatience and acquiescence into each robot. ALLIANCE uses motivational behaviors such as robot impatience and robot acquiescence to perform tasks that cant be done by other robots, and gives up the tasks they cant perform efficiently. BLE [7] means the broadcast of local eligibility technique. It is another behavior based architecture, which uses cross inhibition of behaviors between robots. It is based on a calculated task eligibility measure which robots compute individually and broadcast to the team. ASyMTRe [8] is a behavior-based architecture. It is based on mapping environmental, perceptual, and motor control schemas to the required flow of information through multi-robot systems, automatically reconfiguring the connections of schemas within and across robots to synthesize valid and efficient multi-robot behaviors for accomplishing the team objectives. These approaches are stronger in real-time capability, fault-tolerant and robustness, but still only the part optimal for solving MRTA. (3) Approaches based on linear programming Gerkey and Mataric regard MRTA as the linear programming problem of 0-1 type [9], in this approach, 2 they find n nonnegative integers, and in order to maximize

i =1 n j =1

ij

= 1,1 j n

ij = 1,1 i n

(2)

The methods based on linear programming can handle only MRTA of single-robot tasks and single-task robots, but cant handle a task for which need multi-robots to cooperate to accomplish. Early, the main methods to solve liner programming were simple type method and the hungry method. These two methods are essentially matrix calculations, as the increasing of tasks and robots. The computational complexity will grow of exponential. Some mixed integer linear programming methods of MRTA can find the optimal solution successfully, but usually need to collect the information of all the tasks and robots. The expansibility and efficiency of these methods are weak. (4) Approaches based on swarm intelligence There approaches simulate the behaviors of insects to assign the task of robots, swarm intelligence methods [10] include the threshold value method and ant colony algorithm, are mainly using for robot system in unknown environment. Because the group cooperation among individuals is distributed, a few of the individual's fault cant affect the entire task to solving, Swarm intelligence methods have high robustness, scalability, are very suitable for distributed multi-robot systems. In this paper, we proposed a solution for MRTA based on ant colony algorithm. II. ANT COLONY ALGORITHM In the natural world, ants run randomly around their colony to search for food. Ants deposit a chemical substance called pheromone along the traveled paths. Other ants upon finding these paths would follow the trail. Then, if they discover a food source they will return to the nest and deposit another path along with the previous one, and will be strengthened the preceding path. Over time, however, the pheromone trail starts to evaporate, thus reducing its attractive strength. The more time it takes for an ant to travel down the path and back again, the more time the pheromones have to evaporate. Pheromone evaporation also has the advantage of avoiding the convergence to a locally optimal solution. If there were no evaporation at all, the paths chosen by the first ants would tend to be excessively attractive to the following ones. In that situation, the exploration of the solution space would be constrained. Thus, when one ant finds a path from the colony to a food source, other ants is more likely to follow that path, and positive feedback eventually leads all the ants following a single path. The idea of the ant colony algorithm is to mimic this behavior with simulated ants walking around the graph representing the problem to solve. Ant colony algorithm is the result of research on computational intelligence approaches to combinatorial optimization originally conducted by Dr. Marco Dorigo, in collaboration with

ij

. The formula is defined as (1):


i =1 j =1

ij

ij j

(1)

In (1), the conditions satisfy (2).

2012 ACADEMY PUBLISHER

2162

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

Alberto Colorni and Vittorio Maniezzo [11]. The first algorithm was aiming to search for an optimal path in a graph, based on the behavior of ants seeking a path between their colony and a food source. In operation research, ant colony algorithm is a probabilistic technique for solving computational problems, which is reduced to finding good paths through graphs. This algorithm is a member of ant colony algorithms family, in swarm intelligence methods, and it constitutes some metaheuristic optimizations. It is a Bionic algorithm. It simulates ants foraging behavior in nature: the intelligent ants through the exchange of information and collaboration between individuals to find the optimal path from the nest to the food source. At present, ant colony algorithm is mainly applied in TSP, quadratic assignment problem, job shop scheduling problem and network routing problem. It is also applied in pattern recognition fields, and many achievements have been made up to now. A. The Basic Principles of Ant Colony Algorithm The basic principles are understood as follows: Ants communicate and cooperate with each other by releasing pheromones [12], ants through the concentration of pheromones to choose their path during movement, and release their own pheromones. The denser the pheromone concentration of the path the more likely ants will choose the path. Therefore, the pheromone concentration will be greater on the path which ants usually move. As time goes by, pheromone concentration will get smaller, and the pheromone concentration will get smaller and smaller on the path ants seldom choose. When the quantity of ants is large, there will appear the positive feedback of pheromone, until ants find the shortest path from the nest to the food source. The execution flow of ant colony algorithm is described as Fig. 1. When the algorithm starts, it will initialize the ant colony, and build up the search path by the ants individually.

When all the ants accomplish to build up the search path, the system will sort the entire path, if not, the ants will continue to build up the search path. After then, the Pheromone which has been left on the path will update. If the entire ants go to the end condition, the algorithm will be end. If not, the algorithm will continue. B. The Mathematical Model of the Ant Colony Algorithm Ant colony algorithm was proposed first in solving TSP, TSP means traveling salesman problem, in order to seek an optimal way for salesman [13]. In ant colony algorithm, the salesman is simulated as individual ants. In the walking process, all the ants calculate the state transition probability according to the amount of information on various paths. The ant system simply iterates a main loop where m ants construct in parallel their solutions, thereafter updating the trail levels. k Given Antk (k = 1, 2, K , m) , set pij (t ) as the state transition probability from Cityi to City j at the moment
k of t . pij (t ) is described as (3).

[ ij (t )] [ik (t )] k pij (t ) = [ is (t )] [is (t )] , s allowed k 0, otherwise

j allowed k (3)

In (3), means the information stimulating factor, this parameter reflects the relative importance of track, means the expected stimulating factor, said the relative importance of visibility. ij (t ) shows the inspire function, it is described as (4).

ij (t ) = 1/ dij

(4)

In (4), dij means the distance from Cityi to City j .

ij (t ) shows the expectations of ants move from Cityi to


City j .
After a moment, all the ants complete a cycle. The pheromone left on each path will be adjusted as (5).
ij (t + n) = (1 ) ij (t ) + ij (t ) m k ij (t ) = ij (t ) k =1

(5)

In (5) means the information volatile factor. 1 means the information of track attenuation coefficients, ij (t ) means the pheromone incremental left on the
k path from Cityi to City j , ij (t ) means the amount of

information left on the path from Cityi to City j . According to the updating strategy of pheromone, M.Dorigo brought up three different ant colony algorithm models: Ant-Cycle, Ant-Quantity and Ant-Density. k The difference between the three algorithms is ij (t ) , among the three algorithms, Ant-Quantity and AntDensity use the local information, Ant-Cycle uses the
2012 ACADEMY PUBLISHER

Figure 1. Execution Flow of Ant Colony Algorithm

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2163

global information. In Ant-Quantity and Ant-Density models, each ant lays its trail at each step, without waiting for the end of the tour. In Ant-Density model a quantity Q of trail is left on edge (i, j ) , every time an ant goes from i to j , Therefore, in the ant-density model, ijk (t ) is described as (6).

Q , k ij (t ) =

Antk Collation 0, Others

(6)

In Ant-Quantity model, an ant going from i to j leaves a quantity


Q of trail on edge (i, j ) , every time it dij

goes from i to j . So, in Ant-Quantity model, ijk (t ) is described as (7).


ij k Q , Antk Collation = dij 0, Others

Figure 2. Hierarchy Architecture of System

(7)

III. SOLUTION OF MRTA BASED ON ANT COLONY ALGORITHM A Model of MRTA So far, some models of multi-robot coordination in the literature list as follows: Ref. [14] proposed a formalism of information invariants. In these formalism models, the information requirements of a coordination algorithm and provides a mechanism to perform reductions between algorithms. Ref. [15] developed a prescriptive control-theoretic model of multi-robot coordination and showed that it is used to produce a precise multi-robot box-pushing. MRTA based on ant colony algorithm is abstracted as the hierarchy model shown in Fig. 2. Low-level refers to the task group composed by disparate ant individuals. Each task is undertaken by different ants. High-level refers to task assignment group formed by task groups. At the low level, as to multi-robot task coalition 1, the robots coalition is formed based on ant colony algorithm [16]. At high level, task is assigned also based on ant colony algorithm. High level ant colony algorithm is aimed at achieving optimal task allocation, and let ants represent tasks and choose the holders for each task. The low level is to form the robot coalition to generate tight coupling task solutions. The systems overall workflow is as follows: At first, the ants are standing at the high level, starting from the first task, to choose the proper holder. When the ants encounter a tight coupling task, they move to the low level and call the corresponding coalition to form the algorithm, and then accomplish the task cooperative.

B Coalition of Robot In this paper, ant colony algorithm is informally defined as a multi-robot system inspired by the observation of some real ant colony behavior exploiting simmer. All the tasks is seen as foods, the ant is seen as individual ants, coordination among ants is achieved by exploiting the stigmatic communication mechanism. Given m tasks on n robots randomly, for Antk , the probability of choosing Robot j is (8).
k pij =

uJ k

[ ij (t )] [1 / dij ]
iu

(t )] [1 / d iu ]

, j Jk
(8)

In (8), J k is the robot collection for which Antk doesnt select. ij (t ) means the residual pheromone amount on connection i , j at the moment of t . dij (i, j = 1, 2, K n) means the distance from Roboti to
Robot j , i.e. communication cost. The two parameters

and show the intensity of accumulated pheromones on the path and the weight of communication cost. After selecting one robot, if the ant finds the present coalition can complete the task, it will stop path searching. When all the ants have completed a solution, one cycle is completed. Take the coalition which gets the maximal income in this cycle as the present optimal solution, and updates the intensity of pheromone as (9) and (10).
k ij (t + 1) ij (t ) + ij k =1 m

(9)

Inc(Ck ) m k ij = Ck , Robot K Coalition k =1 0, Others

(10)

In (8), Inc(Ck ) means the income of coalitions for which Antk formed. In the algorithm, for parameters ,

2012 ACADEMY PUBLISHER

2164

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

and , experiment methods are used to determine the optimal combination; fixed evolution generation is used as the stop condition, or when the evolutionary trend is not obvious, the calculation will be stopped. In order to compute the complexity of the algorithm for MRTA, we usually take the qualitative quantitative way. In a project composed of m tasks and n robots, we use ant colony algorithm for distributing, T is the iteration time, time complexity [17] of the algorithm is described as (11):
T (o) = O( f (n)) = m (m 1) 2 n T / 2

k ij =

Costkj
k =1

,Q C

(14)

Algorithm flow is as follows: k Given t = 0 , NC = 0 , ij (0) = 0 , ij = 0 ,


NumTask = s , NumAnt = m , NumRobot = n , the capacity needs of the task, the ability each robot has and the corresponding ability cost. STEP 1 for i = 1 to s for k = 1 to m do Antk Starts from the first task, and determines whether the current Taski is a tight coupling task, if it is, then turn to step6. If not, Antk will choose a task holder from J i according to the probability Pijk of (6) and calculate

(11)

Given m < n , T = k n , when n , T (o) is described as (12).

T (o ) = O ( n 4 )

(12)

We know that, with the increased amount of transaction and the robots number, the time complexity of the algorithm becomes higher.

C Task Allocation The process of task allocation is described as follows: At first, the entire ants stand at the high level. For the first task, the algorithm chooses the suitable ants as the undertakers. Then, if the current task is accomplished by a single robot, the system will find the optimal solution directly. If the task cant be accomplished by a single robot, that means the task is tightly coupled, all the ants will move to the low level, some suitable ants will be assigned to undertake the task. In MRTA, it seems robots as ants, uses ant colony algorithm to solve the problem. Set the number of ants is m, each task is node 0, candidate robots or robot coalition are nodes 1n, then given Antk start walking from node 0, the probability of choosing node j is (13).
k pij =

the income of completing the task. Then, Antk will move to the next task and repeat the above actions until all tasks are assigned the holders. STEP 2 Calculate the total income of task allocation corresponding to each ant. Update the maximal income and the corresponding allocation plan. STEP 3 Update the intensity of pheromone [18] ij (t + 1) according to (9) and (10). STEP 4 Set (13) as follows.
t = t + 1 NC = NC + 1 = 0 ij

(15)

[
uJ

[ ij (t )] [1/ Costij ]
iu

(t )] [1/ Costiu ]

, j Jk

(13)

In (11), J i is the collection of candidate robots or robot coalition of Taski . Costij is the cost of robot or robot coalition completing Taski . For each robot, the cost of completing a task is the distance between the robot, task and the consumption of its ability. For robot coalition, the cost of completing task is Cost (C , t ) . For Antk , the first task node of to be assigned task list is the starting point for optimal path. After Antk finished choosing the holder for the task, it will move to the next task, and choose another holder for it. After Antk finished choosing the holders for all the tasks, one task allocation is completed. When all the ants have completed the solutions, one cycle is completed. Take the coalition which gets the maximal income in this cycle as the present optimal solution, and updates the intensity of pheromone as (14).
2012 ACADEMY PUBLISHER

STEP 5 If ( NC < NCmax [19]) then go to Step1, output the task allocation plan for which can get maximal incomes. STEP 6 Organize the robot coalition of tasks using ant colony algorithm, and then return to step 1. After the 6 steps above, the high-level ant colony algorithm completes the task allocation of the whole system by using the low-level ant colony algorithm.

IV. SIMULATION PROCESS

A. Parameters of Experiment In order to validate the algorithm, we take information transmission as the background, there are a number of tasks, some of them are loose type tasks, i.e. task is completed by single robot independently [20]; some are tight coupling tasks, i.e. the task is completed by some robots cooperation. The number of robots and tasks is set as it is required, and each task and each robot has a corresponding capacity vector. The work space is 5 5 of four zones, shows as Fig. 3.The algorithm is achieved on MATLAB. It is a numerical computing environment and

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2165

fourth-generation programming language developed by MathWorks. MATLAB can support a dynamic environment control system of multi-agent simulation, and provide visualization capabilities.

In this paper, we make the main contrast of robustness and speed for the four algorithms.
TABLE II INFORMATION OF TASKS Tasks T0 T1 T2 T3 T4 T5 T6 T7 Requiredcapacity 210 300 130 120 229 112 332 106 Location (4,3) (0,0) (5,3) (2,4) (4,1) (3,0) (0,4) (2,3)

Figure 3. Work Space of Robots

The entire robot scattered among the space of a certain position, each robot can move in the space, goods are stored in a certain position of the space. Given the numbers of ants as m = 8 , the maximum number of iterations as NCmax = 300 , rew(Ti ) = 700 ,

1 = 2 = 1 , = = = 0.5 , Q = 1 , = 1.5 , = 2 , = 0.7 . The capacity and cost vector of robots are
TABLE I CAPACITY AND COST VECTOR OF ROBOTS Robot R0 R1 R2 R3 R4 R5 R6 R7 Capacity 80 102 210 120 062 087 185 164 Costvector 103

provided in TABLE I.

B. Robustness Robustness means the multi-robots systems can continue to accomplish the task, even if some robots Failure. In the test, we take the failure times of the task allocations for measuring the Robustness of the four algorithms. In the process of cooperation, the conflict may appear. This will lead to error in distribution and handling, so contrast the collaboration efficiency is crucial to the distribution process. By comparing the ALLIANCE, BLE, M+ and ant colony algorithm, we know that, conflict times of ant colony algorithm is fewer. The results are shown as Fig. 4.
Failure Times of Allocations 16 14 Failure Times 12 10 8 6 4 2 0 100 200 300 400 500 600 Generations 700 800 ALLIANCE BLE M+ Ant Colony

160 258 178 104 115 230 210

As there exist the different situations: some tasks are completed by single robot independently. Some tasks are completed by some robots cooperation. In the following part, some tasks belong to single robots, and someone belong to multi-robots [21]. Table II provides the information about the tasks, where the ant stands. Using the parameters above, we do the experiment for the information transmission task, after the experiment, we can estimate the performance of ALLIANCE, BLE, M+ and ant colony algorithm for MRTA. The Performance indicators of MRTA contain robustness, speed, extensibility, heterogeneity, flexibility and so on.

Figure 4. Failure Times of allocations

By anglicizing of the experiment results, we know that the Robustness of ant colony algorithm is strong, as the increasing generation, the failure times will present linear growth trend.

C. The Best Incomes Speed is the most important index for the multi-robots systems. The best incomes describe the task execution efficiency. Using the data for which provided by table I

2012 ACADEMY PUBLISHER

2166

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

and table II, we make the test of ALLIANCE, BLE, M+ and ant colony algorithm for MRTA, in the test, we set the population 3, the best incomes of the algorithms is shown as Fig. 5.
Population 3 12000 10000 Best Incomes 8000 6000 4000 2000 0 100 200 300 400 500 600 Generations ALLIANCE BLE M+ Ant Colony

40 times of calculation of ant colony algorithm are made respectively by using the above parameters and the calculation results of robot coalition and high-level task allocation is obtained. The achieved results are analyzed and compared as shown in Fig. 6. V. CONCLUSION We do the simulation of MRTA based on ant colony algorithm, in order to comparing with ALLIANCE, BLE, M+. By analyzing the experiment data of the four algorithms, we get the conclusion as follows: (1) Robustness By test the four algorithms, we know that the robustness of ALLANCE is the stranger one. In the beginning of system, the robustness between M+ and ant colony is the same, as the growing of the generations, the failure times of M+ will grow rapidly, but ant colony is more stable. As the generation grows, the robustness of ant colony will be stranger. (2)The Best Incomes By analyzing the experiment data, we know that , in the beginning of the system, the best incomes of ant colony is the lower, but as the generations grows, the best incomes of ant colony will be more higher. (3)The efficiency of solution evolving between the low level and the low level: By testing the executing times of the low level and the low level, we know that, the coalition of robots will take longer times. And when the coalitions have been shaped, all the tasks allocation will becomes more easily. So, we can draw the conclusion that, ant colony algorithm is suitable for solving robot coalition problems because of its high degree of ability and stability. While the corresponding high-level task allocation has relative low efficiency as the numbers of ants and tasks are not limited. Considering the pheromones distribution problems, this may cause the imbalance in assigning to a task, the problem will lead to the heavy workload of some robots, and the others have few workloads. Therefore, in the actual use, the algorithm may need guidance. As the development of the robotics, how to improve the stability and the efficiency of the algorithm becomes more important. Future work includes the implementation of ant colony algorithms with real robots, and the improvement of the stability and the efficiency of ant colony algorithm for MRTA. To realize fair distribution based on ant colony algorithm of MRTA need to be studied further. ACKNOWLEDGMENT This work was supported by a grant from the Social Science Association of Henan Province (No. SKL-20101146 and No. SKL-2011-2381).

Figure 5. The Best Incomes of Population 3

D. Results of the Experiments At present, ant colony algorithm has been used for solving the cooperation of the robots, but the algorithm is only limited to the loosely distributed tasks, solving the large-scale distributed tasks by ant colony algorithm is uncommon. This paper introduces a new methodology for MRTA based on Ant Colony Algorithm. The main contributions of this paper are the following: (1) We discuss the coral-problem to the formation of the robot alliance, and use ant colony algorithm for resolving the alliance formation of multi-robots. (2) We use ant colony algorithm for solving the largescale task allocation. (3) We realize the task allocation process by building up the hierarchy model, using the hierarchy structure we divide MRTA problem into low-level and high-level. (4) We realize the algorithm simulation program by MATLAB, using the program we do the experiment for the information transmission task, and estimate the performance of ALLIANCE, BLE, M+ and ant colony algorithm for MRTA.
4100 4090 Execution Times 4080 4070 4060 4050 4040 4030 4020 100 200 300 400 500 600 Gerenations 700 800 The low lelel The high level

REFERENCES
[1] ZHANG Yu, LIU Shu-Hua, Survey of Multi-Robot Task Allocation, CAAI Transactions on Intelligent Systems, Vol.3 No.2, pp: 115-120, April 2008.

Figure 6. Average Solution Evolving Curve

2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2167

[2] Gerkey B P, Mataric M J. A framework for studying multi-robot task allocation, In Proceedings of the NRL Workshop on Multi-robot systems, pp: 17-19, 2003. [3] Kalra N, Martinoli, A Comparative Study Of MarketBased And Threshold-Based Task Allocation, Distributed Autonomous Robotic Systems, Tokyo: Springer, 2006. [4] Fang Tang, Automated Synthesis of Multi-Robot Task Solution through Software Reconfiguration, Proc of IEEE Int Conf on Robotics and Automation, Barcelona, Spain, pp:1501-1508, 2005. [5] Botelho.S, Alami.R, M+: a scheme for multi-robot cooperation through negotiated task allocation and achievemen, Proc of the 1999 IEEE International Conf on Robotics and Automation, Detroit, Michigan, pp: 12341239, 1999. [6] L. E. Parker, ALLIANCE: An Architecture for Fault Tolerant Multi-robot Cooperation, IEEE Transactions on Robotics and Automation, Vol. 14, No. 2, pp. 220240, April 1998. [7] B. B. Werger, M. J. Matari, Broadcast of Local Eligibility for Multi-Target Observation, in Distributed Autonomous Robotic Systems, Springer, pp: 347356, 2001. [8] Fang Tang, Lynne E. Parker. ASyMTRe: Automated Synthesis of Multi-Robot Task Solutions through Software Reconguration, Proc of IEEE Inte Conf on Robotics and Automation, Barcelona, Spain, May 2005. [9] M. Dorigo, M. Birattari, and T. Stitzle, Ant Colony Optimization: Arificial Ants as a Computational Intelligence Technique, IEEE computational intelligence magazine, November, 2006. [10] ZLOT R, STENTZ A, Complex task allocation for multiple robots, Proc of IEEE Int Conf Robot Atom, pp: 1515-1522, 2005. [11] VIG L, ADAMS J A, Market-based multi-robot coalition formation, Proc s of the 8th Inte Symposium on Distributed Autonomous Robotic Systems, Minneapolis, USA, pp: 227-236, 2006. [12] Liu, S. H, Zhang Y, Multi-robot task allocation based on particle swarm and ant colony optimal, Journal of Northeast Normal University, Vol.41, No.4, pp: 68-72, 2009. [13] Huang Bo, Yan Li-Na, Superiority evaluation algorithmbased task allocation strategy of robot soccer systems, Journal of Huazhong University of Science and Technology, pp: 38-44, 2010. [14] Donald. B, Jennings. J and Rus. D, Information invariants for distributed manipulation, The Intl. J. of Robotics Research, pp: 673702, 1997. [15] Spletzer. J.R, Taylor. C. J, A Framework for Sensor Planning and Control with Applications to Vision Guided Multi-robot Systems, Proc of Computer Vision and Pattern Recognition Conf, Kauai, Hawaii, pp: 378383, 2001. [16] Liu, S. H, Zhang Y, Multi-robot task allocation based on swarm intelligence, Journal of Jilin University, Vol.40, No.1, pp: 123-129, 2010. [17] XU Ju-xiang, LIU Guo-dong, Ability Evaluation of Multi-Robot Based on Set-Pairs Proximity, Journal of Jiangnan University, pp: 56-72, 2009. [18] XIE Ping, LIU Zhi-jie, Error Compensation Method of Parallel Robot Based on Ant Colony Algorithm, Computer Engineering, pp: 56-70, 2011. [19] XU Jiang-le, XIAO Zhi-tao, An Improved Intelligent Ant Colony Optimization Based on Genetic Algorithm, Microelectronics Computer, pp: 67-74, 2011.

[20] Zhang Dan-dan, Adaptive Task Assignment for Multiple Mobile Robots Via Swarm Intelligence Approach, Robotics and Autonomous Systems, pp: 52-58. 2007. [21] GERKEY B P, MATARIC M J, A formal analysis and taxonomy of task allocation in multi-robot systems, International Journal of Robotics Research, pp: 939-954, 2004.

Jian-Ping WANG (1981-), was born in BaoJi, Shaanxi Province of China. He received his B.S degree in 2004 from Shaanxi Normal University, his M.S degree in 2010 from Nanjing university of science and technology. His current research area is computer network technology and robot system. He is a lecture in the School of Information Engineer, Henan Institute of Science and Technology, Xinxiang, Henan Province of China.

Yuesheng Gu: birth 1973, male, vice professor of Henan institute of science and technology. His current research area is computer network technology and artificial intelligence.

Xiao-Min Li received bachelor's degree in Electrical Engineering at Southwest Jiaotong University, then received a master's degree in electronics and communication engineering at Nanjing University of Science and technology. She is teaching at Henan Institute of Science and Technology now, more than ten papers have been published.

2012 ACADEMY PUBLISHER

2168

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

Parameter Auto-tuning Method Based on Selflearning Algorithm


Chaohua Ao
Dept. of Automation, Chongqing Industry Polytechnic College, Chongqing, China Email: aochaohua@yahoo.com.cn

Jianchao Bi
College of Automation, Chongqing University, Chongqing, China Email:bjc115@163.com

AbstractThe central air condition system is a complex system. Aimed at the puzzle of optimal status adjusting by once setting parameter of fuzzy PID, the paper proposed a sort of parameter auto-tuning method of fuzzy-PID based on self-learning algorithm. It adopted parameter autotuning technique to adjust the PID parameters in real time so as to ensure good quality of control system. It combined fuzzy logic control with classical PID control, and then made the fuzzy auto-tuning of PID parameter on line, after that, the parameters of adjusted system were switched to the natural work status. Once the change of system performance occurred, and it went beyond the specified range, then the system would automatically start the parameter tuning process of PID. The engineering practice shows that the retuned parameters could obtain better control effect than before, therefore the system performance is greatly enhanced, and it is higher in control accuracy, better in stability, stronger in robustness. Index Termsparameter auto-tuning, fuzzy PID controller, self-learning algorithm

performance index better, but also saves the energy source in great extent outstandingly. II. CYBERNETICS CHARACTERISTIC OF CONTROLLED OBJECT AND ITS CCONTROL STRATEGY SELECTION A. Cybernetics Characteristic The central air-condition control system mainly is used to control the temperature and humidity in the room of the building, and it is a very complicated system. As a result of building space being a open system, the temperature and humidity located at each coordinate point in the room are different each other, therefore it is a complex control system with multi-input and multioutput, and thereby it has become a hot spot of the control theory and control engineering area. In order to obtain better control effect and control quality, first it has to research on cybernetics characteristic of controlled object so as to seek correct control strategy and algorithm. Because the control of temperature and humidity in a open space is a very typical complex process, it is hard to describe the control process characteristic by strict mathematics method, such as nonlinear in variable parameter, time varying in system performance, and so on. Also it appears the randomicity, fuzzification and non-stability, up to now, it is still difficult to build the strict mathematics model by uniform mathematics method. In generally speaking, the process characteristic in cybernetics of controlled object can be reduced as the following. 1) The process parameter is unknown or uncertainly known, and varying with the time, random in regular pattern as well as decentralized in space distribution. 2) The time lag of process is unknown or uncertainly known, and the time lag is always varying. 3) There is a serious nonlinearity in parameter and process with the changing of time and space coordinate position. 4) There are lots of correlations among process parameters.

I. INTRODUCTION The central air condition system is a complex system with big lag, nonlinearity and big inertia of multi-input and multi-output, and also a huge energy consumption system over 50 in the whole building energy consumption. Furthermore, the system is difficult to build the precise math model, so the traditional PID controller used currently is not a better choice, such as poor in control precision, bad in stability and reliability, difficult in satisfying user demand etc. And these shortcomings make an unsatisfying control effect and result in a huge waste in energy consumption. The paper takes an engineering reform of central air condition system for a certain tobacco factory as an example, and discusses the method of related parameter auto-tuning. The system reform adopts parameter auto-tuning technique of fuzzy PID controller. By means of fuzzy auto-tuning online of PID parameter, it can modify the PID parameter in real time, and make system running locate the optimal status all the time. As a result, compared with the convention algorithm it makes not only satisfy the technical

2012 ACADEMY PUBLISHER doi:10.4304/jcp.7.9.2168-2175

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2169

5) The disturbance of process environment is unknown or uncertainly known, and the representation form appears multiformity and randomicity. 6) The controlled object is a big inertia system with uncertainty lag. From the above mentioned, we can known that it is difficult to obtain better control effect for central aircondition process control by means of traditional control strategy (PID) and method of modern control theory. Therefore it is necessary to research farther the control strategy. B.Control strategy Selection There are lots of control strategies that can be supplied to select, but there still are lots of puzzle needed to be solved. For example, NN control needs definite experiment samples. Due to the influence of uncertainty, it is hard to obtain the experiment samples from the known experience and forehand experiment. Because of the method limitation, it is also difficult to realize effective control generally. The expert control system is based on the knowledge, and because it is difficult to sample the characteristic information to express the characteristic information and build the maturity repository. Therefore the expert control system is also difficult to realize the control of air-condition process. The hominine control experience can carry through the summarization and description by means of hominine language, it can be depicted to fuzzy linguistic language by means of fuzzy set in fuzzy mathematics, and also it can be realized by the sentence of IF condition THEN action. But because the uncertainty factors are too much, the general fuzzy control is unnecessarily a good choice for air-condition process. The basic property of HSIC (Human Simulated Intelligent Control) is to simulate the control behavior of control expert. Therefore its control algorithm is multi-mode control, and the material method is to execute alternate use among multi-mode control. Such a property makes that a good many contradictions of control quality demand are perfectly harmonized for control system. It is maybe a sort of more wise choice. But as a result of the parameters adjusted being too much, the parameter tuning is very complex, therefore it is not able to be selected. In this paper, we select the parameter auto-tuning method based on self-learning algorithm of fuzzy PID controller. The next we discuss the related problem of parameter selection for PID controller. III. INFLUENCE OF EACH PID PARAMETER FOR SYSTEM The following discusses influence of each parameter of PID controller for steady and dynamic performance. A. Proportional Unit The function of proportional unit is to reduce the system deviation. If the proportion coefficient KP increases, then the response speed will be quickened, and the system steady error will be reduced, therefore the control precision will be enhanced. But too big KP can result in bigger overshoot and system unstable. If KP takes too small then the overshoot will be reduced, the

system stability margin will be magnified, but the control precision will be reduced, and it will make the transitional process be long [1, 2]. B. Integral Unit Its action is to eliminate steady error of the system, but it can make the system response speed get slow down, and therefore it makes system overshoot quantity get big, it is possible to result in producing system oscillation. If the integral coefficient KI increases, then it is propitious to reduce the system steady error. But the over strong integral action will make the overshoot quantity intensify, even if it will result in producing oscillation. And if the KI reduces, then it is propitious to make system stable, avoid producing oscillation and reduce system overshoot quantity, but it is not propitious to eliminate steady error of the system [3 ,4, 5]. C. Differential Unit The differential unit can reflect the change trend of deviation signal. After the deviation signal changes too big, it can introduce an effective signal at early stage, and it is propitious to reduce the overshoot, overcomes the oscillation, makes system quicken approaching stable quickly, enhance response speed of system, and reduces the adjusting time, as a result the dynamic characteristic is improved. The disadvantage is poor in anti-jamming ability, and it is big in influence of response process to the differential coefficient value KD. If KD increases, then it is propitious to quicken system response, makes the system overshoot reduce, increases the system stability. But it can bring with disturbance sensitivity, and weaken anti-jamming ability. If KD is too big, then the response process will be in advance braking, thereby the adjusting time will be delayed. And otherwise, if the KD is too small, then the deceleration of system adjusting process will be delayed, the overshoot will be increased, and it makes the system response speed slow down, therefore the system stability will get bad [6, 7]. IV. PARAMETER TUNING AND ITS PUZZLE A.Tuning Principle of PID-parameter In the control system of PID, the most part of math models can be simplified as a two order system so as to carry through the system analysis. The typical response curve is shown as in Fig.1. In which, system deviation e(t) = r(t)-y(t), system deviation change rate ec(t) = de(t)/dt. Now we can analyze the tuning principle of each subsection for each parameter in the control algorithm of routine PID controller. 1) For subsection OA ( e > 0,ec < 0 ), under the action of unit-step signal it is considered as the key transition phase from static state to dynamic state, and after that it is gradually turned to steady state. Owing to the influence of system inertia, this subsection curve decides only to assume ascending in certain incline. In this phase, when e > 0, the e assumes reducing trend, and ec < 0, the absolute value of deviation e must assume reducing trend. For obtaining better control performance, the gain control should be adopted in the subsection OA. If the fixed

2012 ACADEMY PUBLISHER

2170

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

proportion control mode is adopted then when the output reaches the steady value it is impossible to hold steady value and certainly it will bring the overshoot because of the system itself inertia. In order to make the system response speed rapidness and not to bring the great overshoot, the subsection OA should be divided as three subsections to carry through analysis, namely that is respectively OI, IJ and JK. In the subsection OI, the deviation value e is larger, it can take larger gain coefficient KP and less gain KD so as to expedite the response speed and prevent instantly biggish value of starting deviation e. In order to prevent integral saturation, the integral action should be canceled (gain KI = 0) or reduce the integral action so as to avoid the system response bringing larger overshoot. In the subsection IJ, the gain coefficients, such as KP , KI and KD, should not be taken too much value to avoid the system producing overshoot. It should take less KI and middle KP as well as KD so as to insure the response speed of the system. And in the subsection JA, it has better development trend to reduce the deviation, we should reduce the KP and enlarge KI to avoid the oscillation around the set value. In order to enhance the anti-jamming performance, generally it is better to take middle size value for KD.

coefficient KD can take the larger value, generally it is better to take middle size value. 4) For the subsection CD (e > 0, ec > 0 ), the output response of the system reduces, it has badness trend to have the variation towards to reverse direction, and it will reach the positive maximum value at point D. In this situation, the integral control action should be the main to weaken the influence of badness trend. 5) In the subsection DE (e > 0, ec < 0 ), the system response appears good trend that the system deviation will be gradually reduced. Therefore the control action is not able to be too strong. Otherwise the system overshoot will be appeared again. Obviously the integral action should be reduced. Thereafter the situation of each time section is similar to the above situation. Here it is not any more to repeat. B. The Puzzle of Parameter Tuning From the above mentioned, we can see that the control parameter choice is very complex. The main problem is that the once tuned parameter is difficult to insure always locating optimal status of the system by traditional tuning method of PID. Therefore it is necessary to adopt autotuning system of the parameter. The method is presented by Astron [8, 9, 10]. With simple speaking, the autotuning of parameter is that the controller can tune the parameter value of PID, after the parameter is tuned the system can automatically switched to normal work situation. Once the system performance is changed or goes beyond the anticipated bound, the system can automatically tune to start the parameter tuning process of PID, and retune the parameter of PID to obtain better control effect. There are lots of parameter tuning methods. This paper adopts the parameter auto-tuning method of fuzzy control technique on line based on self-learning system. The most outstanding advantages are that it is better in real time performance, and faster in system response. V. PARAMETER AUTO-TUNING SYSTEM OF FUZZY PID BASED ON SELF-LEARNING FUNCTIONS A.Algorithm Design of Fuzzy PID In order to satisfy the special environment demand of the tobacco factory, the ideal room temperature range is from 23 to 24, the fluctuating time of overrunning 0.5 around 23 is not greater than 400s, and the rising time of temperature must be within 150s so as to avoid the pipe tobacco quality downgrading. For getting the optimal environment control effect of temperature and humidity, by means of parameter auto-tuning system of fuzzy PID controller, it first finds the fuzzy relation among three parameters, KP, KI, KD and system deviation e and its change rate etc. Then it modifies three parameters by means of increment principle of parameter adjusting so as to satisfy the different demand of control parameter when the system deviation e, and its change rate is different. Thereby it makes controlled object own the better steady and dynamic performance. The structure of fuzzy PID controller is shown as in Fig.2.

Figure 1. Typical response curve of 2-order system

2) For subsection AB (e < 0, ec < 0), the response output of the system has been beyond the steady value and presents the bad trend towards to get increase of system deviation. Up to point B, the deviation reaches negative maximum, and it should take the measure to reduce the control quantity. In the subsection AB, the control action is to reduce the overshoot, except taking proportion control it should strengthen the integral action so that it can strengthen control action through the deviation integral to make the system output return to the steady value as soon as possible. In order to enhance the anti-jamming performance, it can take the larger value KD. Generally it is better to take middle size value. 3) In the subsection BC (e < 0, ec > 0), the deviation e begins to reduce. Under the condition of control action, the system presents better trend towards to steady variation. Here if we still add integral action then it will certainly make the control execution too strong, and it appears system callback. Therefore the integral action should be reduced or taken out. In order to enhance the anti-jamming performance, the value of differential

2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2171

Figure 2. Fuzzy control system structure based on self learning function

The parameter auto-tuning organization adopts increment adjusting principle. The reasoning method of Mamdani would be adopted by fuzzy inference and defuzzification for control rule. IF E i AND EC i THEN K P is U i ( i=1,2,49) Its fuzzy implication adopts least value method

KP = K P + Kp0

(4) (5) (6)

KI = K I + KI0
KD = K
D

+ KD0

U (K P ) = Ei (e) ECi (ec) Ui (K P ) (1) i


For fuzzy compound, it adopts maximum value method, and the reasoning principle is

B. The Design of Self-learning Unit The reference [11] proposed a sort of algorithm of selflearning fuzzy control for multi-input single output system. The y shows the output modification quantity of fuzzy controller, and performance function is used to reflect the ideal response characteristic of system. Assume the increment model of controlled object to be

U' (KP) =U' (KP)U' (KP) U' (KP)


1 2 49

(2)

y(k) = M[eu (k 1)]

(7)

At a certain sample moment, K P can be determined by barycenter of fuzzy output U .


'
49 j

In which, y (k ) is output increment, e u ( k ) is control increment, is the beat number of pure lag. It can compute the modified quantity of eu ( k 1) of control quantity by increment model. Because all the control quantity and observed number of each step are stored in the storage, therefore e u ( k 1 ) can be taken out from storage, and the control quantity should be modified as e u ( k 1 ) + e u ( k 1 ) , then it is transferred as fuzzy quantity A u

Kp =

pj ( Kp ) Kp

pj (Kp)
j

49

(3)

In which, Pj (K P ) (j=1,2,49) is the membership grade of K P . In like manner, it can get K I and K D . The value obtained by fuzzy reasoning and defuzzification multiplies by a proportional factor, and the value of increment adjusting for PID can be obtained. By means of adjusting formula (4), (5) and (6), it can be considered as the control parameter of PID controller. In which, KP0, KI0, KD0 is the initial value of controller parameter, and it can be obtained by conventional method. The adjusting process is shown as in Fig.3.

And the measured

Figure 3. Fuzzy adjusting of PID parameter

value before + 1 step is taken out, and it is transferred as the corresponding fuzzy number A 1 , A 2 , L , A k , and thus the new rule is formed. If it has the rule like the same precondition in the rule base, then it is substituted by new rule. Otherwise the new rule would be written into the rule base. The self-learning process is continuously repeated, and then the control rule is gradually improved and perfected until there is any rule needed to be modified or added. By means of sampling e and ec value of current measurement, the control algorithm can make the performance evaluation for control effect, and it can be carried through modification by action part of control rule in the learning algorithm of rewards and punishment on line before +1 beat by means of varying domain method. In this way, it can achieve the aim of modifying the control rule base, and improve the big lag characteristic of the system. The dynamic characteristic of the system can be summarized as following [12]. When e(k) ec(k) > 0 the system has the trend toward to reducing the deviation.

2012 ACADEMY PUBLISHER

2172

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

When e(k) ec(k) < 0 the system has the trend toward to increasing the deviation In terms of the above characteristic, the evaluation function C(k) can be expressed as in formula (8). C(k) = e(k) ec(k) (8)

(10) Let f(k) = 1 [e(k)+ ekmax + ek 1 max ]/(3ST) then go to (11) (11) Take [(former value of [num (k-1-)] f(k)] replacing [num(k-1-)] then go to (12) (12) End the process of this time learning. VI. SYSTEM SIMULATION For convenience to explain and more intuitivism, the simulation experiment is divided into two parts. First it makes the simulation for control method so as to compare that which control strategy is better. Then based on the optimization of control method, it makes the overall system simulation. Here it takes following model to make the simulation. W(S) = K e-s / (Ts + 1) In which, K is a gain coefficient, T is the time constant of controlled object, is pure lag time of the system. For convenience, it takes K=1, T=1.2, =2, then W(S) = e-2s/(1.2S + 1) Based on the environment of MATLAB, by means of Simulink to build the system simulation model, under the condition of unit step input, it makes the simulation for the same controlled object respectively by PID and fuzzy PID controller. The response curve of simulation is shown as Fig. 4.

When C(k) > 0 , we carry through the reward for corresponding control rule. When C(k) < 0 we carry through the punishment for corresponding control rule. C. Determining the Function of Reward and Punishment In order to assure learning performance on line, it is introduced to the concept of rewards and punishment function from the angle of system stability. The establishing of reward and punishment function is based on that the deviation obtained on line should be gradually going to zero. The established function of reward and punishment is shown as in formula (9).
k k 1 f(k)=1 e(k) + emax + emax /(3 ST )

(9)

In which, when C(k) > 0, it takes as f(k) = 1 [e(k) + ekmax + ek1max ]/(3 ST) . When C(k) < 0 it takes as f(k) = 1 [e(k)+ ekmax + ek1max ]/(3 ST), in which ekmax , ek1max is respectively the maximum absolute value of deviation on line at the time k and k-1. The is the lag beat of system delay. ST is the maximum of object set value (Assuming as temperature), and usually it sould be a constant value. In this paper, the temperature set range is from 18 to 25C, The ST is set as equal to 25. The feasibility of reward and punishment function has been approved through system response experiment. D. The Algorithm Flow for Self-learning The basic step, which is the algorithm flow of selflearning on line for reward and punishment based on varying-domain, is as the following. (1) Start-up self-learning system (2) Read e(k), ec(k) and e(k-1-) from database (3) Read the sequence number num(k) and num(k-1) of control output u(k) (respectively KP , KI and KD and u(k-1- ) of fuzzy controller according to corresponding rule). (4) Read reward and punishment factor [num(k-1)] in terms of num (k-1- ) from database according to corresponding control rule. (5) Read ekmax and ek1max from database (6) If ekmax 0.1 then go to (8) else go to (7) (7) Computing c(k)=e(k) ec(k) if c(k)=0 then go to (8), if c(k) < 0 then go to (9), if c(k) > 0 then go to (10). (8) Let f(k) = 1 then go to (11) (9) Let f(k) = 1 [ e(k) + ekmax + ek 1 max ]/(3 ST) then go to (11)
2012 ACADEMY PUBLISHER

Figure.4. System response curve

In Fig.4, curve 1 and curve 2 is respectively the response curve of PID and fuzzy PID controller. It can be seen that both curve 1 and curve 2 do not appear overshoot. But the rising and adjusting time of the former is lower than fuzzy PID controller, and therefore the fuzzy PID controller owns better control quality than PID controller. The following is some comparison results of simulation curves under the condition of pulse disturbance. Fig.5 is the curve with a pulse disturbance at t=4.5s, from the comparison of the curves, it can be seen that the fuzzy PID controller has better anti-interference performance.

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2173

When is changed from 2s to 4s, if the other parameters are not changed, then the response curve is shown as in Fig.8. It shows that the overshoot is more enlarged for PID controller, but for the curve of fuzzy PID controller, in spite of there is a little change it is still hardly any change, and it is only to postpone 2s in response time. Fig.8 shows the result of simulation.

Figure.5. Response of system with a pulse disturbance

Fig.6 is the curve comparison of robustness in change of object parameter, in which, it supposes K=1, T=1.2, =2, and it changes only the open-loop gain K from K = 1 to K = 2, the others are not changed. From Fig.7, it can be seen that after the gain is changed, the overshoot of PID controller is enlarged much more, but it is still kept non-overshoot for fuzzy PID controller.

Figure.8. Response with time-lag changed

Figure 6. Curve of response after gain K changed

For other parameter change, when T changes from 1.2s to 2s, if the other parameters are not changed, then it is slightly a overshoot for PID controller, but there is hardly any change for fuzzy PID controller in response curve, and it is shown as in Fig.7.

Figure 9. Curve of response with 2-order system

Figure 7. Curve of response after time constant changed

Here it is worth to be mentioned that if it follows a inertia unit after controlled object, then the transfer function will be W(S) = K e-s/(Ts + 1) (2s+1, the result of simulation is shown as in Fig.10. From Fig.9, it can be seen that the overshoot enlarged, but for fuzzy PID controller, there is hardly any changed in system response, and special there is not any overshoot. The above simulation shows that if the disturbance is appeared then fuzzy PID controller has better antiinterference performance than PID controller. And if the controlled object is changed then the curve of system response is hardly any change, but for PID controller, it is obviously appeared the overshoot, the rising and adjusting time gets slowly, and therefore the fuzzy PID controller has better quality. Now we make the overall system simulation. Assume the input variables are deviation e and its change rate ec, through fuzzy inference machine it can obtain the control

2012 ACADEMY PUBLISHER

2174

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

output Kp , K I , Kd . The controlled object and its parameter value are shown as in Tab.1. The ideal output is 23, the steady deviation less than 0.5, the over-shoot Mp 4C , tr 150 s , the rising

time ts 400s , Suppose the initial parameter of fuzzy PID controller is Kp 0 = 0.18 , K I 0 = 0.00158 , K d 0 = 1 .

TABLAE I. PARAMETER VALUE OF SYSTEM SIMULATION Variable Linguistic variable Basic domain Fuzzy subset Fuzzy domain Quantition factor Control object [-6,6] 0.4 [-3,3] 0.1 e E [-15,15] ec EC [-30,30]

Kp Kp
[-0.03,0.03]

K I
K I

Kd Kd

[-610.7,610.7]

[-0.102,0.102]

NB, NM, NS, ZE, PS, PM, PB [-6,6] 0.005 [-6,6] 0.0000001 [-6,6] 0.017

G (s) =

10 e30 s (60 s + 1)( s + 1)

If the routine PID is used then the precision of system can be controlled in the error range of 0.5. And if it adopts the fuzzy PID controller then the system response has excellent performance in system overshoot, rising time and adjusting time. If we change the model parameter, for example, KP changes from 10 to 13, then when the routine PID controller is used the adjusting time (ts = 750s) exceeds the performance index, If the fuzzy PID controller is used then the overshoot of system will be appeared but the most over-shoot M p 4 C , it is still within the allowable range, rising time tr= 84s, adjusting time ts= 172s, the performance index is still satisfied. If one term of denominator of the transfer function changes from (60s+1) to (50s+1), and the lag time changes from 23 to 26 then it is still able to be stable within 2000s but it seriously exceeds the performance index of engineering demand. Under the same condition, if the fuzzy PID controller is used then the adjusting time will be 202s. All the system overshoot and rising time will satisfy the performance index. Under the disturbance of unit-step signal, if the routine PID controller is used then it can not be convergent to the set value. Therefore the system performance becomes bad. If the fuzzy PID controller is used then the system will appear certain overshoot, but the overshoot is 1.2, after disturbance about 230s, the system can automatically be convergent to steady value. VII. REALAZATION OF SYSTEM CONTROL For convenience, we take the room temperature control of tobacco production workshop as an example to validate the method correctness presented by fuzzy autotuning parameter of PID controller. The main environment demand of tobacco production workshop is
2012 ACADEMY PUBLISHER

that the room can not exceed the ideal temperature (23 24 ) and the above, the time of fluctuating amplitude 0.5 of ideal temperature (23) is not great than 400s to avoid the product quality descending. In order to obtain optimal environment control effect of temperature and humidity, it adopts the parameter auto-tuning method of fuzzy PID controller based on the self-learning in the central air conditioner system. The control parameters of increment output of PID controller such as KP , KI and KP , through defuzzification and parameter computing, we can obtain each control parameter of PID controller next time. Finally the control parameter is sent to PID controller to carry through the system adjustment. If the deviation exceeds the allowable error then the selflearning system is started up and carries through the adjustment, such as reward and punishment, and fuzzy rule and so on. Otherwise it does not enter the selflearning system and carries through the parameter autotuning on line according to the current fuzzy rule. The fuzzy parameter auto-tuning system of PID extracts the advantages, such as control precision being higher in PID control, and response being faster in fuzzy control, and so on. The practical test and running effect of system show that it is successful in system design. VIII. CONCLUSIONS From contrast research on parameter auto-tuning system of fuzzy PID controller based on self-learning and routine PID controller, it can be seen that if the routine PID controller is adopted to control the central air conditioner then it will be appeared such as oscillation and so on, and it is not able to satisfy the strict demand of environment temperature with humidity in technology of tobacco production workshop. And the parameter autotuning system of fuzzy PID controller based on self-

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2175

learning owns better self-adaptability that has obvious advantage in robustness and steady precision of the system and so on. It is better able to satisfy the work situation of high precision control. REFERENCE
[1] Cheres, E., Tuch, J., Div. of Planning Dev. & Technol., Israel Electr. Corp. Ltd.: Haifa Man-utility focused tuning algorithm, Electrotechnical Conference. (2002)396-397. [2] Han, Houde, Kan, Ankang; Sha, Lili. Application of fuzzy control technology on the marine heat exchanger control system. 6th International Symposium on Heating, Ventilating and Air Conditioning, ISHVAC 2009, 2009: 311-317. [3] Larbes, C., At Cheikh, S.M.; Obeidi, T., Zerguerras, A. Genetic algorithms optimized fuzzy logic control for the maximum power point tracking in photovoltaic system. Renewable Energy. 2009:2093-2100. [4] Chen, Yifei, Wang, Ku; Dai, Wenguang. Natural ventilation control system by fuzzy control technology. ICINIS 2009 Proceedings of the 2nd International Conference on Intelligent Networks and Intelligent Systems. 2009: 35-38. [5] Tehrani, Fleur T. Automatic control of mechanical ventilation. Part 2: The existing techniques and future trends.

Journal of Clinical Monitoring and Computing. 2008: 417424 [6] Avgelis, A. Papadopoulos, A.M. Application of multicriteria analysis in designing HVAC systems. Energy and Buildings. 2009: 774-780. [7] Wei Zhi-nong, Yu Xiao-yong, Wu Jia-jia, Han Lian-shan, Xie Xiang, Che Dan, Wang Yue: The intelligent control of DFIG-based wind generation. Sustainable Power Generation and Supply. 2009:1-5. [8] Jian Zhang, Xuhui Wen, Lili Zeng. Research of parameter self-learning fuzzy control strategy in motor control system for electric vehicles. Electrical Machines and Systems, 2009. ICEMS 2009. International Conference on . 2009: 1 - 5 [9] Srivastava, S., Sukumar, V.; Bhasin, P.S.; Arun Kumar, D. A Laboratory Testbed for Embedded Fuzzy Control Education. IEEE Transactions on Volume: 54, Issue: 1. 2011: 14 23. [10] Shiuh-Jer Huang, Hsin-Wei Shieh. Motion control of a nonlinear pneumatic actuating table by using self-adaptation fuzzy controller. 2009: 1 6. [11] Jianxian Cai, Xiaogang Ruan. Self-organization stochastic fuzzy control based on OCPFA and applied on self-balanced robot. 2010: 4775 4780. [12] Jianxian Cai, Xiaogang Ruan. Self-Balance Control of Inverted Pendulum Based on Fuzzy Skinner Operant Conditioning. International Conference on Fuzzy, 2009: 518 521.

2012 ACADEMY PUBLISHER

2176

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

Efficient Graduate Employment Serving System based on Queuing Theory


Zeng Hui
School of Sciences, Yanshan University, Qinhuangdao, China Email: zenghui @ysu.edu.cn

AbstractThe mathematical model of an two-phases-service M/M/1/N queuing system with the server breakdown and multiple vacations was realized and established in the Graduate Employment Services system. Secondly, equations of steady-state probability were derived by applying the Markov process theory. Then, we obtained matrix form solution of steady-state probability by using blocked matrix method. Finally, some performance measures of the system such as the expected number of users in the system and the queue were also presented. Index Termsqueuing theory, mathematical Graduate Employment Services system model,

I. INTRODUCTION Queuing theory is a branch of operations research. The main purpose of the study is to answer how to improve the service provided to an object making a cerain indicator target to achieve optimal [1]. Queuing theory originated in 1909 from Copenhagen, Dennark Telephone Companys A. K Erlangs famous paper Probability theory and phone calls [2]. The thesis of the paper was focused on phone calls creating an applied mathematics in this subject, and many basic principles of the discipline. At present, domestic and international use of queuing theory in the optimization of window and communication facilities. Many researchers have proposed estimation methods of the number of units in circulation and network bandwidth based on queuing theeory.[3, 4]. In recent years, many scholars began to study real-life problems of queuing theory, such as the application in the arrangements in hospital clinics and wards [5, 6], an optimized method for the loading/unloading system of port transportation [7], the application in determining the number of bank teller window and staff [8], the supermarket checkout queue management [9, 10], and a variety of after-sales service system [11]. However, we have not seen any papers about the analysis of Graduate Employment Service system. The models above only study the case of one service per user provided by each server. In fact, in our daily life, we often encounter a server offering different services for the same users. In such queuing models, all the users need the first phase service and only part of them will be asking the server to provide a second phase service, which is the two-phases-service queuing system. Recently, there have been several contributions

considering queuing system in which the server may provide a second phase service. Madan [12] studied an M/G/1 queue with the second optional service in which first essential service time follows a general distribution but second optional service is assumed to bi exponentially distributed. Medhi [13], generalized the model by considering that the second optional service is also governed by a general distribution. Yue Dequan [1417], studied an M/M/1/N queue with the multiple vacations; they obtained the matrix form solution of steady-state probability. The system considered in this paper is the secondary server system which is mentioned above. Graduate Employment Services system is an information service platform that facilitates students employment and has a lot of queues in it, and therefore the issue of seeking optimal solution also exists. In this paper we will take a college student employment service system that can accommodate a limited number of users for example, and consider a college student employment service system model in which the server can provide two phases service, and takes a vacation when the system becomes idle. Once service begins, the service mechanism is subject to breakdowns. Parameters are calculated based on actual data and validated to study the performance of its services. II. SYSTEM MODEL Graduate Employment Services system capacity is N. Suppose there are only one help desk system for service users. Under normal circumstances, access to the query must be employed to process information, that is, users must first accept the first phase of service. Subsequently, the user will then submit resume online based on their needs, which means choosing to accept the second phase of service. The service mechanism may fail. A. Input process In a certain period of time, each user can repeatedly enter the system, so the source of users can be seen as a infinite populaion. Users arrive independently according to Poisson process with different rates. Arrival rate during vacation is 0 , arrival rate during active service is 1 , arrival rate during breakdown is 2 .

2012 ACADEMY PUBLISHER doi:10.4304/jcp.7.9.2176-2183

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2177

B. Queuing discipline The first essential service is needed for all arriving users. The vacation times, uninterrupted service times, and the repair times follow exponential distribution. The first service rate is 1 , and the second service rate is 2 . As soon as the first service of a customer is completed, then with probability (0 < < 1) , he may opt for the second service, in which case his second service will immediately commence or else with probability 1 , he may opt to leave the system, in which case another customer at the head of the queue is taken up for his first essential service. C. Service rules The server goes on vacation instantly when the queue becomes empty, and continues to take vacations of exponential length until, at the end of a vacation, users are found in the queue. The vacation rate is v and vacation time follows exponential distribution. Service mechanism breakdowns occur only during the first active service, and the breakdown rate is b ( 0 < b < 1) .The service mechanism goes through a repair process of random duration, and once repair is completed, the server returns to the customer whose service was interrupted, the repair rate is r .Various stochastic processes involved in the system are assumed independent of each other. III. STEADY-STATE PROBABILITY EQUATIONS Let X (t ) be the number of customers in the system at time t . Define C (t ) as the state of the server at the time t . And define the state as follows:
0, ( The server is on vacation at the time t ) 1, ( The server is on the first service at the time t ) C (t ) = 2, ( The server is on the second service at the time t ) 3, The server is on breakdown process at the time t ) (

( 1 + 1 + b) p1 (1)
= vp0 (1) + 1 (1 ) p1 (2) + 2 p2 (2) + rp3 (1), (4)

( 1 + 1 + b) p1 (n)
= vp0 ( n) + 1 p1 (n 1) + 1 (1 ) p1 (n + 1)

+ 2 p2 (n + 1) + rp3 ( n) , ( 2 n N )

(5) (6) (7)

( 1 + b ) p1 ( N ) = vp0 ( N ) + 1 p1 ( N 1) + rp3 ( N ) ,
( 2 + 1 ) p2 (1) = 1 p1 (1),

( 2 + 1 ) p2 (n) = 1 p1 (n) + 1 p2 (n 1), ( 2 n N 1) ( 8 )

2 p2 ( N ) = 1 p1 ( N ) + 1 p2 ( N 1),
(r + 2 ) p3 (1) = bp1 (1),

(9) (10)

(r + 2 ) p3 (n) = bp1 (n) + 2 p3 (n 1), ( 2 n N 1) (11) rp3 ( N ) = bp1 ( N ) + 2 p3 ( N 1),

(12) (13)

p (n) + p (n) + p (n) + p (n) = 1.


n=0 0 n =1 1 n =1 2 n =1 3

IV. MATRIX FORM SOLUTION In the following, we derive the steady-state probability by using the partitioned block matrix method. Let P = ( p0 (0), P0 , P , P2 , P3 ) be the steady-state 1 probability vector of the transition rate matrix Q , where
P0 = ( p0 (1), p0 (2),L , p0 ( N ) ) Pi = ( pi (1), pi (2), L , pi ( N ) ) , (1 i 3)

Then, { X (t ), C (t ), t 0} is a Makov process with state space as follows:


= {(n, 0) : 0 n N } U {(n, j ) :1 n N , j = 1, 2,3}

The steady-state probability of the system is defined as follows: p0 (n) = lim p( X (t ) = n, C (t ) = 0), ( 0 n N )

Then, the steady-state probability equations above can be rewritten in the matrix form as
PQ = 0 Pe = 1

p j (n) = lim p( X (t ) = n, C (t ) = j ), (1 n N )
t

(14)

By applying the Makov process theory, we can obtain the following set of steady-state probability equations:

0 p0 (0) = 1 (1 ) p1 (1) + 2 p2 (1),


(v + 0 ) p0 (n) = 0 p0 (n 1) , (1 n N 1) vp0 ( N ) = 0 p0 ( N 1),

(1) (2) (3)

Where e is a column vector with 4 N + 1 components, and each component of e equal to one, and the transition rate matrix Q of the Markov process has the following blocked matrix structure:

2012 ACADEMY PUBLISHER

2178

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

0 0 Q = 0

A0 0 0 0

0 B0 B1 B2 B3

0 0 C1 C2 0

0 0 D1 0 D3

Where e is a column vector with 4 N + 1 components, and each component of e equal to one, and the transition rate matrix Q of the Markov process has the following blocked matrix structure:
0 0 Q = 0

r + 2 0 M D3 = 0 0 0

2 L r + 2 L M 0 L 0 0

0 0 M 2

L r + 2 0 L

2 r 0 0 M 0

B0 = vdiag (1,1,K ,1) B3 = rdiag (1,1,K ,1)

A0 0 0 0

0 B0 B1 B2 B3

0 0 C1 C2 0

0 0 D1 0 D3

C1 = 1 diag (1,1,K ,1)


D1 = bdiag (1,1,K ,1)

Where 0 is a constant,
0 v 0 0 M
0 M 0 1 1 + b 0 0

Each sub-matrix of the matrix Q as follows:


v + 0 0 A0 = M 0 0 0 v + 0 M 0 0 0 L 0 L M 0 0
L L L

= ( 0 , 0, L , 0 )
T T

L v + 0 L 0
0 0 0 M

0 0 M

is a 1 N row vector,

= 1 (1 ) , 0,L, 0 , = ( 2 , 0, L, 0 ) are 1 N column vectors. A0 , Bi ( 0 i 2 ) , Ci (1 i 2 ) , Di (1 i 3) are square matrices.Eq. (14) is rewritten as follows:

1 1 + 1 + b 1 (1 ) 1 + 1 + b 0 1 (1 ) M M B1 = 0 0 0 0 0 0

0 p0 ( 0 ) + P + P2 = 0, 1
p0 ( 0 ) + P0 A0 = 0,

(15)

(16) (17) (18) (19) (20)

L 1 L 1 + 1 + b

P0 B0 + P B1 + P2 B2 + P3 B3 = 0, 1 PC1 + P2 C2 = 0, 1 P D1 + P3 D3 = 0, 1 p0 ( 0 ) + P0 eN + PeN + P2 eN + P3 eN = 1, 1

L 1 (1 )

0 2 B2 = 0 M 0

0 0 2 M 0

L L L

L 2

0 0 0 M

0 0 0 M 0

2 + 1 0 M C2 = 0 0 0

2 + 1
M 0 0 0

L L

0 0 M 1

L L 2 + 1 0 L

0 0 M 0 1 2

Where eN is column vector with N components, and component of eN to one. From Eq. (2) we get
p0 (1) =

0 p0 (0) v + 0
k

(21)

p0 (k ) = 0 p0 (0) (1 k N 1) v + 0
p0 ( N ) =

(22)

0 0 v v + 0

N 1

p0 (0)

(23)

2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2179

From Eq. (18), we get


P2 = PC1C 1
1 2

(24)

From Eq. (19), we get

P3 = P D1 D31 1
Substituting Eq. (24) and (25) into (17), we get
P B1 C1C2 1 B2 D1 D31 B3 = vP0 1

(25)

2 N 1 N 1 0 , 0 , L, 0 , 0 0 p0 ( 0 ) =v v + 0 v + 0 v v + 0 v + 0

(26) Let A = B1 C1C B2 D1 D B3 , after some algebraic manipulation we find the component of the A as follows:
1 2 1 3

1 br 1 + 1 + b 1 2 ( + ) + r + , i = j N 2 1 2 1 , i = j = N (1 ) 2 , j = i + 1, j N 1 2 + 1 1 (1 )i = N , j = N 1 br 2 2 1j i +1 aij = 1 + ,i < j N 1 2 j i + 2 ( 2 + 1 ) ( r + 2 ) br 2 , i = 1, j = N 2 ( r + 2 ) br 2 1 + , i = N 1, j = N 2 ( r + 2 ) 0, others r1 0 Let A = % T , each sub-matrix of the matrix A A r2 as follows:

a21 0 0 M % = A 0 0 0 0

a22 a32 0 M 0 0 0 0

a23 a33 a43 O L L L L

a24 a34 a44 O 0 0 0 0

a25 a35 a45 O a( N 3)( N 4) 0 0 0

a26 a36 a46 O a( N 3)( N 3) a( N 2)( N 3) 0 0

L L L O a( N 3)( N 2) a( N 2)( N 2) a( N 1)( N 2) 0

a2( N 1) a3( N 1) a4( N 1) M a( N 3)( N 1) a( N 2)( N 1) a( N 1)( N 1) aN ( N 1)



N 1

r1 = ( a11 , a12 , a13 , 0, K , 0 ) is a 1 ( N 1) row vector,

r = 0, K, 0, a( N 1) N , aNN
T 2

is

1 ( N 1)

column

vector. Let
% P = ( p1 (1) , P ) , 1 1

2 0 , 0 ,L, 0 v + 0 v + 0 v + 0 % Theorem 1. A is an invertible determinant is

matrix, the

% A = ai ( i 1) 0
i=2

Where

% P = ( p1 ( 2 ) , p1 ( 3) ,K, p1 ( N ) ) . 1
Eq. (14) is rewritten as follows:
% % p1 (1) r1 + P A = v ( p0 (1) , p0 ( 2 ) ,K , p0 ( N 1) ) = v p0 ( 0 ) 1

Proof. Obviously, A is an upper triangular matrix, the determinant is equal to the product of diagonal
% elements, that is A = ai ( i 1) .
i=2 N

(27)
% Pr2T = vp0 ( N ) = 0 0 1 v + 0
N 1

For 0 , 1 , 2 , 1 , 2 , v, b, r > 0 , 0 < b < 1, 1 , 2 0 , According to the above expression of the aij , that
% ai (i 1) < 0, i = 2,3, K , N , so A 0 .

p0 ( 0 )

(28)

From theorem 1 and Eq. (26), we get


% % % P = v p0 ( 0 ) A1 p1 (1) r1 A1 1

(29)

where

Substituting Eq. (29) into (28),

2012 ACADEMY PUBLISHER

2180

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

we get
p1 (1) = 1 % 1 T v A r2 0 0 % 1 r T r1 A 2 v + 0
N 1

P = ( c, q ) 1

p0 ( 0 )

P2 = 1 ( c, q ) C21
P3 = b ( c, q ) D31

= cp0 ( 0 )

(30) Where

where
N 1 1 % 1 T v A r2 0 0 is a constant. c= % r1 A1r2T v + 0

1 m

m = 1 + A01eN + ( c, q ) eN + 1 ( c, q ) C2 1eN + b ( c, q ) D31eN

Substituting Eq. (30) into (29), we get


% % % P = p0 ( 0 ) v A1 cr1 A1 1

(31)

V. PERFORMANCE MEASURES OF SYSTEM A. The Probability That the System Service Station During Busy Period
PB = p1 ( n ) + p2 ( n ) = ( c, q ) + 1 ( c, q ) C21
N N N

% % Let q = v A1 cr1 A1 is a 1 ( N 1) row vector, % that P = p ( 0 ) q .


1 0

So
% P = p1 (1) , P = p0 ( 0 )( c, q ) 1 1

(32)

n =1

n =1

n =1

Substituting Eq. (32) into (24), we get


P2 = 1 p0 ( 0 )( c, q ) C21

B. The Probability That the System Service Station During Vacation Period (33)
N N 1 P = p0 ( n ) = 1 A01 n +1 V n =0 n=0

Substituting Eq. (32) into (25), we get


P3 = bp0 ( 0 )( c, q ) D
1 3

C. The Average Waiting Queue Length of the System (34)

Substituting Eq. (16), (33) and (34) into (20), we get


p0 ( 0 ) + p0 ( 0 ) A01eN
+ p0 ( 0 )( c, q ) eN + 1 p0 ( 0 )( c, q ) C21eN

E ( Lq ) = np0 ( n ) + np1 ( n + 1)
N N n =1 n =1

+ np2 ( n + 1) + np3 ( n + 1)
n =1 n =1

+bp0 ( 0 )( c, q ) D e = 1
1 3 N

So
p0 ( 0 ) =

( c, q ) n +1 + 1 ( c, q ) C2 1 n +1 = n + N 1 1 n =1 +b ( c, q ) D3 n +1 A0 n +1 N 1

D. The Average Queue Length of the System


1
1 2 N 1 3 N

1 + A e + ( c, q ) eN + 1 ( c, q ) C e + b ( c, q ) D e
1 0 N

(35)

E ( L ) = np0 ( n ) + np1 ( n )
n =1 n =1

Substituting Eq. (35) into (16), (33) and (34), we get the matrix solution of P0 , P2 , P3 . In summary, we have the following theorem. Theorem 2. Probability matrix of the steady state solution is:
p0 ( 0 ) = P0 = A01
N

+ np2 ( n ) + np3 ( n )
n =1 n =1

= n A01 n +1 + N
n =1
+ n ( c, q ) n + 1 ( c, q ) C2 1 n + b ( c, q ) D31 n n =1

N 1

2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2181

Let n be a column identity vector of order N with is n th component equals to one and the other components equal to zero. VI. GRADUATE EMPLOYMENT SERVICES SYSTEM M/M/1/N QUEUING MODEL OF THE CASE STUDY Based on the above analysis, we obtain the average waiting queue length and the average queue length of the graduate employment services system, and some other state indicators. But as a management decision makers not only to know the steady-state targets, but also to understand some of the parameters on the impact of these state indicators of the system, so that the queuing system as optimal. We take N = 5 for example, when 0 = 1 = 2 = 1 , 2 = 1 , b = 0.5 , v = 1 , r = 1 , 1 and Impact on the average queue length of the system.

increase, the increase of the E ( Lq ) gradually slows down. In Figure 2, we fix

increase will find that attendant faster and faster, steady state system in reducing the number of customers. When 1 < 3 , E ( Lq ) changes faster. Then, with the 1

1 = 2 = 1 , 0 = 2 = 1 ,

v = 1 , b = 1 , r = 1 , = 0.5 . Consider when users arrival rate 1 changes, the average queue length of

changes. Looking at Figure 2, with the 1 increase will find that attendant faster and faster, steady state system in increasing the number of customers. When 1 < 3 , increase of the E ( Lq ) gradually slows

E ( L ) changes faster. Then, with the 1 increase, the

down.

Figure 1. The expected waiting queue length E(Lq) vs. the arrival rate 1

Figure 3. The expected waiting queue length E(Lq) vs. the first busy service rate 1

Figure 2. The expected queue length E(L) vs. the arrival rate 1

Figure 4. The expected queue length E(L) vs. the first busy service rate 1

In Figure 1, we fix 1 = 2 = 1 , 0 = 2 = 1 , v = 1 , b = 1 , r = 1 , = 0.5 . Consider when users arrival rate 1 changes, the expected waiting queue length of changes. Looking at Figure 1, with the 1
2012 ACADEMY PUBLISHER

In Figure 3, we fix 0 = 1 = 2 = 1 , 2 = 1 , v = 1 , b = 1 , r = 1 , = 0.5 . Consider when the first busy service rate 1 changes, the expected waiting queue length of

2182

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

will find E ( Lq ) first decreases rapidly, steady state

changes. Looking at Figure 3, with the 1 increase, we system in reducing the number of customers. When 1 > 6 , the increase of the E ( Lq ) gradually slows

In Figure 6, we fix 0 = 1 = 2 = 1 , 1 = 2 = 1 ,
v = 1 , b = 1 , r = 1 , Consider when the probability of the users chose the second service changes, the average queue length of changes. Looking at Figure 6, with the number of the user who chose the second service increase, we will find E ( L ) increases linearly with increasing

down. In Figure 4, we fix 0 = 1 = 2 = 1 , 2 = 1 , v = 1 , b = 1 , r = 1 , = 0.5 . Consider when the first busy service rate 1 changes, the average queue length of changes. Looking at Figure 4, with the 1 increase, we will find
E ( L ) decreases rapidly at first, then in equilibrium.

trend.

Figure 7. The expected queue length E(L) vs. the service rate 1 and

Figure 5. The expected waiting queue length E(Lq) vs. the probability of the users chose the second service

In Figure 5, we fix 0 = 1 = 2 = 1 , 1 = 2 = 1 ,
v = 1 , b = 1 , r = 1 , Consider when the probability of the users chose the second service changes, the expected waiting queue length of changes. Looking at Figure 5, with the number of the user who chose the second service increase, we will find E ( Lq ) increases linearly with

In Figure 7, we fix 0 = 1 = 2 = 1 , 2 = 1 , b = 0.5 , v = 1 , r = 1 , and 1 from 0.5 to 2.5, from 0 to 1. Looking at Figure 1, with the 1 increase will find that attendant faster and faster, steady state system in reducing the number of customers. When 1 is fixed, with the increases, the average queue length gradually increases.

increasing

trend.

Figure 8. The expected queue length E(L)vs.the service rate 1 and b

Figure 6. The expected queue length E(L) vs. the probability of the users chose the second service

In Figure8, we fix 0 = 1 = 2 = 1 , 2 = 1 , = 0.5 , v = 1 , r = 1 , and 1 from 0.5 to 2.5, b from 0 to 1. Looking at Figure 2, with the 1 increase will find that attendant faster and faster, steady state system in reducing

2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2183

the number of customers. When 1 is fixed, with the b increases, the average queue length gradually increases. Through the above analysis, we could get a clearer understanding of the system and some of the parameters on the performance of queuing systems. Using this result, service providers can design a reasonable rate and holiday vacation service rate so that the queuing system could achieve as optimal. VII. CONCLUSION Queuing model can be used to the employment services system and its design and to optimize the actual system according to the specific requirements of the system. Queuing system is suitable for analyzing and studying random phenomenon such as the employment service system services. In this paper, the Graduate Employment Services system service queuing model can effectively assess the situation, and support the decisionmaking with regards to the management and services of the university employment service system. REFERENCES
[1] Sun Ronghuan, Li Jianping, The Basis of Queuing Theory, Peking: Science, 2002, pp. 1-7. [2] Zhang Rui, Analysis of the queuing theory of service industry, Journal of Qiqihar University Thiliosophy, vol. 6, pp. 41-43, 2002. [3] Wolff R W, Stochastic Modeling and the Theory of Queues, New York: Prentice Hall, 2000, pp. 23-30. [4] Meng Yuke, Basic and Applied Queuing Theory, Shang Hai:Tongji University, 1989, pp. 117-120. [5] Yang Feng, Liu Di, Queuing theory to improve patient management in the application queue, University Science Research, vol. 26, pp. 128-129, 2010 [6] Liu Zhan, Xuyange, The application of queuing theory in the eyes hospital beds, China New Technologies and Products, vol. 15, pp. 253-253, 2011. [7] Huang Daming, Wen Bing, Jiang Shunmei, An optimized method for the loading/unloading system of port transportation based on queuing theory, Journal of Guangxi University. Nat Sci Ed, vol. 34, pp. 781-786, 2009.

[8] Sun Zhonghui, The application of queuing theory in the bank and the teller window, Operation and Management. vol. 6, pp. 20-21, 2010. [9] Qin Li, The application of queuing theory in supermarket checkout service system, Modern Economy, vol. 10, pp. 7-8, 2009. [10] Gao Yingying, Zhou Jingzhen, Qian Ting, The application of queuing theory in librarys service marketing, SCI-TECH Information Development & Economy, vol. 24, pp. 3-5, 2010. [11] Kong Xiangping, The application of queuing theory in the library circulation services system, The Library Journal of Shangdong, vol. 2, pp. 88-90, 2010. [12] K.C. Madan, An M/G/1 queue with second optional service, Queue. Syst, vol. 34, pp. 37-46, 2000. [13] J. Medhi, A single server poisson input queue with a second optional channel, Queue. Syst., vol. 42, pp. 239242, 2002. [14] Yue D, Zhang Y, Optimal performance analysis of an M/M/1/N queue system with balking, reneging and server vacation, International Journal of Pure and Applied Mathematics, vol. 28, pp. 101-115, 2006. [15] Tian R, Yue D, Hu L, M/ H2 / 1 Queuing System with Balking, N-Policy and Multiple Vacations, Operation Research and Management Science, vol. 4, pp. 56-60, 2007 [16] Yue D, Sun Y, The Waiting Time of the M/M/1/N Queuing System with Balking Reneging and Multiple Vacations, Chinese Journal of Engineering Mathematics, vol. 5, pp. 943 -946, 2008. [17] Yue D, Sun Y, The waiting time of M/M/C/N queuing system with balking, reneging and multiple synchronous vacations of partial servers. Systems Engineering Theory & Practice, vol. 2, pp. 89-97, 2008.

Zeng Hui, borrn in January 1982 in Wangqing, China. She graduated from Yanshan University of China, and accessed to the Master Degree of science. She is mainly engaged in research in the area of queuing theory. Lecturer.

2012 ACADEMY PUBLISHER

2184

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

Staying Cable Wires of Fiber Bragg Grating/Fiber-Reinforced Composite


Jianzhi Li*
Shijiazhuang Tiedao University/Key Laboratory of Structural Health Monitoring and Control, Shijiazhuang, China Email: lijianzhigang@163.com

Yanliang Du and Baochen Sun


Shijiazhuang Tiedao University/Key Laboratory of Structural Health Monitoring and Control, Shijiazhuang, China Email: {sunbch, duyl}@stdu.edu.cn

AbstractIntelligent hybrid reinforced plastics (IHFRP) with fiber Bragg grating (FBG) sensors embedded in the HFRP were produced in-situ. The FBG/HFRP interface bonding is discussed. Experimental results show that the interfacial bonding of FBG/HFRP is good, which ensures that it can transfer the external load to the FBG sensor well. The novel IHFRP has a higher strain-testing precision compared to the existing products, which is necessary due to the requirements of staying cables. The survival rate of the FBG sensor is high due to the packaged sensor used during the manufacturing process. In addition, a simple and theoretical model to estimate shear and peel-off stress is proposed. According to the simulation results, the maximum shear and peel-off stress are located at the ends of the FBG sensor, and this stress is less than the strength of interfacial adhesion of the matrix material. Moreover, the larger the diameter of the packaged sensor is,the more is the additional stress induced. Index Termshybrid composites, smart materials, structural composites, interface, stress/strain curves

I. INTRODUCTION Fiber-reinforced plastics (FRPs) have become an ideal future material for staying cables due to their low weight, great strength, corrosion and fatigue resistance, and low coefficient of thermal expansion; these advantages attract scholars and engineers of various countries to study their properties [13]. Especially, the railway cable bridge, whose dynamic load and dynamic response are large, needs a cable material with high tensile strength, high elasticity coefficient, and appreciable toughness and damping performance. Research results [4] show that composites with only one type of fiber cannot meet the requirements of mechanical strength and dynamic characteristics. However, with increasing tailoring, hybrid fiberreinforced plastics (HFRP) attain properties that the individual components do not have. Moreover, HFRP also reduces the cost. Composite cables with high levels of tensile strength, toughness, and damping, in addition to a large elastic modulus, are prepared to meet these demands. In addition, cable-health monitoring is a key issue. In

recent research, fiber Bragg grating (FBG) has been proven to be the most promising fiber-optic strain sensor due to its elaborate strain-sensing capability [5-9]. Numerous efforts have been undertaken to produce practical FBG strain-gage systems, either mounted on the surface or integrated within the structure. These research experiments provide a reliable guarantee of achieving intelligent functioning of the HFRP cable. For example, tension-monitoring systems based on FBG have huge attraction due to their wavelength-division multiplexing features. Zhang et al. [1013] have designed an FBG system suitable for real-time, online cable monitoring. However, the FBG sensor in the above-cited studies was mounted on the surface, which reduced the FBG reliability. Therefore, Zhou et al. [1416] carried out numerous experiments involving the embedding of bare FBGs in FRPs. Nevertheless, the survival rate of FBG is lower without stainless steel package of FBG. In addition, the cable-health monitoring process must possess a higher precision, and the strain errors must be less than a few microstrains. However, the cross-sensitivity of bare FBGs between temperature and strain seriously affect the accuracy of the strain-testing process. Therefore, increasing the test precision of intelligent hybrid fiber reinforced plastics (IHFRP) wires is of utmost importance. In this study, we manufactured a novel IHFRP wire with a temperature-compensated FBG sensor embedded during the online production of HFRP; moreover, the encapsulation of bare FBG effectively protected the sensors from hostile attack. The novel IHFRP has a higher strain-testing precision compared to the existing products, which is necessary due to the requirements of staying cables. Meanwhile, a simple and theoretical model to estimate shear and peel-off stress is proposed. This study will be an important guide to the further development of IHFRP cables. II. MODELING AND SIMULATION Whenever embedded in materials, optical fiber is regarded as foreign entity to the host structure, which inevitably perturbs the instinctive structural morphology in a local continuum. Embedded FBG strain gages experience lateral response-induced radial and axial

*Corresponding author, Tel: 86-311-87936614; Fax: 86-311-87935012


Email address: lijianzhigang@163.com (Jianzhi Li)

2012 ACADEMY PUBLISHER doi:10.4304/jcp.7.9.2184-2191

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2185

stresses. Thus, due to the mismatch of the mechanical properties between the embedded sensor and the host structure, when a load is applied to the structure along the sensors longitudinal direction, the lateral mechanical response of the structure may induce further local stress on the embedded FBG sensor and, hence, may induce interfacial debonding. Therefore, refer to Fig. 1, we modeled an anisotropic structure with an embedded FBG sensor, which was developed by our research group [1719]. A three-dimensional finite element model, with an FBG-sensor angle of 36, was constructed, as seen in Fig.2. Different values were assigned to the mechanical properties of the structure, while an identical strain was loaded onto the structure along the direction of the embedded FBG sensor, so that the strain evolvement in the HFRP in a vicinity of the embedment can be evaluated along with the changes in the structural property. The structural property parameters are listed in Tab.1, together with those of the FBG sensor.

0Mpa to 6.98Mpa. From Fig. 5, the interfacial stress is decreased gradually along the radius of packaged FBG sensor, named along the L2 direction; for example, the stress located in the core of the FBG sensor is far lower than that of the outer of FBG sensor. From Fig. 6, the interfacial stress is declining rapidly far away from the sensor end, which is identical to the theoretical results. The above simulation results are identical with the theory model. According to the simulation results, we can conclude that: due to their elastic modulus mismatch between FBG and host structure, embedded sensor did experience additional stress field when load was applied along sensor direction. The more the mismatch exists, the more the additional stress field is induced.

Figure 1. Concentric cylinder structure model

Figure 3. Shear stresses contour(Ecom=245Mpa, D=2mm, d=1.6, t=0.2mm)


6 5 shear stress(10 ) 4 3 2 1 0
c
6

220 210 245 200

Figure 2. The 3D finite element model with the angle of 36 TABLE I MECHANICAL PROPERTY OF STRUCTURAL ELEMENTS EZ (Mpa) FBG sensor IHFRC
a a

t (mm) 0.2/0.25/1/1.5 11

PRXZ 0.3 0.28

-1

3
-4

200 200/210/220/245

Figure 4. Shear stresses on L1 path for different elastic modulus


6.5 6.0 5.5 5.0 4.5 4.0 3.5 3.0 2.5 2.0 1.5 1.0 0.5 0.0 0.0 5.0x10
-5

Distance(10 m)

A. The Influence of Elastic Modulus on Shear Stress According to the numerical simulation result in Fig. 3, the maximum circumferential stress (6.98Mpa) is located at both ends of the sensor, named along L1 the direction, and less than the adhesive strength between the sensor and epoxy resin (20Mpa). In addition, the shear stress is declining rapidly far away from the sensor end. From Fig.4, with the modulus increasing of the sensor, the shear stress along L1, L2, and L3 is increasing from

shear stress(10 )

EZ represents elastic modulus along z-direction (loading direction) of FBG sensor and IHFRC. b t represents the thickness of the cylindrical sensor and diameter of the rod. c PRXZ represents major Poissons ratio between x- and z-directions

220 210 245 200

1.0x10

-4

1.5x10

-4

2.0x10

-4

Distance(m)

Figure 5. Shear stresses on L2 path for various elastic modulus

2012 ACADEMY PUBLISHER

2186

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

7 6 5 4 3 2 1 0 -1 -2 -3 -4 -5 -6 -7 0.00 0.02

shear stress(10 Pa)

220 210 245 200

stayed cable. Therefore, based on the above analysis, the embedded sensor does not affect the mechanical properties of IHFRP. According to the simulation, we can conclude that: the thickness of packaged sensor must be less than 3mm; the lager the thickness is, the more the additional stress is induced; in addition, the core stress is far lower than the outer stress of FBG sensor. As we have discussed, according to the safest circumstances the diameter of the designed FBG sensor in this study is 2 mm.
0.06 0.08 0.10

0.04

Figure 6. Shear stresses on L3 path for various elastic modulus

Distance(m)

B. The Influence of the Thickness of Packaged Sensor on the Shear Stress From Fig. 7, we can conclude that: with the thickness increasing of the sensor, the shear stress along L1, L2, and L3 is increasing from 7.13Mpa to 20.7Mpa. Thus, the thickness of packaged sensor must be less than 3mm. From Fig. 8, it is shown that the stress is decreased significantly along the radius of FBG sensor, for example, the stress located in the core of the FBG sensor is far lower than that of the outer of FBG sensor. Hence, as long as the thickness of the FBG sensor is less than a certain value, the interfacial debonding between the FBG sensor and HFRP does not induced during the working period of

C. Interface Stress Analysis between the Designed Sensor and IHFRP According to the numerical simulation results shown in Fig. 9, the maximum shear stress (16.6 Mpa) is located at both ends of the sensor and is less than the adhesive strength between the sensor and the epoxy resin (20 Mpa). In addition, the shear stress rapidly declines away from the sensor end. Hence, interfacial debonding between the FBG sensor and the HFRP is not induced during the working period of the staying cable wire. Fig. 10-12 show the shear-stress curves along the L1, L2, and L3 paths for three typical cases of different diameters of the sensor. The interfacial stress is declining rapidly far away from the sensor end, which is identical to the theoretical results in Fig. 10.

Figure 7. Stress contour of sensor model ((a)Ecom=245Mpa,D=3mm,d=2.6,t=0.2mm; (b)Ecom=245Mpa,D=3mm,d=2,t=0.25mm; (c)Ecom=245Mpa,D=3mm,d=1,t=1mm; (d)Ecom=245Mpa,D=3mm,d=0,t=1.5mm)

2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2187

18 Shear stress(MPa)

t=0.2mm t=0.5mm t=1mm t=1.5mm

0.0000

0.0007

0.0014

Thickness(mm) Figure 8. Shear stresses on L3 path for various sensor thicknesses

From Fig. 11, it is shown that the stress is gradually decreased along the radius of the FBG sensor, for example, the stress in the core of the FBG sensor is far lower than that at the outer edges. In the case of a 2-mm diameter sensor, the core stress declines to 1.85Mpa. The maximum shear stress is located at both ends of the sensor (Fig. 10).The maximum shear stress of the FBG sensor with 2-mm diameter is 16.6 Mpa, which is less than the strength of the bond of the sensor with the epoxy

resin (20 Mpa). In addition, the shear stress rapidly declines away from the sensor end. Hence, as long as the diameter of the FBG sensor is less than 2 mm, interfacial debonding between the FBG sensor and HFRP is not induced during the working period of the staying cable. Therefore, based on the above analysis, the embedded sensor does not influence the mechanical properties of IHFRP.

Figure 9 Shear stress contour of sensor model (Ecom=245Mpa)

Figure 10 Shear stress curve on L1 path for composites (Ecom=245Mpa, D=2mm)

Figure 11 Shear stress curve on L2 path for composites(Ecom=245Mpa, D=2mm)

Figure 12. Shear stress curve on L3 path for composites(Ecom=245Mpa, D=2mm)

2012 ACADEMY PUBLISHER

2188

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

III. EXPERIMENTS AND RESULTS A. FBG Optical Fiber Sensor Principle and The Other Experimental Materials FBG is written and formed into a photosensitive optic fiber by modulating the core refractive index periodically using interference pattern created by ultra violet lights through a phase mask. When a broadband light transmits through the optic fiber, the FBG written in the core reflects back a wavelength depending on the Bragg condition B = 2ne (1) Where ne is the effective refractive index of the core and is the grating period. The shift in the reflected wavelength of the FBG sensor is approximately linear to any applied strain or temperature. Therefore, the Bragg wavelength shift( B ) caused by the change of strain ( ) and the change temperature( T ) can be expressed as B B 0 B = = (1 pe ) z + ( f + ) T (2)

coefficients of thermal expansion between the optical fiber( f ) and the host material( m ). Therefore, a more generalized equation can be written as

B B 0 B = = (1 pe ) z + ( m f ) T + ( f + ) T B 0 B 0
(6) In the case where the embedded sensor and the host material are not subjected to any temperature variations, eq. (3) can be futher simplified to the following form: B B 0 B (7) = = (1 pe ) z

B 0

B 0

The Bragg wavelength is also affected by temperature changess. The relative change in the Bragg wavelength due to temperature change is expressed as B B 0 B = = ( f + ) T (8)

B 0

B 0

B 0

B 0

f + 7.5 106

(3)

pe 0.22 (5) When the FBG in embedded into a host material and both experience temperature changes, the Bragg equation is modified to account for the thermally induced axial strains in the fiber as a result of the mismatch in the

Where B 0 is the Bragg wavelength at a reference state, pe the strain-optic coeffient of the fiber and is the thermo-optic coefficent of the fiber, respectively. n2 pe = eff p12 ( p11 + p12 ) (4) 2

The FBG used in this study is 10mm long with a reflectivity of more than 90% and bandwidths of 0.2nm. The packaged FBG sensor with temperature-compensated function is made by our team. Therefore, the wavelength is not moved with the temperature change. The FBG sensor used in this study is 2mm diameter and 10mm length. Moreover, the experimental materials included carbon fiber (Japan Toray production, 6 k, M40J), glass fiber (Nanjing Fiberglass Research and Design Institute, 240Tex), aramid fiber (United States, Kevlar-29), E-618 epoxy resin, methyl tetrahydrophthalic anhydride (curing agent). The material properties is shown in Tab.1

Fiber carbonfiber Kevlar-29 glassfiber

Producted Country Japan America China

TABLE.II THE PERFORMANCE PARAMETERS OF VARIOUS FIBER Strength/Mpa Modulus/Gpa Elongation/% 4400 2800 4018 377 63 83.3 1.2 3.6 5.7

Density/G/cm3 1.77 1.45 2.54

Diameter/ m 6 12 10

B. Specimen Fabrication A cable is a one-dimensional rod suitable for the pultrusion process. The molding temperature was approximately 160 C. To cure the resin completely, the online curing process was necessary after the molding, and the temperature was about 170180 C, which is higher than the glass-transition temperature; The pultrusion speed and molding pressure were, respectively, 0.1 m/min and 0.40.5 Mpa. To carry out a thorough investigation on the interface adhesion and the sensing properties of the intelligent HFRP with an embedded FBG strain gauge, two composite specimens were fabricated and studied, as shown in Fig. 13. Furthermore,

to assess and characterize the overall behavior of the embedded FBG sensors, mechanical testing of two pultruded IHFRPs was carried out by applying various loads to the cable wires while continuously monitoring the strain through the embedded FBG sensors and a standard extensometer clipped onto the pultruded rod. In addition, analysis of the interfacial adhesion between the FBG sensor and HFRP was carried out by scanning electron microscopy (SEM). C. Results and Discussions The FBG sensor was embedded in a HFRP wire during molding. Encapsulation contributes greatly to the survival rate of the bare FBG from manufacture. Moreover, the

2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2189

stress-sensing experiments show that the sensor can accurately reflect the wavelength changes. The photograph of an entire cross section of the cable is shown in Fig. 14, and the amplified SEM image of the FBG/HFRP interface is shown in Fig. 15. Figures 14 and 16 show that the diameter of the FBG sensor used in this study is approximately 2 mm and that the FBG sensor/HFRP interface bonding is good. Therefore, the interface can transfer the external load to the FBG well; moreover, the FBG sensor is able to accurately reflect the loading environment, which provides a reliable guarantee for achieving intelligent functions of the IHFRP cable wire. Figures 16 and 17 are the stress-sensing curves of two HFRP rods with embedded FBG sensors. The Bragg wavelength of the sensor is linearly dependent on its axial strain, and the strain-sensitivity values of the two

intelligent rods are respectively, 1.35 and 1.4 pm / which tallies with the theoretical strain intensity of the FBG sensor and is higher than that of the IHFRP rods in existing products. This also shows that FBG can reflect the external loading environment accurately.

Figure 13. Photo of pultruded intelligent stayed cable wire

Figure 14. Cross-section photo of intelligent wire

Figure 15. SEM of the interface between FBG and HFRP with different amplied times

Figure 16. Strain sensing properties of 1# intelligent HFRP rod embedded with FBG sensor

Figure 17. Strain sensing properties of 2# intelligent HFRP rod embedded with FBG sensor

2012 ACADEMY PUBLISHER

2190

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

IV. CONCLUSIONS A simple and theoretical model to estimate both shear and peel-off stress is proposed. The simulation results show: 1) Due to mechanical properties mismatch between FBG and host structure, particularly their elastic modulus, embedded sensor did experience additional stress field when load was applied along sensor direction; and the more the mismatch exists, the more the additional stress field is induced; 2) The interfacial stress is decreased significantly along the radius of FBG sensor for various thickness of packaged; and the larger the thickness is, the more the additional stress field is induced; 3) The maximum shear stress is 16.6 Mpa for the FBG sensor with 2-mm diameter, which is less than the interfacial adhesion strength. Neither fraction nor debonding was observed in this IHFRP rod produced by our team. In addition, the novel IHFRP produced in this study has a higher strain-test precision and can satisfy the cable-monitoring requirements. Neither fracturing nor interface debonding was observed during specimen fabrication and stress measurements, which proved the survivability of the embedded FBG sensor. The interface between FBG and HFRP is intact, which can transfer the external load to the FBG well and reflects the external mechanical environment accurately. The Bragg wavelength of the packaged sensor is linearly dependent on its axial strain, and the strain-sensitivity values of the two intelligent rods are, respectively, 1.35 and 1.4 pm / , which tally with the theoretical strain intensity of an FBG sensor and are higher than those of the intelligent FRP rods found in existing products. ACKNOWLEDGMENTS The author would like to thank her collaborators and the financial support of the National Natural Science Foundation of China (50778116), the Natural Science Foundation of Hebei province (E2011210058) and education department in Hebei province. REFERENCES
[1] Dumlao C, Lauraitis K, Abrahamson E, Hurlbut B, Jacoby M, Miller A,et al., Demonstration of low-cost modular composite highway bridge, Proceedings of The First International Conference on Composites in Infrastructure, Arizona, America, pp. 1141-1145,1996 [2] Taly N, Design of modern highway bridges, New York: McGraw Hill, pp.20-25 1998 [3] S. Meiarashi, I. Nishizaki, and T. Kishima, Life-Cycle Cost of All-Composite Suspension Bridge, J. Compos. For Constr, vol.6, pp.206-213, 2002 [4] Dai De Pei, Engineering applicaton of damping technology, Tsinghua University Press, Beijing, pp.86-88, 1991 [5] Jeannot Frieden, Jol Cugnoni, John Botsis, Thomas Gmr, Low energy impact damage monitoring of composites using dynamic strain signals from FBG sensors Part II: Damage identification, Composite Structures, in press. [6] Jeannot Frieden, Jol Cugnoni, John Botsis, Thomas Gmr, Low energy impact damage monitoring of composites using dynamic strain signals from FBG sensors Part I: Impact detection and localization, Composite Structures,

in press. [7] Jeannot Frieden, Jol Cugnoni, John Botsis, Thomas Gmr, Vibration-based characterization of impact induced delaminating in composite plates using embedded FBG sensors and numerical modeling, Composites Part B: Engineering, Vol. 42,pp. 607-613, June 2011. [8]Jeannot Frieden, Jol Cugnoni, John Botsis, Thomas Gmr, Dragan ori, High-speed internal strain measurements in composite structures under dynamic load using embedded FBG sensors, Composite Structures, Vol. 92, pp.1905-1912, July 2010. [9] Carlos Rodrigues, Carlos Flix, Armindo Lage, Joaquim Figueiras, Development of a long-term monitoring system based on FBG sensors applied to concrete bridges, Engineering Structures, Vol. 32, pp. 1993-2002, August 2010. [10] Zhang, Xushe, Ning, Chenxiao, Application of fiber bragg grating sensors on monitoring of cables' tension,2007 8th International Conference on Electronic Measurement and Instruments, ICEMI, p 4232-4235, 2007. [11] Zhang Xu she, Du yan Liang, Jin xiu Mei, SUN Bao Chen, Journal of Traffic and Transportation Engineering, vol.3, pp. 22-24, 2003 [12] Zhang Xu she, Du yan Liang and Sun Bao Chen, Study in Monitoring of Cables Tension with Fiber Bragg Sensor,China Safe Science Journal, vol.14, pp.98-100, 2004 [13] Du Yanliang, Shao Lin, Li Jianzhi, Sun Baochen, Study on the intelligent hybrid composites suitable for stayed cable, Journal of Functional Materials,vol.39,pp.282-286,2008 [14] Zhou Zhi, Zhang Zhichun, Deng Nianchun, Zhao, Xuefeng, Li Dong-sheng, Wang Chuang, Ou Jinping, Applications of FRP-OFBG sensors on bridge cables, Proceedings of SPIE - The International Society for Optical Engineering, vol.5765, pp.668-677, 2005 [15] Zhou Zhi, Zhou hui, Huang Ying, Ou Jinping, R and D of smart FRP-OFBG based steel strand and its application in monitoring of prestressing loss for RC, Proceedings of SPIE - The International Society for Optical Engineering,vol. 6933,pp.693313-693316,2008 [16] Fan, Yu Kahrizi, Mojtaba. Characterization of a FBG strain gage array embedded in composite structure. Sensors and Actuators A: Physical, vol.21 (2), pp.297-305, 2005 [17] Li Jianzhi, DuYanliang, Liu Chenxi, FBG strain sensor based on line forming, Optics and Precision Engineering, vol.17 (9), pp. 2069-2075, 2009 [18] Li Jianzhi, DuYanliang, Liu Chenxi, FBG strain sensor based on thermal stress mechnism, Proceedings of the 2nd International Symposium on Intelligent Information Technology Application, vol.1, pp. 640-642, 2008 [19] Du Yanliang, Li Jianzhi, Liu, Chenxi, A novel fiber Bragg grating temperature compensated strain sensor, Proceedings of the 1st International Conference on Intelligent Networks and Intelligent Systems, vol.1, pp.569-572, 2008 Jianzhi Li was born in Dingzhou city of China on 13 April 1978. She received his Bachelors degree in material science from the Guilin University of Technology in 1997. She was awarded the degree of master of material science in Wuhan University of Technology and completed her PhD studies in 2009, graduating from Beijing Jiaotong

2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2191

University. Her research interest is strain measurement monitoring system based on FBG (fiber Bragg grating) sensors and intelligent material. She joined Shijiazhuang Railway University in 2004 after graduating from Fiber Optic Sensing Technology Center, Wuhan University of Technology, undertaking research in novel optical instrumentation, especially in fiber optic sensor development for physical sensing. This work has led into several fields including FBG based strain and temperature sensor systems, intelligent materials, and smart structure. The work has been extensively published in the major journals and at the international conferences in the field.

Yanliang Du was born in Shenze city of China on 10 October 1956. He was awarded the degree of doctor in1999, graduating from Beijing University of Aeronautics and Astronautics. His research interest is strain measurement monitoring system based on FBG (fiber Bragg grating) sensors and intelligent materials. Professor Du is currently vice-president of Shijiazhuang Railway University. Baochen Sun was born in Zhuzhou
city of China on 29 November 1961. He was awarded the degree of master in1982, graduating from Northeast Heavy Machinery Institute. His research interest is strain measurement monitoring sytem based on FBG (fiber Bragg grating) sensors and intelligent materials. Professor Sun is currently deputy Dean of Key Laboratory of Structural Health Monitoring and Control in Shijiazhuang Railway University.

2012 ACADEMY PUBLISHER

2192

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

Research and Development of Intelligent Motor Test System


Li Li
College of electrical and information engineering, Hunan Institute of Engineering, Xiangtan, China Email: xu9371@126.com

Jian Liu and Yuelong Yang


College of electrical and information engineering, Hunan Institute of Engineering, Xiangtan, China Email: whuliujian@hotmail.com, yyl1206@126.com

AbstractThis paper presented an intelligent motor test system in order to adapt the development trend of larger data scale and more complicated acquisition environment, and solve the low efficiency problem of motor testing device. It described the scheme of computer automatic data acquisition and processing system, application program structure and distributed network group control strategy of the intelligent motor testing system. The network group control system performes automatic switching and automatic controlling under various units running conditions in motor type testing. The paper also described in detail the equivalent circuit algorithm for motor working characteristics calculation and the non-linear LSM data fitting method for experiment curve. The system took full advantage of hybrid programming technology in interface developing, and of Period Tracking Technology which could solve the problems of measurement errors and low efficiency. It had been successfully applied in several motor manufactories. Index Termsnonlinear least square method, curve fitting, motor testing, intelligent

I. INTRODUCTION Recently, with the rapid development of computer applications, Computer-based Automatic Testing System (ATS) for electrical motor has been greatly popularized [1]. Since Computer-based ATS has remarkable advantages such as test function, measurement accuracy and other performance indexes, it gradually replaces the traditional artificial method and makes motor testing technology enter a new stage. In present, computer-based ATS can be divided into two kinds, PLC-based system and PC-based system. For PLC-based system, process control of the test is realized by the hardware platform, while the computer system only involves data processing and curve plotting. For example, the automatic testing system of motor developed by Westinghouse Electric Corporation [2] is a typical PLC-based ATS. In [7], the PLC is adopted to act as both a local and a remote controller for the motors. For
This work is supported by 2009 Science and Technology Achievement Cultivation Project of Hunan province (09CY018)

PC-based system, the personal computer controls not only the data processing but also the data acquisition and the entire testing process, and the PLC solely executes the commands from PC. Some research institutes such as Shanghai Institute of Motor Technology, China, have achieved lots of results in the testing system of induction motor and PM motor. In [3], a distributed intelligent motor type test system based on ActiveX and COM Technique is introduced. In [4], by adopting the PC as the measure and control center, a computer-aided type test system is presented. And in [5], a novel USB based detecting system for the high-power propulsion motor is proposed. Also, in practice, the 300-type process control computer, which manufactured by Siemens for Munich University motor lab, has firstly designed and it greatly simplified the measurement of parameters in motor test. The integrated motor performance test machines MDP101 and MDP102, which are designed by Japan International Test Corporation, can automatically test the desired projects and implement the data processing. Meanwhile, the expert system with automatically diagnosing ability for motor test is appeared in [6]. Although the PC-based ATS has been widely applied in motor test, this paper proposes a novel PC-based distributed group network system for motor test, which uses the touch screen based interfaces, the PLC based control and the configuration analog meters based display. With the integration of the power testing unit and machine control unit, this paper describes in detail a scheme for designing the distributed network control system. By using hybrid programming technology, the problems (large quantity of testing project, heavy workload and big testing error) can be well solved. The computer-aided intelligence system is developed by fitting motor measurement curves based on nonlinear least square method and successfully applied in a middle type motor testing system design for XTE Motor Manufactory. The system can be operated both in pattern design and delivery testing for various motor categories including asynchronous, synchronous, DC and special motors. Moreover, the proposed scheme can be reconstructed according to user requirements and be used to intellectualize the motor testing process.

2012 ACADEMY PUBLISHER doi:10.4304/jcp.7.9.2192-2199

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2193

II. COMPOSITION OF THE MOTOR TESTING SYSTEM The propose motor network testing system is shown in Fig.1, which mainly composed of data acquisition and processing subsystem, unit drive control subsystems [7]. By integrating the testing power part and the controller part, the unit drive control subsystem performs as a distributed network control system. It provides stable DC excitation power to units in motor testing process, and switches between different running states according to test requirement. In automatic data acquisition and processing subsystem, the acquired electrical and nonelectrical quantities are sent to the PC through field bus, and subsequently preprocessed. After that, the obtained data is processed through hybrid MATLAB/VB programming technology and non-linear fitting method. The processed results are stored in computer database and output by printer [8].

tachometer (CA27) and so on. The test circuit is remarkably simplified for the application of intelligent instruments with RS485 interfaces, which can realize remote controlling [9,10]. The analysis requires detecting the type, scope and assembly requirements of signals. The measured signals of the system include three-phase voltage, three-phase current and rotary speed of intermediate frequency generator, three-phase voltage, current and DC excitation voltage and current of vice exciter. The effective values of AC voltage vary between 0 and 300 V for intermediate frequency generator and 0 ~ 30 V for vice exciter, respectively. Also, DC voltage range is 0 ~ 30 V; AC current effective value ranges are 0 ~ 100 A and 0 ~ 2 A, DC current range is 0 ~ 2 A. The amphibious AC and DC sensors used in this system are as follows: SX1T500V050V7 voltage sensor: input voltage is 0~500V, output current is 0~50mA, and used for the measurement of voltage signals. SE1T100C50V6 current sensor: input current is 0~100A, output current is 0~50mA, and used for the measurement of generator three-phase current. SG1T5C25V6 current sensor: input current is 0~5A, output current is 0~25mA, and used for the measurement of three-phase line current and DC excitation current of vice exciter. A. Period Tracking Technology In automatic motor test system, when the motor to be tested is a generator, the frequency of its output voltage is always not constant [11]. If the sampling period Ts and the sampling times N in a period are constant, the N sampling will exceed or not cover a signal period, which brings large error. In our system, the sampling frequency is set to be fixed and the actual sampling times N will be changed real-timely and followed the variation of the value of voltage period T. Thus, the measurement error caused by period fluctuations in the signal can be eliminated [12]. According to the frequency characteristcs of tested motors, the highest harmonic number to be analyzed is set to be 30 and the signal highest frequency is 12 KHz. In order to improve the real-time performance, sampling should be operated and completed within two fundamental periods of the sampled signal. According to the non-period sampling theory of period signal, the required sampling frequency of 10-channel parallel sampling is 254 kHz. This requirement can be met by adopting the data acquisition card - PCI9118 DAQ card [13]. B. Adoption of Intelligent Instruments and Meters Both the pattern tests and delivery tests of motors include measuring current, voltage and network frequency, revolution speed and temperature. Intelligent instruments and meters with interface RS485 have advantages of remote controllability, isolation of internal CPU system and external inputs, robust ability of antiinterference, and thus simplifying the circuit. For example, the selection of dual-channel intelligent digital display controller WP-MD807-200-1212 avoids using

Fig.1. Block diagram of the test system 1 sensor and transmission circuit 2 signal conditioning 3 data acquisition card 4 ACESS database 5 MATLAB and VB hybrid programming 6 Intelligent instrument array 7 Data server 8 Enterprise management network

III DESIGN OF AUTOMATIC DATA ACQUISITION AND PROCESSING SYSTEM In order to improve the measurement accuracy of Voltage Harmonic Distortion (VHD) rate, the signals are firstly be processed by signal conditioning circuit and then sent to DAQ card. After that, the testing data will be processed by PC and the results will be either displayed or printed. The whole test-beds with embedded PCs form a TCP/IP LAN. The data server is linked to enterprise management network. In the system, the motors electrical signals are measured by intelligent instruments, such as two-way intelligent digital display controller (WP-MD807-200-1212), industrial level digital
2012 ACADEMY PUBLISHER

2194

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

several kinds of meters. To satisfy the requirements of the test, the intelligent instruments and meters chosen in our system are listed as: technical grade digital tachometer CA27, multi-channel temperature itinerant detector, programmable 5KV digital insulation measuring apparatus HT7050, vibration-measuring analyzer VA-11, noise spectrum analyzer AWA6270, digital micrometer PC9A-1, technical grade digital tachometer with interface RS-232 CA27, intelligent frequency display controller WP-LEQN1C6E1, etc [14]. IV. DESIGN OF THE UNIT DRIVE CONTROL SYSTEM In the design of units control system, we adopt total distributed control method to centralize the manage and distribute the control for the units [15]. The 5 sets of excitation power supplies in the test station are separately controlled in closed-loop form by corresponding control unit, and each control cabinet is provided with local control and works in remote control modes. Meanwhile, the DC speed regulation system (6RA70) is constituted in a complete double closed-loop form. During the system operation, the upper computer is in charge for the unified manage of the 5 sets of excitation powers and the pair of speed regulation systems. Through the PLC simulation module, the upper PC outputs control command to the lower computer in remote and the lower computer controls each power subsystem in local system [16]. The operation parameters of the subsystem should be sent back through the detection interface and are displayed on the touch screen. The whole motion control system is composed of PLC, touch screen, the excitation magnetic control cabinet and the speed regulation subsystem in a distributed network control structure. A. Design of Test Power System The unit drive control system is composed of 5 motors, 5 suites of excitation power sources with MCU, and a full-digital DC speed regulating equipment (SIMENS 6RA70). The interfaces of the test project are shown in Fig.2. In Fig.2, the notation 117D denotes Siemens full digital DC speed regulator 6RA70, B is motor to be tested, P is accompany motor , TD is AC synchronous motors, TF is AC synchronous generator, ZF denotes DC motor, LT is the exciter current feedback, LV is the voltage feedback .

The project is tested under different operating frequency such as 50Hz, superposing-frequency. The running state is different in different tests, as well the feedback signals. Also, for manual control mode, it sometimes requires some inconvenient operations such as repeated adjustments and heavy workload. After taking into account the complexity of unit control and the need of running state switching in testing process, closed-loop control is adopted for the motors excitation. The tests use two DC SCR cabinets, two synchronous motor separate excitation cubicles, and a synchronous motor separate excitation SCR cabinet to control the excitation of the motors. The upper-level monitor management system is composed of touch screen and PLC, which can automatically switches between different running states and control modes. B. Composition of Unit Drive Control System The Siemens full digital DC speed regulator 6RA70 is very powerful and is commonly used in single machine dual-closed-loop control system. For the system design, the control structure and the parameters of the test object is essentially the same, thus the control parameters adjustment and performance tuning are very convenient. In our system, there are two different working modes for speed regulating system: (1) stable voltage mode,which supplies armature voltage for two motors double motor drive; (2) speed regulating mode, which controls the stability of ZF1s speed the single-motor drive. It will be a waste of hardware when using two power sources to meet the application requirements. Through demonstration and analysis, during double-unit drive control, we use Siemens full digital DC speed regulator (6RA70) for DC motor ZF1 and ZF2, which can adjust control parameters accordingly so as to ensure the system meet the control performance requirements. In the 50Hz feedback pilot project, the power units work as follows. Step 1. Motor TD is electrified and gridconnected, and works in electro motion state. The synchronous motor TF1 is turned through shaft. At the beginning, it cannot reach the synchronous speed (f=50Hz), so TF1 couldnt connect to excitation current. TF1 is connected to DC motor ZF1 through shaft. ZF1 is not initially exited but starts by remanence in generating state. Then its excitation current is adjusted to make the systems speed to reach synchronous rated speed. Also, TF1 is connected to excitation current and works in generating state. Step 2. Electricity generated by TF1 is sent to motor B for testing and motor B works in electromotion state. The companion motor P, which is connected to B through shaft and works as a generator, is used as a workload of motor B. Step 3. The electricity generated by P is sent to synchronous motor TF2 to force TF2 in electro-motion state. Also, DC motor ZF2 is connected to TF2 through shaft and is in generating state. After that, by disconnecting K1 and closing K2, K3, motor ZF2 and ZF1 is connected in a closed-loop. Step 4. ZF2 receives the input current of the motor to be tested as its closed-loop feedback current. TD, ZF1 and ZF2 use their own excitation current for closed-loop feedback

Fig.2 The interfaces of the test project

2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2195

current control. And TF1 uses its own armature voltage for closed-loop feedback voltage control. C. Composition of Network Control System In order to achieve the control requirements, facilitate the operation of main control loop, and communicate with upper-level management PC, we integrate the testing power source with the unit control system to form a network control system. The upper-level machine is composed of touch screen (TP270) and PLC (S7-200), while the lower-level machine system includes a doubleclosed-loop full digital DC regulator system (Siemens 6RA70) and five real time excitation power control system. The management PC is connected to upper-level machine and receives management commands, transfers operation parameters of power system. It communicates with lower-level machine to monitor the management process and transfer control parameters through field bus. The multi-level distributed network power control system provides a good test environment for motor testing and well fulfils the test and control requirements. V. CURVE FITTING FOR EXPERIMENT DATA In motor type test and delivery test, we need to measure the motor speed n, power factor cos 1 , efficiency , rated current I1 , and the relationship curve between the electromagnetic torque T and the output power P2 . In our design, the chosen intelligent instrument is configured with a RS485 interface, which is convenient for remote control. The internal CPU system is separated from external signal, thus it has a strong anti-interference ability. The test circuit is greatly simplified while meeting the automatic test requirements. After the temperature rise test, working characteristic test proceeds while loads vary between 0.25 PN and 1.25 PN . The widely used methods include torque meter direct measurement, indirect measurement and circle diagram method by solving induction motor equivalent circuit. The measured three-phase asynchronous motor YKKL355-4 has a high capacity with 220kW rated power, 6000V rated voltage and 26.4A rated current, which is suitable for equivalent circuit method. A. The Equivalent Circuit of Three-phase Asynchronous Motor The equivalent circuit of one phase of the 3-phase asynchronous motor is shown in Fig.3. R1 is the resistance of the stator winding phase. X1 is the stators leakage reactance. GFe is the equivalent conductivity of stator iron loss. Bm is the master excitation susceptance. S is the slip ratio. R2 is rotor phase resistance converted to rotor side. X2 is rotor leakage reactance converted to rotor side. The equivalent circuit method is a way to calculate working characteristic through equivalent circuit according to motor empty load and stall test.

Fig. 3 The equivalent circuit of 3-phase induction motor

B. Equivalent Circuit Measurement Method Step 1. Empty load test: set U 0 = (1.1 ~ 1.3)U n , then measure the empty load current I 0 , empty load voltage U 0 , empty load input power P0 , and the stator resistance R10 at the end of the test. There are 7~9 sets of data in the test. If U O U N = 0 , the empty load loss is equal to the iron loss as
POC = Pfe

(1)

If U O U N = 1 , the mechanical loss is equal to rated empty load loss minus iron loss as
Pfw = POC Pfe .

(2)

Step 2. Low-frequency short-circuit test: measure the three-phase voltage U1k , three-phase current I1k , input power P1k and the resistance R1k of the stator winding. There are 7 sets of data in the test. The equivalent circuits parameters of the three-phase asynchronous motor are calculated through iterative method. Let the motor reactance is X 1 , X 2 , X m , master excitation susceptance Bm , iron loss equivalent conductivity GFE , rotor resistance R2 , the slip ratio of rated point Sn . The total impedance can be calculated as following: (3) Step 3 Calculating working characteristic: Suppose the slip ratio of rated point in the previous step is S n , and let
Z = R + jX

S equal to 0.25 Sn , 0.5 Sn , 0.75 Sn , 1.0 Sn , 1.25 Sn respectively, the equations to calculate working characteristic are as follows Input current:
I1 = U n / Z

(4)

Input power:
P = 3I12 R 1

(5)

Rotor copper loss:

2012 ACADEMY PUBLISHER

2196

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

Pcu1 = 3I12 R1ref

(6)

where R1ref is an equivalent conversion under standard temperature of the stator winding resistance. Thus, the iron loss is calculated as
Pfe = 3I
2 1

where i is the standard deviation of measurement error of the i-th pointP2i ,cos i ; The Lieweibuge - McQuirter method is equivalent to solving the following non-linear equation
2 ( a ) = 0

G fe Y2

(7)

(15)

rotor copper loss:


2 Pcu 2 = 3I 2 R2

Initializing a to a (0) , and using the improved Newton iteration method, we have
a ( q +1) = a ( q ) + a ( q )
2 2 (a ( q ) ) a ( q ) = 2 (a ( q ) )(q = 0,1, 2,L)

(8)

(16)

where I 2 = I1

Z 2Y

If S = S n , the stray losses is calculated as


Ps = 0.005P 1

For fixed Q, let (9)


[a] = 1 2 2 (q) 1 a , = 2 ( a ( q ) ) 2 2

(17) (18) (19) (20)

or else

a = a ( q )

I Ps = P1n 1 I 1n
the total loss is calculated as

In above equation, we define (10) (11)


[a ]a = a =

P = Pcu1 + Pcu 2 + Pfe + Pfw + Ps

Therefore, the efficiency is calculated as


= 1
P P 1

Combine (19) and (20), according to the Lieweibuge McQuirter method, we have
([a] + D)a =

(12)

(21)

where D = diag (a11 , a22 ,L, amm ) The gradient vector of the minimization function is
R cos = Z

and power factor is (13)

n cos y ( P ; a)] y ( P ; a) 2 i 2i 2i , j = 1,L , m = 2 i =1 a j a j i2

Linearlize Hessian matrix: C. Non-linear LSM Data Fitting We want to find a function y = y ( x;a ) to approximate the working characteristic curve in Table 1. The two most commonly used methods are interpolation and fitting. Here, we adopt different fitting methods according to the characteristics in the proposed intelligent motor testing system. When fitting motors empty load characteristic, since straight line fitting and high-order polynomial fitting cannot meet the requirements, we use quadratic and cubic polynomial two-segment fitting method to get the optimal characteristic curve. For cos = f ( P2 ) , we use non-linear LSM data fitting and Lieweibuge - McQuirter method to obtain good performance. Suppose there is a functional relationship cos = y ( P2;a ) between cos , variable P2 , and parameter a = (a1 ,L , am )T , and let P2i , cos i i = 1, ,ndenote n observations of P2 ,cos . The L objective is to minimize
x 2 (a) = [
i =1 n
n 1 y ( P ; a ) y ( P ; a ) 2 2 2i 2i = 2 2 [ i =1 al a j al a j i

[cos i y ( P2i ; a)][

2 y ( P2i ; a ) ] al a j

So, we have
alj
n i =1

i2

1 y ( P2 i ; a ) y ( P2 i ; a ) [ ], l = 1,L , m al a j

VI. VB SOFTWARE DESINGING VB (Visual Basic) is a high efficiency visualization software development platform. By taking advantage of VB in user interface development, we avoid using the complicated C++. The VB subprogram MRQA is one iteration step of Lieweibuge - McQuirter method, and MRQB is used to estimate linear fitting matrix (Hessian Matrix, Hessian and minimization functions gradient vector) of Lieweibuge - McQuirter method. We refer to VB Algorithm Collection to develop the program, and the efficiency is greatly improved [17].

cos i y ( P2i ; a )

(14)

2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2197

A. Application Program Structure of Intelligent Motor Testing System The application program structure of intelligent motor testing system is shown in Fig.4.

assign a(0), calculate minimize x2(a)

assign =0.01, q=0

solve equation (21) obtain a

calculate x2(a+a)

N Y Y Y

=0 ? NN
Y N

Iteration stop

x2(a+a)>x2(a) ? N

10
a a+a

Fig. 4. Application program structure of intelligent motor testing system

x2(a+a)<x2(a) ?
Q2

10
a a+a

For this testing system, master control software is responsible for command and coordination of the entire system and communicates with operator directly through the man-machine interface. The software can execute tasks of receiving data, data processing, fitting and plotting test curves, generating and printing test reports, test data access and management, etc [18]. In intelligent motor testing system, process control, data collection and display, and dynamic parameters setting are implemented in several classes including dialog box classes for empty load test, dialog box classes for loaded test, dialog box classes for locked-rotor test, dialog box classes for temperature rise test etc. All of these are designed and implemented with functions of data access and process. Due to the merits of VB programming, the designed user interface of master control software offer the channel of man-machine conversation, and has a friendly operating platform which is easy to understand. Once the interface of each test is designed, the system is easily setup for real-time monitor test. All the test data are saved and stored in database for further analysing the running performance and quality of motors. B. Curve Fitting Sub-program Flow Char The main procedure of intelligent motor testing systems load experiment includes: pilot project selecting, testing preparation, load regulating, data acquisition and processing, thermal resistance sampling, motor working characteristic calculating, curve fitting, result printing, and curve draw. The flow chart of non-linear LSM curve fitting for working characteristic curve cos = f ( P2 ) is shown in Fig.5.
2012 ACADEMY PUBLISHER

N N x2(a+a)<x2(a)
Q>2

0
a a+a

Fig. 5. Curve fitting sub-program flow char

C. Hybird Programming Technology MATLAB is a mathematical software with high performance of numerical analysis, signal processing and graphical display, and it has high efficiency in complex algorithm implementation, but low efficiency in developing man-machine communication interface [19] [20]. Visual Basic (VB) is an efficient visualization software platform, which enables programmers to quickly and easily develop friendly user interfaces. We use VB /MATLAB hybrid programming techniques to develop test system software, where VB is used to develop the interface and Matlab as the background operation procedure [21]. Executable files (.exe) technology is a relatively simple call method [22]. MATLAB software itself provides a start to run M files "startup.m", so we write the problem solving algorithm program in the" startup.m" at first. Then we use the VB Shell calls MATLAB, MATLAB starts immediately and executes the "startup.m", reads the data file preprocessed by VB. When MATLAB accomplished data processing, it closes itself and returns to VB, VB implements the data virtual display.

2198

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

VII. TESTING RESULT The tested motor (YKKL335-4) is a three-phase asynchronous motor with rated voltage 6000V. We use equivalent circuit method to measure the working characteristic. The testing and calculation data are shown in Table 1.
TABLE I. CALCULATION OF WORKING CHARACTERISTIC U (V) I (A) P (kW) Sref (%) Pcu1 (kW) Pfe (kW) Pfw (kW) Pm (kW) Pcu2 (kW) Ps (kW) P (kW) P2 (kW) (%) Cos 6000 11.570 69.006 .247 1.008 6.728 6.09 61.268 .151 .345 14.323 54.682 79.24 .573 6000 16.165 129.081 .494 1.968 6.626 6.09 120.487 .595 .645 15.926 113.156 87.66 .768 6000 21.487 186.792 .741 3.478 6.500 6.09 176.813 1.310 .934 18.313 168.478 90.20 .836 6000 26.968 241.389 .9883 5.479 6.353 6.09 229.555 2.268 1.206 21.399 220 91.14 .861 6000 32.389 292.290 1.235 7.904 6.190 6.09 278.196 3.436 1.461 25.082 267.208 91.42 .868 Uuv V 35 120.5 180 200 220 240 260 280

TABLE II. EMPTY LOAD TEST Uu4v4 V 5 19 29 32 35.2 38 41.5 44.5 Uf2 V 0 6.3 11 12.6 14.8 17.3 21.5 27.8 If2 A 0 .34 .58 .67 .78 .91 1.12 1.43

Fig. 7. The characteristic curve of Empty load test

Fig. 8 gives the characteristic curve of short-circuit test.

The characteristic curve of the three-phase asynchronous motor (YKKL335-4) is fitted by non-linear least square method. The characteristic curve of rated voltage and frequency is shown in Fig. 6. A motor empty load test measurement data at n = 4000 r/min are shown in Table 2, and the fitted characteristic curve of empty load test is shown in Fig. 7.

Fig. 8. The characteristic curve of short-circuit test

In the design and development of intermediate frequency motor testing system, test data acquisition and processing and resulted virtual display are implemented by using VB/MATLAB hybrid programming technology. The sinusoidal aberration rate of voltage wave and wave crest ratio of dynamic measurement are given in Fig. 9.

Fig. 6. The characteristic curve of rated voltage and frequency

Fig. 9. Section result of test data acquisition and processing

2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2199

From Fig.9, the running results show that , as to the sinusoidal aberration rate of voltage wave and wave crest ratio of dynamic measurement, the precision of measuring results reaches 0.01% , while traditional method VHD only has 0.1%. By taking advantage of VB in user interface developing, the drawing interface of characteristic curve is very friendly. And the fitting result reflects real data very well, which meets the requirement of intelligent motor testing system. VIII. CONCLUSION The computer-aided intelligence system designed in this paper has been used in XTE motor manufactory and been generalized in all type motor testing stations for lots of motor factory. The proposed testing system has been proved to test accurately and efficiently by field experiment. This system can well resolved the problems such as huge workload of data processing, synchronous reading and multi-data real time display at testing process in intermediate frequency generator testing. Compared with manual measurement, the system meets the test requirement of motor testing system well, for it has less than 0.4% testing error, stable performance and favorable display interface. It presents a new approach for automated performance testing of motors. REFERENCES
[1] Fan Ying, Yan Huaguang, Yu Haibo, and Zong Jianhua, Research on Induction Motor Energy Efficiency Evaluation Methods and the Intelligent Test System, Electrical technology of China, Vol.9, 2008. [2] Charles L. Neft, Alvaro Caneino, facility for Automated Testing of Induction Motors, Industry Applications Society Annual Meeting, 1990, Conference Record of the 1990 IEEE, 1990 page(s): 116-121 vo1.1 [3] Qiu Sihai, Wang Yaonan, Huang Shoudao, et al., Study and Development of Intelligent Motor Test System, Small & Medium Electrical Machines, 2003, 30(1): 66-70 [4] Dai Wenjin, Liu Huazhu, Zhang Jiangming, Computeraided Type Test System for Electric Machine, Small & Special Electrical Machines, 2000, 4: 36-38 [5] Fang Hong, Xu Weizhuan, Xu Hemei, et al., Design of USB Based Detecting System on High Power Impellent Electromotor, Electronic Engineer, 2004, 4, Vol.30 No.2: 10-12 [6] E. Albas, T. Arikan, C. Kuzkaya, In-Process Motor Testing Results Using Model Based Fault Detection Approach, Electrical Insulation Conference and Electrical Manufacturing & Coil Winding Conference, 2001, Proceedings, 2001 page(s): 643 647 [7] Liu Yun, and Han Ying. An Intelligent Testing System for Asynchronous Motors, Techniques of Automation & Applications, 2005, 25(3): 30-33. [8] Shen Peihui , and Chen Shumei, Application of Computer Intelligent Control on the Hydraulic Test System, Machine Tool & Hydraulics, 2009, 37(9): 146-192. [9] Zhou Nawu, Computer Aided Test System for Intelligent Asynchronous Electric Machine, Electric Machines & Control Application, 2002, 29(2): 52-55. [10] Hespanha, J.P. Naghshtabrizi, and P. Yonggang Xu, A Survey of Recent Results in Networked Control Systems, Proceedings of the IEEE, Vol.95, pp138-162, Jan 2007.
2012 ACADEMY PUBLISHER

[11] Li Li, Yong-tao Long, and, Ming Yan, Motor Testing Data Processing With Hybrid, ICISE 2009,136-139. [12] Zhang Jianqiu, Shen Yi, and Zhao Xinmin, The NonInteger-period Sampling Theory for Periodic Signals, Journal of Harbin Institute of Technology, 1995, 27(6): 99103. [13] Niu Faliang, and Huang Jin, Instantaneous Frequency Extraction of Asymptotic Signal and Its Applications in Motor Performance Test, Transactions of China Electrotechnical Society of China, Vol.21, No.4, Apr 2006. [14] Liu Guofan, Yang Yuelong, Qiu Hong, and Yu Jianqia, Design and realization of the midfrequency generator test system based on DAQ, RELAY, 2005, 33(18): 67-70. [15] Li Juan, Research of computer test system on pump performance parameter, Machinery Design & Manufacture, 2009, (9): 86-87. [16] Xu Y.K., and Cao X.R., Time aggregation based optimal control and Lebesgue sampling The 46th IEEE Conference on Decision and Control, 2007: 12 - 14. [17] Cheng Lei , Li Luqiang , and Cheng Yunan, Research on automatic test on computer interlocking system software based on gray-box test, Journal of Hefei University of Technology, 2010, Vol . 33 No. 5: 670-673. [18] Ni Hongjie, and Zhou Yanqun, Research on three-phase induction motor measurement of speed, Industrial instruments and automation devices, 2010, 6: 105-108. [19] Y. Zhang, L. Jin, and S. Zhong, Matlab simulation for speed control system of asynchronous motor direct selfcontrol, Coal Mine Machinery, Vol.27, no.8, pp. 36-38, Aug 2006. [20] Yang Yao, Wang Mingang, and Zhao Yuelou, Design of test system for flight control computer based on LabVIEW, Electronic Measurement Technology, 2009, 32(9): 96-99. [21] Wang Mingang, Li Yi, Guo Fangshe, and Gong Yudi, Automated Test System Based on Virtual Instrument Technology for Embedded Computer System, Instrument Technique and Sensor, 2008, (11): 56-58. [22] LI Li , SHI Xiao-rong, QIU Hong , and ZHAO Yi-jun, Study of The Mid-frequency Generator Test System Based on VB and MATLAB Hybrid Program Technology, Electric machines & control application, 2008, Vo1.35, No.7: 41-48.

2200

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

Application of Intelligent Controller in SRM Drive


Zhang Baojian, Zhu Yanli School of Information and Engineering, Henan Institute of Science and Technology Xinxiang, China 453003 Email: zbj@hist.edu.cn Xie Jianping*
Institute of Technology, Lishui University, Lishui 323000, China

Wang Jianping
School of Information and Engineering, Henan Institute of Science and Technology Xinxiang, China 453003

AbstractThe switched reluctance motor (SRM) is gaining wider and wider application for its simple structure, low cost, reliability, controllability and high efficiency compared with the commonly-used permanent magnet synchronous motors (PMSM). The overall performance of SRM relays on that of its control system. To overcome the disadvantages of the classical PID controller, an intelligent control system combining the PID algorithm and fuzzy logic is presented to meet the demand of the control system, and the SRM model is given in this paper. Simulation experiments have been carried out to test the performance of the fuzzy PID controller. The simulation results show that the compound fuzzy PID controller possesses better dynamic and static performance than the traditional PID controller and the fuzzy controller, and meets the performance requirements of the SRM control system. Index TermsSRM, PID, Fuzzy logic, Intelligent controller

generated is a function of the rotor position with respect to the energized phase, and is independent of the direction of current flow through the phase winding. Continuous torque can be produced by intelligently synchronizing each phases excitation with the rotor position. By varying the number of phases, the number of stator poles, and the number of rotor poles, many different SRM geometries can be realized. A few examples are shown in Figure 1.

I. INTRODUCTION With the rapid progress of industrialization, the demand on automatic production equipment is becoming more and more urgent. As the main drivers to the production equipment, electric motors have played an important role. Along with the remarkable growth of power electronic technology and automation, various subjects to AC motors have been progressed. The application of switched reluctance motors (SRM) is obtaining more and more attention. The main reason for using SRM is that it is a simple, low cost, robust structure, high ratio of torque to rotor volume, reliability, controllability and high efficiency [1-3]. The switched reluctance motor is a rotating electric machine where both stator and rotor have salient poles. The basic operating principle of the SRM is quite simple, as current is passed through one of the stator windings, torque is generated by the tendency of the rotor to align with the excited stator pole. The direction of torque
* Corresponding author: Xie Jianping, Tel.: +86-578-2299323, E-mail: xjp1386@hotmail.com.

Figure 1. Various SRM Geometries

In addition, for the reason of no winding loss occurring in rotor as it existing in permanent magnet synchronous motors (PMSM), the efficiency of SRM is better than those of the induction motors, thus it is adoptable to be the low-cost variable-speed drivers in many industrial applications. In construction, the SRM is the simplest of all electrical machines. Only the stator has windings. The rotor contains no conductors or permanent magnets. It consists simply of steel laminations stacked onto a shaft.

2012 ACADEMY PUBLISHER doi:10.4304/jcp.7.9.2200-2207

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2201

It is because of this simple mechanical construction that SRMs carry the promise of low cost, which in turn has motivated a large amount of research on SRMs in the last decade. However, the mechanical simplicity of the device brings some limitations. Like the brushless DC motor, SRMs cannot run directly from a DC bus or an AC line, but must always be electronically commutated. Also, the saliency of the stator and rotor, the necessary for the machine to produce reluctance torque, causes strong non-linear magnetic characteristics, complicating the analysis and control of the SRM. Industry acceptance of SRMs has been slow, which is due to a combination of perceived difficulties with the SRM, the lack of commercially available electronics with which to operate them, and the entrenchment of traditional AC and DC machines in the marketplace. But SRMs offer some advantages along with potential low cost. For example, they can be very reliable machines since each phase of the SRM is largely independent physically, magnetically and electrically from the other motor phases. Also, because of the lack of conductors or magnets on the very high speeds can be achieved, relative to comparable motors. Disadvantages are often cited for the SRM. For example, they are difficult to control, that they require a shaft position sensor to operate, that they tend to be noisy and they have more torque ripple than other types of motors; they have generally been overcome through a better understanding of SRM mechanical design and the development of algorithms that can compensate for those problems. Nevertheless, SRM has not been put into wide applications since there are the problems of large torque ripple, acoustic noise and low power factor. The whole performance of SRM largely depends on the performance of the control system of SRM. In the traditional way, the commonly used PID controller has been utilized in the industrial fields. But the PID algorithm is only fit for the design of the linear system. For the nonlinear plant like SRM, its impossible to arrive at a satisfied control effect [4, 5]. Fuzzy control method has been widely applied in a lot of project fields, which can easily turn controllers' experience into control strategy. It need not set up the accurate mathematical model of controlled object and its dynamic quality is superior to conventional controlling methods. In this paper we use the PID algorithm and fuzzy logic to design an optimal intelligent control system. The classical PID method is combined with the fuzzy logic scheme to construct a compound intelligent controller, and simulation experiments have been carried out to test the performance of the fuzzy PID controller. The simulation results show that the compound controller has better control effects that the traditional PID controller. II. MODELLING OF SRM In order to achieve modeling for SRM, many advanced control strategies have been proposed, for example, nonlinear control, the linear control, the iterative study method, and the nerve network etc. These control

strategies lack of flexibility and robustness, and difficult to obtain satisfactory. The modeling of a motor is usually according to the flux-linkage-current characteristics at a proper chosen series of position angle of the rotor against the stator, which can be obtained by measurements or numerical field methods. The mathematical model of the SRM is given according to the volt-ampere equation and motion equation, and the static and dynamic behavior of SRM can be described using these equations. Differential equations are used to model static and dynamic performance of the SRM, which is derived from the interaction between the phases of the motor. For phase j of the SRM, the corresponding equation is [6, 7]:

U j = Rj ij +
where:

d j ( , i j ) dt

(1)

R j phase resistance;
j linkage flux;

position angle;
U j voltage supply for phase j; i j current flowing through phase j;

For the solution of the Eq. (1), it is necessary to model magnetic nonlinearities in the form of i(, ) rather than (i, ) form as shown in Figure 2.

Figure 2. The variation of flux-linkage with current

This is because after each integration step the solution to Eq. 1 yields a value for the flux which can be used to find the corresponding current value for the next integration step. The variation of flux-linkage with current is the same for the remaining phases except for the angular dependence, which takes into account the physical, inter polar spacing. The experimental motor has a total of eight symmetrically located stator poles used by a total of four phases (two poles per phase). There are three coupled magnetic circuits in SRM, for one phase in a system with three coupled magnetic

2012 ACADEMY PUBLISHER

2202

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

circuits a, b, c, the equation is:


dL jk (i j , ) d di ik U j = R j i j + L jk (i j , ) k + dt d dt k

(2)

In accordance with the following equation: d


dt =

(3)

Equation (2) becomes


dL jk (i j , ) di U j = R j i j + L jk (i j , ) k + ik dt d k

(4)

And there is the following equation

E j ( , i j , ) =

dL jk (i j , ) d

ik

(5)

The coefficient of viscosity can be considered to be negligible, therefore the motion equation, which will be taken into account in the simulation, is as follows:

d 1 = (M M R ) dt J

(6)

The electromagnetic torque produced by the motor can be obtained from the theorem of the generalized forces:

M=

dW ' (i, ) d

(7)
i = ct

where the magnetic co-energy has the following expression

W ' = (i, ) di

(8)

The mean value of the electromagnetic torque developed by a phase is calculated according to the following equation:

M med =

1 M ( ) d Tq

(9)

The speed regulating system of SRM is composed of the controller, the power converter and an SRM. Traditionally, the PID algorithm has been widely used in a lot of industrial fields successfully. Whereas the PID controller is only suitable for the analysis and design of the linear system and depends heavily on the model of the plant, it cannot arrive at a satisfactory performance when it is applied to the nonlinear system such as SRM [8-10]. In order to acquire the desired requirements in the SRM control system, we combine the classical PID method with the fuzzy logic to form an intelligent controller, as we called, a compound fuzzy PID controller. The integral operation and derivative operation are integrated into the fuzzy controller, as shown in figure 3. Early studies showed the construction of the fuzzy logic controller (FLC) to be a sequence of the next steps: (1) Development of fuzzfication interface which involves the following functions: (i) Measures the values of input variables, (ii) Performs a scale mapping that transfers the range of values of input variables into corresponding universe of discourse, (iii) performs the function of fuzzification, that is, converts input data into suitable linguistic values which may be viewed as labels of fuzzy sets. (2) Construction of the knowledge base, which comprises knowledge of the application domain and the attendant control goals. It consists of a "data base" and a "linguistic (fuzzy) control rule base": (i) Data base provides necessary definitions, which are used to define linguistic control rules and fuzzy data manipulation on FLC, (ii) The rule base characterizes the control goals and control policy of the domain experts by means of a set linguistic control rules. (3) Development of decision making logic, which is the kernel of the FLC; it has the capability of simulating human decision making which is based on fuzzy concepts and inferring fuzzy control actions employing fuzzy implication and the rules of inference in fuzzy logic. (4) Constructing the defuzzification interface performing the following functions: (i) A scale mapping which converts the range of values of output variables into corresponding universe of discourse. (ii) Defuzzification, which yields a nonfuzzy control action from an inferred fuzzy control action. The inputs are the errors between the given speed and the detection speed and the error variance.

III. DESIGN OF INTELLIGENT CONTROLLER

e = V * V
PID controller e ec ec e Fuzzy controller

ri +

de/dt

e>55

SRM

yo

Figure3. Structure of the control system

2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2203

When the inputs are fuzzier, the fuzzy controller may take some smaller errors as zero. Therefore there exists dead zone in the control system, and its difficult to obtain higher control accuracy for the static error. The system is switched to the conventional PID controller when the error is small, and its switched to the fuzzy controller when the error exceeds the error threshold. The incremental PID algorithm is used in the system. Suppose the system input is ri, the output is y0, and the system error is error(k) = ri(k) - yo(k). The algorithm is:

u (k ) = k p x c (1) + k i x c (2) + k d x c (3)


Where

PS: Positive small ZO: Negative Zero NS: Negative Small NM: Negative Medium NB: Negative Big Each fuzzy set is defined by a linguistic variable (low, large) which is again defined by a multi valued membership function (MF). An MF can have a range of shapes. Figure 4 is the membership functions of E and Figure 5 is the membership functions of EC respectively.

(10)

xc (1) = error ( k ) error ( k 1)


xc (2) = error (k ) xc (3) = error (k ) 2error (k 1) + error (k 2) x c (1) , x c (2) ) and x c (3) are the three inputs of the PID control parameters. In the PID controller, the gradient descent algorithm is used to tune the PID control parameters k p , k i , and k d . The tuning algorithm is

k p =

E E y u = k p y u k p

(11)

y = error (k ) xc (1) u

Figure.4 Membership function of error E

ki =

E E y u = ki y u ki

y = error (k ) xc (2) u

(12)

kd =

E E y u = kd y u kd

y = error (k ) xc (3) u In formula (11)~(13), and are the learning rate, and (0,1) , (0,1) .

(13)

Fuzzy inference system is a method, based on the fuzzy theory, which maps the input values to the output values. The mapping mechanism is based on some set of rules, a list of if-then statements. Fuzzy control involves fuzzification, a fuzzy rule base generalized from experts' experience, fuzzy inference and defuzzification. The fuzzy inputs (error and the variance rate of error) are classified into seven equal-span triangular membership functions. NB, NM, NS, ZO, PS, PM, PB are abbreviations, which are described in the following table: PB: PM: Positive Big Positive Medium

Figure.5 Membership function of variance rate of error EC

The fuzzy control rules, which are utilized to eliminate the error E and the variance rate of error EC. According to controllers' experience of expert, the control rules can be described by the following. (1) If motor speed is higher than 800 r / min, then torque reference should be reduced. It shows significant reduction of torque reference as compared with higher speed of motor. (2) If the motor speed is equal to 800 r / min, while torque reference keeps invariant. (3) If motor speed is lower than 800 r / min, then it should to increase torque reference. Furthermore, motor speed is lower, reference torque is more increased.

2012 ACADEMY PUBLISHER

2204

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

Fuzzy logic control rules have been given in table 1 and can be described with the if-then statements as follows. (1) If E is NB and EC is NB then U is NB (2) If E is NB and EC is NM then U is NB (3) If E is NB and EC is NS then U is NB (4) If E is NB and EC is ZO then U is NB (5) If E is NB and EC is PS then U is NB (6) If E is NB and EC is PM then U is NB (7) If E is NB and EC is PB then U is ZO (8) If E is NM and EC is NB then U is NB (49) If E is PB and EC is PB then U is PB In order to guarantee real time control in the fuzzy

(PID) controller, the FLC can achieve the goals of steady output and satisfactory transient performance simultaneously. However, choices of rule sets and membership functions affect achieving these performance goals. System Modeling based on conventional mathematical tools (e.g., differential equations) is not well suited for dealing with these systems. By contrast, a fuzzy system employing fuzzy if-then rules can model the qualitative aspects of human knowledge and reasoning processes without employing precise quantitative analysis. Increasingly, fuzzy system as a promising Computational Intelligence technique has found many industrial applications. Different fuzzy models have been developed, and successfully applied such as Mamdani fuzzy method

TABLE I. FUZZY CONTROL RULES TABLE

EC NB NM NS ZO PS PM PB

NB NB NB NB NB NB NB ZO

NM NB NB NB NM NS ZO PS

NS NB NB NM NS ZO PS PM

E ZO NB NM NS ZO PS PM PB

PS NM NS ZO PS PM PB PB

PM NS ZO PS PM PB PB PB

PB ZO PS PM PB PB PB PB

controller used in engineering, the fuzzy control table is usually calculated off-line. According to the control experience of engineers, the fuzzy control rules, as described in the above if-then statements, are presented in

and Tagaki-Sugeno (T-S) fuzzy method . Advantages of the Mamdani fuzzy inference system are its intuitive, has widespread acceptance and well suited to human cognition. The T-S fuzzy inference system works well

TABLE II. FUZZY CONTROL DECISION TABLE

EC -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6

E -6 -6 -6 -6 -6 -6 -6 -6 -6 -6 -6 -6 -3 0 -5 -6 -6 -6 -6 -6 -5 -5 -4 -4 -3 -3 -1 1 -4 -6 -6 -6 -6 -6 -5 -4 -3 -2 -1 0 1 2 -3 -6 -6 -6 -5 -5 -4 -3 -2 -1 0 1 2 3 -2 -6 -6 -6 -5 -4 -3 -2 -1 0 1 2 3 4 -1 -6 -5 -5 -4 -3 -2 -1 0 1 2 3 4 5 0 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 1 -5 -4 -3 -2 -1 0 1 2 3 4 5 5 6 2 -4 -3 -2 -1 0 1 2 3 4 5 6 6 6 3 -3 -2 -1 0 1 2 3 4 5 5 6 6 6 4 -2 -1 0 1 2 3 4 5 6 6 6 6 6 5 -1 0 1 2 3 4 5 5 6 6 6 6 6 6 0 1 2 3 4 5 6 6 6 6 6 6 6

Table 1. In the design of fuzzy inference system, Mamdani fuzzy inference method was used. Fuzzy logic algorithms have been widely used in many control applications. Unlike a conventional proportional- integral-derivative
2012 ACADEMY PUBLISHER

with linear techniques and guarantees continuity of the output surface. But the T-S fuzzy inference system has difficulties in dealing with the multi-parameter synthetic evaluation; it has difficulties in assigning weight to each input and fuzzy rules Mamdani model can show its

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2205

SRM1 SRM2 ton1 200 Ua 0.42 ton pi/6 Dt Mr ton3 Mux alim3 Mux alim2 Mux alim1 SRM3 L Mux T1 T2 T3 Dyn

1/s 1/s 1/s Mux Current

Mux

+ Torque

ton2

1/s Speed

Figure 6. Fuzzy PID controller in MATLAB

legibility and understandability to the laypeople. The Mamdani fuzzy inference system shows its advantage in output expression and is used in this project. Fuzzy logic starts with the concept of a fuzzy set. A fuzzy set is a set without a crisp, clearly defined boundary. It can contain elements with only a partial degree of membership. A fuzzy set is defined by the expression below:

controllers, the PID controller, the fuzzy controller and the fuzzy PID controller, are used in order to make comparison among them. The simulation results are presented in figure 7. From the simulation results, it can be seen that the fuzzy PID controller possesses better dynamic and static performance than the classical PID controller and the fuzzy controller.

(14) where X represents the universal set, x is an element of X, D is a fuzzy subset in X and D(x) is the membership function of fuzzy set D. In the design of fuzzy inference system, Mamdani fuzzy inference method was used. And the weighted average method is used in the fuzzy decision of the output signals:

U
i =1 n i =1

C (U i )
C

(15)
Figure7. Speed response of different controllers

(U i )

where U i is a variable of the fuzzy output, and C (U i ) ) is the membership corresponding to U i . In order to improve the response speed of the control system, the input data are usually discredited by means of off-line computation [11-13], and the results of the fuzzy decision are turn into the fuzzy control decision table, as shown in Table 2. In order to test the effectiveness of the above intelligent controller, a simulation model is constructed in Matlab/Simulink module. The simulation model is shown in figure 6. In the simulation, a 10 kW, three-phase SRM is utilized, and the given speed is 800r/min. Three different kinds of
2012 ACADEMY PUBLISHER

IV.CONCLUSION In this paper, we discuss Application of Intelligent Controller in SRM Drive. The switched reluctance motor has received considerable attention for its ruggedness and variable-drive applications at present. Its simple construction due to the absence of permanent magnets, rotor cages, brushes, and also having high efficiency over a wide range of speed being limited both by the bearings and by the losses in the rotor iron, make the SRM drive an alternative to DC permanent magnet and AC induction motor drives. With a suitable current control, the torque/speed characteristics of SRM can be obtained.

2206

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

Many of the converters with variable speed drives have no input power factor correction circuits. This results in harmonic pollution of the utility supply, which should be avoided. On the basis of a simple model of SRM, a compound control system combining the PID algorithm and fuzzy logic is utilized in this paper, and the constructed intelligent control system can overcome the disadvantages of both the classical PID controller and the fuzzy controller. The proposed scheme was simulated to test the performance of the fuzzy PID controller using MATLAB/Simulink software package and the results were obtained. From the simulation results, it can be seen that the compound fuzzy PID controller has a faster response and a better stability, and possesses better dynamic and static performance than the common PID controller and the fuzzy controller, and improves the overall performance of the control system of SRM. REFERENCES
[1] P.J. Lawrenson, J. M. Stephenson, P.T. Blenkinsop, J. Corda, and N.N. Fulton, Variable-speed switched reluctance motor, IEE, vol. 127 , no.4, 1980,pp. 253-265. [2] P. Zhou, S. Stanton and Z.J. Cendes, Dynamic modeling of three phase and single phase induction motors, Proceedings of IEEE International Electric Machines and Drives Conference, 1999, pp. 556-558. [3] Deihimi A, Farhangi S, and Hannegerger G, A general nonlinear model of switched reluctance motor with mutual coupling and multiphase excitation, Electr Eng 84 pp.143158. [4] Blaabjerg F, Kjaer PC, Rasmussen PO, and Cossar C, Improved digital current control methods in switched reluctance motor drives, IEEE Trans Power Electron , vol. 14, pp. 563572. [5] Gribble JJ, Kjaer PC, and Miller TJE, Optimal commutation in average torque control of switched reluctance motors, IEE Proc Electr , vol. 146, pp. 210. [6] Kowol, M. Mynarek, and P. Mrochen, D. Construction of a dynamic model for a switched reluctance motors. 2nd International Conference on Electrodynamic and Mechatronics, 2009, pp. 25-26. [7] David Cajander and Hoang Le-Huy, Design and optimization of a torque controller for a switched reluctance motor drive for electric vehicles by simulation, Mathematics and Computers in Simulation ,vol.71,no.6 2006, pp. 333-344. [8] Wang Mianhua and Liang Yuanyuan, Fuzzy-PI controller for direct torque control drive system of SRM, Electric Drive, vol.40, no.1, 2010, pp. 51-54. [9] M. Ehsani, I. Husain, S. Mahajan, and K. R. Ramani, New modulation encoding technique for indirect rotor position sensing in switched reluctance motors, IEEE Transactions on Industry Applicatio, vol. 30, no.1, 1994 , pp. 584-588. [10] Leonid Reznik, Omar Ghanayem, and Anna Bourmistrov, PID plus fuzzy controller structures as a design base for industrial applications, Engineering Application of Artificial Intelligence, 2000 , pp. 419-430. [11] Xiaodiao Huang and Liting Shi, Simulation on a fuzzy-pid position controller of the cnc servo system, Proceedings of the Sixth International Conference on Intelligent Systems Design and Applications(ISDA06), 2006. [12] DiRenzo, M.T. Masten, M.K. Cole, and C.P, Switched reluctance motor control techniques, Proceedings of the

American Control Conference, 1997, pp. 272 277. [13] S. Vijayan and S. Paramasivam, Intelligent speed controller for a switched reluctance motor drive using FPGA, International Journal of Intelligent Systems Technologies and Applications, vol. 7, no.4, 2009, pp. 414 429. [14] T.J.E. Mille, Switched reluctance motors and their control, Clarendon Press, Oxford, 1993. [15] R. Krishnan, Switched reluctance motor drives, CRC Press, 2001. [16] D.A. Torrey, X.M. Niu, and E.J. Unkauf, Analytical modelling of variable-reluctance machine magnetisation characteristics, IEE Proceedings, Electric Power Applications, , vol. 142, no.1, pp. 14-22, January 1995. [17]H. Le-Huy and P. Brunelle, Design and implementation of a switched reluctance motor generic model for simulink simpower systems, Electronics 2005 Conference. [18] Ding W and Liang D, Modeling of a 6/4 switched reluctance motor using adaptive neural fuzzy inference system, IEEE TRANSACTIONS ON MAGNETICS, 2008, vol.44 (7 ), pp. 1796. [19] SU Jian-qiang, LI Chang-bing, and FENG Liang, Switched reluctance motor nonlinear modeling and system simulation, Vehicle & Power Technology, 2009(3), pp. 19-22. [20] D.A. Torrey, X.M. Niu, and E.J. Unkauf, Analytical modelling of variable-reluctance machine magnetisation characteristics, IEE Proceeding, Electric Power Applications, vol. 142, no. 1, pp. 14-22, January 1995. [21] Elamvazuthi, P. Vasant, and J.Webb, The application of mamdani fuzzy model for auto zoom function of a digital camera, International Journal of Computer Science and Information Security, vol. 6, no. 3, 2009. [22] Liu Weiye and Ding Yixing, Research on a new kind of power circuit of an IPM-based SRM, Marine Electric & Electric Technology, April 2005. [23] E. H. Mamdani, Application of fuzzy logic to approximate reasoning using linguistic synthesis, IEEE Trans. Computers, vol. 26, no. 12, pp.1182-1191, 1977. [24] Tan Guo-jun, Han Yao-fei, KUAI Song-yan, and WANG Si-jian, Control system of high-power switched reluctance motor, Power Electronics, April 2006. [25] Zhou Su-ying and Lin Hui, Overviews of control strategies to minimize torque ripple of switched reluctance motor, Electric Drive, March 2008. [26] Zhou Su-ying and Lin Hui, Modeling and Simulating of Switched Reluctance Motor Based on RBF Neural Network with Combined Clustering Algorithm, Small & Special Electrical Machines, October 2009. [27] Liao Bei-rong and Yang Xu-li, Mechanism Analysis of Switching Resistance Motor, Electric Switchgear, May 2008. Zhang Baojian, (1969-), male, associate professor in the School of Information Engineering, Henan Institute of Science and Technology, Henan Xinxiang, 453003, China., was born in Kaifeng, China, He received his B.S degree in 1991 from Beijing Normal University, his M.S degree in 2001 from School of Computer Science, Shanxi university. Now he is pursuing the PhD in Wuhan University of Technology. His research interests: Computer Application and information security. Zhu Yanli was born in Xinxiang, China. She received her B.S degree in 1999 form Computer and Applications, Henan normal university in Xinxiang, her M.S degree in 2007 from School of Computer Science, Sichuan Normal University. Currently she is a professor in the School of Information Engineering, Henan

2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2207

Institute of Science and Technology, Henan Xinxiang, 453003, China. The main publications include: A Practical Course book on C Language Programming (Beijing, China, Tsinghua University Press, 2010), Photoshop CS4 Graphics and Image Processing (Changsha, China, National Defense University Press, 2010). Her research interests include: data mining and digital image processing. She is a membership of China Computer Federation. Xie Jianping, male, associate professor in Institute of Technology, Lishui University, was born in Lishui, China. He received his B.S degree in 1990 from Zhejiang Sci-Tech University. He mainly engages in the study of signal and information processing.

2012 ACADEMY PUBLISHER

2208

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

Simulation of Rolling Forming of Precision Profile Used for Piston Ring based on LS_DYNA
Jigang Wu
Hunan University of Science and Technology/Hunan Provincial Key Laboratory of Health Maintenance for Mechanical Equipment, Xiangtan, 411201, China jgwuhust@gmail.com

Xuejun Li and Kuanfang He


Hunan University of Science and Technology/Engineering Research Center of Advanced Mine Equipment, Ministry of Education, Xiangtan, 411201, China hnkjdxlxj@163.com, hkf791113@163.com

AbstractThe rolling process of precision profile used for piston ring is simulated by large general-purpose explicit dynamic finite element analysis software ANSYS/LS_DYNA. The modeling of finite element models, the selection of material models and element types, the meshing of the model are introduced in detail. The rolling metal flow rulers of each pass are gained. The deformation of rolled piece and distribution of stress field are analyzed deeply. These can provide guidance for the design of forming roller and the optimization of process. Index Termspiston ring, precision profile, rolling forming, explicit dynamic finite element, numerical simulation

I. INTRODUCTION Piston ring is one of the key parts of engine, and is often compared to the heart of engine. Meanwhile, piston ring is also an easy-to-break part and needs to be changed frequently, and billions of piston rings are manufactured in China each year. The dimension of piston ring is small and the requirements of dimensional accuracy and surface roughness are high. The piston ring is formed at last through more than twenty procedures such as turnery, milling and grinding and so on with cast iron material by traditional processing approach for a long time, and the production cost of piston ring is very high. With the development of precision forming technology, the precision profile which is satisfied with the requirements of cross section shape and dimensional accuracy of the piston ring is produced by cold forming rolling technology in developed country, and then the precision profile is coiled into openings elliptic with specific shape by CNC forming machine, so the piston ring can be formed at once. The past more than twenty procedures can be reduced to several procedures, so not only the processing cost is reduced but also the product performance is improved. However, the machining and manufacturing of this precision profile is still a blank in our country, and in order to meet the production requirement of piston ring, the precision profile must be imported from abroad each year though the price of
2012 ACADEMY PUBLISHER doi:10.4304/jcp.7.9.2208-2215

oversea precision profile is high. Therefore, the localization of the precision profile has great significance. During the cold forming rolling process, the law of plastic deformation of materials, the friction between roller and rolled piece, the change of material microstructure, the influence of rolling reduction rate, roller diameter and rolling speed and so on are very complex issues. In order to investigate the deformation law of metal materials in the rolling process, the experimental research method can provide accurate reference data for on-site production and it is in accord with actual production. But the cycle of the experimental research is longer because a lot of work must be done before the experiment, such as the preparation, field experiment and experimental results processing and so on. Moreover, because of the uncertainty of experiment, it is often difficult to resolve all the issues to be studied in one experiment, and at the same time the probability of the failure in the experiment is very high. Once the experiment failed, it will waste a lot of manpower and material resources. At the same time, the traditional experimental methods is difficult to deal with the rolling of precision profile which is related to the quantitative calculation of distributed variables such as metal flow and stress field and so on. Explicit dynamic finite element method has applied to the rolling research domain successfully, and it can make up the deficiency of traditional methods and provide a highly effective and low-cost method for the further study of many issues in the rolling process of precision profile. Liu et al. investigated the simulation of the strip rolling processing based on dynamic explicit FEM [1]. Diao et al. studied the springback of sheet V-bending based on dynamic explicit finite element analysis [2]. Xie et al. analyzed the strip rolling pressure distribution for different width by explicit dynamic FEM [3]. Wu et al. simulated the billet rolling in oval roll-profile by FEM [4]. Niu et al. simulated the profiled billet deformation of Hbeam with slab by FEM [5]. Fan et al. researched the stress field of rolling element bearing based on explicit dynamics FEA [6]. Li et al. studied the inuence of

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2209

material properties on T-section ring rolling by threedimensional elasticplastic FE method [7]. Hua et al. established the three-dimensional FE model of plastic penetration in L-section prole cold ring rolling under ABAQUS software, and based on this model, the expanding rules of plastic zone in roll gap are revealed by FE simulation, and three deformation behaviors of Lsection ring that exist in the rolling process are exposited [8]. Shuai et al. proposed a finite element model for simulating the formation process of shaped steel tube for driving shaft based on the deformation characteristics of Y type mill [9]. Chen et al. developed a comprehensive procedure to predict the degree of void closure using nite element analysis and neural net-work [10]. But until now, very few focused on the study of the simulation of rolling forming of the precision profile. The rolling process of precision profiles used for piston ring at the production is simulated with explicit dynamic finite element technology, and it can provide more accurate reference data for the design and optimization of rolling mill and the optimization of rolling process parameters. II. MODELING OF ROLLING PROCESS A. Explicit Dynamic Finite Element The general form of finite element equation for dynamic analysis is shown as follows. && & (1) [ M ]{U} + [C]{U} = [ P ] [ F]

variables are

t +t

{U} , and then the displacement of t+ t

&& & Where, {U} is acceleration array of whole nodes, {U}

moment can be worked out directly. This solution method is called as explicit integral method. As to the dynamic analysis of elastoplasticity issue, an effective method is central difference method. The difference quotients of central difference method are as follows. t & (4) {U} = 21 t (t +t {U} t t {U}) t && {U} = 1t 2 (t +t {U} 2t {U} + t t {U}) (5) The formula (4) and (5) are substituted into formula (3), then t ([ M ] + [ C]) t + t {U} = t 2 ( t {P} t {F}) + 2 (6) t t t t t t [ M ] (2 {U} {U}) + [C] {U} 2 Denoted as follows, t t (7) [ K ] = [ M ] + [ C] 2 t R = t 2 ( t {P} t {F}) + [ M ] (2t {U} t t {U}) (8) t t t + [ C] {U} 2 The formula (6) can be wrote as follows, t + t t K {U} = t R (9) The t +t {U} can be worked out directly according to

{}

is velocity array of whole nodes, [ M ] is whole mass matrix, [ C] is whole damping matrix, {P} is called as external node force array, {F} is called as internal node force. Mathematically, formula (1) is a second order system of differential equations. If the balance of the object finite element system at t+ t moment is considered, the following equation will be obtained. t + t t +t && & [ M ] {U} + [C] {U} = t +t {P} t + t {F} (2)

formula (9). The calculation of the central difference method is conditionally stable. In order to ensure the stability of the calculation, the time step t must less than critical time step t cr . The critical time step value is related to the minimum natural vibration period of the object system Tmin . As to finite element analysis, Tmin is minimum vibration period of the finite element sets. Ordinarily, the t cr can be worked out with the following formula.
Tmin (10) In the finite element analysis of the plastic forming, ordinarily the time step t can be chose as follows. t = t cr (11) Where, is a coefficient which is less than 1.0, ordinarily can be choose from 0.5 to 0.8. In the real calculation, the critical time step t cr can be ascertained approximately according to the element size. As to every element, the critical time step t cr can be calculated with the following formula. L t cr e (12) c Where, the c is the propagation velocity of elastic wave, as to metal, c is calculated as follows. 2G(1 v) (13) c= (1 2v) t cr =

Formula (2) is a nonlinear differential equation, and needs to be solved with implicit integral method. If formula (1) is constant coefficient system of differential equations, the velocity and acceleration can be represented approximately with displacement by any appropriate finite difference method, and the formula (1) can be changed to explicit linear system of equations whose unknown variables are t +t {U} , and the t +t {U} can be worked out directly. If the balance of the object finite element system at t moment is considered, the idea will be realized. As to the moment t, formula can be written as follows. t t && & (3) [ M ] {U} + [C] {U} = t {P} t {F} If the velocity and acceleration are represented approximately with displacement by any appropriate finite difference method, the formula (3) can be changed to explicit linear system of equations whose unknown

2012 ACADEMY PUBLISHER

2210

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

L e is the nominal length of the element, and is related

to the element types, and the value of L e can be selected as the distance of two nodes which are closest in the element.
B. Establishment of Model In order to enhance the confidence and accuracy of the simulation, the dimension of the geometric model is identified with the dimension of the production equipment in the produces field, and the rolling process is supposed to be reasonable as follows, all diameters of rollers are equal, the rotational speeds of rollers are same, and all rollers are drive roller, the mechanical properties of rolled piece are uniform, the rollers rotate at a constant angular velocity, the rolled piece moves to roller slit at a constant speed which is approach or equal to the peripheral speed of roller until the rolled piece is nipped into the roller slit, the rolling process is done depending on the friction force between the roller and the rolled piece. The blank of the rolled piece is a round steel wire whose diameter is 2.7 mm, and the cross section of the precision profile is rectangle whose height is 1.5 mm and width is 3.5mm, so the blank is rolled by two pass process tandem rolling, the blank is rolled at two rollers rolling mill firstly to realize the thickness requirement of the precision profile basically, then the blooming piece is rolled at finished rolling at dislocation four rollers rolling mill, and then the formed precision profile is obtained. The material of rolled piece is 50CrVA, its elastic modulus E is 206GPa, tangent modulus Etan is 90MPa, yield limit s is 1127MPa, density is 7850kg/m3, poisson ratio is 0.3. The material of roller is 9CrSi, its elastic modulus E is 206GPa, density is 7850kg/m3, poisson ratio is 0.3. The diameter of the roller D is 120mm. The rolling process is modeled, solved and analyzed by large general-purpose explicit dynamic finite element analysis software ANSYS/LS_DYNA. As to the selection of element types, the explicit block element SOLID164 is chosen for the rolled piece and the roller. As to the selection of material models, the roller can be treated as rigid roller because the deformation of the roller is small on the condition of cold rolling, so the rigid material model is chosen for the roller, and the Bilinear Kinematic (BKIN) material model is chosen for the rolled piece. In ANSYS/LS-DYNA, the contact friction is based on Coulomb formula. The friction coefficient can be worked out with following formula. c = d + (s d )e DC v (14) Where s is static friction coefficient, d is dynamic friction coefficient, DC is index attenuation coefficient, v is the relative velocity between the contact faces. The maximum frictional force can be defined with viscous friction coefficient VC and the area of the contact region, the formula is shown as follows. Flim = VC A cont (15)

The recommended viscous friction coefficient VC is as follows. VC = 0 (16) 3 In which, 0 is the yield stress of the contact material. In order to avoid the oscillation in the contact, the contact damping coefficient VDC can be used to apply damping at the vertical direction of the contact faces. The VDC can be worked out with the following formulas. VDC = crit (17) 100 crit = 2m (18)
= k(mslave + m master ) mslave + m master

(19) (20)

m = min {mslavem master }

Ordinary, the value of the VDC is 20. The symmetric penalty function contact algorithm is selected, and the contact type is Automatic Surface to Surface Contact, the static friction coefficient s is 0.482, the dynamic friction coefficient d is 0.346, the viscoelasticity friction stress VC is 650.67MPa, the contact damping coefficient VDC is 20. The contacts between rolled piece and all rollers are defined respectively. Since the roller is regarded as rigid roller, in order to reduce the element quantity and shorten the computation time, the solid roller is replaced with the roller surface generally at the modeling. So as to approximate the real production and reduce the element quantity, the real roller is simulated by a torus whose thickness and width are 5 mm, and the real rolled piece is simulated by a cylinder whose diameter is 2.7 mm and length is 5 mm. The geometrical model of two roller rolling mill and dislocation four rollers tandem rolling mill is shown in figure 1. The whole geometry model of two roller rolling and dislocation four rollers tandem rolling is shown in 1 (a), the roller arrangement of dislocation four rollers rolling mill is shown in 1 (b), the model of rolled piece is shown in 1 (c).

(a) Whole geometry model

2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2211

(b) Roller arrangement of dislocation four rollers rolling mill

(b) Rolled piece meshed model

(c) Model of rolled piece Fig. 1 Geometrical model of two roller rolling and dislocation four rollers tandem rolling

(c) Whole meshed model

C. Meshing of Model Mapped meshing method is used, element type is hexahedral element, and the meshing is done by manual sweep method. As to the roller, since the length of rolled piece model is quite short, only less than 1/4 arc of the roller contacts with the rolled piece during the simulation process. So as to reduce the element quantity and shorten the computation time, the whole roller is meshed into coarse grids, only 1/4 arc of the roller which will contact with rolled piece is fined, the subdivision number of the width direction is 5, the subdivision number of 1/4 arc of the roller is 100, element dimension is 2 mm, element number is 23710. As to the rolled piece, the lines at the length and the circle on end face are fined, and the subdivision number of the length direction is 15, the subdivision number of the circle on end face is 60, element dimension is 1 mm, element number is 8745. The meshed model is shown in figure 2, and one of roller meshed model is shown in 2 (a), the rolled piece meshed model is shown in 2 (b), the whole meshed model is shown in 2 (c), the meshed model of rolled piece is shown in 2 (d).

(d) Meshed model of rolled piece Fig. 2 Meshed model

III. ANALYSIS OF SIMULATION RESULT The stress distribution nephogram and deformation of rolled piece of nipping stage, stable rolling stage and throwing steel stage in the first pass are shown in (a), (b) and (c) of figure 3 respectively. The stress distribution nephogram and deformation of rolled piece of nipping stage, stable rolling stage and throwing steel stage in the second pass are shown in (a), (b) and (c) of figure 4 respectively. The different color region in the stress nephogram of rolled piece corresponds to the different stress magnitude, the stress scaleplate on the right edge demonstrates the corresponding relationship between the color and the stress value by scientific notation, and the unit of the stress scaleplate is Pa.
A. Results Analysis of First Pass The figure 3 (a) reveals that, an obvious double swallow tail has occurred at the forehead of the rolled piece, the nipping region of the rolled piece has been rolled as obvious wedge shape, the elements at the region which contact with roller have deformed obviously, the element grids have been elongated in the length direction,

(a) Roller meshed model

2012 ACADEMY PUBLISHER

2212

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

but the element grids at the heart and both sides have not deformed obviously. The metals at top and bottom surface of the forehead of the rolled piece contact with the roller and are pressed by the roller, the end face of the forehead of the rolled piece is free surface, so the restraint to the metals deformation of the rolled piece is small. The metals at the heart and both sides of the rolled piece are not pressed by the roller so the deformation is small. The metal follows to the rule flowing to the place of smallest resistance, so the forehead of the rolled piece is rolled as obvious wedge shape. The stress of the rolled region at the forehead of the rolled piece and the region which contact with roller is bigger, the stress at the both sides of forehead and the region that is not rolled is smaller. The figure 3 (b) reveals that, the double swallow tail at the forehead of the rolled piece is more obvious than nipping stage, the contact region and its neighbor flow obviously and form 1/4 spheroid shape, the rolled region has been rolled to the shape of the pass, the element grids on the contact surface have been elongated in the length direction and have not been distorted obviously in the width direction, the element grids at the heart of the rolled region have been elongated in the width direction and have not been distorted obviously in the length direction, the element grids at both sides have not been distorted obviously. This indicates that, at the stable rolling stage, the superficial metals which contact with the roller mainly has the extend distortion in the length direction and the heart metals of rolled region mainly has the extend distortion in the width direction. Along with the advance of the rolling, big stress are distributed at the rolled region and the top and bottom surface which contact with roller. The figure 3 (c) reveals that, the forehead and the empennage of rolled piece have been rolled to the obvious double swallow tails, the center of the both sides of rolled piece have extruded slightly, the rolled piece has been rolled to the shape of the pass, the element grids at the heart of the rolled piece have been elongated and flattened in the width direction, the element grids at the both sides and contact surface have been elongated in the length direction, the deformations of all grids are synchronous. The elongation modulus of the rolled piece in length direction is 1.101, and in the width direction is 1.270, therefore the rolled piece has fiercer distortion in width direction than in length. The residual stress in the interior of the rolled piece which stem from the elastic restitution of the metals is 1.093 GPa.

(a) Nipping stage

2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2213

(b) Stable rolling stage

(c) Throwing steel stage Fig. 3 Stress nephogram of the first pass

B. Results Analysis of Second Pass The figure 4 reveals that, the rules of metal flow and stress distribution of the second pass are basically the same as the first pass. The rolled piece has been formed in the high direction after the first pass rolling, so the second pass rolling only plays the role of finishing for the height dimension, the main rolling distortion occurs in the width direction, the half arc part at both sides of rolled piece become even by rolling, the rolled piece has been rolled to the shape of the pass completely, and its section is the prospective rectangle. The rolled piece has extended slightly in the length direction, and the elongation modulus is 1.048. The residual stress still exists in the interior of the rolled piece and the value is 0.971 GPa after the second pass rolling, and is slightly smaller than that of the first pass, the reason is that, the first pass is two roller rolling and both sides of the rolled piece have no restraint, and the second pass is four roller rolling and four sides of the rolled piece have restraint, and the restraint changes the three-dimensional stress condition of the metals in the interior of rolled piece.

2012 ACADEMY PUBLISHER

2214

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

(a) Nipping stage

(c) Throwing steel stage Fig. 4 Stress nephogram of the second pass

IV. CONCLUSIONS Through plenty of exploration both in theory and practice, an appropriate explicit dynamic finite element model is built, the rolling process of the precision profile used for piston ring is simulated, the flow law of the metals, the deformation of the rolled piece and the distribution of the stress during the rolling are analyzed, these can provide guidance for the design and improvement of forming roller and the optimization of process. An actual two rollers and dislocation four rollers

(b) Stable rolling stage

2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2215

tandem rolling experiment is done at production field, the experiment results are identical with the simulation results very well, so plenty of time and the expense are saved. At the same time, it indicates that the explicit dynamic finite element can be used for the three dimension simulation of the cold rolling forming process of the precision profile very well. ACKNOWLEDGEMENT Financial support from CEEUSRO Special Plan of Hunan Province (2010XK6066), Aid Program for Science and Technology Innovative Research Team in Higher Educational Institutions of Hunan Province, Industrial Cultivation Program of Scientific and Technological Achievements in Higher Educational Institutions of Hunan Province (10CY008), Natural Science Foundation of Hunan Province (11JJ9016), Project Supported by Scientific Research Fund of Hunan Provincial Education Department (09C405), Ph.D. Start Fund (E50925), are gratefully acknowledged. REFERENCES
[1] L.Z. Liu, X.H. Liu, and Z.Y. Jiang, Strip Rolling Simulation by the Dynamic Explicit FEM, Journal of Plasticity Engineering, vol.8, pp. 51-54, 2001. [2] F.X. Diao, K.F. Zhang, Dynamic explicit finite element analysis of springback of sheet V-bending, Materials Science and Technology, vol.10, pp. 170-174, 2002. [3] H. B. Xie, H. Xiao, and G.M. Zhang, Analysis of strip rolling pressure distribution for different width by explicit dynamic FEM, Journal of Plasticity Engineering, vol.10, pp. 61-64, 2003. [4] D. Wu, X.M. Zhao, and J.C. Li, Simulation of billet rolling in oval roll-profile by FEM, Journal of Plasticity Engineering, vol.10, pp. 57-60, 2003. [5] H.S. Niu, X.M. Zhao, and D. Wu, Profiled billet deformation simulation of H-beam with slab by FEM, Journal of Plasticity Engineering, vol.13, pp. 41-44, 2006. [6] L. Fan, N.L. Tan, and D.P. Shen, FEA on Stress Field of Rolling Element Bearing Based on Explicit Dynamics, Journal of Beijing Jiaotong University, vol.30, pp. 109-112, 2006. [7] L.Y. Li, H. Yang, and L.G. Guo, Research on Interactive Inuences of Parameters on T-shaped Cold Ring Rolling by 3D-FE Numerical Simulation, Journal of Mechanical Science and Technology, vol.21, pp. 15411547, 2007.

[8] L. Hua, D.S. Qian, and L.B. Pan, Deformation Behaviors and Conditions in L-section Prole Cold Ring Rolling, Journal of Materials Processing Technology, vol.209, pp. 50875096, 2009. [9] M.R. Shuai, S.B. Liu, and C.M. Gao, Finite Element Simulation of Cold Rolling Process of Shaped Steel Tube for Driving Shaft, International Journal of Iron and Steel Research, vol.17, pp. 25-29, 2010. [10] J. Chen, K. Chandrashekhara, and C. Mahimkar, Void Closure Prediction in Cold Rolling Using nite Element Analysis and Neural Network, Journal of Materials Processing Technology, vol. 211, pp. 245255, 2011.

Jigang Wu, who was born on August 3, 1978 in Hunan province, received master degree in 2004 and Ph.D in 2008 from Wuhan University and Huazhong University of Science and Technology respectively, research field is advanced manufacturing technology, dynamics of machinery and fault diagnosis. He has been working at Hunan University of Science and Technology since 2008, mainly engaged in theory and practice of teaching, scientific research. His paper works are: Subpixel Edge Detection of Machine Vision Image for Thin Sheet Part, China Mechanical Engineering (2009), Research on Planar Contour Primitive Recognition Method based on Curvature and HOUGH Transform, Journal of Electronic Measurement and Instrument (2010), etc. Dr. Wu is a member of Hunan Province Instrumentation Institute. In recent years, he has presided two provincial research projects, obtained three provincial academic rewards, and published more than 10 academic papers. Xuejun Li, male, born in 1969. He received Ph.D from Central South University in 2003. He is currently working at Hunan University of Science and Technology, China. His research interests include fault diagnosis, dynamic monitoring and control for electromechanical systems. Kuanfang He, male, born in 1979. He received Ph.D from South China University of Technology in 2009. He is currently working at Hunan University of Science and Technology, China. His research interests include dynamic monitoring and control for electromechanical systems.

2012 ACADEMY PUBLISHER

2216

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

Bargmann and Neumann System of the SecondOrder Matrix Eigenvalue Problem


Shujuan Yuan
Qinggong College, Hebei United University, Tangshan, China Email: yuanshujuan1980@163.com

Shuhong Wang
College of Mathematics, Inner Mongolia University for Nationalities, Tong Liao, China Email: shuhong7682@126.com

Wei Liu
Department of Mathematics and Physics, Shijiazhuang TieDao University, Shijiazhuang, China Email: Lwei_1981@126.com

Xiaohong Liu, Li Li
Qinggong College, Hebei United University, Tangshan, China Email: lily130203@126.com, lili2064@163.com

AbstractThis paper discusses the second-order matrix eigenvalue problem by means of the nonlinearization of the Lax pairs, then the author gives the Bargmann and Neumann constraints of this problem. The relation between the potentials and the eigenfunction is set up based on these constraints. By means of the nonlinearization of the Lax pairs, the author found these systems of the eigenvalue problem can be equal to the Hamilton canonical system in real symplectic space. In the end, the infinite-dimensions Dynamical systems can be transformed into the finitedimensions Hamilton canonical systems in the symplectic space. As well, this paper obtains the representations of the solutions for the evolution equations. Index Termseigenvalue problem, evolution equation, lax representation integrable system,

I.INTRODUCTION The understanding of completely integrable Hamilton system experienced a tortuous process. To plot out the kinematic equations is the highest goal of the early classical mechanics. With the rise of the solitary, a lot of nonlinear evolution equations have been proven restricted submanifold in completely integrable system. Thus the investigation for them is one of the most highlighted subjects in mathematical physics. The integrable system is widely used in fluid mechanics, nonlinear optics and a series of nonlinear science. To find as much as possible of the integrable system and study its algebra, geometry and other properties can judge whether a system is completely integrable or not, and it can be applied to nonlinear evolution equation. As everyone knows, the famous Liouville theorem has laid a finite dimensional Hamilton completely integrable system beautiful geometric theory. According to Liouville theorem, if a 2n
2012 ACADEMY PUBLISHER doi:10.4304/jcp.7.9.2216-2223

dimensional Hamilton system has n of independent and dual involution on the conservation integral [1, 2], then for the infinite dimension the situation is dimensional completely integrable has an infinite number of independent and dual involution on the conservation integral. In the past, systematic methods were developed to study soliton equations such as the inverse scattering, Darboux transformation, etc. Finitedimensional completely integrable systems are closely connected with infinite dimensional completely integrable systems described by soliton equations. For instance, when studying of pole solutions of soliton equations, one found that the equations of motion of poles are finitedimensional integrable systems. More examples are that the stationary equations of soliton equations are finitedimensional integrable systems. Initial dimensional integrable systems can generate from finite dimensional integrable systems. The resulting infinite dimensional integrable systems are mainly divided into two kinds: one is the Bargmann type in the freedom, and the other is the Neumann type with an additional constraint. The former has been investigated by many authors in a series of papers. By the contrast, the later is more difficult to be treated owning to an additional constraint, and few examples were discussed such as the cKDV. To our opinion, the Dirac bracket and the Mosers constraint are effective in exploring the Neumann type finite dimensional integrable systems, so-called Lax pairs or nonlinearization of eigenvalue problems. Therefore, to seek a new completely integrable system which is associated with the development of non-linear equations is an interesting issue in the international mathematical physics union. This paper has obtained a new finite-dimensional completely integrable system with the nonlinear of the eigenvalue problem.

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2217

II. LAX REPRESENTATION AND THE EVOLUTION EQUATION


HIERARCHY RELATED TO EIGENVALUE PROBLEM TABLE I. NOTE NO symbols 1 2 3 4 5 6

T
n

from Wx = [ M ,W ] ,
w11x w21 + w12 + rw12 = 0 w12 x 2w12 + 2qw12 w22 + w11 = 0 w21x + 2w21 2qw21 + w22 w11 rw11 + rw22 = 0 w22 x + w21 rw12 w12 = 0 It can get the following result:
w11 = qa j 1 a j 1x + a j 1 ,
2 1

Note symbols
SYMBOLS

note Partial derivative of x Diagonal line of matrix is (1, 2 L n ) Vector transformation of

w12 = a j 1 , w21 = qa j 1 b j 1x + a j 1 w22 = qa j 1 + a j 1x a j 1 ,


2 2 1 1

, = k k
k =1

[W , L ] = WL LW

N dimensional vector inner-product of and Differential operator of the lie algebra of transposition operation Functional gradient of

Definition:
K K = 11 K 21 K 12 K 22

1 K11 = r + r , K12 = q 2 2 1 2 K 21 = q + , K 22 = 0 2
2 J = 1 , KG j 1 = JG j , j = 0,1, 2,3,... (2) 2

Figure1. The Jacobi fields

aj G j = j = 1, 0,1, 2, L (3) bj Note: Operators K , J are double Hamilton operator [3] it is that K , J have the Properties of antisymmetry, bilinear, non-degeneracy, and satisfy the Jacobi equation. The definition of second-order matrix as follows
Wm = 1 qa j 1 2 a j 1x + a j 1 j =0 1 qa j 1 b j 1x + a j 1 2
m

a j 1
1 qa j 1 + a j 1x 2

a j 1

Figure2. The period of unit mass in the potentials plotted versus energy

m j This paper has the following proposition: Proposition 2.1: The Evolution Equation [4] Hierarchy related to eigenvalue problem is ( u = (r , q )T ): r utm = tm = JGm , m = 0,1, 2,L (4) qtm
M tm = ( wm ) x + wm M Mwm (5) In other words, (4) is consistency condition of spectrum-preserving for the following two linear equations:

This paper considers the second order matrix eigenvalue problem:


1 2 x

1 + r q 2

(1)

= M ,
r u = ,Let W = W11 W12 j , q j = 0 W21 W22

( xt = t x , t = 0 )
m m m

2012 ACADEMY PUBLISHER

2218

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

1 1 =M 2 x 2 1 1 = Wm 2 tm 2

(6)

the Functional derivation is denoted by Symbol . Let be an open relatively compact subset. We assume that I and J are intervals. See Figure 3. Proposition 2.3: If 1 , 2 is the eigenfunction of the eigenvalue problem (1) relates to , so the functional gradient [6, 7] as follows:
r 2 = 1 , grad = 212 q

Let
JG1 = 0 ,

so G1 = 4 , from KGm 1 = JGm ,


0

2q r G0 = 4q + 2r 4rqx + 4qrx 3rrx + 4qqx rxx 2qxx G1 = 1 4qqx rxx + qxx (qr ) x 2 Then this paper has the evolution equation:
r 4r ut 0 = t 0 = x qt 0 4qx

(7)

Proof: 1 = q

2 x

1 + r q 2

(8)

r ut1 = t1 qt1 4qrx + 4rqx 3rrx rxx 2qxx (9) = 4qqx + (qr ) x + qxx 1 rxx 2 rt 2 ut 2 = qt 2

= M , 0 1 & 1 (1 , 2 ) M dx = 0 , 1 0 2 0 1 q (1 , 2 ) dx = 0 1 0 + r + q
2 2 2 [(21 2 1 + 2 ) 212 q 1 r ]dx = 0

The eigenfunction is standard, ( (21 2 12 + 2 2 )dx) 1 = 1 , so conclusion correct.


Proposition 2.4: Functional gradient satisfying the following equations K = J .

3 15 rt 2 = 3qqx r 3r 2 qx 6qrrx + rqxx + r 2 rx 2 8 3 2 3 2 3 + q rx + rx + 6q 2 qx x qrxx 2 4 2 3 1 1 2 3qqxx 3qx + rrxx + rxxx qxxx 4 2 2 3 2 3 2 qt 2 = 3qqx r + q qx q rx + 2 2 3 3 2 3 3 3 3 qrrx + qx + r 2 qx qxx r qx rx qrxx 4 2 8 4 2 4 1 3 3 + qxxx + rx 2 + rrxx 2 8 8

Proof: K =
r + r q + 1 2 2 q 1 2 2 2 1 = 2 1 2 0

2 r 1 2 + 4 q 1 2 4 2 1 2 2 2 1 2 + 2 2 2 2 2 q 1 2 + 2 2 1 2 + 3 2 2 2 + 4 2 1 2

J =
2 2 1 = 1 2 1 2 2

2 r 1 2 + 4 q 1 2 4 2 1 2 2 2 1 2 + 2 2 2 2 2 q 1 2 + 2 2 1 2 + 3 2 2 2 + 4 2 1 2

so K = J
III.THE HAMILTON EQUATION AND ITS COMPLETE INTEGRABILITY UNDER THE BARGMANN CONSTRAINT Suppose
Figure3. Sample results

that

1 < 2 < L N

is

different

eigenvalues of (5). 1 j , 2 j are characteristic function of


Proposition 2.2: Let = (1, 2 ) If the trace [5] of
T

j ( j = 1, 2, L , N ), so
1 j 2 j

the second- order matrices M over the real is 0, so 0 1 & 1 (1 , 2 ) M dx = 0 1 0 2


2012 ACADEMY PUBLISHER

= M ( u , j )

1 j , 2j

(10)

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2219

1 j 2 j

tm

= wm ( u , j )

1 j [5] , 2j

(11)

1 2

m 1 , 1

1 , 1 +4 m +11 , 2

Let

= diag (1 , 2 ,L , N ) , 1 j = (11 ,..., 1N )T , 2 j =(21 ,...,2 N )T ,

+2 m+12 , 2
+ 1 2

m 1 , 1 1 , 2
j 2 , 2 m j 2 , 1 j 2 , 1 m j 1 , 1

j =0

(17)

The generator Ek is as followed:


2 Ek = k 1k 2 1 , 2 + 4 1k 2 k + 1k 1 , 1

From Kgrad = Jgrad , the result:


k 1 , 1 K k 1 , 2 k +11 , 1 = J k +11 , 2

(12)

From (7) and (11),


j1 , 1 Gj = j1 , 2 , j = 0,1, 2, L

1 +2 2 k 2 1k 2 2 , 2 2 2 N among them, k = (1k2 j 2 k1 j ) k j j =1


jk

.
is involutive

Proposition 3.2: (1)

{Ek , k = 1, 2,...N }

system: so that

j a j 1 1 , 1 = b j 1 j1 , 2

, j = 1, 2,L (13)

{E , E } = 0, k , j = 1, 2,...N ,
k j

And {dEk } has nothing to do with the gradient;


N (2) H = 1 Ek = m 1 H m k =1

The constraint condition is Bargmann constraint condition: 2q r = 1 , 1 , G0 = 4q + 2r 1 , 2


1 1 , + , q 4 1 2 4 1 1 = r 1 , 1 , 1 2 1 1 2 2

m=0

H m = m Ek [9], m = 1, 2, L
k =1

(14)

Proof (1): From the define of Poisson bracket define, 1k , 1l = 2k , 2l = 0


1k , 2l = kl =
1 k = l 0 k l

The Poisson bracket [8]of Smooth Function H and F in symplectic space ( R 2 N , d 2 k d1k ) is defined as
k =1 N

( ,
1

, Bkj ) = ( 1 , 2 , Bkj ) = 2 , 2 , Bkj = 0

( k , l )
=(
N

followed:
H F H F {H , F} = k =1 2 k 1k 1k 2 k
N

(1k2 j 2k1 j )2

j =1 jk

k j
N

(1l2 j 2l1 j )2

j =1 j l

l j

)=0

(12k , l ) = (12k ,
1

(1l2 j 2l1 j )2

Proposition 3.1: (10) and (11) can be written as a finitedimensional system: H H 2 x = 1x = 1 2 H m = H m 2 tm = 1tm 2 1
m = 1, 2, L
H and H m are Hamilton function here.
H = 1 4 1 , 1 1 , 2

j =1 j l

l j
(1l2 j 2l1 j )2

= 4 ( k l ) 1k1l Blk

(22k , l ) = (22k ,
N 1

j =1 j l

l j
1

)=

4 ( k l ) 2k2l Bkl

(15)

(1k2k , l ) = 2 ( k l ) (2k1l + 1k2l ) Blk


so ( Ek , El ) = 0 Proof (2):

1 8 1 8

1 , 2 1 , 1

1 , 2 1 , 1

+ 1 , 2 +

1 2

m Ek n 1 n 1 Ek = k Ek = E k =1 k k k =1 m = 0 k =1 k

2 , 2

+
1 2

1 , 1

m 1

(16)
m +1

Proposition 3.3: (1) { H , Ek } = 0, k = 1, 2,...N .

m=0

k =1

m 1 H m [10] ( k ) Ek =
n

m =0

Hm =

1 2

1 , 1 2 , 2
m

2 1 , 1

(2) { H , H m } = 0, m = 0,1, 2,... (3) { H n , H m } = 0, m, n = 0,1, 2,...

2012 ACADEMY PUBLISHER

2220

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

( 4) ( R 2 n , d1k d1k , H ) ,
k =1

Suppose

that

1 < 2 < L

is

different

( R 2 n , d1k d1k , H )
k =1

eigenvalues of Equation (2.5). 1 j , 2 j are characteristic function[12] of j ( j = 1, 2, L , N ), so


1 j 2j

is the completely integrable system in the Liouville sense.

{H , Ek } = 0, k = 1, 2,...N . can be proved calculation. {H , H m } = 0, m = 0,1, 2,... {H n , H m } = 0, m, n = 0,1, 2,... can be proved by
Proof:
( R , d1k d1k , H ) , ( R , d1k d1k , H )
2n 2n k =1 k =1 n n

by ; the is

= M ( u , j )

1 j 2j

(18) , (19)

1 j 2 j

tm

= wm ( u , j )

1 j 2j

previous

proposition.

= diag (1 , 2 ,.. N ) ,
1 j = (11,..., 1N )T , 2 j =(21,...2N )T
Kgrad = Jgrad , we can get the result:
k 1 , 1 K k 1 , 2 k + 1 1 , 1 = J k + 1 1 , 2

proved by Liouville theorem. Hence the proposition is concluded.

(20)

From (14) and (19), we have


j +11 , 1 G j = j +1 1 , 2 j = 1, 0,1, 2,L

= diag (1 ,... N ) ;

At the same time,


Figure4. Solition solution
j a j 1 1 , 1 = b j 1 j 1 , 2

, j = 1, 2, L

(21)

4 1 , 1 G1 = = 0 2 1 , 2

Neumann constraint condition [13, 14] is : 1 , 1 = 4 : 1 , 2 = 0


q= 1 , 1 + 1 , 2 4 4 Then 1 1 r = 1 , + 1 , 1 1 2 2 4 4
N

(22)

Figure5 Solition solution


Proposition 3.4: if 1 , 2 are involutive solution of

The Poisson bracket [15] of Smooth Function H and F [16] in symplectic space [17] ( R 2 N , d2 k d1k ) is
k =1

the Hamilton regular system [11], so


1 1 r 2 1 ,2 2 1 ,1 = q 1 1 ,2 + 1 1 ,1 4 4

defined as followed:

{H , F} =

H F H F 2 k 1k 1k 2 k k =1
N

is the solution of evolution equation hierarchy rtm utm = = JGm , m = 0,1, 2, L q tm


IV.THE HAMILTON EQUATION AND ITS COMPLETE INTEGRABILITY UNDER THE NEUMANN CONSTRAINT

Proposition 4.1: (18) and (19) can be written as a finite-dimensional system: H H 2x = 1 x = 1 2 (23) H m H m = 2t = 1t 1 2
m

. H and H m are Hamilton functions here


H = 1 1 ,1 4

1 , 2

1 1 , 2 1 , 2 4

2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2221

+ 1 , 2 +

1 1 , 1 1 , 1 1 , 1 8

(22k , n ) = (22k ,
N
1

1 2 , 2 1 , 1 8

(1n 2 j 2n 1 j )2 ) j =1, n j
j n

(24)

= 4 ( k n ) 2k2n Bnk

1 m +1 Hm = m+11,1 2 ,2 4 1 , 1 + 2

m +11 , 1 1 , 2 + + 1 m +1 j 2 , 2 2 j =0

1 m +1 1 , 1 1 , 1 2 m j +11 , 1
m j +12 , 1

(1n2 j 2n1 j )2 ) (1k2k , n ) = (1k2k , j =1, n j


N j n

= 2 ( k n )
(25) Proof (2): n Ek k =1 k

(2k1n + 1k2n ) Bnk


k Ek k =1 m = 0 n m

Therefore, ( Ek , En ) = 0
=
m

1 m +1 j2 , 1 2 j 0

The generator [18] Ek is as followed:


1 2 Ek = k k 1k 2 1 , 2 +41k 2 k 41k 2

k =1

1 k

1 Ek

Ek

1 1 + 1k 2 1 , 1 1k 2 2 , 2 2 2 N ( ) 2 2k 1 j ), (Let k = 1k 2 j k j j =1,
jk

= m 1 ( k ) Ek = m 1Gm
n m =0 k =1 m =0

solutions as follows on : (1) { H , f j } = 0, j = 1, 2

Proposition 4.3: let f1 = 1,1 4 , f 2 = 1 , 2 , then the

Proposition4.2: (1)

{Ek , k = 1, 2,...N }

is involutive

system [19]: so that {Ek , E j } = 0, k , j = 1, 2,...N , and {dEk } has nothing relation with the gradient[20].
(2) H = Ek = m1Gm k=1 m=0 k n

(2) { f j , Ek } = 0, j = 1,2 k = 1,2,Ln (3) det { f i , f j } 0, i, j = 1, 2 (4) { H, Ek } = 0, k = 1,2,...N. (5) { H, Hm} = 0, m = 0,1,2,... (6) { Hn , Hm} = 0, m, n = 0,1,2,... (7) ( R 2 n , d1k d1k , H ) ,
k =1 n

Gm = k m
k =1

Ek k

H m = Gm +1

Among them, Hm = k m+1Ek , m = 1,2,L.


k =1

( R 2 n , d1k d1k , H ) is the completely integrable


k =1

Proof (1): From the define of Poisson bracket, 1k , 1n = 2k , 2n = 0


1k , 2n
1 k = n = kn = 0 k n

system in the Liouville sense [21]. Proof: { H , f j } = 0, j = 1, 2


j k

( ,
1

, Bkn ) = ( 2 , 2 , Bkn )

{ f , E } = 0, j = 1,2 k = 1,2,Ln det { f , f } 0, i, j = 1, 2 ; { H, E } = 0, k = 1,2,...N.


i j
k

= ( 1 , 2 , Bkn ) = 0

can be proved by calculation. { Hn , Hm} = 0, m, n = 0,1,2,... can


2n

be
n k =1

proved

by ,

the
2n

previous
n k =1

proposition. are

( k , n ) =
N ( )2 N ( )2 1n 2 j 2n 1 j 2k 1 j 1k 2 j =0 , j =1, j =1, k j n j j n j k

( R , d1k d1k , H )

( R , d1k d1k , H )

(1n 2 j 2n 1 j )2 ) ( , n ) = ( , j1, = n j
2 1k 2 1k

proved by Liouville theorem. Hence the proposition is concluded. Proposition 4.4: if 1 , 2 are involutive solutions of the Hamilton regular system, so
1 1 , , r 2 1 2 2 1 1 = q 1 , + 1 , 1 2 1 1 4 4

is the solution of evolution

j n

= 4 ( k n ) 1k1n Bnk
1

r equation hierarchy = KGm , m = 0,1, 2,L . q tm

2012 ACADEMY PUBLISHER

2222

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

Figure6 Solition solution

Figure7 Solition solution

V. CONCLUSIONS The results of this paper have been algebraic in nature. We have considered the second order matrix eigenvalue problem (1). In the first part of section 2, we gave Lax representation and evolution equation hierarchy which related to eigenvalue problem. In section 3, Hamilton equation and its complete integrability under the Bargmann constraint are given. In section 4, Hamilton equation and its complete integrability under the Neumann constraint are given. ACKNOWLEDGMENT The authors are grateful to Prof. Baocai Zhang and Prof. Zhuquan Gu for their valuable suggestions.

[5] C. W. Cao, A classical integrable system and involutive representation of solutions of the KdV equation, Acta Math. Sinia, 1991, pp. 436440. [6] D. Y. Chen, Soliton Theory, Science Press, 2006, pp.194199. [7] Z. Q. Gu, The Neumann system for 3rd-order eigenvalue problems related to the Boussinesq equation, IL Nuovo Cimento, vol. 117B, (6) 2002, pp. 615632. [8] C. Y. Fu, T. C. Xia, Integrable couplings of the generalized AKNS hierarchy with an arbitrary function and its bi-Hamiltonian structure, International Journal of Theoretical Physics, vol.46, 2007. [9] D. Y. Chen, D. J. Zhang, J. B. Bi, New double Wronskian solutions of the AKNS equation, Science in China Series A: Mathematics, vol.51, 2008. [10] F. C. Long , Nonlinear integrable systems and related topics, CNKI:CDMD:125.2009. [11] C. G. Shi , F. S. Wu, Research on a class of solitons to KDV equations in mathematica, Journal of Shanghai University of Electric Power, Vol.23, 2007. [12] Q.L.S.R.D.R.J. ZHA , Exact periodic wave solutions for the (2+1)-dimensional breaking soliton system, Journal of Inner Mongolia Normal University (Natural Science Edition), February 2006. [13] L. Jiang, Exact solution of nonlinear partial differential equations and solitary wave interaction research, Beijing University of Posts and Telecommunications, 2007 [14] H. M. Fu,and Z. D. Dai, Two dimensional S-K equation periodic solitary wave solution, Journal of Xuzhou Normal University (Natural Science Edition), April 2009 [15] Q.L.ZHA, Integrable system multiple soliton solutions and symbolic computation research, Journal of East China Normal University, 2009. [16] W. J. Liu and B. Sang, The first integral method and its applications to nonlinear evolution equations, Journal of Liaocheng University (Natural Science Edition), February 2011. [17] X. P. Xin and L. L. Zhang, A structure of Burgers and KP equation of soliton and periodic solutions method, Journal of Liaocheng University (Natural Science Edition), April 2009. [18] S. Q. Cai and S. J. Gan, Progress in the study of the internal solition in the northern south china sea, Advances in Earth Science, vol.16, 2001. [19] Y. He, Some exact solutions to nonlinear evolution equations, Northwest University, 2011. [20] X. C. Zhu, Nonlinear evolution equation solution method study, Beijing University of Posts and Telecommunications, 2009. [21] B. Lu, Nonlinear differential equation in the constructive method and symbolic computation, Dalian University of Technology, 2010.
Shujuan Yuan was born in Tangshan, Hebei September,1980. She graduated from Hebei Normal University. She received M.S. degree in 2007 from School of Sciences, Hebei University of Technology, Tianjin, China. Current research interests include integrable systems and computational geometry. She is a lecturer in department of Qinggong College, Hebei United University. Shuhong Wang, native of Liaoning, was born in January 1980. In 2006, she got master of Applied Mathematics at Hebei

REFERENCES
[1] O. J. Peter, Applications of Lie Groups to Differential Equations, 2nd ed., Spring-Verlag, New York Berlin Heidelberg, 1999, pp. 1353, 452462. [2] C. W. Cao, Confocal involutive system and a class of AKNS eigenvalue problem, Henan Science, vol.5, 1987, pp.1-10. [3] Z. Q. Gu,,Two finite-dimensional completely integrable Hamiltonian systems associated with the solutions of the MKDV hierarchy, Math.Phys.vol.32. [4] V. I. Arnold, Mathematical Methods of Classical Mechanics, 2nd ed., Spring-Verlag, New York Berlin Heidelberg, 1999.
2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2223

University of Technology, Tianjin, China. She is mainly engaged in domain of differential equations, inequality and so on.
Wei Liu was born in Shijiazhuang, Hebei/ January, 1981, Received M.S. degree in 2007 from School of Sciences, Hebei University of Technology, Tianjin, China. His mainly engaged in control and application of differential equations. Current research interests include integrable systems and computational geometry. She is a lecturer in department of mathematics and physics, Shijiazhuang Tiedao Uuiversity. Xiaohong Liu was born in Tangshan, HeiBei province, China, in 1976. She graduated from Beijing Normal University, got the master of Science degree at the same school. And the major

field of study is mathematical Statistics and the application of wavelet transform. She has been a secondary school teacher for seven years. After graduated from Beijing Normal University, She became a lecturer in department of Qinggong College, Hebei United University.
Li Li was born in 1979 in Handan City. He graduated from Hebei Normal University in 2002. Li Li was enrolled in the Henan Polytechnic University in 2008, and began to study the basic mathematics. Three years later, he got the Master of Science at the same school in 2011.He has been a secondary school teacher for six years before he was enrolled in the Henan Polytechnic University. Now, he is a Teaching Assistant in Qinggong College, Hebei United University.

2012 ACADEMY PUBLISHER

2224

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

Privacy-preserving Judgment of the Intersection for Convex Polygons


Yifei Yao
School of Computer & Communication Engineering, University of Science & Technology Beijing, Beijing, China Email: yaoyifei@mail.ustc.edu.cn

Shurong Ning
School of Computer & Communication Engineering, University of Science & Technology Beijing, Beijing, China Email: fancyning@163.com

Miaomiao Tian and Wei Yang


National High Performance Computing Center at Hefei, Hefei, China Email: {qubit, miaotian}@mail.ustc.edu.cn

AbstractAs the basic issues of computational geometry, intersection and union of convex polygons can solve lots of problems, such as economy and military affairs. And privacy-preserving judgment of the intersection and union for convex polygons are most popular issues for information security. Traditional method of making the polygons public does not satisfy the requirements of personal privacy. In this paper, a method to compute intersection and union of convex polygons in secure two-party computation (STC) model has been considered, both proportionate partition and unproportionate partition cases are studied. Scan line algorithm is used to figure out the geometry matter, while secret comparison protocol is used for saving the privacy, a series of protocols for this matter is proposed, which combines computational geometry and secure multi-party computation (SMC) technique to achieve the functionality of cooperation calculation without leaking so much privacy. At last, the security, complexity and applicability analysis of the protocols are also discussed. Index TermsSTC, secret comparison protocol, privacypreserving geometric computation, polygonal intersection, polygonal union

I. INTRODUCTION Along with the importance of privacy turning more and more attractive, the secure computation of basic algorithm in each field becomes popular questions. Privacy-preserving techniques provide methods to find important messages correctly in shared data collection. It turns out to be attractively because it can seek more benefit for participants [1]. Meanwhile, secure multiparty computation makes cooperative calculation privately, and prevents participants data from leaking [2]. Polygonal intersection and union is base of computational geometry and computer graphics, they are of significance both in theory and practice [3]. Many issues need polygonal intersection such as removing hide line, pattern
2012 ACADEMY PUBLISHER doi:10.4304/jcp.7.9.2224-2231

recognition, component position, linearity programming and so on. Meanwhile, polygonal union can help one decide architecturally plane area of ichnography. Methods to compute two objects intersection or union privately and effectively will settle these problems. In former applications, people always collect the polygons information together and solve it by a trust third party (TTP). But the demand of privacy makes it hard to find such an agency trusted by both partners. Each party wants the result correctly avoiding leaking his information to the other. In this paper, we study how to calculate polygonal intersection and union in STC model. This solution does help in economy and military affairs. For example, a new company hopes to build a shopping mall, it must review if there is another company working at the same area. Both of them want to know weather their orbits meets or not without leaking their own border information. Meanwhile, military affairs also refer to the intersection question often. Fortunately, Reference [4] and [5] indicate that SMC technique can help to achieve the goal. In this paper, we devise protocols to compute intersection and union of convex polygons approximately, and then analyze their security, complexity and applicability. The paper is organized as follows. In section 2 we describe preliminaries. We introduce the basic comparison protocol detailed in section 3 and present the STC protocol in section 4. Then in section 5 we discuss the protocols complexity and security. At last we conclude the paper in section 6. II. PRELIMINARIES A. Secure Multi-party Computation In a multi-agents network, SMC helps two or more parties complete the synergic calculation without leaking private information. Generally speaking, SMC is a

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2225

distributed cooperation. In this work, each party hold a secret as input, and they want to implement the cooperative computation while knowing nothing about others data except the final result. Secure two-party computation (STC) was first investigated by A. C. Yao in reference [6]. Then, a general solution for SMC was proposed [4, 5]. From then on, the technology of SMC has already come into more and more domains and many scholars dive into this field, and lots of articles for special use of SMC come into being, such as data mining [7], statistical analysis [8], scientific computation [9, 10], electronic commerce, private information retrieval (PIR), privacy-preserving computation geometry (PPCG) [11, 12], quantum oblivious transfer and so on. Secure multi-party computation for set union and join makes SMC useful in data mining [13, 14]. PIR uses the SMC conception for reference to retrieve answer without leaking other information. Privacy-preserving location determinant of two geometry graphics imports SMC into military affairs [15, 16]. With the rapid development of economy, scientific computation and statistical analysis will use SMC technique as one of the basic security tools. Reference [5] introduced several applications of SMC by W. Du, and it brought forward correlation and regression analysis problem of privacy-preserving statistical analysis firstly. He gives a solution in two-party instance with his matrix product protocol. Then L. Yehuda studied STC problems in a malicious condition [17]. Computational geometry is a subject studying plane and solid issues, which is important for settling matters in abstract problems. Geometry measurements, which include inner product, convex hulls, location judgment, and so on, are basal in production and living life of society. To achieve the purpose of preserving personal privacy in computational geometry, D. Li designed an approximate convex hulls protocol [18] and Q. Wang proposed a convex hull algorithm for planar point set in [19]. Reference [20] tells us how to determine the meeting points of two intersected circles. Meanwhile, reference [21] gives a method to share unified location with end user privacy control and reference [22] gives a security analysis of the Louis protocol for location privacy. All the research are balancing the privacy and efficiency, a method without any leaking is not really existed, but more efficient ways for limit disclosure of information are more useful and practical. Previous methods work on a third-party who is trusted by all parties. A TTP can get enough information to complete the calculation and broadcasts the result. But the hypothesis itself is insecure and unpractical. Therefore, an executable protocol which can preserve participants privacy becomes more and more dramatically. It is known that any secure computation problem can be solved by a circuit protocol, but the size of the corresponding circuit is always too large to realize. So investigators choose to design special protocol for special use instead of praying for a third partys keeping secret.

B. Secret Comparison Protocol In 1982, A.C. Yao brought forward the famous millionaires problem: two millionaires, say Alice and Bob, want to know which is richer, without revealing their respective wealth. To begin with, Alice and Bob need a public-key cryptographic system which is strong, for example RSA, and assume that both of their belongings are in a certain integral range. Alice is worth millions, and Bob millions, they execute the millionaire protocol through public-key system and mode computation [6]. If the result is that Bob receives a data inosculating his J th digital, then I J , otherwise I < J . At last, Bob sends Alice the result. In order to avoid cheating, each party initiates the protocol once for peace. After that, another method for two parties comparing is brought forward. There are three parties taking part in the protocol: A, B, and an oblivious third party C who helps A and B to check if their private value a and b are equivalent or not. The method validates its security by computational undistinguished through homomorphism encryption and -hiding assumption. It returns which one is greater or equal to the other, while Yaos method couldnt returns the equality message. This protocol is complex in computation and safe to resist decoding, it reduces the communication of random data perturbation techniques in Yaos method. Recently, reference [23] put forward an efficient protocol for private comparison problem, and this issue must be a popular problem for several years [24].

Figure 1. Areas decided by the peaks of convex polygons.

C. Models Computational model: Generally speaking, there exist potential malicious attacks against any multi-party protocol [17]. In this paper, we study the problem under a semi-honest model, in which each semi-honest party follows the protocol with exception that he keeps a record of all its intermediate computations, and he will never try to intermit or disturb with dummy data [5]. The model is practical and useful, because everybody in the cooperation expects the right result rather than others private information. Security model: We name I A and I B as the input instance of Alice and Bob, and name OA and OB as the corresponding output. C represents the computation

2012 ACADEMY PUBLISHER

2226

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

executed by the two partners, then (1) establishes. A protocol for executing C is secure when it satisfies two conditions as follow: 1. There is an infinity set (2) for (3) and (4).

At last, we pick them up and move out the void peaks at the margin (Fig.1). Theorem 1: The intersection of convex polygons L and M will be find in (8). (8) (L + M ) . The correctness of the theorem 1 can be found in reference [1]. An example is shown in Fig.2.
Irreciprocal Protocol in Proportionate Partition: In order to avoid the irregular workload of unproportionate partition, we can use proportionate partition instead. It likes the one above but equidistant and adjustable as demand. An example is shown in Fig.3. Basic Generating Set Protocol: The method in irreciprocal protocol in unproportionate partition forms a proper subset of approximate convex as Fig.2. We modify it to get its generating convex. To achieve the goal, reference [1] uses the two outermost points in the same ordinate slip instead of maximum point P . Obviously, it generates a convex including proper subset. Similar to the error analysis, the point in our approximate convex but not proper subset does be in (9) from the proper convex. (9) ( xmax xmin ) k .

( OA , OB ) = C ( I A , I B ) .
DA = {( IAi , OAi ) | i = 1, 2,L} .

(1)

(2) (3) (4)

( OA ', OB ) = C ( I A ', I B ) .
( I A ', OA ' ) DA .

2. There is an infinity set (5) for (6) and (7).


DB = {( IBi , OBi ) | i = 1, 2,L} .

(5) (6) (7)

( OA , OB ') = C ( I A , I B ') .
( I B ', OB ' ) DB .

Apparently, the more closely that I A has association with I A , the more information Bob will get about Alice, and vice versa. In the execution, its inevitable to leak some message. But our protocol is robust to some extent. The adversary can obtain something through security analysis, but its not enough for him to get the certain value. Although the protocol is not zero-knowledge, it is a desirable way to achieve high efficiency.
Original polygons Union Intersection

We modify it to form a protocol computing the generating set of intersection and union of polygons as below. Firstly, we divide the area into equidistant intervals. Secondly, the maximum and minimum ordinate of the two polygons in each strip will be compared. At last, the outermost point with the same ordinate is outputted instead of primary maximum point P . An example is showed as Fig.3.
Original polygons Union Intersection

Figure 2. Example of unproportional instance - subset.

D. Related Algorithms in Computational Geometry Irreciprocal Protocol in Unproportionate Partition: Firstly, we find out the maximal and minimal xcoordinate values, noted as a and b , then divide into k equidistant bands perpendicularly between a and b . The k bands form a memory serial, and we distribute the n points of set S into the memory serial. At last, we pick the maximal and minimal y-coordinate values of each band and save them as set S * . S * has 2k + 4 points at most, we construct its convex to form the approximate outline. Each sector and a convex polygon intersect into a quadrangle. It is to say that P and Q intersect into a

Figure 3. Example of proportional instance - generating set.

III. BUILDING BLOCKS In this section, we introduce the secure building blocks. It performs comparison on one scanning bean. Its a basic tool for the latter protocols. We assume that Phigh and Plow belong to Alice, Qhigh and Qlow belong to Bob. They are on the same scanning bean, and are ranked by their y-coordinate. There will be four instances appearing on each bean as follows.

quadrangle in each sector. We will find it in O (1) . Then, a scanning work in linear time can set up these fragments.
2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2227

Result 1: Phigh > Qhigh and Plow < Qlow : Then Qhigh and Qlow belong to polygonal intersection, Phigh and Plow belong to polygonal union (Fig.4).

Result 3: Phigh > Qhigh and Plow > Qlow : Result 3.1: if Qhigh > Plow then Qhigh and Plow belong to the polygonal intersection, Phigh and Qlow belong to the polygonal union (Fig.7).
Phigh

Qhigh

Plow Qlow

Figure 7. Basic comparison protocol: Result 3.1. Figure 4. Basic comparison protocol: Result 1.

Result 2: Phigh < Qhigh and Plow < Qlow :

Result 2.1: if Phigh > Qlow then Phigh and Qlow belong to polygonal intersection, Qhigh and Plow belong to polygonal union (Fig.5).

Result 3.2: if Qhigh < Plow then no one on this scanning bean belongs to the intersection, the four points all belong to the union (Fig.8).
Phigh

Plow Qhigh Qlow

Figure 8. Basic comparison protocol: Result 3.2. Figure 5. Basic comparison protocol: Result 2.1.

Result 2.2: if Phigh < Qlow then no one on this scanning bean belongs to the intersection, the four points all belong to the union (Fig.6).
Qhigh

Result 4: Phigh < Qhigh and Plow > Qlow : Then Phigh and Plow belong to polygonal intersection, Qhigh and Qlow belong to polygonal union (Fig.9). We summarize the basic comparison protocol as below: Protocol 1: Basic Comparison Protocol Input: Alice has Phigh and Plow , while Bob has Qhigh and Qlow at each bargained scanning bean. Output: Both Alice and Bob know which of his point is on the polygonal borderline with no information leaking to the other. Alice cooperates with Bob to compare ( Phigh , Qhigh ) and ( Plow , Qlow ) using the secret comparison protocol in 2.2. Case 1: if Phigh > Qhigh and Plow < Qlow then we get result 1 and terminate. Case 2: if Phigh < Qhigh and Plow < Qlow then we continue to compare Phigh and Qlow : if Phigh > Qlow then we get result 2.1 , else we get result 2.2, and terminate.

Qlow Phigh

Plow
Figure 6. Basic comparison protocol: Result 2.2.

2012 ACADEMY PUBLISHER

2228

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

Case 3: if Phigh > Qhigh and Plow > Qlow then we continue to compare Qhigh and Plow : if Qhigh > Plow then we get result 3.1 , else we get result 3.2, and terminate. Case 4: if Phigh < Qhigh and Plow > Qlow then we get result 4 and terminate.

Qhigh

Phigh Plow

Qlow

Figure 9. Basic comparison protocol: Result 4.

Theorem 2: Protocol 1 could complete the comparison on one scanning bean without compromising privacy. Proof: We get the correctness from the figures above (Fig.4 - Fig.9). For security, we study what is leaked through the process. On each scanning bean, both the parities get the result which point belongs to the convex borderline, but the value is secret to the other side. Because they will compare two times at most, neither can nose out the others information by repetitious execution. And what they got can be analyzed from the view though the execution without any other assistant. It does satisfy the demand of security model. Theorem 3: Protocol 1 has complexity O (1) times of

secret comparison protocol. Proof: On each scanning bean, they will compare two times at most. So, they can finish the progress in O (1) times of secret compare protocol. IV. PROTOCOL TO COMPUTE INTERSECTION AND UNION OF CONVEX POLYGONS APPROXIMATELY IN STC
A. Secure Protocol for Approximate Intersection of Convex Polygons in STC In this section, we discuss the secure protocols for approximate intersection of two polygons in unproportionate and proportionate partition. Protocol in unproportionate partition: a protocol for unproportinate partition is proposed below. Protocol 2: Secure Two-Party Protocol for Approximate Intersection of Two Polygons in Unproportionate Partition. Input: Alices and Bobs private convex polygons Output: the approximate intersection of the two polygons
2012 ACADEMY PUBLISHER

Step1: Alice and Bob announce to each other the xcoordinate of each peak or selected x-coordinates to form the unproportionate partition scanning beans. Step2: On each scanning bean, Alice has Phigh and Plow Bob has Qhigh and Qlow . They invoke the Basic Comparison Protocol to know which point is on their approximate intersection without leaking any other information. Step3: They repeat step 2 until all the scanning beans are finished. The benefit of this protocol is to avoid computing the actual coordinate of the point of intersection. The point of intersection will leak apex message more or less, and it is dangerous to leak the outline of polygons when scanning frequently. This protocol reduces leaking information by the way of avoiding calculating the apex. Alice registers her point if it is on the outline, otherwise, she only knows that the point on this scanning bean belongs to Bob without getting other message. If there is only one apex, we look it on as the maximum and minimum value simultaneously. Protocol in proportionate partition: Protocol under proportionate partition is similar to that of unproportionate partition, the only difference is in step 1, and we modify it as below. Step1: Alice and Bob choose their own greatest and least x-coordinate to compare securely for announcing the maximum and minimum value, and negotiate about the number of regions. Then, they carve up n + 1 scanning beans proportionately between the two values. Thus it can be seen that the complexity of this protocol is correlative to the partition number n . Although reducing partition number will preserve parties privacy better, it descends precision meanwhile. Protocol of generating set: The protocol of generating set is similar to that of subset, the difference is in step2. We get it when change Basic Comparison Protocol into Basic Generating Set protocol.
B. Secure Protocol for Approximate Union of Polygons In this section, we discuss the secure protocols for approximate union of two polygons in unproportionate and proportionate partition. Protocol in unproportionate partition: a protocol for unproportinate partition is proposed below. Protocol 3: Secure Two-Party Protocol for Approximate Union of Two Polygons in Unproportionate Partition Input: Alices and Bobs private convex polygons Output: the approximate union of the two polygons Step1: Alice and Bob announce their x-coordinate of each peak or selected x-coordinates to form the unproportionate partition scanning bean. Step2: On each scanning bean, Alice has Phigh and Plow , and Bob has Qhigh and Qlow . They invoke the Basic Comparison Protocol to know which point is on their approximate union without leaking any other information.

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2229

Alice or Bob only knows if her/his point is the maximum or minimum value but nothing else. Step3: They carry out step 2 repeatedly until all the scanning beans are finished. Protocol in proportionate partition: Protocol under proportionate partition is similar to that of unproportionate partition, as protocol in protocol in proportionate partition is similar to that in protocol in unproportionate partition. Protocol of generating set: The protocol of generating set is similar to that of subset, as protocol in protocol of generating set is similar to that in protocol in unproportionate partition. V. ANALYSIS In this section, we analyze the complexity and security of the protocols.
A. Complexity Analysis Conclusion 4: Secure two-party protocol to compute intersection or union of convex polygons in unproportionate partition has time and communication complexity O ( m + n ) times of Basic Comparison Protocol. The corresponding protocols in proportionate partition has time and communication complexity O ( l ) times of Basic Comparison Protocol, where l is the number of regions the both bargained on. Protocol for generating set likes the fore type. For Basic Comparison Protocol, we get its security in section 3 and it can be finished in O (1) times secret comparison protocol. Because of the comparability of the protocols, we take protocol 2 as example. In Step 1, Alice and Bob decide the partition of the scanning area, they use O (1) times exchanging message to announce their xcoordinate. In Step 2, it cost them O (1) times of secret comparison problem. In Step 3, they need O ( m + n ) times of Basic Compare Protocol to scan all the beans. Meanwhile, the protocol in unproportionate partition has complexity O ( l ) times of comparison, where l is the number of regions the both negotiate about. For they compare once at each scanning bean. B. Security Analysis Conclusion 5: Protocol 1 (Protocol 2, Protocol 3) can execute securely without leaking privacy. Now, we analyze the message leaked at each case in Basic Comparison Protocol. On each scanning bean, Alice gets to know if her Phigh and Plow are on the outline. In case 1, Alice sees Qhigh and Qlow of Bobs are between Phigh and Plow , Bob gets that his Qhigh and Qlow are not on the outline and Phigh > Qhigh , Plow < Qlow thereby. In case 2, if 2.1 happens, they see Phigh is seated between Qhigh and Qlow , and Qlow is greater than Phigh . The rest may be
2012 ACADEMY PUBLISHER

deduced by analogy. So, Alice or Bob only knows the relative position of her/his point and the others but not the value. Considering this problem, some message is predetermined to leak out. If anyone knows his point is on the outline, he immediately sees the others corresponding point is not on the outline. The acceptance or rejection indeed discloses some information about big or small on the same scanning bean, but it is inescapable. Our method can not guarantee this kind of message but only prevent from leaking any needless information. Because the four points is independence and there is no rule between them, neither can analyze to know the others information through the secure intersection or union protocol. This does preserve the parties privacy.
C. Applicability Analysis

The protocol of this paper is an approximate algorithm to calculate an outline. Although approximation avoids leaking apex message when computing intersection and union, it is at the cost of precision in calculation. Especially in proportional partition, our result is anamorphic if their figures change suddenly in some area as Fig.10. Therefore, the scheme in this paper doesnt adapt to work of high precision.

Figure 10. Mutant Example.

Meanwhile, the point on the outline belongs to one party but not shared by the both, so it can not used for computing acreage of intersection or union. VI. SUMMARY Privacy-preserving computational geometry is important for secure multi-party computation, it offers basic tool to calculate conveniently. It is useful in science research and engineering technology. The intersection and union of convex polygons are basic issues in computational geometry, and the demand of privacypreserving calls on secure protocols for special fields. We have proposed protocols to compute approximate intersection and union of convex polygons in STC model.

2230

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

Detailed analyses about security and complexity are also presented. We tie in computational geometry and SMC technique rationally to solve the problem. By the help of secret comparison, the protocols use Basic Compare Protocol as sub-protocol and gain in privacy and efficiency at the price of precision appropriately. Along with the development of SMC, our future work would like to settle the problem in more complex settings, such as multi-party model, malicious behavior and so on. ACKNOWLEDGMENT The authors wish to thank Professor L. Huang, for providing excellent notes of the discussion. We would like to thank the participants in the National High Performance Computing Center for their helpful comments. And we are very grateful to Professor Y. Luo at Department of Computer Science and Technology at Anhui Normal University for useful comments and suggesting some corrections. This work is jointly supported by NSFC under Grant No. 60903067 and Funding Project for Beijing Excellent Talents Training under Grant No. 2011D009006000004. REFERENCES
[1] V. A. Oleshchuk, V. Zadorozhny, Secure Multi-party Computations and Privacy Preservation: Results and Open Problems, Telektronikk. Norway, pp. 2026, February 2007. [2] Goldreich, Secure Multi-party Computation (working draft). Available from www.wisdom.weizmann.ac.il/home/ oded/public html/foc.html, 1998. [3] M. Berg, O. Cheong, and M. Kreveld, Computational Geometry: Algorithms and Applications, 3rd ed. SpringerVerlag Berlin, Heidelberg, 2008, pp.4562. doi: 10.1007/978-3-540-77974-2. [4] S. Li, and Y. Dai, Secure Two-Party Computational Geometry, Journal of Computer Science and Technology. Beijing, China, vol. 20(2), pp. 259263, 2005. [5] W. Du, and Z. Zhan, A Practical Approach to Solve Secure Multi-party Computation Problems, New Security Paradigms Workshop. Beach Virginia, USA, pp. 127135, September 2002. [6] A. C. Yao, Protocol for Secure Computations (extended abstract), 21st Annual IEEE Symposium on the Foundations of Computer Science. IEEE Press, New York, USA, pp. 160164, 1982. [7] C. Clifton, M. Kantarcioglou, Xiadong. Lin, and M. Y. Zhu, Tools for Privacy Preserving Distributed Data Mining, SIGKDD Explorations Newsletter. New York, USA, vol. 4, Issue 2, December 2002. [8] Y. Yao, L. Huang, W. Yang, Y. Luo, W. Jing, and W. Xu, Privacy-preserving Technology and Its Applications in Statistics Measurements, The Second International Conference on Scalable Information Systems. Suzhou. China, article 74, June 2007. [9] W. Du, and J. A. Mikhail, Privacy-preserving Cooperative Scientific Computation, 14th IEEE Computer Security Foundations Workshop. Nova Scotia, Canada, pp. 273 282, 2001. [10] Y. Yao, L. Huang, and Y. Luo, Privacy-preserving matrix rank computation and its applications, Chinese Journal of Electronics. Beijing, China,, vol. 17(3), pp. 481486, 2008. [11] R. Dowsley, J. Graaf, D. Marques, and C. A. Nascimento, A Two-Party Protocol with Trusted Initializer for

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]

Computing the Inner Product, Information Security Applications. Brazil, vol. 6513, pp. 337350, 2011. doi: 10.1007/978-3-642-17955-6_25. D. Eppstein, M. T. Goodrich, and R. Tamassia, Privacypreserving data-oblivious geometric algorithms for geographic data, GIS '10 Proceedings of the 18th SIGSPATIAL International Conference on Advances in Geographic Information Systems. New York. USA, pp. 13 22, 2010. doi: 10.1145/1869790.1869796. B. Hawashin, F. Fotouhi, and T. M. Truta, A privacy preserving efficient protocol for semantic similarity join using long string attributes, PAIS '11 Proceedings of the 4th International Workshop on Privacy and Anonymity in the Information Society. New York, USA, article 6, 2011. doi:10.1145/1971690.1971696. J. Camenisch, and G. M. Zaverucha, Private Intersection of Certified Sets, Financial Cryptography and Data Security. Lecture Notes in Computer Science. Berlin, Heidelberg, vol. 5628, pp. 108127, 2009. doi: 10.1007/978-3-64203549-4_7. M. Hardt, and K. Talwar, On the geometry of differential privacy, STOC '10 Proceedings of the 42nd ACM symposium on Theory of computing. New York, USA, pp. 705714, 2010. Y. Sun, H. Sun, H. Zhang, and Q. Wen, A Secure Protocol for Point-Segment Position Problem, Web Information Systems and Mining. Lecture Notes in Computer Science, vol. 6318, pp. 212219, October 2010. doi: 10.1007/978-3642-16515-3_27. L. Yehuda, and P. Benny, An Efficient Protocol for Secure Two-Party Computation in the Presence of Malicious Adversaries, 26th annual international conference on Advances in Cryptology. Barcelona, Spain, pp. 5278, 2007. D. Li, L. Huang, W. Yang, Y. Zhu, Y. Luo, and L. Li, A Practical Solution for Privacy-Preserving Approximate Convex Hulls Problem, WRI International Conference on Communications and Mobile Computing. Kunming, USA, vol. 3, pp. 539544, January 2009. Q. Wang, and Y. Zhang, A Convex Hull Algorithm for Planar Point Set Based on Privacy Protecting, First International Workshop on Education Technology and Computer Science. Wuhan. China, vol. 3, pp. 434437, March 2009. Y. Ye, L. Huang, W. Yang, and Z. Zhou, A Secure Protocol for Determining the Meeting Points of Two Intersected Circles, International Conference on Information Technology and Computer Science. Kiev, Ukraine, vol. 2, pp. 4044, July 2009. J. Xie. and S. Wang, A unified location sharing service with end user privacy control, 1st ed., vol. 16. Issue 2. Bell Labs Technical Journal Special Issue: Application, 2011, pp.520. doi: 10.1002/bltj.20499. A. Gupta, M. Saini, and A. Mathuria, Security analysis of the Louis protocol for location privacy, Communication Systems and Networks and Workshops. Bangalore. India, pp. 18, January 2009. doi: 10.1109/2009.4808858. Y. Luo, L. Huang, W. Yang, and W. Xu, An Efficient Protocol for Private Comparison Problem, Chinese Journal of Electronics. Beijing. China, vol. 18(2), pp. 205209, April 2009. J. Qin, Z. Zhang, D. Feng, and B. Li, A Protocol of Comparing Information without Leaking, Journal of Software. Beijing, China, vol. 15(3), pp. 421427, 2004.

2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2231

Yifei Yao was born in Jilin Province, China, in 1981. She received the Ph.D. degree from the Department of Computer Science and Technology, University of Science and Technology of China in 2008. She is currently a lecturer of the School of Computer and Communication Engineering at University of Science and Technology of Beijing. Her major research interests are information security and distributed computing. (Email:yaoyifei@mail.ustc.edu.cn).

Wei Yang was born in Anhui Province, China, in 1978. He received the Ph.D. degree from the Department of Computer Science and Technology, University of Science and Technology of China in 2007. He is currently a lecturer of the Department of Computer Science and Technology at University of Science and Technology of China. His major research interests are infromation security and quantum information. (Email: qubit@ustc.edu.cn).

Shurong Ning was born in Shanxi Province, China, in 1976. She received the Ph.D. degree from Beijing Institute of Technology in 2006. She is currently a professor of the School of Computer and Communication Engineering at University of Science and Technology of Beijing. Her major research interests are artificial life, intelligent control and computer animation. (Email: fancyning@163.com).

Miaomiao Tian was born in Anhui Province, China, in 1987. He received the Master degree from the Department of Computer Science and Technology, University of Science and Technology of China in 2010. He is currently a Ph. D. candidate of the Department of Computer Science and Technology at University of Science and Technology of China. His major research interests are information security. (Email: miaotian@mail.ustc.edu.cn).

2012 ACADEMY PUBLISHER

2232

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

Medical Equipment Utility Analysis based on Queuing Theory


Xiaoqing Lu
College of Science, Hebei United University, Tang Shan, China luxq2005@126.com

Ruyu Tian, Shuming Guan


College of Science, Hebei United University, Tang Shan, China try2012@126.com, smguan@126.com

Abstractto improve the efficiency of hospital, reduce patients waiting time, as well as meet patients satisfaction, the optimization of patients queuing system which is based on the knowledge of queuing theory has the practical significance in the use of medical equipments. Based on the relevant data of the usage of B- ultrasound in a hospital, the distribution model of queuing theory has been analyzed by the 2 test of goodness of fit, then being able to describe the law of probability of queuing system in a scientific and exact way, and finally the application of the hospital instruments will be arranged in the most efficient way. The wish model can be set and decided by two indexes: one is the average time of patients waiting time in queuing system, the other is the free rate of medical equipments, and then the director can identify the best status of the model. This method, on one hand, can optimize the hospital service, improve the efficiency as well as supply the scientific evidence and reference for the director to arrange the material and human resources reasonably; on the other hand, it can find out a balance between patients and hospitals through system optimization. In this way, not only can short the waiting time, but also can save the human resources and materials of the hospital. The paper also provides a theoretical basis for hospital administrators to manage and optimize medical equipment. Index Termsqueuing theory; medical equipment; fit test; optimization

I. THE BACKGROUND AND THE APPROACHES Since most advanced technical equipments and talents are clustered in the large scale hospital, queuing for the treatment is frequent and normal. Long time waiting not only results in the low efficiency and the waste of equipments and human resources, but also will lead to patients dissatisfaction and left. Therefore, how to improve service and lower the operating cost together with other issues in queuing system is a primary and cardinal management issue for the large scale hospital to solve. To solve the queuing problem, what must be done firstly is the queuing theory. Queuing theory is a discipline which solves the optimal design and control service system [3]; it is an important branch of operations research [4]. The queuing theory will be applied into the
2012 ACADEMY PUBLISHER doi:10.4304/jcp.7.9.2232-2239

hospital service through analyzing the routine that the patients come, are treated and then leave in the hospital, and therefore based on it, a qualitative and quantitative evaluation can be received, and meanwhile based on the evaluation of the clinic procedure efficiency can discover the merits and demerits of resource distribution. When arranging the utilities of the medical equipment with queuing theory, two optimization principles ought to be noticed, as follows [12]: The patients satisfaction: to well settle the queuing issue for clinics and pharmacy, the principle of patients satisfaction should be put in the first place. That is, the hospital should truly and fully mobilize and utilize the all existed resources and aim to meet every patient requirement in the given time. The benefit principle: Hospital is supposed to take measures that low cost to achieve the goal of patients satisfaction. At present, most hospitals are facing the pressure of self-financing. How to reduce the operating costs of major medical equipments is an important orientation of hospital management currently. How to utilize and optimize the all existed resources and improve efficiency and patient satisfaction is an important application of queuing theory discussed in the thesis. This paper, through case analysis, analyzes the application of the hospitals medical equipments, and then finds out the best arranging and utilizing way of medical equipments. Specific studies are as follows: The statistical inference of B-ultrasound check based on queuing system: First collect and record the customers arrived time, service time and other data outside the B-room in a hospital during a certain period in the morning. Then process the obtained data with the mathematical statistics. At last infer the probability law of the queuing system, then establish M / M / C queuing model. Determine the number of optimal help desk: In order to determine the number of optimal help desk, for the established queuing model, the actual data should be calculated and analyzed with the

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2233

formulas. As to the design or the operation management of the queuing system, the interests of both patients and service part ought to be calculated, with the aim of making the queuing system optimized while achieving any reasonable index. II. THEORETICAL BASIS A. Chi Square Goodness of Fit Test The specific procedures as follows [18]: Divide the whole getter area of X into k intervals that do not overlap [ai 1 , ai ) , i = 1,2,L k . Under the null hypothesis H 0 , with maximum likelihood estimate the distribution of the unknown parameter, the contained unknown parameter can be represented by r . Under the null hypothesis H 0 , calculate the quantum
pi = P(ai 1 X < ai )

system. Several various situations are concluded in this input procedure, and meanwhile they are compatible. The quantity of the clients (that is the source of the clients): the construction of the clients is various, maybe limited or infinitive. The way of arrival: the situation of the clients arrival is also various. They may arrive in a continuous way, or in a discrete way, or one by one, or coming in large quantity. The intervals between the different clients can be a certain type, and also can be a random type. If all the parameters (such as Expectation, variance and so on) that are used to describe the distribution of the intervals between the different clients are all not related to the time, and then this input procedure is called the stationary input procedure, otherwise it is called the non-stationary input procedure. 2. Queuing rule Queuing rule refers to the system whether the service system permits the clients to queue or not, and whether the clients are willing to accept the queuing. If the service system permits to wait, what will be the sequence? The queuing rule is usually divided as: losing system, waiting system as well as mixing system [18]. Losing system: when the clients arrive, all the service desks are taken and at the same time the service agent does not allow the clients to wait, so the clients have to leave to choose other place to accept the service or have to give up the demand. If the clients leave at once, it can be called as instance system or losing system. Waiting system: when the clients arrive, all the service desks are working for the current clients, but the clients automatically join in the queue to wait for the service till they receive the service. In this system, according to the serving sequence, there are several rules can be adopted: First come first serve, that is, the service system offers the service according to the clients arrival sequence. This is the most common serving law. Last come first serve, that is, the service system offers the service opposite to the clients arrival sequence. For example, in the information system the latter part information is much more important, so has to be settled first. Have the priority in the service system, that is, among the waiting clients, due to the distinctiveness of some clients some service objects must be treated first. Random service refers to choose the client in random to offer the service not considering the arrival sequence. In the study of the queuing system, the length of the queue is not related to the law of service, but the waiting and staying time has relationship with the law of service. Consequently, the different service law will directly affect the time cost in the queuing system.

(1)

And then get the theoretical frequency npi . Suppose that x1 , x2 , , xn come from the sample observations of X field, then observing the quantity that the locations of x1 , x2 , ,

xn are in the arrange of [ai 1 , ai ) , that is, the actual frequency f i , i = 1,2,L k .
According to the formula

2 =

( fi fi )2 fi i =0
10 2

(2) (3)

f i = nPi

Calculate the value of . Based on the given significance level , find out 2 distribution table on the condition of the degree of freedom of k r 1 and then get the critical value of ( k r 1 ) .
2

If

2 (k r 1) , the null hypothesis H 0 will be

refused; otherwise, the null hypothesis H 0 will be accepted. B. The Structure of the Queuing System Actually the queuing system is various; however, from the main determine factors of the queuing system, there are three primary components, that is, input procedure, queuing rule and service part. 1. Input procedure Input procedure is used to describe the source of the clients and the law of the clients arriving at the queuing

2012 ACADEMY PUBLISHER

2234

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

Mixing system: it is combined by the losing system and the mixing system, the service agent only allows limited number clients to wait, the extra clients having to leave. In addition, some clients are not willing to wait when they saw the long queue; for the short queue they stay to wait; or the waiting period of the clients can not beyond the value of T otherwise they will leave. 3. The service agent Describe the main aspects of the service agent, as follows: The quantity of the service desks, and if there are several desks existed, they are in series or in parallel. The serving time of clients should obey what kind of probabilistic distribution; the service time of every client is independent or not; the services offer to the clients per person or per group and so on. C. The Classification and the Symbolization of the Queuing Model The ordinary form of the classification of the queuing model is put forward by D. G Kendall is X / Y / Z / A / B / C , among them X referring to the interval of the clients arrival time, Y representing the service time, Z referring to the number of desk, A instead of the capacity of the system, that is, the largest quantity of the clients (containing the being served clients and the waiting clients), B replacing the source number of the clients, C referring to the serving rule. In other words, the criteria of the classification is the main characteristics of the queuing system, that is, the distribution of the intervals of the clients arrival, the serving time and the quantity of service desks. D. The Optimal Design of the Queuing System To optimize the design of queuing system, the one hand, considering the patients, the patients stay as short as possible; the other side, when the device is increased, the idle of the device may increase and cause waste[14]. Therefore, adding the equipment is conditional. We can use the tools of economic analysis to minimum the costs: The total cost = service cost + line damages (4) When the total cost is the smallest, the corresponding service level is the optimal. Optimization problem can also be considered from the service side. For example, in a loss control service system, when a customer is serviced, service organization can receive certain income. Service rate is higher, the patients loss is littler, so the income is more .On the other hand, the service rate is higher, and the expense is greater. Then you will find the optimal service rate and make servers net income maximal. Suppose unit time cost per each equipment is h , the loss expenses of unit time which each sufferer stops in the system is b , so the expenses function f is service cost of unit time plus linger to lose expenses [13]

L(C) is the number of sufferers that stop in the system which has C set equipments. Adopting a marginal analysis and according to the essential condition, the expenses function has a minimum value is

f C* f C* 1

( ) (

(6) (7)

f C* f C* + 1

( )

Based on analysis and computation results

L C* L C* + 1

( ) (

h L C* 1 L C* b

) ( )

(8)

By computer simulation, we can calculate the difference between two items of L(1),L(2),L(3), ,observe the constant being located in two items, then determine the optimal solution C that is, the optimal quantity of equipment, which make the cost of patients losing and the cost of hospital services to be minimum[6][10]. III. THE SITUATION ANALYSIS OF THE APPLICATION OF MEDICAL EQUIPMENTS A.Cases and Related Data As for the data collection, the quantity of medical devices, costs of medical equipment, assignment of the operators and so on are direct and available, and dont need choose samples in random. To get the average arrival rate of patients in different time, average service rate, and the probability distributions of patients waiting time and service time, it needs to collect the following sample data including the quantity of patients, the waiting time and the duration of examination. Now collect, arrange and analyze the data of the arrival time and the time of the patients to receive the service. Calculate the quantity of patients that arrive at the hospital per unit time and the distribution of service time (read Tableand Table). The unit time of the table is one hour. Service time is the time used to complete patients exam, counting by minute.
TABLE I. THE NUMBER OF B-ULTRASOUND PATIENTS WHICH REACHES IN UNIT TIME (PER HOUR) Arrivals Observed frequency Arrivals Observed frequency 0 2 7 9 1 4 8 4 2 18 9 3 3 23 10 1 4 5 6 12
*

22 15 total 113

f (C ) = h C + b L(C )

(5)

2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2235

TABLE II. THE SERVICE TIME (MINUTES) DISTRIBUTION OF B-ULTRASOUND PATIENTS Service time Arrive frequency of patients 0-15 58 15-30 27 30-45 15 45 2 total 102 persons

4. Service hours: the service time of patients receiving is random, and the law can be described by probability distribution, and the service time of patients receiving is independent. In summary, the service system is a multi-server system of M / M / C / / queuing system.
serve equipment leave leave

25 Frequency 20 15 10 5 0 0 1 2 3 4 5 6 7 8 9 10 Number of patients


Figure 1 Frequency histogram of B-ultrasound arrival time

Input

serve equipment

serve equipment
Figure 3 B-ultrasound queuing model

leave

Figure 3 [8, 16], the queue system is a single team, and C set of serving equipments are parallel. The service devices are equipped with a medical doctor, and a desk on queuing theory also can form a queuing system. C. Fitting Test In order to understand and master the law of an operating queuing system, we need multiple observations and data. Secondly, with mathematical statistics process the collected data, and observe to identify the type of model in queuing system. Then we can use appropriate theory to study and solve the problem of queuing system. The statistical inference of queuing system in this case applies goodness of fit for the 2 test to determine a given queuing system in line with distribution model [2]. Patients arriving number in per unit time subject to fitting test of possion distribution (read Table ) [5,15]. An average arrival rate per unit time of patients

80 F req u en cy 60 40 20 0 15 30 45 60 Service time


Figure 2 Frequency distribution histogram of service time

B. The Basic Characteristics of Queuing System [1, 7] 1. Input process: the patient's arrival and B-ultrasound examination time is random. So ultrasound examination system is a typical stochastic service system. The system has the following characteristics: Unlimited number of patients overall, all parameters (such as expectation, variance, etc.) of the distribution of patients arriving interval time are independent of time, and their arrival is a smooth entry process; The patient's arrival is a one by one and independent process, and they do not affect the others. 2. Queuing rules: To a help desk, a queue lines a single team, there is no restriction on the captain, and apply the rule that first come first serve when the patients arrive. If the desk is available, then check; if the desk is obtained, the patients will be waiting in line. 3. Help desk: Suppose N Desks arranged in parallel and each desk works independently; the average service rate is the same; each desk can only serve a client per time.

i f
113

=4.2 persons/hour

(9)

The data in Table 3 obtained by the following Formula: Probability

Pi =
Theoretical frequency

i e i!

(10)

f i = 113Pi
2 = ( fi fi )2 f i =0 i
10

(11) (12)

Therefore, the quantity of patients arrival to receive B-ultrasound examination in unit time is subjected to the possion distribution which parameter is 4.2.

2012 ACADEMY PUBLISHER

2236

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

TABLE III. THE POSSION DISTRIBUTION 2 FITTING TEST FOR THE NUMBER OF PATIENTS PER UNIT TIME Number of patients ( i ) Observed frequency ( f i ) Probability ( Pi ) Theoretical frequency ( fi )

Service time follows a negative exponential distribution fitting test The average service time is

0 2
0.0150

1 4
0.0630

2 18
0.1323

3 23
0.1852

1 =

fi

102

(13)

The value of the group is X i . Solution was =0.0596 person / minute. Table IV data can be obtained by the following formula: Probability

1.6945

7.1169

14.9455

20.9237

( fi fi )2 fi
Number of patients ( i ) Observed frequency ( f i ) Probability ( Pi ) Theoretical frequency ( fi )

0.0551

1.3651

0.6243

0.2060

Pi = P( xi < < xi +1 ) = e xi e xi +1
Theoretical frequency

(14)

4 22
0.1944

5 15
0.1633

6 12
0.1143

7 9
0.0686

f i = 102 Pi
2 =
i =1 4

(15) (16)

( fi fi )2 fi

TABLE IV. 21.9699 18.4547 12.9183 7.7510

FITTING TEST FOR THE NEGATIVE EXPONENTIAL DISTRIBUTION OF SERVICE TIME 0--15 15--30 30--45 45--60 total

( fi fi )2 fi
Number of patients ( i ) Observed frequency ( f i ) Probability ( Pi ) Theoretical frequency ( fi )

0.0001

0.6467

0.0653 10

0.2013 Total

Service time (Minutes) Observed Frequency

8 4
0.0360

9 3
0.0168

58

27

15

102

1
0.0071

113

( fi ) Probability (P) i Theoretical frequency 0.5913 0.2417 0.0988 0.0404

4.0693

1.8990

0.7976

60.3111

24.6500

10.0748

4.1177

( fi )

( fi fi )2 fi

0.0012

0.6384

0.0514

= 3.8549
2

( fi fi )2 fi

0.3126

0.2241

2.4077

1.0891

2 =
4.0335

Take =0.05, =11-2=9. Look-up table to be

Take =0.05, =4-2=2. Check the number of table

2 0.05,9 =16.919, 2 < 2 0.05,9 , P0.05.


0.25 Probability 0.2 0.1 0.05 0 0 1 2 3 4 5 6 7 8 9 10 Number of patients
Figure 4. A possion distribution of patients arriving in B-ultrasound examination

2 0.05, 2 =5.99, 2 < 2 0.05, 2 , P0.05.


B-ultrasound service time follows exponential distribution = 0.052.
0.8 Probability 0.6 0.4 0.2 0 15 30 Hours
Figure 5. A negative exponential distribution of B-ultrasound service time

negative

0.15

45

60

2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2237

application of indicators of = 4.2 people / hour = 0.07 people / minute, = 0.0596 person / min, C=2. Calculated as follows [3, 11]: Service intensity:

D. Index Calculation and Optimization of System From the above fitting test results, the queuing system in line with M / M / C / / queuing model [3], the

= 0.59 C
1

(17)

Free probability:

C 1 1 K 1 1 C =0.26 P0 = + C! 1 K =0 K!
The average examination: number of patients waiting

(18) for

Lq =

(C )C P = 0.64 persons 0 2 C! (1 )

(19)

of formula can be identified, it will be able to determine the optimal number of help desk and make it minimize [15]. From the above results, when there are two sets of the B-ultrasound instruments, the system utilization is 59% and idle probability is 0.26, and 44% of patients have to wait in line after arrival; the average waiting time is 9.14 minutes, and average length of stay is nearly 26 minutes. The system is busy; it is difficult to guarantee the quality of service. To optimize the system, it will now increase the number of instrument after observing the performance indicators (see Table). We can see that when the device is increased to three, the capacity utilization is 39%, and free probability is 0.25 and 11% of patients have to wait in line after arrival; average waiting time is only 1 minute. If we increase to 4 sets of equipment, although the indicators improved significantly, but capacity utilization is only 29%. While reducing the cost of waiting patients, it increases the cost of service, apparently resulting in some waste [20]. In summary, 2 equipments are basically rational, but 3 equipments is the optimal choice of the system [18]. E. Comparison of Patients with Two Kinds of Queuing Models [9, 17] In the previous study, we assume that patients can be transferred ahead of the B- ultrasound equipment, and regarded the patient's line as a "single team", and then establish M / M / C queuing model. If patient cannot be transferred in front of all B- ultrasound equipments, the each desk will become an independent queuing system. The whole queuing system will change into n M / M / 1 queuing models. Here comparing the two queuing approaches, think about a situation that the set quantity is 3. If we assume that there is no patient transferring in front of each other B- ultrasound equipment, the average arrival rate is

The average number of patients staying in the system:

LS = Lq + C = 1.82 persons
The average waiting time of patients:

(20)

Wq =

Lq

= 9.14 minutes

(21)

Average length of stay of patients:

W s = Wq +

1 = 25.92 minutes

(22)

The probability of patients waiting for examination:

P C ,

C C n P0 Pn = C ! (1 ) = 0.44 (23) n =C

0.07 =0.023 people / minute = n 3

(24)

TABLE V. THE OPERATION OF INSTRUMENT QUANTITY Instrument quantity 2 3 4

Average service rates are remained, so the system becomes five M / M / 1 queuing systems with =0.023 people / minute and =0.0596 person / minute. Free probability:

0.59 0.39 0.29

P0
0.26 0.25 0.19

Lq
0.64 0.07 0.01

Wq
9.14 1.00 0.14

P C,
0.44 0.11 0.02

P0 = 1 -

= 0.39

(25)

The average number of patients waiting for check:

Lq =

2 = 0.24persons ( - )
= 0.63 persons

(26)

The optimization of queuing system is actually critical to determine the optimal number of help desk. If we consider it from the cost structure, there are usually two kinds of cost in the service system. One is the loss cost of each client waiting for in the system; the other is the unit time service cost of every desk. If the average total cost
2012 ACADEMY PUBLISHER

The average number of patients stays in the system:

LS =

(27)

The average waiting time of patients:

2238

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

Wq =

Lq

= 10.43 minutes

(28)

<1 C

(31)

Average length of stay of patients:

Ws =

1 = 27.32 minutes

(29)

Then according to the number of equipments, we can compute the sufferer waiting time in the system, one by one; make the sufferer average waiting time Wq in the system to the upper bound value for determining the optimal number of equipment units. If there is no C value to be satisfied with two objects, you need to correct one of the goals. Based on the arrival routine of client flow, service system manager's task is that control and regulate the service system to make sure it is in the best operating condition. It is necessary to meet the clients requirements and minimize the total cost of the community (or the net income of service organization is maximum) or make other index to attain the optimum. Therefore, the managers of hospital should continuously study the rule of the clients arrival with the knowledge of queuing theory. The study and the queuing theory will help to design and regulate the service level and other indicators, making the hospital achieve the best service effect [19]. REFERENCES
[1] Gui Feng-Juan, Wang Ming-Jun and Yang Lin, Research on Optimization of Multi-queue Multi--server Serving System to Arrangement Beds, Computer and Modernization, pp. 6366, 2, 2010. [2] Qian Xiao-Hong and Hu Chun-Ping, The application of queuing theory in the hospital rehabilitation and physical therapy ultrashort wave, Medicine and Society, vol. 20, pp. 3233, July 2007. [3] Hu Yun-quan, Operational Research, Tsinghua University Press, 2003. [4] Frederick. S. Hillie, Data, Models and Decision-making, China Financial and Economic Publishing House, 2006. [5] Qin Cai, Yang Zhang-Wei and Hu Jin-Hong, Optimization of service in Outpatient Pharmacy and Emergency Pharmacy by operational research and data statistics, Academic journal of second military medical university, vol. 21(20), pp. 983986, October 2000. [6] Jiang Shu-hua, Fu Xiao-Liang, Application of Checking Service Model of Supermarket Based on Queuing Theory, Logistics Sci- Tech, pp. 141142, October 2008. [7] Wang Xing-gui, Jiao Zheng-chang, Analysis on queuing theory in the bank line up problems, Journal of Xiangtan Normal University (Social Science Edition), vol. 30(1), pp. 5860, January 2008. [8] Chen Bin, Research on queuing theory in the hospital medical management system, Software Guide, vol. 8(9), pp. 107108, September 2009. [9] Liu Wei, Liu Zhi-Min, Applying Queuing Theory to Quantitative Analysis on Outpatient Billing Service in Hospital, Professional Forum, vol. 30(10), pp. 8789, October 2009. [10] Pan Quan-Ru, Zhu Yi Jun, The Application of Queuing Theory in Toll Station Design and Management, OR Transactions, vol. 13(3), pp. 95102, September 2009. [11] Zhao Yuan, Ding Wen-Ying, Dong Shao-Hua, etc, How to Determine the Proper Quantity of Traveling Crane

Now compare mainly indexes of M / M / C and M / M / 1 queuing system as follows (see table VI):
TABLE VI.

M / M /C

AND M / M / 1 QUEUING SYSTEM PERFORMANCE COMPARISON

P0
M / M /C
0.25 0.39

Lq
0.07 0.24

Wq
1.00 10.43

Ws
17.78 27.32

M / M /1

The figures in tableindicate that M / M / C (single queue) queuing systems the length of the queue; waiting time and the average stopping time of patient are significantly lower than the M / M / 1 (multi-queue) queuing system. In a multi-team queuing system, the free rate of service equipment is 39% which also caused a waste of resources. From this, efficiency of single team in multi-server system is relatively good and is consistent with the theoretical conclusion. The reason lays in that the system operating status and indicators are different in different queue fashion. The joint service (single queue) is more effective than the decentralized services (multiqueue). Therefore, the hospital should adopt single-queue and multi-server system, it is also to say all patients need the check line up into single queue. The queue can be avoided collision and hustle with each other, and will improve the hospitals service status and efficiency. IV. CONCLUSIONS But we estimate the cost in practice, especially the loss cost of the customer waiting is very difficult. Therefore, a desire model is developed. It is according the level of intention to determine the best model. To determine the model number of the best desk C will involve the following two indicators: The average waiting time of the patients in the system Wq ; The free rate of medical equipment is P0 . Two above-mentioned indexes become the critical value. Decision-makers can accord the following two formulas to determine the range of the optimal number of devices.

1-

P0 C

(30)

2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2239

[12] [13]

[14] [15]

Based on Queuing Theory, Logistics technology, vol. 27(8), pp. 217219, 2008. Wang Lei-Ping, The Design on the APPlication of the Queuing Theory in Management System of Physical Health Checking, Masters thesis, May 2008. Han Xin-Huan, Zhu Peng-Shu and Wu jing, Optimization decision analysis of queuing model in the hospital management system, Joumal of Mathematical Medicine, vol. 21(1), pp.1617, 2008. Lin Zheng-Xiong, exploration of queuing theory in improved banking services system application, Modern Business Trade Industry, pp.167168, 1, 2010. Liu Jun-Lan, Li Ya-Fang, Xu Jing and Liu Zi-Xian, Optimizing operating department facility capacity with the queuing theory, Chinese Journal of Hospital Administration, vol. 27(6), pp. 413416, June 2011.

[16] Zeng Hua, Sun Xia-Lin, The Model of Emergency Treatment in Hospital Based on Queue Theory, Value engineering, vol. 29(11), pp. 126128, 2010. [17] Sun Hong-hua, Zeng Chao, Optimize research of hospital base on queue model, Cina Modern Educational Equipment, pp. 5963, 5, 2010. [18] Sun Hong-heng, Li Jin-ping, Queuing theory basis [M], Science and technology press in Peijing, 2002. [19] Suresh Radhakrishnan, Kashi R. Balachandran, Service Capacity Decision and Incentive Compatible Cost Allocation for Reporting Usage Forecasts [J], European journal of operational research, vol. 157, pp. 180195, 2004. [20] Cai Qin, Yang Zhang-wei,Hu Jin-hong and Ge Hai-yi, Optimization of service in Outpatient Pharmacy and Emergency Pharmacy by operational research and date statistics, Academic journal of second military medical university, vol. 21(10), pp. 983986, 2000.

2012 ACADEMY PUBLISHER

2240

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

Research and Application of Electromagnetic Compatibility Technology


Hong Zhao
School of Electrical Engineering, Dalian University of Technology, Dalian, Liaoning, China Dalian Institute of Product Quality Supervision & Inspection, Dalian, Liaoning, China Email: zhaohong@dlzjs.org

Guofeng Li and Ninghui Wang


School of Electrical Engineering, Dalian University of Technology, Dalian, Liaoning, China

Shunli Zheng and Lijun Yu


Dalian Institute of Product Quality Supervision & Inspection, Dalian, Liaoning, China

AbstractAlthough still in its primary phase up to now, the research about EMC science and its technological application in modern industries has become more and more noticeable. In this paper the main subjects for EMC research are introduced, then discussed briefly are some hot research topics concerning the application of EMC technology, as in such areas as military field, industrial electronics or microelectronics, power industry, onboard or airborne electronics, communications, home appliances and lighting equipments. Then the topic of complex electromagnetic environment concerning modern informationalized battlefield is emphatically addressed. At last several outstanding problems in the development of electromagnetic compatibility technology are introduced concisely. Index TermsEMC research, EMC technology application, Complex electromagnetic environment, Electromagnetic environment effect

II. AN OVERVIEW ON THE EMC TECHNOLOGY Zhang Linchang has defined EMC as this: EMC is a science that studies a variety of electrical or electronic equipments (and living things, in a broad sense), which aim to operate smoothly and coexist with limited resources of space, time and spectrum, and not to degrade under such a condition [1]. EMC is a highly comprehensive boundary science and also an applied science of high practicality, based on the fundamental theories and technologies of electrical and electronic sciences, concerning the solution of each theoretical or technical problem arising from natural or man-made electromagnetic interference (EMI). The final purpose of EMC research is to ensure the EMC performance of each system or subsystem [2]. Within most of the modern industries there are EMC problems to be solved, as in the industries of power, telecommunication, transportation, spaceflight, computer, military and medical equipments, etc. III. RESEARCH SUBJECTS FOR EMC SCIENCE AND
TECHNOLOGY

I. INTRODUCTION The study of Eletromagnetic Compatibility (EMC) has close correlation with the life and work of the people, and the content of EMC study is much comprehensive. In this paper the main research subjects of EMC science are first introduced, then hot research topics within some areas of EMC research and technology are described, involving the war field electromagnetic environment effects, EMC control and management for industrial electronics or microelectronics, electromagnetic susceptibility (EMS) performance for electrical or electronic equipments in power systems, EMC performance for automobile electronics and EMI suppression for household appliances, etc. Then emphatically addressed is the complex eletromagnetic environment of modern informationalized battlefield under high-tech conditions. Then following the discussion about the development trends of EMC technology for military equipments in the informational age, several noticeable problems in the future development of EMC technology will be briefly described at last.
2012 ACADEMY PUBLISHER doi:10.4304/jcp.7.9.2240-2247

A. Sources of Electromagnetic Disturbances Only if di / dt 0 or dv / dt 0 under whatever circumstances, electromagnetic noises will be generated, and these noises make up the major part of electromagnetic disturbance. Sources of electromagnetic disturbances can be of two kinds: natural or man-made. The former includes the electromagnetic noises from outer space and the aerosphere(such as thunderbolt, ionosphere changes), and electrostatic discharge, thermal noise, etc; the latter includes those from radio frequency devices for industrial, scientific or medical use, and transmitters for communication, navigation, remote control or radio business, etc. The difficulties in the research about electromagnetic noise sources include the following.

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2241

The first is about the generating mechanisms of electromagnetic noise, which are too many and varied. For example, only for household appliances, there are at least four kinds of noise generating mechanism. The second is the inherent conflict between EMC control and the development trends of some technical areas. For example, one trend in computer industry is that the CPU clock frequency goes higher and higher, but the higher the noise frequency goes, the more difficult the noise control gets. The third difficult problem up to now is the digital modeling for the physical phenomenon of the generation of electromagnetic noise. B. Propagation Characteristics of Electromagnetic Noises Generally speaking, the transmission modes of eletromagnetic noise can be divided into two categories: conducted emission and radiation emission. The former means transmission of eletromagnetic noise energy through one or several conductors (such as power lines, signal lines, control lines etc). The latter means space transmission of noise energy in the form of electromagnetic wave, sometimes induction phenomenon also included. The research method for the propagation characteristics is to build mathematical models according to electromagnetic field theory. The difficulties in this research are as follows. The first is that in most of the mathematical models, both near field effects and far field effects of electromagnetic waves have to be considered at the same time, because the frequency bands of electromagnetic noises can be very wide. The second, both the source and the channel of a noise have to be modeled in the same one. The third is that the boundary conditions of the models are usually complicated because of the need for the models to be practical, and the idealization of the boundary conditions can be difficult in some degree.

C. Immunity Characteristics of Electronic Devices and Systems The main concerns are about the responses of receivers for electromagnetic noises, and about how to improve the electromagnetic immunity. This kind of research has involved many technical areas such as telecommunication, navigation, radar, broadcasting, television, information technology, remote control and remote sensing. The excessive dependence on human experiences is the difficult point to overcome in this research. D. Test Equipments, Test Methods and Statistical Methods Electromagnetic noises are not normal sinusoidal voltages, but voltages taking on various shapes and different frequency spectrums, including pulse voltages. Therefore there are strict requirements about the test equipments, test sites and test methods. In the tests people always pay attention to instrument parameters, measurement fields, measurement methods, and such matters as the statistics, the evaluation, etc. Since the coefficients or performance parameters of test equipments or test sites may drift from initial values with the lapse of time, it is recommended to perform routine checks (also called spot checks) of the test equipments and test sites in daily operation of EMC laboratory, say, once per week or twice per month. Timely corrective measures should be taken when deviations occur, to ensure that uncertainties in the own performance of test equipments or test sites can not lead to test data distortion. TABLE is part of a check list on the performance of some anechoic chamber; the judgement standard in this check list is that the difference between spot check value and corresponding original value should be within 3 dB. Here the unit of dB is an EMC unit that is frequently-used, for example, the ratio between two variable values can be stated as follows, with dB as the unit [3].
P dB 10 log 10 2 (ratio of power values) (1) P 1

TABLE I. CHECK LIST ON THE PERFORMANCE OF SOME ANECHOIC CHAMBER

2012 ACADEMY PUBLISHER

2242

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

v dB 20 log 10 2 v 1
i dB 20 log 10 2 i 1

(ratio of voltage values) (2)


(ratio of current values) (3)

challenging EMC technology areas for all these applications can be restricted to three main themes: EMC Design, EMC Standardization and EM (electromagnetic) Safety [32]. Hot research topics in some of the EMC technology application areas are presented as the following.
Military Network EMC DESIGN System
Industrial electronics

E. EMC Analysis, Prediction and EMC Design EMC design must rely on EMC analysis and prediction. Key points in EMC analysis and prediction are the building of mathematical models and programming for the computation and analysis of EMI that is within a system or between systems. Nowadays, though the accuracies of EMC analysis and prediction are not likely to be very high, they should reach the level of being serviceable in practice. For example, basic mathematical model has been set up for the quantitative discussion of shielding effectiveness of metal shielding shells, which points out the relationship between the shielding effectiveness SE dB and the reflection loss RdB , absorption loss AdB and multiplereflection factor M dB , as (4).
SE dB = RdB + AdB + M dB

Subsystem Power system Interconnect Consumer electronics Application areas Component Application levels

EMC STANDARDIZATION EM SAFETY

Challenging EMC technology areas

Figure 1. EMC view on electronic systems.

(4)

Furthermore, practical mathematical models have been built for the determination of exact values of shielding effectiveness under the condition of far field source [3], as well as the approximate values of RdB and AdB , as (5), (6) and (7), with 0 being the inherent impedance of free space, being the inherent impedance of shielding layer, t being the thickness of shielding layer (in meters), being the skin depth of shielding material (in meters), r being the relative conductivity, r being the relative magnetic permeability and f being the frequency (in Hz).
SE dB 20 log10

0 + 20 log10 et / + M dB 4

(5)

RdB = 168 + 10 log10 r f r

(6)

AdB = 131.4t f r r

(7)

IV. SOME HOT TOPICS IN EMC TECHNOLOGY


APPLICATION RESEARCH

For the EMC technology there are many research and application areas such as weapon systems [4~7], industrial electronics or microelectronics [8~15], power systems [16, 17], onboard or airborne electronics [18~28], communications [29], home appliances [30], lighting equipments [31], see Fig. 1. In each area, there can be several levels of concrete applications, as levels of system, subsystem, interconnect and component, etc. The

A. Electromagnetic Compatibility for Weapon Systems With the technology development for military equipments, signal spectrum may be composed of by signals from various kinds of electrical equipments, and with the use schemes for electrical equipments changing at times, the characteristics of electromagnetic environment have become more and more complicated. Take a warship for instance, which can be considered as a platform holding lots of electronic weapons and electrical equipments, the embodiment of electromagnetic environment effects may have various forms. For example, the radio-frequency interferences may take on the characteristics of high amplitude and big density, the radiation from each electronic warfare system may have wide frequency spectrum, and the low-frequency communication equipments may be harmed by strong magnetic field environments or low-frequency heavy currents. In a future war, effective management, control and flexible using of electromagnetic environment will become key factors in winning the initiative of the war. Therefore, listed below may be key points of the research with respect to war field electromagnetic environment effects [4]. 1. Analysis techniques for electromagnetic environment. 2. Design techniques for electromagnetic safety of weapon system. 3. Simulation techniques for electromagnetic environment and strong electromagnetic environment. 4. Experimentation, test and evaluation of electromagnetic environment effects. 5. Techniques of control and management. In the next section, complex eletromagnetic environment of modern informationalized battlefield under high-tech conditions will be emphatically addressed.

2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2243

B. EMC Control and Management for Industrial Electronics or Microelectronics The EMC research for electrical or electronic devices is still in its primary phase up to now. The research in the last years has been focused on such issues as techniques for modeling and suppression of EMI from power converter and motor transmission, parasitic effects of EMI filters, layout optimization for PCBs and numerical analysis techniques for EMI [8]. C. EMS Performance of the Equipments in Power Systems Nowadays the application of digital instrument control system has been a hot topic for the research and development in nuclear power industry, and EMI immunity test of such system has become a focus of attention of the engineers. To improve EMC performance of the entire system, it is necessary to strengthen the design against EMI for such components as power supply, communication network, input/output module and CPU module, and reasonable design or corrective measures for electromagnetic environment of the system may also be necessary. Such measures may be proper grounding, standard wiring or dust prevention, etc [16]. Full-scale micro-computerization of the protection devices in power systems has become an irreversible trend. How to improve the anti-EMI ability of these devices to ensure the stable operation of power system has always been a hot topic for the research of microprocessor-based protection technology. To this problem, the anti-interference ability for the hardware as well as the software of such a device can be respectively taken into consideration [17]. D. EMC Performance for Onboard or Airborne Electronics Take electric vehicles for example. For various kinds of electric vehicles such as fuel cell vehicles and HEVs, electromagnetic compatibility problems have been paid much attention, including the problems with car parts, control & management systems and entire vehicles. EMC research about electric vehicles aims at ensuring that the onboard electrical and electronic equipments can work compatibly together in the running state, each equipment being immune against interference from outside, meanwhile causing no interference beyond allowed range to other onboard or outside equipments. E. EMI Suppression for Home Appliances According to different kinds of operation principles and configurations, household electromotors can be sorted into two categories, i.e. commutator type and induction type. Electromagnetic disturbance from home appliances of the former type can be particularly serious, accordingly the anti-disturbance measures such as proper filtering, shielding and grounding should be considered in the early design phase [30]. V. ABOUT COMPLEX ELECTROMAGNETIC ENVIRONMENT
OF BATTLEFIELD IN THE FUTURE

Electromagnetic environment of informationalized battlefield under high-tech conditions should be recognized as a composite environment, characterized by the joint action of various electromagnetic energy. In this composite environment, coexist both the sources of electromagnetic disturbances of natural origin, such as thunderbolts, electrostatic, etc., and those of serious manmade disturbances such as radars of various powers, radio communication, navigation and the antagonistic EW(electronic-warfare) equipments, new concept electromagnetic weapons, etc., see Fig. 2. The electromagnetic environment of any modern battlefield is mainly composed of various electromagnetic pulse fields [5]. For the research of complex electromagnetic
Electrostatic and its EM pulses Natural disturbance Eletromagnetic environment of modern battlefield Man-made disturbance Intentional Strong EM pulses Thunderbolt and its EM pulses EM radiation inside the system Unintentional Electronic warfare EM radiation outside the system Electromagnetic disturbance Electronic fraud Nuclear EM pulses Ultra-widebands High power microwaves

Figure 2. Formations of electromagnetic environment of modern battlefield.

environment, one should first get to know about some basic concepts and the relationships between each other, such as the concepts of electromagnetic environment, complex electromagnetic environment, electromagnetic disturbance, EMI, EMC, electromagnetic shielding, spectrum management and electronic warfare, etc. [6]. A. Electromagnetic Environment and Complex Electromagnetic Environment Electromagnetic environment commonly means the grand sum of all the electromagnetic phenomena consisting in some place. On analyzing the formation and characteristics of electromagnetic environment of a future battlefield, electromagnetic environment under a battlefield condition can be defined as this: In a specific battlefield space, the grand sum of natural electromagnetic phenomena and man-made electromagnetic phenomena that may have influences on the warfare operations. Complex electromagnetic environment may be defined as this: In a specific battlefield space, a battlefield electromagnetic environment having certain influences on the equipments, fuels and persernnel, environment being the superposition of different electromagnetic signals that may be densely distributed in complex modes, numerous in quantity, dynamic and stochastic in time domain, frequency domain, energy domain and space domain. B. Electromagnetic Environment Effects Electromagnetic environment effects refer to the effects of some factors or the collectivity of electromagnetic environment acting on the equipments, volatile materials and the organisms, etc. In Electronic

2012 ACADEMY PUBLISHER

2244

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

War published by the U.S. army in 2007, the concept of electromagnetic environment effects has been defined as the influences of electromagnetic environment on the operational capabilities of the armed forces, equipments, systems and platforms. C. Electromagnetic Disturbance and Electromagnetic Interference In GJB72A-2002, the concept of electromagnetic interference has been described as any conducted or radiated electromagnetic energy that can interrupt or hamper, even degrade or restrain the performance of radio communication or some other electrical or electronic equipment. The U.S. army defines electromagnetic interference as: any electromagnetic disturbance that can interrupt or hamper, degrade or restrain the working efficiency of electrical or electronic equipment. The concept of electromagnetic disturbance contains the two meanings of natural and man-made disturbances, and it is a neutral concept without particular attributes, commonly signifying the electromagnetic disturbance phenomena in the combat systems of own side or the ally side. However, electromagnetic interference usually means a kind of man-made purposeful electromagnetic behaviors, generally the behaviors between the two warring sides. D. Electromagnetic Compatibility, Electromagnetic Shielding and Spectrum Management Electromagnetic compatibility refers to a coexistence state that in a common electromagnetic environment, the respective functions of equipments, subsystems or systems can be exercised all together. There are two aspects of this concept. (1) When running in an expected electromagnetic environment, equipment, subsystem or system can fulfill its designed performance according to specified security margins, the performance not to be compromised or unacceptably degraded because of electromagnetic interference. (2) When smoothly running in an expected electromagnetic environment, an equipment, subsystem or system is not likely to bring about unacceptable electromagnetic interference to the environment (or some other equipment). Electromagnetic shielding refers to the measures taken for avoiding the influences or even damages that may be suffered by the electro-explosive devices, fuels and personnel from electromagnetic environment effects(or electromagnetic hazard sources), as well as the technical measures for electronic equipments and subsystems to lower their electromagnetic sensitivity and get the ability of anti-electromagnetic interference or antielectromagnetic damaging, especially the countermeasures against enemy electromagnetic attacks. The two research subjects of electromagnetic compatibility and electromagnetic shielding have coherence in their intensions and extensions, and both of the research purposes are to improve the viability and

serviceability of any system in an expected electromagnetic environment. Military electromagnetic spectrum management is a collective term for a set of activities of military leading bodies and electromagnetic spectrum management organizations, including the establishment of policies and systems for electromagnetic spectrum management; the division, planning, distribution and allocation of frequencies and spacecraft orbit resources; and the supervision, inspection and coordinating management for the usage of frequencies and spacecraft orbit resources, etc. In fact, spectrum management is also a sort of management and control measures for ensuring the electromagnetic compatibility of systems or equipments. VI. DEVELOPMENT TRENDS OF EMC TECHNOLOGY
APPLICATION

In industrial production and human life, the harm of EMI has been recognized more and more widely. For example, for fear of security threat to a flying civil aircraft, it is forbidden to use mobile phones in the capsule cabin, because electromagnetic disturbance from a mobile phone may be coupled in the cables then to the sensitive equipments in the aircraft, and the disturbance may also radiate outward through the cabin windows and then be directly received by the antennas and sensors, which are numerous on the fuselage. As another example, electromagnetic radiation may interfere with electroexplosive devices, making them mistakenly detonated. The American Ellen Sugarman depicted in his Warningthe Electricity Around you May be Hazardous to your Health: How to Protect Yourself From Electromagnetic Fields that many epidemiology researches have revealed that being exposed to a magnetic field of over 2 milligausses may add to the risk of contracting cancer. And if a residence is within 100 meters far away from high-tension cables, the risk for children to contract leukaemia or brain cancer will increase. With the fast increase of clock frequencies of integrated circuits in electronic systems (already up to 15 GHz in the year 2010), the number and variety of potential EMI sources and victims around us are going to increase exponentially in the near future. The question of how to control EMI is becoming a key issue in system design [32]. To meet the market needs for products that are safe to use, highly reliable and perfect in EMC performance, new methods and new means should be adopted to lower EMI from electrical components, interconnects and (sub)systems, meanwhile improve their immunity for EMI. Future electromagnetic environment needs to be controlled through the establishment of legislation and standardization, with new test methods, frequency bands and EMI limits. Zhao Gang has made the discussion for the development trends of EMC technology in the information age for military equipments [7], which is also useful for reference by other EMC technology application areas and generalized as below. 1. Improvement of the viability and serviceability of C4ISR system is a focus in EMC research in this century.

2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2245

C4ISR is a man-machine system that integrates the subsystems of command, control, communication, computer, intelligence, surveillance and reconnaissance. By the comprehensive use of modern electronic and information technologies and military science theories in a battle command system, C4ISR can achieve the scientization of decision-making methods and the automation of operational information collection, transfer and processing, thus ensuring the operational command to be efficient. 2. Application of predictive analysis method will feature in EMC design and EMC control for electronic equipments in this century. So called predictive analysis method means in the beginning as well as the whole course of weapon equipment development and design, with numerical calculation methods complemented with appropriate tests and according to the equipment characteristics, antenna layouts, armament layouts or the component characteristics and circuit layouts, making predictive analyses for electromagnetic characteristics of the systems, subsystems, equipments, parts, components and then among them making reasonable distribution of EMC targets; making requests and targets for the control devices, circuits, elements and continually making corrections and supplements for these requests and targets, and making solutions for EMC problems along with the development process. 3. Verification, evaluation and shielding of electromagnetic environment effects are hot problems to be solved in current days. Electromagnetic environment effect refers to the influence from electromagnetic environment to the service ability of electrical or electronic system, equipment or device. It covers all of the electromagnetic disciplines including EMC, electromagnetic interference, electromagnetic vulnerability, electromagnetic pulses, electronic countermeasures, harms of electromagnetic radiation to armament and volatile substance, and natural effects such as thunderbolt or sedimentary electrostatic. 4. Proper management and utilization of radio frequency spectrum resources are becoming much important for EMC technology application in the information age.

Radio spectrum is a kind of natural resource that is important, limited and non-consumable. With the rapid development of information technology, spectrum resources are being made full use of (see Fig. 3), meanwhile the increasingly overcrowded spectrum occupancy plus serious electromagnetic pollution is making the available spectrum resources increasingly poor. As a result, how to meet the demands of information society for spectrum resources and how to make the best of and effectively manage the limited spectrum resources, have become critical problems to be solved in this information age. 5. Exploration and application of new techniques, new materials and new devices are the trends for EMC technology development. The technologies for EMI suppression and antiinterference reinforcement are backbone technologies for EMC control. The bases of technologies for achieving good electromagnetic compatibility of circuit components, equipments, systems or armament platforms, are the carrying out of interdisciplinary and multidisciplinary joint researches, persistent development and application of new EMC technologies, new materials and new devices, and meanwhile the full using of classic techniques of shielding, filtering, decoupling, isolating and grounding, etc. Gao Yougang has enumerated several outstanding problems in the current development of environment electromagnetism and EMC technology, briefly introduced as follows [33]. 1. Electromagnetic compatibility prediction EMC prediction is necessary in the development of a complicated equipment or system. At present time, EMC predictions are carried out at three levels. 1) EMC predictions at chip level 2) EMC predictions at component level 3) EMC predictions at system level Unfortunately, it is hard to make perfect EMC prediction at whichever level by now. There has not been any serviceable prediction software. As extremely complex electromagnetic boundary-value problems, EMC problems are hard to solve using conventional methods. 2. Development of shielded-measurement technologies

Figure 3. The frequencies we use.

Figure 4. A stirred-mode chamber.

2012 ACADEMY PUBLISHER

2246

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

Recently the development of stirred-mode reverberation chambers (see Fig. 4) has gained much attention. The need of reverberation chambers comes from the requirement for high EMI field intensity from the test standards like aerospace standards, military standards and motor vehicle standards. 3. Frequency spectrum allocation and management From the mid-twentieth century people have come to realize that, the radio spectrum composed of the three factors of space, time and radio frequency is a unique kind of natural resources. It is a resource being limited but not consumable, being important but not visible or touchable. For this precious and special resource, scientific management is necessary, meanwhile the application of this resource needs to be most effective and strictly restricted, thus ensuring a desirable radio environment. 4. EMC problems of space vehicles A space vehicle is a mixed system, combined with complex mechanical structures and complex electrical and electronic structures. EMC analysis and prediction for this system can be divided into two aspects. 1) For the basic component units in the system that can be effectively analyzed, corresponding analysis software packages should be worked out and mathematical models are made as perfectly as possible, by means of various electromagnetic numerical analysis methods that may be relatively complex and relatively accurate. 2) The inside of a space vehicle can be divided into several relatively independent subsystems, and each subsystem may be equivalent to an EMI source or an EMI victim. Following EMC analysis and prediction for each subsystem, corresponding mathematical models can be worked out. 5. EMC problems in radio communication technology Some critical EMC problems need to be solved for mobile communication technologies in this century, so that the development of such technology can be based upon a reasonable foundation. The research work should include two aspects, namely in-system problems and those between systems. 1) EMC problems are very complicated inside a system, especially in a radio system equipped with high frequency components and circuits. Special attention should be paid to the systems for high-speed information transmission, and this will be an important research direction for EMC problems in this century. 2) Some critical EMC problems between systems should also get enough attention. 6. EMC problems in the computers A computer is a low-level electronic system, commonly working in a complex electromagnetic environment, meanwhile radiating outward and leaking electromagnetic interferences in a wide frequency band. And yet from the angle of electromagnetic compatibility, the computer is mainly sensitive equipment. Generally speaking, the main measures for improving the antiinterference ability of computer systems can be outlined as follows.

1) The improvement of anti-interference ability of a system itself can be set about from the design of hardware and software. 2) Shield earthling is an effective measure for the improvement of anti-interference ability of a computer. 3) Filtering is an important method to suppress the transmission coupling interference. 4) High-density information can be transmitted using light technology, which is not easily affected by electromagnetic induction noises. 7. Ecological effects of electromagnetic fields The influences of electromagnetic waves on biological tissues have strong frequency dependence. Currently there are two frequency bands being intensively studied, one is the electric power frequency (50Hz~60Hz), the other is the radio wave band(radio frequency). 1) Ecological effects of low-frequency fields Foreign medical research results have shown that electromagnetic fields of high-tension cables can produce harmful effects on human tissues. The factors causing such electromagnetic environment problems include the noise waves from corona discharges, ozone and the magnetic fields and electric fields induced by the currents and voltages in the conductors. 2) Biological effects of radio frequency fields Effects of electromagnetic waves on biological tissues can be divided into thermal effects and non-thermal effects. The mechanisms of thermal effects are relatively clear, while those of non-thermal effects are not exactly. In conclusion, the effects of electromagnetic radiation on animals and human bodies are a kind of important research subjects, although there has been great progress in this research, many problems are still not clear and thus need further researches, especially those researches and tests being systematic, long-term and complying with scientific standards. REFERENCES
[1] Zhang Linchang, Developing electromagnetic compatibility undertaking in our country, Transactions of China Electrotechnical Society, vol. 20(2), pp. 2328, 2005. [2] Ma Weiming, Several problems in the research for EMC prediction of independent system, Electromagnetic Compatibility Technology, vol. (2), pp. 14, 2005. [3] Clayton R. Paul, Introduction to Eletromagnetic Compatibility (Second Edition. Translated by Wen Yinghong etc.) Beijing: Posts& Telecom Press, 2007. [4] Hou dongyun, Electromagnetic environment status and EME/EMC technology development, Electromagnetic Compatibility Technology, vol. (4), pp. 15, 2007. [5] Liu Shanghe, Weaponry furnishment and electromagnetic environment effects, Electromagnetic Compatibility Technology, vol. (3), pp. 17, 2006. [6] Liu Shanghe, Sun Guozhi, Analysis of the Concept and Effects of Complex Electromagnetic Environment, Electromagnetic Compatibility Technology, vol. (2), pp. 1 6, 2009. [7] Zhao Gang, Development trend of weaponry furnishment EMC technology in the informational age, Ship Electronic Engineering, vol. 27(1), pp. 2022, 2007.

2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2247

[8] Qian Zhaoming, Chen Henglin, State of art of electromagnetic compatibility research on power electronic equipment, Transactions of China Electrotechnical Society, vol. 22(7), pp. 111, 2007. [9] Ali Alaeldine, Nicolas Lacrampe, Alexandre Boyer, etc. Comparison among emission and susceptibility reduction techniques for electromagnetic interference in digital integrated circuits, Microelectronics Journal, vol. 39(12), pp. 17281735, 2008. [10] Zhou Changlin, Hu Mingxin, Lin Xin, etc. Electromagnetic compatibility analysis and design for digital signal controllers, Asia-Pacific International Symposium on Electromagnetic Compatibility, pp. 668 671, 2010. [11] Boyuan Zhu, Junwei Lu, Erping Li. Electromagnetic compatibility benchmark-modeling approach for a dual-die CPU, IEEE Transactions on Electromagnetic Compatibility, vol. 53(1), pp. 9198, 2011. [12] Zhen Zhang, K. T. Chau, Zheng Wang, etc. Improvement of electromagnetic compatibility of motor drives using hybrid chaotic pulse width modulation, IEEE Transactions on Magnetics, vol. 47(10), pp. 40184021, 2011. [13] M. S. Sarto, A. Tamburrano. Modelling approaches for nanotechnology applied to electromagnetic compatibility, Asia-Pacific International Symposium on Electromagnetic Compatibility, pp. 498503, 2010. [14] C. Fuentes, B. Allongue, G. Blanchot, etc. Optimization of DC-DC Converters for improved electromagnetic compatibility with high energy physics front-end electronics, IEEE Transactions on Nuclear Science, vol. 58(4), pp. 20242031, 2011. [15] Erping Li, Xingchang Wei, Andreas C. Cangellaris, etc. Progress review of electromagnetic compatibility analysis technologies for packages, printed circuit boards, and novel interconnects, IEEE Transactions on Electromagnetic Compatibility, vol. 52(2), pp. 248265, 2010. [16] Huang Wenjun, Yu Haoyang, Ao Chunbo, EMC test and design of I&C system in nuclear power plants, Nuclear Power Engineering, vol. 29(3), pp. 8588, 2008. [17] Peng Honghai, Zhou Youqing, Wang Hongtao, etc, Antiinterference technology for microprocessor-based protection, High Voltage Engineering, vol. 33(10), pp. 4953, 2007. [18] Lucas E. A. Chamon, Claudio H. G. Santos, Kenedy M. dos Santos, etc. Dielectric effects in electromagnetic compatibility experiments for automotive vehicles, 9th IEEE/IAS International Conference on Industry Applications, pp. 16, 2010. [19] Y. Li, F. P. Dawalibi, R. Raymond. Electromagnetic compatibility analysis of power line and railway sharing the same right-of-way corridor: a practical case study, International Conference on Future Power and Energy Engineering, pp. 103106, 2010. [20] Wang Jian, Cai Bai-gen, Liu Jiang, etc. Electromagnetic compatibility design of multi-sensor based train integrated positioning system, International Conference on Electromagnetics in Advanced Applications, pp. 753756, 2010.

[21] Mohamed Youssef, Jaber A. Abu Qahouq, Mohamed Orabi. Electromagnetic compatibility results for an LCC resonant inverter for the tranportation systems, TwentyFifth Annual IEEE Applied Power Electronics Conference and Exposition, pp. 18001803, 2010. [22] Emmanuelle Garcia. Electromagnetic compatibility uncertainty, risk, and margin management, IEEE Transactions on Electromagnetic Compatibility, vol. 52(1), pp. 310, 2010. [23] R. De Maglie, A. Engler, Radiation prediction of power electronics drive system for electromagnetic compatibility in aerospace applications, Proceedings of the 14th European Conference on Power Electronics and Applications, pp. 19, 2011. [24] Guo Dandan, Su Donglin, Xie Yongjun, etc. The complex network model of the airborne equipment electromagnetic compatibility, 9th International Symposium on Antennas Propagation and EM Theory, pp. 10191022, 2010. [25] Mohamed Youssef, Jaber Abu-Qahouq, Mohamed Orabi. The electromagnetic compatibility design considerations of the input filter of a 3-phase inverter in a railway traction system, IEEE Energy Conversion Congress and Exposition, pp. 42104216, 2010. [26] Liu Ying, Xie Yong-jun, Zhang Yong. Top-down design flow and its applications in multi-vehicle communication systems EMC design, Journal of University of Electronic Science and Technology of China, vol. 39(5), pp. 720724, 2010. [27] Zheng Yali, Yu Jihui, Wang Quandi, etc. Dynamic circuit model of the spark plug for EMC prediction of ignition system, Transactions of China Electrotechnical Society, vol. 26(2), pp. 813, 2011. [28] An Jing, Wu Junfeng, Wu Yihui. Electromagnetic compatibility design of electronic control system for attitude control flywheel, Journal of Jilin University (Engineering and Technology Edition), vol. 41(4), pp. 9981003, 2011. [29] Nozad Karim, Jingkun Mao, Jun Fan. Improving electromagnetic compatibility performance of packages and SiP modules using a conformal shielding solution, Asia-Pacific International Symposium on Electromagnetic Compatibility, pp. 5659, 2010. [30] Liu Ping, Sha Fei, Improving the electromagnetic compatibility of home appliances with commutator electromotor, Journal of Northern Jiaotong University, vol. 26(6), pp. 6468, 2002. [31] Eugen COCA, Valentin POPA, Georgiana BUTA. Compact fluorescent lamps electromagnetic compatibility measurements and performance evaluation, IEEE International Conference on Computer as a Tool, pp. 14, 2011. [32] Van Doorn M, EMC technology roadmapping: a longterm stragegy, IEEE International Symposium on EMC, pp. 156159, 2006. [33] Gao Yougang, etc, Forecast of electromagnetic compatibility technology, Electromagnetic Compatibility Technology, vol. (1), pp. 15, 2006.

2012 ACADEMY PUBLISHER

2248

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

An Improved Mixed Gas Pipeline Multivariable Decoupling Control Method Based on ADRC Technology
Zhikun Chen1, 2
1 Electrical Engineering College, Yan Shan University, Hebei Qinhuangdao 063009, China; 2 College of Electrical Engineering, Hebei United University, Hebei Tangshan 063009, China Email: zkchen@heuu.edu.cn

Yutian Wang1, Ruicheng Zhang2 and Xu Wu2


1 Electrical Engineering College, Yan Shan University, Hebei Qinhuangdao 063009, China; 2 College of Electrical Engineering, Hebei United University, Hebei Tangshan 063009, China Email: {y.t.wang@163.com, rchzhang@yahoo.com.cn, barbiewx@163.com}

AbstractAccording to the serious decoupling, uncertainty, easily disturbed and nonlinear characters of the gas mixture of butterfly valve string couplets system, an ADRC static decoupling technology and the dynamic decoupling method based on extended state observer (ESO) were proposed. In this method, coupling from other input, parameters time-varying and external disturbances are all regarded as a total disturbance. The total disturbance can be estimated by ESO and feedback to controller, and coupling is achieved. In order to improve response speed of the system, and reduce the computation, this control system use nonlinear tracking-differentiator(TD), ESO and the nonlinear state error feedback control law (NLSEF) uses nonlinear function.Simulation results shows that, in this method, the decoupling is simplified, and the requirement of the model is reduced. The novel method can also ask for less calculation with faster response and better robustness, still with good tracking performance and disturbance rejection capability, the decoupling effect is fine. Index Termsgas mixing; active disturbance rejection control (ADRC); extended state observer (ESO); decoupling

I.

INTRODUCTION

For the steel industries, gas compressor stations is a very important units, it is responsible for the iron and steel group and the surrounding cities with mixed gas calorific value and pressure. If the gas compression station stops, all units of the iron and steel group have to stop manufacturing.As the iron and steel production processes and the geographical layout of the business characteristics and complex situation of the production statement, it requires the mixed gas pressure station control system to adapt to the following features: (1) The variability of the in-put and out-put pressure that under the effect of the upstream user and downstream user change in the amount. (2) Medium is the gas, and change-process in a long term.

(3) Volatility characteristics of the pipe network. Obviously, we are facing with a mathematical model which is difficult to determine with great delay and nonlinear parts of the complex controlled object. With the more and more complex, characters as uncertainty, interference, nonlinearity, non-minimum phase, etc, more than one variable is needed to be regulated. And there are several interrelated variables. Traditional single-variable control system designed approach is clearly unable to meet the requirements. In engineering area, bring in the multi-variable decoupling is common. There are two butterfly valves in pipeline blast, coke pipe also have two butterfly valves, by adjusting this four valve, regulation of coke oven gas and blast furnace gas flow ratio, to achieve the ratio of calorific value and pressure regulation.There is serious coupled between the flow control and pressure regulator [1], at the same time the regulation of the four butterfly valves are also interaction, this implies that there is a relative strong coupling in this system. In recent years, along with the development of control theory, a variety of decoupling control method adapt to the times require. At present, most of the major domestic pressure station by artificial control. There are three common methods of gas mixture pressure control in automatic control of the compression stations: pressure and calorific value dual-loop controller, feed-forward compensation decoupling and intelligent decoupling. Pressure and calorific value of the dual-loop adjustment method using the variable matching circuit, by matching the appropriate of regulating variables, this method is not accurate, the ability to inhibit the disturbance is not strong. Since the feed-forward compensation decoupling method due to the limitations of control methods, control effect is poor [1 4]. Along with the development of intelligent control, intelligent decoupling control strategy is proposed by Ren Hailong [5]. By analysis the disciplines and the physical structure of the gas mixture pressure process, they divided the control loop into two: the calorific value

2012 ACADEMY PUBLISHER doi:10.4304/jcp.7.9.2248-2255

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2249

pressure decoupling control loop and the pressure control loop. However, because the two valves series on the blast furnace gas pipes and the two valves series on the coke oven gas pipes are very close, the literature one pointed that: this is a typical strong coupling system. Because of this,we explore a new control method Active Disturbance Rejection Control (ADRC),use this technology in mixed gas pipeline multi-variable system, in order to achieve better control results. ADRC have 3 parts: the tracking differentiator (TD),extended state observer (ESO) and nonlinear state error feedback control law (NLSEF).It is a kind of nonlinear controller that not dependent on the model, it attributed the model uncertainty and unknown external disturbance effects (including the coupling term) to the general disturbance of the system, by the extended state observer (ESO) to real-time estimation and dynamic compensation, there are many advantages, such as high precision, fast response, robustness, usability, etc. Active disturbance rejection control is to adapt the trend of digital control, absorb the results of modern control theory, development and application of nonlinear effects to develop new practical technologies.Therefore, in this paper, used ADRC static decoupling technology and ESO dynamic decoupling techniques, established multi-variable control system of gas mixed, realize the decoupling control of two butterfly valves. II. PROCESS INTRODUCTION According to the law of energy conservation, the heat value of mixed gas, such as (1) below.
R 3 Q 3 = R 1Q 1 + R 2 Q 2

The design of four-valve regulating mixed-gas heating value was first proposed by the Japanese Nippon Steel Corporation, at present, the vast majority of gas pressure regulating stations have adopted such a program. Why the butterfly in country has been developing rapidly, the reason is it has the following advantages: (1) Simple structure, the installation length is short, easy layout, valve installation is far less than the length of DN, installation length and the nominal diameter ratio of only 0.2~0.1; (2) Small size, light weight; (3) Liquidity resistance is small. When the Lowpressure valve fully open, liquidity resistance coefficient <1; (4) Switch is simple, just turn on a dish plate 90 . Butterfly valve has many advantages, but there is a drawback: most valve impatience high-pressure. In highpressure environment, in order to solve the high pressure problem, general with double butterfly valve series pressure points. This is also the reasons why gas compressor stations using butterfly in series design. Through the butterfly valve to regulate gas flow, gas pressure, gas heating value, the main principle is based on equation (2)

Q=

2g

(2)

In equation (2): 3 Q volume flow ( m / s ), that is volume of gas flows through the pipe cross-sectional area in per second A pipe circulation sectional area ( m )
2

(1)

Symbol subscript l, 2, and 3 were side coke oven gas, blast furnace gas and mixed gas. 3 Among, R-gas calorific value ( KJ / m ), Qgas flow 3 (m / s )

drag coefficient, dimensionless

g acceleration of gravity ( m s 2 ) Ppressure differential is ( p 1 p 2 )( kgf / cm 2 ) , p 1 , p 2 respectively is before and after the pressure of valve.

PV101 PV103

PV102 PV104

Gas mixing

medium severe ( kgf m 3 ) For simplicity, excluding the impact of the density before and after mixing, obtain equation (3)
Q3 = Q1 + Q 2

(3)

Fig 1. Process mixed gas compressor stations

Simplify process of gas mixing compression station shows in figure 1.There are two butterfly valves in blast furnace gas pipeline,coke oven gas pipeline also have two butterfly valves, by adjusting this four valve, regulation of coke oven gas and blast furnace gas flow ratio, to achieve the ratio of calorific value and pressure regulation.The mixture of gas pressurized from the compression machine, compressor stations are generally installed in the inverter, by adjusting the frequency converter to change the speed of pressure machine to adjust the pressure purposes.

Symbol subscript l, 2, 3were side coke oven gas, blast furnace gas and mixed gas. According to (3), because the total flow Q3 of mixed gas is uncontrolled, decision by the users, when a gas flow rate changes, need to change another gas flow to reach the purpose of regulate the mixed gas calorific value. III. THE MODEL OF THE GAS MIXING BUTTERFLY VALVE GROUP In the blast furnace, coke oven gas pressure during the mixing,when the controller to calculate the control values of coke oven and blast furnace valve group, the butterfly valve controller according to the series gain

2012 ACADEMY PUBLISHER

2250

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

matrix to decision that how to adjust the two valves of corresponding valve group in the end. Due to the high valve and focal valve pressure distribution of the basic ideas are exactly the same, for example by coke oven butterfly group.
N um ber one b u tte rfly v a lv e N u m b e r tw o b u tte rfly v a lv e

P1 = P0

Q Q u1 P0 + u 2 P2 = P2 + = u1 u2 u1 + u 2

(10)

Another gain can be determined, solve partial derivative for equation (10), can derived the relative gain of u1, u2 for two channels of the P1. Then, pressure-flow system of input-output relationship can be expressed by relative gain matrix as
Q u1 P = u 1 2

P0 U1

P1

Q U2

P2
Among
= 11 21

(11)

gas of coke oven

Fig.2. Pipe system of pressure-flow

In the fig 2, P0 is coke oven gas pressure before the number one butterfly valve. P1 is coke oven gas pressure before the number two butterfly valve. P2 is coke oven gas pressure before the number two butterfly valve. u1, u2 are the opening of number one and number two butterfly valves. Q is the gas flow of coke oven pipeline. According to the literature10, pressure-flow process can be described as
Q = u1 ( P0 P1 ) = u 2 ( P1 P2 ) = u1u 2 ( P0 P2) u1 + u 2

12 P0 P2 = 22 P1 P2
P P 2 0

P0 P1

P1 P2 P0 P2 P0 P1 P0 P2

(12)

(4)

If in the system P1 close to P2, is very close to the unit matrix, instructions to control the flow with a valve 1, with a valve 2 to control pressure P1 is appropriate. If P1 close to P0, use valve 2 to control flow and with a valve 1 to control pressure P1 is appropriate.if P1 close to the midpoint of (P0-P2), adjusting the two valves are the best. System of coke oven gas pipeline pressure is P0=6.2kPa, P1=5.5kPa, P2=5.0kPa, the gain matrix is
= 11 21

When two circuits are in the open loop, the first amplification factor is
Q u1 u2 = u +u 2 1 (P0 P2 )
2

12 0.58 0.42 = 22 0.42 0.58

(5)

u 2 = cos t

i,j very close to 0.5 shows that there are serious coupling in this system. From equation (11), we could be find the mathematical model of this control system can be expressed as
y1 ( s ) u1 ( s ) = G ( s) y 2 ( s) u 2 ( s )

Closed loop pressure, the first amplification factor is


Q u1 u2 (P0 P2 ) = P0 P1 = u1 + u 2

(13)
y 2 (s) = P( s)

(6) Among

p1 = cos t

According to the definition of a relative gain


Q u1 Q u1
R 1(s )

y1 ( s ) = Q( s )

and

11 =

u 2 = cos t

u2 = u1 + u 2

(7)

11

G 1(s)

Y 1(s )

21

P2 = cos t
12

From equation (4) and solve for u1 and u2 into equation (7).can be used pressure to represent the gain

R 2(s)

11

P P1 = 0 P0 P2 P1 P2 P0 P2

22

G 2(s)

(8)
Fig 3. Gas mixing process control block diagram

Y 2(s )

can also find the relative gain of the u2 and flow Q

12 =

(9)

If use P1 to describe the pressure-flow system, is

G ( s ) = diag {G ( s ), G ( s )} is transfer function of controlled object.

2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2251

debugging the process of trial and K error G1 ( s ) = , in the equation, K = 2, T0 = 1.5 T0 s + 1 Gas mixing process control block diagram shown in figure 3. IV. ADRC DECOUPLING CONTROLLER DESIGN FOR BUTTERFLY VALVE SERIES SYSTEM

In

ADRC static decoupling can use conventional controller to achieve. Due to the actual operation process, static coupling matrix is uncertain. Therefore, within the scope of its changes to take a rough estimate 0 , approximation error can be attributed to disturbance. Static decoupling compensator used in the decoupling controlled objects, command
v1 ( s ) u1 ( s ) = v 2 ( s) u 2 ( s )

(14)

ADRC have 3 parts, the tracking differentiator (TD), extended state observer (ESO) and nonlinear state error feedback control law (NLSEF).Structure shown in the figure4.
z z11 TD z12
e2 e1

In the formula (14), of each channel, then

vi (s )

is the virtual control amount

w u0 NLSEF u y object

u1 ( s ) v1 ( s ) 1 v1 ( s ) = = N u 2 ( s ) v 2 ( s ) v 2 ( s )

(15)

So static decoupling compensator is


1/b0 b0 z23 z22 z21 ESO

N = 1

(16)

After the static decoupling compensation, equation (13) can be described as


y1 ( s ) v1 ( s ) = G (s) y 2 ( s) v 2 ( s )

(17)

Fig.4. Structure of ADRC

Tracking differentiator (TD) is used to arrange the transition process, fast without overshoot to track the input signal, and has a better differential characteristic. Avoid when the set value mutations, control the amount of violent changes and output overshoot, largely solved the contradiction between of system response speed and overshoot. Precisely because of this the ADRC requirements in high speed applications are subject to certain restrictions. Extended state observer (ESO) is the core part of ADRC, it can be attributed from the system internal or external factors to the disturbance of the system, through the extended state observer to estimate the system all state variables simultaneously estimate the system's internal and external disturbances and appropriate compensation, in order to achieve the dynamic feedback linearization. Tracking differential output and extended state observer estimation to take error, and get the system the state variable error. The amount of error into the nonlinear state error feedback control law (NLSEF), after the operation, add from the compensation amount of the extended state observer, the final receive the control amount of charged with object. Because disturbance rejection controller is based on the time scale of the system to divide objects, when design the controller, do not consider the linear or nonlinear, time-varying or time-invariant of the system, thus simplifying the controller design. A. ADRC Static Decoupling

Visible, ADRC static decoupling technology break through the limit of the conventional decoupled method of matrix inversion, it only need to know a rough estimate of , that is 0,and singularity of is not limited. For the reason of the uncertainty or singular to due to error of approximation, ADRC put it as a new disturbance, then automatically estimated and compensated. As long as the difference between them not very large, you can achieve a good decoupling objective. Therefore, ADRC static decoupling method adapt to wider range, better robustness. B. ADRC Decoupling Control Design The ADRC is based on the idea that in order to formulate a robust control strategy. Although the linear model makes it feasible for us to use powerful classical control techniques such as frequency response based analysis and design methods, it also limits our options to linear algorithms and makes us overly dependent on the mathematical model of the plant. Instead of following the traditional design path of modeling and linearization and then designing a linear controller, the ADRC approach seeks to actively compensate for the unknown dynamics and disturbances in the time domain. Once the external disturbance is estimated, the control signal is then used to actively compensate for its effect, and then the system becomes a relatively simple control problem. More details of this novel control concept and associated algorithms can be found in [8]-[10].A brief introduction is given below. In many practices, the performances of the controlled system are limited by how to pick out the differential signal of the non-continuous measured signal with

2012 ACADEMY PUBLISHER

2252

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

stochastic noise. In practice, the differential signal (velocity) is usually obtained by the backward difference of given signal, which is very noisy and limits the overall performance. Han [8] developed a nonlinear trackingdifferentiator (TD) to solve this problem effectively. It is described as
e 0 = r1i ri * r1i = R fal (e 0 , 0 , 0 ) &

(18)

Rtracking parameter, 0control parameter, 1 tracking * The function of fal as following


e sgn(e) fal (e, , ) = e 1

e >

(19)
e

disturbances, 1 , 2 are larger, however, if 1 , 2 is too large may cause the value of estimate oscillation, so we should be coordinated to adjust parameters 1 and 2 ,to ensure ESO can fast and accurate estimate, and when disturbance compensation, value of estimated will not oscillate at the same time. In this paper, ESO of the parameter selection is first in matlab simulation setting get a set of parameters and then into the actual system after many experiments, 1 and 2 are 90 and 2500. Once the design of TD and ESO is accomplished, the general error and its change between the reference and the estimated states can be defined as e1, e2 and e3.The nonlinear proportional derivative (N-PD) law is used to synthesis the preliminary control action, which can be described as
ei = ri z1i & z 3i = ri z1i e 0i = z 3i v 0i = k pi fal (ei , i , 1 ) + k di fal (e1 , 1 , 1 )

The extended state observer (ESO) proposed by Prof. Han [8] is a unique nonlinear observer designed to estimate:

(22)

i = z1i y i
& z1i = z 2i 1i fal (e, , ) + b0i v i & z 2i = 2i fal (e, , )

(20) (21)

In the formula (22), ei is error term, e0i is error rate, k pi , k di are the gains of PD controller. Plus the control action to cancel out the external disturbance, then the total control action of ADRC can be determined as follows
v i = ( z 2i + v 0i ) / b0i

01, 02, 03, 04 are observer gains, b0 is normal value of b. Because performance of extended state observer has a direct impact on the performance of ADRC is good or bad,ESO parameter selection is critical.System speed links can be approximation reduced to first-order the object, only need to design a second-order extended state observer,then be able to estimate the system state variables, at the same time disturbance compensation,in equation(21),the parameters that need tuning only , and 1 , 2 . Among, parameter in the range between 0 and 1,parameters of smaller, non-linear is the better.ESO is stronger ability to adapt for the uncertainty and disturbance of system model, generally often take 0.25, 0.5 or 0.75,here the first take = 0.5,parameter is the linear interval width of nonlinear function, set the linear range is designed to avoid the error curve slope at the near zero large high-frequency pulse. If is too small easily lead to high-frequency pulse, is too large, nonlinear feedback will be some extent to degraded a linear feedback. Here take = 5. , determined, in the ESO, tuning parameters to be only 1 and 2 .Because the system dynamic performance by argument 1 , 2 a great impact, 1 major impact on the estimated state variables, 2 major influence on estimate of the disturbance, 1 , 2 ,the faster, the greater the estimate. For large inertial system, the greater the time constant, the corresponding value of 1 , 2 should be greater,additional,larger system

(23)

The active disturbance rejection decoupling control system is given as figure 5. V. SIMULATION

z11

LESO

z21
b01

d
1

LSEF

v v

1/b01

y
Plant

01

v v
b02
2

u
DC

LSEF

02

u d

1/b02

z12

z22
LESO

Fig.5 Architecture of ADRC for gas mixing butterfly valve group

To verify the feasibility of the control program, figure 5 shows the block diagram of a simulation test. ADRC adjustable parameters using matlab nonlinear optimization toolbox optimized, two channels of ESO, NLSEF parameters is the same. ADRC controller parameters are: ESO parameters

2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2253

are: 01 = 90 , 02 = 2500 , b0 = 4 / 3 .NLSEF parameters are: k p = 20 , k d = 10 .5 . A. Tracking Performance: For the nominal state, two channels each with unit step value were given, the simulation results shown in figure 6. Figure (a) is the response curve of flow channel for a given value of the step change. Curve 1 is the flow curve, curve 2 is pressure curve. Figure (b) is response curve of pressure channel for a given value of the step change. Curve 1 is pressure curve, curve 2 is flow curve. Visible, although the coupling of two valve series system is more serious, but decoupling through decoupling method based on growth hormone two-way adjustment mechanism, when the flow rate Q changes, pressure remained almost unchanged, and the output is no static error, decoupling works well, and vice versa.

curve 2 is curves of the pressure, figure(b)is response curve of the pressure channel plus interference signal.

(a)Changes in flow channel

(a)Flow set point step change

(b)Pressure channel change Fig.7. Responses of disturbance rejection

Curve 1 is pressure curve, curve 2 is flow curve. Visible, butterfly valve series system has little effect of interference, so it is can negligible. VI. CONCLUSIONS In summary, gas mixing is a nonlinear, large timevarying, uncertainty and complex multi-variable coupled production process. Mixed gas pressure calorific value fluctuations are determined by a variety of factors. Most of these factors are unpredictability, appeared frequently, resulting in pressure of mixed gas calorific value volatility, affects the production. Therefore, traditional decoupling method can not meeting the requirements, we can only use intelligent decoupling method. Active disturbance rejection control rises in response to the proper time and conditions. It adapt the trend of digital control, absorbing the results of modern control

(b)Pressure set point step change Fig.6. Responses of gas mixing butterfly valve group

B. Disturbance Rejection Tests: When t=500s, in the channel and amplitude of 0.5 load disturbance signal. The simulation results shown in figure 7.Figure(a)is response curve of the flow channel plus interference signal, curve 1 is the curves of flow,

2012 ACADEMY PUBLISHER

2254

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

theory, develop and enriching the classic control to eliminate errors based on error spirit and essence, development and application of nonlinear effects, with features as small overshoot, fast response, high control precision, strong anti-disturbance, the algorithm is simple and so on. This paper use decoupling method based on ADRC to remove gas mixing system coupling. Simulation results shows that the control system is not only has better tracking performance, anti-disturbance ability, the decoupling works well, to solve the gas mixture system valve in series strong coupling, time-varying, confounding factors such as more adverse impact on the system. ACKNOWLEDGMENTS This work was supported by the Natural Science Foundation of Hebei Province (grant number: F2009000796). REFERENCES
[1]. Qin lu. Decoupling and Smith compensator with a mixture of gas heating value control. Control Engineering, 7375(2002). [2]. Meng chun, Xia chen. SMAR intelligent regulator and control of mixed gas calorific value. Hebei Metallurgical, 53-55(2002). [3]. Wu xiao feng, Sun han feng, Wei cheng yong.Automatic control of furnace gas pressure regulator. Metallurgical Automation, 64-65(2002). [4]. Huang Jimin,Cao Jianmei. Computer-controlled system of gas mixing. Automation and Instrumentation, 28-30(2000). [5]. Ren Hailong, Li Shengyu. Mixture gas pressure and calorific value of the fuzzy decoupling control. Automation Instrumentation, 65-67 (2004). [6]. Shao li wei,Liao xiao zhong. ADRC in the voltage-type PWM Rectifier [J]. Beijing University of Technology. (2008). [7]. Cao Weihua, Wu Min, Hou Shaoyun. Method and application of gas mixture pressurized process intelligent decoupling control. Zhongnan University News,780785(2006). [8]. Han Jingqing. ADRC control technology - an estimated uncertainty compensation control technology. Biejing National Defence Industry Press,(2008). [9]. Wu meng, Zhu xi lin, E shiju,Sun ming ge, Tong shao wei. ADRC controller parameter tuning Method [J]. Beijing University of Technology.(2009). [10]. Jin yi hui, Fang chong zhi. Process Control [M]. Beijing: Tsinghua University Press, 1995. [11]. Guo jiang.Intelligent control of blast furnace, coke oven gas pressure and calorific value of mixture. Xinjiang University News: Engineering Edition, 231-235, 2001. [12]. Li SL,Yang X,Yang D. Active disturbance rejection control for high pointing accuracy and rotation speed. Automatica, 1854-1860, 2009. [13]. Qing zhang. On active disturbance rejection control: stability analysis and applications in disturbance decoupling control. National University of Singapore, 2003. [14]. Qiu xiao bo,Dou li hua, Han jing qing, Zhou qi huang. Application of ADRC control used in tanks maneuvering target state estimation [J]; Ordnance Technology, 2009.

[15]. Xu xiang bo,Fang jian cheng. Angular acceleration of the gyro frame-based servo system disturbance observer [J]. Journal of Beijing University of Aeronautics. (2009). [16]. Wu meng,Zhu xi lin,E shi ju,Sun ming ge,Tong shao wei. ADRC controller parameter tuning Method [J].Beijing University of Technology.(2009). [17]. J. Timmis and M. Neal. Artificial homeostasis: integrating biologically inspired Computing [EB/OL], www.cs.kent.ac.ulc/pubs/2003/1586Zcontent.pall, Apr.15, 2006. [18]. B. Liu, K. R. Hao and Y S. Ding . A nonlinear optimized controller based on modulation of testosterone [A]. Proceeding of The Third International Conference on Computational Intelligence, Robotics and Autonomous Systems (CIRAS2005) [C], Singapore, Dec.14-16, 2005. [19]. B.Liu and Y S.Ding.A decoupling control based on the bi regulmion principle of growth hormone[A].Proceeding of 2005 ICSC Congress on Computational Intelligence: Methods & Application 2005 (ClMA05)[C], Istanbul, Turkey, Dec.15-17.2005. [20]. Bliu, YS. Ding and J. H. Wang. An intelligent controller inspired from neuroendocrine-immune system [A]. Proceeding of 2006 International Conference on Intelligent Systems & Knowledge Engineering (ISKE2006) [C], Shanghai, China, Apr.6-7, 2006. [21]. WU M,CAO W H,HE C Y.I ntegrated intelligent control of gas mixing-and-pressurization process[J]. IEEE Transactions on Control Systems Technolopogy, 2009, 17(1) :68-77. [22]. Xia wei xing, Yang xiao tong. Filtering characteristics of the steepest tracking-differentiator based on robust filtering theory. 2010, 17(2). [23]. D. M. Keenan, J. Licinio and J. D. Veldhuis. A feedbackcontrolled ensemble model of the stress-responsive hypothalarno-pituitary-adrenal axis [J]. PNAS, 2001, 98(7): 40284033.

Zhikun Chen, born in October 1961. professor, master instructor in control science and engineering. He got M. S. in detection technology and automation devices, Northeastern University, B. S. in metallurgy Chemical Automation Instrumentation, Hebei Institute of Mining and Metallurgy. Mainly engaged in the detection of sensor technology and intelligent control, machine vision and other research, teaching and education management. Has been granted provincial, municipal, hospital the title of outstanding teachers and school teaching achievement First prize. He has published 40 papers in leading journals such as Chinese Journal of Scientific Instrument, Electronic Measurement Technology, Chongqing University of Posts and Telecommunications, Transducer and Microsystem Technologies etc, has been tagged 4 times by EI. Pro. Chen has co-authored 1 book in Chinese. Currently, Professor Chen Zhikun hosted Key Laboratory projects of Tangshan City: The study of based on radial sintered end of basis function neural network prediction system, Natural Science Foundation of Hebei Province: Gas mixing intelligent decoupling control algorithm and its application research projects. Now,he is members Ministry of Education College of Instrument Science and Technology Education Steering Committee, executive director of Hebei instrumentation

2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2255

institute, Associate Dean of Computer and automatic control in Hebei Polytechnic University. Yutian Wang, born in October 1952. professor, Doctor instructor in control science and engineering. Mainly engaged in the detection of sensor technology and intelligent control, machine vision and other research. Ruicheng Zhang, born in March 1975,Ph.D. associate professor master instructor in control science and engineering.Ph.D.in control theory and control engineering, University of science and technology Beijing, M.S. in process equipment and control engineering, Lanzhou University of techmology, B. S.in process enquipment and control engineering, Hebei University of Science and techmology. Mainly engaged in rolling automation, production intelligent control and robust control and other research. He was awarded the Hebei University of Technology Excellent Teacher third

prize, outstanding graduate design instructor, excellent teacher, excellent script and other honorary titles. In recent years,he guiding students to participate in "Freescale" Cup National University Smart Car race, won the first prize and second prize. He has published 30 papers in leading journals such as Journal of Vibration and Control, Control Theory and Application, transactions of china electro technical society, Journal of University of Science and Technology Beijing etc, which has been tagged 11 times by SCI and EI, and cited 50 times by others. Dr. Zhang has co-authored 1 books in Chinese. Associate professor Rui-Cheng Zhang currently chaired by the National Natural Science Foundation: Small world artificial neural network model; Ministry of Science and Technology personnel services business operations funded projects: JDC150 type electronic digital multi-function tape measure ruler; Natural Science Foundation of Hebei Province: ADRC control theory and applied research of vibration control in the rolling mill drive system and other projects. Xu Wu, born in October 1986, 2005-2009 studied at the Electrical Engineering in Yanshan University, master degree in Hebei Union University Control Science and Engineering.

2012 ACADEMY PUBLISHER

2256

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

A Robust Scalable Spatial Spread-Spectrum Video Watermarking Scheme Based on a Fast Downsampling Method
Cheng Wang 1, Shaohui Liu 2, Feng Jiang 3, Yan Liu4
School of Computer Science and Technology, Harbin Institute of Technology, Harbin, 150001, China Email: shliu@hit.edu.cn

AbstractThis paper presents a robust spatial spreadspectrum (SS) video watermarking scheme with adaptive detection for H.264/AVC and SVC. We introduce a fast image size change method in DCT domain to obtain a lowpass and downsized version of the video frame for embedding, which protects the watermarking signal from being damaged by downsampling during SVC encoding. And then we embed watermark bits by SS method with exploiting the Just Noticeable Distortion (JND) model to dynamically adjust the embedding strength. The watermark will be spread to the original video after the upsampling of the embedded downsized version. Furthermore, we deduce a detection model based on a likelihood ratio test to realize adaptive detection and design a different detection strategy for the base layer of SVC-encoded video. Simulation results prove the validity of the detection model and show that the proposed watermarking algorithm has strong robustness against not only H.264/AVC-encoded I-frames, P-frames and B-frames, but also SVC-encoded I-frames and P-frames while preserving the perceptual quality. Index Termsfast downsampling, spread-spectrum video watermarking, JND model

I. INTRODUCTION Video watermarking has been proposed as a technology for authentication and copyright protection by embedding an imperceptible, yet detectable signal into the video sequence [1]. Currently, the video coding standard H.264/AVC with high compression efficiency and strong network adaptability is utilized in a wide range of applications. And in order to adapt to different network environment and user terminals, the Joint Video Team (JVT) has also standardized a Scalable Video Coding (SVC) extension of the H.264/AVC standard which aims at encoding video in a single bit stream from which multiple spatial and temporal resolutions at different quality levels can be extracted. With the rapid increase of distribution of video content, it is necessary to devise new watermarking schemes for the two standards. There are two embedding scenarios for video watermarking, embedding in raw data (namely spatial domain watermarking) and embedding in the coding process or directly in the compressed bitstream (namely compressed domain watermarking).

Recently, a few compressed domain watermarking algorithms on H.264/AVC have been proposed. In [2], the authors embed watermark information in the quantized residuals of I-frames and build a theoretical framework for watermark detection based on a likelihood ratio test. The authors in [3]-[4] modify the quantized DC coefficients of each DCT block, while in [3] they propose a drift compensation algorithm and in [4] they implement a blind video watermarking scheme. These compressed domain methods above commonly offer real-time processing of the video, however, they may not survive video format conversion including SVC coding because the watermark is tied to the H.264/AVC standard. Spatial video watermarking method resistant to H.264/AVC has received continuous attention due to its high capacity, as well as avoiding error drift in the compressed domain. In [5], authors employ the JND model to improve the performance of [6] further, however, simply setting a threshold to detect the watermark causes higher error ratios under larger quantization step. Compared to [5], Huang et al. [7] apply 1D-DCT to temporal domain and use quantization index modulation (QIM) to embed watermark, but it is not robust to some attacks such as scaling and frame deleting. For watermarking algorithm on SVC, few articles have appeared in the open literature. In [8], the authors propose a watermarking scheme using the Fourier-Mellin transform, but it is no longer robust when image size is scaling from 1920 1080 to 854 480 . The authors in [9] propose a robust compressed domain watermarking scheme for SVC but only focus on inter-layer intra prediction. In this paper, we introduce a fast scheme for image size change in the DCT domain proposed in [10] to get a low-pass and downsized version of the video frame, and then embed watermark bits in the AC coefficients of the downsampled video frame with corresponding JND values to adjust the embedding strength. These JND values of the downsampled video frame are calculated by the JND values of the original video frame through the same downsampling process. After the upsampling, the watermark will be spread to the original video frame. Note that our embedding method is essentially like [9] where the base layer watermark signal is upsampled to match the resolution of the enhancement layer data. For

2012 ACADEMY PUBLISHER doi:10.4304/jcp.7.9.2256-2261

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2257

watermark detection, we follow the same theoretical framework in [2] and derive a similar detection model. Furthermore, we design a detection strategy for the base layer of SVC- encoded video. The rest of the paper is organized as follows. In Section II, we briefly review the fast scheme for image size change in [10]. Section III analyses the correctness of our watermarking algorithm theoretically. In Section IV, we describe details about the proposed embedding scheme. In Section V, we present the derived detection model and the detection strategy for the base layer of SVC-encoded video. Simulation results are provided in Section VI followed by conclusion in Section VII.

B = T8 bT8 = [TL
t

TR ]

b1 b3

= [TL

TR ]

t T4 B1 T4 t T4 B3 T4

b2 TL t t T b4 R T4t B2 T4 TL t T t T4t B4 T4 R

(1)

= (TLT4t ) B1 (TLT4t )t + (TLT4t ) B2 (TRT4 )t + (TRT4t ) B3 (TLT4t )t + (TRT4t ) B4 (TRT4 )t

Finally we have
B = ( X + Y )C t + ( X Y ) D t

II. FAST SCHEME FOR IMAGE SIZE CHANGE


In order to keep the paper self-contained, we describe the fast scheme for image size change in [10] briefly as follow.

(2)

It should be noted that the matrices C and D in equations (3) are very sparse and can be computed in advance.
X = C ( B1 + B3 ) + D( B1 B3 ); TLT4t = C + D

B1

B 1

B2

Y = C ( B2 + B4 ) + D( B2 B4 ); TRT4 = C D
t

(3)

B4

B 3

During the upsampling, given B , we have


B1 = (TLT4t )t B (TLT4t ); B2 = (TLT4t )t B(TRT4t )
Figure 1. Macroblock downsampling and upsampling

B3 = (TRT4 ) B (TLT4 ); B4 = (TRT4 ) B(TRT4 )


t t t t t t

(4)

Let B1 , B2 , B3 and B4 denote the DCTs of four consecutive 8 8 blocks as shown in Fig. 1. Let b1 , b2 , b3 and b4 denote these four adjacent 8 8 blocks in the spatial domain, respectively. Let
B1 , B2 , B3 and B4 denote the 4 4 matrices containing

In the next section, the correctness of our watermarking algorithm based on this fast size change scheme is proved.

the low-pass coefficients of B1 , B2 , B3 and B4 , respectively. Let b1 , b2 , b3 and b4 denote the 4 4 inverse

III. CORRECTNESS OF THE WATERMARKING ALGORITHM

DCT Then

of

B1

,
b2 b4

B2

B3

and the

B4

respectively. and

def b b = 1 b3

denotes

low-pass

Suppose we add a watermark W to B , due to the linearity property of the DCT, the equation is easily derived from (1) as follows: t t t T4 ( B1 + W1 )T4 T4 ( B2 + W2 )T4 TL B + W = [TL TR ] t t T t T4 ( B3 + W3 )T4 T4 ( B4 + W4 )T4 R

def b def b downsampled version of b = 1 2 . Let B = DCT (b) . b3 b4 Let T8 denote the 8-point DCT operator matrix and let T4 denote the 4-point operator matrix. Let TL and TR denote

= [TL

TR ]

t T4 Bw1 T4 t T4 Bw3 T4

T4t Bw 2 T4 TL t T t T4t Bw 4 T4 R

(5) where Wi (i {1, 2,3, 4}) denotes the change of Bi ( i {1, 2,3, 4} ) caused by embedding and Bwi is defined as Bwi = Bi + Wi (i {1, 2,3, 4}) . Observing (5) from left to right, we find that the watermark added to the downsampled video frame can propagate to the original video frame by upsampling.
def

the first and last four columns of T8 respectively. Our


goal is to obtain B directly from B1 , B2 , B3 and B4 . We have

2012 ACADEMY PUBLISHER

2258

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

While observing (5) from right to left, we can obtain


B + W for detection through downsampling by collecting

embedding scheme is similar to [5]. Note that matrices C and D need to be recalculated. V. WATERMARK DETECTION Watermark detection can be formulated as a hypothesis test to choose between
H 0 :Yl = Bl ( j , k )H1 :Yl = Bl ( j , k ) + Wl J l ( j , k )

Bwi from the DCT blocks of the watermarked video frame. Equation (5) forms the basis of our proposed algorithm for watermark embedding and extraction.

IV. WATERMARK EMBEDDING In this section we describe the embedding scheme in details and present several optional methods to get a lowpass and downsampled version of a video frame. We use a bipolar watermark W {1,1} with zero mean and variance one. Watermarking embedding procedure is described as follows: 1. Partition the luminance component into 88 blocks and then apply 2D-DCT to each block; 2. Compute the JND value for every coefficient in the top left 4 4 corner of each block using the JND model described in [11]; 3. Implement the downsampling procedure for each
16 16 macroblock MBl to obtain Bl . The corresponding JND values go through the same

(7)

where Bl ( j , k ) is

the

selected

DCT

coefficient

of B l , J l ( j , k ) is the corresponding JND value, and Wl is the watermark bit. Here we adapt a theoretical framework built in [2] for watermark detection. Assume that the ac coefficients of the DCT has a generalized Gaussian distribution, which can be written as
pYl ( X ) = ae |b ( X m )|
c

(8)

where a and b are defined as


3 ( ) c 1 ( ) c

process to obtain J l ;

bc 1 a= b= 1 2 ( ) c

(9)

4. Select the location ( j , k ) in Bl to embed watermark with SS method. The process can be described as:
Bl ( j , k ) = Bl ( j , k ) + J l ( j , k )Wl (6) where is the global parameter for watermark, which will be set to 1 in our experiments and Wl is the l th watermark bit;

and (.) is the gamma function, is the standard deviation of the DCT coefficients. The optimal detector compares the likelihood ratio to a threshold
pY | H1 (Y | H1 ) > pY | H 0 (Y | H 0 ) <
H0 H1

5. Solve (4) to obtain Bl1 , Bl 2 , Bl 3 and Bl 4 and then replace the top left 4 4 corner of corresponding blocks in MBl ; 6. Apply 2D-IDCT to each block to get the watermarked frame. Note that, the corresponding JND values go through the same downsampling process in step 3, which ensures that the change of original video frame caused by embedding in the downsampled video frame would not be perceived by Human Visual System (HVS). In step 4,

(10)

where controls the tradeoff between missed detections and false alarms. Assume that the watermarked DCT coefficients are statistically independent, substitute the joint probability density into (10) to get
ae |b (Yl Wl Jl ( j , k ))| > l =1
c

H1

we find that the first four AC coefficients of Bl are suitable for embedding in order to achieve good robustness. The selection of embedding position can be controlled by private key to enhance the security of watermark information. There are several optional methods to obtain the lowpass version of a video frame. We can use the top left 2 2 corner of each block as Blx ( x {1, 2,3, 4}) in step 2 to get a 4 4 matrix Bl as the low-pass version of MBi . And if we only use the top left 1 1 corner of each block (i.e. the DC coefficient) as Blx ( x {1, 2,3, 4}) , our

ae |bYl |
l =1

H0

<

(11)

where N is the number of selected DCT coefficients from the video. Algebraic simplification reduces this to equivalent test
> Y = YWl J l ( j , k ) l < l =1
N H1

2 ln c ( )

H0

1 2 c + NJ 3 2 2 ( ) c

(12)

2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2259

where J 2 = 1

N l =1

J l ( j, k )2 .

Till now, we derived a detection model similar with [2], in order to lower the complexity of watermark detection, we set = 1 (i.e. PD + Pf = 1 , where PD denotes the probability of a detection and PF denotes the probability of a false alarm) and have
Y = YWl J l ( j , k ) l
l =1 N

2. If the frame is H.264/AVC-encoded or SVC-encoded enhancement layer, go to step 4; If the frame is SVCencoded base layer, go to step 3; 3. Implement the upsampling procedure by solving (4)

for each Bl to obtain MBl ; Go to step 5; 4. Implement the downsampling procedure for each
16 16 macroblock MBl to obtain Bl ; Go to step 5; 5. Compute the JND value for every coefficient in the top left 4 4 corner of MBl using the JND model described in [11];

H1

H0

> NJ2 < 2

(13)

In our experiments, we employ (13) to detect watermark in H.264/AVC-encoded video and the enhancement layer of H.264/SVC-encoded video. When we extend the detection model to the base layer of

H.264/SVC-encoded video, in order to get J l , we

6. Compute J l by the corresponding JND values going through the downsampling process; 7. Use (13) to determine whether the watermark signal exists. In the next section, we will give the experimental results and compare it with another similar algorithm published in [5]. VI. SIMULATION RESULTS The proposed watermarking algorithm is implemented in the H.264/AVC reference software version JM10.0 and the SVC reference software version 9.19.10. The first 100 frames of four standard video sequences (CITY, CREW, HARBOUR, SOCCER) in 4CIF format ( 704 576 ) are used in our experiments. We opt for always selecting the second 8 8 DCT AC coefficient of Bl in zig-zag order as the embedding location.

implement the upsampling procedure for each Bl to obtain MBl whose four corresponding blocks are defined
as Blx = Blx O , where Blx ( x {1, 2,3, 4}) are obtained O O by solving (4) and O denotes a 4 4 zero matrix. Note that
def

computing J l from received video sequence is not exactly accurate because of the lossy encoding and watermark embedding process. Watermarking detection procedure is described as follows: 1. Partition the luminance component into 88 blocks and then apply 2D-DCT to each block;

TABLE I PERFORMANCE OF H.264/AVC COMPRESSION WITH ALL I-FRAMES

Sequence CITY CREW HARBOUR SOCCER

Payload 1584bits 1584bits 1584bits 1584bits

PSNR 43.9dB 47.5dB 42.0dB 43.8dB

QP20
PD PF PD

QP24
PF PD

QP28
PF PD

QP32
PF

>0.999 >0.999 >0.999 >0.999

0.010 0.016 0.020 0.022

>0.999 >0.999 >0.999 0.999

0.010 0.015 0.021 0.023

>0.999 >0.999 >0.999 0.998

0.012 0.018 0.028 0.025

>0.999 0.998 >0.999 0.997

0.014 0.020 0.028 0.025

Four different experiments are conducted. In the first experiment, we validate the robustness of the watermarking method against H.264/AVC compression with each watermarked and unwatermarked video sequence of all I-frames. We test 10 randomly generated watermark messages to obtain an estimate of the error rates (shown in TABLE I). In the second experiment, we compare the detection performance of our scheme with the algorithm in [5]. Fig. 2 shows the watermark detection performance for the CITY and HARBOUR sequences under QP = 32 . All the cases have the same structure of the group picture (GOP) with IBBPBB. We can see that simply setting a threshold is no longer valid
2012 ACADEMY PUBLISHER

for detection. The performance of our algorithm is shown in TABLE II as well as the comparison of watermark payload and the decrease of Y-PSNR at QP = 28 . The algorithm proposed shows a better performance mainly because of adaptive change of the threshold. For H.264/SVC compression, we focus on the spatial resolution scalability which is the most important for watermarking. In the third experiment, we test each sequence of all I-frames 10 times with and without watermark. Inter-layer prediction is used in a macroblock adaptive way, and both layers are coded with the same QP. Simulation results show that the watermark can be reliably detected in the base layer (L0, shown in TABLE

2260

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

TABLE II PERFORMANCE OF H.264/AVC COMPRESSION WITH GOP STRUCTURE OF IBBPBB

Sequence CITY CREW HARBOUR SOCCER

Number of Error (Total 200) QP24 QP28 QP32 I P B I P B I P B 0 0 0 0 0 6 0 1 16 0 0 0 1 0 0 1 0 1 0 0 0 1 0 0 4 0 1 0 0 0 1 0 0 11 0 1

Payload Our Method in Scheme [5] 1584bits 792bits 1584bits 1584bits 1584bits 792bits 792bits 792bits

Decrease of Y-PSNR (QP28) Our Method in Scheme [5] 0.3371dB 0.0615dB 0.2490dB 0.5913dB 0.4774dB 0.2332dB 0.1666dB 0.1761dB

80 Watermarked(CITY,QP32) Watermarked(HARBOUR,QP32) Without watermark(HARBOUR,Q Without watermark(CITY,QP32)

60

40

C r e% or ( )

20

-20

-40

11

21

31

41

51

61

71

81

91

Frame Index

Fig. 2. Correlation detection using method in [5]

TABLE III PERFORMANCE OF H.264/SVC COMPRESSION WITH ALL I-FRAMES

Sequence CITY CREW HARBOUR SOCCER

PD (L0)

PF (L0)

QP20 0.986 0.995 0.996 0.981

QP24 0.985 0.991 0.991 0.976

QP28 0.972 0.978 0.982 0.959

QP32 0.953 0.956 0.980 0.951

QP20 0.005 0.012 0.015 0.015

QP24 0.005 0.014 0.016 0.015

QP28 0.006 0.015 0.018 0.015

QP32 0.007 0.006 0.019 0.017

TABLE IV PERFORMANCE OF H.264/SVC COMPRESSION WITH GOP STRUCTURE OF IPPP

Number of Error (L0) Sequence CITY CREW HARBOUR SOCCER QP20 0 0 0 0 I-frames (Total 50) QP24 QP28 0 1 0 0 0 0 0 0 QP32 1 0 0 0 QP20 1 1 0 1 P-frames (Total 150) QP24 QP28 5 10 3 0 1 4 0 2 QP32 17 3 0 5

III). The detection results of the enhancement layer not listed here are slightly different from the first experiment. Note that the probability of false alarm PF is decreased in the base layer. For the detection of all I-frames, our algorithm works as well as that in [9] which only embeds watermark in intra-coded blocks, while we can detect watermark from the H.264/SVC-encoded P-frames with
2012 ACADEMY PUBLISHER

high probability. Simulation results of the forth experiment in TABLE IV with GOP structure of IPPP show this clearly. VII. CONCLUSION Many robust spatial domain watermarking methods cannot be applied to H.264/SVC-encoded video mainly

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2261

because the downsampling of the original size video has a fatal impact on the embedded watermark signal. In this paper, a fast image size change method is used to obtain a low-pass and downsized version of the video frame for embedding, it protects the watermarking signal from being damaged by downsampling during SVC encoding. The JND model is also exploited to dynamically adjust the embedding strength for preserving perceptual quality. We deduce an adaptive detection model based on a likelihood ratio test and gain a good detection performance. The watermark can be detected from not only the H.264/AVC-encoded I-frames, P-frames and Bframes but also the enhancement layer and base layer of H.264/SVC-encoded I-frames and P-frames with high probability. ACKNOWLEDGMENT This work was supported by the Natural Science Foundation of China (60803147), the New Teacher Program Foundation (200802131023), the Fundamental Research Funds for the Central Universities (HIT.NSRIF.2009068), the Development Program for Outstanding Young Teachers in Harbin Institute of Technology (HITQNJS.2008.048)and Major State Basic Research Development Program of China (973 Program) (2009CB320905). REFERENCES
[1] I.J. Cox, M. Miller, J. Bloom., J. Fridrich and T. Kalker, Digital Watermarking and Steganography, Morgan Kaufmann, 2007. [2] M.Noorkami and R.M. Mersereau, A framework for robust watermarking of H.264-encoded video with controllable detection performance, IEEE Transactions on Information Forensics and Security, vol.2, No.1, pp.1423, Mar 2007. [3] X. Gong and H.M. Lu, Towards fast and robust watermarking scheme for H.264 video, Proceedings of ISM08, IEEE, Berkeley, CA, USA, pp.649-653, Dec 2008. [4] D.W. Xu, R.D. Wang and J.C. Wang, Blind digital watermarking of low bit-rate advanced H.264/AVC compressed video, IWDW 2009, Guildford, UK, vol.5703, pp.96-109, 2009. [5] S.H. Liu, F. Shi, J.G. Wang and S.P. Zhang, An improved spatial spread-spectrum video watermarking, International Conference on Intelligent Computation Technology and Automation 2010, ChangSha, China, pp.587-590, May 2010. [6] T.H. Chen, S.H. Liu, H.X. Yao and W. Gao, Spatial video watermarking based on stability of DC coefficients, ICMLC05, vol.9, pp. 5273-5278, Aug 2005.

[7] H.Y. Huang, C.H. Yang and W.H. Hsu, A video watermarking technique based on pseudo-3-D DCT and quantization index modulation, IEEE Transactions on Information Forensics and Security, vol.5, No.4, pp.625637, Dec 2010. [8] R.V. Caenegem, A.Dooms, J.Barbarien and P.Schelkens, Design of an H.264/SVC resilient watermarking scheme, Proceedings of SPIE, Multimedia on Mobile Devices 2010 ,vol.7542.SPIE, San Jose, CA, USA, Jan 2010. [9] P.Mee. and A.Uhl., Robust watermarking of H.264/SVCencoded video: quality and resolution scalability, Proceedings of IWDW 2010, Seoul Korea, Oct 2010. [10] R.Dugad and N.Ahuja, A fast scheme for image size change in the compressed domain, IEEE Transactions on Circuits and Systems for Video Technology, 11(4): 461-474, Apr 2001. [11] A.B. Watson, DCT quantization matrices visually optimized for individual images, Proc. SPIE. Int. Conf. Human Vision, Visual Processing and Digital Display, San Jose, CA, vol.1913, pp.202-216, Feb 1993.

Cheng Wang Hebei Province, China. Birthdate: March, 1985. is a master graduated from School of Computer Science and Technology, Harbin Institute of Technology. And research interests on data hiding in video.

Shaohui Liu Hunan Province, China. Birthdate: Jan., 1977. is Computer Application Technology Ph.D., graduated from School of Computer Science and Technology, Harbin Institute of Technology. And research interests on multimedia security, image processing, video coding, video surveillance and tracking. He is a lecture of School of Computer Science and Technology, Harbin Institute of Technology.

2012 ACADEMY PUBLISHER

Blocking Contourlet Transform: An Improvement of Contourlet Transformand Its Application to Image Retrieval
Jian Wu
The Institute of Intelligent Information Processing and Application Soochow University, Suzhou 215006, China Email: szjianwu@163.com

Zhiming Cui, Pengpeng Zhao, Jianming Chen


The Institute of Intelligent Information Processing and Application Soochow University, Suzhou 215006, China Email: szzmcui@suda.edu.cn

AbstractContourlet transform is an effective solution to solve two or more dimensional singularity and has good direction and anisotropy. Against the shortage of ability of describing the spatial distribution characteristic of objects edge information, this paper proposed a new image retrieval algorithm based on Contourlet transform, which blocks the indexed image and decomposes each sub-block images using Contourlet transform. At first, carry out weighted processing for sub-band data of each sub-block image, extract features with high classification ability from high and low frequency sub-band data, and give greater weight for those features with high classification ability. Then, according to the energy of each sub-block image, give greater weight for those sub-block image with strong texture characteristic. At last, retrieve the images using weighted Euclidean distance between two image feature vectors as image similarity. The experiment results show that our algorithm has good retrieval performance. Index Termsedge-spatial distribution; transform; image blocking; image retrieval Contourlet

I. INTRODUCTION At present, most image retrieval algorithms use the underlying characteristics of the images to describe them, such as color, texture, appearance, etc. The main purpose of shoes image retrieval is to retrieve and return the shoes whose styles people are interested in. As shoes edge information is abundant, the texture features of shoes should be considered more. Texture features are usually obtained by statistical methods, structural methods, model method and frequency-domain method, Including co-occurrence matrix, Markov random field model, wavelet transform, etc[1]. In 2002, based on the wavelet multi-scale analysis, Do and Vetterli put forward Contourlet Transform[2], which is a new non-adaptive, directional and multi-scale analysis and can achieve decomposition in any direction and on any scale. It is good enough to describe the contours and direction of the image texture information in pictures, which makes up

for the lack of the wavelet transform. Contourlet transform is a multi-resolution, local, and directional method of image representation. It has unique advantages when it is used to express a small, directional segment and contour[3]. Contourlet transform is currently used in image segmentation[4,5], image denoising[6,7], image fusion[8,9] and others[10], but less used in image retrieval application. Literature[11] studied the use of Contourlet transform in image retrieval applications, which proposed that the extraction of characteristic quantities have a high ability of classification from the high and low frequency sub-band data has a great improvement when compared to traditional Contourlet transform. Without considering the spatial distribution of directional texture information in the image, however, the effect of its application in this study of image retrieval is not good. In this paper, by decomposing each sub-block after blocking the searched image, we give greater weight to the sub-blocks have a higher texture feature based on the energy of each sub-block. While the local characteristics of the image, we also take considering the overall characteristics of the image into account. Experimental results show that the algorithm has a good retrieval performance. II. CONTOURLET TRANSFORM Contourlet transform is also called Pyramid Directional Filter Bank(PDFB). To achieve Contourlet decomposition transform, two steps need be completed: Laplacian Pyramid(LP) decomposition and Directional Filter Bank(DFB). Synthetic transformation process is the anti-process of the decomposition process[12]. Contourlet transform firstly uses LP filter to do the multiscale image decomposition, in order to capture the singular points in images. One LP Decomposition divides the original image signal into its low-frequency components and the difference between original signal

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2263

and low-frequency sampling signal, that is the highfrequency components. And then we continue the decomposition to the low-frequency components. At last, we get the entire multi-resolution image. We should use DFB filter to do the multi-directional decomposition to each high-frequency signal we get by LP decomposition. Contourlet transformation process is shown in Figure 1.

Figure 2 is a three-level, eight-direction Contourlet transform rendering of the athletic shoes.

(2,2)

image

Subband

Subband Multiresolution Multi-direction

Figure 1. Contourlet transform structure schematic.

Contourlet transform domain can be extended from discrete space to continuous function in the square integrable space L (R ) . Just like the wavelet decomposition, Contourlet transform in the continuous domains to decompose the whole space L (R ) into multi-scale, multi-dimensional sub-space sequence by the use of the iterative filter group. That is,
j L (R ) = V 0 ( ( W j ,k ))

2j

(1)

j 0 k =0

Figure 2. All sub-bands after Contourlet transform

In (1), is an orthogonal summation, subspace V 0 is an approaching component of the lowest scale, which is consisted of orthogonal base of scaling functions after its zoom and pan. W j ,k is the balance unchanged directional subspace. If j, k, n are used respectively as the scale, orientation and location parameter, and then Contourlet function can be expressed as,
lj

III. FEATURE EXTRACTION ALGORITHM A. Thinking of Image Blocking Contourlet transform mainly takes the signals global characteristics into account. In order to consider image local features as well as global features, in this paper blocking Contourlet transform algorithm is used to extract the sub-band coefficients of each sub-block of the image. Feature extraction algorithm is shown in Figure 3.

{ jj,k ,n(t )} =
In (2),

m z 2

g k (m
lj

S k j n )j ,m(t ) (2)

g kj

is the low-pass analysis filter.

j ,m (t ) is
S kj
l

a frame defined in defined as,

R2,

Over-sampling matrix

is

lj k

diag(2l 1 , ), < k < 2lj 1 2 0 (3) = l 1 l l 1 2 2j k < 2j diag(2, ),

In (3), parameter k determines the direction of the DFB Analysis. Meixue and others have studied the effect Contourlet transform decomposition scale to the separation degree of different types of target. It was found that the separation degree between classes of third level sub-band is max[13]. This paper uses three-level LP decomposition and the numbers of each direction sub-band are respectively 4,8,8.
2012 ACADEMY PUBLISHER

1,1 , 1,2 ,..., 1,1 , 1,2 ,..., 1,2 , 1,2 ,... 2,1 , 2,2 ,..., 2,1 , 2,2 ,..., 2,2 , 2,2 ,...
......

N ,1 , N ,2 ,..., N ,1 , N ,2 ,..., N ,2 , N ,2 ,...


2 2 2 2 2 2

Figure 3. Feature extraction algorithm schematic based on blocking Contourlet transform

The procedure is as follows:

2264

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

(1) Read the image. (2) Divide the image into N N blocks and we get N 2 sub-blocks.

(b) Sub-band 4, layer 2

image _ strel{(i 1) * n + j} = imcrop (image,[( j 1) * width + 1,


(4) In formular (4), width is the side length of each block and imcrop is used to cut the image. (3) Calculate each sub-blocks Contourlet feature Character{i} . After blocking the image, implement Contourlet transform to each sub-block, and then extract mathematical characteristics of directional sub-bands coefficient distribution from every level in each subblock of the image as the characteristics vector. (4) Give each sub-block weight. After blocking, not every sub-block can well reflect the image texture information and their contributions to the description of images texture features are different. Effective sub-blocks weights can help us effectively improve the retrieval precision. B. High-frequency Sub-band Features After blocking Figure 2, we can implement the Contourlet decomposition to it. Figure 4 shows the edge statistical information of any three sub-bands coefficients from 1,2,3 level of the upper left corner of the image sub-block. Abscissa represents the transform coefficients after transformation while Ordinate represents the frequency number of coefficients.
(c) Sub-band 7, layer 3 Figure 4. Statistical histogram of sub-bands coefficients of every Layer

(i 1) * width + 1, width 1, width 1])

As can be seen from the figure that Contourlet coefficients probability distribution has a very sharp peak in the zero, both sides are long tails and the value is closer and closer to zero and the peak is more and more prominent with the number of layers growing.The rest scale and direction of the image are also like this. The distribution of sub-band coefficients of Images Contourlet transform is in line with the generalized Gaussian distribution. From the view of sparse, Contourlet transform can be used to express the original images more sparse. After implementing Contourlet transform to the image, sub-band coefficients in different directions and different scales can be obtained. Amplitude values of these coefficients characterize the energy of the image in different directions and scales. The formula of calculating the mean kl and standard deviation kl of Contourlet decomposition coefficients of sub-bands in every direction are as follows:

kl =
kl =

1 M N

i =1 j =1
M

| w kl(i,j ) |
j (w kl(i ,j ) kl ) i
=1 =1

(5)
2

1 M N

(6)

In (5) and (6),

M N

is the size of Contourlet

decompositions sub-band,
(a) Sub-band 1, layer 1

kl

is the mean of the sub-

band l in layer k and kl is the standard deviation of the sub-band l in layer k .

C. Low-frequency Sub-band Features As the direction filter only consider the high frequency components, the low frequency components are missing. In order to apply all the features, the GLCM is used to extract low-frequency characteristics. Assuming the original image is f (x , y ) , x = 1, 2,...M , y = 1, 2,...N ,
the size of image is M N and grayscale is 256. In order to reduce the amount of computation, it is necessary to compress the images grayscale and quantify the graylevel to 16. 4 independent directions (00, 450, 90, 1350) are selected to calculate the secondary statistical
2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2265

features of the image. In 1979, Hralick raised 14 representative texture features from GLCM, which include four texture features as follows: a) Energy

Q1 =
b) Entropy

j p(i,j ) j
=0 =0

n 1 n 1

texture features, if the energy sub-block is greater, the texture is stronger, so greater weight should be given to it. On the contrary, it should be given smaller weight. Suppose the kth sub-block has a weight of k ,

(7)

k =

Ek

E
k =1

(12)
k

Q2 = p (i, j ) ln p (i, j )
j =0 j =0

n 1 n 1

(8) IV. ALGORITHM IMPLEMENTATION (9) In this paper, a three-layer Contourlet transform is implemented, and the sub-bands number of every layer is 4,8,8. Finally, it will produce a 48-dimensional feature vector, which is used in image retrieval, the vector is expressed as: character = [1, 2 ,..., 20 ,1, 2 ,..., 20 ,1, 2 ,..., 8 ] (13) Block Contourlet algorithm is used for feature extraction of every sub-block. Then we get N 2 48 features as follows.

c) Moment of inertia

Q3 =
d) Relation

n 1

k =0

n {i j p(i,j )}
2 =0 =0
=0

n 1 n 1

Q4 =

j ijp(i ,j ) x y i
=0

n 1 n 1

x y
2

(10)

In the formulas above,

x = y =
2 x =

n 1 i =0

i j p(i,j )
=0

n 1

1,1 , 1,2 ,..., 1,1 , 1,2 ,..., 1,2 , 1,2 ,...

2,1 , 2,2 ,..., 2,1 , 2,2 ,..., 2,2 , 2,2 ,...


......

i =0

j p(i,j )
j =0 n 1

n 1

n 1

N ,1 , N ,2 ,..., N ,1 , N ,2 ,..., N ,2 , N ,2 ,...


2 2 2 2 2 2

i =0
2 y =

(i x )2 p(i,j )
j =0 n 1 j =0

n 1

i =0

(j y )2 p(i,j )

n 1

As the statistic of various features is different dimension, in order to express the similarity better, we use the weighted Euclidean distance to measure three types of feature extracted in the previoussection, the formula is as follows:

After calculating the characteristics of GLCM, their mean and standard deviation are calculated as the final feature vector.

D ( k ) = 1

( xi mi )2 + 2
i =1 xi

20

i = 21

40

xi

mi ) 2 + (14)

D. Sub-block Weighting Since the amplitude value of Contourlet coefficients response the change in direction at different scales of the image. The energy reflects image texture features in a particular direction and scale. The sub-bands energy is , smaller, texture features are weaker. So the energy of each sub-block can be used to reflect their contribution to the search. The energy of Contourlet transformcan be approximated as:

i = 41

48

mi )

In the formula,

1 , 2 ,3

represent distinguishing

weight correspondingly, and

1 + 2 + 3 = 1

xi represents the feature vector of the retrieved image, mi represents the feature vector of the m th image. D k is the distance among the k th sub-image, then the
similarity between two images can be expressed as:

Ek =

n n [y km(i,j )] m n
=1

Cm

Dist = k Dk
k =1

N2

(15)

km

1* 2

(11)

E k(1 k N ) represents the sum of energy of the k sub-block. C m represents the number of the subband in the k th sub-block, n km represents the number of
2

V. EXPERIMENTAL RESULTS AND ANALYSIS In order to test the performance of the algorithm proposed in this paper, we get various kinds of shoes from several B2C websites, such as Letao.com, OkeyBuy.com and TaoXie.com According to the shoes style, we divide shoes into six classes, they respectively are: high-heeled shoes, boots, sport shoes, flats, slippers and sandals. One hundred images from each class are

the m th sub-bands coefficient in the k th sub-block,

y km(i ,j ) represents Contourlet coefficient on the Subband position (i, j ) . As the sub-block energy reflects the
2012 ACADEMY PUBLISHER

2266

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

extracted respectively and the 600 test images make up the test image library. One of the important steps in the algorithm is to divide the image into several blocks. The property of image retrieval using theContourlet algorithm which is on basis of the block is better than which is not. But the different method for dividing the image will lead to different results. With the increasing of the blocks number, it is difficult to fine the statistical properties of sub-block and it will also influence the final retrieval results. Three block-dividing methods are proposed in the experiment, they are 22, 33 and 44. 10 images are randomly selected from each class and there are 60 queries. Table 1 shows that search results of six kinds shoes when the block numbers are 22, 33 and 44 andthe searching number is 50.
TABLE I. RETURNED RESULTS ON DIFFERENT DIVIDING METHOD No. 1 2 3 4 5 6 type of shoes high-heeled shoes boots sport shoes flats slippers sandals 22 58% 99% 58% 53% 89% 56% 33 76% 90% 74% 73% 90% 58% 44 70% 85% 88% 44% 85% 36%

(b)The comparison of the relationship between precision rate and recall rate Figure 5. The performance of two algorithms

Table 1 figures out that there is difference among the adaptability of different method. For comprehensive, the 33 method is chosen in the experiment. In this paper,precision rate and recall rate are used as the evaluation rules of similarity retrieval. In the same retrieval conditions,the higher precision rate and recall rate are, the better the corresponding algorithm is. The algorithm in this paper compares with the algorithm from literature 11. Without loss of generality, 20 images are randomly selected from each class as examples of each experiment and we then get 120 queries. Figure 5 shows the retrieval results of two algorithms.

From figure 5, we can see that the average precision rate of two algorithms decreases while the number of the returned images increasing. And the algorithm proposed in this paperis much better than the algorithm proposed by literature [11]. Literature [11] partly solve the problem that traditional Contourlet transform did not make full use of the coefficient of high-frequency and low frequency sub-band, but it has the drawback that the images space distribution about directional texture information is poor presented, and it is lack of the ability to describe the distribution character of the edge information space of target. In order to visually analyze the retrieve results of two algorithms, sport shoes and slippers are selected for testing.Figure 6 is a test for sports shoes and Figure 7 is a test for slippers. Figure 6 and 7 show the retrieval result of the same query instance when the two algorithms return 29 images. In figure 6 and 7, the image on the top left corner is a query instance, the rest 29 ones are the results. The similarity decreases between the query instance and the results from left to right, and top to bottom in the figures. Figure 6 and 7 also show that the results of the algorithm from literature [11] is not ideal. Because the right images returned after retrieval are just 20 and 22 respectively and the precision rate are 69% and 76% respectively. On the opposite, the numbers of the algorithm proposed in this paper are both 28 and the precision rate reach 97% , which is more higher than the literature [11]. And the retrieval results are more consistent with the human visual. Thus, this algorithm has a higher performance.

(a)The comparison of precision rate when changing the number of images

2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2267

To illustrate the retrieval results of different class using two algorithms, we randomly select 30 images from each class as the query image and retrieve 180 times in the library. The average precision rate for each class is used to evaluate the effect of retrieval results. Figure 8 shows the average precision rate of six classes using two algorithms when the number of returned querying images is 50.

(a) With the algorithm from literature [11]

Figure 8. The comparison of the precision rate of different class with two algorithms

(b) With the algorithm proposed in the paper Figure 6. The retrieval results for the same query instance (sport shoes)

From Figure 8, we know that two algorithms have distinctly difference about the precision rate of different class images. The algorithm in this paper is better the algorithm proposed in literature [11]. Especially for four classes of No.1, No.2, No.3 and No.5, the algorithm in this paper is obviously better than the one literature [11] proposed. VI. CONCLUSION In order to solve the lack of the ability to describe the distribution character of the edge information of images, a new image retrieval algorithm is proposed, which is on the basis of Contourlet transform. The algorithm divides the image into sub-block and decomposes each sub-block with Contourlet transform. Firstly, the sub-band data of each sub-block will be endowed by weight, and we extract the strong classing capacity feature from the high and low sub-band data and assign a bigger weight. Secondly, according to the size of each sub-blocks energy, the block with clear texture features is endowed with large weight. Finally, we use the weighted Euclidean distance of feature vector to measure the similarity of the images. Experimental results show that the algorithm has good performance on retrieving. Further research will be carried out from two aspects. On one hand, the Contourlet transform itself is not stable, so the application of Contourlet transform in image retrieval is limited. On the other hand, the characteristics of the index structure can be improved, while the speed of retrieving in large image database can be accelerated.

(a) With the algorithm from literature [11]

(b) With the algorithm proposed in this paper Figure 7. The retrieval results for the same query instance (slippers)

2012 ACADEMY PUBLISHER

2268

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

ACKNOWLEDGEMENT This research was partially supported by the Natural Science Foundation of China under grant No. 60970015, 61003054 and 61170020, the 2009 Special Guiding Fund Project of Jiangsu Modern Service Industry (Software Industry) under grant No. [2009]332-64, the Program for Postgraduates Research Innovation in University of Jiangsu Province in 2011 under grant No.CXLX11_0072, the Applied Basic Research Project (Industry) of Suzhou City under grant No. SYJG0927 and SYG201032, and the Beforehand Research Foundation of Soochow University. REFERENCES
[1] G Rafiee, S S Dlay, W L Woo. A Review of Content-based Image Retrieval[J]. The Seventh International Symposium on Communication Systems, Networks and Digital Signal Processing, Newcastle upon Tyne, United kingdom, July 21-23, 2010, 775-779. [2] Do M N, Vetterli M. Contourlets: a New Directional Multiresolution Image Representation[C]. The ThirtySixth Asilomar Conference on Signals, Systems and Computers, Pacific Grove, CA, USA, Nov. 3-6, 2002, 497501. [3] Do M N, Vetterli M. The Contourlet Transform: an Efficient Directional Multiresolution Image Representation [J]. IEEE Transactions on Image Processing, 2005, 14(12):2091-2106. [4] Zhong Hua, Jiao Li-cheng, Hou Peng. Retinal Vessel Segmentation Using Nonsubsampled Contourlet Transform[J]. Chinese Journal of Computers, 2011, 34(3):574-582. [5] Gang Liu, Xiao-Geng Liang, Jingguo Zhang. Contourlet Transform and Improved Fuzzy C-means Clustering Based Infrared Image Segmentation[J]. Systems Engineering and Electronics, 2011, 33(2):443-448. [6] Wu Xiao-yue, Guo Bao-long, Li Lei-da. A New Image Denoising Method Combining the Nonsubsamped Contourlet Transform and Adaptive Total Variation[J]. Journal of Electronics and Information Technology, 2010, 32(2):360-365. [7] Dai Wei, Yu Sheng-lin, Sun shuan. Image De-noising Algorithm Using Adaptive Threshold Based on Contourlet Transform[J]. Acta Electronica Sinica, 2007, 35(10):19391943. [8] Li Guang-xin, Wang Ke. Color Image Fusion Algorithm Using the Contourlet Transform[J]. Acta Electronica Sinica, 2007, 35(1):112-117.

[9] Yang Xiao-hui, Jia Jian, Jiao Li-cheng. Image Fusion Algorithm in Nonsubsampled Contourlet Domain Based on Activity Measure and Closed Loop Feedback[J]. Journal of Electronics and Information Technology, 2010, 32(2):422-426. [10] Jian Wu, Zhiming Cui, Pengpeng Zhao, Jian-ming Chen. Research on Vehicle Tracking Algorithm Using Contourlet Transform[C]. The 14th International IEEE Annual Conference on Intelligent Transportation Systems, Washington DC, USA, Oct. 5-7, 2011, 1267-1272. [11] Huang Chuanbo, Shao Jie, Wan Minghua, Jin Zhong. Image retrieval using Contourlet transform[J]. Computer Engineering and Applications, 2009, 45(3):24-27. [12] Lin Li-yu, Zhang You-yan, Sun Tao, et al. Contourlet TransformImage Processing Applications[M]. Beijing: Science Press, 2008:28-42. [13] Mei Xue and Xia Liang-zheng. Object Invariant Feature Extraction in Contourlet Field[J]. Computer Science, 2010, 37(11):275-277.

Jian Wu was born in Nantong on the 29th April, 1979, and got master degree in the field of computer application technology from Soochow university, Suzhou city, China in 2004. The main research direction is computer vision, image processing and pattern recognition. He works as a teacher in the same college after his master graduation. Now he is pursuing the doctoral degree. He was awarded the Third Prize of 2007 Suzhou City Science and Technology Progress and the 20082009 Soochow University Graduate Scholarship Model. Zhiming Cui was born in Shanghai on July, 1961. Professor, PhD Candidate Supervisor. The main research direction is deep web and video mining. Pengpeng Zhao was born in Suzhou on March, 1980. Associate professor. The main research direction is Deep Web, Web data extraction and mining. Jianming Chen was born in Suzhou on February, 1960. Associate professor, Master Supervisor. The main research direction is intelligent information processing and software engineering.

2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2269

Speech Recognition Approach Based on Speech Feature Clustering and HMM


XinGuang Li
Guangdong University of Foreign Studies, Guangzhou, 510006, China Email: lxggu@163.com

MinFeng Yao and JiaNeng Yang


Guangdong University of Foreign Studies, Guangzhou, 510006, China Email: {hammersons, tizziyang}@gmail.com

AbstractThe paper presents a Segment-Mean method for reducing the dimension of the speech feature parameters. K-Means function is used to group the speech feature parameters whose dimension has been reduced. And then the speech samples are classified into different clusters according to their features. It proposes a cross-group training algorithm for the speech feature parameters clustering which improves the accuracy of the clustering function. When recognizing speech, the system uses a crossgroup HMM models algorithm to match patterns which reduces the calculation by more than 50% and without reducing the recognition rate of the small vocabulary speech recognition system. Index Terms--HMM, Speech Feature Parameters, SegmentMean, K-Means Clustering, Model Cross-group

I. INTRODUCTION Breakthrough progress has been made in studies of speech recognition techniques in recent years. And these techniques have been applied for business purposes. A regular speech recognition system can be, in general, divided into four parts, namely, speech pretreatment, feature extraction, speech recognition and semantic understanding. Speech pre-processing aims at noise elimination and endpoint detection with signal processing technology, while feature extraction is designed to extract the feature parameters of the input speech. And speech recognition refers to the process in which one explores the distance or probability between the vector sequence of the unknown speech feature and each speech sample, as well as the most analogous type, when matching the unknown speech features with different training patterns. Finally, semantic understanding means giving a grammatical and semantic analysis of the result so as to obtain a proper one which conforms to grammatical rules[1]. With the development of speech recognition techniques, Dynamic Time Warping (DTW), Vector Quantization (VQ), Hidden Markov Model (HMM) and Artificial Neural Networks (ANN) have successively been applied to the speech recognition system in a successful way, which substantially promotes its development[2]. Researchers continuously try to improve these algorithms and they have made many innovative achievements.

Povey D. etc. describe an acoustic modeling approach in which all phonetic states share a common Gaussian Mixture Model structure, and the means and mixture weights vary in a subspace of the total parameter space. This style of acoustic model allows for a much more compact representation and gives better results than a conventional modeling approach, particularly with smaller amounts of training data[3]. Feng HongWei and Xue Lei introduced a new method for speech recognition which combined the hidden HMM and the algebraic neural networks. And the simulation result show the algorithm is better than the traditional algorithm in convergence speed, robustness and recognition rate improvement[4].Almost all present day large vocabulary continuous speech recognition (LVCSR) systems are based on HMM[5].Hidden Markov Model works very well in time series signal processing for its double stochastic process[6]. This statistical model has been used extensively and successfully in speech recognition systems. In this paper we attempt to study on the speech recognition system based on the integration of HMM and a new speech feature clustering model. Compared with the speech recognition system based on single HMM, the new hybrid model effectively improves the recognition speed. It provides a new reference method for the small devices to run speech recognition applications which meet the requirements of real-time system. II. SPEECH FEATURE PARAMETERS The system digitized analog signals according to Nyquist sampling frequency. The sampling frequency is set to 8 KHz. After pre-processing the input speech by using several algorithms, Mel-frequency cepstral coefficient (MFCC) is calculated. A. Pre-processing Input Speech Due to the pronunciation mechanism, speech signal has the characteristics of Attenuation of high frequency components. In this paper, pre-emphasis digital filter is used to enhance high frequency components. The filter flattens the spectrum of the signal and makes it possible to calculate the spectrum with the same SNR throughout the band. The filter is defined as

2012 ACADEMY PUBLISHER doi:10.4304/jcp.7.9.2269-2275

2270

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

H ( z ) = 1 az 1 ,0 .9 1

(1)

Where is the coefficient of the pre-emphasis filter, generally 0.92 or 0.94. In this paper, we use dual-threshold comparison to detect endpoints of input speech. It combines short term energy and short term zero-crossing rate of the signal, that makes the detection become more accurate. Also the dualthreshold comparison can effectively exclude from the silent segments of the noise and enhance system performance in real-time speech signal processing. B. MFCC Calculation So far, the most applicable ones of speech feature coefficients are linear predictive cepstral coefficients (LPCC) and MFCC. In this paper, we use MFCC as feature parameters. The MFCC analysis consists of four steps. Perform a fast Fourier transform on the input speech signal.
X [k ] =

higher frequency and center frequency on the actual frequency axis of the corresponding filter. f s denotes the sampling rate; L denotes the number of the filters. Perform the discrete cosine transform on the logarithm of the filter-bank energies and append first order differentials. Then we obtain the expression of the MFCC
M (i ) =

2 L 1 i log F (l ) cos[(1 2 ) L ]i = 1,2,..., Q n l =1

(7)

Here M (i ) denotes the MFCC parameters[7][8]. Q denotes the order of MFCC. In this paper, Q is set to 24. III. SPEECH RECOGNITION SYSTEM BASED ON SPEECH FEATURE CLUSTERING AND HMM In the traditional HMM isolated word speech recognition system, the classical Viterbi algorithm which uses forward iteration of the mathematical methods solves the hidden Markov model decoding problem perfectly. However, the required computation is still very impressive while the Viterbi algorithm is used in a large vocabulary speech recognition system. Suppose a large vocabulary speech recognition system can recognize 500 words. And we establish a model for each word. Assuming they have the same number of states, we connect these models into one big model. In this way, the state number of the big model is N 500 . As the Viterbi algorithm needs N 2 T ( T denotes the frames of input speech) orders of magnitude for the computation, compared to the computation of Viterbi algorithm in original model, the one in the big model is increased to 3 orders of magnitude. Moreover, the experiments find that the computing time is mainly used to calculate the mixture Gaussian probability distribution of the observation sequence[9]. In order to reduce the computational complexity, this paper presents a hybrid model based on speech feature clustering and HMM, which experimentally confirmed validity.

x[n ]e
n=0

N 1

2 nk n

, k = 0,1, 2,..., N 1

(2)

Where x ( n )( n = 0,1, 2,..., N 1) is one frame discrete speech signal; N is the length of the frame; X [k ] is the complex sequence of N points in the frame. Then we find the mold of X [k ] , and get the signal amplitude spectrum X [k ] Model the frequency axis by the Mel-scale. The Melfrequency f mel can be computed from the frequency f as follows:
f mel ( f ) = 2595 lg(1 + f ) 700 Hz

(3)

Calculate the amplitude spectrum of the triangle filter to the filtered signal output:
F (l ) =
f h (l )

k = f 0 (l )

w (k ) X [k ]
l

l = 1, 2,..., L

(4)

Where
k f o (l ) f (l ) f (l ) c o wi ( k ) = f h (l ) k f h (l ) f c (l ) , , f o (l ) k f c ( l ) f c (l ) k f h (l )

(5)

f 0 (l ) =

o (l ) h (l ) c (l ) , f h (l ) = , f c (l ) = fs fs fs N N N

(6)

Here F (l ) denotes the filtered signal output. wi ( k ) is the filter coefficients of the corresponding filter. o (l ) , h (l ) and c (l ) represent the lower frequency,

Figure 1. Structure of speech recognition system based on speech feature clustering and HMM.

2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2271

Fig. 1 shows that when recognizing, the system calculates the feature of input speech and determines its cluster group k at first. While Viterbi decoding is under progress, it only calculates the HMM parameters in the kgroup. In the case of appropriate cluster group the system will save a considerable amount of computation. A. Speech Feature Dimension Reduction and Cluster Cross-grouping Model Speech feature parameters must be structured before the clustering. This paper proposes a segment-mean algorithm to reduce the dimension of the speech feature parameters, so that they can keep the same orders and frame length. After the dimension reduction, the proposed cluster cross-grouping model effectively improves the accuracy of the speech feature parameters clustering function. 1) Segment-mean Algorithm When studying on clustering algorithm in the field of speech recognition, most of the literatures take it as a means of pattern classification[10][11]. In this paper, the improved speech feature clustering model will have the words with similar acoustic characteristics clustered into the same group. When recognizing, it only calculate the HMM parameters in the selected group. After a number of experiments we find that the group accuracy results of directly using traditional K-means clustering algorithm to cluster the speech feature are not optimistic. The segment-mean algorithm fragments the speech feature parameters into segments with the same dimension. Define the speech feature parameters as S ( K , J ) . Where K denotes the orders of the MFCC parameters; J denotes the number of fragmented frames. Assumes T is the number of frames before fragmented. Then fragment the speech feature parameters into N segments can be:
M (i ) = s ( K , J ) , J =[ T T (i 1) + 1],..., [ i ] N N

_________

M (1)
M ( 2)
_________

M (1 )
_________

M (N )1
M (N ) 2

M ( N )1
__________

M (2)
___________

S (K ,T )

M (N )2
___________

S (K ,T )
_________

M (N )

M ( N )M

M (N )M

M (N )

Figure 2. Schematic diagram of the segment-mean algorithm.

The total numbers of parameters in Fig. 2 are shown in Table . The segment-mean algorithm turns the size of feature parameters matrix from T K to MN K . That is to say the algorithm successfully removes the frame length T from the matrix. This means, the matrix (dimensionality reduction) keeps the same size after the segment-mean calculation. And the size of feature parameters matrix is determined for K (the orders of the speech feature parameters), N (size of the segment) and M (size of the child segment).
TABLE I. NUMBER OF PARAMETERS IN EACH STAGE
Stage Matrix size Number
TK
TK

T N

T NM

T MN MN T

K MK
KMN

MNK KMN

TK

TK

KMN

2) Cluster Cross-grouping Model This paper presents a new cluster cross-grouping model to enhance the performance in the field of speech feature clustering.

(8)

M (i ) represents the i-th segment of the fragmented speech

feature parameters. The value of N is set to the statue number of the HMM. After fragmenting the speech feature parameters into average segments, we continue fragment M (i ) into M average segments (The value of M is set to the observation sequence number of the HMM). The calculations of child segments see (8). The mean of each child segments is given by M (i ) k , k = 1, 2,..., M . Merge all the mean of the child segments into a matrix. The matrix denotes the speech feature parameters output after dimensionality reduction. It is defined as s ( K , T ) . The size of s ( K , T ) is MN K .

Figure 3. Schematic diagram of cluster cross-grouping.

As shown in Fig. 3, Cluster Cross-grouping consists of three steps: Cluster the features of the training speech samples using K-means clustering algorithm. Calculating the distances between the training speech samples and the cluster centers using dynamic time warping (DTW) algorithm. For each sample, the minimum distance determines its target group. Check whether the target group contains the training sample. If included, the classification is correct; else the word will be added to the target group. Set the cluster group number to K , the number of vocabulary to N . The number of words in the k-th group is S k , k = (1,2,..., K ) . After the first time clustering, we

2012 ACADEMY PUBLISHER

2272

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

have

S
k =1

= N . Define the clustering coefficient of cluster

cross-grouping model as

S
k =1

(9)

The improved speech feature clustering model has the words with similar acoustic characteristics clustered into the same group. Combining this with the HMM Parameters Grouping model, we achieve the speech recognition system based on speech feature clustering and HMM which greatly improves the system efficiency.

KN

Easy to see that after the first time clustering, we get 1 . A question must be asked is, after training the =
K

cluster group for the second time using cross-grouping algorithm, whether K S will tend to or equal to K N , or

k =1

tends to 1. If tends to 1, that means the feature clusters are degraded to ungrouped.

B. System Design and Implementation The system improvable ratio is key indicator of the improving analysis. Experimental Analysis of system improvable ratio which bases on the pattern matching stage shows that the proposed method is reasonable and effective. 1) System Improvable Analysis and HMM Parameters Grouping
Speech pre-processing MFCC calculation Pre-processing and MFCC calculation stage Recognition Pattern matching Post-processing Pattern matching stage Figure 4. Speech recognition system based on single HMM.

Figure 5. Speech recognition system based on single HMM.

As Fig. 5 shown, when using Viterbi algorithm to do decoding operations, all the model parameters must be involved in the computation. Assume the number of system vocabulary is n. Then the number of HMM parameters is n. When recognizing a word, each output probability is calculated by Viterbi algorithm and involved the total n HMM parameters. Because each isolated word has a unique HMM parameter with corresponding. We are able to have the words in the feature cluster groups Mapped to the corresponding HMM parameters. Therefore we achieve the HMM parameters grouping model as Fig. 6 shown.

There are two main stages when the system recognizing words, as shown in Fig. 4: Pre-processing and MFCC calculation stage; Pattern matching stage. The pattern matching stage takes up most of the recognizing time. Table shows the recognizing time of ten different samples apple in the system based on single HMM. According to the statistics given in Table , pattern matching time account for average 96.4% of the total recognition time.
TABLE II. RECOGNIZING TIME OF TEN DIFFERENT SAMPLES APPLE
Pattern Matching Time /Recognition Time

Figure 6. HMM parameters grouping model.

id

Preprocessing Time (sec)

Pattern Matching Time (sec)

Recognition Time (sec)

1 2 3 4 5 6 7 8 9 10

0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.02

0.93 0.77 0.76 0.92 0.86 0.75 0.74 0.71 0.69 0.46

0.96 0.80 0.79 0.95 0.89 0.78 0.77 0.74 0.72 0.48

96.4% 96.4% 96.5% 96.6% 96.6% 96.4% 96.5% 96.4% 96.4% 95.8%

As the feature clustering algorithm is good in grouping performance, the number of the HMM parameters in the cluster group is always less than or equal to the number of system vocabulary. Also, the improved speech feature clustering model ensures a high grouping accuracy rate. Hence, this paper proposes to combine the feature clustering model and HMM to form a hybrid model-speech recognition system based on speech feature clustering and HMM (as Fig. 1 shown). 2) Implementation According to the hybrid model, we complete the system on the Matlab platform. The main interface of the system shows in Fig.7. The state number of HMM is set to 8[12].

2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2273

to 80 when using 8 KHz sampling frequency, we set the resample length to 50. DTW algorithm is used to calculate the Euclidean distance between the test speech samples and the cluster centers. And the target group is the one who has the minimum Euclidean distance. Check whether the target group contains the input sample. If included, the classification is correct.
TABLE III. CLUSTERING RESULTS Figure 7. Main interface of the system.
Number of clusters 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Direct MFCC Clustering Accuracy 1.0000 0.7000 0.4000 0.3800 0.3000 0.2000 0.3500 0.2000 0.1500 0.1000 0.1000 0.1500 0.1000 0.1000 0.1000 0.1000 0.0500 0.0500 0.1000 0.0500 Segment-mean clustering Accuracy 1.0000 0.8400 0.8900 0.9400 0.8200 0.8450 0.9150 0.8850 0.8750 0.9000 0.9050 0.8800 0.8900 0.9000 0.8850 0.9100 0.9050 0.9200 0.9100 0.9050

Figure 8. Recognition and data analysis module.

As Fig. 8 shown, in the recognition and data analysis module we can get the recognition results and the clustering items. Users can customize the number of cluster groups and compare different recognition cases. IV. EXPERIMENTS

In order to verify the validity of the new model, we analyze the segment-mean algorithm, cluster crossgrouping model and the hybrid model (speech feature clustering and HMM) through several experiments. The test samples are composed of 1,000 speech samples recorded by 10 individuals. The number of system vocabulary is 20. A. The Effect of Segment-mean Algorithm
Test Samples

The experimental data shows in Table , the average clustering accuracy of the untreated MFCC feature is 23.40%. Such a low accuracy cause that we cannot use this kind of clusters in the system. Experimental data also shows that, the average clustering accuracy which uses the Segment-mean Algorithm to the MFCC feature is much better with a value of 89.60%. Although the segmentmean clustering accuracy relatively in a higher level, the cluster still cannot be used in the system. Because that its classification performance will still reduce the recognition rate of the system in a certain degree. So we should continue try others methods to improve the clustering accuracy of the speech feature clustering. B. Analysis of Cluster Cross-grouping Model Take experiment according to the schematic diagram of cluster cross-grouping as illustrated in Fig. 3. The experiment result is shown in Table . In the experiment, the number of the clusters is set to 3. There are 10 groups test samples which contain the speech recorded by 10 individuals, 200 words/times. The words in each cluster are given in Table . And the clustering accuracy is 89.00%.
TABLE IV. CLUSTERING RESULTS
id word Apple Baby Boy Call 1 2 1 3 2 2 2 1 3 2 3 2 1 3 3 4 2 1 3 2 5 2 1 3 1 6 2 1 3 2 7 2 1 3 1 8 2 1 3 2 9 2 1 3 2 10 2 1 3 2

Figure 9. Schematic diagram of speech feature parameters clustering experiment.

According to the procedures in Fig. 9, for the speech feature parameters whose dimension are not reduced, we choose the resample algorithm to wrap their length into the same size. As the length of the frames is generally 30

2012 ACADEMY PUBLISHER

2274

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

Car Chew Deny Dress Man Many Map Movie Name Navy New One Open Over Pay Power

2 2 2 2 1 1 2 1 1 1 1 3 3 3 1 2

2 1 1 2 1 1 2 1 1 1 1 3 3 3 1 2

2 2 2 2 1 1 1 1 1 1 1 3 3 3 1 2

2 2 2 2 1 1 2 1 1 1 1 3 3 3 1 2

2 1 1 2 1 1 2 1 1 1 1 1 3 3 1 2

2 1 1 1 1 1 2 1 1 1 1 1 3 3 1 2

2 1 2 2 1 1 2 1 2 1 1 3 3 3 1 2

2 2 2 2 2 1 2 1 2 1 1 3 3 3 2 2

2 2 2 2 2 1 2 1 1 1 1 3 3 3 2 2

2 1 2 2 2 1 2 1 1 1 1 3 3 3 1 2

Thus, after using the segment-mean algorithm and the cluster cross-grouping algorithm, the accuracy of speech feature clustering is raises to above 98%. C. Improved Effectiveness Analysis of the Speech Recognition System
TABLE VII. RECOGNITION TIME AND RECOGNITION RATE UNDER THE NUMBER OF DIFFERENT HMM PARAMETERS CLUSTERS
the number of HMM parameters clusters 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Recognition Time (sec) 0.712402641 0.587432793 0.471864087 0.393995398 0.323681682 0.30373318 0.284968684 0.230544148 0.256837057 0.192162851 0.184630581 0.184794962 0.147031782 0.144813982 0.135196623 0.117887713 0.108788616 0.104763784 0.104496738 0.092861028 recognition rate 99.50% 99.50% 99.00% 99.50% 99.50% 99.00% 97.50% 96.50% 97.00% 99.00% 97.50% 99.00% 98.00% 97.50% 98.00% 98.00% 97.00% 98.00% 97.50% 97.00% improved recognition time / unimproved recognition time 100.00% 82.46% 66.24% 55.31% 45.44% 42.64% 40.00% 32.36% 36.05% 26.97% 25.92% 25.94% 20.64% 20.33% 18.98% 16.55% 15.27% 14.71% 14.67% 13.03%

TABLE V. GROUPS OF THE SEGMENT-MEAN MFCC CLUSTERING (3 GROUPS)


Group1 Group2 Group3 The Words in the Cluster Groups Baby Chew Man Many Movie Name Navy New Pay Apple Call Car Deny Dress Map Power Boy One Open over

The clustering results are in good stability after using the segment-mean algorithm. We can see from Table , there are 10 words in the test that do not occur any grouping error. Therefore, these 10 words will not be re-grouping while using the cluster cross-grouping algorithm. This will help to reduce the value of K S .And the cross-group will

k =1

achieve better results. From the result in Table , after the train using the cluster cross-grouping algorithm, the total number of words in each group is K S = 36 .

k =1

And K N = 3 20 = 60 . So the clustering coefficient of cluster cross-grouping model is =

S
k =1

KN

= 0 .6 . Hence,

cluster cross-grouping model is good in performance.


TABLE VI. TRAIN RESULT OF USING THE CLUSTER CROSS-GROUPING ALGORITHM
Group1 Group2 Group3 The Words in the Cluster Groups Baby Chew Man Many Movie Name Navy New Pay Apple Boy Call Deny Dress Map One Apple Call Car Deny Dress Map Power Boy Chew Man Name New One over Pay Boy One Open over Call

As Table shown, when the number of HMM parameters clusters is set to 1, the system is degraded to unimproved, so that the recognition rate is not affected by the cluster cross-grouping algorithm at all. In the case of ensuring that the recognition rate is above 99.50%, and when the number of HMM parameters clusters is set to 5, the improved recognition time account for 45.44% of the unimproved recognition time. V. SUMMARY

After using the cluster cross-grouping algorithm to retrain the speech feature clusters, the average clustering accuracy raises to 98.75%. The accuracy is 99.50% when the number of clusters is set to 3 which greatly improved compared to the previous 89.00%. Fig. 10 shows the accuracy in different number of the clusters.

Through the analysis of the speech recognition system based on single HMM, the problem that small devices cannot meet the requirements of real-time system for the enormous computation is pointed out. To solve this problem, a new speech recognition system based on speech feature clustering and HMM is proposed. The main techniques of the system are as follows: A new segment-mean algorithm for reducing the dimension of the speech feature parameters; The cluster cross-grouping model is proposed; HMM parameters clustering. The results of experiment indicate that, after improving the system by using the method which this paper proposed, the improved recognition time account for less than 45.44% of the unimproved recognition time. Therefore, the purpose of improving the system efficiency is achieved.

Figure 10. Accuracy of cross-group training.

2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2275

ACKNOWLEDGMENT This work is supported by GDSF 2011B031400003, GDNSF (915104200100017) and GDSF (2008B080701007). REFERENCES
[1] [12] Zhang Jie, Huang Zhitong and Wang Xiaolan, Principle

of Selection of States Number of HMM in Speech Recognition and its Analysis, Computer Engineering and Applications, vol. 36(01), 2009, pp. 67-69.

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

Wang Xianbao1, Chen Yong, Tang Liping, Speech recognition research based on MFCC analysis and biomimetic pattern recognition, Computer Engineering and Applications, vol.47 No.12, 2011, pp.20-22. Mehryar Mohria, Fernando Pereirab and Michael Rileya, Weighted finite-state transducers in speech recognition, Computer Speech & Language, vol.16, No.1, Jan. 2002, pp.69-88. Povey D., Burget L., Agarwal M., Subspace Gaussian Mixture Models for speech recognition, 2010 IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP), Mar. 2010, pp. 4330-4333. Feng HongWei, Xue Lei, Application of speech recognition system based on algebra algorithm and HMM, Computer Engineering and Design, Vol. 31, No.24, Dec. 2010, pp.5324-5327. Mark Gales, Steve Young, The application of hidden Markov models in speech recognition, Foundations and Trends in Signal Processing, vol. 1 No.3, Jan. 2007, pp.195304. Rabiner L.R., A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition, Proceedings of the IEEE, vol. 77, No.22, Feb. 1989, pp. 257-286. Ye Qingyun and Jiang Jia, Improved Extraction Algorithm for MFCC Feature, Journal of WuhanUniversity of Technology, vol. 29(5), May. 2007, pp. 150-152. Feng Yun, Jing Xinxing and Ye Mao, Improving the MFCC Features for Speech Recognition, COMPUTER ENGINEERING & SCIENCE, vol. 31(12), 2009, pp. 146168,doi: 10. 3969/ j. issn. 10072130X. 2009. 12. 042. Yuan Jun, The Viterbi Algorithm Optimization and Application on Continuous Speech Recognition Base on HMM, Electronic Technology, vol. 2, 2001, pp. 48-51. Yu Xiangdong, Suo Xiuyun and Zhai Jianren, Speech Recognition Based on Fuzzy Clustering, Fuzzy Systems and Mathematics, vol. 16(01), Mar. 2002, pp. 75-79. Li Dongdong, Wu Zhaohui and Yang Yingchun, Speaker Recognition Based on Pitch-Dependent Affective Speech Clustering, Pattern Recognition and Artificial Intelligence, vol. 22(01), Feb. 2009, pp. 139-140.

XinGuang Li Hunan Province, China. Birthdate: Jan, 1963. Circuit and System Ph.D., graduated from School of Electronics and Informatics, South China University of Technology; with research interests in artificial intelligence. He is a professor of Cisco School of Informatics Guangdong University of Foreign Studies.

MinFeng Yao Henan Province, China. Birthdate: Nov, 1977. Graduated from Leeds University with a Master Degree in Computer Science; research interests in artificial intelligence. He is a lecturer of Cisco School of Informatics, Guangdong University of Foreign Studies.

JiaNeng Yang Guangdong Province, China. Birthdate: May, 1987. bachelor's degree in computer science and technology. And research interests in artificial intelligence and machine learning. He is a candidate for Master of business management in School of management Guangdong University of Foreign Studies.

2012 ACADEMY PUBLISHER

2276

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

Electronic Nose for the Vinegar Quality Evaluation by an Incremental RBF Network
Hong Men School of Automation Engineering, Northeast Dianli University, Jilin, 132012, China Email: menhong_bme@163.com
Lei Wang

School of Automation Engineering, Northeast Dianli University, Jilin, 132012, China Email: wanglei0510410@126.com
Haiping Zhang

School of Automation Engineering, Northeast Dianli University, Jilin, 132012, China Email: zhplxzlq@tom.com

AbstractPattern classification was an important part of the RBF neural network application. When the electronic nose is concerned, in many cases it is difficult to obtain the entire representative sample; it requires frequent updating the sample libraries and re-training the electronic nose. In additionthe gas detected from the online environment is not always the known gas in the training samples. This paper proposes a RBF neural network model in order to identify gas. This model uses K-means clustering algorithm and has incremental learning ability, the network output node can be adjustable online to ensure the network with high generalization ability and some incremental learning ability. Finally, the classification system based on this algorithm is used to identify the vinegar online. The results show that this algorithm has faster convergence speed, good performance of the network's online classifieds. Index Termsradial basis function; K-means clustering; incremental learning; electronic nose

I. INTRODUCTION Electronic nose is one of the applications in bionics; it used the gas sensor array to make a specific identification and analysis for gas molecules. It has been widely used in many fields, including food surveillance [1], medical test [2], environmental monitoring [3], explosives detection [4]. Currently, according to the research related to electronic nose, the accuracy of identification algorithm for a single gas which has been known has become to a considerable degree of (93 %to 95%), but for the test case of the gas is unknown gas are not yet effective algorithms. We intend to make unknown gas as an unclassified gas (the gas is not belong to any part of a training database), and design a clustering algorithm to identify it in order to improve the gas recognition accuracy. Compare to the traditional pattern recognition method, neural network classifiers have better performance in the case of the characteristics of the identified gases is unknown, and also has better generalization ability.

For a classifier, its performance largely depends on the training samples. The more representative training sample has the better performance of the classifier. But in many cases, it is difficult to obtain a representative sample. So it often need to use incremental learning techniques which means after use the trained samples to complete the study, obtained the new sample to training neural network classifiers in incremental learning approach. Actually neural network classifier has been used in a wide range of area. Therefore, the research about neural network algorithm which is based on the incremental learning is necessary. At present there are many neural network algorithms based on incremental learning [5-9]. Compare to the BP neural network, RBF is more suitable for incremental learning. The activation function of BP neural network is the S-function, which is a function of a global response, while the activation function of RBF neural network is Gaussian, the response is local. So that the weight coefficients and the hidden layer nodes of the RBF neural only link with parts of the region in the sample space. So the incremental learning of the RBF network only need to adjust the weight coefficients and the hidden layer nodes corresponding to the input sample space can achieve incremental learning RBF network. which means that the RBF network is suitable for incremental learning [10,11] Based on this consideration, this paper presents a incremental learning algorithm for RBF network based on K-means clustering algorithm and the number of the hidden layer nodes is adjustable. It has the following characteristics: a. the training samples first used the K-means cluster and then adjusted to make between different types of the subsets has no overlap. So in the incremental learning process of the new samples, it can avoid a lot of repetitive training to improve the efficiency of training; b. After the K-means clustering, the training samples become to a subset with the local clustering, RBF neural network only distribute hidden layer nodes to the center of subset, not generate hidden

2012 ACADEMY PUBLISHER doi:10.4304/jcp.7.9.2276-2282

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2277

layer nodes for every sample. So the size of the network can be controlled in some extent. II. RBF NETWORK AND K-MEANS CLUSTERING ALGORITHM A. RBF Broom head and Lowe in 1988 proposed the RBF network, and then have been someone makes the RBF network extended and improved [12]. RBF network is usually a three-tier feed-forward network, the first layer is input layer, input the Eigen values of the input feature space mode, input nodes directly connect with the each neuron in the second layer; the second layer is the hidden layer in RBF network neuron, the number of hidden neurons can change with the complexity of the problem; the third layer is output layer [13]. B. K-means Clustering method can generally be divided into partition-based methods [14], based on a layered approach [15], density-based method [16] and grid-based method [17]. Among them, the Partition-based clustering algorithm in pattern recognition are the most common type of clustering algorithm, which Mac Queen proposed a real-time unsupervised clustering algorithm K-means most widely used [18,19]. The steps of the k-means clustering algorithm [20] (1) From n data objects arbitrary choice k objects as initial cluster centers; (2)According to each cluster object means (central object), calculated the distance between the each object and the object of these centers; and according to the minimum distance re-classification of the corresponding object, each object is (re) assigned to the closest class. (3) re-calculated for each (change) clustering means (central object) (4) Repeat (2) and (3) until no further change in each cluster, k-means algorithm attempts to find the kclustering which make the value of the squared error function to be minimum; it is defined as follows:

(1) Use the sample data makes analysis by hierarchical clustering method, get k division; (2) Calculate the k-hierarchical clustering analysis divided each division means, and the clustering algorithm as k-means initial cluster centers; (3) According to means in each cluster object (central object), calculated the distance between each object and these center object; and according to the minimum distance to divide the corresponding object; (4) Re-calculated means (central object) in each (change) clustering (5) Repeat (3) and (4) until there is no further change in the class of each object First to sample data use improved hierarchical clustering method can effectively ruled out the random factors in clustering center from the random initial in kmeans clustering algorithm. Make this algorithm get stable clustering results. And because the initial clustering center has very good represents the cluster, make the iterative times in this algorithm decreased significantly and improve the running speed of the algorithm; the initial way also can make use of the structure information in the hierarchical clustering method. Make clustering quality improved significantly relative to the average quality of random initially. III. IMPROVED RBF NEURAL NETWORK MODEL In the applications of pattern recognition, in many cases one-time get all of the data to train neural network is not only time-consuming, arduous, and sometimes is not realistic; in addition, when the sample size is large, it is often not feasible to use all of the samples to make the training because of the limitation of the system memory, then you need neural network to make incremental learning. Incremental learning neural network can learn new sample information, while maintaining the original sample of knowledge, the learning process do not need the original sample. Incremental learning neural network can be achieved by adjusting the parameters; you can also adjust the network topology to achieve. As the radial basis function network (RBF) with a hidden layer restructuring potential, attracting researchers study its incremental learning ability [21, 23]. The RBF network model used in this paper use the Kmeans algorithm to make sure the cluster center and the right .Make use of the incremental learning algorithm to achieve the output of the network nodes online adjustable. The structure of network model is shown in Figure 1. For incremental learning can be divided into two steps the learning process and the recognition process, the specific algorithm is described as follows: A. The Learning Process 1) Unsupervised learning algorithm Make use of the improved K-means algorithm to achieve the dynamic adjustment of hidden nodes, the number of samples for the input set Xp, p = 1, 2 N. (1) Set the number of initial hidden nodes is K (0) and the initial cluster centers, and set the hidden layer initialized weights is W1.j (0), W1.j (0), j=1, 2 K (0);

E = p mi
i =1 pci

(2) E is the sum of squared error of all of the objects in the database, p is the point in space ,which express a given data object, mi is the average of cluster Ci (p and mi are multi-dimensional).This criterion make the results clusters generated compact and separate as much as possible. C. Improved K-means Clustering Algorithm First we can use the improved hierarchical clustering method get the k division from the sample data. By calculating the means in each division object from the k classified sample data, you can get the k initial cluster centers. The basic idea of the improved k-means clustering algorithm is as follows:

2012 ACADEMY PUBLISHER

2278

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

(2) For j = 1, 2 k; p = 1, 2, N, calculate

d j p = x p W1, j (t ) ,
jp =
1 2 , d jp m 1 di p i =1
k
m jp

(2)

u = (u j )c1
(3) The output of i-th output layer neurons:

(10) (11) (12) (13) (14)

yi = W2,i (t ) u, i = 1, 2, , m

(3)

(4) Calculate the output error: (5) Correct the weight

W1, j (t ) =

( )
N p =1

xp

ei = ti yi , i = 1, 2, , m

( )
N p =1 ip

(4)

W2, j (t + 1) = W2, j (t ) + W2, j (t + 1)

x=

( x )
N p =1 p

W2, j (t + 1) = ei u + a W2, j (t ), i = 1, 2, m

is the m is the weighting factor, usually take 2, first p samples belonging to the j-th membership function in the fuzzy sets,

jp

(5)

(6) Return to Step (2) until the end of iteration B. Recognition Process Make use of the RBF network which have been trained and have M output node to identify the input sample. When found the input sample belongs to a new class, the network in accordance with k-means clustering algorithm to add a hidden layer node and output node. And then use the samples from the new class train the network; make it learn a new classification of knowledge, to meet the requirements of incremental learning. Step 1 initialization process: Make use of the sample set A as training set for learning and establish the RBF neural network classifiers S. In this process use the Kmeans algorithm mentioned earlier. Make cluster analysis for the initial sample set, the number of clustering is the number of initial hidden layer in RBF networks, the number of cluster centers is the basis function centers, according to the cluster variance to get the smoothing parameters in base function, and get the connected weights value by using the least square algorithm calculate. Finally, get the initial RBF network structure.

W1, j (t )

is the j-th cluster center,

d jp

is

the center distance between (3) Calculate


c (t )

xp

and fuzzy set j;


2

E = W1, j (t ) W1, j (t 1) ,
j =1

If E > , then turn to step (2) continue learning; (4) Calculate the clustering index function
S (k ) =

(6)

( )
N c p =1 j = 1 jp

(x

W1, j

W1, j x

),

If S > , then k (t +1) = k (t) +1, re-learn, otherwise, stop learning. Also, make sure the number of cluster k and cluster centers 1, j , turn to step (5); (5) Calculate the radius of the kernel function 2 1 2 = x p W1, j (t ) , j = 1, 2, , c j m j xp j

(7)

(8)

process, the learning rate and the momentum factor a are automatic adjustments

Which, j as the number of samples belonging to the j-th cluster? 2) Supervised learning algorithm Adjust the weights vector between the output layer and hidden layer. Make solve by the gradient descent and minimum mean square error method. In the iterative

between (1) Initialize the connection weights 2,i the hidden layer and output layer to the random number, i as the number of the sub-network output layer node, i = 1, 2... m; (2) Provide training on (x, t), j-th hidden layer neuron's output is:
u j = exp( x W1, j
2

W (0)

Figure 1. Network structure

( 2 )
2

) j = 1, 2 , k

(9)

Step 2 incremental process: after loading the incremental data set B, not only the normal data sample in B, but also there may be many types of heterogeneous data sample. Because the multi-class problems can be transformed into more second-class problem, here only

2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2279

consider the basic second-class problem. Which means if there are heterogeneous sample in B, only considered B as a set contains normal samples and a class of heterogeneous sample. Specific incremental training process is as follows: (1) Inspection sample set B whether have heterogeneous sample. If it does not exist, the algorithm stops, S is the incremental learning outcomes; If it exists, according to test results divide sample set B into B1 and B2, B1 as the heterogeneous sample set and B2 as the normal sample set, turn to step (2). (2) Based on the original RBF neural network classifier S, increase an output node, make M = M +1.According to K-means algorithm make effective clustering to the sample set B1, make sure the number of the added hidden layer nodes and the corresponding added center and width parameters .Meantime random initial the added connection weights between the hidden layer and the output layer. Using the steepest descent method to learn new class samples, and correct the new increase weights value. After the online learning network the change of hidden layer nodes shown in Figure 2.
Input layer Hidden layer Output layer

. . . . . .

a. RBF network structure before incremental learning

Input layer

Hidden layer

Output layer

. . . . . .

. . . .

New increase hidden layer node

New increase output layer node

b. RBF network structure after incremental learning Figure 2 RBF network output before and after incremental learning

IV. SIMULATION
2012 ACADEMY PUBLISHER

Test samples are 5 kinds of Chinese vinegar: vinegar River White City, River City Rice vinegar, Liubiju vinegar, apple vinegar, Zilin Old vinegar. The system uses the TGS-8 series of six metal oxide gas sensors come from Figaro company, the specific model are TGS822, TGS813, TGS821, TGS830, TGS831, TGS832. The experiment equipment used the signal regulating circuit of divider-type. Through the sensor voltage reflects the gas concentration. Use the quantitative liquid sampling methods, sample volume set at 5ml. The system selects the time between the data rapidly increasing and slow decrease to collect the data. Generally, after the sample put into a container 2min later, the significant changes a lot. In order to avoid the impact of environment on the test results, we carried out for two weeks of intermittent testing. Before each sample being measured, use built-in fan make the sensor come to usual. Experimental data set are a total of 600 samples, which have five classes and each class have 120 samples. We extract 53, 53, 54, 53 samples in the four class as the training sample set, the rest as the test sample. The process of training the network, the sample is provided to the network one after another. After every training sample input RBF neural network, according to the actual output from the RBF neural network whether achieves the desired accuracy to decide whether to readjust the network parameters. RBF network use four categories of vinegar samples as training sample set, five categories of vinegar samples as testing sample set. Table 1 shows the test results of sample data. After the RBF neural network structure relearning, re-test the testing sample set. Comparative analysis of test results listed in table 1 show that all five categories of vinegar samples were correctly identified. In particular through incremental learning, the new increase categories of vinegar samples (e.g. samples 5, 6) are correct identification. This proves that the proposed improvement of K-RBF network with incremental learning ability can continue to improve gradually with practice, enhance the system's ability to identify. In order to understand the network training condition, we recorded the adjustment number of RBF neural network parameters (Figure 3) for the fifth categories of vinegar samples. In the initial phase, the network does not adapt to the training samples. Performance in the RBF neural network parameters are frequently and greatly adjust; in the late stages of training, the network gradually adapt to the training sample, fluctuations tends to smooth, the adjust number of parameters are average 5 times. In order to observe the convergence performance of this algorithm, we selected the very representative of the fifth categories of vinegar samples as a research object. Because the initial RBF network is used the initial parameters, the network output error is greater in the initial phase. Through the adaptive adjust network parameters, the error between the network output and the actual value gradually reduced. After 31 times re-tune the network parameters, the output desired to requirements, shown as Figure 4. Figure 5 shows after the training of the RBF network, the approximation capability of 50 test

2280

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

samples; the red line is the target value, the blue line for the network's actual output value.
TABLE I RBF NETWORK OUTPUT BEFORE AND AFTER INCREMENTAL LEARNING

0.9

0.8

0.7

error rate

0.6

Sample number

RBF network output before incremental learning (1.015 0.007 0.006 0.002) (0.011 0.978 0.005 0.011) (0.020 0.006 0.932 0.002) (0.002 0.003 0.013 0.998) (0.009 0.002 0.001 0.001)

Category

RBF network output after incremental learning (1.015 0.007 0.006 0.002 0.001) (0.011 0.978 0.005 0.011 0.004) (0.020 0.006 0.932 0.002 0.010) (0.002 0.003 0.013 0.998 0.009) (0.009 0.002 0.001 0.001 1.009) (0.007 0.004 0.003 0.004 0.974)

Category

0.5

0.4

0.3

0.2

0.1

10

20

30

40

50

60

70

80

90

100

River City White vinegar River City Rice vinegar Liubiju vinegar

River City White vinegar River City Rice vinegar Liubiju vinegar

lteration

Figure 4 Algorithm convergence

Apple vinegar

Apple vinegar

New Category

Zilin Old vinegar

Figure 5.The approximation capability

Zilin Old vinegar

40

Training Epoch-Blue

As can be seen from Table 2, the former four known categories of vinegar samples have 91%, 93%, 95%, 90% recognition rate after RBF network training. After incremental learning to the fifth categories of vinegar samples, RBF recognition rate has not changed. That means the incremental learning algorithm model not only can accurately predict the new category, but also can not forget the original knowledge.
TABLEII THE CORRECT RATE OF THE ELECTRONIC NOSE IDENTIFY THE NEW SAMPLE DATA

35

30

25

20

15

10

10

20

30

40

50

60

70

80

90

100

Samples Sequence Number


Figure 3. The training condition of the network

River City White vinegar River City Rice vinegar Liubiju vinegar Apple vinegar Zilin Old vinegar

before incremental learning 91% 93% 95% 90%

after incremental learning 91% 93% 95% 90% 97%

As can be seen from Figure 6, the incremental learning algorithm performance is not very sensitive for the initial sample volume and the total amount of incremental data, when the initial sample volume of data changes from 50 to 100, the average recognition algorithm rate has not changed. When the total amount of incremental data changes from 50 to 100, the algorithm samples the
2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2281

greatest change in the average recognition rate of about 1%.

Figure 6 .Different amounts of data and algorithm performance

V. CONCLUSION In this paper an incremental RBF neural network model based on K-means clustering is studied. This model make K-means clustering algorithms combined with the incremental learning algorithm, so the network can effectively learn new sample mode and keep the original memory of the old sample pattern, have progressive learning ability. Clustering initialization reduced training sequence of the initial data set sample effect on the RBF incremental learning network. The dynamically adjust strategy of the hidden layer nodes makes incremental RBF network has the ability to learn new information. Preliminary experimental results demonstrate the validity of the model. ACKNOWLEDGMENT This work was supported by Jilin Province Education Department Research Program of China (2011NO.79) to Hong Men and Haiping Zhang. REFERENCES
[1] C. Li, P. H. Heinemann, J. Irudayaraj, Detection of Apple Deterioration Using an Electronic Nose and zNosetm, Transactions of the ASABE, vol.50, No.4, pp1417-1425, 2007. [2] Kateb Babak, Ryan M.A., Homer M.L., etc., Sniffing out cancer using the JPL electronic nose: A pilot study of a novel approach to detection and differentiation of brain cancer, NEUROIMAGE, vol.47, No.2, pp5-9, 2009. [3] Romain A CGodefroid DNicolas J, Monitoring the exhaust air of a compost pile with an e-nose and comparison with GC-MS data, Sensors and Actuators B, vol.106, No.1, pp317-324, 2005. [4] Jehuda Yinon, Field detection and monitoring of explosives, Trends in analytical chemistry, vol.21, No.4, pp292-301, 2002. [5] Polikar R, Udpa S S, Honavar V, T Learn++: An incremental leaning algorithm for supervised neural networks, IEEE Trans. Syst., Man, Cybern. C, vol.31, No.4, pp4 97-508, 2001. [6] Fu Linmin, Hsu Huihuang, Princip J C, Incremental backpropagation leaning networks, IEEE Trans. Neural Networks, vol.7, No.3, pp757-761, 1996 [7] Carpenter G, Grossberg S, Markuzon N, etc., Fuzzy Artmap: a neural network architecture for incremental supervised learning of analog multidimensional maps, IEEE Trans. Neural Networks, vol.3, No.5, pp698-713, 1992.
2012 ACADEMY PUBLISHER

[8] Platt J, Aresource allocating network for function interpolation, Neural Compute, vol.3, No.2, pp213-225, 1991 [9] Yamauchi K, Yamaguchi N, Ishii N, Incremental learning methods with retrieving of interfered patterns, IEEE Tran. Neural Networks, vol.10No.6, pp1351-1365, 1999 [10] David Coufal, Incremental Structure Learning of ThreeLayered Gaussian RBF Networks, Lecture Notes in Computer Science, vol.2331, pp584-593, 2008 [11] Seiichi Ozawa, Keisuke Okamoto A Fast Incremental Learning for Radial Basis Function Networks Using Local Linear Regression, Ieej Transactions on Electronics, Information and Systems, vo.130, No.9, pp1667-1673, 2010 [12] Zhang Linke, He Lin, Ben Kerong, etc., RBF Network Based on Fuzzy Clustering Algorithm for Acoustic Fault Identification of Underwater Vehicles, Proc of the 2nd Intl Symp on Neural Networks, pp567-573,2005. [13] Nagabhushan T.N., Padma S.K. , Prasad Bhanu, Performance of auto-configuring RBF networks trained with significant patterns, International Journal of Signal and Imaging Systems Engineering, vol. 2, No.1-2, pp41-50, 2009. [14] Kobrina Y, Turunen MJ, Saarakkala S, etc., Cluster analysis of infrared spectra of rabbit cortical bone samples during maturation and growth, Analyst, Vol.135, No.12,pp3147,2010. [15] P. Krishna Prasad , C. Pandu Rangan, Privacy Preserving BIRCH Algorithm for Clustering over Arbitrarily Partitioned Databases, Lecture Notes in Computer Science ,vol. 4632, pp146-157, 2007. [16] K. Anil Kumar , C. Pandu Rangan, Privacy Preserving DBSCAN Algorithm for Clustering, Lecture Notes in Computer Science, vol.4632, pp57-68, 2007. [17] Shafto, Michael, CLIQUE: A FORTRAN IV program for the Needham-Moody-Hollis cluster-listing algorithm, Behavior Research Methods and Instrumentation, Vol.6, No.1, pp58-59, 1974. [18] Cuell Charles, Bonsal Barrie, An assessment of climatological synoptic typing by principal component analysis and kmeans clustering, Theoretical and Applied Climatology,vol. 98, No.3-4, pp361-373, 2009. [19] Madhu Yedla, Srinivasa Rao Pathakota, Srinivasa.T.M, Enhancing K-means Clustering Algorithm with Improved Initial Center, International Journal of Computer Science and Information Technologies, vol.1, No.2, pp121-125, 2010. [20] Sameh H. Ghwanmeh, Applying Clustering of Hierarchical K-means-like Algorithm on Arabic Language, International Journal of Information Technology, vol.3, No.3, pp314-318, 2007. [21] Yamauchi Koichiro, Hayami Jiro, Incremental Leaning and Model Selection for Radial Basis Function Network through Sleep, IEICE Transactions on Information and Systems, Vol.E90-D, No.4, pp722-735, 2009. [22] Guang-Bin Huanga , Lei Chena, Convex incremental extreme learning machine, Neurocomputing, vol.70, No.16-18, pp3056-3062, 2007. [23] Constantinopoulos onstantinos, Likas Aristidis, Semisupervised and active learning with the probabilistic RBF classifier, Neurocomputing, vol.71, No.13-15, pp24892498, 2008.

2282

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

Hong Men was born in Jilin, China, in 1973. He received a Bachelor of Electronics and Informational System from Northeast Normal University, Changchun, China, in 1996, a Master of Electric Power System and Automation from Northeast Institute of Electric Power Engineering, Jilin, China, in 2002, and a Doctor of Biomedical Engineering from Zhejiang University, Hangzhou, China, in 2005. Now he is an associate professor in Northeast Dianli University. His current research areas are chemical sensor mathematics, microbiologically influenced corrosion, and pattern recognition technique. Lei Wang was born in Tianjin, China, in 1987. He received a Bachelor of Automation from Computer technology and Automation College, Tianjin Polytechnic University, China, in 2009. Now he is a graduate student of School of Automation Engineering, Northeast Dianli University. And research interest is on Electronic nose.

Haiping Zhang was born in Liaoning Province, China, in 1962. He received a Bachelor of Chemical analysis instrument from Northeast Institute of Electric Power Engineering, Jilin, China, in 1984, a Master of Chemical analysis instrument from Northeast Institute of Electric Power Engineering, Jilin, China, in 1987. Now he is an associate professor in Northeast Dianli University. His current research areas are chemical sensor.

2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2283

Intelligent Recognition for Microbiologically Influenced Corrosion Based On Hilbert-huang Transform and BP Neural Network
Hong Men
School of Automation Engineering, Northeast Dianli University, Jilin, China Email: menhong_bme@163.com

Jing Zhang and Lihua Zhang


School of Automation Engineering, Northeast Dianli University, Jilin, China Email: zhangjing_092@163.com, zhanglihua_1039@163.com

AbstractIn this paper, the level of corrosion and the corrosion rate of 304 stainless steel induced by sulfatereducing bacteria were studied using electrochemical noise. The noise data were analyzed by time domain and frequency domain combined with the observations of optical microscope. And the corrosion was divided into four categories: passivation, pitting induction period, pitting and uniform corrosion. The traditional method for electrochemical noise analysis has lag shortcomings, so the feasibility study on Hilbert-huang Transform and BP Neural Network on intelligent recognition method for microbiologically influenced corrosion was conducted. The results showed that the use of Hilbert-huang Transform for feature extraction can characterize the level of corrosion;BP Neural Network could identify passivation, pitting induction period and pitting correctly, and recognition effect for uniform corrosion would be improved. A feasible way of analyzing electrochemical noise data real-time and intelligent was provided on this paper, and it was hoped that the analyzing method could provide theoretical basis in the identification of the extent of corrosion in practice to take preventive measures timely. Index Termsmicrobiologically influenced Hilbert-huang Transform, BP Neural identification corrosion, Network,

signal on the measurement system. It does not interfere with microbial growth and reproduction, thus, electrochemical noise measurement method is very suitable for microbial-induced corrosion[4]. Electrochemical noise analysis is generally divided into time-domain analysis and frequency domain analysis. Electrochemical noise data was analyzed by standard deviation, noise resistance, power spectral density and wavelet analysis[5-8], combined with optical microscope observation. According to the analyzed results, corrosion of stainless steel was classified. However, the basic electrochemical noise analysis must be carried out after the experiment. It is not on-line and real-time analysis, so it have a significant lag. In this paper, a feasible way of identifying corrosion types intelligently based on Hilbert-huang Transform and BP Neural Network was studied, and it was expected to achieve real-time analysis for monitoring data, and then make preventive measures timely. II. EXPERIMENTAL SYSTEM AND EXPERIMENTAL
METHOD

I. INTRODUCTION Microbiologically Influenced Corrosion (MIC) are prevalent in industrial cooling water systems. It can lead to sticky mud deposition, low heat transfer efficiency in heat transfer equipment. Partial perforation of pipeline would happen if the situation is serious, resulting in industry shutdown and huge economic losses[1]. Sulfate Reducing Bacteria (SRB) is main strains in industrial cooling water, and it is one of the main causes inducing MIC[2-3]. The 304 stainless steel is the main material for heat exchanger, so the study of corrosion of 304 stainless steel induced by SRB is important. Electrochemical noise measurement is an in situ and non-destructive electrochemical measurement method, and it does not need to impose the measured disturbance
2012 ACADEMY PUBLISHER doi:10.4304/jcp.7.9.2283-2291

Sulfate-reducing bacteria for experiments took from the Songhua River, and it was placed in incubator (29 1 ) after the enrichment and purification. Medium for experiments used API-RP38 medium[9], and its ingredients is shown in Table I. CHI660C electrochemical workstation produced by Shanghai Chen Hua Instrument was used into electrochemical noise measurement. Measurement route back was the classic three-electrode system, including reference electrode was saturated calomel electrode (SCE), and the working electrode was 304 stainless steel (chemical composition is shown in Table ). The working area of the working electrode was 1 cm 2 . Taking the comparison test, and two electrodes were placed in erlenmeyer flask with culture medium inoculated with SRB and without any strains. Experimental sampling interval for electrochemical noise measurement was 1 second, and the test period was

2284

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

29 days. The test was taken four times a day, and measurement time for each sample was 2048 seconds.
TABLE I. THE COMPOSITION OF API MEDIUM

which measured when stainless steel immersed in the culture medium containing SRB was analyzed.
0.00E+000

TABLE II. THE CHEMICAL COMPOSITION AND THE PROPORTION OF 304 STAINLESS
STEEL

The potential noise/V

Molecular Formula Na2SO4 NH4Cl CaCl2 K2HPO43H2O MgSO47H2O C3H5NaO3

Content/gL 0.5 1.0 0.1 0.5 2.0 3.5

-1

The current noise/A

-2.00E-010 -4.00E-010 -6.00E-010 -8.00E-010 -1.00E-009 -1.20E-009 0 0.0012 0.0010 0.0008 0.0006 0.0004 0.0002 0.0000 -0.0002 -0.0004 0 500 500 1000 1500 2000

T/s

1000

1500

2000

Elem ent C Si Mn Cr III.


NOISE

Ni Fe

Wt /% 0.0 7 0.4 8 1.5 5 18. 5 9.3 5 Ba l

T/s

(a) The original electrochemical noise signal when stainless steel immersed in the culture medium containing SRB for 8 days

6.00E-010

The current noise/A

4.00E-010 2.00E-010 0.00E+000 -2.00E-010 -4.00E-010 0 0.00010 0.00005 0.00000 -0.00005 -0.00010 -0.00015 0 500 1000 1500 2000 500 1000 1500 2000

ELECTROCHEMICAL

T/s The potential noise/V

A. Principles of Electrochemical Noise Electrochemical noise is random non-equilibrium fluctuations of the system's electrical state parameters in the evolution of dynamical systems of the electrochemical reaction on the surface of electrode. Electrochemical noise generates from the electrochemical system itself, and the measured signal is the fluctuations of potential on the surface of the working electrode and current between the working electrode over time. It does not impose outside disturbance on the surface of the measured electrode. The disturbance can change the electrode reaction which happens on the surface of the electrode. Thus, the electrochemical noise measurement is an in situ, non-destructive, non-interference method of detecting the electrode, and it is the forefront of the study on the electrochemical measurements. B. Analysis of Electrochemical Noise Data There was significant DC drift on signal spectrum of electrochemical noise, and it could affect the analyzed results obviously, so the DC drift must be removed before analyzing experimental data[10-11]. Fig.1 showed the original electrochemical noise signal and the electrochemical noise signal after removing DC drift. Standard deviation, noise resistance and power spectral density were obtained by time domain, frequency domain analysis of the electrochemical noise data after removal of the DC drift. In this paper, electrochemical noise data

T/s

(b) The electrochemical noise signal after removing DC drift Figure 1. The original electrochemical noise signal and the electrochemical noise signal after removing DC drift

Standard deviation, noise resistances were shown in Fig.2 and Fig.3. It can be seen that the current standard deviation change was small and noise resistance change was large when 304 stainless steel immersed in medium containing SRB from 1st day to 6th day. It indicated that the corrosion rate was basically zero, because a thin layer of passive film formed on the surface of the stainless steel in the initial soaking. At this time SRB was in the adaptation period, and it grew slowly. So, the stainless steel was not affected. During the 7th day to 23rd day, the standard deviation increased significantly, and noise resistance reduced. Due to the metabolism of SRB speeded up, SRB began to destroy the passive film. Such damage was local, and the early part of this period was

2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2285

pitting induction period. Significant white noise on power spectral density of potential noise (Fig.4) appeared when the stainless steel soaked for 13 days, this was because the pitting of stainless steel gradually increased as the metabolism of SRB increasing. And metastable pitting point became stable loss point. Standard deviation decreases after soaking for 23 days, it revealed that the corrosion rate decreases. This was because the accumulation of metabolic products made the stainless steel surface form a layer of adhesive layer, then uniform corrosion occurred. Sum up the analyzed results of electrochemical noise above, the types of corrosion during the whole corrosion process were divided into passive, pitting induction period, pitting and uniform corrosion.

10 10 10 10 10 10

-5 -6 -7 -8 -9

-10 -11 -12 -13 -14 -15 -16 -17

Pow er

10 10 10 10 10 10 10

0.0

0.1

0.2

0.3

0.4

0.5

Frequency (Hz)

0.0000055

The standard deviation of current noise

0.0000050 0.0000045 0.0000040 0.0000035 0.0000030 0.0000025 0.0000020 0.0000015 0.0000010 0 5 10 15 20 25

Figure 4. The power spectral density of potential noise when stainless steel immersed in the culture medium containing SRB for 13 days

IV. OPTICAL MICROSCOPE OBSERVATION The bonding material on the surface was removed after the stainless steel soaked in a medium containing SRB. There were obvious corrosion spots on the surface of the stainless steel when observing by optical microscope (Fig.5). V. FEATURE EXTRACTION BASED ON HILBERT-HUANG TRANSFORM A. Empirical Mode Decomposition Hilbert-Huang Transform (HHT) is a new timefrequency analysis which can deal with non-linear, nonstationary signal, and it is self-adaptive[12]. HHT consists empirical mode decomposition (EMD) of Hilbert transform. EMD is to decompose the analyzed signal into several Intrinsic Mode Function (IMF). IMF must satisfy two conditions: (1 ) In the time domain, the number of exceeding zero and extreme values is identical or a difference of 1; (2) At any point, the average of the signal envelope determined by the local maxima and the envelope determined by the local minimum is zero [13]. Specific approach is: 1 m = [v1 (t ) + v2 (t )] (1) 2 Where, m is the average, v1 (t ) is envelope above.
v2 (t ) is envelope below.

T/Day

Figure 2. The standard deviation of current noise when stainless steel immersed in the culture medium containing SRB for different days
4

1.35 1.3 1.25 1.2 R/ 1.15 1.1 1.05 1 0.95

x 10

10

15 T/Day

20

25

30

Figure 3. Electrochemical noise resistance when stainless steel immersed in the culture medium containing SRB for different days

s(t)-m=h

(2)

Where, h is seen as the new s (t), and repeat the action above until h meet IMF conditions.

2012 ACADEMY PUBLISHER

2286

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

Z (t ) = X (t ) + iY (t ) = a (t )e

i ( t )

(8) (9)

a (t ) = [ X (t )2 + Y (t ) 2 ]1/2
Where, a (t) is the instantaneous amplitude.

( t ) = arctan Y (t ) / X (t )
(10)Where, ( t ) is the phase,
f (t ) =
(a)before soaking

1 d ( t ) 2 dt

(11)

Where,f (t) is the flow rate.

C. Hilbert Spectrum Every IMF transforms by HT.


j wi ( t ) dt s (t ) = Re ai (t )e ji (t ) = Re ai (t )e i =1 i =1 n n

(12)

Where, residual function is omitted, r, Re is the real part. Equation (12) is called Hilbert spectrum, denoted by
j wi ( t ) dt H ( w, t ) = Re ai (t )e i =1 n

(13)

(b)after soaking Figure 5. Optical microscope observation

HHT analyzes the signal based on the scale features itself, and it avoids human factors. So, it can analyze the characteristics of the signal as soon as possible[14]. This paper attempts to use HHT to extract signal features.

c1 = h

(3)

c1 is seen as a full IMF.


s (t ) c1 = r

(4)

Where,r is seen as the new s(t), and repeat the above process, then, can obtain followed c2 , c3 , .... It can be stopped until r (t) is monotonic trend or |r (t)| is very small.

s (t ) = ci + r
i =1

D. Feature Extraction In this paper, feature extraction and intelligent recognition were contrary to the current noise data as an example. The current noise signal when the stainless steel immersed in the SRB medium was decomposed by EMD. EMD diagram of the current noise when the stainless steel soaked the 4th day (a), 8th day (b), 15th day (c) and 27th day (d) were shown in Fig.6. It could be seen from the figure that the number of IMF was not necessarily. The IMF components ranged from 7 to 9 after a lot of validation. IMF components which were more than 7 layers were added to the residual component. The feature extraction to the IMF energy was did. Seven IMF components of each sample did the Hilbert transform respectively, and it was according to (14) :

(5)Then, the original signal is decomposed into n-IMF c1 , c2 ,, cn ,..., and a residual component r. Equation (5) shows that the EMD has the completeness of decomposition.

Ei = | ai (t ) |2
i =1

(14)Where, a(t) is the instantaneous amplitude after the signal do the Hilbert transform. So, the amplitude energy of every IMF component was obtained, and it would be a feature vector[15]. There was one feature vector to one sample, and 7 amplitude energy is in one feature vector. Thus, there were 4 feature vectors one day, and 116 feature vectors at the end of the experiment.

B. Hilbert Transform (HT) For the real signal X (t), the HT is


1 + X ( ) d t (6)The inverse transform is Y (t ) =

X (t ) =

Y ( ) d t

(7)

The resolved signal is

2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2287

x 10 2 0 -2 2 0 -2 2 0 -2 2 0 -2 2 0 -2 2 0 -2 2 0 -2 2 0 -2 s ig n a l

-6

Empirical Mode Decomposition

x 10 im f 1

-6

x 10 1 0 -1 x 10-8 5 0 -5 5 0 -5 5 0 -5 5 0 -5 5 0 -5 5 0 -5 5 0 -5 5 0 -5 5 0 -5 x 10
-10 -8

-7

Empirical Mode Decomposition

im f1 im f2 im f3 im f 4 im f 5 x 10
-6

s ig n a l x 10
-6

x 10

im f 2

x 10

-8

x 10 im f 3

-6

x 10

-8

x 10 im f 4

-6

x 10

-8

x 10

-8

im f 5

im f 6 im f 7 im f 8 re s . im f 9 x 10
-8

x 10 im f 6

-6

x 10

-8

x 10

-8

x 10 im f 7

-6

x 10

-8

re s .

9.9506 6.2423

200

400

600

800

1000

1200

1400

1600

1800

2000

-7.0153 -9.6476

200

400

600

800

1000

1200

1400

1600

1800

2000

(a) EMD on the 4th day


x 10 2 0 -2 2 0 -2 2 0 -2 2 0 -2 2 0 -2 2 0 -2 2 0 -2 2 0 -2 x 10
-8 -6

(d) EMD on the 27th day Figure 6. EMD diagram when stainless steel immersed in the culture medium containing SRB for different days

Empirical Mode Decomposition

s ig n a l im f 1

x 10

-6

x 10 im f 2

-6

VI. IDENTIFICATION BASED ON BP NETWORK

x 10 im f 3

-6

x 10 im f 4

-6

x 10 im f 5

-6

x 10 im f 6

-6

x 10 im f 7

-6

re s .

1.4325

-4.5268

200

400

600

800

1000

1200

1400

1600

1800

2000

(b) EMD on the 8th day


-6

A. BP Network Error-Back propagation Network (BP network), also known as multilayer feed-forward network, is a nonfeedback forward network, including input layer, hidden layer and output layer[16]. BP network transforms the input layer vector into output layer vector through hidden layer, to achieve mapping from the input space to the output space. This map associates with the weights of the network, BP network's task is to find a suitable weight to achieve the desired mapping[17]. BP network has self-learning ability and generalization ability, and it is currently one of the most widely used neural networks[18].
The three-layer BP network is shown in Fig.7. There are M nodes in the input layer, and there are L nodes in the output layer. There is only one layer in the hidden layer which has N nodes. In general, N> M> L. Setting the output of neurons node in the input layer is ai (i = 1,2, ..., M); the output in the hidden layer node is

x 10 2 0 -2 2 0 -2 2 0 -2 2 0 -2 2 0 -2 2 0 -2 2 0 -2 s ig n a l

Empirical Mode Decomposition

x 10 im f 1

-6

x 10 im f 2

-6

x 10 im f 3

-6

a j (j =

x 10 im f 4

-6

1,2, ..., N); the output of neurons node in the output layer is yk (k = 1, 2, ..., L); the output vector of neural network is is y p .

x 10 im f 5

-6

ym ; the desired output vector of the network

x 10 im f 6

-6

re s .

14.7869 0.0697

x 10

-8

200

400

600

800

1000

1200

1400

1600

1800

2000

B. The Input-output Relationship of Neural Nodes in Network Layers The input of ith node in the input layer

(c) EMD on the 15th day

neti = xi + i
i =1

(15)

2012 ACADEMY PUBLISHER

2288

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

Where xi (i = 1,2, ..., M) is the input for the neural network, i is the threshold Corresponding output is
ai = f ( neti ) = 1 = 1 + exp( neti )

C.

of

the

ith

node.

The Adjustment Rules of Weights in BP Network The quadratic error function of the input and output mode for each sample is

1 1 + exp( xi i )
i =1 M

Ep =
(16)

1 L ( y pk a pk )2 2 k =1
1 P L ( y pk a pk )2 2 p =1 k =1

(22)

The error cost function of the system is

E = Ep =
p =1

(23)

i j

x1
x2

y p1
y m1
.

Where, P is the sample model number, and L is the output nodes of the network. The question is how to adjust connection weights so that the error cost function E is minimum.

a pk = yk , network training rules will make E in each

(1) When calculating the nodes of the output layer,

training cycle descent by gradient, and the power factor correction formula is

xn
.
Figure 7. BP neural network

In the learning of BP network, learning of the nonlinearity mainly is completed by the hidden layer and output layer. Generally,

The input of the jth node in the hidden layer is

Where, wij is weights of the hidden layer, and

the threshold of the jth node. Corresponding output is


a j = f (net j ) = 1 = 1 + exp( net j ) 1 1 + exp( wij ai j )
j =1 N

input of kth node in the output layer is

Where, w jk is the weights of the output layer, and

ai = xi
net j = wij ai + j
j =1 N

netk = w jk a j + k
k =1

w jk =

E p w jk

E w jk

(24)

For simplicity, the subscript of E p is omitted. The layer; is the searching step size by the gradient 0 < <1, then
E E netk E aj = = w jk netk w jk netk

netk refers to the input network of kth node in the output

(17)

(25)

(18)

Define error signal of back-propagation in the output layer as


k =

is

E E yk f (net jk ) = ( y pk yk ) f ' (netk ) = = ( y pk yk ) netk yk netk netk

(26) The derivative of (21) on both sides is

f ' (netk ) = f (netk )(1 f (netk )) = yk (1 yk )


(19)The Equation (27) type into (26), we can obtain

(27)

k = yk (1 yk )( y pk yk ) k = 1,2, ..., L
(20)

(28)

(2) When calculating the hidden layer nodes, the power factor correction formula is
wij = E p wij = E wij

(29)

k
is

is the threshold of kth node. Corresponding output

For simplicity, where, the subscript of E p is omitted, then

1 yk = f ( netk ) = = 1 + exp( netk )

1 1 + exp( w jk a j k ) (21)
k =1 L

E E net j E = = ai wij net j wij net j

(30)

Define error signal of back-propagation in the hidden layer as


2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2289

j =

E E a j E ' = = f (net j ) net j a j net j a j

(31)

otherwise go to step (2). Reread a set of samples, and continue to training network. (7) The end. BP network makes the input-output problem into a nonlinear optimization problem. The adjustable parameters of the optimize problem increases after adding the hidden nodes, and we can obtain more accurate solution.

Where,

L L L E E netk L E N E ) wjk aj = ( )wjk = k wjk = = ( aj k=1 netk aj k=1 netk aj j=1 netk k=1 k=1

' (32)And f (net j ) = a j (1 a j ) , so error signal of backpropagation in the hidden layer is

j = a j (1 a j ) k w jk
k =1

(33)

In order to improve the learning rate, the training rules the weights correction formula in the output layer and hidden layer are coupled with a momentum term. Then, the weights correction formula in the hidden layer and the weights correction formula in the output layer are wij ( k + 1) = wij (k ) + j j ai + j ( wij ( k ) wij (k 1)) (34) w jk (k + 1) = w jk (k ) + k k a j + k ( w jk (k ) w jk (k 1)) (35) Where, are the coefficient of the learning rate. is searching step size of every layer according to the gradient. is the factor of determining past weight changes on the impact to current weight changes, and it is also known as the memory factor.

E. Identification Results The number of feature vector was 116. There were 4 feature vectors one day, and there were seven energy values each vector. 100 feature vectors which were randomly extracted accomplished as training samples, and the remaining 16 vectors accomplished as a predictor of samples. The four output categories of samples after training, respectively, corresponded to four types of corrosion. Therefore, the input layer dimension of BP network was 100, and the output layer dimension was 4. The corrosion results after BP intelligent recognition were shown in Table , and the correct rate was in Table .
VII. CONCLUSION (1)The electrochemical noise data were analyzed by time domain and frequency domain. From the standard deviation, noise resistance, power spectral density curve analysis combined with optical microscope observations could be drawn: during the stainless steel immersed in a bacterial culture medium from the 1st day to 6th day, the surface of the stainless steel was in a passive state; soaking the 7th day to 12th day the stainless steel was pitting corrosion induction period; soaking the 13th day to 22nd day the stainless steel was pitting period; soaking the 23rd day to 29th day, the stainless steel was uniform corrosion. The corrosion of stainless steel induced by SRB was divided into four categories: passive, pitting induction period, pitting and uniform corrosion; (2)A feasibility study of intelligent recognition method to corrosion based on Hilbert Huang transform and BP neural network. The results showed: the current noise data after removal of DC drift did EMD and Hilbert transform, and the feature of the amplitude energy of each IMF was extracted. There were 116 feature vectors.100 random feature vectors were trained by BP network, and the remaining 16 feature vectors input to the trained BP network. The forecast results of the passivation, pitting induction period and pitting were better, and the prediction of the uniform corrosion would be increased. In this paper, the identification method provides a new research idea for monitoring MIC online and identifying the types of corrosion real-time in the actual industrial cooling water system.

D. The Training Steps of BP Neural Network in the Identification of the System Model:
(1) The initial values of the weights and threshold are set. w w are a small random number array.
jk ij j

Error cost function is R.

is assigned. The number of cycles

(2) Providing the learning materials for training: input matrix xki (k = 1,2, ..., R; i = 1,2, ..., M), the target output y pk is obtained through the reference model, and

yk is obtained after neural network. Then do the following (3) - (5) step for each k.
(3) Calculating the network output according to (21), and calculating the state a j of hidden units according to (19). (4) Calculating the error value of the training output layer according to (28), and calculating the error value of the training hidden layer according to (33). (5) Amending the weight values of the hidden layer and output layer respectively according to (34) and (35). (6) Determining whether the indicators meet the accuracy requirements after training each time. It is said that the error cost function (22) if meet E . If you meet the requirements then go to step (7), and then determine whether the number of cycles to reach k = R. If the number of cycles is equal to R, then go to step (7),
2012 ACADEMY PUBLISHER

2290

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

TABLE III. THEORETICAL TYPES OF CORROSION COMPARED WITH THE RESULTS OF BP IDENTIFICATION The amplitude energy of the IMF components(*E13) 15500 12300 8810 1730 2050 805 11.4 18700 7900 7770 1600 340 171 0 6430 14100 13100 2160 691 574 18.8 21900 18800 24800 4350 3740 2140 583 44700 8210 3760 5460 2030 1910 1420 45600 11000 8220 3940 1620 1930 1590 57200 4610 2060 2700 1800 1810 135 56400 5810 12400 7230 3790 3090 746 57600 8700 7980 3960 2820 607 196 57100 8010 16900 4440 794 1120 1280 59100 3490 2810 6360 877 419 68.2 1.93 11.1 4.72 1.37 0.724 0.135 0.0348 2 2.14 0.624 0.43 0.251 0.126 0.0977 3.68 1.19 5.47 5.39 6.89 247 5.29 2.00 5.04 8.95 6.32 2.81 3.13 1.02 1.88 1.04 1.5 1.41 5.89 1.97 4.86 Residual energy(*E 13) 2870 Theoretical types of corrosion BP recognition results

REFERENCES
[1] J.J. Santana Rodrguez, F.J. Santana Hernndez, and J.E. Gonzlez Gonzlez, Comparative study of the behavior of AISI 304 SS in a natural seawater hopper, in sterile media and with SRB using electrochemical techniques and SEM, Corrosion Science, vol.48, pp. 12651278, May 2006. [2] S.Sh. Abedi, A. Abdolmaleki, and N. Adibi, Failure analysis of SCC and SRB induced cracking of a transmission oil products pipeline, Engineering Failure Analysis, vol.14, pp. 250261, January 2007. [3] H. A. Videla and L. K. Herrera, Understanding Microbial Inhibition of Corrosion. A Comprehensive Overview, International Biodeterioration & Biodegradation, vol.63, pp. 896900, October 2009. [4] J. H. Wu, G. Z. Liu, and H. Yu, Electrochemical Methods for Study of Microbiologically Influenced Corrosion in Marine Environment, Corrosion and Protection, vol.20, pp. 231237, May 1999. [5] J. Q. Zhang, Z. Zhang, C. N. Cao, J. M. Wang, and S. A. Cheng. Analysis and Application of Electrochemical Noise Technology-- . Application of Electrochemical Noise, Journal of Chinese Society for Corrosion and Protection, vol.22, pp. 250261, 2002. [6] Y. B. Zhou, L. Y. Chao, and Y. T. Li, Corrosion Monitoring Technology Status and Development Trend,Marine Science, vol.29, pp. 7780, March 2005. [7] R. Zhao, W. F. Deng, and S. Z. Song, Electrochemical Noise Detection on Alkaline Corrosion of Weld Zone of 304 Stainless Steel, Journal o Chemical Industry and Engineering , vol.59, pp. 12161222, May 2008. [8] N. Zaveri, R. Sun, and N. Zufelt, Evaluation of Microbially Influenced Corrosion with Electrochemical Noise Analysis and Signal Processing, Electrochimica Acta, vol.52, No.19, pp57955807, May 2007. [9] A. Padilla-Viveros, E. Garcia-Ochoa, and D. Alazard , Comparative Electrochemical Noise Study of the Corrosion Process of Carbon Steel by the Sulfate-reducing Bacterium Desulfovibrio Alaskensis under Nutritionally Rich and Oligotrophic Culture Conditions, Electrochimica Acta, vol.51, pp. 38413847, May 2006. [10] U. Bertocci, F. Huet, and R. P. Nogueira, Drift Removal Procedures in the Analysis of Electrochemical Noise, Corrosion, vol.58, pp. 337347, April 2002. [11] Y. B. Qiu, J. Y. Huang, and X. P. Guo, Polynomial Fitting to Eliminate the DC Drift of Electrochemical Noise, Huazhong University of Science and Technology, vol.33, pp. 3942, October 2005. [12] D. J. De, J. S. Cheng, and Y. Yang, Hilbert-Huang Transform of Mechanical Fault Diagnosis, Beijin, Science Press, 2006. [13] L. W. Liu, C. H. Liu, and C. H. Jing, Novel EMD Algorithm and Its Application, Journal of System Simulation, vol.19, pp. 446447, February 2007. [14] N. E. Huang,Z. Shen,S. R. Long,H. H. Shih, N. C. Yen, C. C. Tung, etc., The Empirical Mode Decomposition and Hilbert Spectrum for Nonlinear and Non-stationary Time Series Analysis, Proceedings of the Royal Society of London,London,March 1998 [15] Y. L. Zhou,Q. Wang,B. Sun,and Y. G. Zhang, Based Hibert-Huang Transform and Elman Neural Nerwork of Gas-Liquid Two-Phase Flow Pattern Identification Method, China Electrical Engineering, vol.27, pp. 5056, April 2007. [16] A. J. Zou, Y. N. Zhang, Basis Function Neural Networks and Applications, Guangzhou, Zhongshan University Press, 2009.

510

1280

6470

5360

5130

3750

7620

3620

3190

1360

0.894

0.475

7.19

6.96 8.35

TABLE IV. THE CORRECT PREDICTION RATE OF CORROSION CATEGORY Types of corrosion Accuracy 1.0000 1.0000 1.0000 0.8000

ACKNOWLEDGMENT This work was supported by both National Basic Research Program of China (973 Program, No. 2007 CB206904) and Natural Science Foundation of China (NO.51176028).

2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2291

[17] Y. Shi, L. Q. Han, X. Q. Lian, Design Method and Case Analysis of Neural Network, Beijing: Beijing University of Posts and Telecommunications Press, 2009. [18] Y. B. Hou, J. Y. Du, M. Wang, Neural Network, Xian: Xian University of Electronic Science and Technology Press, 2007.

Hong Men was born in Jilin, China, in 1973. He received the Bachelor of Electronics and Informational System from Northeast Normal University, Changchun, China, in 1996, a Master of Electric Power System and Automation from Northeast Institute of Electric Power Engineering, Jilin, China, in 2002, and a Doctor of Biomedical Engineering from Zhejiang University, Hangzhou, China, in

Jing Zhang was born in Shanxi, China, in 1985. She received a Bachelor of Biomedical Engineering from Jilin Medical College, China. Now she is a graduate student of School of Automation Engineering, Northeast Dianli University, China. And research interests on mathematics, automation and microbiologically influenced corrosion.

2005. Now he is an associate professor in Northeast Dianli University. His current research areas are mathematics, microbiologically influenced corrosion, chemical sensor and pattern recognition technique.

Lihua Zhang was born in Mongolia, China, in 1986. She received the Bachelor Degree of Biomedical Engineering from Jilin Medical College, China, in 2009. Now she is a graduate student of School of Automation Engineering, Northeast Dianli University, China. And research interests on automation and microbiologically influenced corrosion.

2012 ACADEMY PUBLISHER

2292

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

Research on Diagnosis of AC Engine Wear Fault Based on Support Vector Machine and Information Fusion
Lei Zhang
Department of Electrical & Electronic Engineering, Henan University of Urban Construction, Ping Ding Shan , 467036, China Email: hncjzhanglei@yeah.net

Yanfei Dong
Department of Electrical & Electronic Engineering, Henan University of Urban Construction, Ping Ding Shan , 467036, China Email: dong yanzi@hncj.edu.cn

AbstractSupport Vector Machine (SVM) and information fusion technology based on D-S evidence theory are used to diagnose wear fault of AC engines. Firstly, based on a number of frequently used oil sample analysis methods for detecting engine wear fault, establish corresponding sub SVM classifier. The classifier can reflect the mapping relation between fault symptoms and fault types and achieve the result for a single diagnosis item. And then, use D-S evidence theory to make information fusion over result for a single diagnosis item so as to make fault diagnosis. With diagnosis of AC engine wear fault serving as example, example testing is performed. The result shows that in comparison with conventional methods, the combination of SVM and information fusion technology is fast and effective, suitable for diagnosis of AC engine wear fault. Index TermsSupport Vector Machine (SVM), AC engine, fault diagnosis, information fusion

I. INTRODUCTION As an effective tool for solving nonlinear issues, artificial neural network is extensively applied in the field of mechanical fault [1]. It is one more artificial intelligence (AI) technology applied in diagnosis of engine wear fault, following application of expert system. However, as neural network technology is merely a heuristic technology reliant on experience and lacks solid theoretical basis, and its learning process uses the empirical risk minimization principle, making it likely to tend to local minimum and weak in generalization. Moreover, the complexity of its algorithm is considerably subject to influence of the complexity of network structure and samples. These weak points have checked its further application and development in smart fault diagnosis. In recent years, Support Vector Machine (SVM) proposed by Vapnik has received extensive attention [2]. It is a new machine learning approach based on statistical learning theory and structure risk minimization theory and specific to small sample sizes. It

features brief mathematical form, intuitive geometric interpretation and excellent learning performance and promotion ability. It can overcome the above-mentioned weakness of neural network and has become a new popular research field in the realm of machine learning. Additionally, it has found successful application in such realms as pattern recognition, regression analysis and function approximation [3] [4] [5]. On the other hand, in practical fault diagnosis, the method of oil sample analysis has become the major method for diagnosis of wear fault of AC engine. It includes ferrography analysis, spectral analysis, grain counting analysis method and physical and chemical analysis. However, one single method is invariably subject to certain limitation in terms of test accuracy, hence unsatisfactory accuracy. If diagnosis information of these analysis methods can be used to the greatest extent possible, with fusion diagnosis made to have results complement each other, the accuracy of fault diagnosis can be improved. SVM boasts rapid learning speed and strong generalization, while D-S evidence theory boasts a fairly good ability in processing of uncertain information [6] . Therefore, the present article proposes a fusion diagnosis technology for engine wear fault, where firstly, SVM classifier is used to realize diagnosis with each single method, and then based on D-S evidence theory, diagnosis results undergo fusion so as to improve accuracy and reliability of diagnosis.

II. SVM
SVM is a machine learning algorithm proposed by Vapnik in the 1990s. With its good theoretical background and structure risk minimization principle, SVM provides a brand-new direction for machine learning. At first SVM is used for solving pattern recognition issues. Subsets of training data chosen for the purpose of discovering decision rules with generalization ability are called support vectors. Best support vector

2012 ACADEMY PUBLISHER doi:10.4304/jcp.7.9.2292-2297

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2293

separation is equivalent to separation of all data. SVM involves from optimal hyperplane given linear separability, as shown in Figure 1. By way of fixing risks, minimizing confidence risk and mapping input space to high-dimension inner product space, SVM effectively avoids curse of dimensionality. SVM obtains the global optimum by solving a quadratic programming problem of linear constraint, hence no local minima, and the Fast algorithm ensures the rate of convergence. Typical SVMs for classification are shown in Figure 2.
Optimal hyperplane

linear combination of : w = ai yi ( xi ) ; among that,


i =1

ai is the multiplier of Lagrange, and can be obtained by solving the following quadratic programming N 1 N problem: Max ai ai a j yi y j ( xi ) ( x j ) a 2 i , j =1 i =1 s.t.

a y
i =1 i

=0

ai 0

(1)

Max. border

It is known from theory of Reproduce Kernel Hilbert Space that: In the inner product of high-dimension space ( xi ) ( x j ) , a kernel function meeting Mercer conditions can always be found in the input space to enable K ( xi , x j ) = ( xi ) ( x j ) ; therefore, theres no need to know the specific form of nonlinear mapping. Formula (1) can be adapted to the following form: N 1 N Max ai ai a j yi y j K ( xi , x j ) a 2 i , j =1 i =1

s.t.

a y
i =1 i

=0

ai 0

(2)

Frequently used kernel functions include: Polynomial p kernel K ( x, y) = [1 + ( x y)] , p = 1,L , n; Gaussian radial
Figure 1 Optimal Hyperplane
Decision rule

basis kernel K ( x, y ) = exp( x y layer feed-forward neural K ( x, y ) = tanh [ k ( x y ) ] .

/ 2 2 ) and twonetwork kernel

y = yi ai K ( x, xi ) + b

1 2 3
k ( x, x1 )

i =1

Weight value (Lagrange multiplier)

n
Input vector

Calculating inner product Non-linear mapping

k ( x, x2 )

k ( x, x3 )

k ( x, xn )

x1

x2

x3

xm

Figure 2 SVM Structure Network Diagram

A. Classification Algorithm of SVM For given sample set

For points on the classification border, their corresponding ai > 0 is called support vector points. The number of support vector points is generally smaller than sample size, and support vector points are closely related to generalization ability of classifiers. It is necessary to use Karush-Kuhn-Tucher condition to get the domain value b. Therefore, the classification decision function obtained is: N (3) y = sgn yi ai K ( x, xi ) + b i =1 When data can not be separated without error in the high-dimension space, SVM introduces non-negative slack variable i 0 in constraints and solves the x following quadratic programming to minimize wrongclassification error: N 1 min wT w + C ( i ) w ,b , 2 i =1

{( x , y )}
i i

yi {1} , SVM firstly use nonlinear mapping

i =1

, xi R m and

s.t. yi ( wT xi + b) 1 i i = 1,L , N

(4)

Among that, slack variable = (1 ,L , N )T reflects the distance between the actual indication value yi and SVM output. C is a marginal factor that reflects the balance between item 1 and item 2. Solving of Formula (4) can be converted to: N 1 N Max ai ai a j yi y j K ( xi , x j ) a 2 i , j =1 i =1

: R m R n to map input vector to the high-dimension


space. When data is separable in the high-dimension space, SVM constructs a maximum-interval separating classification hyperplane in the high-dimension space R n : ( w ( x) ) + b , can be proven that w can be written as

2012 ACADEMY PUBLISHER

2294

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

s.t.

a y
i =1 i

=0

0 ai C

(5)

To solve Formula (5), likewise KKT condition can be used to obtain domain value and get the decision function (3). B. Algorithm of Multi-class SVM Classification Basic principle of multi-class SVM is converting multi-class issues into combination of two-class issues. In the present article, pair classification is adopted, i.e. specific to N-term classification issue, build N ( N 1) / 2 SVM sub-classifier(s); train SVM to separate between two classes, using the following algorithm: Supposed training data sample belongs to Class k and Class l ; the 2-term classifier can be converted into a multi-term classifier: N 1 min kl ( wkl )T wkl + C ( ikl ) wkl ,b kl , 2 i =1

introduced, defining that: Pl(A)1Bel(-A), (A ) ; Pl function is also called upper limit function or irrefutable function, reflecting confidence in that Proposition A is not false. To make it easy for understanding, it is an uncertainty metrics reflecting whether Proposition A probably will be true. Obviously: Pl (A) Bel(A), to all A . D-S Combination Rule For the same recognition framework, depending on varied evidence, different BPA functions which are independent of each other will be obtained. Supposed m1 and m2 are two basic BPA functions on the same recognition framework, and m1 and m2 can be combined into a new BPA function m1 m2 . The corresponding confidence function is expressed as Bel1 Bel2 ; and according to definition of confidence function, Bel1 Bel2 calculation can be made by using m1 m2 . Define m( A) = B.

[(wkl )T (x )] + bkl 1 ikl xi k kl (6) st. kl T i . i 0 kl kl [(w ) (xi )] + b 1+ i xi l After using multi-term classifier to train data samples, when new samples are tested, input new samples to each trained SVM classifier to make classification recognition, thus obtaining a class each time. And then, make statistics on the result of N ( N 1) / 2 classifications; the class covering the most class number will become the class for the new sample. Multi-term classifier constructed with such an approach features less single SVM training size, easy realization using programming, accurate result and balanced training data.
III. FUSION DIAGNOSIS BASED ON D-S THEORY A. Basis of D-S Evidence Theory

1 N

B I C

m1 ( B) m2 (C ) as the BPA

function of m1 m2 . In this formula, m( A) is the mass function on ;

N =1

B I C

m1 ( B ) m2 (C ) > 0

When

N =0

m1 m2 ( A) makes no sense, representing that the two


BPA functions of m1 ( B) and m2 (C ) are in full conflict and can not be combined. IV. THE COMBINATION OF SVM AND D-S THEORY
EVIDENCE

D-S theory uses recognition frame work to describe the aggregate of all elements constituting a whole assumed space; it is made up of mutually exclusive and exhaustive elements. It defines a set function m: 2
A

[0,1] , meeting: (1) m( )=0; (2) m( A) = 1 ; m is called basic probability assignment

function (BPA) on the recognition framework ; m(A) shows the extent of accurate confidence of evidence against A. For any proposition set, D-S theory also proposes concept of confidence function, i.e.: (7) Bel(A) = m( B), (A )
B A

That is to say, confidence function of A is the sum of confidence values of each sub-set. Bel function is also called lower limit function, and reflects full confidence in A. However, for confidence in one specific proposition, description with confidence function alone is not allaround enough because Bel(A) can not reflect the extent of suspicion on A. Therefore, a likelihood function is
2012 ACADEMY PUBLISHER

D-S evidence theory is an important method in the field of multi-sensor information fusion, but its advantage is not fully utilized because is BPA is difficult to obtain. SVM is a new learning algorithm based on the statistical learning theory. However, its hard decision output does not adequately facilitate multi-sensor information fusion. In order to apply SVM to information fusion, a two-class SVM with BPA output is proposed. By analyzing the essence and deficiency of the Platts model , the BPA is obtained through use of the lower bound of the SVM precision to weight the Platt s probability model , which achieves the combination of SVM and the evidence theory in the information fusion. The standard SVM output is the {1, 1}, belonging to a judge output, not probability output, so that cannot be as the evidence of BPA theory. SVM model of fault diagnosis is more than a class classification problems, "one-versus-one" many kinds of support vector machine (SVM) is an effective way to solve this problem, one is to use the ballot to judge fault mode. For the first L sensors, in recognition framework = {F,},s = 1 , 2 , , t, each "one-versus-one" multi-class support vector machine (SVM) output corresponding to a evidence, if the total number of votes each V (F,) and total votes t (t-1) / 2 than can be obtained with probability

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2295

output, based on this, the structure, the basic allocation function:

m1 ( FS ) =

2V ( FS ) , A , s = 1,2, , t t (t 1)

(8)

If using a single source of data to fault diagnosis, the probability of the largest such as sample points x belongs of fault; If use evidence theory information fusion of data to fault diagnosis, the "one-versus-one" multi-class SVM output is a evidence. V. MODEL OF FAULT DIAGNOSIS SYSTEM Respect to four basic oil sample analysis technologies for AC engine wear faults, namely ferrography analysis, spectral analysis, grain counting analysis and physical and chemical analysis, the present article firstly takes some frequently occurring fault sets as a fault domain [7][8][9] , and then specific to each oil sample analysis method, establish sub support vector classifiers corresponding to the fault sets, thus realizing mapping of each fault symptom and fault type and subsequently finishing preliminary diagnosis of fault types. After undergoing information fusion, preliminary information is sent to a classifier SVC for fault classification, i.e. realizing final classification of decision diagnosis. As various oil sample analysis methods produce varied diagnosis data and dimensions, subsequent fusion will encounter some inconveniences. Therefore, pretreatment on original symptom data will be made to convert them into Boolean values 0 and 1. The basis for the conversion is: compare original data obtained by way of various methods against their corresponding standard limit values; if they are within the normal range, 0 is obtained; if not, 1 is obtained. The realization of decision information fusion is a process within the same recognition framework, where different evidence bodies are combined into a new evidence body. The specific fusion method is as follows: supposed S = {s | s = 1, 2,L , q} , = { j = 1, 2,L , p} ; for symptom domain s , local information fusion leads to the result of Model j : ms ( j ) ; supposed the confidence factor of symptom domain s being used for local diagnosis is R( s ) , R( s ) (0,1) , the BPA function is defined as follows: mass ( j ) = ms ( j ) R( s ) , where j = 1, 2,L , p . mass ( ) = 1 R( s ) .With that determined, it is possible to use combination rules of D-S evidence theory to make global information fusion, judge probability of occurrence of various faults, and eventually obtain the final diagnosis of various fault patterns. In order to directly show the fault type, the final decision diagnosis uses fault classifiers. As SVM is a 2value classifier, fault classification of multiple fault types requires construction of multi-term classifier to enable fault diagnosis classification. The model of AC engine wear fault diagnosis system based on support vector and D-S evidence theory fusion is shown in Figure 3.

The diagnosis process using the model of AC engine wear fault diagnosis system based on support vector and D-S evidence theory fusion is as follows: (1) Preparation of training set and testing set. Use known diagnosis results as sample data for respectively training their own classifiers. (2) After fusion of output of various classifiers by way of D-S evidence theory, take the fusion result as the input for fault classifier to train decision rules of SVM. (3) Use testing set to test diagnosis system.

Figure 3 Model of AC Engine Wear Fault Diagnosis System

VI. ANALYSIS OF FUSION DIAGNOSIS INSTANCE A. Diagnosis Instance To verify the validity of the diagnosis method based on SVM and evidence theory fusion, the present article takes 6 frequently occurring wear faults of AC engine as the fault domains of the diagnosis. The 6 wear faults are: bearing wear failure (F1), bearing fatigue failure (F2), gear fatigue overload (F3), gear stuck or scratched (F4), non-conforming lubricant pollution (F5) and nonconforming physical and chemical analysis result of lubricant (F6). In this way, the recognition framework of the evidence theory = { F1 , F2 , F3 , F4 , F5 , F6 } is built. Therefore, number of output type of the four sub classifiers built specific to the fault domains - ferrography analysis, spectral classifier, grain counting classifier method and physical and chemical classifier is 6. Regarding ferrography analysis, its input vector is the percentage of abrasive grain of elements of various types, globular abrasive grains in large number (SF1); layered abrasive grain in large number (SF2);fatigued abrasive grain in large number (SF3);cutting abrasive grain in large number (SF4);severely slippery abrasive grain in large number (SF5); red oxide abrasive grain in large number (SF6); black oxide abrasive grain in large number (SF7); therefore, the input node number of ferrography sub classifier is 7. For spectral sub classifier,

2012 ACADEMY PUBLISHER

2296

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

choose concentrations of elements of Fe, Cr, Ni, Mo, Cu, V, Zn, Al and Ti as the original data for spectral analysis (for other machinery, depending on structure and material of abrasion parts, elements chosen may vary). After pretreatment, the spectral data are turned into: nonconforming Fe element concentration (SS1); nonconforming Cr element concentration (SS2); nonconforming Ni element concentration (SS3); nonconforming Mo element concentration (SS4); nonconforming V element concentration (SS5); nonconforming Cu element concentration (SS6); nonconforming Zn element concentration (SS7); nonconforming Al element concentration (SS8); Nonconforming Ti element concentration (SS9), so input node number of spectral sub classifier is 9; for grain counting sub classifier, as number of grain with specific size level can not correspond to fault pattern of engine, it is only possible to arrive at the conclusion whether the oil sample pollution is non-conforming, i.e. the classifier input vector is: non-conforming pollution (SC1), so for grain counting sub classifier, the input node number is 1; for physical and chemical analysis sub classifier, the input vector is: non-conforming kinematic viscosity (SP1); non-conforming impurity content (SP2); nonconforming other physical and chemical specifications(SP3). So the input node number can be set as 3. Afterwards, train all the sub classifiers and fault diagnosis sub classifiers of the system. To verify the validity of the algorithm herein, an example is provided for verification. Supposed fault symptom data of ferrography analysis are {0,0,0,1,0,0,0}; supposed fault symptom data of spectral analysis are {0,1,0,0,0,0,0,0,0}; fault symptom data of pollution analysis are {1}; fault symptom data of physical and chemical analysis are {0,0,1}; Single-item diagnosis results of each sub classifiers are shown in Table 1.
TABLE I. SINGLE-ITEM DIAGNOSIS AND FUSION DIAGNOSIS RESULTS

Gear stuck or scratched Non-conforming lubricant pollution

0.8

0.1

0.6626

0.6

Non-conforming physical and 0.0001 chemical analysis result of lubricant

TABLE II INSTANCE OF SMALL CONFLICT FUSION

Ferrograph y analysis

Pollution analysis

Spectral analysis

Conflict degree

Bearing wear failure Non-bearing-wear failure

0.8 0.2

0.9 0.1

1 0

0.28 0.972 0.28 0.028

TABLE III Instance of Large Conflict Fusion

Gear stuck or scratched Non-gearstuck-orscratched

0.8 0.2

0.1 0.9

1 0

0.92 0.92

0.6626 0.3373

Ferrography analysis

Pollution analysis

Spectral analysis

Conflict degree

Fusion result

Bearing wear failure Bearing fatigue failure Gear fatigue overload

0.8

0.9

0.972

0.0001 0

0 0

0 0

0 0

0 0

B. Analysis (1) Influence of Kernel Function and Its Parameters on Diagnosis The article uses 2 types of kernel functions (i.e. polynomial kernel and Gaussian radial basis kernel) to compare influence of varied kernel functions on sub SVM classifier and fault diagnosis classifier. K-based cross validation and grid search method are used to choose from 7 parameters of SVM, where: = 0.02 . The result shows that, 2 of Gaussian radial basis kernel produces better result in the range of [1,5] , but if d 3 , the classification abilities of polynomial kernel and Gaussian radial basis kernel are close to each other, though the calculation speed of the later is a bit faster ( 0.06 ). Therefore, radius basis kernel is used herein.

2012 ACADEMY PUBLISHER

Fusion result

Ferrography analysis

Pollution analysis

Spectral analysis

Conflict degree

Fusion result

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2297

(2) Research on Comparison between SVM and Neural Network Models To make comparison between the diagnosis ability of neural network and SVM, here BP neural network is used to make diagnosis on test data. Structure of BP network is 7-12-6, 9-12-6, 1-12-6 and 3-12-6; the allowance is 0.001. The result shows that though BP network can also find out the wear fault, the confidence degree is only 0.57, lower than the confidence degree of SVM diagnosis. This also shows generalization ability of BP network is not as strong as SVM. VII. CONCLUSION (1) The present article applies SVM to research of diagnosis of AC engine wear faults, and the result shows that: SVM features strong generalization ability in issues with small sample size, particularly applicable to diagnosis of AC engine faults where obtaining of wear fault samples is difficult. (2) Firstly, use sub SVM to make preliminary diagnosis; then use D-S evidence theory to make decision fusion on the diagnosis result. In this way, relatively weak diagnosis decision will support relatively strong diagnosis decision in a more effective way, and such issues as diversification and multiple classes of information in practical projects can be solved. (3) The combination of SVM and information fusion technology based on D-S evidence theory can realize effective diagnosis of wear faults of AC engines. REFERENCES
[1]

[4]

[5]

[6]

[7]

[8]

[9]

Kang Jian li, Lu Yaping , Zhou Yinsheng . Wear Particle Recognition with Improved BP Algorithm [J]. Lubrication Engineering , 2005 ( 2) : 149-150. CHEN Guo. Intelligent fusion diagnosis of aeroengine wear faults [ J ] . China Mechanical Engineering, 2005, 16 (4) : 299-302. ZHU Daqi, YU Shenglin. Data fusion algorithm based on D-S evidential theory and its application for circuit fault diagnosis [J]. Acta Electronica Sinica, 2002, 30 ( 2 ) : 153-155. HUANG Weibin, HUANG Jinquan. On board selftuning model for aero-engine fault diagnostics [J] . Journal of Aerospace Power, 2008, 23( 3) : 580-584 Wang Weihua , Yin Yonghui, Wang Chengtao . Wear Debris Recognition System Based on Radius Basis Function Network[ J] . Tribology , 2003 , 23 ( 4) : 340 - 343 . LI Yanjun , ZUO Hongfu , WU Zhenfeng , etal. Wear Particles Identification Based on Dempstershafer Evidential Reasoning [J]. Journal of Aerospace Power , 2003 , 18 ( 1) : 114- 118 .
Lei Zhang He Nan Province, China. Birthdate: November, 1979. is Electric Information Engineering Ph.D., graduated from School of Electric Information Engineering, Jiangsu University. And research interests on data mining, complex networks. He is a senior lecturer of Dept.

[2]

[3]

HUANG Xianghua, DING Yi. Engine sensor fault diagnosis based on geometric pattern recognition[J]. Acta Aeronautica et Astronautica Sinica, 2006, 27 (6) : 1018-1022 . Feng Z G, Shida K, Wang Q. Sensor fault detection and data recovery based on LS-SVM predictor [J] . Chinese Journal of Scientific Instrument, 2007, 28(2) : 193-197 . CHEN Tian, SUN Jianguo, HAO Ying. Neural network and dempster shafter theory based fault diagnosis for aero engine gas path [J]. Acta Aeronautica et Astronautica Sinica, 2006, 27(6) : 1014-1017 .

Department of Electrical & Electronic Engineering, Henan University of Urban Construction.


Yanfei Dong He Nan Province, China. Birthdate: Feb, 1976. is graduated from School of Electric Information Engineering, Wuhan University. And research interests on data mining, complex networks,. She is a associate professor of Dept. Department of Electrical &

Electronic Engineering, Henan University of Urban Construction.

2012 ACADEMY PUBLISHER

2298

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

Optimal Kernel Marginal Fisher Analysis for Face Recognition


Ziqiang Wang
Henan University of Technology, Zhengzhou, 450001, China Email: wzqagent@126.com

Xia Sun
Henan University of Technology, Zhengzhou, 450001, China Email: sunxiamail@126.com

AbstractNonlinear dimensionality reduction and face classifier selection are two key issues of face recognition. In this paper, an efficient face recognition algorithm named OKMFA is proposed. The core idea of the algorithm is as follows. First, the high-dimensional face images are mapped into lower-dimensional discriminating feature space by using the feature vector selection-based optimal kernel marginal Fisher analysis(KMFA), then the multiplicative update rule-based optimal SVM classifier is applied to recognize different facial images herein. Extensive experimental results on two benchmark face databases demonstrate the effectiveness and efficiency of the proposed algorithm. Index Termsface recognition, kernel marginal Fisher, support vector machine

I. INTRODUCTION Face recognition (FR) aims to assist a human expert in determining the identity of a test face. FR has attracted the extensive attention of researchers for more than two decades due to its wide range applications in many fields, such as humancomputer interfaces, image and video content analysis, multimedia surveillance, and so on. However, the captured face image data often lies in a high-dimensional space, ranging from several hundreds to thousands. Thus, it is necessary and beneficial to transform the face image data from the original highdimensional space to a low-dimensional one for alleviating the curse of dimensionality. In the lowdimensional feature space, the traditional classification algorithm can be applied to recognize different face images. As a result, numerous face recognition algorithms have been proposed, and surveys in this area can be found in [1]. How to extract discriminating facial features and how to classify a new face image based on the extracted features are two key issues of all these face recognition algorithms. Therefore, this work also focuses
Manuscript received September 3, 2011; revised October 17, 2011; accepted October 28, 2011. Project number: 70701013. Corresponding author: Ziqiang Wang.

on the issues of feature extraction and classifier selection. Principal component analysis (PCA) and linear discriminant analysis (LDA) are two well-known feature extraction and dimensionality reduction methods for face recognition[2]. PCA is an orthogonal basis transformation where the new basis is found by an eigen-decomposition of the covariance matrix of a normalized data set, it aims to choose a linear transformation for dimensionality reduction that maximizes the scatter of all projected samples. However, PCA is an unsupervised learning method, it does not utilize the class label information. Thus, features extracted by PCA are optimal for face representation and reconstruction, but not optimal for discriminating one face from others. Unlike PCA, LDA is a supervised method, it aims to find the optimal discriminant vectors by maximizing the ratio of the between-class distance to the within-class distance, thus achieving the maximum class discrimination. The discriminant vectors can be readily computed by applying the eigen-decomposition on the scatter matrices. Due to the utilization of label information, LDA is experimentally reported to outperform PCA for face recognition when sufficient labeled face images are provided. Despite the success of LDA in many pattern classification tasks, it often suffers from the small sample size problem when dealing with high-dimensional face data. Moreover, both PCA and LDA are designed for discovering only the global Euclidean structure, whereas the local manifold structure is ignored. Then, they fail to discover the underlying nonlinear structure as traditional linear methods. One way to handle nonlinear face structure can be provided by using kernel theory[3]. Kernel-based dimensionality reduction methods have been extensively investigated in the literature. For example, PCA is generalized to its kernel version, named as KPCA; kernel discriminant analysis (KDA) utilizes the kernel trick to extend the LDA for handling linearly inseparable classification problems. Although both KPCA and KDA have achieved great success in describing the complexity of face images, they fail to discover the intrinsic structure of face images if they are lying on or close to a submanifold of the ambient space. In fact, in many real-world classifications such as face

2012 ACADEMY PUBLISHER doi:10.4304/jcp.7.9.2298-2305

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2299

recognition, the local manifold structure is more important than the global Euclidean structure[4]. To discover the intrinsic manifold structure of the face data, nonlinear dimensional reduction algorithms such as ISOMAP[5], locally linear embedding (LLE)[6] and Laplacian eigenmap (LE)[7] were recently developed. ISOMAP, a variant of MDS, aims to preserve globally the geodesic distances between any pair of data points. The goal of LLE is to discover the nonlinear structure via locally linear reconstructions. LE restates the nonlinear mapping problem as an embedding problem for the vertices in a graph and uses the graph Laplacian to derive a smooth mapping. Although all of these algorithms can discover the intrinsic manifold structure, they are defined on the training data points, and the issue of how to map new test data remains difficult. Therefore, they are not suitable for face recognition. To solve the new test data mapping, He et al.[8] proposed the linear manifold learning algorithm named locality preserving projection (LPP), which is obtained by finding the optimal linear approximations to the eigenfunctions of the LaplaceBetrami operator on the manifold. However, these algorithms are designed to best preserve data locality or similarity in the embedding space rather than good discriminating capability. Therefore, these manifold learning algorithms might not be optimal in discriminating face images with different semantic which is the ultimate goal of face recognition. Later, different manifold learning algorithms have been proposed, Analyses and interpretations about these algorithms are given in view of graph embedding framework [9]. The utility of manifold learning has been demonstrated in many pattern recognition applications. Among these various manifold learning algorithms, the graph embedding framework-based marginal Fisher analysis(MFA) has gained significant popularity due to its solid theory foundation and generalization performance[9,10]. Although MFA seems to be more efficient than other manifold learning algorithms for face recognition, it is still a linear technique in nature. So it is inadequate to describe the complexity of real face images due to high variability of the image content and style. In this paper, we discuss how to perform MFA in the Reproducing Kernel Hilbert Space (RKHS), which gives rise to kernel MFA for facial feature extraction. As for face recognition, classifier selection is another key issue after facial feature extraction. At present, the nearest neighbor (KNN) algorithm is one of the most widely used classifier algorithms. However, for large face image data sets, the computational demand for classifying face image using KNN can be prohibitive. Until now, many classifier algorithm have been proposed for face recognition, such as nearest feature line(NFL)[11], nave Bayes, neural network and support vector machine(SVM) [12]. Especially, SVM classifier has a very good performance for pattern classification problems by minimizing the Vapnik-Chervonenkis dimensions. The basic idea behind SVM is to find an optimal hyperplane in a high-dimensional feature space that maximizes the margin of separation between the closest training

examples from different classes. Although SVM has achieved great success in many pattern classification tasks, its time complexity is cubic in the number of training points, and is thus computationally inefficient on massive face image data sets. In order to overcome the above shortcomings and fully use its advantages such as higher classification accuracy and better generalization ability, we adopted the multiplicative update rule-based optimal training SVM as face classifier. In this paper, the objective is to improve face recognition performance by simultaneously using kernel MFA and optimal SVM. The rest of this paper is organized as follows. In section II, we give a brief review of MFA. Section III deals with nonlinear dimensional reduction for face recognition by using the optimal kernel MFA. Section IV discusses the optimal training SVM classifier implement. Experiments are reported in Section V. Finally, we give concluding remarks in Section VI. II. BRIEF REVIEW OF MFA Marginal fisher analysis(MFA)[9] is a recently proposed manifold learning method for feature extraction and dimensionality reduction, it is based on the graph embedding framework and explicitly considers the local manifold structure and class label information with margin criterion. MFA aims to preserve the within-class neighborhood relationship while dissociating the submanifolds for different classes from each other, it has achieved good discriminating performance by integrating the information of intraclass geometry and the interclass discrimination. Given the face image set X = x1 , x 2 , K , x n , MFA

aims to design an intrinsic graph that characterizes the intraclass compactness and another penalty graph which characterizes the interclass separability. For the intrinsic graph, the intraclass compactness is measured as the sum of distances between each sample and its neighbors within the same class. The formal definition of the intraclass compactness is as follows:

% Sc =
i

iN k1 ( j )or jN k1 ( i ) T

W T xi W T x j
T

(1)

= 2W X ( D S ) X W
1, if i N k1 ( j )or j N k1 (i ) (2) Sij = 0, otherwise where S is a similarity matrix defined on the data points in the intrinsic graph, Dii = j Sij , and N k1 (i )
denotes the index set of the k1 nearest neighbors of sample xi that are in the same class. For the penalty graph, the interclass separability is measured as the sum of distances between margin points and their neighbor points from different classes. The formal definition of the interclass separability is as follows:

2012 ACADEMY PUBLISHER

2300

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

% Sp =
i

= 2W T X ( D p S p ) X T W

( i , j )Pk2 ( l ( xi ) )or ( i , j )Pk2 ( l ( x j ) )

W T xi W T x j

(3)

1, if ( i, j ) Pk ( l ( xi ) )or ( i, j ) Pk l ( x j ) 2 2 (4) Sijp = 0, otherwise


where Sij is a similarity matrix defined on the data points in the penalty graph, class label of data point data pairs that are the
i j
p

xi , and Pk2 ( l ( xi ) ) is a set of k2 nearest pairs among the set

Diip = j Sijp , l ( xi ) is the

{(i, j ) l ( x ) l ( x )} .
% Sc
and

the perceptual redundancy among representation elements. One way to handle nonlinear structure can be provided by using kernel theory. Inspired by the success of SVM, we introduce the similar scheme to kernelize the linear MFA. The main idea is to nonlinearly map the face image data into a high-dimensional feature space, and then perform to obtain a semantic manifold in that space. Such a generalization is of great importance since the kernelized MFA would generally achieve better recognition accuracy, and relax the restriction of MFA being only a linear manifold learning algorithm. The idea of kernel MFA(KMFA) is to solve the problem of MFA in an implicit feature space F which is constructed by the kernel trick[12]. The intuition of kernel trick is to map the input data x from the original feature space into a higher dimensional Hilbert space F constructed by the nonlinear mapping

Performing MFA means minimizing the intraclass compactness maximizing the interclass

: x ( x) F

(7)

% separability S p . This is equivalent to minimizing the


following objective function:

WMFA = arg min


W

% Sc % Sp W T X ( D p S p ) X TW W T X ( D S ) X TW

in which the data may be linearly separable. Then building linear MFA algorithms in the feature space implement nonlinear counterparts in the input data space. The map, rather than being given in an explicit form, is presented implicitly by specifying a kernel function

(5)

K ( , ) as the inner product between each pair of points in

the feature space.

= arg min
W

K ( xi , x j ) = ( xi ) ( x j )
% Sc
and maximizing

)
the

(8)

Finally, the optimal transformation vectors of MFA are the eigenvectors associated with the smallest eigenvalues of the following generalized eigen-problem:
p p T If X ( D S ) X is nonsingular, then the optimal

Performing KMFA means minimizing the intraclass compactness interclass

X ( D S ) X TW = X ( D p S p ) X TW

(6)

% separability S p in the feature space

F simultaneously.

transformation vectors of MFA can be regarded as the eigenvectors of the matrix

( X (D

Sp)XT

X ( D S ) X T associated with

the smallest eigenvalues. For face recognition, a problem arises that the matrix

According to (5), this is equivalent to minimizing the following objective function: % S WMFA = arg min c % W S p (9) T T W ( X )( D S ) ( X ) W = arg min T W W ( X ) D p S p T ( X )W where

X ( D p S p ) X T can not be guaranteed to be

( ) ( X ) = ( x1 ) , ( x2 ) ,K , ( xn )

denotes

nonsingular since the number of training face images is usually much smaller than the dimension of each face image. In this case, we can first apply PCA to remove the components corresponding to zero eigenvalues. III. OPTIMAL KERNEL MFA Although MFA seems to be more efficient than other dimensionality reduction algorithms for facial feature extraction, it often fails to deliver good performance when face images are subject to complex nonlinear changes due to large pose, expression or illumination variations, for it is a linear method in nature. Therefore, a nonlinear version of MFA is required to classify the face images based on their nonlinear structure in their feature space. Employing a nonlinear face image representation algorithm can result in a reduction of the statistical and
2012 ACADEMY PUBLISHER

the face image data matrix in the feature space F . Then, the eigenvector problem of MFA in the Hilbert space F can be rewritten as follows: ( X )( D S ) T ( X ) W = ( X ) ( D p S p ) T ( X ) W (10) where the optimal transformation vectors of KMFA are the eigenvectors associated with the smallest eigenvalues of the generalized eigen-problem (10). Since the eigenvectors of (10) must lie in the span of all the samples in the feature space F , there exist coefficients i , i = 1, 2,K , n such that

W = i ( xi ) = ( X )
i =1

(11)

where

= [1 , 2 ,K , n ]

By using (11) and (8), we can rewrite (10) as follows:

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2301

K ( D S ) K T = K ( D p S p ) K T

(12)

Then, the problem of KMFA is converted into finding the eigenvectors of the matrix

S = { ( xs1 ) , ( xs 2 ) ,K , ( xsr )} in the feature space F is known, where r denotes the number of

K (Dp S p ) KT

K ( D S ) K T associated with

the smallest eigenvalues. For a new face image data x , its projection onto W in the feature space F can be calculated as follows:

( xi ) of any input data xi as a linear combination of S in the feature space F . The formal description is as
follows:

selected feature vector, then we can estimate the mapping

f ( x ) = (W ( x ) ) = i K ( xi , x )
i =1

(13)
T

where

i = ( i1 , i2 ,K , ir ) is the coefficient vector.


( xi ) as

( xi ) = S i

(15)

In

fact,

matrix

K ( D p S p ) K T is usually singular in face recognition,


which stems from the fact that the dimension of the kernel feature space is usually much higher than that of the empirical feature space, a deficiency that is generally known as small sample size (SSS) problem. One possible way to address the SSS problem is by performing PCA projection to reduce the dimension of the feature space and make the two matrixes nonsingular. According to the above derivation of KMFA, we can observe that different kernel function will produce different implicit kernel feature space. However, how to choose a suitable kernel function for a given application is still an open problem so far. In this research, motivated by the fact the inner product between two vectors can be considered as a similarity representation in the implicit feature space[13], we employ the normalized polynomial kernel function.

K (D S) K

and

Then, the goal of feature vector selection is to find the coefficients i so that the estimated mapping

( xi ) approaches

to the real mapping

far as

possible, which can be attained by minimizing the following objective function:

i =

( xi ) ( xi ) ( xi )
2

(16)

The above optimization problem is performed by setting the partial derivative of i with respect to i to zero. By using matrix form, the optimal objective function of (16) can be rewritten as follow:

min i = 1

T K Si K SS1 K Si K ii

(17)

where K SS is a square matrix of dot products of the selected vectors, and K Si is the vector of dot product between xi and the selected vector set S . Then, the ultimate goal of feature vector selection method is make (17) apply to all the sample data, which can be summarized as the following form:

K ( xi , x j ) =

k ( xi , xi ) k ( x j , x j )

k ( xi , x j )

(14)

where k ( , ) is the polynomial kernel. The degree of the polynomial kernel is set to 2 since it has achieved better performance in many pattern recognition tasks[14,15]. In addition, we can observe that the kernel trick-based KMFA algorithm is computationally expensive in the training phase since its computational complexity is proportional to the number of training points needed to represent the transformation vectors from (11). In fact, the dimensionality of the data subspace spanned by

max J S =
S

T K Si K SS1 K Si 1 K n xi X ii

(18)

The above optimal problem of solution can be obtained with an iterative algorithm[16], and the algorithm stops when K SS is no longer invertible or the predefined number of selected vectors is reached. IV. OPTIMAL TRAINING SVM CLASSIFIER Once the discriminating facial features are extracted by KMFA, face recognition becomes a pattern recognition task. Pattern recognition systems employing support vector machine (SVM) have drawn much attention due to its good performance in practical applications and their solid theoretical foundations. The essential idea of SVM is to find a linear separating hyperplane which achieves the maximal margin among different classes of data. Furthermore, one can extend SVM to build nonlinear separating decision hyperplanes by exploiting kernel techniques. Although SVM has achieved great success in many pattern classification tasks, its time complexity is cubic in the number of training points, and is thus

( xi )

is given by the rank of kernel matrix K , and the

rank ( K )

n for massive training data set. If we

replace n with rank ( K ) and select a corresponding subset of feature vectors in the feature space F , which will greatly improve the computational efficiency of KMFA. Based on the above consideration, we adopt the feature vector selection methods[16] to accelerate the running speed of KMFA. The essential idea of the feature vector selection is to find a subset which is sufficient to express all the data as a linear combination of the selected subset in the feature space F . Let the selected feature vector subset
2012 ACADEMY PUBLISHER

2302

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

computationally inefficient on large scale face image data sets. In order to overcome the above shortcomings and fully use its advantages such as higher classification accuracy and better generalization ability, we adopted the multiplicative update rule-based optimal training SVM as face classifier. Consider n face data points in the low-dimensional feature space extracted by KMFA that belong to two different classes:

Once the optimal is obtained by solving the quadratic programming (QP) problem of (24), the decision function of SVM classifier is given as follows:

f ( x ) = sgn ( wT ( x ) + b )

n = sgn i yi K ( xi , x ) + b i =1

(26)

{( x , y )}
i i

i =1

and yi {1, +1}

(19)

where xi is a low-dimension feature vector and yi is the label of the class that the vector belongs to. SVM aims to separate the two classes of sample data by finding a hyperplane (20) where w is the normal vector to the hyperplane and b is the corresponding bias term of the hyperplane. The optimization objective of SVM is to maximize of the margin 2 w and minimize the training error, which can be formally stated as the following optimization problem:

Note that the above decision function depends only on the training samples with non-zero Lagrange multipliers i . Such training samples are known as the support vectors. Meanwhile, the threshold b is computed by averaging b = y j vectors x j
j

wT x + b = 0

y K ( x , x ) over all support ( > 0 ) .


i i i j

In addition, although the quadratic programming (QP) problem of (24) has the important computational advantage of not suffering from of local minima, given n training samples, the naive implementation of QP solver is of O n
3

( ) computational complexity, which is

min
w,b ,

1 2 w + C i 2 i =1

(21)

subject to

yi wT ( xi ) + b 1 i , i > 0

(22)

where

is the nonlinear mapping function, C is

computationally infeasible on very large face image data sets. Hence, a replacement of the naive method for solving QP solutions posed by the SVM classifier is highly desirable. To this end, we applied the multiplicative update rule-based method[17] to improve the training speed of SVM classifier. Since the optimal problem of (24) can be boiled down to the general nonnegative quadratic programming

used to balance the tradeoff between maximizing the margin and minimizing the training error, and i is the slack variable that quantifies SVM training error. In the primal form, the Lagrangian of the above SVM optimization problem is as follows: n 1 2 L = w + C i 2 (23) i =1
i yi ( wT ( xi ) + b ) 1 + i ii
n n i =1 i =1

1 F ( v ) = min vT Av + bT v subject to v 0 . (27) v 2 where matrix A is a symmetric and semipositive definite


matrix. Hence, the optimization of objective function

F ( v ) is convex.

where the Lagrange multipliers all

i 0

and

i 0 for

i = 1, 2,K , n .

Due to the nonnegativity constraints in (27), we adopt the multiplicative iterative updates rule to obtain the optimal solution. The iterative update algorithm is implemented according to the positive and negative components of the matrix A in (27).Their definitions are as follows:

With Lagrange multipliers and Karush-KuhnTucker(KTT) conditions, the solutions of (21) under constraint condition (22) can be obtained by solving its dual problem:

1 n Q ( ) = max i i j yi y j K ( xi , x j ) (24) 2 i , j =1 i =1
n

Aij , if Aij 0. + Aij = 0 , otherwise. A , if Aij < 0. Aij = ij 0 , otherwise.

(28)

(29)

From the above definition, we can easily observe that

subject to

A = A+ A . Then, the multiplicative iterative updates


rule can be defined as follows in terms of nonnegative matrices.

i yi = 0 and 0 i C , i = 1, 2,K, n
i =1

(25)

where K xi , x j =

( xi ) ( x j ) is a kernel function

satisfying Mercers condition.

b + b 2 + 4 A + v A v i i i vi vi i 2 A+ v i

( )( ) ( )

(30)

The remarkable advantage of the multiplicative iterative update rule in (30) is that it can be easily
2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2303

implemented and never violate the nonnegativity condition constraints. Furthermore, it has been proved that the multiplicative update rule has the correct fixed points[17] and can monotonically improve the optimal objective function of (27). Especially, since the optimal objective function of SVM in (24) is a special case of (27), for the training of SVM with the multiplicative iterative update rule, we can make Aij = yi y j K xi , x j , bi = 1 . (31)

benchmark face databases including the FERET standard facial database [20] and Yale database [21] were tested.

Figure 2.

The sample image cropped from the face database FERET.

Then, the multiplicative update rule for solving the objective function of (27) in SVM can be described the following form:

Figure 3.

The sample image cropped from the face database Yale.

1 + 1 + 4 A+ A i i (32) + 2Ai + where A and A are defined as in (28) and (29).

i i

( )( ( )

In short, the face recognition procedure has three steps. First, we obtain the face subspace with the optimal manifold learning algorithm KMFA; then the new face image to be identified is projected into the face subspace; finally, the optimal training SVM classifier is adopted to identify the new facial image. The outline of the proposed face recognition algorithm is shown in Figure 1.
Input the highdimensional face images

Obtaining the face subspace with KMFA

The new testing face images are projected onto the face subspace Training SVM with the multiplicative update rule in the subspace The obtained optimal SVM is used to identify new facial images Output the face recognition results Figure 1. The proposed face recognition algorithm.

V. EXPERIMENTAL RESULTS In this section, we investigate the performance of our proposed optimal KMFA plus SVM (OKMFA for short) algorithm for face recognition. The algorithm performance is compared with the kernel PCA (KPCA)[18], kernel LDA(KLDA)[19], and kernel LPP(KLPP)[8] algorithms, three of the most popular nonlinear dimensionality reduction algorithms for face recognition. In this experiment, two publicly available

The FERET face database is a rather larger database. It contains 13,539 face images of 1565 subjects taken during different photo sessions with variations in size, pose, illumination, facial expression and even age. We test the four algorithms on a subset of the FERET database. This subset includes 1,400 images of 200 individuals (each with seven images labeled as ba, bc, bd, be, bf, bg, and bh). Some cropped sample face images from the face database FERET are displayed in Figure 2. All of the gray-level images are aligned by fixing the locations of the two eyes, normalizing in size to a resolution of 32 32 pixels, and preprocessing with histogram equalization. For each individual, p(=2,3,4,5) face images are randomly selected for training and the rest are used for testing. To reduce the variation in the recognition results, for each given p, we computed the average recognition accuracy of 10 random splits. In general, the performance of the four recognition algorithms varies with the number of dimensions. We only report the best recognition accuracy and the optimal dimensionality obtained by KPCA, KLDA, KLPP, and OKMFA in Table I. As can be seen, our proposed OKMFA algorithm outperforms all the other algorithms with fewer features, and the KPCA algorithm gives relatively poor recognition accuracy. The Yale face database contains 165 images of 15 individuals (each person has 11 different images) under various facial expressions and lighting conditions. In this experiment, preprocessing to locate the faces was applied. The images were aligned semi-automatically according to the eyes position of each facial image using the eye coordinates. The facial images were cropped, and then resized to a resolution of 32 32 pixels. Some cropped sample face images from the face database Yale are displayed in Figure 3. Histogram equalization was used for the normalization of the facial image luminance. A random subset with p(=5,6,7,8) face images per individual was taken with labels to form the training set, and the rest of the database was regarded as the testing set. For each given p, we average the results over 10 random splits. We only report the best recognition accuracy and the optimal dimensionality obtained by KPCA, KLDA, KLPP, and OKMFA in Table II. We can see that our proposed OKMFA algorithm achieves the best recognition accuracy. In addition, to verify the efficiency of our proposed OKMFA algorithms, we record the computational time in

2012 ACADEMY PUBLISHER

2304

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

the experiments. The running times of the four algorithms on the FERET and Yale face database are listed in Table III and Table IV respectively. As can be seen, the time ration for the FERET face database used by the four algorithms are approximately KPCA: KLDA: KLPP: OKMFA = 29:31:26:21.The time ration for the Yale face database used by the four algorithms are approximately KPCA: KLDA: KLPP: OKMFA = 17:19:23:15. These results show that the proposed OKMFA algorithms are much more efficient than the traditional kernel-based dimensionality reduction algorithms. The main reason could be attributed to the fact that the feature vector selection strategy accelerates the running speed of KMFA, and the multiplicative update rule-based method further improve the training speed of SVM classifier. Therefore, our proposed OKMFA algorithm could dramatically reduce the computational time when compared to other three algorithms on large scale face recognition problem.
TABLE I. RECOGNITION ACCURACY COMPARISONS ON THE FERET DATABASE Algorithms KPCA KLDA KLPP OKMFA 2 images 74.6%(79) 76.5%(42) 79.1%(45) 84.7%(40) 3 images 82.7%(81) 84.6%(46) 87.9%(50) 92.8%(44) 4 images 88.6%(56) 90.3%(40) 91.5%(42) 93.2%(40) 5 images 91.8%(48) 93.4%(69) 94.2%(60) 96.7%(46)

TABLE II. RECOGNITION ACCURACY COMPARISONS ON THE YALE DATABASE Algorithms KPCA KLDA KLPP OKMFA 5 images 56.3%(72) 68.5%(14) 79.8%(12) 85.6%(10) 6 images 64.7%(78) 72.6%(14) 84.1%(14) 92.5%(12) 7 images 70.5%(70) 81.3%(14) 91.6%(14) 95.7%(14) 8 images 78.2%(74) 89.7%(14) 95.8%(13) 98.3%(14)

In summary, the main observations from the above performance comparisons include: (1) Our proposed OKMFA algorithm consistently outperforms KPCA, KLDA, and KLPP algorithms in terms of recognition accuracy and computational time. The superiority of OKMFA stems from two aspects: on the one hand, the kernel MFA explicitly considers the local manifold structure and class label information with margin criterion in the process of nonlinear dimensionality reduction, thus achieving maximum discrimination and improving the computation efficiency of face classifier; one the other hand, the multiplicative updates rule-based optimal SVM can achieve better classification performance and lower computational requirements simultaneously. Hence, the proposed OKMFA algorithm achieved much better performance than other three algorithms. (2) The manifold learning-based (such as KLPP and OKMFA) algorithms achieve much better performance than KPCA and KLDA, which demonstrates the importance of utilizing local manifold structure. The reason is that KPCA and KLDA are designed for discovering only the global Euclidean structure, whereas the local manifold structure is ignored. (3) Although OKMFA and KLPP algorithms belong to manifold learning algorithms, our proposed OKMFA algorithm performs better than KLPP. One possible explanation is as follows: although KLPP seeks to preserve local neighbor structure, it does not explicitly exploit the class information for classification. By jointly considering the local manifold structure and the class label information with two graphs, OKMFA achieves much better performance than KLPP in face recognition. (4) The KPCA algorithm gives the worst recognition accuracy. One possible explanation is as follows: KPCA is an unsupervised algorithm that ignores the valuable label information for classification. Hence, the features extracted by KPCA are optimal for representation, but not optimal for classification. VI. CONCLUSIONS In this paper, we have proposed an enhanced face recognition algorithm called OKMFA that combines the advantages of optimal kernel MFA and SVM. The effectiveness and efficiency of this algorithm is evidenced in experimental comparisons on two benchmark face databases with other well-known kernelbased dimensionality reduction algorithms. In the future, we would like to apply our algorithm to many tasks in pattern recognition, data mining, and high-dimensional data processing. ACKNOWLEDGMENT This work is supported by the National Natural Science Foundation of China under Grant No.70701013, the Natural Science Foundation of Henan Province under Grant No. 102300410020,0611030100, and the Natural Science Foundation of Henan University of Technology under Grant No. 08XJC013 and 09XJC016.

TABLE III. RUNNING TIME COMPARISONS ON THE FERET DATABASE Algorithms KPCA KLDA KLPP OKMFA Training time(s) 26.1 28.7 23.8 19.6 Testing time(s) 3.2 2.5 2.1 1.4

TABLE IV. RUNNING TIME COMPARISONS ON THE YALE DATABASE Algorithms KPCA KLDA KLPP OKMFA Training time(s) 14.9 17.5 21.4 13.8 Testing time(s) 2.2 1.6 1.8 1.3

2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2305

REFERENCES
[1] S.R.Chellappa and C.L.Wilson, Human and machines recognition of faces: a survey, Proceedings of IEEE, vol.83, pp.705741, May 1995. [2] R.O. Duda, P.E. Hart, and D.G. Stork, Pattern Classification, Second edition, Hoboken: WileyInterscience,2000. [3] J. Shawe-Taylor and N. Cristianini, Kernel Methods for Pattern Analysis, Cambridge, UK: Cambridge University Press, 2004. [4] X.He, S.Yan,Y.Hu, P.Niyogi, and H.-J.Zhang, Face recognition using Laplacianfaces, IEEE Transactions on Pattern analysis and Machine Intelligence, vol.27,pp.328340, March 2005. [5] J.B.Tenenbaum,V. de Silva, and J.C. Langford, A global geometric framework for nonlinear dimensionality reduction, Science, vol.290, pp.2319-2323, December 2000. [6] S.T. Roweis and L.K. Saul, Nonlinear dimensionality reduction by locally linear embedding, Science,vol.290, pp.2323-2326, December 2000. [7] M. Belkin and P.Niyogi, Laplacian eigenmaps for dimensionality reduction and data representation, Neural Computation, vol.15, pp.1373-1396, June 2003. [8] X.He and P.Niyogi, Locality preserving projections, in Advances in Neural Information Processing Systems, Cambridge, MA:MIT Press,2003,pp.385-392. [9] S.Yan, D.Xu, B.Zhang, H.-J.Zhang, Q.Yang,and S.Lin, Graph embedding and extensions: a general framework for dimensionality reduction, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.29, pp.4051, January 2007. [10] D.Xu, S.Yan, D.Tao, S.Lin, and H.-J.Zhang, Marginal fisher analysis and its variants for human gait recognition and content-based image retrieval, IEEE Transactions on Image Processing, vol.16, pp.2811-2821, November 2007. [11] S.Z.Li and J.W.Lu, Face recognition using the nearest feature line method, IEEE Transactions on Neural Networks, vol.10, pp.439443, March 1999. [12] V. N.Vapnik, The Nature of Statistical Learning Theory, New York: Springer-Verlag,1995. [13] N. Cristianini, J. S. Taylor, and J. Kandola, Spectral kernel methods for clustering, in Advances in Neural Information Processing Systems, Cambridge, MA:MIT Press,2001,pp.649655. [14] Q.Liu, H.Lu, and S.Ma, Improving kernel Fisher discriminant analysis for face recognition, IEEE Transactions on Circuits and Systems for Video Technology, vol.14, pp.42-49, January 2004. [15] H.Li, T.Jiang, and K.Zhang, Efficient and robust feature extraction by maximum margin criterion, IEEE [16]

[17]

[18]

[19]

[20]

[21]

Transactions on Neural Networks,vol.17, pp.157-165, January 2006. G. Baudat and F. Anouar, Kernel-based methods and function approximation, in Proceedings of International Conference on Neural Networks, Piscataway, NJ:IEEE Press,2001,pp.12441249. F.Sha, Y.Q.Lin, L.K.Saul, and D.D.Lee, Multiplicative updates for nonnegative quadratic programming, Neural Computation, vol.19, pp.2004-2031, August 2007. M. H. Yang, N.Ahuja, and D. Kriegman, Face recognition using kernel eigenfaces, in Proceedings of International Conference on Image Processing, Piscataway, NJ:IEEE Press,2000,pp. 37-40. G. Baudat and F. Anouar, Generalized discriminant analysis using a kernel approach, Neural Computation, vol.12, pp. 23852404, October 2000. P. J. Phillips, H.Wechsler, J. Huang, and P. Rauss, The FERET database and evaluation procedure for face recognition algorithms, Image and Vision Computing, vol. 16, pp. 295306, April 1998. Yale Face Database, http://cvc.yale.edu/projects/ yalefaces/ yalefaces.html, 2002.

Ziqiang Wang was born Zhoukou, Henan, China in 1973. He received the PhD degree from China University of Mining &Technology (Beijing) and the Master of Science degree from Xi'an Petroleum University, both in computer application, in 2011 and 2002, respectively. His research interests include data mining and pattern recognition. He is currently an associate professor in the Henan University of Technology.

Xia Sun was born Xian, Shanxi, China in 1978. She received the Master of Science degree in computer application from Huazhong University of Science and Technology in 2008. Her research interests include data mining and pattern recognition. She is currently a lecturer in the Henan University of Technology.

2012 ACADEMY PUBLISHER

2306

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

Hybrid Cloud Computing Platform for Hazardous Chemicals Releases Monitoring and Forecasting
Xuelin Shi
Beijing University of Chemical Technology, Beijing, China Email: shixl@mail.buct.edu.cn

Yongjie Sui and Ying Zhao


Beijing University of Chemical Technology, Beijing, China Email: suiyongj@yahoo.com.cn , zhaoy@mail.buct.edu.cn

AbstractWhile hazardous chemicals pollutions occurring, monitoring and forecasting are very important for emergency planning and evacuation. To implement the object, huge amounts of wireless sensors may be distributed in large area. The data gathered by the sensors are sent back to data centers for preprocessing and analysis, which need massive computing utility. Therefore a high effective computing paradigm is essential for the hazardous chemicals releases monitoring and forecasting. This paper brought out a hybrid cloud computing platform to solve the problem. It is an Infrastructure as a Service (IaaS) platform, which provides High Performance Computing (HPC) resources, virtualized computing resources, storage resources for hazardous chemicals releases data processing and analysis. Especially we designed a scheduling algorithm and QoS policy to assure efficiency of the platform. At last the platform capability and performance were demonstrated by application scenarios. Index TermsCloud Harzadous Chemicals Computing, Scheduling, QoS,

I. INTRODUCTION Releases of hazardous chemicals are main cause of major accidents, which includes accidents in the chemical industry as well as terrorist attacks [1]. While hazardous chemicals pollutions occurring, monitoring and forecasting are very important for emergency planning and evacuation. To implement the object, huge amounts of wireless sensors may be distributed in large area. The data gathered by the sensors are sent back to data centers for preprocessing and analysis, which need massive computing. With development of Information and Communication Technology, computing will one day be the 5th utility (after water, electricity, gas, telephony) [2]. Computing resources are always distributed dispersedly, which connected with networks. Therefore a high effective computing paradigm is essential for the hazardous chemicals releases monitoring and forecasting. A number of computing paradigms have been proposed: cluster computing, Grid Computing, and more recently cloud computing. Cloud computing has been an emerging model which aims at allowing customers to
2012 ACADEMY PUBLISHER doi:10.4304/jcp.7.9.2306-2311

utilize computational resources and software hosted by service providers [3]. Buyya defined a definition for cloud as follows: A Cloud is a type of parallel and distributed system consisting of a collection of inter-connected and virtualized computers that are dynamically provisioned and presented as one or more unified computing resource(s) based on service-level agreements established through negotiation between the service provider and consumers. [2] According this definition, clouds appear to be a combination of clusters and Grids. Clouds often deal with large amount of resources including processors, memories, storages, visualization equipment, software, and so on. In this paper we examine how monitoring and forecasting of hazardous chemicals can take advantage of cloud computing to offer timely emergency decision support. The data processing involves the analysis of chemical components, the tracing of chemical reaction, the simulation of computational fluid dynamics (CFD), and so on. It is complex, time consuming, and expensive. To lower costs and improve efficiency, cloud computing is an affordable solution in this area. Clouds shift the responsibility to install and maintain hardware and basic computational services away form the users (e.g., a laboratory or chemical scientist) to the cloud vendor. Higher levels of the application stack and administration of sharing remain intact, and remain the users responsibility [4]. Our key contributions are the following: (1) an IaaS platform for cloud computing named Hazardous Chemicals Monitoring Cloud (HCMC), which is a hybrid cloud supporting High Performance Computing uniform computing services and data storage services; (2) a resource scheduling algorithm encompassing both customer-driven service management and computational risk management to sustain Service Level Agreement (SLA)-oriented resource allocation; and (3) a QoS policy to meet different users demands and improve scheduling efficiency. The remainder of the paper is organized as follows. Section 2 gives the architecture of HCMC. In section 3,

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2307

we introduce the resource scheduling model and evolutionary algorithm applied in HCMC. Then in section 4, we show how a QoS policy can be used to assure important jobs be processed in high priority. We give an application scenario of HCMC in section 5. At last section 8 give concluding remarks. II. ARCHITECURE OF HCMC Data sensing and processing are key problems of hazardous chemicals releases monitoring and forecasting (HCRMF), computing service of which is different from commercial public clouds. The commercial public clouds, such as Amazon Elastic Computing Cloud (EC2), Google App Engin (GAE), often provide several kinds of computing services to meet users different needs based on virtualization technology. But HCRMF computing platform should not only support the HPC for massive data processing, but also virtualization technology for domain users computing requirements. Therefore, this paper presents a hybrid cloud platform architecture, Hazardous Chemicals Monitoring Cloud (HCMC), which provides hybrid management for HPC and Virtual Private Cloud. This section first describes the overall framework of HCRMF, including data sensing, computing platform, user interface and so on. Secondly architecture of HCMC is given.

A. HCRMF Framework
HCRMF system provides functions as: data collection, transmission, analysis, monitoring, and so on. Fig. 1 gives the framework of HCRMF.
Monitoring Interface User Interface Portal Layer

Sensing layer is responsible for hazardous chemicals monitoring and data collection, which consists of Wireless Sensors Network (WSN) and other mobile terminals. WSN is composed of many widely distributed automatic devices, which use sensors to collect the environmental data. Besides of WSN other mobile terminals also can be used, such as police cars equipped with the GPS, cameras, and detection instruments. These mobile terminals can become data-sensing nodes and transmit data to the base stations of the network layer. Access layer includes base stations and access gateway, which provide functions of sensing layer devices control, data collection and protocol conversion. In the layer, not only the data from sensing layer will be transformed to suit transmission in network layer, but also the instructions from upper layers are distributed to sensing layer. Network layer is holder of data transmission, which can consist of Internet, LAN, and 3rd generation mobile communications networks. It is connection of the access layer and the cloud computing layer. Cloud computing layer is the core of the HCRMF framework, which manage HPC servers, virtual machines, and data storage devices. It provides computing utility for data processing and analysis gathered by devices of sensing layer. In addition, it responds computing requests submitted by domain users with portal. This paper presents a hybrid cloud platform HCMC to achieve the above functions. Portal layer provides monitoring interface and user interface for users to monitor environment and submit computing requests. By real-time monitoring interface, users can retrieve monitoring data and analysis and issue control instructions. As the cloud computing layer provides some software services, such as CFD software, evacuation simulation software and so on, users can submit computing requests by user interface. B. Hybrid Cloud Arthitechture Our HCMC is a unified hybrid computing model of managed HPC and other computing resources. HCMC is not only a pool of virtual machines, but also a set of cluster-level servers. Running HPC workloads on virtual machines was considered impractical for two key reasons: (1) the overhead of virtualization; (2) the virtual machine instances were not backed by a high performance interconnect and storage [5]. Because hazardous chemicals monitoring need massive computing utility, HPC cluster servers with special hardware, such as low-latency interconnects, storage-area networks, multi-core nodes, hundreds of Gigabytes of main memory, are necessary. Therefore HCMC provides two kinds of resources management mechanism: Workload and Resource Management System (WRMS) for HPC cluster servers and Dynamic Infrastructure Management Service (DIMS) for private cloud, i.e. a pool of virtual machines. Fig.1 gives the architecture of HCMC.

HCMC

Cloud Computing Layer

Internet LAN

Hosted Network 3G

Network Layer

Gateway

Base Station

Access Layer

WSN

WSN

Mobile Terminal

Sensing Layer

Figure 1. The HCRMF Framework

HCRMF has five layers: sensing layer, access layer, network layer, cloud computing layer and portal layer.

2012 ACADEMY PUBLISHER

2308

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

Wireless Sensor

Wireless Sensor

Wireless Sensor

Front Processor

Front Processor

Job Scheduler

User Interface

Workload and Resource Management System (WRMS)

Dynamic Infrastructure Management Service (DIMS)

throughput-intensive jobs. Therefore Job Scheduler will dispatch them to DIMS. DIMS supports two kinds of functionality: (1) physical resource management, including resources status and capability; and (2) service management, including service creation (and the related allocation of underlying physical resources), monitoring, migration, and destruction. In HCMC, DIMS is implemented with OpenNebula. It is an open source, virtual infrastructure manager that deploys virtualized services on both local pool of resources and external IaaS clouds [6]. HCMC also has a data store, which provides distributed, transparent and reliable cloud storage service. III. RESOURCES SCHEDULING OF HCMC We proposed a scheduling model for HCMC and designed evolutionary algorithm. In the model priority weight is introduced to implement differential services. Then the algorithm is explained in detail.

PM Data Store PM

PM PM

PM PM Private Cloud

VM

VM

VM

Figure 2. The HCMC Architecture

HCMC are candidates for several roles in hazardous chemicals monitoring, ranging from compute services to data storage. Daily log processing often uses basic machine, while emergency planning and forecasting may rely on clusters of high-performance machines. As shown in Fig.2, HCMC is a unified model of managed HPC and Cloud resources. The working nodes of HCMC can run either traditional physical machines (with the operating system statically bound to the hardware), denoted by PM, or virtual machines (with the operating system dynamically bound to the hardware), denoted by VM. Jobs enter the HCMC through the Job Scheduler who puts them in a submit queue, decides when to dispatch them to the WRMS or DIMS. Distributed wireless sensors gather data on hazardous chemicals releases and send them to front processors. The abnormal or key data needing processing will be submitted to Job Scheduler. These jobs are fixed and should be executed on HPC resources according to a set of scheduling policies defined by the system administrators. Therefore they were dispatched to WRMS by Job Scheduler. WRMS supports three kinds of functionality: (1) resource management, including resource status and capability; (2) job management, including creation, queuing, monitoring, and control; (3) job scheduling, i.e., mapping a job to a set of resources and allocating the resources to that job for a certain time. In HCMC, WRMS is developed with Load Sharing Facility (LSF). Furthermore users (e.g. a laboratory or chemical scientist) also can submit jobs by user interface. These jobs may be data mining of daily logs, simulation computing, and so on, which are often not emergent and

A. Scheduling Model Consider C as a set of machines (maybe PMs or VMs) in a HCMC environment. Assume that there are m machines in C. Also, suppose J is the set of jobs and there are n jobs in J. According to known parameters often used in the existing techniques [7], we defined parameters about jobs and machines are as follows: total processing time required for the ith job if assigned to the jth machine di deadline of the ith job ri ready time of the ith job pj the price/unit time for the jth machine li weight of the ith job tij A job is the description of a compute task, which consists of its resource requirements (i.e. tij) and its timing constraints (i.e. di and ri). Furthermore we add li, a priority weight parameter, which can be predefined for differential services objective. So the good cloud scheduler not only minimize the costs of machines (i.e. tij pj), but also take di and li into account to avoid the loss. The job with big weight should be assigned to machines preferentially in order to assure it can be done before deadline, otherwise loss occurs. To quantize the loss, we use to li time delay represent it. Therefore, we got the cloud scheduling model for HCMC: Problem: Job scheduling problem Instance: J = {j1, j2, , jn}, the set of n jobs. C = {c1, c2, , cm}, the set of m machines. T = [ti], the processing time matrix. P = { p1, p2, , pm }, the set of price/unit time for m machines. D = {d1, d2, , dn}, the set of deadline for n jobs. R = {r1, r2, , rn}, the set of ready time for n jobs. L = {l1, l2, , ln}, the set of weight for n jobs.

2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2309

Output:

( tij pj + max(0, li (si + tij - di)))

(1)

performed. The algorithm works iteratively till stopping criterion is satisfied, i.e. evolutionary generations reach the predefined parameter. IV. QOS POLICY OF HCMC QoS concerns operational characteristics of a service that determine its utility in an application context. QoS offers a basis for service differentiation and is also an important factor of competition [9]. In the same way QoS becomes an important concern in cloud computing. As users rely on cloud providers to supply more of their computing needs, they will require specific QoS to be maintained by their providers in order to meet their objectives and sustain their operations. Cloud providers will need to consider and meet different QoS parameters of each individual consumer. In a highly competitive cloud environment, QoS is one of the crucial means for satisfying various demands from resource users and providers [10]. Therefore an effective cloud scheduling policy should take QoS into account. A dynamic environment like HCMC, it is necessary to assure important jobs getting computing resources with priority. In order to solve the problem we designed a QoS policy to improve scheduling efficiency. The QoS policy can sustain operations of important jobs by different levels of QoS. Our QoS management framework mainly consists of two components: QoS concepts and QoS mechanism. A. QoS Concept A consistent and widely applied set of Qos concepts is essential for effective QoS-aware service chaining. QoS concepts enable quality of service to be modeled, specified, communicated, negotiated, controlled and managed. ISO 13236 (ISO/IEC, 1998) defines a generic set of QoS concepts that can be applied for modeling QoS in information technology (IT) systems that provide distributed processing services [9]. In our QoS management framework we defined the concepts as followed: user requirement, use pattern and predictive resource reservation. User requirement is a quantifiable aspect of quality that is desirable in an interaction or that is necessary for user satisfaction. According to our economic scheduling model, user requirements can be presented by processing time of his or her job executing in some node of cloud, budget, and penalty price. To promote service efficiency, it is very useful to be able to predict when a given computational resource will be idle, becoming available for gird applications. The resource Use Pattern Analysis (UPA) method is often used to predict resource, which based on the assumption that resource availability at each node can be modeled [11]. Based on the concept of user requirement, we defined another QoS concept: use pattern. At present we only consider CPU use pattern. The last QoS concept is predictive resource reservation. Based on UPA the users requirement can be predicted, therefore some resource can be reserved for important jobs. Resource reservation also expresses different levels

where si is start executing time of the ith job, max(0, li (si + tij - di)) is loss of the jobi, if the jobi completes before its deadline time, the loss is 0; otherwise, the loss is li (si + tij - di). It creates a priority among jobs.( tij pj + max(0, li (si + tij - di))) is objective function. B. Evolutionary Scheduling Algorithm After constructed the scheduling model, we designed an evolutionary algorithm to obtain the optimal solution. In this section we introduce its design and implementation. Genetic algorithms (GAs) are search algorithms based on the mechanics of natural selection and natural genetics [8]. Evolutionary algorithms (EAs) are GAs with special data structures or special encodings of solutions or genetic operator based on the problem [8]. Being an EA, the basic components to implement our algorithm are chromosome encoding, fitness function and population mutation and crossover. To minimize( tij pj + max(0, li (si + tij - di))), it is clear that not only to assign the jobi to the machine with highest performance for executing it (i.e. tij is smallest), but also to assure the jobs be executed in time to avoid loss. We can think jobs with higher weight (i.e. li) and urgent deadline (i.e. di) have a higher rank. The scheduling is equivalent to search the combinations of jobs and machines to select the optimized schedule. Therefore we designed the combination of jobi (let it be i) and machinej (let it be j) as gene ( i, j), which form sets called chromosomes. Under this gene encoding, a chromosome of such a subpopulation in which each job appears once and only once are raw schedules. The objective of our algorithm is to find an optimized schedule by selection the chromosomes of the offspring subpopulations. To evaluate an offspring, we defined the fitness function as followed: fitness = ( tij pj + max(0, li (si + tij - di))) (2)

Usually a random population is used as initial population in EAs, but if the quality of the initial population is better than the average of the random population the efficiency of the EAs can be increased. In our algorithm, in order to avoid precocity we still select genes randomly as initial population, P. Then genes (i, j) of P are sorted in non-decreasing order of theirs tij pj + max(0, li (si + tij - di)) to generate template population TP. In each iteration, the new population newP is constructed by newP = TP{x% elites in P} (3)

where x is a predefined parameter. In order to keep elites of parent population, x is let to be 50% in our algorithm. Recombination and mutation operators change chromosomes genes and create chromosomes for the template populations. The simple crossover mutation is
2012 ACADEMY PUBLISHER

2310

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

of services: resources reserved for important jobs but not allocate to arrived jobs. In our QoS management framework resource reserving is a threshold, which defines when and how many resources should be reserved. B. QoS Mechanism After defined QoS concepts, we designed QoS mechanism: scheduling with predictive resource reserving, i.e. based on UPA pre-reserving resources for important jobs. The effective approach to acquire resource UP is categorizing log of resource using, such as CPU use, available RAM, disk space, swap space, network and disk I/O. Presently most of operating systems have such functions to generate system logs. These system logs are often semi-structured data, and there are many categorization methods for semistructured text which can be generally classified into supervised learning, also called classification, unsupervised learning, also called clustering, and semisupervised learning [12]. In our work, unsupervised learning is used for UPA mining. Firstly, original log data is preprocessed and represented in the Vector Space Model (VSM) [13]. In this model, each log record is identified by a feature vector in a space in which each dimension corresponds to a distinct item log associated with a numerical value indicating its weight. At present we just considered CPU use and hoped to find users use pattern in some period. The resulting representation of one node log in one period is, therefore, equivalent to a n-dimensional vector: d = <(t1,w1),(t2,w2)(tn,wn) > (4)

With the above mechanism, the rules of resource usage are found from system logs. When scheduling jobs, these rules can be used to predict when the user will submit a job and how long this job will be processed on a node. Therefore resources can be reserved for important jobs to improve scheduling efficiency. V. APPLICATION SCENARIOS In this section, we describe how scheduling algorithm and QoS policy deployed on HCMC. Then a scenario of an example job execution workflow is given. HCMC is a hybrid cloud platform, which provides HPC and virtual computing services. Therefore job scheduling and QoS policy should work based on WRMS and DIMS, which is implemented by a layered scheduling system. It consists of a global job scheduler, WRMS schedulers and DIMS schedulers. As previously stated, HCMC provides hybrid computing resources, this system is effective for managing different resources at distributed sites. Fig.3 gives the scheduling system deployment on HCMC.

Front Processor

Individual users

Cloud Consumers Cloud Management HPC and VI Management

Job scheduler

LSF

OpenNebula

In the vector wj represents the numerical CPU usage value of the user tj in one period d. As a result, we get VSM to represent all cloud nodes resource usage as a matrix: X R , where n represents the nodes number of the cloud, and d is node log. To each element Xij of this matrix is assigned the number of CPU usage of user tj on node dj. After constructed resource usage matrix (simply named as X), we used k-NN to classify very row of X, i.e. vector d. K-NN finds the k nearest neighbors of the test document, and then uses majority voting among the neighbors in order to decide the log category. Similarity between two vectors is used to decide whether neighbors are near or far, and it is measured by the cosine between the vectors:
n d

Cluster servers

VMware

Cluster and VM Management

Figure 3. The deployment of HCMC

sim(di , d j ) =


r k =1

r k =1 2 ik

wik * W jk

w *

r k =1

(5)

2 jk

When new resource usage matrix is given to the k-NN algorithm, for each row of X (i.e. vector d), the similarities among vectors are computed, then they were classified into several categories.

Key component of the system is the global job scheduler. All machines of HCMC are connected with computer network. Global job scheduler performs topdown scheduling decisions in the cloud platform, while HPC and Virtual Infrastructure (VI) managers wait instructions from the job scheduler to execute jobs. The layered scheduling management architecture has advantage to effectively control workflow execution on hybrid cloud like HCMC. So our scheduling solution can be performed effectively. A scenario of an example job execution workflow is following: first, a user submits the computing requests by the portal of HCMC, or a front processor sends data to HCMC for analysis. Second, the request is sent to the job scheduler. Thirdly the job scheduler set priority weight and other parameters of the job. Then it dispatches the job to HPC manager or VI manager to perform the computation according to the jobs requirement. HPC manager (i.e. LSF) or VI manager (OpenNebula) locate one or more machines to

2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2311

execute the job. Finally the computing results are returned to the user or administrators. Where the requests are submitted, how the computing requests is allocated, and the monitoring and management for computing tasks are all completed by the HCMC transparently. VI. CONCLUSION AND FUTURE WORK This paper presented a hybrid cloud computing platform HCMC. HCMC is an IaaS platform, which provides HPC and virtualized computing resources, storage resources for hazardous chemicals releases data processing and analysis. Especially we designed a scheduling algorithm and QoS policy to assure efficiency of the platform. At last the platform capability and performance were demonstrated by application scenarios. Now HCMC is still in theory research stage, but in future it will put into service for some key areas hazardous chemicals monitoring. Many chemical domain specialists believe that clouds represent the next generation of mass computing services. HCMC is very suitable for processing data from huge numbers of distributed wireless sensors. Our work is a step toward realizing how cloud computing is used in hazardous chemicals releases monitoring and forecasting to support emergency planning. In future work, a friendly cloud portal for users to access computing and storage resources is one key problem. Furthermore security of the cloud platform is also important. ACKNOWLEDGMENT This work was supported in part by the National Grand Fundamental Research 973 Program of China (No. 2011CB706900). REFERENCES
[1] Xiaoping Zheng, Zengqiang Chen: Inverse Calculation Approaches for Source Determination in Hazardous Chemical Releases. Journal of Loss Prevention in Process Industries, 2011, doi:10.1016/j.jlp.2011.01.002. [2] Rajkumar Buyya, Chee Shin Yeo, Srikumar V., James B., Ivona B.: Cloud computing and emerging IT platforms: Vision, hype, and reality for delivering computing as the 5th utility. Future Generation Computer Systems. 25, 599-616 (2009). [3] Hong-Linh Truong, Schahram Dustdar: Composable cost estimation and monitoring for computational applications in cloud computing environments. Procedia Computer Science 1, 2169-2178(2010). [4] Arnon Rosenthal, Peter Mork, Maya Hao Li, etc.al.: Cloud Computing: A New Business Paradigm for Biomedical Information Sharing. Journal of Biomedical Informatics, 43(2010), 342-353.

[5] Gabriel Mateescu, Wolfgang Gentzsch, Calvin J. Ribbens.: Hybrid Commputing Where HPC meets grid and Cloud Computing. Future Generation Computer Systems 27 (2011), 440-453. [6] B. Sotomayor, R. Montero, I. Llorente, I. Foster: Virtual infrastructure manaagement in pirvate and hybird clouds. IEEE Internet Computing, 13(5)(2009), 14-22. [7] Subodha Kumar, Kaushik Dutta, et.al.: Maximizing Business Value by Optimal Assignment of Jobs to Resources in Grid Computing. European Journal of Operational Research 194, 856-872 (2009). [8] D.E. Goldberg. Genetic Algorightms in Search. Optimization and Machine Learning, Addison-Wesley, 1988. [9] Richard Onchaga, Quality of service management framework for dynamic chaining of geographic information services. International Journal of Applied Earth Observation and Geoinformation, 8, 137-148 (2006). [10] Chunlin Li, Layuan Li: A distributed multiple dimensional QoS constrained resource scheduling optimization policy in computational grid. Journal of Computer and System Science, 72(4), 706-726 (2006). [11] Marcelo Finger, Germano C. Bezerra, Danilo R.: Conde. Resource use pattern analysis for opportunistic grids. MGC08, December 1-5, Leuven, Belgium (2008). [12] Soumen Chakrabarti: Data mining for hypertext: A tutorial survey. SIGKDD Explorations, 1 (2) 1-11 (2000). [13] G. Salton, C. Yang, A. Wong: A vector space model for automatic indexing. Communications of the ACM, 613620 (1975).

Xuelin Shi Beijing, China. Birthdate: November, 1977. is Computer Application Technology Ph.D., graduated from Dept. Computer Science Beijing Institute of Technology. And research interests on cloud computing and resource scheduling. She is a lecturer of School of Information Science and Technology Beijing University of Chemical Technology. Yongjie Sui Beijing, China. Birthdate: December, 1987. is graduate student of School of Information Science and Technology Beijing University of Chemical Technology. Ying Zhao Beijing, China. Birthdate: September, 1966. is Control Theory and Control Engineering Ph.D., graduated from School of Information Science and Technology Beijing University of Chemical Technology. And research interests on computer network and computing architecture. He is professor of School of Information Science and Technology Beijing University of Chemical Technology.

2012 ACADEMY PUBLISHER

2312

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

Quantum Competition Network Model Based On Quantum Entanglement


Yanhua Zhong
Depart. Of Electronics and Information Technology,Jiangmen Polytechnic,Jiangmen 529000, China E-mail:zhflowers@163.com

Changqing Yuan
Aviation University of Air Force, Changchun 130022, China Email:ycq02@mails.tsinghua.edu.cn

AbstractThis paper proposes a quantum competition


neural network model compared to its classical counterpart from the relative parts of the complex system localizing operation without changing the perspective of entanglement measure. It shows that the pseudo-state is an inevitable part of the quantum competitive model. After the initialization of the quantum neural network; quantum competitive network is capable of associative memory through local area of operations because of the existence of these pseudo-states. Furthermore, Competitive algorithms of quantum theory are given, and finally an example of pattern recognition for simulation. Simulation results show that a quantum competitive learning algorithm in the learning rate and convergence rate is far better than the basic competitive artificial neural network.

Index TermsQuantum competition network model; quantum associative memory; Pattern recognition; Quantum entanglement

and the emerging quantum neural network have become an important direction for further development. Quantum computing demonstrated the amazing potential and unusual features are all derived from the traditional calculation of the quantum transformation, and neural computation is the biological behavior of the analog information processing method, and its kinetic characteristics of quantum systems have many similarities place. In the literature [1-3], we also discussed the quantum competitive learning algorithm, but the algorithm is relatively simple. The following of the paper introduces some key concepts from quantum mechanics, briefly discussing some of the well-known quantum algorithms, and then details a quantum version of competitive learning. Preliminary empirical results (obtained through simulation on a classical computer) are presented, and these results demonstrate that a quantum competitive learning system is indeed capable of performance that is impossible. II. QUANTUM CONCEPTS Quantum computation is based upon physical principles from the theory of quantum mechanics, which is in many ways counterintuitive. Yet it has provided us with perhaps the most accurate physical theory ever devised. The theory is well-established and is covered in its basic form by many textbooks. Several ideas are briefly reviewed here. Linear superposition is closely related to the familiar mathematical principle of linear combination of vectors. Quantum systems are described by a wave function that exists in a Hilbert space. The Hilbert space has a that form a basis, and the system is set of states i described by a quantum state,

I.

INTRODUCTION

In the past ten years, the academic papers and reports about quantum computing have been widely published; the theory of quantum computing has made great progress, however, to the successful development of a practical value of the quantum computer, its physical implementation is very difficult, the main reason is to make the quantum state with the outside world. On one hand, the Quantum particles and the nteraction between the external environments will destroy the superposition of qubits, resulting in an error. On the other hand, with the development of artificial neural networks, neural computing limitations and shortcomings have gradually become prominent, especially Radovan in 1997 proved that this method of neural connectionism and its ability to express the traditional doctrine of symbolic logic methods are equivalent, so there is a neural network approach with traditional methods of symbolic logic, the same limitations. Since then, the neural network research returns to low. However, neural networks and quantum theoretical description of the system has a striking similarity, so the advantages of quantum computing can be used to compensate for the shortcomings of neural networks. In this sense, the current method of calculating

| >

| >= | i >
i

, and in the general case, the coefficients ci states maybe complex. Use is made here of the Dirac bracket notation, where the ket | > is analogous to a column

| > is said to be in a linear superposition of the basis | i >

(1)

2012 ACADEMY PUBLISHER doi:10.4304/jcp.7.9.2312-2317

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2313

vector, and the bra < | is analogous to the complex conjugate transpose of the ket. Coherence and decoherence are closely related to the idea of linear superposition. A quantum system is said to be coherent if it is in a linear superposition of its basis states.According to quantum mechanics, if a coherent system interacts in any way with its environment, the superposition is destroyed. This loss of coherence is called decoherence and is governed by the wave function . The coefficients

space associated with that property and whose eigen values are the quantum allowed values for that property. Operators in quantum mechanics must be linear and further, operators that describe the time evolution of a

state must be unitary so that A A = AA = I ,where I

is the identity operator, and A is the complex conjugate

collapsing into state gives the probability of if it decoheres. Note that the wave function describes a real physical system that must collapse to exactly one basis state. Therefore, the probabilities governed by the

ci are called probability amplitudes, and | i > | >

transpose of A . Interference is a familiar wave phenomenon. Wave peaks that are in phase interfere constructively while those that are out of phase interfere destructively. This phenomenon is common to all kinds of wave mechanics from water waves to optics, and the well-known double slit experiment proves empirically that interference also applies to the probability waves of quantum mechanics. III. THE GENERAL COMPETITIVE NEURAL NETWORK Artificial Neural Network [6][8] is a system loosely modeled based on the human brain. The field goes by many names, such as connectionism, parallel distributed processing, neuro-computing, natural intelligent systems, machine learning algorithms, and artificial neural networks. It is inherently multiprocessor-friendly architecture and without much modification, and it goes beyond one or even two processors of the von Neumann architecture. It has ability to account for any functional dependency. The network discovers (learns, models) the nature of the dependency without needing to be prompted. No need to postulate a model, or to amend it, etc. Competitive learning is an important learning style of artificial neural network. It can achieve pattern classification and associative memory. The Hamming neural network is a typical competitive learning model, which first introduced the following network structure and competitive learning process. The Hamming neural network topology was shown in figure 1, it can be divided into two basic sections: input layer a layer built with neurons, all of which neurons are connected to all of the network inputs; output layer which is called MaxNet layer; the output of each neuron of this layer is connected to input of each neuron of this layer, besides, every neuron of this layer is connected to exactly one neuron of the input layer (as in the picture right).

amplitudes i must sum to unity. This necessary constraint is expressed as the unitarity condition (2) Consider, for example, a discrete physical variable called spin. The simplest spin system is a two-state system, called a spin-1/2 system, whose basis states are usually
i

| c | = 1
2 i

(spin up) and (spin down). In represented as is a distribution over this system the wave function

|>

|>

two values and a coherent state | > is a linear

superposition of

|> and |> . One such state might be 2 1 | >= |> + |> 5 5 (3)

As long as the system maintains its quantum coherence it cannot be said to be either spin up or spin down. It is in some sense both at once. When this system decoheres the result is, for example,
2

the

|> state

with

probability (2 / 5) = 0.8 . A simple two-state quantum system, such as the spin1/2 system just introduced, is used as the basic unit of quantum computation. Such a system is referred to as a and |1 > , it is easy to see why this is so. Operators on a Hilbert space describe how one wave function is changed into another. Here they will be denoted by a capital letter with a hat, such as A , and they may be represented as matrices acting on vectors. Using operators, an eigenvalue equation can be written quantum bit or qubit, and renaming the two states | 0 >

A | i >= ai | i >

to Where i is the eigenvalue. The solutions i such an equation are called eigenstates and can be used to construct the basis of a Hilbert space as discussed above. In the quantum formalism, all properties are represented as operators whose eigenstates are the basis for the Hilbert
2012 ACADEMY PUBLISHER

| >

(4)

Figure 1. Hamming neural network model

2314

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

For example, the dimension of the input mode is m, the number of samples stored patterns (i.e., network capacity) is n, and the learning process can be described as follows: Left: pattern match Assuming the input patterns is patterns are stored in the

states obtained through the tensor product of the Hilbert Space associated with each qubit.

| x1 >

X = {x1 , x2 ,L xm }
network

, which

( P j = pij ,0 i m,0 j n) ,calculate the is Hamming distance between each of X and P , that is m X Pj HD( X , P j ) = (0 j n) , 2 (5)
Here, X P is the inner product of two patterns. Simply, if expressed in binary patterns, then the Hamming distance can be defined as two patterns for different values of vector (I.e., the opposite value) the number of components, which is used to measure the difference between two patterns, the smaller the value, the closer the two patterns, and vice versa, Otherwise, two of the same patterns of Hamming distance are zero, m is the maximum distance. According to the minimum Hamming distance criteria,
j

| xi >

QNN

| y (t ) >

Us

|d >
|d >

| xn >
Figure 2. The quantum neural model

To better describe the quantum Competition Network, the concept of the amount of entanglement will be introduced; Entanglement is described in the quantum mechanics properties between several parts of the same system state, suppose a system is composed of A, B, C etc, general States of composite systems with the density matrix to describe. There are several properties of the amount of entanglement.

is maximum, and the following operations can be classified results: 1 y j (t + 1) = f y j (t ) y k (t ) < k j n (6) , here IV. QUANTUM-COMPETITION NEURAL NETWORK Quantum computing is to use the interference properties of quantum states, so that the results you want enhanced, making unnecessary results weakened, so the results you want at the time of measurement will occur in the higher probability, In the whole calculation process, the quantum algorithm is actually a series of unitary operators of continuous operation, it describes that the quantum state changes from the initial state to the final state of evolution. Similarly, we believe that quantum learning algorithms also have similar characteristics. Controlled Hamming neural networks, quantum competition network model and its learning algorithm are given below. A. Quantum Neural Network Model Consider a discrete physical variable called spin. The simplest spin system is a two-state system, called a spin1/2 system, whose basis states are usually represented as

yj = X Pj

(1) If the density matrix described state is not entanglement and Separable, it can be expressed as belonging to different parts of the state of the tensor product

= Pi iA iB L
i

(7)

Here,

iA , iA L

is the description of the various

parts of the density operator,

Pi 0, Pi = 1
i

entanglement, entanglement is zero, . (2) Related to the various parts in the local area under the unitary transformation (Relative to various parts of the local operation).These operations can be part of the base of the unitary transformation, and the implementation is of the general measure, part of the Hilbert space to expand or give up some space, then the total entanglement remains unchanged in this system.

E( ) = 0

, Of non-

E ( ) = E (U

U B U

+ A

+ U B ) , (8)

| 0 > and | 1 > . One such state might | >= a | 0 > +b | 1 > , here a, b are complex be | a | 2 + | b | 2 = 1 . A composed system numbers such that
with N qubits is described using N = 2 independent
n

| 0 > (spin up) and | 1 > (spin down). In this system the wave function is a distribution over two values and a coherent state is a linear superposition of

B. Quantum Storage Mode Traditional artificial neural networks (such as Hopfield networks) allow association mode response, but its main drawback is the limited storage capacity. For example, to deposit a pattern of length n requires n-neuron network, general 0.15 k 0.5 ; the use of quantum associative memory can greatly expand the memory capacity. In the given incomplete distorted input samples can be relatively large probability to restore a complete prototype. In storage mode, with a quantum associative memory proposed by Ventura and Martinez difference is that a the model can be stored number

m kn , in

2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2315

corresponds to a neuron, the longest field mode number of bits n as the number of neurons, uses Hamiltonian

change the whole entanglement when using Grover's algorithm to find a matching string in 2 dimensional Hilbert space, no matter what one bit data is not clear. So after about T fallow:

= H H L H ( n is the number)to the initial state | 0 L 0 > to get equal weight


transformation H
(n)

| y1 y2 L yi 1 yi +1 L yn >

n 1

superposition state of qubits. Each basic state represents an initial mode. That n qubits can store 2 patterns. When the input pattern number is less than n, it is inevitable that there are some pseudo-states. For example, to store the
n

N iterations, system state changes as

three pattern | 000 > , | 010 > , | 111 > , and need three qubits, there are five pseudo-states, and probability amplitude for each patter 1 2 2 . Prepare all the qubits in the state | 0 > , using Hamilton equations to qubits is written as follows:

| s >=| y1 y2 L yi yi+2 L yn > | x1 >

1 2

(| 0 > + | 1 >)
, (12)

| x2 >

| s >= H

(n)

| 00L0 >

1 2
n/2

2 n 1 x =0

| x >

W1 U 2

| y1 (t ) >

(9)

2 dimensional Hilbert n n space, there are 2 orthogonal states, and the 2 basic state is expressed by | x > ,
In general, the n qubits open
n

| xi > Wm U m | xn >
Input layer Competitive layer

| y m (t ) >

| >= Ci | i >
i =1

2n

,
n

(10)

Note that | i > is one of the 2 basic states, i is Stacking factor. x is n-bit string made of 0 and 1,

H ( n ) | x > is an equal weight superposition state of n binary number from 0 to 2 1 . State | s > can
be expressed as the following formula:

Figure 3. quantum competition network models

| s >= H ( n ) | 0 i >| 0 >| 0 ni 1 >


1 = i/2 2 1 1 | x i > 2 (| 0 > + | 1 >) 2 ( ni 1) / 2 x =0
2i 1 2 n i 1 1 x =0

Then, with some measurements, the system will collapse to a 50% probability to | y1 y 2 L yi | 0 > yi + 2 L y n > and state

| y1 y 2 L yi | 1 > yi + 2 L y n >
>

| x

n i 1

V. PATTERN RECOGNITION USED QUANTUM COMPETITION


NETWORK

1 = ( n 1) / 2 2

2 n 1 1 x =0

| x

n 1

1 > 2 (| 0 > + | 1 >) , (11)

Here j is the binary digit string. From the above known equation, the location of the binary character in the quantum system is equivalent. The location of the binary character in the quantum system is equivalent. C. Response mode Now the system is divided into any two parts, part A (input layer) and Part B (competitive layer) as shown in Figure 2.A part searches known part string , and the character of B part may be incomplete state. These two parts are connected by certain entanglement. Suppose we want to recall the incomplete mode | y1 y 2 L y i 1 ? y i +1 L y n > , the Symbol ? means that it did not know the i bits but now need to association it. By the formula (7) we know that system does not

To test Quantum competition network (QCNN) algorithm performance, we selected a typical example of pattern recognition to simulate, with the General competitive neural network for performance comparison. For example, to store the four | 000 > , | 010> , | 111 > , | 110 > , requires four q-ubits, models there are four pseudo-states, the probability amplitude of each mode is 1 2 2 . Now we give an incomplete pattern | 11? > , its the first two quantum bits constitute Input part, implementation of Grover iteration algorithm [4-5] on the input part will find the matching state | 11 > in

N =

23 1 = 1
times. Using formula (8)

can be obtained

2012 ACADEMY PUBLISHER

2316

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

| 11 >

1 1 1 | 111 > | 110 > + (| 0 > + | 1 >) = 2 2 2

, (9) The system will collapse to a 50% probability to

convergence of the two models are shown in Figure 5; the comparisons of the number of Iterative step are shown in Figure 6.
15000 GCNN maximum number of iteration step GCNN average number of iteration step GCNN minimum number of iteration step QCNN maximum number of iteration step QCNN average number of iteration step QCNN minimum number of iteration step

Number of iteration step

state | 111 > and | 110 > . We choose simulation of typical examples of pattern recognition and General competitive neural network (GCNN) for performance comparison.

10000

5000

0 0.1

0.2

0.3

0.4

0.5 0.6 Learning rate

0.7

0.8

0.9

Figure 6. The comparison chart of QCNN and GCNN number of iterative step

Figure 4. The nine-point pattern distribution map

110 100 Percentage rate of convergence/% 90 80 70 60 50 40 30 20 0.1 GCNN QCNN

Figure 5 shows that when learning rate changes, QCNN convergence rate is of 100%, while the CBP convergence rate of the minimum is 22%, Max is only 69%.Figure 6 shows that when learning rate changes, QCNN the average number of Iterative steps up to 687.70, down to 275.01, fluctuation range is only 412.69.Average number of GCNN iteration steps up to 12335.45, about 6 times of Q QCNN, and fluctuation range is up to 10638.21, about 26 times of QCNN. Simulation results show that QCNN are not only small number of iteration step, also high rate of convergence, when the parameters change with strong robustness. VI. CONCLUSIONS Ideas from classical neural network theory are recast in a quantum computational framework, using the language of wave functions and operators. The unique characteristics of quantum systems are utilized to produce a quantum competitive learning network capable of storing exponentially more prototype patterns than possible classically. This demonstrates that quantum computational ideas can be combined with concepts from the field of neural networks to produce useful and interesting results. Simulations using real-world data show that the quantum competitive learner performs very favorably during pattern recall. Ongoing work focuses on discovering new operators to improve performance by weighted feature discovery. Future work includes searching for further applications of quantum computation to neural networks and generally further developing the field of quantum computational learning. ACKNOWLEDGMENT This work was supported by the Natural Science Foundation of China under Grant No.10902125

0.2

0.3

0.4

0.5 0.6 Learning rate

0.7

0.8

0.9

Figure 5. The comparison chart of GCNN and QCNN convergence rate

Nine samples of pattern recognition are shown in Figure 3. This mode is a typical two-class classification problem, which can be seen as "exclusive or" generalization of the problem, often as the inspection algorithms of classification ability scale. Since GCNN and QCNN are respectively as a classifier, network structure are taken by a 2-10-1 model, limited the number of iteration step is 15,000, limited error precision of 0.01, learning rate from 0.1, 0.2,0 ...1 in the value. For each learning rate, respectively QCNN and GCNN to 100 times the simulation, convergence times were recorded as the evaluation index, such as the maximum number of Iterative steps, the minimum number of Iterative steps, and the average number of iterative steps. When the learning rate is changed, the comparisons of the

2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2317

REFERENCES
[1] [2] [3]

[21] G. Y. Lian, K. L. Huang, J. H. Chen, F. Q. Gao,"Training

[4]

[5]

[6] [7]

[8] [9] [10] [11]

[12]

[13]

[14] [15]

[16]

[17]

[18]

[19]

[20]

Feynman R P. Quantum Mechanical Compute.Found Phys,1986,16:507-531 Deutch, D. Quantum computational networks, Proc. Roy. Soc. Lond.A439 (1992)553-558. Zhou Rigui. Quantum Competitive Neural Network .International Journal of Theoretical Physics, Vol.49 (1), pp.110-119, 2010 Grover L K.A Fast Quantum Mechanical Algorithm for Database Search .proceedings of the 28th Annual ACM Symposium on the Theory of Computing[C], 1996, 12-219, ACM, New York. Grover L K.Quantum Mechanics Algorithm Helps in Searching for a Needle in a haystack.Phys.Rev.lett.79,325,1997. D.Ventura , T.Martinez, Quantum associative memory, Information Sciences, vol. 124 nos. 1-4, pp. 273-296, 2000. Liu, Yong,Ma, Liang,Ning,Ai-Bing.Quantum competitive decision algorithm and its application in TSP. Application Research of Computers. Vol. 27, no. 2, pp. 586-589. Feb. 2010 SUN JG, HE YG. A Quantum Search Algorithm. Journal of Software. 2003, V14 (3): 334-344. Trugenberger C.A. Phase Transitions in Quantum Pattern Recognition, Phys. Rev. Lett. 89 (27), 277903 (2002) Trugenberger C.A. Probabilistic Quantum Memories, Phys. Rev. Lett. 87 (6), 067901 (2001) C.Z.Li, Quantum Communication and Quantum Computation, The National University of Defense technology Press, 2000. G.J.Xie, Z.Q.Zhuang.Research on Quantum Neural Computational Models. Journal of Circuits and System, china,Vol.7, pp. 83-88, 2002. C.Z.Li, Quantum Communication and Quantum Computation, The National University of Defense technology Press, 2000. Gao Yang. BP Neural Network's Quantum Learning and its Application .Science Mosaic. 2010, (7) Jen-Pin Yang, Yu-Ju Chen, Huang-Chu Huang, Sung-Ning Tsai and Rey-Chue Hwang. The Estimations of Mechanical Property of Rolled Steel Bar by Using Quantum Neural Network Advances in Intelligent and Soft Computing, 2009, Volume 56/2009, 799-806 Li Peng ,Yan Yan,Zheng Wu-Jun,Jia Jian-Fu,Bai Bo. Short-term load forecasting based on quantum neural network by evidential theory. Power System Protection and Control. Vol. 38, no. 16, pp. 49-53. 16 Aug 2010 Long Bohua,an Yanghong ,Xu Hui,Sun Lei,Wen Juan.Fault Diagnosis of Power Electronic Circuits Based on Quantum Neural Network. TRANSACTIONS OF CHINA ELECTROTECHNICAL SOCIETY .2009, 24(10) ZHENG Ling-xiang ZHOU Chang-le , Using Quantum Associative Memory to Simulate the Brain Functions, JOURNAL OF DONGHUA UNIVERSITY, 2010 27(2) McElroy, J., Gader, P.. Generalized Encoding and Decoding Operators for Lattice-Based Associative Memories. Neural Networks, Issue Date: Oct. 2009 Vol. 20 (10 )pp: 1674 - 1678 Wei Sun, Yu Jun He, Ming Meng, A Novel Quantum Neural Network Model with Variable Selection for Short Term Load Forecasting, Applied Mechanics and Materials , vol. 20-23, pp. 612-617, 2010

algorithm for radial basis function neural network based on quantum-behaved particle swarm optimization, International Journal of Computer Mathematics - IJCM , vol. 87, no. 3, pp. 629-641, 2010 [22] Panchi Li, Kaoping Song, Erlong Yang, Quantum genetic algorithm and its application to designing fuzzy neural controller, International Conference on Natural Computation - ICNC , pp. 2994-2998, 2010 [23] Christopher Altman, Romn R. Zapatrin, Back propagation training in adaptive quantum networks, International Journal of Theoretical Physics, 49, 2991 (2010)

Yanhua Zhong Guangdong Province, China. Birth date: October, 1974, is Software Engineering M.S., graduated from Computer College, Guangdong University of Technology, China. And research interests on intelligent algorithms and quantum computing. She is an associate professor of Jiangmen Polytechnic.

Changqing Yuan Jilin Province, China. Birthday: October, 1974. is Mechanical Engineering Ph.D., graduated from School of Aerospace, Tsinghua University. And research interests on Spacecraft dynamics and Control. He is an associate professor of Aviation
University of Air Force

2012 ACADEMY PUBLISHER

2318

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

BP Neural Network based on PSO Algorithm for Temperature Characteristics of Gas Nanosensor
Weiguo Zhao
Center of Education Technology, Hebei University of Engineering, Handan 056038, China Email: zwg770123@163.com

AbstractTo comprehensively understand the characteristics of gas nanosensor between temperature and sensitivity, this paper has developed a Backward Propagation (BP) neural network based on Particle Swarm Optimization (PSO), which is applied to fitting the temperature-sensitivity characteristic of the SnO2 gas nanosensor mixed with benzene. The simulation results show the PSO can well optimize the structure of the BP network, and the fitting accuracy of the temperature of nanosensor using the acquired BP model is improved greatly and the optimized BP network has better generalization performance than the traditional BP network, and the acquired curve is both smooth and accurate so the study shows that BP-PSO neural network is effective for fitting the temperature characteristics of gas nanosensor. Index Termsgas nanosensor, temperature characteristics, neural network, PSO

I. INTRODUCTION With the development of the electronic and communication technique, sensor technique has been improved greatly and sensor has become an indispensable equipment, which has found widespread applications in industry, agriculture, transportation and so on. As one of the most important branches of sensor, nanosensor has been under research by many institutions and a great number of researchers for as long as ten years. A nanosensor is a sensor built on the atomic scale based in measurements of nanometers, which are any biological, chemical, or surgical sensory points used to convey information about nanoparticles to the macroscopic world. Their use mainly include various medicinal purposes and as gateways to building other nanoproducts, such as computer chips that work at the nanoscale and nanorobots [1, 2]. Presently, there have been a number of advances in the research and development of nanosensor for a number of different applications. Some of the major applications are the medical field, national security, aerospace, integrated circuits, and many more. Along with many different applications for nanosensor, there are also many different types of nanosensor, and a number of ways to manufacture them [3]. Nanotechnology offers the
Manuscript received Sept. 10, 2011; revised Oct. 17, 2011; accepted Oct. 28, 2011. Project number: 2011138 and E2010001026

promise of improved gas sensors with low-power consumption, fast response time which will enable portability for a wide range of applications. Indeed, nanostructured materials such as nanotubes and nanowires have been shown to be suitable for sensing various gases [4]. Gas nanosensor is mainly used for testing gas density and humidity [5], currently, the majority of nanosensors are made of nano SnO2 film, which is mixed with different heavy metal particles, it greatly enhance its freedom and flexibility. In recent years, many nanosensor researches and applications show that gas nanosensor has various advantages such as good stability, high sensitivity, fast responsibility, and high accuracy and so on. But the sensitivity of gas nanosensor is very sensitive to the temperature of environments tested, so sensitivity is considered as a impartment parameter for gas nanosensor and the measurement system simulation [6, 7]. There exists a nonlinear relationship between temperature and sensitivity in gas nanosensor, but many applications was also based on a linear computation, which resulted in big nonlinear error. So, to eliminate or compensate the nonlinear error and have a thorough understanding of the relation between temperature and sensitivity, there are some fitting models established based on test data. Liu Haiyan [8] proposed the comparison of the cubic polynomial method and the spline polynomial fitting in least square method for gas nanosensor, the simulation results show the accuracy of the cubic polynomial method is better then that of the spline polynomial fitting in least square method, and its fitting curve is smooth without any break in connections; ZENG Zhezhao [7] introduced the neural network algorithm with Fourier basis functions for fitting the temperature characteristics of gas nanosensor, the experiment involved inferring that the method is both smooth and accurate. In recent decades, as a good nonlinear model, Artificial Neural Network (ANN) is widely applied to many complex nonlinear questions, and the BP neural network is one of the widest network types, which has a powerful capability to generalize the nonlinear relationship between input and desired output, but the traditional BP network can trap into local minimum and has inherent searching rate slowly when training. Guo Wenxian[9] proposed BP neural network model based on PSO was

2012 ACADEMY PUBLISHER doi:10.4304/jcp.7.9.2318-2323

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2319

applied to predict river temperature of the Yangtze River, the study proved that PSO-BP neural network model was effective in river temperature prediction. Wang Ping [10] established an improved algorithm called PSO-BP, the new algorithm fully shows the ability of nonlinear approach of multilayer feedforward network, improves the performance of ANN, and provides a favorable basis for further on-line application of a comprehensive model, it is applied to mechanical property prediction of strip model, inferring excellent performance. So, to overcome the shortcomings, in this paper, a combination of Particle Swarm Optimization is used to train and optimize the BP network structure including its weights and thresholds, it can overcome the over-fitting problem and the local minima problem of the BP neural network, and then the established BP network is applied to fitting the relationship between temperature and sensitivity in the SnO2 gas nanosensor, the simulation results show the fitting curve is accurate and the BP network based on PSO is practical and effective.

Figure 1.

Structure of BP neural network

II. BP NEURAL NETWORK


As is a type of complex system, artificial neural network is made up of plenty of nerve cells, which can simulate the way that human deals with the problem, parallel process information and make nonlinear transformation. Artificial neural network handling information first is train to neural network, then composes linear functions and gain fit weight, finally completes the nonlinear processing about variety information. Under the situation of loosing of sample and parameters drifting, it can also guarantee the stable output. This characteristic of artificial neural network has been successfully used in many fields, including pattern recognition, image processing, intelligent control, optimal calculation, artificial intelligence and so on [11]. Back propagation was proposed by generalizing the Widrow-Hoff learning rule to multiple layer network and nonlinear differentiable transfer function. Input vectors and corresponding target vectors are used to train a network until it can approximate a function, associate input vectors with specific output vectors, or classify input vectors in an appropriate way as defined in this study. The back propagation algorithm consists of two paths; forward path and backward path. Forward path contain creating a feed forward network, initializing weight, simulation and training the network. The network weights and biases are updated in backward path [12]. A typical three layers BP neural network with 4 inputs is shown in Fig. 1 In general, the implementation of the training algorithm makes use of the followed sigmoid as the output function of each node whose value is in (0, 1), 1 (1) f ( x) = 1 + e x The error margin of the actual output Y of the BP network and expectation output T is: 1 n (2) E = ( yi ti ) 2 2 i =1
2012 ACADEMY PUBLISHER

When we train a neural network with a gradient and descent method based on back propagation [13] the network is provided with a set of training samples along with their target outputs. One by one, each sample is placed into the inputs of the neural network. The resulting outputs are then compared to the target values and an error is calculated for each node, starting with nodes in the output layer and propagating backward toward nodes in the input layer. The error at an output node i, with respect to its target t and output o, is & i = o(t o) (3) The error at hidden node i, with respect to each of its downstream connections j, is

& i = o

ji j Downstream (i )

(4)

Each weight wji between nodes i and j is adjusted based on a learning constant , the calculated error j at the target node j and the output xi of the source node i. w ji = j xi (5)

III. PSO ALGORITHM


PSO is a method for performing numerical optimization without explicit knowledge of the gradient of the problem to be optimized [14, 15]. PSO is basically developed through simulation of bird flocking and fish schooling in two-dimension space which is based on a simple concept, so the computation time is short and requires few memories, and then it was originally developed for nonlinear optimization problems with continuous variables so that it is easily expanded to treat a problem with discrete variables. The position of each individual (agent) is represented by XY axis position and also the velocity is expressed by vx (the velocity of X axis) and vy (the velocity of Y axis). Modification of the agent position is realized by the position and velocity information. An optimization technique based on the above concept can be described as follows: namely, bird flocking optimizes a certain objective function. Each agent knows

2320

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

its best value so far (pbest) and its XY position. Moreover, each agent knows the best value so far in the group (gbest) among pbests. Each agent tries to modify its position using the following information: the current positions (x,y), the current velocities (vx,vy), the distance between the current position, pbest and gbest. This modification can be represented by the concept of velocity. Velocity of each agent can be modified by the following equation:

each particle, so as to generate the training error of particles with training samples. Step 6: The personal best position pbest and and the global best position gbest are updated. Step 7: If the maximum of the iteration is achieved or the optimum solution is acquired, then the algorithm is stopped, else return to Step 4.

Normalize training samples Define the network structure of BP network Update position and velocity of each particle according to equation (6) and(7) (4) (5) Evaluate each particles fitness value according to the MSE

k +1 i

= wv + c1rand ( pbesti x ) +
k i k i

c2 rand ( gbest xik )

6)

where, vik : velocity of agent i at iteration k, w : inertia weight, cj : weight factor, rand : random number between 0 and 1, xik: current position of agent i at iteration k, pbesti: pbest of agent i. Using the above equation, a certain velocity, which gradually gets close to pbest and gbest can be calculated. The current position (searching point in the solution space) can be modified by the following equation:

x ik + 1 = x ik + v ik + 1 .
VI. BP NETWORK BASED ON PSO

(7)

Calculate the total force in different directions of each of agent

BP network has excellent characteristics of nonlinear approximation and has been widely used in many fields, but it has the disadvantages of inherent slowly searching rate and partially leading to minimum. To improve the performance of BP network, we use PSO to optimize all weights and thresholds in BP network. So, we adopt the global searching for the optimum weights and thresholds instead of BP itself, the method will improve the solution of results and increase the convergence speed of the BP network, the flowchart of the BP based on PSO is shown in Fig. 2 and the procedure is summarized as follows: Step 1: Normalize the training samples and test samples into [-1, 1]. Step 2: Define the network structure of BP network according to the input samples and the output samples. Step 3: Initialize parameters m, w, c1, c2, , where: m: number of population, w: inertia weight, ci: weight factor, : parameter of identification (coefficient of nonlinear rectification equation), the velocity and position of each particle are initialized randomly. Step 4: Each particles velocity is updated according to (6) and each particles position is updated according to (7). Step 5: Each particles fitness is evaluated. The Mean square Error (MSE) of BP neural network is used as the fitness function to guide the particle population for searching for the optimum solution, we use all the training samples to calculate in forward propagation for

Update the personal best position and the global best position

No

Termination is satisfied? Yes Return the optimal solution

Figure 2.

The flowchart of the BP network based on PSO

V EXPERIMENT AND RESULTS


We put SnO2 gas nanosensor with mixed with different concentration of benzene into a testing equipment whose gas density and pressure is constant, then the temperature is changed every 10 and the sensitivity of nanosensor is recorded. The experimental results between temperature and sensitivity are list in [8]. In this paper, the BP network based on PSO is applied to fit the temperature characteristics of gas nanosensor [16], firstly, we need to determine the involved parameters, which is given as follows [9]:

2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2321

k (m 1) b = x i + k 1

a = 2 i +1

(8) (9)

10 10

10

Where x is set to 2 constantly, k is the number of neurons of output layer, i is the number neurons of input layer, m is the size of the training samples, and the number of the neurons of the hidden layer h is searched in range [a, b], so the number of parameters to be optimized by the PSO in the BP network is ih+h+hk+k. Then, the PSO algorithm we used is the standard global version with inertia weight, the population sizes the particles is set at 50, the maximum iteration is 100, the acceleration factors c1 and c2 are both 2.0, a decaying inertia weight w starting at 0.9 and ending at 0.2 is used. For the comparison, the standard BP network and the BP network based PSO are both applied to the temperature characteristics of gas nanosensor, the training parameters are set as follows: learning rate is 0.02, and Momentum constant is 0.9, the weights and biases are initialized randomly. The two methods are trained with the same training samples, and the same test samples are used too in testing. The 30 sensitivity values corresponding to different temperatures are used as a dataset as shown in Table 1, of which 25 samples are as training samples, the remaining 5 samples randomly selected as test samples as validation data. We use the training samples to train the BP network based on PSO, the final number of the hidden layer is 5, so the BP structure of the two methods is 1-5-1. Fig. 3 and Fig. 4 demonstrate the convergence process using the two methods respectively when training, Fig. 5 and Fig. 6 show the fitting results for the two methods, Table 2 summarizes the training results of the two methods using the training samples. It can be seen that the BP optimized using PSO achieve the optimal error at the iteration of 26 and the MSE is 0.00239, while the BP network achieve the optimal error at the epoch of 802 and the MSE is 0.00526. It is clear that the BP network based on PSO requires fewer training time and has faster convergence speed than the BP network, moreover, the training accuracy of the former is better than that of the latter. So, we has established a good prediction model based on the BP optimized using PSO, which is applied to predict the Sensitivity of nanosensor corresponding to the desired temperature.
TABLE I. THE PERFORMANCE COMPARISON OF TWO METHODS Method BP BP based on PSO Epoch(Iteration) 802 26 Training error(MSE) 0.00526 0.00239

10-0 MSE 10-1 10-2 10-3 10-4

200

400

600

800 1000 1200 1400 1600 1800 2000 2000 Epochs

Figure 3.
103 102 101 100 10-1 10-2 10-3 10-4

Convergence process of BP network

MSE

10

20

30

40 50 60 100 Iterations

70

80

90

100

Figure 4.

Convergence of BP network based on PSO

Figure 5.

Fitting result of BP network

2012 ACADEMY PUBLISHER

2322

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

Figure 8. Figure 6. Fitting result of BP based on PSO

Predicted value versus actual for BP-PSO

0.025 BP 0.02 Relative error 0.015 0.01 0.005 0 220 290 370 410 Temperature/ 470 BP-PSO

The prediction results of the remaining 5 samples using the two methods are depicted in Fig. 7 and Fig. 8 respectively, they show the prediction value of the BP network based on PSO is nearer to the diagonal line inferring its higher prediction accuracy than that of it counterpart, Fig. 9 illustrates the comparison of relative error of the test samples for the two method, it is obvious that the relative error of most of the test samples using the BP-PSO is smaller than that using the BP network except for the fifth test sample, the average relative error of BP network is 0.0126 and the average relative error of BPPSO is only 0.0093, it proves that the proposed method has a better prediction ability than its counterpart. We can see that the BP-PSO has a better generalization performance, whether its convergence speed or its prediction ability, this results shows the fact that the SPO has a good generalization performance and global searching for the optimum, it can refine the optimal parameters of the BP network structure, which can better reflect the nonlinear relationship between the temperature and the sensitivity of nanosensor.

Figure 9.

Comparison of relative error for two methods

IV. CONCLUSIONS
In this paper, we have proposed a BP neural network model whose struct is optimized using the PSO, and it is applied to fitting the temperature characteristics of the SnO2 gas nanosensor mixed with benzene. The simulation results show the PSO can well refine the structure of BP neural network, the proposed method has greater improvement in both accuracy and convergence speed for the traditional BP neural network, so it provide a practical and effective method for gas nanosensr. Our future research is to investigate the fitting model based on online and apply it to gas nanosensor. ACKNOWLEDGEMENTS This work is supported by the Science and Technology Research Project of University of Hebei Province No. 2011138, and the Natural Science Foundation of Hebei Province of China No. E2010001026.

Figure 7.

Predicted value versus actual value for BP network

REFERENCES
[1] Foster LE, "Medical Nanotechnology: Science, Innovation, and Opportunity", Upper Saddle River: Pearson Education, 2006. [2] http://en.wikipedia.org/wiki/Nanosensors [3] http://www.rachelheil.com/courses/Nanotechnology/Nano. html

2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2323

[4] Ting Zhang, Syed Mubeen, Bongyoung Yoo, Nosang V Myung, Marc A Deshusses, A gas nanosensor unaffected by humidity, Nanotechnology Vol.20, 2009, pp1-5. [5] ZHAI Lin, ZHONG Fei, LIU Peng-yi, Advances in research TiO2 gas sensors, Sensor World, Vol.10, No.12, 2005, pp69. [6] JIAN Qifei, LIU Haiyan, The Fitting of the Characteristic Curve of Nanosensor Based on Cubic Spline Function Chinese, Journal of Sensors and Actuators, Vol.18, No.1, 2005, pp50-52. [7] ZENG Zhezhao, ZHU Wei, SUN Xianghai, WANG Yaonan, Approach Fitting the Temperature Characteristic Curve of Sensor with a High Accuracy Based on Neural Network Algorithm.Chinese, Journal of Sensors and Actuators, Vol.20, No.2, 2007, pp325-328. [8] Liu Haiyan, Jian Qifei.Firrting, The Sensitivitytemperature Curves of Nanosized Gas Sensor, Journal of South China University of Technology, Vol.32, No.6, 2004, pp27-30. [9] Guo Wenxian, Wang Hongxiang, Xu Jianxin, Dong Wensheng, PSO-BP Neural Network Model for Predicting Water Temperature in the Middle of the Yangtze River., 2010 International Conference on Intelligent Computation Technology and Automation, 2010, pp951-954. [10] WANG Ping, HUANG Zhen-yi, ZHANG Ming-ya, ZHAO Xue-wu, Mechanical Property Prediction of Strip Model Based on PSO-BP Neural Network, Journal of iron and steel research, international, Vol.15, No.3, 2008, pp 87-91 [11] Sun ZL, Tan YZ Back-propagation binding model on colliery safety estimation,: Progress in Safety Science and Technology, Vol 6, Pts A and B, 2006, pp55-58.

[12] Amini, J, Optimum learning rate in back propagation neural network for classification of satellite images, Scientia (Iranica), Vol.15, NO.6, 2008, pp558-567 [13] Sam Gardner, Robbie Lamb, John Paxton, An initial investigation on the use of fractional calculus with neural networks, Proceedings of Computational Intelligence, 2006, pp186~191. [14] Kennedy J, Eberhart R, Particle Swarm Optimization, Proceedings of the Fourth IEEE International Conference on Neural Networks, Perth, Australia. IEEE Service Center, 1995, pp1942-1948. [15] Amany El-Zonkoly, Particle Swarm Optimization for Solving the Problem of Transmission Systems and Generation Expansion, Mansoura Engineering, Vol.30, 2005, pp10-15. [16] Kewen Li, Predicting Software Quality by Optimized BP Network Based on PSO, Journal of Computers, Vol.6, No.1, 2011, pp122-129. Weiguo Zhao was born in Xingtai, Hebei Province, China, in 1977. He received the B.S. degree from School Information and Electronic Engineering, Hebei University of Engineering, in 2001, and received the M.S. degree from School of Computer Science and Software Engineering, Hebei University of Technology in 2005. Now, he is teacher in Hebei University of Engineering, his current research interests include intelligent computing, image processing, and intelligent fault diagnosis system.

2012 ACADEMY PUBLISHER

2324

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

Hybrid SVM-HMM Diagnosis Method for RotorGear-Bearing Transmission System


Qiang Shao Department of Mechanical Engineering, University of Dalian Nationalities Dalian Liaoning Province, China Email: sq@dlnu.edu.cn Changjian Feng Department of Mechanical Engineering, University of Dalian Nationalities Dalian, Liaoning Province, China Email: fcj@dlnu.edu.cn

AbstractNo stationary time series are occurring when the plant proceeds to an abnormal state or a transient situation from a normal state. So it is necessary to identify the type of fault during its early stages for the selection of appropriate operator actions to prevent a more severe situation. This paper proposes a new architecture for identification of the time series. It converts the output of support vector machine (SVM) into the form of posterior probability which is computed by the combined use of sigmoid function and Gauss model, it acts as a probability evaluator in the hidden states of hidden Markov models (HMM). Experiments show that the architecture is very effective. Index TermsHMM; SVM; identification; pattern recognition

I. INTRODUCTION In modern and unmanned machining systems, including dedicated transfer lines, flexible manufacturing systems, and Reconfigurable Manufacturing Systems (RMS), one crucial component is a reliable and effective monitoring system to monitor process conditions, and to take remedial action when failure occurs, or is imminent. Vibration monitoring method is adopted because its cheapness and convenience. Hoverer the monitoring vibration signals are usually some nonstationary time series. Detection and identification these time series are belong to the problem of dynamic pattern. Time often plays a secondary role: it should be incorporated in the feature extraction procedure. For practical recognition tasks, the assumption of stationarity of the class distributions may not be hold. Alternatively, information in sequences of feature vectors may be used for recognition. We will call both groups of problems dynamic pattern recognition problems. A dynamic pattern is a multidimensional pattern that evolves as a function of time. A set of feature vectors can be looked upon as the result of independent draws from a multi-dimensional distribution. All temporal information should now be present in each feature vector. Identification problem may

then be based on the dissimilarity of a set of newly measured feature vectors with respect to a set of known templates. HMMs have been proved to be one of the most widely used tools for learning probabilistic models of dynamical time series[1-3]. HMM can model dynamical behaviors variation existing in the system through a latent variable (hidden states). HMM is good at dealing with sequential inputs, while SVM shows superior performance in classification. Furthermore, the former approach usually provides an intra-class measure while the latter proposes inter-class difference. Since these two classifiers use different criteria, they can be combined to yield an ideal one. The output of SVM is converted into the form of posterior probability which is computed by the combined use of sigmoid function and Gauss model, it acts as a probability evaluator in the hidden states of HMM. This paper introduces the SVM-HMM method to identification of nonstationary time series of rolling element bearing. The results show the proposed method is effective. II. SVM AND ITS OUTPUT PROBABILITY A. SVM for Classification Support vector machine (SVM) has been widely used in the pattern recognition and regression due to its computational efficiency and good generalization performance. It was originated from the idea of the structural risk minimization that was developed by Vapik in 1970s [4]. In order to introduce to HMM, we should convert real-valued output of SVM to probability form. The power of SVMs lies in their ability to transform data to a high dimensional space where the data can be separated using a linear hyperplane. The optimization process for SVM learning therefore begins with the definition of a functional that needs to be optimized in terms of the parameters of a hyperplane. The functional is defined such that it guarantees good classification[5-8]. On the training data and also maximizes the margin. The points that lie on the hyperplane satisfy, w x +b = 0 (1)

2012 ACADEMY PUBLISHER doi:10.4304/jcp.7.9.2324-2329

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2325

where W is the normal to the hyperplane and b is the bias of the hyperplane from the origin. Given a labelled training data set,

Obviously g(x) is ratio of dx and dsv. Therefore we can get the output probability of SVM by Sigmoid function as followings,

{ xi , yi }iN 1 ( xi R d , and yi {1} ) =


Where xi is the input vector and yi is its class label), an SVM constructs the discriminant function of classification as following: f ( x ) = sign(w x + b) . (2) In order to maximize the separating margin, optimal problem can be solved. Minimize the following: and

P (C+1 x ) =

1 , 1 + e g ( x ) 1 . 1 + eg ( x)

(12)

P (C1 x ) =

(13)

(w, ) = (w w ) + C i
i =1

1 2

(3)

Subject to the constraints:

yi ((w xi ) + b) 1 i , i = 1, 2,L , n

i 0, i = 1, 2,L , n
n

(4)

The result of the above problem is solving the maximum of the following function:

W ( ) = i
i =1

1 n yi y j k ( xi , x j ) . 2 i , j =1

(5)

C. Output Probabilities of Muti-class SVM[4,9] Binary SVM is discussed only from above. For muticlass problem we can transform the muti-calss SVM into a series of binary subtasks that can be trained by the binary SVM. In this paper one-against-one (OAO) decomposition strategy is adopted. The output probability of each binary SVM can be calculated by the method described in equation (12) and (13). Construct the feature vector as following, (14) V(x) =[P 1(Ci | x),L, P (Ci | x), P (Ci | x)]T ia iaj iaM where Piaj(Ci|x) denotes the output probability of the binary SVM determined by the i-th type and the j-th type training samples (i j).The feature vector can be transformed into output probability by Gauss model as following[9], P ( C i | x ) = N (V ( x ), , )

Subject to

0 i C , i = 1, 2,L , n

y
i =1 i
n

=0

(6)

where k(xi,xj) is kernel function. The result discriminant function is

= 2
(7)

d 2

1 2

f ( x) = sign[i yi k ( x, xi ) + b] .
i =1

exp[

1 (V ( x ) ) T (V ( x ) )] 2 (15)

B. SVMs Output Probability[4,9] General the outputs of SVM are symbols that represent the class labels. But real value outputs are considered only and converted to output probability. The real value is given by formula (2) and the value is g ( x) = w x + b (8) Paying attention to the training samples are all normalized, then the closest points (support vectors) to the hyperplane are subject to g(x)=1.The points on the hyperplane are subject to g(x)=0, for others points, then

III. DESIGN OF IDENTIFICATION OF PATTERN A. HMM for Identification Problem Identification problem of time series is defined the classification of signal type j, given sequential input pattern Xt at time t. Input pattern Xt is mathematically defined as an object described by a sequence of features at time t.[10-12] X t = ( x1 , x2 ,L , xd ) (16) The space of input pattern Xt consists of the set of all possible pattern: Xt Rd, Rd is a d-dimensional real vector space. The k observed data up to time t is defined as,

g ( x) = d w

(9)

where d is the distance between the vector x and the hyperplane. Positive and negative sign denote that the samples on the two sides of the hyperplane. Then for any sample point vector x, the formula is

t k = { X t k +1 ,L , X t 1 , X t } (t ) = {1 , 2 ,L , c } ,

(17)

The set of possible signal classes j forms the space of classes : (18) where c is the number of classes. The identification task can be considered to be the finding of function f, which maps the space of input patterns t k to the space of classes . Nonstationary time series often exhibit sequentially changing behaviours. If one short-time period is defined

dx =

g ( x) w 1 w

(10)

To the support vectors, the formula is

d sv =

(11)

2012 ACADEMY PUBLISHER

2326

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

to a frame, the probability of a particular frame transition is different for each type of time series. Therefore, the probability of frames existence, and of a particular transition between frames, can be statistically modelled. The probability of specific signal is already known, and is called prior probability. When identifying a specific time series, a decision can be made only by selecting the type of signal with the highest a prior probability P ( ) . The decision is probably unreasonable. It is more reasonable to determine the type of time series after observing the trend of time series major variables, namely, to get the conditional probability P ( | t k ) . This conditional probability is called a posterior probability. Decision-making based on the posterior probability is more reliable, because it employs a priori knowledge together with the observed time-series data. Classification of an unknown pattern Xt corresponds to finding the optimal model that maximizes the conditional probability P ( | t k ) over the whole time series of the type . One can apply Bayes rule to calculate the a posterior probability,

en = sn sn = sn i =1 ai sn i
n

(23)

P( | t k ) = max

P (t k | ) P ( ) (19) P(t k ) The conditional probability P ( | t k ) comes from

The weighting coefficients, also referred to as the linear prediction coefficients (LPC) a1, a2, L , ap, can be calculated by minimizing some functional of the residual signal en over each analysis window. Different methods can be used to find the linear prediction coefficients. The coefficients of linear predictors are equal the coefficients of AR models [3,9]. Vibration signals are nonstationary. Therefore, the future behaviour of a vibration signal is unpredictable. However, when the signal is divided into several small windows, quasistationary behaviour can be observed in each window. Thus, future behaviour of the vibration signal can be predicted separately in small windows under the restriction that a different model is used for each window. In this approach, as illustrated in Fig.1, the signal is divided into windows of equal length. Each window is coded into a feature vector, which consists of a set of linear prediction coefficients for that window. The feature vectors for all windows are combined together to form a feature matrix. We will interchangeably use observation matrix and feature matrix throughout the rest of the paper. In this way, the vibration signal is a feature or observation matrix, which will then be used for training HMMs.

comparing the shapes of the time series models with the input observations, while the a priori probability P() comes from the accident probability. Since P (t k ) is

independent of ,

P( / t k ) max{P(t k | )P( )}

(20)

In fact it is difficult to calculate an a priori probability P(), which satisfy the following equation. c (21) j =1
j =1

o1
The

o2

o3

oT 1 oT

Fig.1 Vibration feature extraction

The HMM can successfully treat an identification of nonstationary time series under a probabilistic or statistically framework. In this identification problem, the HMM is used to estimate the conditional probability P ( | t k ) . B. Vibration feature extraction Linear predictors are used to predict the value of the next sample of a signal as a linear combination of the previous samples. The next sample of the signal sn is predicted as the weighted sum of the p previous samples, sn-1,sn-2, L ,sn-p , sn can be expressed as

observation matrix is, O = [o1 | o2 | o3 L oT 1 | oT ] , where the oi is the vector of linear prediction coefficients for i-th window signal. It is equivalent to X t which is described by the equation (16). C. Application of HMM By using the HMM, the pattern variability in the parameter space and time can be modelled effectively. HMM uses a Markov chain to model the changing statistical characteristics that exist in the actual observations of dynamic process signals. The Markov process is therefore a double stochastic procedure that enables the modelling of not only spatial phenomena, but also time-scale distances. HMMs parameters are estimated from the Baum-Welch algorithm, An HMM is trained for each specific time series from both a set of training data, and an iterative maximum-likelihood estimation of model parameters from observed time-series data. Incoming observations are classified by calculating which model has the highest probability of producing such an observation.

sn = a1sn 1 + a2 sn 2 + L + a p sn p = i =1 ai sn i
p

(22)

The residual error en is defined as the different between the actual and predicted values of the next sample and can be expressed as

2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2327

The following parameters are needed to define a HMM: The number of state, N The transition probability distribution, A = {aij } ,where,

IV. HYBRID SVM-HMM ARCHITECTURE One significant drawback in SVMs is that, they are inherently static classifiers. they do not implicitly model temporal evolution of data. HMMs have the advantage of being able to handle dynamic data with certain assumptions about stationary and independence. Taking advantage of the relative strengths of these two classification paradigms we have developed a hybrid SVM-HMM architecture using our Baum-Welch training method. This system provided all components for the HMM portion of the hybrid system architecture. For estimating SVMs we used a publicly available toolkit, stprtoolbox[2]. The flow chart of SVM-HMM training is as Fig.2.
Training set O Initial HMM

ai j = P{qt +1 = j | qt = i}1 i, j N

(24)

The note qt denotes the current state i.e., the probability of being in state j at time t+1 provided that the state at time t is i. Observation probability distribution of each state, B = {b j (k )} ,where,

bj (k ) = P(ok | qt = j), 1 j N ,1 k M

(25)

where ok and M denote the k-th observation and number of distinct observations, respectively. If the observation is modeled as continuous, a continuous probability density function must be specified for each state. The initial state distribution, = { i } ,where,

Viterbi algorithm for qt SVM probability for bj(k)

i = P(q1 = i ),1 i N

(26)

It denotes the probability of the i-th state being the initial state. The compact notation = ( A, B, ) is used to represent a HMM. Learning a HMM consists of two steps: (i) inference step where the posterior distribution over hidden states is calculated; (ii) learning step where parameters (such as initial state probability, state transition probability, and emission probability) are identified. The well-known forward-backward recursion allows us to infer the posterior over hidden states efficiently, More details on HMM can be found in [13]. D. Identification Method by HMM Assume possible S types nonstationary time series are existed, each type of time series is modelled by a distinct HMM. Each HMM is trained by each training set constitutes an observation matrix O. The following steps are involved in design of identification. Step 1: The fist step is build an HMM s for each type signal. In other words, we must estimate the model parameters = ( A, B, ) that optimize the likelihood of the training set observation sequence or maximize P(O | s ) , the probability of observation sequence O given model s . The method is the Baulm-Welch reestimation algorithm, also known as expectation maximization (EM) approach. Step 2: Given unknown observation sequence, probabilities of all possible modes are calculated. The model with the highest likelihood is considered to be the best candidate for representing the specific time series. i.e.[3,9]


Baulm-Wechl Re-estimate N

Satisfaction?

Y Final HMM
Fig.2 Hybrid SVM-HMM training

SVM is mainly used to calculate the observation probability as equation (25). Therefore it is necessary to assume the number of sub-state of the j-th hidden sate of HMM is k. V. EXPERIMENTS AND RESULTS A. Experiment Setup The major objective of this paper was the experimental investigation of vibration signatures due to localized wear/damage in bearing outer race and gear tooth. Vibration results from three cases of a combination of bearing and gear. 1) The undamaged bearing and the gear set with no induced damage/wear. 2) The damage bearing and the gear set with no induced damage/wear. 3) The damage bearing and the gear set with one single tooth damage gear. In order to perform a parametric study of the effects of bearing and gear damage on the vibration signatures of the system, vibration study for three different cases above were carried out using the test rig shown in Figure 3. The test rig consists of two identical spur gears on two shafts with one attached to the electric motor driver while the other is attached to a water-braking system to provide loading onto the gears, each shaft is supported by two

s* = arg max[ P (O | s )]
1 s S

(27)

2012 ACADEMY PUBLISHER

2328

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

bearings. The driver of the gear test rig consists of a 75Hp motor connected through a belt-pulley driving system that can 6 provide a maximum speed of up to 8000 rpm.

Fig. 6 Time signals for undamaged gear & undamaged bearing

Fig. 3 Rotor-gear-bearing test rig

Using a shaft speed 20 Hz (1200 rpm), both rotor speed and bearing carrier speed were also measured using optical encoders. Vibration data were acquired through a set of accelerometers (one accelerometer in x-direction and one accelerometer in y-direction on the bearing box) to a computer-based high-speed analog-to-digital system. The sampling rate of the vibration data was set to be 6000 Hz. There were around 300 samples per revolution of the rotor (32768 in total). Vibration signals for approximately 109 revolutions were acquired to be stored in computer for fault identification vibration signature analysis. Figure 4 shows the test gear with single tooth damage. Figure 5 shows the test bearing with the damage of outer race.

Fig. 7 Time signals for undamaged gear & damaged bearing

Fig. 8 Time signals for damaged gear & damaged bearing

Fig. 4 Gear with single tooth damage

Fig. 5 Bearing with outer race damage

In the Time Signal figures, only 1600 points are chosen within 32768 total points. Time signals with undamaged the gear and undamaged bearing of ydirection are shown in Figure 6. Time signals with undamaged gear and damaged bearing of y-direction are shown in figure 7. Time signals with damage gear and damage bearing of y-direction are shown in figure 8. Only y-direction vibration signals are considered.
2012 ACADEMY PUBLISHER

B. Features Extraction by Wavelet Packet Decomposition The features used for a specific fault must only be correlated with this specific fault and uncorrelated (or unnoticeable corrected) with all the other faults. To illustrate this, if the energy of a certain frequency band is used as one of the features for type-one fault, then the energy of this band must only be affected by the presence of type-one fault and be unchanged (or minimally affected) by the presence of other faults. Thus a new feature extraction method is considered. The signal is divided into several small window segments of equal length. Wavelet packet decomposition (WPD) is applied to the segments of the signal in each window. A detailed review of WPD can be found in [7]. The feature vector for each window consists of selected node energies of the WPD. This process of WPD is illustrated . Figure 1 shows the three wavelet packet decomposition. (i,j) represents the i-th layer and j-th node. For example, (3, 0) represents the third layer and 0 th node of WPD. We obtain the coefficients of all the nodes. The time signals Sij are reconstructed from the coefficients. All the nodes in third layer are considered. Then the whole signal is obtained as following,

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2329

S = S30 + S31 + L + S37


Thus the energy of node is as following,

(28)

REFERENCES
[1] Rabiner L R and Juang B H. An Introduction to Hidden Markov Models. IEEE ASSP Magazine, vol.3, pp. 4-6, January,1986. [2] Vojtech Franc and Vaclav Hlava,Statistical Pattern Recognition Toolbox, http://cmp.felk.cvut.cz, June, 2004. [3] Hasan Ocak and Kenneth A.Loparo. HMM-Based Fault detection and Diagnosis Scheme for Rolling Element Bearings, Journal of Vibration and Acoustics, vol.127, pp. 299-306, 2005. [4] J.Platt. Probabilistic output for Support Vector Machine and Comparisons to regularized likelihood method in advances in large margin classifiers, MIT Press, Cambridge, MA,USA,1999: 61-73. [5] Ramy Saad Saman K. Halgamuge Jason Li, Polynomial kernel adaptation and extensions to the SVM classifier learning, Neural Comput & Applic (2008) 17:1925. [6] Xavier CapronDesire Luc Massart, Johanna SmeyersVerbeke, Multivariate authentication of the geographical origin of wines:a kernel SVM approach, Eur Food Res Technol (2007) 225:559568. [7] Fethi Smach, Cedric Lematre, Jean-Paul Gauthier, Johel Miteran, Mohamed Atri, Generalized Fourier Descriptors with Applications to Objects Recognition in SVM Context, J Math Imaging Vis(2008)30:4371. [8] Cecilio Angulo,Francisco J. Ruiz,Luts Gonza Lez,and Juan Antonio Ortega, Multi-Classification by Using Tri-Class SVM, Neural Processing Letters (2006) 23:89-101. [9] Hua Jing. Research on continuous speech recognition based on a hybrid HMM/SVM framework. Master thesis, Harbin Institute of Technology, 2006. [10] Kee-Choon Kwona and Jin-Hyung Kim. Accident identification in nuclear power plants using hidden Markov models. Engineering Applications of Artificial Intelligence, vol.12, pp. 491-501, 1999. [11] Atulya Velivellia, Thomas S. Huanga and Alexander Hauptmann, Video shot retrieval using a kernel derived from a continuous HMM, SPIE-IS&T/ Vol. 6073 607311:110. [12] Xiao-Bing Li, Frank K. Soong, Tor Andr Myrvoll, RenHua Wang, optimal clustering and non-uniform allocation of Gaussian kernels in scalar dimension for HMM compression, ICASSP 2005:669-672. [13] Feng Chang-jian, Kang-jing, Wu-bin, Hu Hong-ying, Application in Fault Diagnosis of Rotary Machine Based on Theory Of DHMM Dynamic Pattern Recognition, Journal of Dalian Nationalities University, No.3, pp,1215,May,2005.

E3 j = S3 j dt = x jk
2 k =1

(29)

Where x jk represent amplitude of reconstructed discrete signal. Therefore the observation feature vector is constructed as following. ot = [ E30 , E31 ,L , E37 ] (30) C Experiment Results Initial HMM is as figure 2. The number of components of gauss mixture is 5. The result of 20 times test of single fault diagnosis is illustrated in table 1.
TABLE I THE RESULT OF DIAGNOSIS

test (20times) Normal Fault-1 Fault-2

Normal 20 1 1

Fault-1 0 19 1

Fault-2 0 0 18

Normal, Fault-1 and Fault-2 represent the three cases in section IV(A) separately. The method of the multiple fault diagnosis is not test in this work. V. CONCLUSIONS In this paper, we attempted to construct identification classifier for nonstationary time series by integrating SVMs to HMMs. The proposed hybrid method utilized the advantages of both HMM as a detector for time varying characteristics and SVM as a powerful binary classifier. The results on the signals of the rotor-gearbearing show that hybrid model is executable and effective.. ACKNOWLEDGMENT This work was financially supported by Liaoning Province education department (L2010092) and University of Dalian Nationalities talent import fund(20016202).

2012 ACADEMY PUBLISHER

A Real-Time Information Service Platform for High-Speed Train


Ruidan Su, Tao Wen
College of Information Science and Engineering, Northeastern University, Shenyang, China Email: suruidan@hotmail.com

Weiwei Yan, Kunlin Zhang, Dayu Shi, Huaiyu Xu*


Service Science Research Center, Shanghai Advanced Research Institute, Chinese Academy of Sciences, China Email: *Corresponding authors email: huaiyu@cs.ucla.edu

Abstract In this paper, a real-time information service platform for high-speed train was proposed and implemented. With this platform, all passengers of train will freely to get access by Wi-Fi or other wireless network to enjoy movie, music, train information and travel information. Besides, the platform was connected to a FPGA based data processing center to collect train operation data including instantaneous power, traction and braking force, also convertor temperature. The server was designed based on opening protocol, and the data processing center is designed based on FPGA system. We setup a specific scene to verify the feasibility and achieve good results. Index TermsInformation service platform, FPGA, Highspeed Train

I. INTRODUCTION Rail transport is the most commonly used mode of long-distance transportation in the People's Republic of China [1]. Passenger number of rail transport in 2010 achieves to about 1.64 billion. Only considering the passenger number between two major city Beijing and Shanghai, it owns about 10% of it. With the huge number of passengers traveling every day, how to provide information service for them becomes a problem. Although the Third Generation Communication Network (3G Network), which provided by the major mobile operators, can reach the download speed of 3Mbps [2]. The customers have to pay expense for that. In this paper, a real-time information service platform for BeijingShanghai high-speed train is design and implemented. In this platform, a server is established for each train to provide Local Area Network (LAN) service. The platform not only provides necessary information like music, video, travel information and advertisement to passengers, but also collects importation train operation running data and provides the administrators with visual result like traction property curves and braking property curves. Besides, the platform is integrated with Geographic Information System (GIS) to users. Users are

possible to check the train information which is permitted according to the user type. Some administrators are authorized to get access to professional management system to read the train running data. Although the concept of the platform is beautiful, there are still some problems in realization. How to make it adapted to different kind of electronics like smart phone (with different operation systems: windows mobile, Android, ios), Laptops and tablet PC? How to provide information to passengers in the train and users who may get access at any place of the word? In addition, how to collect data of the train into the server in real-time? In our research work, we designed a web server based on open standards and widely available industry specifications to make different accesses working well, while the data collecting system was implemented by FPGA to process amount of transferring data. Users can easily get access to the platform by Ethernet or Internet. In order to highly improve the communication speed between PC and appliances, FPGA is applied in our system. Thus, different interfaces can be easily integrated in the system to make real-time data collection possible[9]. But how to make the high-speed train connected to Internet? Mobile operators like China Mobile have started to set up Wi-Fi network in nationwide. According to cablevision.com, Cablevision, the largest wireless Internet network provider in USA has Offered Wi-Fi on NY trains. These will make the train connected to Internet to provide service but its not the major work of this manuscript. [3] The organization of this paper is shown as follow. The system overview and function definition is introduced in section II. Detailed software design of the platform using java and Ajax is illustrated in section III. In section IV, we introduce hardware implementation for the remote control system. And a user case in section V. Coming up is the conclusion. II. SYSTEM OVERVIEW

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2331

Figure 1. System Function Architecture

A. Platform The information service platform on high-speed train is shown in Figure1. The server is connected to the internet through Wi-Fi. Wi-Fi is a high-speed wireless communication networks and can support the abundant data flow between users and the web server. Browser and Server based (B/S) structure is integrated in the software system. User can log in the system by using IE or other browsers in B/S structure. This is implemented by Java language and Ajax. B. Function Definition 1. Collecting train operation running data. It is significant that the high-speed train will run with safety. So it requires that the platform will monitor the important parameter when the train is running. Speed, power, traction and the converter temperature are important parameter to evaluate the safety level of the train running. In this system, we collect these data with different kinds of interface including RJ45, RS232, 485, and USB etc. All the data is process by the FPGA and send to the platform. After analyzed, traction property curves and braking property curves is present in the system. 2. Travel information and advertisement distribution channel In this system, information distribution channel is assigned to corresponding person, including train administrator, advertisers. Train administrator can login to publish announcement of time table of the train, weather report, remaining seats etc. While at the same time, the function is assign to advertisers like travel agency. Travel agency staff can login at the office by browsers to publish travel information according to the train location. For example, they will introduce Tianjin travel information and hotel reservation info to attract passengers. All the information will be published in each screen of each cabin. 3. Passengers Function Currently, train only offer the same TV program for all passenger without consider the passengers interests. In order to offer better service, the server is able to offer kinds of music, video, electronic books, and latest news for selection. Passengers not only can enjoy their favorite program, but also free of the expense charge from the mobile operators.
2012 ACADEMY PUBLISHER

4. Other Visitors Function Visitor can login to check the trains information about the GIS, which is shown in Google map. In this case, they know where they are and how much time to the destination. 5. GIS based on Google Map Google Map offers a public and free map service [4]. Therefore, we can use it to provide a geographical view in our platform. On the basis of Google Map service, we implemented a Browser and Server based system. As long as users can connect to the Internet, they can use our platform. Besides, since the platform supports many different ways of communication between the server and other trains, users can easily add their applications to our platform. III. SOFTWARE DESIGN OF THE PLATFORM

Figure 2. The achitechture of the software design

The architecture of the software design is show above. A. A Client Design In order to make it easy for the passengers to access the information service platform, dynamic web is implemented by using JSP and JavaScript. Our friendly user interface for the browser is shown in Fig 3. Ajax mode is applied in our client terminal to make user interact with server in real time instead of traditional web applications. Ajax offers a smooth ride all the way without the page reloading. A string in set-cookie of HTTP header represents the cookies information when the server sends the cookie to the client. Cookie can be acquired by using HTTP connection, getHeaderFiled. B. Web Server Design The server has three parts. First, there is Google Map server used to provide geographic information. Second, there is a Keyhole Markup Language (KML) Server[5] used to dynamically create KML files and load those KML files to the Google Map Server. Third, there is an application server which is implemented in java language as the core control part in our system. A database which stores users and real things information is also connected to the application server.

2332

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

TABLE I.
THE LIST USERINFO FOR REGISTERED USERS INFORMATION

field user password

type Varchar(30) Varchar(30)

explanation Prime key Each user has one password

Figure 3. Client interface

IV. HARDWARE IMPLEMENTATION FOR THE DATA COLLECTING SYSTEM A. FPGA and USB Interface Implementation Modern FPGA has the volume of millions of equivalent gates and hundreds of hardware multipliers. The FPGA clock frequency achieves hundreds of megahertz. FPGA works in fully parallel mode in our system. FPGA is chosen as microcontroller for high throughput data processing in our experiment.

As shown in Figure 4, the sever monitor responses the HTTP request from client. The acquired information is processed and classified to transmit to different Servlet. The process is described below.

Figure 5. Hardware Structure of the train information processing center.

Figure 4. Software Structure of the server

1 The mouse over event handler triggers the Javascript function that populates and sends the XMLHttpRequest. The URL contains the context root, the Servlet mapping that is configured in the web.xml deployment descriptor and the populated query string associated with the request. 2 The Servlet receives the request, retrieves the itemed parameter of the request and create/sends the appropriate response. 3 The XML response is sent back to the browser and the callback function is invoked. 4 The response is parsed, the popup values are populated and the popup is displayed. C. Database Design There are two entities in our system. One is user and the other is thing. Thus, we designed two tables in our database for each one. Considering there will be many different kinds of things in our system and data quantity will by huge, Microsoft SQL Server 2005 is adopted for database design in our system. Table1 shows the list UserInfo for registered users information. There are two fields: user and password. user is appointed as the prime key.

Fig. 5. Hardware Structure of the train information processing center Figure 4 shows the hardware design architecture. A communication system connects the PC server and train information processing center. All drive controllers are implemented on a ALTERA EP1C12Q240C8N FPGA device[6].

Figure 6. A use case in the specific scene

The static base design on the FPGA includes a USB 2.0 interface, the reconfiguration, memory management, communication and IO functions. The USB 2.0 interface implements the connection to PC or a higher level control system by employing an external USB controller, the EZ-

2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2333

USB FX2 device by Cypress. This specific high speed USB peripheral controller implements the physical layer and accomplishes several other tasks. Special FIFO memories of the USB controller are connected to the USB interface of the drive controller and deliver the USB data to the FPGA. The FX2 microcontroller contains an embedded USB 2.0 transceiver and handles all USB transfers with the upstream USB host. It presents a data bus to the FPGA with generic control signals which can be programmed to behave in a custom manner. This interface is called the GPIF (General Purpose Interface). The FX2 also handles all USB control requests (via endpoint ()), which all USB-enabled devices must support to fully comply with the USB standard. These include responses to device capability interrogations and standard setup requests. Figure 7 is the prototype of the control system with USB interface. In order to process the abundant data collected from the train, our control system must provide huge capacity of data processing to increase transfer speeds. For example, the system collects the convertor temperature 10 times a second and the power also at 10 times a second. FPGA can easily handle the huge data transportation in our experiment, which sustains different interfaces (e.g. Bluetooth, Ethernet, RS232 etc.) connecting to different kinds of data type. V. A USE CASE IN THE SPECIFIC SCENE: APPLICATION IN HOME We developed a information service platform and applied at home scene as an experiment to demonstrate the feasibility of visiting and distributing information. Firstly, we developed a user friendly browser interface and recorded it in a XML file located in the server part. Second, we connected home appliances with the application server. Home appliances are worked as the important parts of the train and provide data. We set up a proxy server which is used to establish a link between home appliances and the application server. This proxy server is actually a personal computer which can surf the Internet. To connect home appliances with the proxy server and control those home appliances, Fieldprogrammable gate array (FPGA) is used. The home appliances can access to the FPGA system by different channels such as RJ45, RS232, Bluetooth, Wi-Fi or USB. In the system, Bluetooth protocol is applied to connect air conditioner to collect temperature, while Wi-Fi is used to connect the camera. Fig.6 shows the experiment of our application.

VI. CONCLUSIONS This paper present the design, implementation, and an initial model of a high-speed train information service platform, upon which different kinds of users can get access to acquire/distribute information during the train running. Also the important and abandon data is collected and processed by the FPGA based data processing system. Detail implementation of software and hardware design is proposed. Ajax mode is applied in the client terminal to make user interact with server in real time instead of traditional web applications. FPGA is chosen as data processing center for abandon data collecting and processing in our platform. Special FIFO memories of the USB controller are connected to the USB interface of the drive controller and deliver the USB data to the FPGA to guarantee the real time of data process. REFERENCES
[1] [2] [3] [4] [5] www.wiki.org. www.chinamobile.com http://www.cablevision.com Google, Inc. http://maps.google.com/ [last accessed on Aug. 5, 2010] Huaiyu, Xu; Qing, Ni; Ruidan, Su; Xiaoyu, Hou; Chao, Xiao, Avirtual community building platform based on google earth,Proceedings-2009 9th International Conference on Hybrid Intelligent Systems, HIS 2009, v 2, p 349-352 (2009). M. Young, The Technical Writer's Handbook. Mill Valley, CA: University Science, 1989. Yong-Seok Kim; Hee-Sun Kim; Chang-Goo Lee; The development of USB home control network system, Control, Automation, Robotics and Vision Conference, 2004. ICARCV 2004 8th, Volume 1, 6-9 Dec. 2004 Page(s):289 - 293 Vol. 1 (2004). Ping-Chi Wang; Kuochen Wang; Lung-Sheng Lee; A QoS scheme for digital home applications in IEEE 802.11e wireless LANs Personal, Indoor and Mobile Radio Communications, 2005. PIMRC 2005. IEEE 16th International Symposium on Volume 3, 11-14 Sept. 2005 Page(s):1845 - 1849 Vol. 3 (2005). Xu Huaiyu, Su Ruidan, Jiang Linying, Jin Song, WirelessLAN Based Distributed Digital Lighting System for Digital Home,2009 International Conference on Computer Engineering and Technology,ICCET 2009,Vol.1 pp.433437.

[6] [7]

[8]

[9]

Ruidan Su received his BE from South China Normal University in 2006 and his Master's Degree in Northeastern University in 2010.He is currently pursuing his PhD with computer science and computer application of technology in Northeastern University. He had published more than 10 papers during his MD study in Northeastern University.

2012 ACADEMY PUBLISHER

2334

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

Research on the Grey Assessment System of Dam Failure Risk


Jiang Ying and Zhang QiuWen
College of Hydropower and Information Engineering, Huazhong University of Science and Technology Wuhan, China E-mail: jerry_1982@126.com , qwzhang_hust@163.com

AbstractThe security of the dam is a vital problem for the reasonable use of the water resources. It is important and meaningful to research methods of protecting the safety of dam .So the dam safety assessment is related with the national economy and the peoples livelihood, and it is necessary to build a system to evaluate the security risk of dam. The dam failure disaster risk is regarded as main researching object in this thesis. The theories and methods, such as knowledge of dam engineering, risk analysis, analytic hierarchy processing, grey theory and so on, are introduced into the thesis. This research performs relatively detailed study on methods of comprehensive risk assessment, synthesis assessment structure system, method of measuring assessment index of the dam, and development of the grey assessment system of Dam Failure Risk. Index TermsDam Failure, Risk Analysis, Analytic Hierarchy Processing, Grey Theory, Assessment System

I. INTRODUCTION The grey theory was advanced by Prof. Deng Ju Long in 1982, which deals with decisions characterized by incomplete information and explores system behavior using relational analysis and model construction [1,2]. With a view to investigate the inherent law of complex systems, the theory establishes models describing its dynamic variation characteristics based on its own history data. In the field of control technologies, the tint of a color is often used as a metaphor to describe the amount of known information. Black denotes that nothing is known regarding the system internal structure, parameters and characteristics. White refers to the completeness of the information known about a system. Between white and black is grey, which represents an incomplete understanding of system characteristics and structure. In Grey theory, random variables are regarded as grey numbers, and a stochastic process is referred to as a grey process. A grey system is defined as a system containing information presented as grey numbers; and a grey decision is defined as a decision made within a grey system. Fields covered by Grey theory include systems analysis, data processing, modeling, prediction, as well as
Corresponding author: Zhang QiuWen Email: qwzhang_hust@163.com

decision making and control. Water is the fundament natural resource and strategic economic resource. It is impossible the basal existence of people and the sustainable development of the social economy without water. So taking action to protect the security of water resources has increasingly become the major studied issue for all the countries worldwide. Dams have been built to make full use of water resources in many countries. The risks caused by breakage of the dams could be very serious, which have been constructed for a number of purposes such as provision of drinking and irrigation water as well as generation of electric power. Therefore the security of the dam is a vital problem in life, society and environment. It is of necessity to build the system to analyze and evaluate the security risk and safe state of dam. Dam failure risk assessment system can help evaluate the dam safety condition, locate the weak link of the dam, find the key factors affecting the dam safety, direct the management of dam security and guarantee the dam safety. In this way, the security of the water resources is thus assured by used the grey assessment system of dam failure risk. Based on prototype observations, the mathematical and mechanical methods are conventionally used for evaluating the dam failure risk. These classical methods are very important to monitor dam safety. However, these methods lay usually no strong emphasis on the learning of experiences and expert knowledge. In the risk assessment of dam failure, it is very difficult to effectively quantify the risk with only collecting and counting data, so it is significative to use the principles of the grey theory to give some heuristics for the risk assessment system of dam failure. This paper proposed grey assessment method. The quantitative change and qualitative change of dam safety property were integrated into a matter-element. Based on the grey assessment module, the contradictory problem for dam safety assessment can be solved. A grey assessment system of Dam Failure Risk was developed with the method. In practice, the proposed system has been used for evaluating the dam failure risk successfully. The application showed that the bionics model is feasible and the proposed key technologies were effective. The system can supply technical support for improving the level of dam safety management, extending normal run time of dam and avoiding dam failure.

2012 ACADEMY PUBLISHER doi:10.4304/jcp.7.9.2334-2341

II. THEORY AND METHODOLOGY A. Grey Theory Grey theory is mainly utilized to study systems that model uncertainty, analyze relations between systems, establish models, and make forecasts and decisions [3,4,5]. The grey theory emphasizes investigating the inherent law, which is specialized in quantitative analysis of complex systems with incomplete information. The Grey-forecasting Model (GM) is the core of Grey Theory, the grey model represented by GM (n, h) predicts its own trend by a dynamic different equation established on the history data. Where n represents the rank of the different equation and h represents the number of the variables. The model GM is often used in practical trend analysis and condition prediction. Supposing the history data series is represented by X ( 0 ) = [ x( 0 ) (1), x( 0 ) ( 2),L, x( 0 ) ( n)] , and the model GM (1, 1) is as follow:

(0)

x (k + 1) = x (k + 1) x (k ) u = ( x ( 0 ) (1) )(1 e a )e ak a

(1)

(1)

(5)

In intelligent diagnosis applications the predictive vectors from the GM (1, 1) is usually put into the recognition module. According to the grey theory, the relation degree evolves from the relation coefficient. The relation coefficient of the two series X i and X j , is represented by ij (k ) , where k represents the sampling points.

aij = X j (k ) X i (k ) k { , 2, K N } 1
j k j k

(6)

amin = min min aij (k )amax = max max aij (k ) (7)

ij (k ) is defined as:
ij (k ) =
amin + amax m k { , 2, K N } (8) 1 aij (k ) + amax m

X (t ) + X (1) (t ) = u t

(1)

(1)

The Grey Theory builds a model, which must apply one order Accumulated Generating Operation (AGO) to the primitive series in order to provide the middle message of building a model and to weaken the tendency of variation. So the accumulated generating operation (AGO) result of the history data series is as follow:

where m is a constant with the range from 0 to 1. The relation degree of the two series X i and X j are as follow:

ij = ij (k ) + ij (k )
k =1 k =2

N 1

X (1) (n) = X ( 0 ) (i )
i =1

The relation degree represented by

ij

(9) shows the

(2)

comparability of the series X i and X j . It is often applied to grey cluster in practice. B. Analytic Hierarchy Process The Analytic Hierarchy Process uses paired comparisons to derive a scale of relative importance for alternatives [6,7]. We investigate the effect of uncertainty in judgment on the stability of the rank order of alternatives. The uncertainty experienced by decision makers in making comparisons, which is measured by associating with each judgment and interval of numerical values. Construct a pair-wise comparison matrix: To construct pair-wise comparison matrix using a scale of relative importance. The judgments are entered using the fundamental scale of the analytic hierarchy process. An attribute compared with it is always assigned the value 1, so the main diagonal entries of the pair-wise comparison matrix are all 1. The numbers 3, 5, 7, and 9 correspond to the verbal judgments moderate importance, strong importance, very strong importance, and absolute importance (with 2, 4, 6, and 8 for compromise between the previous values). Assuming M attributes, their pairwise comparisons yield a square matrix AM M as follow

After AGO any non-negative series is translated into an increasing series of which the randomness is reduced while the orderliness is strengthened. The vector a = [a, u ] , composed of the coefficient a and the parameter u , can be solved by the least-meansquare (LMS) algorithm.
T

a = [a, u ]T = ( B B) B X ( 0 )
T T 1

(3)

where X ( 0 ) = X ( 0 )

1 (1) (1) 2 [ x (1) + x (2)] 1 (1) (1) B = 2 [ x (2) + x (3)] M 1 (1) [ x (n 1) + x (1) (n)] 2
(1)

1 1 M 1
(4)

Then the solution of GM (1, 1) is as following:

u u x (k + 1) = ( x ( 0 ) (1) )e ak + a a

The predictive equation of the history series is:

A1 (10) AM M = A2 M AM where a ji = 1 aij denotes the relative importance of


attribute j over attribute i. Find the relative normalized weight: The relative normalized weight ( w j ) of each attribute is found by (11) calculating the geometric mean (GM) of the ith row and (12) normalizing the geometric means of rows in the comparison matrix. This can be represented as:

A1 A2 a12 1 a 1 21 M M aM 1 aM 2

L AM L a1M L a2 M L M L 1

GM j =
i =1

Aij

A
k =1

, (i, j = 1,2,L, M )

(11)

kj

and

w j = GM j

GM
j =1

(12)

It is simplicity and ease to find out the maximum eigenvalue and to reduce the inconsistency in judgments. So the method of AHP is used for finding out the relative normalized weights of the attributes Calculate matrices: To Calculate matrices A3 and

A4 such as A3 = A1 A2 and A4 = A3 A2 ,
where A2 = w1 , w2 ,L w j

system is established. According to the factors affected the dam risk are complicated and the standards for judging are fuzzy, the factors are classified in different types and different layers through the method of risk analysis. In the assessment of dam risk, different factors reflect different status and importance make different contribution to the risk assessment that is each factor has its own weight. Weights on the factors are determined through analytic hierarchy process. Then each factor risk degree will be got and the final result can be gain by the method of Grey Assessment of dam-break disaster risk. Build the multistage indexes system: The first step of carrying out the assessment of the dam risk is to set up the performance evaluation indexes and decision factors. By the principle of selecting factors, correlation theory and criteria, to select the primary factors and sub-factors of the dams outburst, and to build the multistage indexes system of the dam-break disaster risk analysis. Under abominable work environment, the safety status of dam changes dynamically, which emerges in the quantitative and qualitative change manners. Therefore, quantitative and qualitative changes need be considered comprehensively in the process of dam safety evaluation. In extension evaluation method, the professional knowledge in dam field is combined with extension theory [8,9] . The quantitative change and qualitative change of dam safety status are integrated into a matter-element. According to analysis the risk sources and corresponding risk characteristics of the dam-break disaster are determined under uncertainty, and the multilevel risk tree model of dam-break disaster is constructed as Fig.1.
Dam status

.The maximum eigenvalue

max

can be found out (i.e. the average of matrix A4 ).

Calculate the consistency index: To calculate the consistency index CI = (max M ) ( M 1) .The smaller the value of CI, the smaller is the deviation from the consistency. The consistency in the judgments of relative importance of attributes reflects the knowledge of the analyst (i.e. decision maker). The number of attributes can obtain the random index (RI), which can be used in decision-making. Calculate the consistency ratio: To calculate the consistency ratio CR = CI RI .Usually, a CR of 0.1 or less is considered as acceptable and reflects an informed judgment that could be attributed to the knowledge of the analyst about the problem under study. C. Grey Assessment Method of Dam Failure Risk This research aims to develop an assessment system of dam failure risk with the theories and methods, such as risk analysis, analytic hierarchy processing, and grey theory. By the principle of selecting factors, correlation theory and criteria, the primary factors and sub-factors of the dams outburst are selected, and to build the multistage indexes system of the dam-break disaster risk analysis. The standards for judging and the suitable subordinate functions can be determined, so the assessment index

experience and rule knowledge Possible risk geneses of dam discretication of attribute value

Diagnosis of dam problem attributes reduct Potential risk geneses of dam attribute least reduct Main risk geneses of dam fault tree Dam disaster risk tree
Figure 1. Diagnosis model of dam-break disaster risk

Establish the weight matrix: Suppose that there are n factors to describe the status of the dam. Then, we can define a finite set, U = {ui , i = 1,2,L, n} as an evaluation factors set. Thus, the classification of evaluation factors can be described as U = u1 , u2 ,L, un .

1 2 L m u1 11 12 L 1m B = u2 21 22 L 2 m
M L M M M un n1 n 2 L nm

(20)

The

linguistic

evaluation

variable

set

V,

V = (v1 , v2 ,L, vm ) is used for describing the danger of


object performance under the assessment. It is determined based on whether the problem is normal or dangerous. For instance, the dam safety status is decomposed into three grades, V= (normal, little abnormal, dangerous) .The voting matrix can facilitate evaluating each factor or factor item for object performance in terms of assessment linguistic variable matrix V. It can be described by a form, of which each expert has one. The multi-variable weight W is confirmed by AHP, the weight matrix can be established A, and w can be shown as follow, respectively:

Evaluate the risk grade: The result of the grey assessment C is denoted as: C = A B (21) The risk grade is confirmed based on the maximum membership grade principle. III. DEVELOPMENT OF THE ASSESSMENT SYSTEM

u1 w1 u w A = 2 2 M M u n wn

A. Requirement Analysis of the System According to the life process of dam and the usual thought of dam safety diagnosis, the Fig.2 describes the model of dam failure risk assessment.
sub-factors Coordinate and analyze

(13)

Calculate the whitenization weight function: The determination of whitenization weight functions of grey is a key link from the qualitative analysis to the quantitative modeling in the process of grey assessment. The whitenization weight functions are obtained as follows:

primary factors

bearing factors

Risk indexes' value

r 5 f1 (r ) = 1 r 4 f 2 (r ) = 1 r 3 f 3 (r ) = (5 r ) 2 r 2 f 4 (r ) = (5 r ) 3 1 f5 (r ) = (5 r ) 4

0r 5 r5 0r p4 4r 5 0r p3 3 r 5 0r p2 2r 5 0 r p1 1 r 5
n p

(14)

Select grained model

(15)

Weight calculate

Establish grey matrix

(16)
Grey assess

(17)

Assess result Decision-maker

(18)
Figure 2. Model of dam failure risk assessment

The relation degree is denoted as:

ij = f l (rij )
k =1

f (r )
l =1 k =1 l ij

(19)

where, n is the amount of grey clusters, p is the amount of estimators, f l (rij ) is the whitenization weight functions. Then the grey assessment matrix B is as follow:

The information on external environment and structural characteristics of dam is acquired in real time by supervisor. Next, observations are analyzed and abnormal symptoms are founded by the theories and methods integrating mathematics, mechanics and so on. With the advanced technologies for data processing, the accurate and reliable data mined from a mass of observation data sources are inputted into assessment system. Finally,

based on the systemic and whole viewpoint, the status of dam risk is evaluated dynamically. So the assessment system should possess three functions. First, the user can put the risk data into the system, which can be saved, amended and deleted. Second, the data should be managed in the system, and the assessment result can be gain correctly. Third, the result of assessment can be shown to the user intuitionally. B. Main Modules of the System According to the requirement analysis of the assessment system, the system should be consisted of integration control module, risk indexes input module, weight calculation module, grey assessment module, assessment output module and project database (Fig.3).
Integration control module

the dam, and evaluating the dam failure risk, which is the most important module in the assessment system. Assessment output module: The assessment output is used for showing the module assessment result. The result is shown in the graphics, so the result is clear at a glance. Project database module: The project database is used for storing the large number of project archives and observing data. IV. IMPLEMENTATION OF THE MAIN MODULES

There are two important modules in the dam failure assessment system, just as weight calculation module and grey assessment module. A. Development Environment of the System According to the analysis and comparison, The common development platform Visual Studio was chosen and the development language VB was used in the risk assessment system, and the database system Access could provide data services. B. Weight Calculation Module In this paper the Analytic Hierarchy Process is applied in the weight calculation module, which is used for calculating the risk indexes weights. The flow chart of the AHP can be show in Fig.4 The Fig.5 shows the implementation interface of this module. When the multilevel risk tree of dam failure is constructed and the comparison matrix is imported in the system, the indexes weights could be calculated in this module swiftly.

Risk indexes input module Comparison matrix Weight calculation module Risk data

Weight

Grey assessment module

Assessment output modules

Result

Figure 3. Main modules of the system

Integration control module: The integration control module is a multilevel control menu, which can harmonize risk indexes input module, weight calculation module, grey assessment module, assessment output module and project database. Risk indexes input module: In the Risk indexes input module, we can input the parameters of the risk indexes by level and select the grained module. In this module, the multilevel risk tree model of dam-break disaster will be put into the system. Weight calculation module: The weight calculation module can calculate the risk indexes weights well and truly, and the weight matrix can be established. Grey assessment module: The grey assessment module is used for analyzing and calculating the data of

Comparison matrix V

Factor subset Uk

Normalize each row


Calculate

Matrix Q
Whitenization weight f(R) Weight matrix W

Add each line

Multiply

Vector a

No

Grey assessment set B

Normalize

Check up Yes Confirm weight

Analyze

Assessment result

Weight vectorW
Figure 4. Flow chart of weight calculation

Figure 6. Flow chart of grey assessment

Figure 7. The interface of grey assessment

V.
Figure 5. The interface of weight calculation

APPLICATION

C. Grey Assessment Module The grey theory is used for evaluating the dam risk. The flow chart of the grey assessment method is described as Fig.6. The interface of this module is shown in Fig.7.After all risk data are put into the system and indexes weights are calculated, the risk grade of the dam can be assessed swiftly and shown with the corresponding color.

The Zhangjiawan dam in Anhui Province and the system is used for evaluating the dam failure risk. According to field knowledge and experiential knowledge, the condition attributes and decision attributes are established. The decision table of diagnosing dam risk is built by collecting historical data about the status of this dam. The continuous value of attributes is discriminated. Redundant attributes for decision table of diagnosing dam cracks are eliminated and potential geneses of cracks are found with attribute reduction algorithm. Main geneses of cracks are found with attribute least reduction algorithm. The fault tree of logical relation between dam risk and genesis is drawn in Fig.8, and the value of risk genes is determined as Tab.I.

TABLE IV. PAIR-WISE COMPARISON MATRIX OF SUB-FACTORS


B2 B21 B22 B21 B22

1 1/3

3 1

TABLE V. PAIR-WISE COMPARISON MATRIX OF BEARING FACTORS


B3 B31 B31 B32 B33

1 1/2 1/3

2 1 1/2

3 2 1

Figure 8. Sketch map of fault-tree for risk genes


B32 B33

TABLE I. VALUE OF RISK GENES genes value


B11 B12 B13 B21 B22 B31 B32 B33

TABLE VI. DAM RISK GRADE Risk grade


safe

1.5

threshold value 4H5 3H4 2H3 1H2 0H1

TABLE II. PAIR-WISE COMPARISON MATRIX OF FIRST LEVEL


A B1 B2 B3 B1 B2 B3

basic safe slight dangerous dangerous very dangerous

1 5 7

1/5 1 2

1/7 1/2 1

TABLE III. PAIR-WISE COMPARISON MATRIX OF PRIMARY FACTORS


B1 B11 B12 B13 B11 B12 B13

1 1/2 1/5

2 1 1/3

5 3 1

The pair-wise comparison matrix (Tab., , , ) is constructed with the fundamental scale of the analytic hierarchy process. The dam risk status is decomposed into 5 grades, corresponding to 5 remarks as follow: safe, basic safe, slight dangerous, dangerous, very dangerous, which can be described as Tab.. The risk grade, risk genes names and values were put into the assessment system in order. The assessment of the dam failure risk was calculated and analyzed by using the grey assessment method. It can be shown from Fig.9 that dam safety characteristic is basic safe.

Humberto, Marengo. Considerations on dam safety and the history of overtopping events. Dam Engineering. 2000, pp. 29-59. [9] Hyun-Han Kwon, Young-ll Moon. Improvement of overtopping risk evaluations using probabilistic concepts for existing dams[J]. Stochastic Environmental Research and Risk Assessment. Berlin: May 2006, Vol.20, pp.223.
[8]

Figure 9. Results of the dam risk assessment

VI. CONCLUSIONS The grey evaluating model established in this research has shown to be quite valuable in practice. The grey assessment system of dam failure risk is an applied measure for optimizing the design, construction and operation of dam, ensuring the dam safety. The application showed the measure and design are viable. The inference efficiency of system is improved by the intelligent inference models .The system can be used for advancing modern level of dam safety management, which can lessen the burden of managers job. ACKNOWLEDGEMENTS The authors would like to appreciate the financial supports for this study from the National Natural Science Foundation of China (Grant #4107219940672179), the Natural Science Foundation for Outstanding Scholarship of Hubei Province in China (Grant#2008CDB364), the Program for New Century Excellent Talent of Ministry of Education of China (Grant #NCET-07-0340), the National Basic Research Program of China (Grant#2007CB714107) and the National Key Technology R&D Program of China (Grant#2008BAC36B01). REFERENCES
[1] [2] [3]

[4]

[5] [6] [7]

Julong Deng. Grey prediction and decision-making. Wuhan: Huazhong College of Technology Press, 1986. Julong Deng. Basic method of grey system. Wuhan: Huazhong College of Technology Press, 1988. Sifeng Liu, Yi Lin. An Introduction to Grey Systems Theory[M]. Grove City:IIGSS Academic Publisher,1998,pp.48-52. Junhui Wang and Kangsheng Tian. Application of grey algebraic curve model to passive angle tracking[J]. Journal of Air Force Radar Academy, 2001, pp. 31-34,. DengJ.Introduction to Grey SystemTheory[J].The Journal of GreySystemNo.l,1989,pp.95-99. T.L.Saaty. The Analytic Hierarchy Process [M]. Printed by Thomass Saaty U.S.A, 1988. Barbara Gaudenzi, Antonio Borghesi. Managing risks in the supply chain using the AHP method. International Journal of Logistics Management [J]. 2006,pp.11

2342

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

Research on the Dynamic Relationship between Prices of Agricultural Futures in China and Japan
Qizhi HE
School of Finance, Anhui University of Finance and Economics, Bengbu 233030, China Email: happyhefei2000@qq.com

AbstractBased on the classical regression model, timevarying coefficient model, unit root, co-integration, Granger causality test, VAR, impulse response and variance decomposition, the dynamic relationship between prices of natural rubber futures in China and Japan has been researched systematically. The following conclusions are gotten through empirical researches: Firstly, there is a stable co-integration relationship between prices of natural rubber futures in China and Japan. Secondly, the timevarying coefficient model is superior to the classical regression model. The influence of price of natural rubber futures in Japan on price of natural rubber futures in China is time-varying. In the long run, the impact of natural rubber futures in Japan on natural rubber futures in China has been gradually increased. Thirdly, the influence of price of natural rubber futures in Japan on price in China is greater than the influence of price in China on price in Japan. Index Termsfutures, price, time-varying coefficient model, co-integration

I. INTRODUCTION Because the natural rubber has excellent physical and chemical properties, so it has a wide range of uses, and there are about more than 70,000 kinds of items which are made partially or completely of natural rubber. From the supply side, China is one of the main origins of natural rubber in the world. From the demand side, China is one of the most important consumers of natural rubber in the world (Shanghai Futures Exchange, Trading Manual for contracts of Natural rubber futures, and 2008 Edition) [1]. With the integration of world economy and the deepening of Chinese openness, the relationship between Chinese economy and the other foreign economy has become more closely. In particular, China has joined the World Trade Organization, and will gradually follow the relevant World Trade Organization agreement. The price dependence of commodities at home and abroad has become higher and higher. In recent years, China's consumption of natural rubber is increasingly dependent on imports. With the continuous deepening and promotion of China's opening up, and chinas continuing to honor commitments on the WTO after China's accession to WTO, the demand of China, gradually as the "World Factory", for natural rubber will continue to increase and in later years China will be increasingly
2012 ACADEMY PUBLISHER doi:10.4304/jcp.7.9.2342-2350

dependent on imported natural rubber. (Shanghai Futures Exchange, Trading Manual for contracts of Natural rubber futures, 2008 Edition)[1].Thus, the trend of natural rubber futures prices and the dynamic dependencies between China and other major overseas markets have strategic importance. All this need us to study the dynamic trends of prices of China's natural rubber futures at an international perspective. At presently, Japan's Tokyo Commodity Exchange (TOCOM) and Osaka Mercantile Exchange (OME), China's Shanghai Futures Exchange (SHFE), Singapore Commodity Exchange Limited (SICOM), and etc are the main international trading site for natural rubber futures. Many scholars empirical studies show that there are a certain dynamic dependencies and linkages between these cross-rubber future markets. A significant change of a country's rubber future price often gives impact to other country's rubber future prices and lead to changes of other country's rubber future prices. In the international exchange markets of natural rubber futures, the exchange markets in Japan account for a large market share and have great influence on other exchange markets, and transactions in Japan can basically reflects the situation of world market prices of natural rubber futures. (Shanghai Futures Exchange, Trading Manual for contracts of Natural rubber futures, 2008 Edition; Analysis on the main factors which influence the price volatility of China rubber futures, China rubber)[1, 2]. In general, with the development of reform and opening up and the integration of the world economy, the link between the natural rubber futures markets at home and abroad is increasingly interconnected, and the prices and yields of China's natural rubber futures are more and more dependent on those of a number of major international rubber futures market. Natural rubber futures market in Japan (Japan's TOCOM and OME) is one of the most representatives of the international markets of natural rubber futures markets, Therefore, the paper study systematically the dynamic dependency and linkage effect between the natural rubber futures markets in China and Japan. II. LITERATURE REVIEW Current researches on the relationships between the domestic and international futures markets are mostly limited to the metal futures, such as copper, aluminum

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2343

and others. RENHAI HUA and BAIZHU CHEN (2004) [3] studied empirically the dynamic relationship between the domestic and foreign futures prices (Copper, aluminum, soybean and wheat). JIN Tao, Miao Baiqi, HUI Jun (2005) [4]examined empirically the dynamic relationship of march copper futures prices between the Shanghai Futures Exchange and the London Metal Exchange by use of the casual relationship model. Zhao Liang and Liu Liya (2006) [5] studied the relationship between copper future prices of the Shanghai Futures Exchange and the London Metal Exchange using cointegration and Granger Causality Test. LIU Qing-fu, ZHANG Jin-qing, HUA Ren-hai (2008) [6] studied empirically the causal relationship between metals futures (copper futures and aluminum futures) prices of the Shanghai Futures Exchange and the London Metal Exchange by building some kind of model. However, there is still relatively little research on natural rubber futures. In fact, it has important theoretical and practical significance to research natural rubber, because natural rubber belongs to agricultural products futures at the angle of its source, but if seen from the wide range of industrial uses, natural rubber also belongs to industrial products futures. At the same time researches on the relationship between the futures markets at home and abroad are mainly confined to the traditional constant coefficient regression model. In fact, as for changes of global and regional economic environment, world political situation, government intervention, transaction costs and exchange rates, and frequent occurrences of climate and other natural disasters, changes of domestic and international futures markets are non-symmetrical, and their relationship is also time varying. Thus the traditional constant coefficient regression model can not fully and accurately measure the dynamic relationship between domestic and foreign futures markets. It is more suitable to the actual market situation for the paper to measure dynamically the relationship between the futures markets at home and abroad by use of the time-varying coefficient model. In addition, the paper study dynamically the relationship between the futures markets at home and abroad based on other methods such as Granger causality test, VAR, impulse response and variance decomposition. In a word, the paper has important theoretical and practical significance. The following parts of this paper are organized as follows: First of all is to introduce some of the econometric model and method, which will be used in the later empirical study, such as time-varying coefficient model, unit root, cointegration, Granger causality test, VAR, impulse response, variance decomposition and etc. Then, using the method described above, the empirical study on the dynamic relationship between Chinese natural rubber futures price series and Japanese rubber futures price series is carried out. The last part is the conclusions which summarize the results of empirical research and gives some suggestions. III. THE MODEL INTRODUCED

A. The Time-Varying Coefficients Model of the Relationship between Futures Prices at Home and Abroad In order to measure dynamically the long-term relationship between Chinese and Japanese rubber futures prices, we construct the time-varying coefficients model of the relationship between Chinese and Japanese rubber futures prices based on the time-varying coefficients model (E. SCHLICHT (1989) [7]; E. SCHLICHT (2005)). (ln C )t = a1t + a2t (ln J )t + ut (1)

a1t a1t 1 v1t = + a2t a2t 1 v2t

(2)

Where (ln C )t is the dependent variable which represents Chinese rubber futures price. (ln J )t is the independent variable which represents Japanese rubber futures price. ( a1t ,

a2t ) ' is the coefficient vector, and

the disturbance terms, ut N (0, 2 ) v1t N (0, 12 )

v2t N (0, 22 ) , all obey normal distribution, and the covariance of ut , v1t , v2 t is 0, namely they are
independent of each other, and t is the time index. Model (1) and (2) are the promotion of the classical regression model, and if
2 12 and 2

are 0 then the model (1) and (2)

will degenerate into the classical regression model. B. Unit Root, Cointegration and Granger Causality Test In the empirical study of financial variables, firstly stationary test must be conducted to data sequences; otherwise, there may be "pseudo-regression" and other unexpected phenomena. The standard methods to check whether a sequence is stationary are unit root tests, which mainly including Augmented Dickey-Fuller test (ADF) test, Phillips-Perron (PP) test and etc. The original hypothesis of these tests is: the sequence is a unit root and the alternative hypothesis of these tests is: there is no unit root series. In using these methods to doing the unit root test, particular attention must be paid to determine rationally the lag order of the sequence (Gao Tiemei, 2008)[8]. Usually the AIC, SIC and HQ criteria are used to determine the reasonable delay order of a sequence. A variable sequence is considered a stationary process and can be empirically analyzed directly if it can pass through a unit root test. On the contrary, a variable sequence is considered a non-stationary process and can not be analyzed directly if it can not pass through a unit root test. At this time difference can be taken to the variable sequence to make the variable sequence become a stationary process, but this often causes a lot of loss of long-term information. In order to avoiding losing longterm information, we can verify if there is any long-term equilibrium relationship between the non-stationary variables based on co-integration tests, which mainly including Johansen co-integration test and co-integration test based on regression residuals. Because this article

2012 ACADEMY PUBLISHER

2344

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

only related to Chinese and Japanese rubber futures price series, the paper uses co-integration test based on regression residuals. Empirical study is divided into two steps. The first step is to carry on regression analysis and get regression residuals. The second step is to apply unit root test to the regression residuals, and if the residual is stationary, then it indicates that Chinese and Japanese rubber futures price series are co-integrated. In the empirical study, as to the economic variables not having clear causality, we can test statistically their guiding relationship through granger causality test (Yi huiwen, 2006) [9]. Granger (1969) put forward a definition of causal relationship which can be tested and elucidated it through simple two-variable models [10]. The first step is to test H 0 : LnC is not the reason causing changes in LnJ by the following two formulas: (Zou Ping, 2005; Yi huiwen, 2006) [11; 9].
m m i =1 j =1

The general form of VAR model can be expressed as follows: (Zou Ping, 2005) [11]

Yt = A iYt i + Et
i =1

(5)

Where Yt represents n dimensional column vector which consist of observations from term t, p is lagging number, Ai is coefficient matrix of n n dimension, Et is n 1 matrix consists of the random error term, and where random error term ei (i = 1, 2,L , n) obeys to a white noise process and meet equation: E (eit e jt ) = 0(i, j = 1, 2,L , n, i j ) . the

( LnC )t = 0 + i ( LnC )t i + j ( LnJ )t j + ut 12Y2,t 2 + 13Y2,t 3 + 11Y3,t 1 + 12Y3,t 2 + 13Y3,t 3 + e1t


(3)

For example n = 3, p = 3 , VAR model as equation (5) can be written as follows (Zou Ping, 2005) [11]. Y1t = 10 + 11Y1,t 1 + 12Y1,t 2 + 13Y1,t 3 + 11Y2,t 1 +

Y2t = 20 + 21Y1,t 1 + 22Y1,t 2 + 23Y1,t 3 + 21Y2,t 1 + Y3t = 30 + 31Y1,t 1 + 32Y1,t 2 + 33Y1,t 3 + 31Y2,t 1 +

(6)

( LnC )t = 0 + i ( LnC )t i + vt
i =1

22Y2,t 2 + 23Y2,t 3 + 21Y3,t 1 + 22Y3,t 2 + 23Y3,t 3 + e2t 32Y2,t 2 + 33Y2,t 3 + 31Y3,t 1 + 32Y3,t 2 + 33Y3,t 3 + e3t
In the empirical study, VAR model also has some drawbacks and shortcomings: First, it is difficult to explain the significance of the coefficient of each variable with economic and financial theory; second, it is difficult to judge how the other variables will change in the future when a variable changes (Zou Ping, 2005) [11]. In order to make up for deficiencies in the VAR model and make the VAR model to play the role, the impulse response and variance decomposition should be applied in the empirical research. In a vector autoregressive model, through the dynamic relationship between variables, disturbance of a variable occurred at time t will have a series of chain change effects on each variable after time t. Impulse response can measure the response of explained variable to the unit impact and determine the existing interaction between the variables. The variance decomposition can measure the contribution of the impact of each variable to mean square error of prediction in system (Shen Yue, Liu Hongyu, 2004) [13]. IV. EMPIRICAL RESEARCH A. Statistical Analysis of Chinese and Japanese Rubber Futures Prices Empirical researching objects in this article are the data series of Chinese and Japanese rubber futures price. As for the Japanese rubber futures, we have the readymade data series of continuous price. As for Chinese natural rubber futures, we do not have the ready-made data series of continuous price, but we have the price data of natural rubber 0803, natural rubber 0804, natural rubber 0805, natural rubber 0806, natural rubber 0807, natural rubber 0808, natural rubber 0809, natural rubber 0810, natural rubber 0811, natural rubber 0812 and

(4) are

Where ut , vt are white noise processes,

0 ,i , j

respectively the coefficient, n is the capacity, and m is the lagging number of variable ( LnC )t and ( LnJ )t . Whether coefficients

1 , 2 , L, m are at the same time

significantly different from 0 can be tested by F-Statistics which are calculated by ESS(3) and ESS(4) that are respectively the residual sums of squares of the regression equations (3) and (4). If the original hypothesis is set up and then F =

( ESS(4) ESS(3) ) / m ESS(4) /(n 2m 1)

F ( m, n 2m 1) .

In the specific empirical study, if the value of FStatistics is greater than the critical value, and then the original hypothesis can be refused and the conclusion that LnJ is the reason which causing the changes in LnC are gotten. The second step is to test H 0 : LnJ is not the reason causing changes in LnJ . Ideas and methods are the same with the first step, but exchange LnJ and LnC , and test whether the coefficient of the lagged terms of LnC are significantly non-zero. (Zou Ping, 2005)[11]. C. VAR, Impulse Response and Variance Decomposition VAR is a non-structural method to establish the relationship model between each variable. Although it is not based on economic theory to describe the relationship between variables, but sometimes it is even better than some of the complex simultaneous equation models, and other structural methods. (Gao Tiemei, 2008)[8]. Based on the lacks of traditional methods of structural models, Sims (1980) proposed the VAR method [12].
2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2345

natural rubber 0901 of Shanghai futures market. In order to overcome the discontinuity of Chinese natural rubber futures prices series, the paper takes the main contract approach, which has been often taken in the existing literature and practice, namely a continuous futures series is produced by selecting the futures contract which has the largest trading volume and trade the most actively in each month. In this study, the time span is from January 1998 to December 2007 and the data frequency is monthly data which is the average closing price of trading day in each month. Data sources: Jiangsu Holly Futures Brokerage Co., Ltd. The Softwares used are Eviews5, Matlab7.0 and VC version 5.6 (E. SCHLICHT, 2005). The unit in Japanese rubber futures contracts is the yen / kg and the unit in Chinese rubber futures is the yuan (RMB) / ton. To overcome the drawback of the different units in Chinese and Japanese rubber futures contracts, to avoid the heteroscedasticity in futures price data, and also to facilitate explaining the dynamic relationship between Chinese and Japanese futures price by use of elasticity, we take natural logarithm respectively to Chinese and Japanese futures price. C represents Chinese futures price, lnC=log(C) represents the natural logarithm of Chinese rubber futures price C, J represents Japanese futures price, and LnJ=log(J) represents the natural logarithm of Japanese rubber futures price J. For simplicity we still say lnC and LnJ as Chinese and Japanese rubber futures price. TABLE I and Figure 1 respectively show the statistical features and dynamic characteristics of Chinese and Japanese rubber futures prices from January 1998 to December 2007.
TABLE I.
THE STATISTICAL CHARACTERISTICS OF CHINESE RUBBER FUTURES PRICES

Chinese and Japanese rubber futures price series were significantly unlike the normal distribution. Figure 1 shows the dynamic trends of Chinese and Japanese rubber futures price series. Seen from figure 1, we can know that the trends and changes in trends of Chinese and Japanese rubber futures price series are roughly the same.
11 10 9 8 7 6 5 4 98 99 00 01 02 LNJ 03 04 LNC 05 06 07

Figure 1. The dynamic changes in trends of Chinese and Japanese rubber futures prices

B. The Unit Root Test In the empirical study on the long-term relationship between Chinese and Japanese rubber futures prices, one need firstly to study the stationary problem of Chinese and Japanese rubber futures price series, and if they are not stationary series, you need carry on Cointegration to them. TABLE II shows the results of unit root test. As can be seen from TABLE II that whether under the AIC, SIC or HQ standards, Chinese and Japanese rubber futures prices series are not stationary series, but their first difference is stationary, indicating that they are process integrated of order one.
TABLE II. UNIT ROOT TEST FOR CHINESE AND JAPANESE RUBBER FUTURES PRICE
SERIES

Mean lnC 9.37

9.59 [0.008] LnJ 4.83 5.72 4.19 0.45 0.51 2.04 9.83 [0.007] Note: Skewness reflects the symmetry of the distribution of sequence; Kurtosis reflects the peak or flatness of the distribution of sequence; JB is a test statistic to test whether a sequence obeys normal distribution; Values in square brackets are the p value for the test, namely the least significant level which rejects the original hypothesis ( eviews help files).

Max imum 10.16

Mini mum 8.79

Standard Deviation 0.39

Skew ness 0.36

Kurt osis 1.82

JB

According to TABLE I, the skewnesses of Chinese and Japanese rubber futures prices series are all greater than the skewness of normal distribution, 0, and thus Chinese and Japanese rubber futures prices series are all skewed to the right. The kurtosises of Chinese and Japanese rubber futures prices series are all smaller than the kurtosises of normal distribution, 3, and thus Chinese and Japanese rubber futures prices series are all low kurtosis. According to the JB statistic, their values are both greater than the critical value of JB at 5% significance level, 2 (2) = 5.991 , namely at 5% significance level, both Chinese and Japanese rubber futures price series refuse to obey normal distribution. In general, the distribution of

Chinese rubber futures Japanese rubber futures prices prices Original The first Original The first sequence difference sequence difference sequence sequence methods AIC -0.81(c,0,1) -8.86(c,0,0) -0.02(c,0,1) -9.48(c,0,0) a *** b *** for Unit Root SIC -0.58(c,0,0) -8.86(c,0,0) 0.12(c,0,0) -9.48(c,0,0) Test *** *** HQ -0.81(c,0,1) -8.86(c,0,0) -0.02(c,0,1) -9.48(c,0,0) *** *** a . The first c in (C, 0, 1) expresses containing the intercept, the second 0 in (C, 0, 1) expresses not containing trending terms, and 1 in (C, 0, 1) is lagging number. b . *** expresses that the sequence, at 10%, 5% and 1% significance level, refuse to the hypothesis having the unit root, namely the sequence is stationary series at three significant level.

Sequence name

C.

The Relationship between Domestic and Foreign Futures Prices Based on Constant Coefficient Model The following unit root test results show that: both Chinese and Japanese rubber futures price series are not stationary series, but they are process integrated of order one and co-integration test can be applied to them. Because the paper only relates to the relationship between two variables, therefore, the paper used the two-step method for co-integration test, frequently used in the empirical study.

2012 ACADEMY PUBLISHER

2346

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

The first step is to establish the following regression equation with constant coefficients by taking Japanese rubber futures price as depending variables and Chinese rubber futures prices as the dependent variable. (ln C )t = a1 + a2 (ln J )t + ut (7) According to the data of Chinese and Japanese rubber futures price, we can get the value of the relevant parameters a1 and a2 , and the following regression equation.

(ln C )t = 5.4424 + 0.8131*(ln J )t + ut

(8)

(44.22) (32.05) The value in parentheses is the t test value of the corresponding coefficient. According to the t test value, we know that both coefficients of the constant and the independent variable are significant in equation (8). At the same time the coefficients signs also meet our expectations, and changes in Chinese and Japanese rubber futures prices are in the same direction. 1% changes in Japanese rubber futures prices will lead to 0.8131% changes in Chinese rubber futures prices. As series of Chinese and Japanese rubber futures prices are both process integrated of order one, therefore whether equation (8) makes sense also depends on the result of residual unit root test. If the residual of the equation (8) is stationary, then it indicates that there is long-term stable equilibrium relationship between series of Chinese and Japanese rubber futures prices. On the contrary, it indicates that equation (8) is a false return, and Chinese and Japanese rubber futures prices series have no cointegration. According to proposal of Engle and Granger (1987), although this time we can apply the ADF approach to the unit root test for the residual sequence, but we should use the modified ADF critical value. Empirical test results are shown in TABLE III.
TABLE III. RESULTS OF RESIDUAL UNIT ROOT TESTS Sequence Ecm value of ADF test -3.52(0,0,1) critical value -1.94 Conclusion stationary

coefficients, or the significance of coefficients, or the residual test. But there is an assumption that Chinese and Japanese rubber futures prices have a stable structural relationship in the past several decades. In fact, over the past decades, factors affecting supply and demand of natural rubber such as international economic and political environment, Chinese and Japanese economic and political environment, Chinese and Japanese trading conditions and etc., are changing at any time, and thus the relationship between Chinese and Japanese rubber futures prices should also change with time and condition. So then the time-varying relationship model is applied to measure dynamically the relationship between Chinese and Japanese rubber futures price. The values of the relevant parameters of time-varying coefficient model can be obtained by use of the sequence data of Chinese and Japanese actual rubber futures price. As the coefficient is time varying, so the values of the parameter are many. To save space, we have omitted the values of specific parameter, and give the general movements of the relevant parameters by the graph. We can see from figure 3 that in the time-varying coefficient model the constant value is stable at around 6.8, and the coefficient of the independent variable has greater fluctuations and takes different values at different times. The coefficient of Japanese rubber futures price series obtains the minimum value around June 1998, and obtains the maximum value around may 2006. The coefficient of Japanese rubber futures price series fluctuates around 0.4 to 0.6, and this explains that every 1% change of Japanese rubber futures prices will lead to 0.4% to 0.6% change of Chinese rubber futures prices. We can also see from figure 3 that the coefficient of Japanese rubber futures price series gradually becomes larger from early 1998 to the end of 2007, and this shows that the impact of Japanese rubber futures prices on Chinese rubber futures prices has increasing.
7.6 7.4 7.2 7

We can see from the TABLE III that the residual sequence of equation (8) is stationary, and rubber futures prices in China and Japan pass through the co-integration test. This shows that equation (8) is meaningful, and there exists long-term equilibrium relationship between rubber futures prices in China and Japan. In the long run changes of 1 percentage points in Japanese rubber futures prices will lead to changes of 0.8 percentage points in Chinese rubber futures prices. D. The Relationship between Futures Prices at Home and Abroad Based on Time-varying Coefficients Model In the former section, the paper studies empirically the long-term relationship between Chinese and Japanese rubber futures prices through the constant coefficient model and get the conclusion that there is long-term equilibrium relationship between Chinese and Japanese rubber futures prices whether according to the symbols of

6.8 6.6 6.4 6.2 6 98m1

99m8

01m4

02m12

04m8

06m4

07m12

Figure 2.

The constant of the time-varying model

2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2347

0.75 0.7 0.65 Time-varying coefficients 0.6 0.55 0.5 0.45 0.4 0.35 0.3 0.25 98m1 Upper limit Time-varying coefficients Lower limit 99m8 01m4 02m12 Term 04m8 06m4 07m12

E. Granger Causality Test Whether using the classical regression model or the time-varying coefficient model, the previous empirical results have shown that there exists a stable long-term relationship between Chinese and Japanese rubber futures price series, and Chinese and Japanese rubber futures price series are co-integrated. Therefore, Granger causality test can be applied to them to further determine the causal relationship between them and determine in the end who is guiding whom.
TABLE V. GRANGER CAUSALITY TEST RESULTS BETWEEN CHINESE AND JAPANESE
RUBBER FUTURES PRICE SERIES

Figure 3.

The independent variable coefficient of the time-varying model

Variable name

null hypothesis Japanese rubber futures prices are not the Granger cause of Chinese rubber futures prices Chinese rubber futures prices are not the Granger cause of Japanese rubber futures prices

The same as classical regression model, as series of Chinese and Japanese rubber futures prices are both process integrated of order one, therefore whether equation (9) makes sense also depends on the result of residual unit root test. If the residual of the equation (9) is stationary, then it indicates that there is long-term stable equilibrium relationship between series of Chinese and Japanese rubber futures prices. On the contrary, it indicates that equation (9) is a false return, and Chinese and Japanese rubber futures prices series have no cointegration. The residual term of equation (9) can be gotten by deformation of the equation (9): (10) Ecmt = (ln C )t a1t a2t (ln J )t As before, the paper still uses the ADF testing methods to determine whether the regression residuals series is stationary. Empirical test results are shown in TABLE IV.
TABLE IV.
RESULTS FOR RESIDUAL UNIT ROOT TEST

Chinese and Japanese rubber futures prices

Fstatistic 5.13

probability 0.025

0.02

0.887

sequence Ecm

ADF testing value -3.00(0,0,4)

critical value -1.94

conclusion stationary

As can be seen from TABLE V, at the 5% significant level, the null hypothesis that Japanese rubber futures prices are not the Granger cause of Chinese rubber futures prices is rejected. But the null hypothesis that Chinese rubber futures prices are not the Granger cause of Japanese rubber futures prices is not rejected. All these show that there is a one-way causal relationship from Japanese rubber futures prices to Chinese rubber futures prices and Japanese rubber futures prices have an unidirectional guiding role on Chinese rubber futures prices between Japanese rubber futures prices and Chinese rubber futures prices, and vice versa. F. Impulse Response and Variance Decomposition Based on VAR As Chinese and Japanese rubber futures price series are not stationary, impulse response and variance decomposition can not be applied directly to them. Some scholars believe that cointegration variables can also be analyzed based on VAR, but we stick to strict standards, and on the other hand, the former is based on prices and in the following empirical study, we prepare to use yield. In the next, we first transfer Chinese and Japanese rubber futures price series into return rate series by logarithmic difference, and then test their stabilities. If they are stationary, then apply impulse response and variance decomposition to return rate series.

We can see from the TABLE IV that the residual sequence of equation (9) is stationary, and rubber futures prices in China and Japan pass through the co-integration test. This shows that equation (9) is meaningful, and there exists long-term equilibrium relationship between rubber futures prices in China and Japan. In the long run changes of 1 percentage points in Japanese rubber futures prices will lead to changes of 0.4 to 0.6 percentage points in Chinese rubber futures prices in the same direction. In order to further study advantages and disadvantages of classical regression model and time-varying coefficient model, the paper also calculates the residual sums of squares under these two methods. the residual sum of squares under classical linear regression model is 1.85, that of time-varying coefficient model is 1.8654e-009, and the residual sum of squares under time-varying coefficient model is much less than that under classical linear regression model. Seen from the model fitting results, the time-varying coefficient model is superior to the classical regression model.

2012 ACADEMY PUBLISHER

2348

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

.3

.2

.1

.0

-.1

-.2 98 99 00 01 02 RJ 03 04 RC 05 06 07

Figure 4. Return series of Chinese and Japanese rubber futures

Figure 4 shows return series of Chinese and Japanese rubber futures. Whether using the ADF unit root test method, or the PP unit root test method, and whether according to the AIC information criteria, or the SIC information criteria or the HQ information criteria, both Chinese and Japanese rubber futures return series are stationary processes. So we can apply impulse response and variance decomposition to them based on VAR. Specific steps in the empirical study is as follows: The first step is to judge whether the VAR model is stable. By Lutkepohl (1991)[14] and Gao Tiemei (2008)s [8] point of view, if all the reciprocals of the roots of the estimated VAR model were less than 1, namely they were located in the unit circle, and then the estimated VAR model was stable. It can be seen from figure 5 that the reciprocals of the roots of the VAR model composed of Chinese and Japanese rubber futures return series were less than 1, therefore, the VAR model composed of Chinese and Japanese rubber futures return series is a stable VAR model.
Inverse Roots of AR Characteristic Polynomial
1.5

be paid a positive impact and return to a stable state in the first three terms when Japanese rubber futures yields have been paid a positive impact. When Chinese rubber futures yields have been paid a positive impact, Japanese rubber futures yields have almost not been paid a impact, but Chinese rubber futures yields will be paid a positive impact and return to a stable state in the first two to three terms. Figure 6 also shows that shocks of Chinese rubber futures yields have no effect on Japanese rubber futures yields, but shocks of Japanese rubber futures yields have a certain extent effect on Chinese rubber futures yields. This is the same with the conclusion, gotten by the former Grange causality test, that there is a one-way causal relationship from Japanese rubber futures prices to Chinese rubber futures prices and Japanese rubber futures prices have an unidirectional guiding role on Chinese rubber futures prices between Japanese rubber futures prices and Chinese rubber futures prices, and vice versa.
Response to Cholesky One S.D. Innovations 2 S.E.
Response of RJ to RJ
.08 .08

Response of RJ to RC

.06

.06

.04

.04

.02

.02

.00

.00

-.02 2 4 6 8 10 12 14 16 18 20

-.02 2 4 6 8 10 12 14 16 18 20

Response of RC to RJ
.08 .08

Response of RC to RC

.06

.06

.04

.04

.02

.02

.00

.00

-.02 2 4 6 8 10 12 14 16 18 20

-.02 2 4 6 8 10 12 14 16 18 20

Figure 6.

Unit impact map of Chinese and Japanese rubber futures yields impulse response

Accumulated Response to Cholesky One S.D. Innovations 2 S.E.


Accumulated Response of RJ to RJ Accumulated Response of RJ to RC
.12 .12

1.0

.08

.08

0.5
.04 .04

0.0
.00 .00

-0.5

-.04 2 4 6 8 10 12 14 16 18 20

-.04 2 4 6 8 10 12 14 16 18 20

-1.0

Accumulated Response of RC to RJ
.10 .10

Accumulated Response of RC to RC

-1.5 -1.5

.08

.08

-1.0

-0.5

0.0

0.5

1.0

1.5
.06 .06

Figure 5. Reciprocal of the AR characteristic polynomial root

.04

.04

.02 2 4 6 8 10 12 14 16 18 20

.02 2 4 6 8 10 12 14 16 18 20

The next step is to do the impulse response analysis. Considering the model error, calculation error and model stability, Monte Carlo method is used to calculate the dynamic response of the corresponding futures return series on the disturbance of endogenous variables (Liu Bin, 2001)[15], and the number of repetitions is 10,000 times when the impulse response is calculated. In picture 6, the solid line is the mean of the impulse response simulated by Monte Carlo, and the dotted lines mean the solid line plus / minus on the two standard errors of the impulse response. Test results are shown in Figure 6 and Figure 7. Figure 6 reflects the unit shock, while Figure 7 reflects the cumulative shock. It can be seen from Figure 6 that Japanese rubber futures yields and Chinese rubber futures yields will all

Figure 7. Cumulative impact map of Chinese and Japanese rubber futures yields impulse response

It can be seen from Figure 7 that a unit positive impact on Japanese rubber futures yields will pay 0.08 unit cumulative impacts on Japanese rubber futures yields and 0.05 unit cumulative impacts on Chinese rubber futures yields. A unit positive impact on Chinese rubber futures yields will pay 0 unit cumulative impacts on Japanese rubber futures yields and 0.06 unit cumulative impacts on Chinese rubber futures yields. Figure 7 also shows that cumulative impulse of Chinese rubber futures yields has no impact on Japanese rubber futures yields, but cumulative impulse of Japanese rubber futures yields has a cretin extent impact on Chinese rubber futures yields.

2012 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

2349

This is the same with the conclusion, gotten by the former Grange causality test, that there is a one-way causal relationship from Japanese rubber futures prices to Chinese rubber futures prices and Japanese rubber futures prices have an unidirectional guiding role on Chinese rubber futures prices between Japanese rubber futures prices and Chinese rubber futures prices, and vice versa.
Variance Decomposition 2 S.E.
Percent RJ variance due to RJ
120 120

Percent RJ variance due to RC

80

80

40

40

10

12

14

16

18

20

10

12

14

16

18

20

Percent RC variance due to RJ


100 100

Percent RC variance due to RC

80

80

60

60

40

40

20

20

0 2 4 6 8 10 12 14 16 18 20

0 2 4 6 8 10 12 14 16 18 20

Figure 8. Variance decomposition maps of Chinese and Japanese rubber futures yields

Although the impulse response function depicts the impact of a disturbance of endogenous variable on other variables in VAR, but variance decomposition makes changes in the endogenous variables broken down into VAR factor impact, and thus variance decomposition provides relative importance of each random innovation influencing the variables in VAR (Eviews help file). Figure 8 shows that, throughout the forecasting period, 99% of the forecasting variance of RJ is due to the disturbance of RJ, and 1% of the forecasting variance of RJ is due to the disturbance of RC, and these show that the forecasting variance of RJ is mainly due to the impact of RJ. Throughout the forecasting period, 31% of the forecasting variance of RC is due to the disturbance of RJ, and the remaining 69% of the forecasting variance of RC is due to the disturbance of RC, and these show that the forecasting variance of RC is mainly due to the impact of RC (Pindyck, Rubinfeld, 1999[16]; Eviews help file). According to Figure 8 we can also found that, throughout the forecast period, the part of the forecasting variance of RJ due to the disturbance of RC is much less than the part of the forecasting variance of RC due to the disturbance of RJ. This is the same with the conclusion, gotten by the former Grange causality test, that there is a one-way causal relationship from Japanese rubber futures prices to Chinese rubber futures prices and Japanese rubber futures prices have an unidirectional guiding role on Chinese rubber futures prices between Japanese rubber futures prices and Chinese rubber futures prices, and vice versa. V. CONCLUSIONS Based on the classical regression model, time-varying coefficient model, unit root, co-integration, Granger causality test, VAR, impulse response and variance decomposition, the dynamic relationship between prices of natural rubber futures in China and Japan has been researched systematically. Main conclusions are drawn and some suggestions are put forward as follows through empirical researches: Firstly, there is a stable co 2012 ACADEMY PUBLISHER

integration relationship between prices of natural rubber futures in China and Japan. Secondly, whatever in terms of actual economic background or in terms of the model fitting results, the time-varying coefficient model is superior to the classical regression model. Thirdly, the influence of price of natural rubber futures in Japan on price of natural rubber futures in China is time-varying. Every 1% change of Japanese rubber futures prices will lead to 0.4% to 0.6% change of Chinese rubber futures prices. In the long run, the impact of natural rubber futures in Japan on natural rubber futures in China has been gradually increased. Fourth, the influence of price of natural rubber futures in Japan on price in China is greater than the influence of price of natural rubber futures in China on price in Japan, and Japanese rubber futures prices are the granger cause of Chinese rubber futures price, and vice versa. Fifth, with the deepening of China's opening up and the gradual accession to the world after joining the World Trade Organization, the relationship between Chinese natural rubber futures prices and foreign prices are becoming closer, and this also further requires relevant managers and investors in China to have an international outlook, pay close attention to the international price fluctuations of the relative futures, and at advance do well the prevention and response measures to avoid the negative impact of the foreign futures price volatility on our country too much. ACKNOWLEDGMENT This work was supported in part by a grant from the National Social Science Fund in China (No. 11CJY080). REFERENCES
[1] Shanghai Futures Exchange, Trading Manual for contracts of Natural rubber futures, 2008 Edition. [2] Feature Article, Analysis on the main factors which influence the price volatility of China rubber futures, [J].China rubber, vol. 20(7), pp. 26 -27, 2004. [3] RENHAI HUA, BAIZHU CHEN, International Linkages of the Chinese Futures Markets, [J]. China Economic Quarterly, vol. 3(3), pp.727-742, 2004. [4] JIN Tao, Miao Baiqi, HUI Jun, Investigating the Casual Relationship of the Future Copper Price between LME and SHFE under the Interaction, [J].OPERATIONS RESEARCH AND MANAGEMENT SCIENCE, vol. 14(6), pp.88-92, 2005. [5] Zhao Liang, Liu Liya, Research on the relationship between copper futures markets at home and abroad, [J]. TONG JI YU JUE CE, vol. 10, pp.116-118, 2006. [6] LIU Qing-fu, ZHANG Jin-qing, HUA Ren-hai, Information Transmission between LME and SHFE in Copper Futures Markets, [J]. Journal of Industrial Engineering, vol. 122(12), pp.155-159, 2008. [7] E. SCHLICHT, Variance Estimation in a Random Coefficients Model,[C].Paper presented at the Econometric Society European Meeting, Munich 1989, http://www.lrz.de/~ekkehart. [8] Gao Tiemei. Econometric analysis and modelingapplication and examples of EViews [M]. Beijing: Tsinghua University Press, 2008.

2350

JOURNAL OF COMPUTERS, VOL. 7, NO. 9, SEPTEMBER 2012

[9] Yi huiwen, Discussion on How to Do Granger Causality Test, [J]. Journal of the Postgraduate of Zhongnan University of Economics and Law, vol.5, pp.34-36, 2006. [10] C.W.J. Granger, Investigating Causal Relations by Econometric Models and Cross-spectral Methods, [J]. Econometrica, Vol. 37, No. 3, pp. 424-438, 1969. [11] ZOU Ping. Financial Econometrics [M]. Shanghai: Shanghai University of Finance & economics Press, pp.150-183, 2005. [12] C.A. Sims, Macroeconomics and reality, [J]. Econometrica, Vol.48, No.1,pp. 1-48,1980. [13] SHEN Yue, LIU Hongyu, Relationship between real estate development investment and GDP in China, JT singhua Univ (Sci & Tech), vol. 44 (9), pp.1205-1208, 2004.

[14] Lutkepohl, Helmut. Introduction to Multiple Time Series Analysis. New York: Springer Verlag, 1991. [15] Liu Bin, Identification of the impact of monetary policy and empirical analysis on the effectiveness of China's monetary policy, [J].Journal of Financial Research, vol.7, pp.1-9, 2001. [16] R.S. Pindyck, D.L. Rubinfeld, Econometric Models and Economic Forecasts[M].Translated by Qian Xiaojun and etc., Beijing: China Machine Press, pp.273-277, 1999. Qizhi He received his doctor's degree in Management from Southeast University, China in 2009. He is currently working at College of Finance in Anhui University of Finance and Economics. His areas of interests include Mathematical Finance and Financial Engineering.

2012 ACADEMY PUBLISHER

Call for Papers and Special Issues


Aims and Scope.
Journal of Computers (JCP, ISSN 1796-203X) is a scholarly peer-reviewed international scientific journal published monthly for researchers, developers, technical managers, and educators in the computer field. It provide a high profile, leading edge forum for academic researchers, industrial professionals, engineers, consultants, managers, educators and policy makers working in the field to contribute and disseminate innovative new work on all the areas of computers. JCP invites original, previously unpublished, research, survey and tutorial papers, plus case studies and short research notes, on both applied and theoretical aspects of computers. These areas include, but are not limited to, the following: Computer Organizations and Architectures Operating Systems, Software Systems, and Communication Protocols Real-time Systems, Embedded Systems, and Distributed Systems Digital Devices, Computer Components, and Interconnection Networks Specification, Design, Prototyping, and Testing Methods and Tools Artificial Intelligence, Algorithms, Computational Science Performance, Fault Tolerance, Reliability, Security, and Testability Case Studies and Experimental and Theoretical Evaluations New and Important Applications and Trends

Special Issue Guidelines


Special issues feature specifically aimed and targeted topics of interest contributed by authors responding to a particular Call for Papers or by invitation, edited by guest editor(s). We encourage you to submit proposals for creating special issues in areas that are of interest to the Journal. Preference will be given to proposals that cover some unique aspect of the technology and ones that include subjects that are timely and useful to the readers of the Journal. A Special Issue is typically made of 10 to 15 papers, with each paper 8 to 12 pages of length. The following information should be included as part of the proposal: Proposed title for the Special Issue Description of the topic area to be focused upon and justification Review process for the selection and rejection of papers. Name, contact, position, affiliation, and biography of the Guest Editor(s) List of potential reviewers Potential authors to the issue Tentative time-table for the call for papers and reviews If a proposal is accepted, the guest editor will be responsible for: Preparing the Call for Papers to be included on the Journals Web site. Distribution of the Call for Papers broadly to various mailing lists and sites. Getting submissions, arranging review process, making decisions, and carrying out all correspondence with the authors. Authors should be informed the Instructions for Authors. Providing us the completed and approved final versions of the papers formatted in the Journals style, together with all authors contact information. Writing a one- or two-page introductory editorial to be published in the Special Issue.

Special Issue for a Conference/Workshop


A special issue for a Conference/Workshop is usually released in association with the committee members of the Conference/Workshop like general chairs and/or program chairs who are appointed as the Guest Editors of the Special Issue. Special Issue for a Conference/Workshop is typically made of 10 to 15 papers, with each paper 8 to 12 pages of length. Guest Editors are involved in the following steps in guest-editing a Special Issue based on a Conference/Workshop: Selecting a Title for the Special Issue, e.g. Special Issue: Selected Best Papers of XYZ Conference. Sending us a formal Letter of Intent for the Special Issue. Creating a Call for Papers for the Special Issue, posting it on the conference web site, and publicizing it to the conference attendees. Information about the Journal and Academy Publisher can be included in the Call for Papers. Establishing criteria for paper selection/rejections. The papers can be nominated based on multiple criteria, e.g. rank in review process plus the evaluation from the Session Chairs and the feedback from the Conference attendees. Selecting and inviting submissions, arranging review process, making decisions, and carrying out all correspondence with the authors. Authors should be informed the Author Instructions. Usually, the Proceedings manuscripts should be expanded and enhanced. Providing us the completed and approved final versions of the papers formatted in the Journals style, together with all authors contact information. Writing a one- or two-page introductory editorial to be published in the Special Issue.

More information is available on the web site at http://www.academypublisher.com/jcp/.

BP Neural Network based on PSO Algorithm for Temperature Characteristics of Gas Nanosensor Weiguo Zhao Hybrid SVM-HMM Diagnosis Method for Rotor-Gear-Bearing Transmission System Qiang Shao and Changjian Feng A Real-Time Information Service Platform for High-Speed Train Ruidan Su, Tao Wen, Weiwei Yan, Kunlin Zhang, Dayu Shi, and Huaiyu Xu Research on the Grey Assessment System of Dam Failure Risk Ying Jiang and Qiuwen Zhang Research on the Dynamic Relationship between Prices of Agricultural Futures in China and Japan Qizhi He

2318

2324

2330

2334

2342

(Contents Continued from Back Cover) Efficient Graduate Employment Serving System based on Queuing Theory Hui Zeng Staying Cable Wires of Fiber Bragg Grating/Fiber-Reinforced Composite Jianzhi Li, Yanliang Du, and Baochen Sun Research and Development of Intelligent Motor Test System Li Li, Jian Liu, and Yuelong Yang Application of Intelligent Controller in SRM Drive Baojian Zhang, Yanli Zhu, Jianping Xie, and Jianping Wang Simulation of Rolling Forming of Precision Profile Used for Piston Ring based on LS_DYNA Jigang Wu, Xuejun Li, and Kuanfang He Bargmann and Neumann System of the Second-Order Matrix Eigenvalue Problem Shujuan Yuan, Shuhong Wang, Wei Liu, Xiaohong Liu, and Li Li Privacy-preserving Judgment of the Intersection for Convex Polygons Yifei Yao, Shurong Ning, Miaomiao Tian, and Wei Yang Medical Equipment Utility Analysis based on Queuing Theory Xiaoqing Lu, Ruyu Tian, and Shuming Guan Research and Application of Electromagnetic Compatibility Technology Hong Zhao, Guofeng Li, Ninghui Wang, Shunli Zheng, and Lijun Yu An Improved Mixed Gas Pipeline Multivariable Decoupling Control Method Based on ADRC Technology Zhikun Chen, Yutian Wang, Ruicheng Zhang, and Xu Wu A Robust Scalable Spatial Spread-Spectrum Video Watermarking Scheme Based on a Fast Downsampling Method Cheng Wang, Shaohui Liu, Feng Jiang, and Yan Liu Blocking Contourlet Transform: An Improvement of Contourlet Transformand Its Application to Image Retrieval Jian Wu, Zhiming Cui, Pengpeng Zhao, and Jianming Chen Speech Recognition Approach Based on Speech Feature Clustering and HMM Xinguang Li, Minfeng Yao, and Jianeng Yang Electronic Nose for the Vinegar Quality Evaluation by an Incremental RBF Network Hong Men, Lei Wang, and Haiping Zhang Intelligent Recognition for Microbiologically Influenced Corrosion Based On Hilbert-huang Transform and BP Neural Network Hong Men, Jing Zhang, and Lihua Zhang Research on Diagnosis of AC Engine Wear Fault Based on Support Vector Machine and Information Fusion Lei Zhang and Yanfei Dong Optimal Kernel Marginal Fisher Analysis for Face Recognition Ziqiang Wang and Xia Sun Hybrid Cloud Computing Platform for Hazardous Chemicals Releases Monitoring and Forecasting Xuelin Shi, Yongjie Sui, and Ying Zhao Quantum Competition Network Model Based On Quantum Entanglement Yanhua Zhong and Changqing Yuan 2176

2184

2192

2200

2208

2216

2224

2232

2240

2248

2256

2262

2269

2276

2283

2292

2298

2306

2312

Anda mungkin juga menyukai