Anda di halaman 1dari 3

2012 IEEE Symposium on Electrical & Electronics Engineering (EEESYM)

K-Nearest Neighbor LS-SVM

Method for Multi-step Prediction of Chaotic Time Series


Dept. of Navigation, Dept. of Basics
Dalian Naval Academy Dalian Naval Academy
Dalian, China Dalian, China
e-mail: e-mail:

Abstract-To reduce the complexity of training an least are the parameters to be identified.
squares support vector machine (LSSVM), a nearest
y(x) = W· qJ(x)+b, weZ,be R (1)
neighbors method was proposed to perform Multi-step time
series prediction. By selecting dataset with smallest Euclidean The optimization problem is defined as follows:

min J(w,e) =-wT w+-r� r> 0

distance and similar changing trend for each testing instance, a 1 12
reduced training dataset was defined. Experiments on chaotic L,.. e k (2)
w,b,e 2
datasets were conducted to compare the prediction 2 k =1

performance with traditional interactive single-step methods. Where y is an adjustable constant. (2) is subjected to
y k = wT qJ(Xk ) +b+e k ' k = 1, .. "
The results demonstrate that the proposed method
N C3)
outperforms the single-step methods. The ability of Multi-step
prediction is promising even when the noises were added. where e k is the error between actual output and

Keywords- chaotic time series; least squares support vector

predictive output of the k-th data.
machine; Euclidean distance; Multi-step prediction
The LS-SVM model of the data set can be given by

The task of forecasting a time series over a long horizon y(x) = La;¢(xJT ¢(x)+b (4)
is commonly tackled by iterating one-step-ahead predictors. ;=1

Despite the popularity that this approach gained in the Where a; e R(i = 1,2, ...... ,N) are Lagrange multipliers,
prediction community, its design is still plagued by a number
of important unresolved issues, the most important being the The mapping function can be paraphrased by a kernel
function K(', . ) because of the application of Mercer's
K(x,xJ(i = 1,2...... ,N)
accumulation of prediction errors. Another method named
direct strategies train one prediction model for each theorem, which means that are
prediction horizon based on the historical data, which puts any kernel functions satisfYing the Mercer condition[2].
nij = ¢(X;)T ¢(x) = K(x;,x)
more effort in the training stage and there is no error
accumulation problem[l]. , ne �NXN (5)


Analytical solutions of parameters a;e Rand b can
be obtained from the equation:
Least squares support vector machine (LSSVM) N
approach has been widely used for nonlinear classification
and function estimation, and successfully applied to time
y(x) = La;K(x,xJ+b (6)
;= 1
series prediction. LSSVM are proposed by Suykens and
Note that in the case of RBF kernels:
Vandewalle (1999). The basic idea of mapping function is to
K(x,x;)=exp{_ll }
map the data into a high-dimensional feature space, and to do
linear regression in this space. ;;J (7)
Given a data set {(XPY;)}:l ,where N is the total There are only two additional tuning parameters , kernel
number of training data pairs, Xi E Rn is input vector and
r in Eq.(2).
width parameter U in Eq.(7) and regularization parameter

Yi E R is output signal. According to the SVM theory, the III. THE MULTI-STEP AHEAD ALGORITHM
input space Rn is mapped into a feature space Z with an
tp( Xi) being the corresponding mapping
The basic steps involved in the Multi-step algorithm for
nonlinear function a L-step time series prediction are as follows:
function. In the feature space, we take the form(Eq.l) to ( 1) The algorithm starts with the construction of the
estimate the unknown nonlinear function where Wand b training data matrix X,WIn by sliding the window iteratively,

978-1-1673-2365-9/12/$31.00 ©2012 IEEE 407

2012 IEEE Symposium on Electrical & Electronics Engineering (EEESYM)

P is the length of time series in each window. m is the should be also considered. Here, the first order difference is
number of data points. Y,vm is the objective matrix. used to describe the trend of a time series.
Given a testing input vector starting at time point T
with length p, we first calculate its Euclidean distance with
each instance in the training dataset, denoted as Lo(k)
using Equation 2.
�vm= � - - - )-2 -+- -+-C-x-], - - - - - - - - -)2- (10)
r- x-],- -
Lo(k) = C Xk ... +P_I -Xk+P_I
The first order difference of the testing input vector
Xm-p-i+1 Xm-p-i+2 Xm-i
Xp+1 Xp+2 Xp+i
is dr ,... , dr p_ =
+ 2 ) (X +
T I -xl' ,... , T _ - T
X +p 1 X +P-2 ')
whose size is 1 x(P-l) as well. Calculate the Euclidean
Xp+2 x p+3 Xp+i+1 distance between the differential testing input vector
r:.vm = and each differential training input vector, denoted as :
Xm _i+1 Xm-i+2 xm �
LI(k) = CdT -dk? + ...+CdT+P_2 -dk+P_2?
A combination of the normalized Lo(k) and Lj (k) is
rp is the nonlinear mapping between training data and
used as the distance metric for the k-NN method. The
objective data in future space.
distance Dis Ck) is defined by Equation 12.
(2) The LS-SVM parameters (y ,a ) are selected 4(k)-MIN...L) + �(k)-MIN...LJ
Dis (k;)
by using Particle swarm optimization. The fitness function is MAJ(4)-MIN...4) MAJ(�)-MIN...�)·
defined as: (12)
where MAX( Lo)' MIN( La)' MAX( Ll), and MIN( Ll )are the
f = �t ei2 = (9)
maximum and minimum values for Lo(k) and LI(k) ,
Where / is the root-mean-square error of L steps respectively. k instances corresponding to the smallest
predicted results, which varies with the LS-SVM parameters, distance measures are selected to generate a reduced
When the termination criterion is met, the individual with the training dataset for LSSVM.
best fitness corresponds to the optimal parameters of the LS­
(3 ) The optimal parameters for (rJ, (J"2,) is obtained A. Lorenz's system prediction simulation
from step(2) is named as (rbest' (J'�est)' Train the LS-SVM We use the typical Lorenz's chaotic system as a sample
based on (rbest' (J'�est) and get final Lagrange multipliers to test the validity of the algorithm
a and the parameter b , and the optimal set ( a , b ) in (6)
* * *
is obtainedo j=(28-z)'x-y (13)
(4) Using LS-SVM model to predicts the data one-
. 8
z = xy-3' z
step ahead (d;1+1 )with the parameters ( a', b') by using the
testing set Xl {dn_p+l'dn_p+2,··,dm}
= . Using this
Take the initial values as x(O) = 0.005
procedure interactively until we finish the L -step prediction
and get the d';HI 0
x(O) = 0.01, z(O) = 0.8, and the integral time interval is
h = 0.1 .Integrate equations (l3) with four-rank Runge­
Kutta method to get a chaotic time series of 3000 points,
For the similar inputs, it is assumed that their mapping Omit the first 1000 points and take 1-700 points as learning
relationships with their outputs should be similar as well. sample, and take 701-1000 points as testing sample.
Hence, the k-NN method is employed to reduce the training
dataset by selecting k instances in the training dataset which
are the closest to the input testing instance. In order to B. Error measures
measure the similarity of the instances, the Euclidean Let X be the m true values for a testing time series
distance is usually used as the distance metric. However, for dataset, and X be the m predicted values obtained L-step
a time series segment, the trend of the changing values ahead. Two error measures are used to evaluate the

2012 IEEE Symposium on Electrical & Electronics Engineering (EEESYM)

prediction performance. One is the root mean squared error

(RMS£), which is the square root of the variance as defined
-+-- single-step LS-SVM
in Equation 9.
4 --t- k-NN based LS-SVM
2 w
1 L (f)
RMSE = -
I(X i=l
* ) '"

The other error measure is the symmetric mean

absolutepercentage error (SMA?£), which is based on 0
° L-�� 2 O - -
�WL- _ - - - -

-1 O
-12 O
-1 4 O

prediction honzon L
relative errors and is defined in Equation 10.

m -+-- single-step LS-SVM
I Ix
L m+i - x +i I --t- k-NN based LS-SVM
SMAPE = - L *
L i=1 (Xm+i + xm+i ) /2
� 0.5


O ������=C==�� � �1 .
prediction IlMSE SMAPE o 20 40 W 60
horizon interactive k-NN interactive k-NN prediction honzon L
single-step based single-step based
methods LS-SVM methods LS-SVM
10 0.3125 O. 2033 0.0164 0.0151
40 0.6957 0.5486 0.0346 0.0193
Figure 2. 160 steps prediction result of Lorenz series by the kNN based
70 I. 2872 I. 0509 0.0758 0.0447 LSSVM approach
100 2.1120 I. 7198 0.1832 0.0676
130 2.9219 I. 9447 0.4738 0.1430
160 5.7642 2.4520 0.9193 O. 2446 C. Conclusion
Due to the need and challenge of long-term time series
prediction, we presented a k-NN based LSSVM framework
for the Multi-step prediction. An advantage of the prediction
scheme proposed in this paper is its simplicity of calculation.


[I] M. Maralloo, A. Koushki, C Lucas, and A. Kalhor. Long term

electrical load forecasting via a neurofuzzy model. In Proceedings of
the 14th International CSI Computer Conference, pp. 35-40, October
[2] K. Meng, Z Dong, and K. Wong. Self-adaptive radial basis function
neural network for short-term electricity price forecasting. Generation,
Transmission Distribution, IET,3(4). pp.325-335, April 2009.

Figure 1. Performance of the kNN based LSSVM onl40 steps Lorenz

time series prediction(using 700 points for training)