Article

© All Rights Reserved

10 tayangan

Article

© All Rights Reserved

- tmpF6B3.tmp
- CPDF Brochure 2010
- Demand Forecasting
- Railway Demand Forecasting and Service Planning Processes
- Yield Learning Modeling in Wafer Manufacturing
- Load Forecast
- ME_1
- EB-12-V32-I1-P15my
- OPM Forecasting 1
- bu-0912-1
- Required Method for Schedule Control
- InTech-Energy Demand Analysis and Forecast
- A comparison between Neural Network and Box Jenkins Forecasting Techniques With Application to Real Data.pdf
- Bollerslev 01 Financial Econometrics Past Developments and Future Challenges
- Time Series Analysis for Business Forecasting
- An Adaptive Ordered Fuzzy Time Series With Application to FOREX
- Seasonal Rainfall Trend Analysis
- Extreme Precip Europe Definition of Episode 2012
- Wp 930001
- Stat -992

Anda di halaman 1dari 13

Neurocomputing

journal homepage: www.elsevier.com/locate/neucom

mode decomposition and auto regression

Guo-Feng Fan a, Li-Ling Peng a, Wei-Chiang Hong b,n, Fan Sun a

a

College of Mathematics &Information Science, Ping Ding Shan University, Ping Ding Shan 467000, Henan, China

b

Department of Information Management, Oriental Institute of Technology, 58 Sec. 2, Sichuan Rd., Panchiao, New Taipei 220, Taiwan

art ic l e i nf o a b s t r a c t

Article history: Electric load forecasting is an important issue for power utility, associated with the management of daily

Received 26 December 2014 operations such as energy transfer scheduling, unit commitment, and load dispatch. Inspired by strong

Received in revised form non-linear learning capability of support vector regression (SVR), this paper presents a SVR model

5 June 2015

hybridized with the differential empirical mode decomposition (DEMD) method and auto regression

Accepted 20 August 2015

(AR) for electric load forecasting. The differential EMD method is used to decompose the electric load

Bijaya Ketan Panigrahi

Available online 1 September 2015 into several detail parts associated with high frequencies (intrinsic mode function (IMF)) and an

approximate part associated with low frequencies. The electric load data from the New South Wales

Keywords: (NSW, Australia) market and the New York Independent System Operator (NYISO, USA) are employed for

Electric load forecasting

comparing the forecasting performances of different alternative models. The results illustrate the validity

Support vector regression

of the idea that the proposed model can simultaneously provide forecasting with good accuracy and

Differential empirical mode decomposition

Auto regression interpretability.

& 2015 Elsevier B.V. All rights reserved.

made for energy forecasting. Though these methods can yield some

Electrical energy could be hardly stocked; therefore, electric signiﬁcant improvements in terms of forecasting accuracy, they are

load forecasting plays a vital role in the daily operational man- usually lacking of the interpretability. Recently, expert systems,

agement of power utility, such as energy transfer scheduling, unit mainly developed by means of linguistic fuzzy rule-based systems,

commitment, load dispatch, and so on. With the emergence of allow us to deal with the system modeling with good interpretability

load management strategies, it is highly desirable to develop [14]. However, these models have strong dependency on the expert

accurate, fast, simple, robust and interpretable load forecasting and often cannot generate satisﬁed forecasting accuracy. Therefore,

models for these electric utilities to achieve the purposes of higher hybrid models, which are based on the existed methods such as

reliability and management efﬁciency [1]. expert systems and other techniques, are proposed to receive both

In the past decades, researchers have proposed lots of meth- high accuracy and interpretability.

odologies to improve load forecasting accuracy. For example, Arda- Based on the advantages in statistical learning capacity to handle

kani et al. [2] proposed linear regression models for electricity con- high dimensional data, support vector regression (SVR) model,

sumption forecasting; Arisoy et al. [3] applied a Grey prediction especially suitable for small sample size learning, has become a

model for electricity demand in Turkey; Afshar and Bigdeli [4] pro- popular algorithm for many forecasting problems [15–17]. However,

the worst shortcoming of an SVR method is that it is easily trapped

posed an improved singular spectral analysis method for short-term

into a local optimum during the nonlinear optimization process

load forecasting (STLF) for the Iranian electricity market; and Kumar

particularly while its three parameters are determining. In addition,

and Jain [5] applied three time series models—Grey–Markov model,

its robustness also requires some improvement. These improving

Grey-Model with rolling mechanism, and singular spectrum analysis

issues are still ongoing in the SVR forecasting research ﬁelds [18].

—to forecast the consumption of conventional energy in India. By

On the other hand, in terms of ﬁnding time series ﬂuctuation ten-

employing artiﬁcial neural networks, Refs. [6–9] proposed several

dency, the wavelet transform possesses the ability to construct a

useful short-term load forecasting models. By hybridizing the pop-

good time resolution in high frequency region of a time series

ular method and evolutionary algorithm, the authors of [10–13] (signal). However, a shortcoming of wavelet transform is that the

computing is somewhat time consuming and, particularly, it cannot

n

Corresponding author. achieve ﬁne resolutions in both time domain and frequency domain

E-mail address: samuelsonhong@gmail.com (W.-C. Hong). simultaneously while suffering from large size data analysis [19,20].

http://dx.doi.org/10.1016/j.neucom.2015.08.051

0925-2312/& 2015 Elsevier B.V. All rights reserved.

G.-F. Fan et al. / Neurocomputing 173 (2016) 958–970 959

For the empirical mode decomposition (EMD) with auto regression the literature, such as ARIMA model, BPNN model (artiﬁcial neural

(AR), which is a fast, easy, and reliable unsupervised clustering network trained by back-propagation algorithm), and GA–ANN

algorithm, it has been successfully applied to many ﬁelds, such as model (artiﬁcial neural network trained by genetic algorithm).

communication, economy, engineering, and so on [21–23], and also These experimental results indicate that this proposed DEMD–

has achieved good effects. In the meanwhile, the EMD method can SVR–AR model has the following advantages: (1) simultaneously

effectively extract the components of the basic mode from non- receives higher accuracy and interpretability; (2) the proposed

linear or non-stationary time series [21,23–26]. By employed EMD, model can tolerate more redundant information than the original

the original complex time series (with multi-scale) can be locally SVR model, thus, it has better generalization ability.

separated into a sum of a low frequency part (residual) and a high The rest of this paper is organized as follows: in Section 2, the

frequency part (IMF), i. e., time series can be transferred into a series DEMD–SVR–AR forecasting model is introduced and the main steps

with more apparent component by reducing noise [26]. However, of the model are given. In Section 3, the data description and the

the sifting process in the EMD modeling phase will stop when the research design are outlined. The numerical results and compar-

residual becomes either over-distorted or a monotonic function

isons are presented and discussed in Section 4. A brief conclusion of

from which no further IMF can be extracted [27,28]. Therefore,

this paper and the future research are provided in Section 5.

Bhusana and Chris [23] proposed the differential empirical mode

decomposition (DEMD) to overcome the ﬂuctuation problem which

the original EMD method is unable to do well. In their model, a

derivate signal can be obtained by several derivative of the original 2. Support vector regression with differential empirical mode

signal which will eliminate the ﬂuctuated gradient, so that the decomposition

signal can be better to meet the requirements of EMD. The new

signal is then used by EMD to integrate and receive each order 2.1. Differential empirical mode decomposition (DEMD)

intrinsic mode function (IMF) and the residual amount of the ori-

ginal signal. The DEMD method is used to decompose the electric The EMD method is based on the simple assumption that any

load to several detail parts associated with high frequencies IMF and signal consists of different simple intrinsic modes of oscillations.

an approximate part associated with low frequencies IMF. It can Each linear or non-linear mode will have the same number of

effectively reduce the interactions among lots of singular values and extreme and zero-crossings. There is only one extreme between

improve the forecasting performance of a single kernel function. successive zero-crossings. Each mode should be independent of

Thus, it is useful to employ suitable kernel functions for forecasting the others. Since the original work on EMD, several studies have

the medium-and-long-term tendencies of the time series. been presented to improve EMD. One improvement is the differ-

In this paper, we present a new hybrid model with clear ential EMD [23]. In this section, the differential EMD will be

human-understandable knowledge on training data to achieve a described as follows. In this way, each signal could be decomposed

satisﬁed forecasting accuracy. The principal idea is hybridizing into a number of intrinsic mode functions (IMFs), each of which

DEMD with SVR and AR, namely the DEMD–SVR–AR model, to should satisfy the following two deﬁnitions [25],

receive better forecasting performances. The rationale of our

forecasting model is as follows: (1) the raw data can be divided a. In the whole data set, the number of extreme and the number

into two parts by DEMD technology, one is the high frequency of zero-crossings should either equal or differ to each other at

item, another is the residuals; (2) the high frequency item have most by one.

little redundant information than the raw data and trend infor- b. At any point, the mean value of the envelope deﬁned by local

mation, because these information are gone to the residuals, so the maxima and the envelope deﬁned by the local minima is zero.

SVR model is employed to forecast the high frequency, the accu-

racy is higher than the original SVR model particularly in some An IMF represents a simple oscillatory mode compared with the

peak and valley values period; (3) the residuals is monotonous and simple harmonic function. With the deﬁnition, any signal x (t ) can be

stationary, so the AR model is appropriate for forecast the resi- decomposed as following steps, and the ﬂowchart is shown as Fig. 1.

duals; (4) the forecasting results would be eventually obtained 1. Identify all local extremes, and then connect all the local

from the high frequency item and the residuals. The proposed maxima by a cubic spline line as the upper envelope.

DEMD–SVR–AR model has the capability in smoothing and redu- 2. Repeat the procedure for the local minima to produce the

cing the noise (inherited from DEMD), the capability in ﬁltering lower envelope. The upper and lower envelopes should cover all

dataset and improving forecasting performance (inherited from

the data among them.

SVR), and the capability in effectively forecasting the future ten-

3. The mean of upper and low envelope value is designated as

dencies (inherited from AR). The forecasting outputs by using the

m1, and the difference between the signal x (t ) and m1 is the ﬁrst

hybrid method are described in the following section.

component, h1, as shown in Eq. (1),

To show the applicability, generality and superiority of the

proposed model, ﬁrstly, half-hourly electric load data (48 data h1 = x (t ) − m1 (1)

points per day) from the New South Wales (NSW, Australia) with

Generally speaking, h1 will not necessarily meet the require-

two different sample sizes are employed to compare the fore-

ments of the IMF, because h1 is not a standard IMF. It needs to be

casting performances of the proposed model and other four

determined for k times until the mean envelope tends to zero.

alternative models existed in the literature, namely the PSO–BP

Then, the ﬁrst intrinsic mode function c1 is introduced, which

model (BP neural network trained by a particle swarm optimiza-

tion algorithm), SVR model, PSO–SVR model (SVR parameters stands for the most high-frequency component of the original data

determined by the PSO algorithm), and the AFCM model (an sequence. At this point, the data could be represented as Eq. (2),

adaptive fuzzy combination model based on a self-organizing map h1k = h1 (k − 1) − m1k (2)

and support vector regression). Secondly, another hourly electric

load data (24 data points per day) from the New York Independent where h1k is the datum after k times siftings. h1 (k − 1) stands for

System Operator (NYISO, USA), also, with two different sample the data after k − 1 times sifting. Standard deviation (SD) is used to

sizes are used to further compare the forecasting performances of determine whether the results of each ﬁlter component meet the

the proposed model with other three alternative models existed in IMF or not. SD is deﬁned as Eq. (3),

960 G.-F. Fan et al. / Neurocomputing 173 (2016) 958–970

n

Start x1 (t ) = ∑ ci + rn

i=1 (6)

input signal Because the sifting process stops when the residual rn(t) becomes

x(t) either over-distorted or a monotonic function from which no further

IMF can be extracted. The power density of white Gaussian noise has

r = x(t), n = 1 a normal distribution, so eliminating the IMF that represents

the normal distribution is therefore assumed to cancel the white

Gaussian noise. Next, the last IMF, the lagged IMF before the

Determination of local maxima monotonic function emerges, is the most suitable because its local

and minima of X(t)

curves have a normal distribution. Subsequently, we subtract the

Fitting the envelope envelope under

original signals using the last IMF, denoted as c0 (t ) in Eq. (1).

E1 and E2 Finally, the differential EMD is proposed by Eq. (7),

DEMD = xn (t ) − c0 (t ) (7)

x(t) = h

m = (E1 + E2) / 2

where xn (t ) refers to dependent variables.

The original data can be expressed as the IMF component and

h = x(t) - m remainder.

x(t) = r

2.2. Support vector regression

Y

regression are introduced brieﬂy. Given a data set of N elements

{(Xi , yi ) , i = 1, 2, ⋯ , N}, where Xi is the i-th element in n-dimen-

n = n+1, c(n) = h,

r = r- c(n)

sional space, i. e., Xi = [x1i , ⋯ , xni ] ∈ Rn , and yi ∈ R is the actual

value corresponding to Xi . A non-linear mapping function, g( ):

Rn → R nh , is deﬁned to map the training (input) data Xi into the

If r is a monotonic function N

so-called high dimensional feature space (which may have inﬁnite

dimensions), R nh . Then, in the high dimensional feature space,

Y there theoretically exists a linear function, f , to formulate the non-

linear relationship between input data and output data. Such a

End

linear function, namely SVR function, is shown as Eq. (8),

f (X ) = WT φ (X ) + b (8)

Fig. 1. Differential EMD algorithm ﬂowchart.

where f (X ) denotes the forecasting values; the coefﬁcients

2

W (W ∈ R nh ) and b ( b ∈ R ) are adjustable. As mentioned above, the

T

h1 (k − 1) (t ) − h1k (t ) SVM method aims at minimizing the empirical risk, shown as Eq. (9),

SD = ∑

k=1

h12(k − 1) (t ) (3) N

1

R emp (f ) = ∑ Θε (yi , WT φ (Xi ) + b)

where T is the length of the data. N i=1 (9)

The value of standard deviation SD is limited in the range of

0.2 to 0.3, which means when 0.2 < SD < 0.3, the decomposition where Θε (y, f (x ))is the ε-insensitive loss function and deﬁned

process can be ﬁnished. The consideration for this standard is that as Eq. (10),

it should not only ensure hk (t ) to meet the IMF requirements, but ⎛ f (X ) − Y − ε, if f (X ) − Y ≥ ε

also control the decomposition times. Therefore, in this way, the Θε (Y , f (X )) = ⎜

⎝ 0, otherwise (10)

IMF components could retain amplitude modulation information

in the original signal. In addition, Θε (Y , f (X )) is employed to ﬁnd out an optimum hyper-

4. When h1k had met the basic requirements of SD, based on the plane on the high dimensional feature space (Fig. 1b) to maximize the

condition of c1 ¼ h1k , the signal x (t ) of the ﬁrst IMF component c1 distance separating the training data into two subsets. Thus, the SVR

can be obtained directly, and a new series r1 could be achieved focuses on ﬁnding the optimum hyper plane and minimizing the

after deleting the high frequency components. This relationship

training error between the training data and the ε -insensitive loss

could be expressed as Eq. (4),

function. Then, the SVR minimizes the overall errors, shown as Eq. (11),

r1 = x1 (t ) − c1 (4) N

1 T

Min Rε (W , ξ *, ξ ) = W W + C ∑ (ξi* + ξi )

The new sequence is treated as the original data and repeats W , b, ξ *, ξ 2 (11)

i=1

the steps 1 to 3 processes. The second intrinsic mode function c2

could be obtained. with the constraints:

5. Repeat previous steps 1 to 4 until the rn cannot be decom-

Yi − WT φ (Xi ) − b ≤ ε + ξi*, i = 1, 2, ... , N

posed into the IMF. The sequence rn is called the remainder of the

original data x (t ) : rn is a monotonic sequence, it can indicate the − Yi + WT φ (Xi ) + b ≤ ε + ξi, i = 1, 2, ... , N

overall trend of the raw data x1 (t ) or mean, and it is usually ξi* ≥ 0, i = 1, 2, ... , N

referred as the so-called trend items. It is of clear physical sig- ξi ≥ 0, i = 1, 2, ... , N (12)

niﬁcance. The process is expressed as Eqs. (5) and (6):

The ﬁrst term of Eq. (11), employing the concept of maximizing

r1 = x1 (t ) − c1, r2 = r1 − c2, …, rn = rn − 1 − cn (5) the distance of two separated training data, is used to regularize

G.-F. Fan et al. / Neurocomputing 173 (2016) 958–970 961

weight sizes to penalize large weights, and to maintain regression Input (data)

function ﬂatness. The second term penalizes training errors of f (x )

DEMD

and y by using the ε -insensitive loss function. C is the parameter

to trade off these two terms. Training errors above ε are denoted as

ξi*, whereas training errors below ε are denoted as ξi .

Resi

After the quadratic optimization problem with inequality con- IMF1 IMF2 IMF3 IMFk

duals

straints is solved, the parameter vector w in Eq. (8) is obtained as

Eq. (13), SVR

AR

N

W= ∑ (βi* − βi ) φ (Xi )

i=1 (13) Prediction

where ξi*,ξi are obtained by solving a quadratic program and are

Fig. 2. The full ﬂowchart of DEMD–SVR–AR model.

the Lagrangian multipliers. Finally, the SVR regression function is

obtained as Eq. (14) in the dual space:

from SVR model and AR model, respectively, the ﬁnal fore-

N casting results would be eventually obtained from the high

f (X ) = ∑ (βi* − βi ) K (Xi , X ) + b frequency item and the residuals.

i=1 (14)

the kernel equals the inner product of two vectors, Xi and Xj , 3. Numerical examples

in the feature space φ (Xi ) and φ (Xj ), respectively; that is,

K (Xi , Xj ) = φ (Xi ) φ (Xj ). Any function that meets Mercer's condition To show the applicability, superiority and generality of the

[29] can be used as the kernel function. proposed model, we employ two different electric markets, the

There are several types of kernel function. The most used ker- New South Wales (NSW) market in Australia (namely Case 1) and

nel functions are the Gaussian radial basis functions (RBF) with a the New York Independent System Operator (NYISO) in USA

width of σ : K (Xi , Xj ) = exp ( − 0.5‖Xi − Xj ‖2 /σ 2) and the poly- (namely Case 2). In addition, for each case, we all conduct two

nomial kernel with an order of d and constants a1 and a2: kinds of sample size, small sample and large sample, respectively.

K (Xi , Xj ) = (a1Xi Xj + a2 )d . However, the Gaussian RBF kernel is not

only easy to implement, but also capable of non-linearly mapping 3.1. The experimental results of Case 1

the training data into an inﬁnite dimensional space, thus, it is

suitable to deal with non-linear relationship problems. Therefore, For Case 1, ﬁrstly, the proposed model is trained by electric load

the Gaussian RBF kernel function is speciﬁed in this study. obtained from 2 to 7 May 2007 (i.e., training data set), and testing

electric load data is on 8 May 2007. The employed electric load data

2.3. AR Model is on a half-hourly basis (i.e., 48 data points per day). The data size

contains only 7 days, to differ from the other example with more

Eq. (15) expresses a p-step autoregressive model, referring as sample data, this example is so-called the small sample size data, and

AR(p)model [30]. Stationary time series {Xt } that meet the model illustrated in Fig. 3(a).

AR(p) is called the AR(p) sequence. That a = (a1, a2, ⋯ , ap )T is Secondly, too large training sets should avoid overtraining

named as the regression coefﬁcients of the AR (p) model: during the learning process of the SVR model. Therefore, the

second experiment with 23 days (1104 data points from 2 to 24

p

May 2007) is modeled by using part of all the training samples as

Xt = ∑ aj Xt − j + εt , t ∈ Z

j=1 (15) training set, i.e., from 2 to 17 May 2007, and testing electric load

data is from 18 to 24 May 2007. This example is so-called the large

sample size data, and illustrated in Fig. 3(b).

2.4. The full procedure of DEMD–SVR–AR model

3.1.1. Results after DEMD in Case 1

The full procedure of the proposed DEMD–SVR–AR model is After being decomposed by DEMD, the data can be divided

briefed as follow and is illustrated in Fig. 2. into eight groups, which are shown in Fig. 4(a)–(h) and the last

group (Fig. 4(h) is a trend term (residuals)). The so-called high

Step 1. Decomposed the input data by DEMD: each electric load frequency item is obtained by adding the preceding seven groups.

data (input data) could be decomposed into a number of intrinsic From Fig. 3(a) and (b), the trend of the high frequency item is the

mode functions (IMFs), i. e., two parts, one is the high frequency same as original data, and the structure is more regular, i.e., it is

item, the other is the residuals. Please refer Section 2.1 and Fig. 1 to more stable. Then, the high frequency item (data-I) and the

learn more detail process of DEMD. residuals (data-II) have good effects of regression by the SVR and

Step 2. SVR modeling: SVR model is employed to forecast the AR, respectively, and will be described as follow.

high frequency item, thus, to look for most suitable parameters,

different sizes of fed-in/fed-out subsets will be set in this stage. 3.1.2. Forecasting using SVR for data-I (the high frequency item in

Please refer Section 2.2 to learn more detail process of SVR. Case 1)

Step 3. AR modeling: the residuals item is forecasted by the AR As shown in Fig. 3, the high frequency data and raw data have

model due to its monotonous and stationary. Please refer sub- the same characteristic such as nonlinearity, chaos. The SVR model

section 2.3 to learn more detail processes of AR modeling. is very adaptive to solve forecasting problems.

Similarly, while the new parameters are with smaller MAPE value Firstly, for both small sample and large sample data, the high-

or maximum iteration is reached, the new three parameters and frequency item is simultaneously employed for SVR modeling, and

its corresponding objective value is the solution in this stage. the better performances of the training and testing (forecasting) sets

Step 4. DEMD–SVR–AR forecasting: after receiving the fore- are shown in Fig. 5(a) and (b), respectively. The correlation coefﬁ-

casting values of the high frequency item and the residuals item cients of training effects are 0.9935 and 0.9927, respectively, of the

962 G.-F. Fan et al. / Neurocomputing 173 (2016) 958–970

11000

11000

10000 10000

9000 9000

8000 8000

7000 7000

6000 6000

Electric load (MW)

Original data

4000 DEMD data-I

4000 DEMD data-I

3000

3000

2000

2000

1000

1000

0

0 -1000

-1000 -2000

-2000 -3000

-50 0 50 100 150 200 250 300 350 0 200 400 600 800 1000 1200

Time (half hour) Time (half hour)

Fig. 3. (a) Half-hourly electric load in NSW from 2 to 8 May 2007; (b) half-hourly electric load in NSW from 2 to 24 May 2007.

800 1500

1500

Electric load (MW)

Electric load (MW)

600

400 1000 1000

200 500

500

0

-200 0 0

-400 -500

-500

-600

-800 -1000 -1000

-50 0 50 100 150 200 250 300 350 -50 0 50 100 150 200 250 300 350 -50 0 50 100 150 200 250 300 350

1500 1200

1500

1000

Electric load (MW)

Electric load (MW)

500 600

500

400

0

0 200

-500 0

-1000 -500 -200

-400

-1500 -1000 -600

-2000 -800

-1500

-50 0 50 100 150 200 250 300 350 -50 0 50 100 150 200 250 300 350 -50 0 50 100 150 200 250 300 350

Electric load (MW)

Electric load (MW)

9200

Electric load (MW)

200 10000

9000

100

8800 9000

0 8600

8000

8400

-100

8200 7000

-200

8000 6000

-300 7800

-50 0 50 100 150 200 250 300 350 -50 0 50 100 150 200 250 300 350 -50 0 50 100 150 200 250 300 350

Fig. 4. For ease of prevention, the graphs (a)–(h) show the plots at different IMFs for the small sample size in Case 1.

forecast effects are 0.9976 and 0.9984, accordingly. This implies that 3.1.3. Forecasting using AR for data-II (the residuals in Case 1)

the decomposition is helpful to improve the forecasting accuracy. The As shown in Fig. 4(h), the residuals are linear locally and stable,

parameters of a SVR model for data-I are shown in Table 1, in which so the AR technique is very suitable to forecast.

the forecasting error for the high-frequency decomposed by the Then, according to the geometric decay of the correlation coef-

modiﬁed DEMD and SVR has been reduced. ﬁcient and partial correlation coefﬁcients fourth-order truncation

G.-F. Fan et al. / Neurocomputing 173 (2016) 958–970 963

Fig. 5. Comparison of the data-I and the forecasted electric load of training and testing by the SVR model for the small sample and large sample data in Case 1: (a) one-day

ahead prediction of May 8, 2007 are performed by the model; (b) one-week ahead prediction from 18 to 24 May 2007 are performed by the model.

3.2.2. Forecasting using SVR for data-I (the high frequency item in

Table 1

The SVR's parameters for data-I in Case 1.

Case 2)

As shown in Fig. 7, the high frequency data and raw data have

Sample size m σ C ε Testing MAPE the same characteristic such as nonlinearity, chaos. The SVR model

is very adaptive to solve forecasting problems.

The small sample size 20 0.1 100 0.0047 9.72

The large sample size 20 0.24 128 0.0021 4.9

Firstly, for both small sample and large sample data, the high-

frequency item is simultaneously employed for SVR modeling, and

the better performances of the training and testing (forecasting)

for data-II (the residuals), it can be denoted as AR(4) model. The sets are shown in Fig. 9(a) and (b), respectively. The correlation

parameters of an AR model for data-II are also shown in Table 2. coefﬁcients of training effects are 0.9901 and 0.9915, respectively,

As shown in Fig. 6(a) and (b), the residuals, for both small sample of the forecast effects are 0.9936 and 0.9957, accordingly. This

and large sample data, almost are in a straight line. In addition, it is not implies that the decomposition is helpful to improve the fore-

difﬁcult to ﬁnd straight line in Fig. 4(h), which is also the superiority of casting accuracy. The parameters of a SVR model for data-I are

DEMD technology. The good forecasting results are shown in Table 2, shown in Table 3, in which the forecasting error for the high-fre-

and the errors have reached the level of 10 5 for the small or large quency decomposed by the modiﬁed DEMD and SVR has been

amount of data. It has demonstrated the superiority of the AR model. reduced.

In Table 2, the forecasting error of the residuals by the improved

decomposition DEMD has signiﬁcantly reduced. 3.2.3. Forecasting using AR for data-II (the residuals in Case 2)

As shown in Fig. 8(h), the residuals are linear locally and stable,

3.2. The experimental results of Case 2 so the AR technique is very suitable to forecast.

Then, according to the geometric decay of the correlation coef-

For Case 2, ﬁrstly, the proposed model is trained by electric load ﬁcient and partial correlation coefﬁcients fourth-order truncation

obtained from 1 January 2015 to 12 January 2015 (i.e., training data for data-II (the residuals), it can be denoted as AR(4) model. The

set), and testing electric load data is from 13 to 14 January 2015. parameters of an AR model for data-II are also shown in Table 4.

The employed electric load data is on an hour basis (i.e., 24 data As shown in Fig. 10(a) and (b), the residuals, for both small

points per day). The data size contains only 14 days, to differ from sample and large sample data, almost are in a straight line. In

the other example with more sample data, this example is so- addition, it is not difﬁcult to ﬁnd straight line in Fig. 8(h), which is

called the small sample size data, and illustrated in Fig. 7(a). also the superiority of DEMD technology. The good forecasting

Secondly, the second experiment with 46 days (1104 data results are shown in Table 4, and the errors have reached the level

points from 1 January to 15 February 2015) is modeled by using of 10 5 for the small or large amount of data. It has demonstrated

part of all the training samples as training set, i.e., from 1 January the superiority of the AR model. In Table 4, the forecasting error of

to 1 February 2015, and testing electric load data is from 2 to 15 the residuals by the improved decomposition DEMD has sig-

February 2015. This example is so-called the large sample size niﬁcantly reduced.

data, and illustrated in Fig. 7(b).

After being decomposed by DEMD, similarly, the data can also

be divided into eight groups, which are shown in Fig. 8(a) to This section focuses on the efﬁciency of the proposed model

(h) and the last group (Fig. 8(h) is a trend term (residuals)). The with respect to computational accuracy and interpretability. To

high frequency item is also obtained by adding the preceding consider the small sample size modeling ability of the SVR model

seven groups. From Fig. 7(a) and (b), the trend of the high fre- and conduct fair comparisons, we perform two real experimental

quency item is the same as original data, and the structure is more cases, as mentioned in Section 3, which are both with relatively

regular, i.e., it is more stable. Then, the high frequency item (data- small sample size for the ﬁrst experiment. And, the second next

I) and the residuals (data-II) have good effects of regression by the experiment with 1104 data points is focused on illustrating the

SVR and AR, respectively, and will be described as follow. relationship between sample size and accuracy.

964 G.-F. Fan et al. / Neurocomputing 173 (2016) 958–970

Table 2

Summary of results of the AR forecasting model for data-II in Case 1.

The small sample size 9.7725 × 10−5 xn = 5523.894 + 1.01xn − 1 + 0.372176xn − 2 + 0.002791xn − 3 − 0.791445xn − 4

The large sample size 7.5921 × 10−5 xn = 5538.269 + 1.0022xn − 1 + 0.369828xn − 2 + 0.001914xn − 3 − 0.753692xn − 4

8640

Actual Values Actual Values

9500

8620 Predicted Values Predicted Values

8600 9400

8580

Electric load-Trend(MW)

9300

8560

8540 9200

8520

9100

8500

8480 9000

8460

8900

8440

0 10 20 30 40 50 -50 0 50 100 150 200 250 300 350

Time(half hour) Time (half hour)

Fig. 6. Comparison of the data-II and the forecasted electric load by the AR model for the two experiments in Case 1: (a) one-day ahead prediction of 8 May 2007 performed

by the model; (b) one-week ahead prediction from 18 to 24 May 2007 performed by the model.

26000 26000

24000 24000

22000 22000

20000 20000

18000 18000

16000 16000

Electric load (MW)

14000 14000

12000 original 12000

10000 data-I 10000 original

8000 8000 data-I

6000 6000

4000 4000

2000 2000

0 0

-2000 -2000

-4000 -4000

-6000 -6000

-50 0 50 100 150 200 250 300 350 -100 0 100 200 300 400 500 600 700 800 900 1000 1100 1200

Time (hour) Time (hour)

Fig. 7. (a) Hour electric load in NYISO from 1 to 14 January 2015; (b) hour electric load in NYISO from 1 to 15 February 2015.

4.1. Forecasting evaluation methods 4.2. Parameter settings of the employed forecasting models

For the purpose of evaluating the forecasting capability, we As mentioned by Taylor [31], and to be based on the same

examine the forecasting accuracy by calculating three different sta- comparison condition with Che et al. [32], in Case 1, some para-

tistical metrics, the root mean square error (RMSE), the mean absolute meter settings of the employed forecasting models are set as fol-

error (MAE) and the mean absolute percentage error (MAPE). The lowings. For the PSO–BP model, as mentioned in [32], they employ

deﬁnitions of RMSE, MAE and MAPE are expressed as Eqs. (16)–(18): 90% of all collected samples as the training set, and the rest as the

n 2 evaluation set. The parameters used in the PSO–BP are set as fol-

∑i = 1 ( Pi − Ai ) lows, (i) The BP neural network is set as that the input layer

RMSE =

n (16) dimension (indim) is 2, hidden layer dimension (hiddennum) is 3,

output layer dimension (outdim) is 1; (ii) the related settings of the

n

∑i = 1 Pi − Ai PSO, as mentioned in [32], are as that maximum iteration number

MAE =

n (17) (itmax) is 300, number of particles N is 40, length of particle D is 3,

weight c1 and c2 are set as 2. Because the PSO–SVR model embeds

∑i = 1

n Pi − Ai the construction and prediction algorithm of SVR in the ﬁtness

Ai

MAPE = *100 value iteration step of PSO, it will take a long time to train the

n (18)

PSO–SVR using the full training dataset. For the above reason, we

where Pi and Ai are the i-th predicted and actual values, draw a small part of all training samples as training set, and the

respectively, and n is the total number of predictions. rest as evaluation set. The parameters of PSO used in this case are

G.-F. Fan et al. / Neurocomputing 173 (2016) 958–970 965

2000

2000 3000

Electric load (MW)

1500

2000

1000

1000

1000

500

0 0 0

-1000 -500

-1000

-2000 -1000

-2000 -3000 -1500

0 200 400 600 800 1000 1200 0 200 400 600 800 1000 1200 0 200 400 600 800 1000 1200

2000 1000

800 100

Electric load (MW)

1500

600

1000

400 0

500 200

0 0 -100

-500 -200

-400 -200

-1000

-600

-1500 -300

-800

-2000 -1000

-400

0 200 400 600 800 1000 1200 0 200 400 600 800 1000 1200 0 200 400 600 800 1000 1200

20000

80 40

Electric load (MW)

60 30 19900

40 20 19800

20 10

19700

0 0

-20 -10 19600

-40 -20

19500

-60 -30

-80 -40 19400

0 200 400 600 800 1000 1200 0 200 400 600 800 1000 1200 0 200 400 600 800 1000 1200

Fig. 8. For ease of prevention, the graphs (a)–(h) show the plots at different IMFs for the small sample size in Case 2.

as follows, for small sample size, maximum iteration number SVR model and the proposed DEMD–SVR–AR model are shown in

(itmax) is 50, number of particles N is 20, length of particle D is 3, Fig. 11(a). Notice that the forecasting curve of the proposed

weight c1 and c2 are set as 2; for large sample size, maximum DEMD–SVR–AR model (red solid dot and red curve) ﬁts better

iteration number (itmax) is 20, number of particles N is 5, length of than other alternative models. For the Case 2, the forecasting

particle D is 3, weight c1 and c2 are set as 2. results (the electric load from 13 to 14 January 2015) of the

Regarding to Case 2, to further verify the applicability, gen- ARIMA model, the BPNN model, the GA–ANN model, and the

erality and superiority of the proposed model, the newest electric proposed DEMD–SVR–AR model are shown in Fig. 12(a). Simi-

load data from NYISO is employed for modeling, three alternative larly, the forecasting curve of the proposed DEMD–SVR–AR

forecasting models (including the ARIMA model, BPNN model, and model (red solid triangle and red curve) also ﬁts better than

GA–ANN model) existed in the literature are selected to be com- others.

pared with the proposed model. Some parameter settings of the The second experiments in Cases 1 and 2 show the one-week-

employed forecasting models are set as followings. For BPNN

ahead forecasting for the large sample size data. The peak load

model, the node numbers of its structure are different between

values of testing set are bigger than that of training set shown in

small sample size and large sample size, for the former one, the

Figs. 5(b) and 9(b), respectively. The detailed forecasted results of

input layer dimension is 240, the hidden layer dimension is 12,

this experiment are shown in Figs. 11(b) and 12(b). It indicates that

and the output layer dimension is 48, and 480, 12, 336, respec-

the results obtained from the DEMD–SVR–AR model ﬁts the peak

tively, for the latter one. The parameters of GA–ANN model used in

load values exceptionally well. In other words, the DEMD–SVR–AR

this case are as follows, generation numbers are set as 5, popula-

tion size is set as 100, bit numbers are set as 50, mutation rate is model has better generalization ability than the three comparison

set as 0.8, crossover rate is 0.05. models in both Cases. Particularly in Case 1, for example, the local

enlargement (peak) details of Fig. 11(a) and (b) are shown in Fig. 13

4.3. Empirical results and analysis (a) and (b), respectively. It is clearer to see that the forecasting curve

of the proposed DEMD–SVR–AR model (red solid dot and red curve)

For the ﬁrst experiment in Case 1, the forecasting results (the ﬁts more precise than other alternative models, i.e., it is powerful to

electric load on 8 May 2007) of the original SVR model, the PSO– keep the data changing trend including ﬂuctuation tendency.

966 G.-F. Fan et al. / Neurocomputing 173 (2016) 958–970

5000

5000

original

4000 original 4000

predict

predict

3000 3000

2000 2000

1000 1000

data-I (MW)

data-I (MW)

0

-1000 -1000

-2000 -2000

-3000 -3000

-4000 -4000

-5000 -5000

-6000 -6000

0 24 48 72 96 120 144 168 192 216 240 264 288 0 48 96 144 192 240 288 336 384 432 480 528 576 624 672 720 768

2000 2000

1000 1000

DEMD-dataI (MW)

forecast (MW)

forecast (MW)

1000 1000

0 0

0 0

-1000 -1000

-1000 -1000

-2000 -2000

-2000 -2000

-3000 -3000

-3000 -3000

-4000 -4000

-4000 -4000

-5000 -5000

0 5 10 15 20 25 30 35 40 45 50 0 20 40 60 80 100 120 140 160 180 200 220 240 260 280 300 320 340

Time(hour) Time(hour)

Fig. 9. Comparison of the data-I and the forecasted electric load of training and testing by the SVR model for the small sample and large sample data in Case 2: (a) one-day

ahead prediction from 13 to 14 January 2015 are performed by the model; (b) one-week ahead prediction from 2 to 15 February 2015 are performed by the model.

The SVR’s parameters for data-I in Case 2. Therefore the forecasting accuracy increases signiﬁcantly.

Several observations can also be noticed from the results.

Sample size m σ C ε Testing MAPE

Firstly, from the comparisons among these models, we point

The small sample size 24 0.12 113 0.0038 8.19 out that the proposed model outperforms other alternative

The large sample size 24 0.21 127 0.0019 5.37 models. Secondly, the DEMD–SVR–AR model has better gen-

eralization ability for different input patterns as shown in the

second experiment. Thirdly, from the comparison between the

different sample sizes of these two experiments, we conclude

The forecasting results in Cases 1 and 2 are summarized in

Tables 5 and 6, respectively. The proposed DEMD–SVR–AR model that the hybrid model can tolerate more redundant information

is compared with four alternative models. It is found that our and construct the model for the larger sample size data set.

hybrid model outperforms all other alternatives in terms of all the Finally, since the proposed model generates good results with

evaluation criteria. One of the general observations is that the good accuracy and interpretability, it is robust and effective as

proposed model tends to ﬁt closer to the actual value with a shown in Tables 5 and 6. Overall, the proposed model provides

smaller forecasting error. a very powerful tool to implement easily for electric load

The proposed model shows the higher forecasting accuracy in forecasting.

terms of three different statistical metrics. In view of the model Furthermore, to verify the signiﬁcance of the accuracy

effectiveness and efﬁciency on the whole, we can conclude that improvement of the DEMD–SVR–AR model, the forecasting accu-

the proposed model is quite competitive against other compared racy comparisons in both Cases among original SVR, PSO–SVR,

models, the ARIMA, BPNN, GA–ANN, PSO–BP, SVR, PSO–SVR, and PSO–BP, AFCM, ARIMA, BPNN, GA–ANN and DEMD–SVR–AR

AFCM models. In other words, the hybrid model leads to better models are conducted by a statistical test, namely a Wilcoxon

accuracy and statistical interpretation. signed-rank test, at the 0.025 and 0.05 signiﬁcance levels in one-

In particularly, as shown in Fig. 13, our method shows higher tail-tests. The test results are shown in Tables 7 and 8. Clearly, the

accuracy and well ﬂexibility in peak or inﬂection point, because proposed DEMD–SVR–AR model is signiﬁcant (under a signiﬁcant

the little redundant information could be used to statistical level 0.05) superior to other alternative models.

G.-F. Fan et al. / Neurocomputing 173 (2016) 958–970 967

Table 4

Summary of results of the AR forecasting model for data-II in Case 2.

The small sample size 6.7345 × 10−5 xn = 10372.441−0.998xn − 1 + 0. 65218xn − 2−0. 3316xn − 3 + 0. 00072xn − 4

The large sample size 7. 8579 × 10−5 xn = 11013.26 + 0. 9782xn − 1 + 0. 11xn − 2−0. 4783xn − 3 + 0. 36437xn − 4

19940

20130

19920

20120 Electric load Trend

20110 19900 forecast

data-II

20100 forecast

Electric load Trend (MW)

19880

20090

20080 19860

20070

19840

20060

20050 19820

20040

19800

20030

20020 19780

0 5 10 15 20 25 30 35 40 45 50 0 25 50 75 100 125 150 175 200 225 250 275 300 325

Time (hour) Time (hour)

Fig. 10. Comparison of the data-II and the forecasted electric load by the AR model for the two experiments in Case 2: (a) one-day ahead prediction of 13 to 14 January 2015

are performed by the model; (b) one-week ahead prediction from 2 to 15 February 2015 are performed by the model.

11000 (2)

10000

10500

9500 10000

9500

Electric load (MW)

Electric load (MW)

9000

9000

8500 8500

8000

8000 7500

Raw data

Forecasted load by DEMDSVRAR 7000

7500

Forecasted load by SVR 6500

7000 Forecasted load by PSOSVR 6000 Raw data

Forecasted load by DEMDSVRAR

5500 Forecasted load by SVR

6500 Forecasted load by PSOSVR

5000

0 10 20 30 40 50 0 50 100 150 200 250 300 350

Time (half hour) Time(half hour)

Fig. 11. Comparison of the original data and the forecasted electric load by the DEMD–SVR–AR Model, the SVR model and the PSO–SVR model for: (a) the small sample size

(One-day ahead prediction of May 8, 2007 are performed by the models); (b) the large sample size (one-week ahead prediction from 18 to 24 May 2007 are performed by the

models). (For interpretation of the references to color in this ﬁgure, the reader is referred to the web version of this article.)

5. Conclusions other sub-classes with small size. The DEMD term of the proposed

DEMD–SVR–AR model has been employed in the present research,

The proposed model achieves superiority and signiﬁcantly out- details of which have discussed in the above section.

performs the original SVR model while forecasting based on the The interest in applying the DEMD forecast systems arises from

unbalanced data. In addition, the goal of the training model is not to the fact that those systems consider both accuracy and compre-

learn an exact representation of the training set itself, but rather to hensibility of the forecast result simultaneously. To this end, a

set up a statistical model that generalizes better forecasting values hybrid model has been proposed and its effectiveness in forecasting

for the new inputs. In practical applications of a SVR model, if the the electric load data has been compared with three other alter-

SVR model is over trained to some sub-classes with overwhelming native models. In this study, various data characteristics of electric

size, it memorizes the training data and gives poor generalization of load are identiﬁed where the proposed model performs better than

968 G.-F. Fan et al. / Neurocomputing 173 (2016) 958–970

24000 24000

22000

22000

20000

Electric load (MW)

20000

18000

18000

16000

ARIMA(4,1,4) 14000 ARIMA(4,1,4)

BPNN BPNN

14000 GANN GANN

DEMDSVRAR 12000

DEMDSVRAR

Time (hour) Time (hour)

Fig. 12. Comparison of the original data and the forecasted electric load by the DEMD–SVR–AR Model, the ARIMA model, the BPNN model and the GA–ANN model for:

(a) the small sample size (one-day ahead prediction from 13 to 14 January 2015 are performed by the models); (b) the large sample size (one-week ahead prediction from 2

to 15 February 2015 are performed by the models). (For interpretation of the references to color in this ﬁgure, the reader is referred to the web version of this article.)

10600

11500

10400

11000

Electric load (MW)

10200

10500

10000

10000

9800

9500

34 36 38 40 42 160 180 200 220 240 260 280 300 320 340

Fig. 13. The local enlargement (peak) comparison of the DEMD–SVR–AR Model, the SVR model and the PSO–SVR model for (a) the small sample size; (b) the large sample

size. (For interpretation of the references to color in this ﬁgure, the reader is referred to the web version of this article.)

Table 5

Table 6

Summary of results of the forecasting models in Case 1.

Summary of results of the forecasting models in Case 2.

Algorithm MAPE RMSE MAE

(s)

ARIMA(4,1,4) 45.33 320.45 25.72

For the ﬁrst experiment (small sample size)

BP–ANN 31.76 219.43 21.69

Original SVR 11.6955 145.865 10.9181 180.4

GA–ANN 23.89 220.96 23.55

PSO–SVR 11.4189 145.685 10.6739 165.2

EMD–SVR–AR 14.31 158.11 17.44

PSO–BP 10.9094 142.261 10.1429 159.9

DEMD–SVR–AR 8.19 140.16 12.79

AFCM [24] 9.9524 125.323 9.2588 75.3

ARIMA(4,1,4) 60. 65 733.22 54.05

EMD–SVR–AR 9.8595 117.159 9.0967 80.7

BP–ANN 42.5 479.48 50.39

DEMD–SVR– 9.7162 110.159 8.7459 76.8

GA–ANN 33.12 450.63 44.35

AR

EMD–SVR–AR 11.29 289.21 20.76

For the second experiment (large sample size) DEMD–SVR–AR 5.37 160.58 15.82

Original SVR 12.8765 181.617 12.0528 116.8

PSO–SVR 13.503 271.429 13.0739 192.7

PSO–BP 12.2384 175.235 11.3555 163.1

AFCM [26] 11.1019 158.754 10.4385 160.4

EMD–SVR–AR 5.100 134.201 9.8215 162.0 experimental results, we conclude that the proposed DEMD–SVR–

DEMD–SVR–AR 4.826 130.118 9.5440 163.3 AR model algorithm can generate not only human-understandable

rules, but also better forecasting accuracy levels. Our proposed

model also outperforms other alternative models in terms of

the other algorithms in terms of its forecasting capability. For interpretability, forecasting accuracy and generalization ability,

example, in Case 2, the electric load from NYISO is with more which are especially true for forecasting with unbalanced data and

ﬂuctuated tendency, where DEMD algorithm can signiﬁcantly very complex systems. In particular, the analyzed sequence can be

overcome the ﬂuctuation problem. Based on the obtained decomposed by the improved DEMD accurately, thereby improve

G.-F. Fan et al. / Neurocomputing 173 (2016) 958–970 969

Table 7 [11] W.-C. Hong, Electric load forecasting by seasonal recurrent SVR (support

Wilcoxon signed-rank test in Case 1. vector regression) with chaotic artiﬁcial bee colony algorithm, Energy 36

(2011) 5568–5578.

Compared models Wilcoxon signed-rank test [12] M. Yesilbudak, S. Sagiroglu, I. Colak, A new approach to very short term wind

speed prediction using k-nearest neighbor classiﬁcation, Energy Convers.

α ¼0.025; W ¼4 α ¼ 0.05;W¼ 6 Manag. 69 (2013) 77–86.

[13] H. Peng, F. Liu, X. Yang, A hybrid strategy of short term wind power prediction,

Renew. Energy 50 (2013) 590–595.

DEMD–SVR–AR vs. original SVR 8 3a

[14] X. An, D. Jiang, C. Liu, M. Zhao, Wind farm power prediction based on wavelet

DEMD–SVR–AR vs. PSO–SVR 6 2a

decomposition and chaotic time series, Expert Syst. Appl. 38 (2011) 11280–11285.

DEMD–SVR–AR vs. PSO–BP 6 2a [15] Y. Lei, J. Lin, Z. He, M.J. Zuo, A review on empirical mode decomposition in fault

DEMD–SVR–AR vs. AFCM 6 2a diagnosis of rotating machinery, Mech. Syst. Signal Process. 35 (2013) 108–126.

DEMD–SVR–AR vs. EMD–SVR–AR 6 2a [16] P. Wong, Q. Xu, C. Vong, H. Wong, Rate-dependent hysteresis modeling and

control of a piezostage using online support vector machine and relevance

a

Denotes that the DEMD–SVR–AR model signiﬁcantly outperforms other vector machine, IEEE Trans. Ind. Electron. 59 (2012) 1988–2001.

alternative models. [17] Z. Wang, L. Liu, Sensitivity prediction of sensor based on relevance vector

machine, J. Inf. Comput. Sci. 9 (2012) 2589–2597.

[18] W.-C. Hong, Intelligent Energy Demand Forecasting, Springer, London, UK, 2013.

Table 8 [19] Z.K. Peng, P.W. Tse, F.L. Chu, A comparison study of improved Hilbert–Huang

Wilcoxon signed-rank test. in Case 2. transform and wavelet transform: Application to fault diagnosis for rolling

bearing, Mech. Syst. Signal Process. 19 (2005) 974–988.

Compared models Wilcoxon signed-rank test [20] H. Li, B. Xu, Y. Zuo, G. Wu, The comparative study of the signal trend extraction

based on Wavelet Transformation and EMD method, Instrum. Anal. Monit. 3

α¼ 0.025; W ¼ 4 α ¼0.05; W¼ 6 (2013) 28–30.

[21] B. Huang, A. Kunoth, An optimization based empirical mode decomposition

DEMD–SVR–AR vs. ARIMA 6 2a scheme, J. Comput. Appl. Math. 240 (2013) 174–183.

DEMD–SVR–AR vs. BPNN 6 2a [22] G. Fan, S. Qing, Z. Wang, Shi, W.-C. Hong, L. Dai, Study on apparent kinetic

DEMD–SVR–AR vs. GA–ANN 6 2a prediction model of the smelting reduction based on the time series, Math.

DEMD–SVR–AR vs. EMD–SVR–AR 6 2a Probl. Eng. 2012 (2012) 1–15, http://dx.doi.org/10.1155/2012/720849.

[23] P. Bhusana, T. Chris, Improving prediction of exchange rates using differential

a EMD, Expert Syst. Appl. 40 (2013) 377–384.

Denotes that the EMDSVRAR model signiﬁcantly outperforms other alter-

[24] X. An, D. Jiang, M. Zhao, C. Liu, Short-term prediction of wind power using

native models. EMD and chaotic theory, Commun. Nonlinear Sci. Numer. Simul. 17 (2012)

1036–1042.

the forecasting accuracy of the SVR model. Meanwhile, even the [25] Y. Huang, F.G. Schmitt, Time dependent intrinsic correlation analysis of tem-

perature and dissolved oxygen time series using empirical mode decom-

interference is decomposed into the residuals, the AR model is still position, J. Mar. Syst. 130 (2014) 90–100.

receive well forecasting performance. [26] G. Rilling, P. Flandrin, P. Gonçalvès, On empirical mode decomposition and its

algorithms, in: Proceedings of the 6th IEEE/EURASIP Workshop on Nonlinear

Signal and Image Processing (NSIP'03), Grado, Italy, 2003.

[27] W. Huang, Z. Shen, N.E. Huang, Y.C. Fung, Nonlinear indicial response of

Acknowledgments complex nonstationary oscillations as pulmonary hypertension responding to

step hypoxia, Proc. Natl. Acad. Sci. 96 (1996) 1834–1839 , USA.

[28] N.E. Huang, N.O. Attoh-Okine, The Hilbert Transform in Engineering, CRC

This work was supported by the Startup Foundation for Doctors Press, Florida, USA, 2005, Taylor & Francis Group.

(No. PXY-BSQD-2014001), Educational Commission of Henan [29] V. Vapnik, The Nature of Statistical Learning Theory, Springer-Verlag,

Province of China (No. 15A530010), The Youth Foundation of Ping New York, NY, USA, 1995.

[30] H.L. Koul, X. Zhu, Goodness-of-ﬁt testing of error distribution in nonpara-

Ding Shan University (No. PXY-QNJJ-2014008), and Ministry of metric ARCH(1) models, J. Multivar. Anal. 137 (2015) 141–160.

Science and Technology, Taiwan (NSC 100-2628-H-161-001-MY4 [31] J.W. Taylor, Short-term load forecasting with exponentially weighted meth-

ods, IEEE Trans. Power Syst. 27 (2012) 458–464.

and MOST 104-2410-H-161-002).

[32] J. Che, J. Wang, G. Wang, An adaptive fuzzy combination model based on self-

organizing map and support vector regression for electric load forecasting,

Energy 37 (2012) 657–664.

References

[1] J.T. Bernard, D. Bolduc, N.D. Yameogo, S. Rahman, A pseudo-panel data model

Guo-Feng Fan was born in Shanxi Province, China.

of household electricity demand, Resour. Energy Econ. 33 (2010) 315–325.

Birthdate: May 29th, 1985. He received his Doctoral

[2] F.J. Ardakani, M.M. Ardehali, Long-term electrical energy consumption fore-

degree in Engineering Research Center of Metallurgical

casting for developing and developed economies based on different optimized Energy Conservation and Emission Reduction, Ministry

models and historical data types, Energy 65 (2014) 452–461. of Education, Kunming University of Science and

[3] I. Arisoy, I. Ozturk, Estimating industrial and residential electricity demand in Technology, Kunming, 2013. His research interests are

Turkey: A time varying parameter approach, Energy 66 (2014) 959–964. ferrous metallurgy, Energy forecasting, Optimization,

[4] K. Afshar, N. Bigdeli, Data analysis and short term load forecasting in Iran System Identiﬁcation.

electricity market using singular spectral analysis (SSA), Energy 36 (2011)

2620–2627.

[5] U. Kumar, V.K. Jain, Time series models (Grey–Markov, Grey Model with

rolling mechanism and singular spectrum analysis) to forecast energy con-

sumption in India, Energy 35 (2010) 1709–1716.

[6] P. Li, Y. Li, Q. Xiong, Y. Zhang, Application of a hybrid quantized Elman neural

network in short-term load forecasting, Int. J. Electr. Power Energy Syst. 66

(2014) 1–8.

[7] A. Kavousi-Fard, H. Samet, F. Marzbani, A new hybrid modiﬁed ﬁreﬂy algo- Li-ling Peng, Hunan Province, China. Birthdate: February

rithm and support vector regression model for accurate short term load 15th, 1985, She received his master degree in Faculty of

forecasting, Expert Syst. Appl. 41 (2014) 6047–6056. Science, Kunming University of Science and Technology,

[8] F. Rodrigues, The daily and hourly energy consumption and load forecasting Kunming, 2013 and research interests on recognition of

using artiﬁcial neural network method: a case study using a set of 93 pattern in image and computer. Especially she is good at

households in Portugal, Energy Procedia 62 (2014) 220–229. the recognition and prediction of the meteorology.

[9] S. Kouhi, F. Keynia, S.N. Ravadanegh, A new short-term load forecast method

based on neuro-evolutionary algorithm and chaotic feature selection, Int. J.

Electr. Power Energy Syst. 62 (2014) 862–867.

[10] J. Geng, M.-L. Huang, M.-W. Li, W.-C. Hong, Hybridization of seasonal chaotic

cloud simulated annealing algorithm in a SVR-based load forecasting model,

Neurocomputing 151 (2015) 1362–1373.

970 G.-F. Fan et al. / Neurocomputing 173 (2016) 958–970

Wei-Chiang Hong received his Ph.D. degree in Man- Fan Sun was born in Henan, China, November 13th

agement from Da-Yeh University, Taiwan, in2008. Since 1972. She received her B.S. degree in Mathematics

September 2006, he has been with the Department of education from Henan University, China, 1996. Her

Information Management of the Oriental Institute of research interests are Mathematics education and

Technology, where he is currently a professor. His Applied mathematics.

research interests mainly include applications of fore-

casting technology and computational intelligence. He

is currently appointed as the Editor-in-Chief of the

International Journal of Applied Evolutionary Compu-

tation, he is also on the Editorial Board of several

journals, including Neurocomputing, Applied Soft

Computing, The Scientiﬁc World Journal, Journal of

Applied Mathematics, Energy Sources Part B: Econom-

ics, Planning, Policy, etc.

- tmpF6B3.tmpDiunggah olehFrontiers
- CPDF Brochure 2010Diunggah olehptta
- Demand ForecastingDiunggah oleharunchauhan31
- Railway Demand Forecasting and Service Planning ProcessesDiunggah olehharshad_patki4351
- Yield Learning Modeling in Wafer ManufacturingDiunggah olehsamuelsouzaserafim
- Load ForecastDiunggah olehStevon Paul
- ME_1Diunggah olehPrateek Meharia
- EB-12-V32-I1-P15myDiunggah olehAviral Kumar Tiwari
- OPM Forecasting 1Diunggah olehSaba Yahya
- bu-0912-1Diunggah olehChrisBecker
- Required Method for Schedule ControlDiunggah olehsohail2006
- InTech-Energy Demand Analysis and ForecastDiunggah olehAnkita Verma
- A comparison between Neural Network and Box Jenkins Forecasting Techniques With Application to Real Data.pdfDiunggah olehDr_Gamal1
- Bollerslev 01 Financial Econometrics Past Developments and Future ChallengesDiunggah olehpballispapanastasiou
- Time Series Analysis for Business ForecastingDiunggah olehAmmar Diwan
- An Adaptive Ordered Fuzzy Time Series With Application to FOREXDiunggah olehNita Ferdiana
- Seasonal Rainfall Trend AnalysisDiunggah olehAnonymous 7VPPkWS8O
- Extreme Precip Europe Definition of Episode 2012Diunggah olehSusanaCardosoPereira
- Wp 930001Diunggah olehuma
- Stat -992Diunggah olehSiam Shaw
- PART4Diunggah olehjbsimha3629
- Pattern MatchingDiunggah olehJubayer Ahmed
- DSE Pilot Evaluation Plan v1 6Diunggah olehRamesh Sai
- Referee Report: “Evaluating the Economic Impacts of Calgary’s Olympic Bid” (The Conference Board of Canada and “Calgary 2026 Olympic and Paralympic Winter Games Economic Impact Analysis” (Deloitte)Diunggah olehppival
- FINC621-LAB01.RmdDiunggah olehAneesh V. Edwankar
- Study of Artificial Neural Network and Support Vector Machine for Students Performance PredictionDiunggah olehInternational Journal of Innovative Science and Research Technology
- 2010MI21.pdfDiunggah olehMina Youssef Halim
- RIVF Paper54 Final VersionDiunggah olehNguyen Hoang Pham
- Arabzadeh2018_Article_ConstructionCostEstimationOfSp.pdfDiunggah olehSyahrul Fitra
- org and man planDiunggah olehJowjie TV

- Chattopadhyay ANN Prediction PaperDiunggah olehRupali Satavalekar
- Modular Arithmetic & RSA EncryptionDiunggah olehtrogsworth
- Zk SnarksDiunggah olehnorulalb
- Boris Tsoniff Mathematical Dialogue as a MethodDiunggah olehtsoniff
- sigma modeling.pdfDiunggah olehRoni Glow
- Handbook of Analysis and Its Foundations 2Diunggah olehLuis Francisco Trucco Passadore
- Geo SphereaDiunggah olehGenaro Ochoa
- DHSch2part3Diunggah olehSunil Tenguria
- EM_1.3_RMDiunggah olehMangam Rajkumar
- Real-World ApplicationsDiunggah olehMichelle-Anne Spring
- Con2Diunggah olehMohamed
- C and Data Structures: A Snap Shot Oriented Treatise Using Live Engineering ExamplesDiunggah olehvenkat_ritch
- Signals and Systems 01Diunggah olehIshi Vempi
- ss6thgradefactorsandmultiplesDiunggah olehapi-261894355
- System of Equations Linear in Two VariablesDiunggah olehEdmarc Arucan
- Commutative Algebra PDFDiunggah olehJenny
- Homework 1 SolutionsDiunggah olehscribalin
- CressieCV10-04Diunggah olehabdounou
- Syllabus+-+MTH221+V1Diunggah olehJames Lanham
- WaveletsDiunggah olehmilindgoswami
- Developing a Tennis Model That Reflects OutcomeDiunggah olehYuri Lookin
- UT Dallas Syllabus for math1314.002.08f taught by Paul Stanford (phs031000)Diunggah olehUT Dallas Provost's Technology Group
- presentation13Diunggah olehapi-315354740
- Nonlinear dynamics and Chaos where do We go From here??Diunggah olehHitMAN010
- motion in planeDiunggah olehkartikthedashing
- JM Bridging Ch3 EDiunggah olehgooddown
- 3513Diunggah olehEncik_Ijak
- The HYSYS SpreadsheetDiunggah olehSyed Muzamil Ahmed
- 9ABS105 Mathematical MethodsDiunggah olehsivabharathamurthy
- Sequence & SeriesDiunggah olehmcatcyonline

## Lebih dari sekadar dokumen.

Temukan segala yang ditawarkan Scribd, termasuk buku dan buku audio dari penerbit-penerbit terkemuka.

Batalkan kapan saja.