Progress in Forecasting by Neural Networks

Progress in Forecasting by Neural Networks
P. CAIRE, G. HATABIAN, C. MULLER
Electriiite De France Direction des Etudes et Recherches 1, avenue du General de Gaulle 92141 Clamart cedex, France
ABSTRACT
Forecasting electricity consumption is a basic concern for a company like Electricite de France, the French electricity supply company. Our purpose in this document is to study forecasting by neural networks. The great advantage of NN forecasting is not just its performance, which is as good as tradiional models (ARIMA models), but the possibility of including exogenous variables, data resulting from forecasting more than one step ahead, and the possibility of changing the minimization criierion according to economic conditions.
INTRODUCTION
Accurate short term forecasting of consumption is very important for the efficient management of electrical power production. For this reason our company, the French electricity supply board needs to study every new model which may offer an improvement in forecasting quality. The use of neural networks seems to be a potential alternative to traditional methods. First the data are described. Secondly a comparison is made between the traditional and NN methods. The traditional method is known as the Box and Jenkins method. A description is then made of the best method and the best model is shown. Different NN approaches are tried and the resulting models presented. Then a comparison between the results of various models is explained. Finally a discussion takes place about some of the NN properties which are useful for forecasting, such as the use of new minimization criteria which can include economic characteristics, or the good results obtained with NN models for forecasting more than one step ahead. 2
DATA
For our first study of forecasting by NN, we have chosen a relatively easy time series with the aim of identifying the difficulties of the NN method. The data is the daily electricity consumption for the whole of France from 1986 to 1990 inclusive. In addition to the ease of this method, no traditional method has yet been made for this series. Therefore the traditional and NN models were made at the same time, thus making the comparison between the two methods that much easier. Two other series are known, firstly the temperatures, and more notably the maxima and minima for the six cities representing the temperatures in France, and secondly the nebulosities and particularly the maxima and minima for the same s x cities. i
3
THE MODELS
Both of the approaches and the various models are presented in this section.
0-7803-0559-0192 $3.00 Q 1992 IEEE
11-540
T
3.1
The Traditional Approach
The forecasting methods used until now have been based on the following system. Observations are first corrected for the weather effect, and individual values smoothed out. Then a normal temperature forecast is made using an ARlMA model. Finally, further corrections are made according to the weather forecast and characteristics of the day for which the forecast is being made. At this point, corrections for the weather effect are made using a sophisticated model which introduces non-linearity between consumption and temperature. The forecasting of the corrected observations is made using the following ARlMA model:
The greatest difficulty in this method is not so much researching a good ARlMA model and its parameters, which does however require a good knowledge of the Box and Jenkins method, than the preliminary corrections which are essential and require a very good knowledge of the effects of the weather on electricity consumption. On the contrary, the time when computing is needed is now.
3.2
The Neural Approach
The neural model, which is used here is a multi-layer perception, the learning was done with a back propagation algorithm, and the transfer function is sigmoid. Contrary to the traditional approach, (see section 3.1), the neural network forecasts are made directly from the observations without any corrections. Exogenous variables, such as temperature and nebulosity, are introduced directly as a network input. The output is always one neuron which gives forecast consumption one step ahead.
NN's from different ideas are studied. Firstly, most of the variables which are correlated with the forecast consumption are introduced as input, this NN is called the maximum model. Due to the size of such networks and the computational time needed for the learning, two additional models are prepared. The second one, i.e. the first additional one, is derived from the idea which is completely opposite to the previous one, at the same level of results as the maximum, only the most important variables are retained as input, i.e. those which are the most closely related with consumption forecasting. The second NN is the minimum model. We then try to reduce the number of maximum model connections by using another minimization criierion (Weigen et al. 90). This last model is called the reduced maximum d e l . Maximum model
m.The number of input neurons is 134. They are divided as follows. The last 35 consumption values
Architecture: The maximum model has only one hidden layer with 3 neurons
(35previous days); 7 boolean neurons for each day of the week 72 temperature values, i.e. the maxima and the minima for the six French cities for the last 5 days, and the forecasting day; 12 values of nebulosity, i.e. the maxima and minima for the six cities for the forecast day; and the 7 last errors made by the network.
11-54 1
The minimum model
m. This network has 18 neuron inputs, i.e. the last 5 consumption observations; the last 5 temperature
averages; the temperature forecast average, and the 7 boolean neurons for each day of the week. Architecture: The minimum model and the maximum model both have one hidden layer with 3 neurons.
The reduced maximum model
loprd:The reduced maximum model has the same input as the maximum model.
ecture: The purpose is to reduce the number of connections in comparison with the maximum model. An algorithm based on weight elimination is used which reduces the number of connections by 30%. The purpose of the new criieribn E l is to try to reduce to zero the connections not carryingdata.
where Eo is the quadratic error. In addition to the performance of this network, one of the advantages is the possibility of explaining some connections and hidden neuron rules, as opposed to the maximum network which is difficult to explain after convergence. For instance, the interpretationof the reduced maximum mode weighing up between the 7 input neurons of the week and the first hidden neuron, shows that Thursday and Friday together are diametrically opposed to Saturday and Sunday, and that Monday, Tuesday and Wednesday are 'average' (figure 1). This result is stable as we obtain approximately the same results with different initial weights.
The week etlect
Thur.
*
v
n
U
W d
TWS. input rwurcm
Sun.
Sat.
Fri.
n W
11-542
In comparison with the traditional method, the amount of data required on the process is less for the NN. We do not need to know exactly how the weather works. Nevertheless, previously known data makes the research of good networks easier. For the NN approach, the computation time for the learning phase is much greater than for the maximum model, and especially for networks with a lot of connections.
3.3
The results
1 9 8 6 - 19 8 9
Cl
1990 c2
Cl
c2
ARIMA model maximum model minimum model reduced maximum model
2 11 Yo . 1.75% 1.79%
1 57%
2.75% 2.51% 2.67% 2.38%
1.80%
1.75%
2.64% 2.68% 3.08% 2.60%
1.84% 1.72%
Table 1 shows firstly that the minimum model is not efficient for generalizing, and secondly that both the maximum and the reduced maximum models are better than the ARlMA model, and finally that the reduced maximum model is the best one. However the difference between all these results is fairly small. These two criteria are too global to be sufficient to give a good idea of the forecasting qualities of the various models. Therefore we must study the distribution of the error, which shows that the forecasts are unbiased and for the ARIMA model, as we already knew, and also for the NN models. The difference between the spreads is in the queue. The absolute error maxima of the NN models are much higher than those of the ARlMA model (236GWh as against 145 GWh). On the other hand the number of absolute errors over 80 GWh is less for the NN models then for the ARlMA models (22against 36). We also look the bad forecast days of the models. These are the same for the 3 NN models and most of them are public holidays. On the other hand the badly forecasted days of the ARlMA model are different from those of the NN. Most of them are not public holidays, and the errors are very difficult to explain. We expect that a new NN model, including public holidays, will improve the results.
DISCUSSION
The NN results are hardly any better than those of the ARlMA model. The advantages of the NN lie both in the results and in the properties of the approach. In Section 3 we underlined the fact that we need less data on the process and that exogenous variables are introduced direct to the input layer. We now have two other advantages to present: firstly the good results with NN for forecasting more than one step ahead , and secondly, the possibility of introducing economic characteristics in the minimizationcriteria.
4.1
p Step Ahead Forecasting
The different estimated models are optimized for a one day ahead forecasting. But they can be used for distant days forecasting. From d+l to d+20, we study the forecasting by the 4 models, the quality evolution is shown by the figure 3.
11-543
P STEP AHEAD FORECASTING

8 -
..
.. ..
6 *5
x 4
3'
2 1
0
..
.. .1
17
11
13
15
figure 3 Evolution of the error standard deviation for the 4 models The ARIMA model has a pseudo periodicity of 7 with a sharp increase for d+14, which the NN models have not. The minimum model, regardless of the horizon studied, is not so good,although for d+20 to result is close to the ARIMA one. Up to d+14 the ARIMA, the maximum, are the reduced maximum models are more or less equivalent. From then on the NN is always better. The two maximum models evolve in parallel with each other, but the reduced maximum is always better. In conclusion, in the short term the performances are equivalent, but for forecasting more than 14 steps ahead the NN models are much more reliable. 4.2
4
10
The Minimization Criterion
Most of the time the NN weights are determined as the minimum of the average quadratic error on the learning side (see the results in Section 3.2). But for the electricity forecasting problem, the real cost function is not y=x2. For instance, we want to reduce the number of the highest errors, so we use a new minimization criterion that is y=xk, with k even higher than 2.
ABSOLUTE ERRORS BAR CHAR1

600
500
1
k = 2
k = 8
1 3 5 7 9 11131517192123252729 figure 4
400
300 200 100
11-54
As figure 4 shows, the minima of the absolute errors are much lower for the minimizationfunction y = x8 than for y = x2 but the absolute error average increases too quickly as k increases. Two conclusions can be drawn on this point, firstly we can consider a model with two networks, the first one for the "standard" days (k = 2) and the second for days which are difficult to forecast (k > 2). The problem will then be to classify the days into "standard" and "difficult'. Secondty, we can clearly determine the real cost function and use it as the minimization criterion. 5
CONCLUSION
This study shows the advantage of forecasting using NN. The first result, which is absolutely essential to NN approach validation: the forecasting quality of the NN, is equivalent to the result with the traditional approach. Secondly, several properties of the NN make them especially advantageous, such as the ease of including exogenous variables, the good results with p step ahead forecasting, and the introduction of economic properties in the minimization criterion. These characteristics do not exist in the traditional approach.
REFERENCES
[de Groot & Wiirtz 901 C. de Groot and D. Wiirtz, "Analysis of Univariate Time Series with Connectionist Nets : A case study of two Classical Examples", Report of Neural Networks for Statistical and Economic Data, Dublin December 1990. [Park et al. 911 D.C Park, M.A. El Sharkowi, R.J. Marks, L. E. Atlas, M. J. Damborg,,"Electric Load Forecasting using an Artificial Neural NetworK', IEEE transactions on Power Systems,Vol6N"2, May 1991. [Canu et al. 901 S. Canu, R. Sobral, R. Lengelle,"Formal Neural Networks as an Adaptative model for Water Deman8,lNNC Paris 1990. [Varfis &Versino 901 A. Varfis, C. Versino,"Univariate Economic Time Series Forecasting by Connectionist Methods", 1 C Paris 1990. weigen et al. 901
A. S. Weigend, B.A. Huberman, D.E. Rumelhart,"Predicting the Future : a Connectionist Approach",

International Joumal of Neural Systems, Vol 1 No3 (1990).
11-545

Progress in Forecasting by Neural Networks

Diunggah oleh

Informasi Dokumen

Deskripsi Asli:

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Progress in Forecasting by Neural Networks

Diunggah oleh

Hak Cipta:

Format Tersedia

Progress in Forecasting by Neural Networks

P. CAIRE, G. HATABIAN, C. MULLER

0-7803-0559-0192 $3.00 Q 1992 IEEE

The Traditional Approach

The Neural Approach

The minimum model

ARIMA model maximum model minimum model reduced maximum model

2.75% 2.51% 2.67% 2.38%

2.64% 2.68% 3.08% 2.60%

p Step Ahead Forecasting

P STEP AHEAD FORECASTING

The Minimization Criterion

ABSOLUTE ERRORS BAR CHAR1

A. S. Weigend, B.A. Huberman, D.E. Rumelhart,"Predicting the Future : a Connectionist Approach",

Anda mungkin juga menyukai