ARIMA and Neural Networks. An Application To The Real GNP Growth Rate and The Unemployment Rate of U.S.A.

ARIMA and Neural Networks.
An application to the real GNP growth rate and

the unemployment rate of U.S.A.
Eleftherios Giovanis
Abstract
This paper examines the estimation and forecasting performance of ARIMA models
in comparison with some of the most popular and common models of neural
networks. Specifically we provide the estimation results of AR-GRNN (Generalized
regression neural networks) and the AR-RBF (Radial basis function). We show that
neural networks models outperform the ARIMA forecasting. We found that the best
model in the case of real US GNP is the AR-GRNN and for US unemployment rate is
the AR-MLP.
Keywords: ARIMA; Radial basis function; Multilayer perceptron; Generalized regression

neural networks; stationarity; unit root
1 Introduction
Artificial neural networks are computational networks which aim and attempt
to simulate nerve cells or neuron of biological nervous system of human or animals
(Graupe, 2007) . The difference between the neural networks and the other estimation
and approximation methods is that neural networks conclude the hidden layers in
which the input variables or data are transformed into special function, as the logistic
or the negative exponential and many more. With this hidden layer and the synapses
functions, the approach can be prove a very efficient to model and to estimate
nonlinear processes (McNelis, 2005). In this paper we have to deal with two
macroeconomic series , which are characterized of trend and circularity.
1
Aryal and Yao-Wu (2003) applied a MLP network with 3 hidden layers to
forecast the Chinese construction industry and they compare the forecasting
performance of the MLP networks with that of ARIMA. They found that the RMSE
of the MLP estimation is 49 percent lower than the ARIMA counterpart. Maasoumi
et al., (1996) have applied a back-propagation ANN model to forecast GDP and
unemployment rate among others. The network they apply is a single hidden layer
feedforward networks with the hidden units. Swanson and White (1997a, 1997b)
applied neural networks to forecast nine seasonally adjusted US macroeconomic time
series and they found generally neural networks outperform the linear models. Tkacz
and Hu (1999) have applied neural networks to forecast the Canadian GDP growth at
4-quarter horizon and they found that forecast accuracy is statistically significant,
while the performance in the 1-quarter horizon is poor. Also they found that the best
neural networks models outperform the best linear models by 15 to 19 percent at 4-
quarter horizon. Tkacz (2001) has found that neural networks produce lower
forecasting errors for the yearly growth rate of the real Canadian GDP relative with
the linear and univariate models.
2 Data
The data concern quarterly series of the real gross national product (GNP) and
the unemployment rate for the economy of the USA during period 1948-2006. The
data have been obtained by the Reserve Federal Bank of St. Louis.
2
3 Methodology
a. Autoregressive moving average
The first model we estimate is the ARMA, which its process (Gujarati, 2004)
is defined as
… … … … (1)
This is the ARMA(p,q) process. If the series are not stationary in their levels ,
which means that aren’t I(0), then we have to estimate an ARIMA(p,d,q) process
(Gujarati, 2004).
b. Generalized Regression Neural Networks
The GRNN is defined as
E[y | x] =
∫-∞
yg(x, y) dy
(2)
∞
∫-∞
g(x, y) dy
, where E[y | x] is the expected value of y given x and g(x,y) is the Parzen
probability density estimator . If the value of g(x,y) is unknown , then it can be
estimated from a sample of observations of x and y.
The predicted output obtained by GRNN is:
n
|| x − x i || 2
^
∑i
y i exp( −
2σ
2
)
y(x) = i (3)
n
|| x − x i || 2
∑ exp( −
i 2σ
2
)
i
Usually the GRNN consists of four layers. The first layer , which are the input
data, the synaptic and the activation functions are linear. In the second layer, the
pattern layer, the synaptic function is the radial and the activation function is the
negative exponential. The third layer, the summation layer, has as the first layer linear
synaptic and activation functions. The last layer , the output, has a synaptic function a
3
division and linear activation function. More specifically input layer receives the input
vector X and distributes the data to the pattern layer. Each neuron in the pattern layer
generates an output θ, which θ i = exp( − || x − x2i || )

2
and present the results to the
2σ i
summation layer. In this layer the numerator and denominator neuron compute the
weighted and simple sums based on the value of w and θ , which is wijθj , the
numerator is Sj = Σi wijθj and denominator is Sd = Σi θj. In the output layer output y are
computed as Υj = Sj/ Sd. We must mention that the hidden layer consists of 24 units.
The smooth rate for GNP is set at 0.01 and for the unemployment rate is set at 0.05
based on the lowest train and test errors. In our case we propose the AR-GRNN
model (Li et al., 2007), which means that the output is the vector of data yt and inputs
are the data with lags as yt-1, yt-2…yt-p. So the general form of the AR-GRNN is
defined as
, , … … … , (4)
, where F is a function produced by GRNN network. But in the case of unemployment
we consider the first differences, because we suspect that unemployment , is not
probably stationary, as indicates the KPSS test , so we apply the following AR(p)
function
, , … … … , (5)
We apply relation (4) and (5) for all neural networks models and specifically we apply
AR(1) for GNP and AR(2) for the first differences of unemployment rate. The
technique we obtain is the following. Suppose that we have quarterly output data for a
period e.g. 1948:Q1-2006:Q4 which is the variable yt. If we have AR(1) then we
obtain the yt-1 , which is the output data with one lag. But this lag is referred again to
4
same data for period 1948:Q2-2007:Q1, which means that we don’t extinguish the last
observation , but we put it forward to the next period. The same process is followed
for AR(2). So in this paper we estimate for the period 1948:Q1-2006:Q4 and then we
make the forecast for the period 2007:Q1-2008:Q1. This definition is applied also for
the other two neural network models. In figure 1 is presented a general GRNN
architecture. In all neural network models estimations the training sample is set up for
period 1948:Q1-1990:Q4 and the testing sample is set up for period 1991:Q1-
2006:Q4. The
Y1 Y2 YJ
Output Layer
……………..
Numerator Denominator
1 22
2 J
……………… Summation Layer
………… Pattern Layer

1 2 3 I
……………. Input Layer
X1 X2 Xk
Figure 1. General GRNN architecture
5
c. Radial Basis Function
The radial basis function is defined (Bishop, 1995) as
M
y k ( x) = ∑ wkjφ j ( x) + wko (6)
j =1
, where wkj are the weights and wko are biases and φj(x) can be estimated by
|| x − µ j ||
φ j ( x ) = exp( − ) (7)
2σ 2
j
The RBF consists by three layers, the input, which its synaptic and activation
function are linear, the hidden layer , where the synaptic and the activation functions
are radial and negative exponential respectively. Finally the third layer, which is the
output layer, has linear synaptic and activation function, as in the case of the input
layer. In figure 2 we present a general RBF illustration. The hidden layer in the RBF
estimation has 11 units. The radial for GNP and for unemployment rate has been set at
50 based on the lowest train and test error as in the case of GRNN estimation. The
definition of AR-RBF function is applied as in the AR-GRNN case. We present a
general RBF illustration in figure 2.
d. Multilayer perceptron
The last model we estimate is the multilayer perceptron (MLP), which has two
differences in relation with the RBF (McNelis, 2005). First the RBF has at the most
one hidden layer, while MLP can have more. Second the activation function in RBF
computes the Euclidean distance of the between the signal from input vector and the
6
center of that unit , while MLP computes the inner products of the inputs and the
weights for that unit.
The first layer, input, in the MLP has linear synaptic and activation function,
as the last layer, the output, has. The hidden layers, which in our case are three , have
linear synaptic function and hyperbolic activation functions. For networks with binary
units MLP with one hidden layer has been shown that is suffice. But in our case we
Output
Linear weights
Radial basis functions
Weights
Input x
Figure 2. General RBF architecture
have continuous variables or data , so we prefer three hidden layers. In the first phase
the back-propagation method is applied. Each layer consists of units and receive input
from the units of the layer directly below, and then send the output to unit directly
above the unit. The Ni inputs are fed into the first layer of Nh,1 hidden units (Krose &
Smagt, 1996). The mathematical concept of the back-propagation method is
(1)
7
, where ∑ (2)
, to get the delta rule we must set

! (3)
"#
The error measure of Et is defined as the total square error for pattern t at the output
units and it is

'(
$ ∑&)%& & (4)

, where %& is the desired output for unit i and pattern t. Then we can write by the
chain rule
*#
(5)
!"# *# !"#
But by equation (2) we find that the second factor from the right hand term of the
equation (5) is equal with
*#
(6)
!"#

And we define the first factor as
*#

+ * (7)
#
, so equation (3) can be written as
+ (8)
Then to compute + we write the partial derivation, by applying the chain rule, as the
product of two factors. The one factor in relation (9) reflects the change in error as a
function of the output of the unit , while the other reflects the change in the output as
a function of changes in the input. Relation (9) is defined as
,#
+ * , *#
(9)
# #
We know that the second factor of (9) is
8
,#
(10)
*#
, which is the derivative of function f for the kth unit. For the first factor computation
we assume that k=i. Then in this case we have

%& & (11)
,#
And then we have
& %& & & & (12)
, for any output unit i. Second if k is a hidden unit and not an output, which means
that k=h , then the error measure can be written as a function of net inputs from
hidden to output layers and we use the chain rule.
' *( ' '( '(

(
∑&) (
∑&) ∑' .
) &, ∑&) * /& ∑&) & /& (13)
,. * ( ,( * ( ,( " (
'
0 / ∑&)
(
& /& (14)
In the first phase we use the back-propagation method. In the second phase we use the
Levenberg-Marquardt algorithm (Bishop, 1995). Suppose that we have the error
function

$ ∑21 2 (15)
, where ε4 is the error for the nth pattern. We set WA as the old weight space and WB
as the new weight space. Then we can expand the error vector ε to first order in
Taylor series.
ε(WB) = ε(WΑ) + Ζ (WB – WΑ) (16)
, where Z is matrix and is defined as
6 7
52& (17)
! (
So the error function (20) can be written as

$ ∑2 εWΑ Ζ WB – WΑ ||εWΑ Ζ WB – WΑ|| (18)

9
In this paper we estimate a MLP network with three hidden layers and three
units each of them. The learning rate is set at 0.01 and the momentum at 0.3. In the
first phase the number of epochs are 100 and in the second phase they are 500. The
AR-MLP is defined as in the other two neural network models, the AR-GRNN and
the AR-RBF. In figure 3 a general MLP illustration with three hidden layers is
presented.
h h
h h
No
Ni Nh,1 Nh,t-1 Nh,t-1

Figure 3. MLP architecture with three hidden layers
Also we will apply the unit root test to examine if the series are I(0) or not,
which with other words means, if they are stationary in levels or in first difference and
above. We apply these tests to define if we have an ARMA(p,q) or an ARIMA(p,d,q)
process. We apply two tests the Dickey-Fuller (Greene, 2003) and the KPSS
(Kwiatkowski, 1992) tests. For DF GLS test we examine the regression with constant
and trend and it is
> , ?@ 1 (19)
10
And we test the hypothesis
H0: φ=1, δ=0 => yt ~ I(1) with drift
H1: φ<|1| => yt ~ I(0) with deterministic time trend
, which means that if we accept the null hypothesis then the series are non stationary
in first differences , so they are I(1), else if we reject the null hypothesis the time
series are stationary, I(0). For the KPSS test we have the hypotheses.
H0: stationary
H1: non-stationary
The KPSS test is based in the residuals by the OLS regression of yt on exogenous
variables xt. Specifically it is
yt = α + βt + γΖt (20)
If γ equals with zero , then the process is stationary if β=0 and trend stationary if
β B 0 . Let et denotes the OLS residuals, et= yt - α - βt The KPSS statistic is
∑G) $
DEFF
H IJ
∑LM@ K ∑L
TM"U@ ST ST?"
, where IJ 2 ∑P)1 PQR , while R
G G
To compare the forecasting performance between the models we examine, we
apply two statistical measures, the RMSE (root mean square error) and the MAE
(mean absolute error).
11
4 Results
Table 1
Unit root tests for real GNP and unemployment rate of USA
Test Series t-statistics Critical values

-3.46 (1%)
DF GLS GNP -10.466 -2.92 (5%)
-2.62 (10%)
-3.46 (1%)
DF GLS Un. Rate -2.43 -2.92 (5%)
-2.62 (10%)
Test LM-stat Critical Values
0.216 (1%)
KPSS GNP 0.0299 0.146 (5%)
0.119 (10%)
0.216 (1%)
KPSS Un. rate 0.2375 0.146 (5%)
0.119 (10%)
From table 1 we conclude that real GNP is I(0), so it is stationary in levels
with both tests. For unemployment rate we conclude that with KPSS test is I(1) as we
can see from table 2.
Table 2
KPSS unit root test for first difference of unemployment rate
Test Series LM stat Critical values

0.216 (1%)
DF GLS Un. rate 0.0338 0.146 (5%)
0.119 (10%)
According to the three information criteria , Akaike, Hannan-Quinn and
Schwarz, we have an ARMA(1,0) process for GNP and ARIMA(2,1,3) for
unemployment rate. So we apply an AR(1) for the three neural networks in the case of
12
GNP and AR(2) for unemployment rate. From table 3 we conclude that neural
networks modeling is better, with AR-GRNN to have the lowest RMSE and MAE. So
we prefer neural network for the forecasting of the real GNP of USA. Specifically we
found that the RMSE of forecasting for neural network models is 7 to 17 per cent
lower than the ARIMA counterpart an the MAE is 9 to 22 per cent lower than the
MAE of ARIMA.
Table 3
Forecasting comparison between ARIMA and neural networks for the real GNP of USA for the period
2007:Q1-2008:Q1
Model RMSE MAE

ARMA(1,0) 0.554 0.502
GRNN 0.460 0.393
RBF 0.500 0.433
MLP 0.515 0.455
In table 4 the conclusions are almost the same with that of GNP results. The
neural networks modeling is again more reliable and these models present lower
RMSE and MAE than that of ARIMA(2,1,3). Especially the AR-MLP and then the
AR-GRNN are the best models. In the case of the unemployment the RMSE and
MAE of neural networks are respectively 45 to 62 and 56 to 67 percent lower than
the ARIMA counterparts. In table 5 we present the actual values of real US GNP and
the predicted values generated by the four models.
Table 4
Forecasting comparison between ARIMA and neural networks for the real unemployment rate of USA
for the period 2007:Q1-2008:Q1
Model RMSE MAE

ARIMA(2,1,3) 0.217 0.202
GRNN 0.107 0.089
RBF 0.120 0.084
MLP 0.081 0.066
13
Table 5
Forecasting values for GNP with the four models
Model Period Actual Predicted Model Predicted Model Predicted Model Predicted
2007:Q1 0.164 0.76379 0.860 0.513 0.729
2007:Q2 0.983 0.80734 0.390 0.941 0.893
ARMA 2007:Q3 1.411 0.82153 GRNN 1.001 RBF 0.695 MLP 0.936
(1,0) 2007:Q4 0.462 0.82615 0.209 0.974 0.790
2008:Q1 0.044 0.82765 0.060 0.643 0.864
Table 6
Forecasting values for unemployment rate with ARMA (2,1,3) and neural networks
Model Period Actual Predicted Model Predicted Model Predicted Model Predicted
2007:Q1 0.567 0.390 0.659 0.629 0.550
2007:Q2 -0.367 -0.176 -0.300 -0.328 -0.280
ARIMA 2007:Q3 0.234 0.391 GRNN 0.197 RBF 0.269 MLP 0.384
(2,1,3) 2007:Q4 -0.100 -0.232 0.100 0.155 -0.144
2008:Q1 0.700 0.342 0.749 0.722 0.667
In table 6 we present the actual and predicted first differences of US
unemployment with ARIMA (2,1,3) and the three neural network models. In figure 4
we present the forecasting with for US real GNP during the period 2007:Q1-2008:Q1,
while in figure 5 are presented the forecasting results for US unemployment for the
same period.
14
1.6 1.6
1.4 1.4
1.2 1.2
1.0 1.0
0.8 0.8
0.6 0.6
0.4 0.4
0.2 0.2
0.0 0.0
2007Q1 2007Q2 2007Q3 2007Q4 2008Q1 2007Q1 2007Q2 2007Q3 2007Q4 2008Q1
ACTUAL ARIMA (2,1,3) ACTUAL GRNN
(a) (b)
1.6
1.6
1.4
1.4
1.2 1.2
1.0 1.0
0.8 0.8
0.6 0.6
0.4 0.4
0.2 0.2
0.0 0.0
2007Q1 2007Q2 2007Q3 2007Q4 2008Q1 2007Q1 2007Q2 2007Q3 2007Q4 2008Q1
ACTUAL MLP
ACTUAL RBF
(c) (d)
Figure 4. Actual against forecasting for US GNP in the period 2007:Q1-2008:Q1 with: (a) ARMA
(1,0), (b) GRNN, (c) RBF and (d) MLP
15
.8 .8
.6 .6
.4 .4
.2 .2
.0 .0
-.2 -.2
-.4 -.4
2007Q1 2007Q2 2007Q3 2007Q4 2008Q1 2007Q1 2007Q2 2007Q3 2007Q4 2008Q1
ACTUAL ARIMA ACTUAL GRNN
(a) (b)
.8 .8
.6 .6
.4 .4
.2 .2
.0 .0
-.2 -.2
-.4 -.4
2007Q1 2007Q2 2007Q3 2007Q4 2008Q1 2007Q1 2007Q2 2007Q3 2007Q4 2008Q1
ACTUAL RBF ACTUAL MLP
(c) (d)
Figure 5. Actual against forecasting for US unemployment first differences in the period 2007:Q1-
2008:Q1 with: (a) ARIMA (2,1,3), (b) GRNN, (c) RBF and (d) MLP
16
5 Conclusion
We examined the forecasting performance of the traditional time series
method, the ARIMA process in comparison with three neural networks models. We
proposed the three of the most usual models the generalized regression neural
networks (GRNN), the radial basis function (RBF) and the multilayer perceptron
(MLP). We obtained the autoregressive (AR) of these neural models, which means
that input data are just the output data with time lags. We configure the AR(p) order
as we define by the unit root tests, so we have AR(1) for the real gross national
product (GNP) and AR(2) for the unemployment rate for the economy of USA. We
show that all neural models outperform the ARIMA process , so we conclude that
traditional time series and econometrical methods , are not always the best or even the
only choice, but we must look out for more sophisticated modeling , as the neural
networks modeling, which are able to capture with great success , the non-linear
processes.
REFERENCES
Aryal R.D. & Yao-Wu W. (2003). Neural Network Forecasting of the Production
Level of Chinese Construction Industry. Journal of comparative
international management , 29, 319-33
Bishop C.M. (1995). Neural Networks for Pattern Recognition. pp. 164-170, 290-
291. Oxford: Clarendon Press
Graupe D. (2007). Principles of Artificial Neural Networks. 2nd Edition, pp. 1 World
USA: Scientific Publishing
Greene H. W. (2003). Econometric Analysis. Fifth Edition, pp. 637-640. New
Jersey: Pearson Education
Gujarati D. (2004). Basic Econometrics. Fourth Edition, pp. 839-840. USA: McGraw-
hill
Krose B. & Smagt. V.D. P. (1996). An introduction to neural networks. Eighth
edition . pp. 33-37. The University of Amsterdam
Kwiatkowski, D., P.C.B. Phillips, P. Schmidt and Y. Shin (1992). Testing the
Null Hypothesis of Stationarity against the Alternative of a Unit Root.
Journal of Econometrics, 54, 159-178.
Li W., Luo Y., Zhu Q., Liu J. & Le J. (2007). Applications of AR*-GRNN model
17
for financial time series forecasting. Neural Computing & Applications,
London: Springer
Maasumi E., Khotanzad A., and Abaye A. (1996). Artificial neural networks for
some macroeconomic series: a first report. Econometric Reviews, 13 (1),
105-122
McNelis D. P. (2005). Neural Networks in Finance: Gaining Predictive Edge in the
Market. pp. 21. USA : Elsevier Academic Press
Swanson, N.R., and White, H. (1997a). A model selection approach to real time
macroeconomic forecasting using linear models and artificial neural
networks. Review of Economics and Statistics, 79, 540-50.
Swanson, N.R., and White, H. (1997b) . Forecasting economic time series using
adaptive versus non-adaptive and linear versus nonlinear econometric
models. International Journal of Forecasting, 13, 439-61.
Tkacz G. and Hu, S. (1999). Forecasting GDP Growth Using Artificial Neural
Networks. Working Paper, Bank of Canada, 99-3
Tkacz G. (2001). Neural network forecasting of Canadian GDP growth.
International Journal of Forecasting, 17, 57-69.
18

ARIMA and Neural Networks. An Application To The Real GNP Growth Rate and The Unemployment Rate of U.S.A.

Diunggah oleh

Informasi Dokumen

Deskripsi Asli:

Judul Asli

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

ARIMA and Neural Networks. An Application To The Real GNP Growth Rate and The Unemployment Rate of U.S.A.

Diunggah oleh

Hak Cipta:

Format Tersedia

ARIMA and Neural Networks.

An application to the real GNP growth rate and

Keywords: ARIMA; Radial basis function; Multilayer perceptron; Generalized regression

to simulate nerve cells or neuron of biological nervous system of human or animals

macroeconomic series , which are characterized of trend and circularity.

applied neural networks to forecast nine seasonally adjusted US macroeconomic time

neural networks models outperform the best linear models by 15 to 19 percent at 4-

the linear and univariate models.

      … …         … …    (1)

b. Generalized Regression Neural Networks

The GRNN is defined as

probability density estimator . If the value of g(x,y) is unknown , then it can be

estimated from a sample of observations of x and y.

The predicted output obtained by GRNN is:

generates an output θ, which θ i = exp( − || x − x2i || )

  ,  , … … … ,   (4)

, where F is a function produced by GRNN network. But in the case of unemployment

we consider the first differences, because we suspect that unemployment , is not

  ,  , … … … ,   (5)

………… Pattern Layer

……………. Input Layer

Figure 1. General GRNN architecture

The radial basis function is defined (Bishop, 1995) as

definition of AR-RBF function is applied as in the AR-GRNN case. We present a

general RBF illustration in figure 2.

weights for that unit.

Radial basis functions

Figure 2. General RBF architecture

Smagt, 1996). The mathematical concept of the back-propagation method is

   (1)

, to get the delta rule we must set

, so equation (3) can be written as

  +  (8)

We know that the second factor of (9) is

And then we have

& %&  & & &  (12)

 '  *( '   '(  '(

ε(WB) = ε(WΑ) + Ζ (WB – WΑ) (16)

, where Z is matrix and is defined as

So the error function (20) can be written as

Ni Nh,1 Nh,t-1 Nh,t-1

above. We apply these tests to define if we have an ARMA(p,q) or an ARIMA(p,d,q)

and trend and it is

H0: φ=1, δ=0 => yt ~ I(1) with drift

H1: φ<|1| => yt ~ I(0) with deterministic time trend

variables xt. Specifically it is

β B 0 . Let et denotes the OLS residuals, et= yt - α - βt The KPSS statistic is

To compare the forecasting performance between the models we examine, we

(mean absolute error).

Test Series t-statistics Critical values

From table 1 we conclude that real GNP is I(0), so it is stationary in levels

can see from table 2.

Test Series LM stat Critical values

According to the three information criteria , Akaike, Hannan-Quinn and

Schwarz, we have an ARMA(1,0) process for GNP and ARIMA(2,1,3) for

Model RMSE MAE

MAE of neural networks are respectively 45 to 62 and 56 to 67 percent lower than

the predicted values generated by the four models.

Model RMSE MAE

In table 6 we present the actual and predicted first differences of US

ACTUAL ARIMA (2,1,3) ACTUAL GRNN

ACTUAL ARIMA ACTUAL GRNN

… … … … (1)

, , … … … , (4)

, , … … … , (5)

(1)

+ (8)

& %& & & & (12)

' *( ' '( '(