ARMA-Stochastic Time Series Modeling

Contents
Abstract Chapter 1 Critical Reviews:

1.1 1.2 1.3 1.4 1.5 1.6 Stochastic Time Series Modeling, Simulation & Prediction Regression Analysis Time Series Modeling & Simulation Chaotic Time Series without Rule Based Fuzzy logic(FL), Mackey Glass Simulation with FL and Prediction Rule Based Fuzzy Logic Time Series Prediction, Modeling and Simulation Artificial Neural Network Time Series (ANNTS) Modeling, Simulation & Prediction Thesis Plan
1.1 Stochastic Time Series Modeling, Simulation & Prediction

A method of forecasting wind power output a few hours advance, from a wind power generator that is supplying power and energy system, is required to ensure efficient utilization of the power. Time series modeling of wind speed has been the subject of many discussions because of the interest in wind as an alternative form of energy. When the records of wind speed are incomplete or of too short a duration or the handling and storage of large values of the data are not desirable, then a time series model is needed .Since wind power is a function of wind speed, simulation of power generally are derived from simulations of speed. Wind speed simulations can be done with Monto Carlo methods that rely solely on the estimated parameters of the marginal distribution of wind speeds. The The multiplicative ARMA (autoregressive moving average) models to generate hourly series of global radiation by Mora-Lopez and Sidrarch-de-Cardona (1998), stochastic simulation using ARIMA (autoregressive integrated moving average) modeling of solar irradiation by Craggs et al (1999) and a time dependent autoregressive Gaussian model (TAG) for generating synthetic hourly radiation by Aguiar and Collares Pereira (1992) are important contributions from modeling and simulation point of view. Lalarukh and Jafri (1999) used an ARMA process on hourly global radiation data, performed stochasting modeling through MTM (Markov Transition Matrix) and generated synthetic sequences of hourly global solar irradiation for Quetta, Pakistan. They found MTM approach relatively better as a simulator compared to ARMA modeling. But, their analysis for ARMA process to simulate and forecast hourly averaged wind speed for Quetta, Pakistan also yielded good results Lalarukh and Jafri (1997). Several non-Gaussian distributions have been suggested as appropriate models for wind speed. These models include the inverse Gaussian distribution Bardsley (1980), the log normal distribution Luna and Church (1974), the gamma distribution Sherlock (1951), the Weibull distribution Hennessey (1977); Justus, et al (1976); Stewart and Essenwanger (1978) and Takle and Brown (1978) and the squared normal distribution Carlin and Haslett (1982). We have seen from our previous studies Nasir et al (1991); Raza and Jafri, (1987) and Brown ((1981) that the Weibull distribution fits the actual 4
wind speed frequencies quite well. However, the use of inverse Gaussian distribution on wind data Bardsley (1980) ignores the positive correlations between consecutive observations of wind speed. Failure to take this autocorrelation into account leads to underestimation of the variances of the time averages of wind speeds. Moreover, the long runs of high and low wind speeds that are characteristic of such data do not occur frequently enough in simulated data when wind speeds are assured to be uncorrelated over time. To overcome this problem Chou and Corotis (1981) and Goh and Nathan (1979) have attempted to incorporate autocorrelation into wind speed models, but they do not consider the Gaussian shape of transformed wind speed distributions and its corresponding statistics. Some of the studies have neglected the non Gaussian shape of the wind speed distribution. Brown et al (1984) suggested methods to take into account the autocorrelated nature of wind speed, the diurnal non-stationarity and non Gaussian shape of wind speed distribution so that forecasting of hourly averaged wind speed could be done. Brown et al (1982) in their previous study, have also indicated the need for standardization to remove diurnal non-stationarity. Diurnal variations in wind speed occur as a natural phenomenon Jafri et al (1989) and as mentioned in a paper by Kamal and Jafri (1996) standardization corresponds to smoothing of a profile, such as of a Gaussian distribution that is obtained after transforming a non- Gaussian shape to an approximately Gaussian shape,.i.e., by bringing scattered data points close to the profile. We accomplished this standardization procedure in the present study, for hourly averaged wind data for a period of twenty years ,.i.e ., 1985-2004, of Quetta, Pakistan before using ARMA process. Jafri (1996)a established that the hierarchical random process is a Markovian random process, which can be characterized by a scaling probability distribution. A generating function for such a process was obtained. These observations can be successfully applied to chaotic time series Jafri (1996)b to overcome the non-stationarity in ARMA process but it would require handy stochastic simulation techniques. Jafri (1996)b suggested that the chaotic time series both in Bayesian and non Bayesion statistics is deterministic. Jafri (1995) developed a first order Markov transition matrix (MTM) for non Gaussian nature of wind speed of Quetta for 1985 and suggested a
Gaussian form of MTM sequence to yield HAWS (hourly averaged wind speed) sequences. The same work was extended further on wind speed data for a period of twenty years, .i.e.,1985-2004. Needless to mention, the simulation of wind data using MTM Jafri (2001) is relatively difficult compared to simulation on solar radiation data Lalarukh and Jafri (1999).The number of iterations exceeds beyond a certain limit thus causing for HAWS and DAWS (daily average wind speed) sequences to become cumbersome and entangled. Jafri (1995; 2001) also found autocorrelation coefficient for wind data, which shows levels of persistence in wind speed frequencies and of wind speed magnitudes when compared with diurnal variations over daily averaged wind speed (DAWS) sequences. Blanchard and Deserochers (1984) and Brown et al (1984) employed a class of parametric time series models called autoregressive moving average processes (ARMA) of Box and Jenkins (1976). Such processes have been employed to model many meteorological time series Katz and Skaggs (1981). The model of Blanchard and Desrochers (1984) takes into account high autocorrelation and allows a time series to be generated which presumes all the main characterstics of the data ; and it does not require any assumption about the wind speed distribution. In fact, a larger class of seasonal models include ARIMA models Blanchard and Desrochers (1984). Sfetos (2002) studied the linear ARIMA models and feed forward artificial neural networks (FFANN). He found that the model order is selected from the minimization of the evaluation set error in the ARIMA process. He suggested the multi step forecasting and the subsequent averaging to generate mean hourly prediction of wind data. The ARIMA models have been critically analyzed by Jain and Lungu (2002). They considered both non- seasonal and seasonal ARIMA models by using stochastic components. They also deliberated to determine the persistence patterns if any, of the stochastic components. We know the model of Chou and Corotis (1981) is based on Weibull distribution and does not require stationarity in the data. McWilliams and Sprevak (1982 a) described a new version of an existing time series modeling procedure Box and Jenkins (1976) from which the distribution of wind speeds and wind directions are obtained McWilliams et al (1979) and McWilliams, and Sprevak (1982)b. Their model incorporates diurnal variations observed in wind speed in such a manner that the time series of wind speed
component remain stationary; the sample autocorrelation functions for the series have identical stochastic behavior as far as the second order statistics are concerned, thus reducing the problem to modeling single Gaussian series. This model is corrected for autocorrelation functions, to account for diurnal variations. There is one point which is obvious: they did not use transformation of hourly averaged wind speed. Instead, they considered annual deterministic variation (t) and 2(t) which are modeled by harmonic series representation to account for diurnal variation of wind speed . With regard to our conjecture, diurnal variation Jafri et al (1989) should be employed in model development in a manner similar to McWilliams and Sprevak (1996b)b. We followed the approach of Daniel & Chen (1991) which consists of first fitting ARMA processes of various orders to hourly averaged wind speed (HAWS) data which have been transformed to make their distribution approximately Gaussian and standardize to remove the so called diurnal stationarity . We did not like procedures of transformation and standardization but preferred this approach for the reason that the model had the capability of using wind data of more than one year .The primary advantage of including more than one year of data in the model development is the increased reliability of the estimates of the model parameter. We used MINITAB (version 11) for ARMA, non seasonal ARIMA and seasonal ARIMA modeling and simulation. ARIMA models are used to model a special class of non- stationary series. Seasonal ARIMA (SARIMA) models are used to incorporate cyclic components in models. In other words, ARIMA models are, in theory the most general class of models (Parsemonius) for forecasting a time series which can be stationarized by transformations such as differencing and logging. SARIMA has the same structure as ARIMA . We used both non seasonal and seasonal models on hourly averaged wind data of 1985-2004. For non- seasonal ARIMA modeling and simulation, the six options,. i.e., random walk (ARIMA(0,1,0)), differenced first order autoregressive model (ARIMA(1,1,0)), constant (ARIMA(0,1,1), linear exponential smoothing (LES) without constant (ARIMA (0,2,1) or (0,2,2)) and mixed ARIMA(1,1,1) are tried for each month and on four seasons. Non seasonal ARIMA (0,1,1) which deals with exponential growth and constant incorporates simple exponential smoothing (SES) model. MA(1)
coefficients correspond to 1-
in the SES formula. The term is called training
parameter. For LES without constant, MA(1) coefficient corresponds to 2. For seasonal ARIMA (SARIMA) modeling and simulation, the seven options,. i.e., SARIMA(0,1,1)x(0,1,1)12, SARIMA(0,0,0)x(0,1,0)12 with constant, SARIMA(0,1,0)x(0,1,0)12 SARIMA(1,0,1)x(0,1,1)12 with constant, SARIMA following SES with =0.4772 and Browns SARIMA(LES) with = 0.2106 are tried for each month only. The most oftenly used model of ARIMA is SARIMA(0,1,1)x(0,1,1)12 which strictly follows seasonal exponential smoothing. SARIMA(0,1,0)x(0,1,0)12 is also known as seasonal random trend (SRT) model. The alternate to SRT model is seasonal random walk model,.i.e., SARIMA (1,0,0)x(0,1,0)12. There is, of course, a difference between seasonal and simple exponential models. The values of = 1- is used in exponential smoothing formulas. The best option is selected by considering the most minimum chi- squared value at 5% confidence interval.
1.2 Regression Analysis Time Series Modeling & Simulation

The regression is strictly the correlation analysis, accomplished with time and sometimes without time series. The modern interpretations and fundamental concepts of regression analysis are thoroughly presented by Gujarati (1988), Siegel (1997), Rawlings (1988) and Newton (1988). All kinds of regression analysis can be accomplished by the least squares regression technique, which minimizes the discrepancy between data points and the fit Chapra and Canal(1990). It comprises of linear regression (LR), polynomial regression (PR), multiple linear regression (MLR), general linear least square (GLLS) and non-linear regression (NLR). For NLR, least square technique is used. Gauss-Siedel technique can not be employed because the normal equations are not diagonally dominant. NLR analysis is sometimes very useful to fit but it also requires minimization of the sum of the square of the residuals (SSR). This analysis is only carried out on a single independent variable, therefore, multiple parameters which are interrelated with each other such as in MLR can not be studied. However, NLR analysis has the advantage over PR because it exploits iteration. For NLR analysis the Gauss-Newton method has some short comings such as slow convergence, wide oscillations,.i.e., changing directions and sometimes divergence Draper and Smith (1981). These discrepancies were overcome
by other methods such as the steepest descent and the Lavenberg-Marquardt techniques Trabea and Shaltout (2000). However, PR in some cases, especially when data is distributed like a parabola or in a cubic polynomial can be applied because it is dependent on a single variable, such as PRATS, in our case. Trabea and Shaltout (2000) studied correlation of global solar radiation with meteorological parameters like mean daily maximum temperature, mean daily relative humidity, mean daily sea level pressure, mean daily vapour pressure and hours of bright sunshine, by using MLR analysis. The correlation, the regression coefficient and the standard error were estimated. But they did not consider the interdependence of the meteorological parameters. Rapti (2000) developed mathematical correlation of atmospheric turbidity with specific humidity and of diffuse radiation with atmospheric turbidity for maritime and for continental air masses. This study does not include any statistical correlations. Ilyas and Nasir (2000) developed a relationship between humidity and temperature and found Guassian trend. The best fit to the experimental data as suggested by them, is as follows:
Hth =
Ho 2 ln e
To k
_____________________ (1)
where Hth is the theoretical humidity, Ho and To are the experimental values of humidity and temperature, respectively and k is a constant for the fit. Hussain, Jafri and Kamal [10] used regression modeling of weather data and found PRATS relatively better than PR. Ilyas (2000) found an inverse Guassian relationship for percentage cumulative frequency of sunshine hours and solar energy,. i.e.,
2 E fcum (%) = k exp 0.5 Eth

where
___________ (2)
2 x E k = exp x ln fcum and x = -----------(3) n Eth

9
In eq.2, symbols E, Eth, x and n represent solar energy, threshold solar energy, square of the ratio of solar energy to its threshold values and the total number of data respectively. The overall behavior of humidity on temperature and solar energy on its cumulative frequency of sunshine hours shows a reversal,.i.e., the former is Guassian and the later is inverse Guassian. We tried to establish the best fit to our diverse data by using regression analysis. Kamal and Jafri (1999) developed stochastic modeling and generated synthetic sequences of hourly global solar irradiation. They also found the Markov transition matrices (MTM) approach relatively better as a simulator compared to Autoregressive Moving Average (ARMA) process. The time series models to stimulate and forecast hourly averaged wind speed (HAWS) were presented by Kamal and Jafri (1997). They also used simulation of Weibull distribution of HAWS Kamal and Jafri (1996). With the use of triangulation method and statistical correlation from regression equations, solar radiations were estimated at locations where there were no observatory and found it very much reliable Raza and Kamal(2002). Jafri recently performed fuzzy logic time series (FLTS) prediction modeling on HAWS (2007). Needless to mention, regression modeling despite many of its short comings is a better predictor. The fuzzy regression analysis is defined as the model which includes the fuzziness (uncertainty) in itself Tanaka and Ishibuchi (1992). Ozawa et al.(1997) used the fuzzy autoregressive (AR) model to describe the fuzzy time series Ozawa et.al (1997) which can not be dealt by stochastic models. The fuzzy time series analysis was proposed by Watada (1992).
1.3 Chaotic Time series without Rule Based Fuzzy logic (FL), Mackey Glass Simulation with FL and Prediction
The original fuzzy logic (FL) pioneered by Lotfi Zadeh (1965) has been around for forty years, and yet it is unable to handle uncertainties. Zadeh introduced the concept of a fuzzy set, a set whose boundary is not sharp or precise. This concept contrasts with the classical concept of a set recently called a crisp set, whose boundary is required to be precise. Probability and fuzzy sets describe different kind of uncertainty .The probability is the theory of sets. It deals with the likelihood of relevant events or with the expectation of a future event based on something now known (outcome of a random event) while the 10
fuzziness is not the uncertainty expectation. Fuzzy set theory, on the other hand is not concerned with events. It is concerned with concepts. Rule based fuzzy logic system (FLS) is a powerful design methodology to minimize the effect of uncertainty Mendel (2001). Model free designs are artificial neural networks (ANN) and fuzzy logic(FL).The fuzzy logic (FL) rules are extracted from numerical data and are then combined with linguistic knowledge. The richness of fuzzy logic is that there are enormous members of possibilities that lead to a lot of non-linear mappings of an input data vector into a scalar output. In model free approaches, the associated model is a representation of architecture to solve a specific problem. With model approach in fuzzy logic, one can endeavor the truth or close approximation theory. FLSs employ 500 rules for one pass (OP) and sixteen rules for back propagation (BP) steepest descent method of designs, respectively. We followed a model free approach, .i.e., fuzzy logic on hourly wind speed data to predict future value, . i.e., consequents from antecedents (past values) . A single stage forecasting for a chaotic time series wind data will be used.
1.4 Rule based Fuzzy Logic Time series Prediction, Modeling and Simulation
Rule based fuzzy logic systems (FLS), a powerful design methodology, minimize the effect of uncertainty Mendel(2001). The two most popular FLSs used by engineers today are the Mamdani and Takagi-Sugano-Kang (TSK) systems. Both are characterized by IF-Then rules and have the same antecedent structures. They differ in the structure of the consequents. The consequent of a Mamdani rule is a Fuzzy set, whereas the consequent of a TSK rule is a function. The type-1 TSK FLSs have been widely used in control and other applications Terano et al (1994). The output of type-1 TSK forecaster occurs without a defuzzification step. Lieng and Mendel (1999; 2000) developed type-2 TSK FLSs. The FLS forecasters comprise of singleton type-1 (with virtually no uncertainties), non-singleton type-1 (with uncertainties), singleton type-2, type-1 nonsingleton type-2, type-2 non-singleton type-2, type-1 TSK and type-2 TSK Mendel (2001). The rule based fuzzy logic systems (FLSs), both type-1 and type-2, handle uncertainties because modeling and minimization of uncertainties can be accomplished. If all uncertainties disappear, type-2 FL reduces to type-1 FL, in much the same manner 11
that if randomness disappears, probability reduces to determinism. For basic singleton type-1 FLSs, we assume that there are no uncertainties; all fuzzy sets are of type-1, measurements are perfect and treated as crisp values,.i.e., as singletons. Thus, the nonsingleton FLS do not yield crisp values, i.e., uncertainties are inherently present. A FLS that is described completely in terms of type-1 fuzzy sets is called a type-1 FLS. Type-1 FLSs are unable to directly handle rule uncertainties, because they use type-1 fuzzy sets that are certain. Therefore, a better way to handle uncertainties is to use a type-2 FLS. But, a non-singleton type-1 FLS is a type-1 FLS whose inputs are modeled as type-1 fuzzy numbers; hence, it can be used to handle uncertainties. Moreover, the type-1 FL, in its applications, deciphers rule based systems as a powerful design methodology. The rules of a non singleton-type-1 FLS are the same as those for a singleton type-1 FLS Mendel (2001). The difference is of the fuzzifier, which treats the inputs, as type-1 fuzzy sets, and the effect of this on the inference block. The output of the inference block will again be a type-1 fuzzy set. The type-1 FLS, both for singleton and non-singleton, is shown in Fig.1. So the defuzzifiers that are described for a singleton type-1 FLS apply as well to a non-singleton type-1 FLS Mendel (2001). We know that non-stationarity (randomness) in our wind data inherently exists Jafri (2005); Kamal and Jafri (1996), therefore, uncertainties or randomness cannot be reduced. It can be handled properly with non-singleton type-1 FLS, therefore, there appears no reason to use a type-2 FLS. We recently performed fuzzy logic (FL) time series prediction modeling on hourly averaged wind speed (HAWS) data of 1985-2004 and used Mackey-Glass simulation, for Quetta, Pakistan.. We shall use the same results of wind data with the applications of rule based type-1 FLS. We used the MATLAB M-files which are: URL:http://sipi.usc.edu/~mendle/software. The M-files are available in three folders: type-1 FLS, general type-2 FLSs and Interval type-2 FLSs. We used, in this study, the following type-1FLSs: Singleton Mamdani type-1 FLS sfls_type1.m: compute the output(s) of a singleton type-1 FLS when the antecedent membership functions are Gaussian
12
train_fls_type.1.m: tune the parameters of a singleton type-1 FLS when the antecedent membership functions are Gaussian using some input-output training data Non-singleton Mamdani type-1 FLS nsfls_type1.m: compute the output(s) of a non-singleton type-1 FLS when the antecedent membership functions are Gaussian and the input sets are Gaussian train_nsfls_type1.m: tune the parameters of a non- singleton type-1 FLS when the antecedent membership functions are Gaussian, using some input- output training data. We avoid the extraneous matter on the development and historical background of rule- based FLSs because we are concerned only with use of FLSs in time series. The exhaustive literature and indeed critical review on rule-based FLSs are available in the form of a book by M. Mendel (2001). However, we shall deliberate on fundamental rules extracted from the data under consideration. The rules in fuzzy logic time-series are usually extracted from designing the FLSs. Prior to 1992, all FLSs reported in the open literature fixed the parameters, such as the type of fuzzification, composition, implication, t-norm (operators for fuzzy intersection), defuzzification (produces crisp output) and membership functions, arbitrarily,.e.g., the locations and spreads of the membership functions were chosen by the designer independent of the numerical training data. Then, at the first IEEE conference in Fuzzy systems, held in San Diago in 1992, three different groups of researchers,.i.e., Horikowa et al (1992), Jang (1992) and Wang and Mendel (1992), presented the same idea: tune the parameters of a FLS using the numerical training data. Since that time, quite a few adaptive training procedures have been published. Because tuning of free parameters had been in feed forward neural network (FFNN) long before it was done in a FLS, a tuned FLS has also come to be known as a neural fuzzy system. Designing a FLS Mendel and Mouzouris (1997) can be viewed as approximating a function or fitting a complex surface in a multidimensional space. Given a set of input-output pairs, tuning is essentially equivalent to determining a system that provides an optimal fit to input-output pairs, with respect to a cost function (tuning algorithm). Utilizing concepts from real analysis, Monzouris and Mendel have proven that a non-singleton FLS can uniformly approximate any continuous function on a compact set. Although the proof of approximation Mendel and Mouzouris (1997) provides some insight, it does not tell us how to choose the parameters of the non-
13
singleton FLS, nor does it tell us how many basis functions will be needed to achieve such performance. The latter are accomplished through design. The designing of FLSs require one-pass (OP), least square, back-propagation (steepest descent, BP), SVD-QR (SVD-QR is a matrix tool in numerical linear algebra used in signal processing, extracting fuzzy rules, reducing fuzzy rules and modeling the fuzzy rules) and iterative design methods. The forecasting of timeseries following the rule-based FLSs designing employ only two methods, .i.e., one pass (OP) and back propagation (BP) methods, respectively. The OP design constructs 500 rules for each antecedent consequent membership functions. We set the value of the standard deviation equal to 0.1 for all Gaussians in a pre-defined OP design. But, the OP is exhaustive as compared to BP designing in FLSs. On the contrary the BP constructs only 16 rules for each antecedent and consequent membership functions. The initial values of the standard deviation of Gaussian membership function are all set equal to 0.5240 in a pre-defined BP design. The BP designing, in many respects, is better than OP, Mendel (2001). The predefined values of all four antecedent membership functions and for the centers of the consequent membership functions ( yl -height defuzzifier) for each corresponding 16 rules in a BP design for FLSs are used in the form of a matrix as an input. We use the height defuzzifier ( yl or centers of the consequent membership functions); to be a random number from the interval (0,1). After training and using BP design, the FLS forecaster was fixed. We use the learning parameter =0.2 in BP design. With tractable learning laws, we set the learning parameters. Alpha stable statistics model the impulsiveness as a parameterized family of probability density functions. Additive fuzzy systems can filter impulsive noise from signals. With < 2 one gets impulsive noise and noise has infinite variance. The alpha in statistics is an exponent parameter. With =2, we get the classical Gaussian case, .i.e., exponential tail and finite variance. The predefined initial mean (center) values of antecedent membership functions along with height defuzzifiers (mean values of consequent membership functions) and the standard deviations of the Gaussian antecedent, in the form of matrix membership functions, as shown in tables 1 and 2, are used for determining the values of singleton 14
consequent membership functions, .i.e., f s ( s k ) for hourly 600 trainee wind data and 120
or 144 testing wind data, respectively. The predefined final mean (center) values of antecedent membership functions along with height defuzzifiers (mean values of the consequent membership functions) and the standard deviations of the Gaussian antecedent membership functions, in the form of a matrix, after six epochs of training, as shown in tables 2 and 3, are used for determining the values of non-singleton consequent member functions, .i.e., f ns(sk ), for hourly 600 trainee data and 120 or 144 testing data, respectively. In both cases, 600 trainee wind data and 120 or 144 testing data for all four antecedent membership functions are used as an input matrix, X, in sfls_type1.m and nsfls_type1.m, respectively. For trainee as well as for testing data, we calculated the predicted values Jafri (2005); Jafri et al (2005). It is difficult to reproduce all predicted values and the values of consequent membership functions for singleton and non-singleton type-1 FLSs in this manuscript. Therefore, we will compare root mean square error,.i,e., RSMEs (BP) with RSMEns (BP) only for testing data. RMSEs =
1 719 [s(k + 1) f s ( x ( k ) )]2 120 k =600
-------------------(4) RMSEns =
1 719 [s(k + 1) f ns ( x ( k ) )]2 120 k =600
where x(k) = [ x (k-18), x(t-12), x(t-6) x(t)] T ------------------(5) s(k+1) = x(t+6) It is worth mentioning that trainee pairs are obtained with testing data, therefore, the analysis of testing data will be the same for trainee data, We input predefined initial mean values of all antecedent membership functions (table 1) in case of a singleton type1 FLS because we assume that there are no uncertainties in the data. But, we cannot totally ignore the noisy measurement environment, therefore, we tested our final FLS forecasters on noisy testing data, .i.e.,
15
x(k) = s(k) + n(k)
-------------------(6)
where n (k) is OdB (decibel) uniformly distributed noise. We accomplished this task for a Monte Carlo set of 60 realizations. After each epoch we used the testing data to see how FLS performed by computing RMSEs(BP) and RMSEns(BP), respectively by using equation (4). This entire process was repeated 60 times using 60 independent sets of mean and standard deviation of 720 or 744 hourly averaged wind data. The predefined BP RMSEs (BP), Mendel (2001) for each of the six epochs of tuning are: RMSEs (BP) = {.0548,.0431,.0322,.0261,.0237,.0232}-------(4) The non-singleton FLS shares most of the same parameters as the singleton FLS. So we shall use the partially dependent BP design approach. In BP design we use only two fuzzy sets for each of the four antecedents, so that there are only 16 rules. Each rule is characterized by eight antecedent membership function parameters (the mean and standard deviation for each of the four Gaussian membership functions) and one consequent parameter, y . More specifically, we initially chose the mean of each and every antecedents, two Gaussian membership functions as m x 2 x or m x + 2 x , respectively, and the standard deviations of these membership functions as 2 x . For the non-singleton type-1 FLS, we modeled each of the four noisy input measurements using a Gaussian membership function. Two choices are possible: (1) use a different standard deviation for each of the four input measurement membership functions, or (2) use the same standard deviation for each of the four input measurement membership functions. We tried both approaches and got similar results because the additive noise n(k) is stationary. The predefined average values and standard deviations Mendel (2001) of RMSEs (BP) and RMSEns (BP) are shown in fig. 2 for each of the 6 epochs.
16
1.5 Artificial Neural Network Time Series (ANNTS) Modeling, Simulation & Prediction
McCulloch-Pitts neuron is the earliest artificial neuron described with fixed weights, a threshold activation function and a fixed discrete (non zero) time step for the transmission of a signal from one neuron to the next McCllouch and Pitts (1943). A processing unit is termed as a neuron or node. An artificial neural network (ANN) is an information processing paradigm that is inspired by the biological nervous system such as the brain and its processing information. A biological neuron has three types of components, that are of particular interest in understanding an artificial neuron: its dendrites, soma and axon. The dendrites receive signals from neighboring neurons. The signals are electric impulses that are transmitted across a synaptic gap by means of a chemical process. The synapse is a connection amongst neurons where their membranes almost touch and signal are transmitted from one to the other by chemical neurotransmitters. The soma or cell body sums the incoming signals, fixes signals when sufficient input is received and transmits signals over its axons to other cells. The axon is a long fiber over which a biological neuron transmits its output signals to other neurons. Neural networks are computer algorithms following the information processing exactly in the same manner as in the nervous system. They learn from the past to predict the future; offer solutions when explicit algorithms and modules are unavailable or too cumbersome. The neural network representative data is gathered and training algorithms are invoked to automatically learn the structure of data. There are many types of network ranging from simple Boolean networks (perceptron), to complex self-organizing networks (Kohonen Networks),to networks modeling thermodynamic properties (Boltzmann machines) Haykins (1994).There are nearly as many training methods as there are network types but some of the more popular ones include back propagation, the delta rule and Kohonen learning. A standard network architecture consists of several layers of neurons. An ANN is configured for a specific application, such as pattern recognition or data classification, through a learning process. Learning in biological system involves adjustments to the synaptic connections that exist between the neurons. This is true of ANNs as well. We shall emphasize only on ANN simulations which appear to be a 17
recent development. This discipline of knowledge was established before the advent of computers. Many important advances in ANNs reported during five decades since its discovery in 1943, resulted into frustration among researches Fausset (1994). Recently, the neural networks (NN) enjoy resurgence of interest and have begun to emerge as an entirely novel approach for the modeling of complex and non-linear phenomena Hertz et al (1991); Bishop(1995); Candill and Butler (1993); Whitley (1995); Connor et al (1994); Dorffner (1996); Ababarnal et al (1993); Gershenfeld and Weigend (1993); Fahlman and Lebiere (1990); Kanter et al (1995); Eisentein et al (1995); Bengio et al (1995); Fessant et al (1995); Ruiz-Suarez et al (1995) and Boznar et al (1993). Neural network (NN) is particularly useful when problems are driven rather by data than by concept or theory. To date NNs have yielded many successful applications in areas, as diverse as finance, medicine, engineering, geology, and physics indeed, any where that they are problems of prediction or classification, neural network are being introduced. ANN models have been applied to problems involving runoff forecasting and weather predictions Kang et al (1993) ANNs have been applied to groundwater reclamation problems Ranjethan and Eheart (1993), predicting average air temperature Cook and Wolfe(1991), predicting precipitation Kalogirou et al (1998) and for forecasting of price increments Castiglioue (2002). There has been intensive research on NNs Engel and Broeck(2000) and Kinzel (1999); Gardner and Dorling (1998); Kulkarni et al (1997); Edwards et al (1997); Geva (1998); Giles et al (2001); Khotanzad et al; Biehl and Caticha (2001); Schroder and Kinzel (1998); Eindor and Kanter (1998); Priel and Kanter (2000); A-Hujazi and Nashash (1996); Hertz and Krogh (1991); Andreas et al ((1994) and Azoff (1994).. Prediction of time series is an important application of NNs. Since 1995 the time series prediction by NNs have been exhaustively studied, Kalogirou et al (1998); Castigioue (2002); Engel and Broeck(2000); Kinzel (1999); Garden Dorling (1998); Kulkarni et al (1997); Edwards et al (1997); Geva (1998); Giles et al (2001); Khotanzad and Abaye(1997); Biehl and Caticha (2001); Schroder and Kinzel (1998); EinDor and Kanter (1998); Priel and Kanter (2000); Al-Hujazi and Al-Nashash (1996); Hertz et al (1991); Andreas et al ((1994) and Azoff (1994); Gately (1996); Refenes et al (1997); Mohandes et al (1998); Zhand et al (1998) and Hill et al (1996). Detecting trends and patterns in financial data is of great interest to the business world to support the decision
18
making process through time series forecasting,. i.e., with neural networks Lin et al (1995). Generally wind speed is a highly non-linear phenomenon Kamal and Jafri (1996)a and Kamal and Jafri (1997). ANNs have recently been used successfully in prediction of wind speed/energy Mohandes et al (1998); Kariniotakis et al (1996); Li et al (1997); Shuhui et al (2001); Sfetsos (2002) and Kamal (2004). ANNs which are trained on a time series are supported to achieve firstly to predict the time series many time steps ahead and secondly to learn the rule which has produced. The prediction and learning are not necessarily related to each other especially for chaotic time series Freking et al (2005). Burney (1999) studied artificial Neural networks (ANNs) with emphasis on predictive data mining. Burney and Jilani (2001) applied methods of ANNs for the forecasting of stock exchange. They performed the supervised ANNs for stock exchange share rates prediction Burney and Jilani (2003). The most notable work on ANNs was the comparison of first and second order algorithms, Burney et al (2004).More and Deo (2003) employed the technique of neural networks to forecast daily, weekly and monthly wind speed. Both feed forward (FF) as well as recurrent networks (RN) are used and trained on past data in the autoregressive (AR) manner using back propagation (BP) and cascade correlation (CC) algorithms. They conclude that the CC algorithms yield more accurate forecasts compared to that of BP. With critical analysis & review on ANNs, we are of the opinion that ANNs yield better forecasts than the traditional stochastic time series model of ARIMA. We have not been able to find any relevant research article pertaining to ANNs in Journal of the American Statistical Association of the last two decades.Recent research activities in forecasting with ANNs can be a promising alternatives to the traditional ARMA structure. Zhang (2003) presented a hybrid ARMA and neural network model. Org et al (2005) worked on model identification of ARIMA using genetic algorithms. Pai and Lin (2005) obtained stock price forecasting using hybrid ARIMA and support vector machines model. With hybridization of intelligent techniques such as ANNs, fuzzy systems and evolutionary algorithms, one could expect a relatively better time series such as ANNs, fuzzy systems, other intelligent systems prediction. Valenzuela et al (2008) exploited hybridization of intelligent techniques and ARIMA models for time series prediction. A critical survey on
19
neural networks in business forecasting is self-explanatory to reflect modeling issues for forecasting applications Zhang (2004).
1.6 Thesis Plan
With critical analysis and review on various time series modeling, simulation and prediction, we have been able to unravel the unattended areas of researches as well as the areas which were overemphasized. It has been realized that statistical techniques like ARMA, ARIMA, non-seasonal ARIMA and seasonal ARIMA have limited capabilities when modeling time series data. Likewise, the regression analysis time series modeling and simulation have enormous limitation. In such a trivial situation, we shall generalize statistical techniques and accomplish modeling of time series wind data.
We shall compare MTM ( Markov Transition Matrices) with stochastic time series models. On comparison of statistical and generalized techniques for stochastic time series, we shall find very pertinent and useful results. The minor statistical details are useful for deciphering proper stochastic time series such as the comparison of MTM with ARMA as a simulator, suitability of short range with large rang prediction, stochastic simulator in ARIMA and indeed the heteroscedasticity /homoscedasticity tests in regression analysis time series partly on some weather data.
We find the recent trends of modeling & simulation of time series only in feed forward back propagation neural network (FFBPNN), therefore, we shall attempt FFBPNN on our data.
We shall apply singleton and non singleton type- 1 back propagation (BP) designed sixteen rule fuzzy logic system (FLS) on hourly averaged wind data, which to our knowledge, nobody has ever attempted till todate.
We shall also use design free fuzzy logic and obtain prediction on wind data, which again to our use knowledge, has never been done on wind data till todate. We shall perform Mackey Glass simulation on wind data. There are diverse categories of time series like neuro fuzzy logic Burney et al (2006), Burney and Jilani (2007), second order modeling of fuzzy time series Tsai &Wu (1999), multivariate fuzzy logic Jilani and Burney (2007), autoregressive 20
fuzzy logic Kezuhiro et al(1997), fuzzy predictor by extrapolating a time series and parallel structure fuzzy system Kim et al (2001) which would, of course, have extensive applications in business and trade related activities, risk assessments and small scale weather or climate predictions.
21

ARMA-Stochastic Time Series Modeling

Diunggah oleh

Informasi Dokumen

Deskripsi Asli:

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

ARMA-Stochastic Time Series Modeling

Diunggah oleh

Hak Cipta:

Format Tersedia

Contents

Abstract Chapter 1 Critical Reviews:

1.1 Stochastic Time Series Modeling, Simulation & Prediction

in the SES formula. The term is called training

1.2 Regression Analysis Time Series Modeling & Simulation

2 E fcum (%) = k exp 0.5 Eth

2 x E k = exp x ln fcum and x = -----------(3) n Eth

x(k) = s(k) + n(k)

1.6 Thesis Plan

Anda mungkin juga menyukai