Gary G. Yen USAF Phillips Laboratory Structures and Controls Division Kirthnd AFB, New Mexico 8711 7 ABSTRACT A decentralized neural control system is advocated for flexible multibody structures. The proposed neural controller is designed to achieve trajectory maneuvering of structural member as well as vibration suppression for precision pointing capability. The motivation to support such an innovation is to pursue a real-rime implementation of robust and fault tolerant structural controller. The proposed decentralized control architecture which takes advantage of the geometric distribution of PZT sensors and actuators has provided a tremendous freedom from computational complexity. In the spirit of model reference adaptive control, we utilize adaptive time-delay radial basis function networks as a building block to allow the neural network to function as an indirect closed-loop controller. The horizon-of-one predictive controller regulates the dynamics of the nonlinear structure to follow a prespecified reference model asymptotically. The proposed control strategy is validated in the experimental facility, called the Planar Articulating Controls Experiment which consists of a two-link flexible planar structure constrained to move over a granite table. This paper addresses the theoretical foundation of the architecture and demonstrates its applicability via a realistic structural test bed. 1. INTRODUCTION Modern space structures, which are likely to be highly nonlinear with time-varying structural parameters and poorly modeled dynamics, pose serious difficulties for all currently advocated methodologies (e.g., robust, adaptive, and optimal controls) as summarized i n [ l ] and [2]. These control systemdesign difficulties arise i n a broad spectrum of aerospace applications, e.g., surveillance satellites, military robots or space vehicles. Current control techniques often rely on the assumption of a high fidelity dynamic model containing identified system parameters.Furthermore, these synthesis algorithms require a priori fixed design constraints, where the loading and material properties need to be specified i n advance. Consequently, synthesis procedures to achieve the desired stability, robustness, and dynamic response for highly nonlinear structures with unknown parameters are incomplete. Literature surveys have justified that there is no systematic modeling technique for space structures which can effectively capture all of the spatiotemporal interactions among the structure members. The ultimate autonomous control, intended to sever the dependence of the space structures on a prior programming, perfect communication and flawless operation while maintaining acceptable performance over an extended operating range, can be especially difficult to accomplish due to factors such as high dimensionality, multiple inputs and outputs, complex performance criteria, operational constraints, imperfect measurements, as well as the unavoidable failures of various actuators, sensors, or other components. Therefore, the controller needs either to be exceptionally robust or adaptable after deployment [3]. In the present paper, wepropose to design and to validate a decentralized neural control systemwhich is capable of withstanding structural failures, component deviation, and unpredictable perturbations. The innovative use of adaptive time-delay radial basis function networks in a distributive manner is proposed to fulfill critical needs in various operating envelopes on a real-time basis. Neural networks which employ the well known back- propagation learning algorithm are capable of approximating any continuous functions (e.g., nonlinear plant dynamics and complex control laws) with an arbitrary degree of accuracy [ 4] . Similarly, radial basis function networks [ 5] are also shown to be universal approximators [6]. These model-free neural network paradigms are more effective at memory usage in solving control problems than conventional learning control approaches. A typical 0-7803-2129-4/94 $3.00 0 1994 IEEE example is the BOXES algorithm, a memory intensive approach, which partitions the control law in the form of a look-up table [7]. Neural network control systemoffers the capability of real-time adaptation and generalization while a look-up table approach would only provide discrete controller solutions in a lengthy and sequential search. Our goal is to approach structural autonomy by extending the control system's operating envelope, which has traditionally required vast memory usage. Connectionist systems, on the other hand, deliver less memory intensive solutions to control problems and yet provide a sufficiently generalized solution space. In vibration suppression problems, we utilize the adaptive time-delay radial basis function network (to be discussed i n Section 3) as a building block to allow the connectionist system to function as an indirect closed-loop controller. Decentralized nature of control system provides a tremendous computation power to suppress the vibration modes which can be identified by the experimental modal testing. Prior to training the compensator, a neural identifier based on an ARMA model is utilized to identify the open-loop system. The horizon-of-one predictive controllers then cooperatively regulates the dynamics of the nonlinear structure to follow a prespecified reference system asymptotically as depicted in Figure 1 (i.e., the model reference adaptive control architecture) [PI. The reference model, which can be easily specified through an input-output relationship, described all desired feature associated with the control task, e.g., a linear and highly damped model to suppress the vibration. R - 1 I Figure 1. Decentralized model reference adaptive control with adaptive time-delay radial basis function networks Each control subsystem, which were designed dedicatedly for one set of PZT actuator and sensor, is utilized to suppress a specified vibration mode, so that the control task can be executed on a real-time basis. The function of the neural control system is to map the system states into corresponding control actions in order to force the plant dynamics to match an output behavior which is specified by the reference model. However, wecannot apply the energy minimization procedure (e.g., gradient descent, conjugate gradient or Newton-Raphson method) to adjust the interconnection weights of the neural controllers because the desired outputs of the neural controllers are not available. In [9], a specialized learning algorithm which treats the plant as an additional unmodifiable layer of network is proposed. However, the authors fail to suggest an effective way to approach the approximation. In [lo], the inverse J acobian of the plant needs to be evaluated at each weight adjustment, which results i n a complicated and computationally expensive learning procedure. Moreover, since the plant is often not well-modeled because of modeling uncertainties, the exact partial derivatives cannot be determined. In [ I 11, a dynamic sign approximation is utilized assuming the qualitative knowledge of the plant. This is not necessarily the case i n space structure applications. To achieve the true gradient descent of the square of the error, we use dynamic buck propagation [12] to accurately approximate the required partial derivatives. An adaptive time-delay radial basis function network is first trained to identify the open-loop system. The resulting neural identifier then serves as extended unmodifiable layers to train a set of neural controllers. If the structural dynamics are to change as a function of time, the neural identifier would require the learning algorithm to periodically update the network parameters accordingly. The proposed efforts address several issues to achieve a decentralized fault tolerant control system i n space structures. In Section 2, adaptive time-delay back- propagation network is covered, providing an underlying issue pertaining to the learning algorithm. Based on the results developed in Section 2, the interconnecting topology and learning algorithm of adaptive time-delay radial basis function network are discussed i n Section 3. The proposed control strategy was validated i n the experimental facility, called the Planar Articulating Controls Experiment (i.e., PACE) which consists of a two-link flexible planar structure constrained to move over a granite table i n Section 4. The test article is well equipped with distributed piezoceramic sensors and actuators, such as torque motors for providing slew torque, encoders to measure angular velocities. The paper is concluded with a few pertinent observations in Section 5. 2. ADAPTIVE TIME-DELAY BACK-PROPAGATION NETWORK Biological studies have shown that variable time- delays do occur along mons due to different conduction 21 27 time and different lengths of axonal fibers. In addition, temporal properties such as temporal decays and integration occur frequently at synapses. Inspired by this observation, the time-delay back-propagation network was proposed by Waibel er al. for solving the phoneme recognition problem [ 131. In this architecture, each neuron takes into account not only the current information from all the neurons of the previous layer, but also a certain amount of past information from those neurons due to delay on the interconnections. However, a fixed amount of timedelay throughout the training process has limited the usage mainly due to the mismatch of the temporal location in the input patterns. To overcome this limitation, Lin er al. [14] has developed an adaptive time-delay back-propagation network to better accommodate the varying temporal sequences, and to provide more flexibility for optimization tasks. Figure 2. Adaptive time-delay back-propagation network A given adaptive time-delay back-propagation network can be completely described by its interconnecting topology, neuronic characteristics, temporal delays, and learning rule (see Figure 2). The individual processing unit performs its computations based only on local information. The output of the j th sigmoidal neuron at time t,, in the kth layer of the network is defined by 1=1 / = I vi" ( t , ) =g: (u: ( t , ) ) , j =1, .. ., N' , k =2,.. . , N, (lb) where g:: (-m,=)+(-l,l) is a sigmoidal function (i.e., continuously differentiable, monotonically increasing, and g(O)=O), Nk-' is the number of neurons in the (k-l)st layer, and w,;;' is an adjustable weight representing the strength of the connection between the output of the ith neuron of the (k-1)th layer to the input of the j th neuron of the kth layer with an independent time-delay z" ,I, L:;' denotes the number of delay connections from the ith neuron of the (k-1)th layer to the jth neuron of the kth layer, v,!-'(r,, -ziJ ') is the activation level of ith neuron i n the (k-1)th layer at t , -ziy. The adaptation of the weights and delays are derived based on the gradient descent method and error backpropagation to minimize the cost function, where N, denotes the output layer and dj (r,,) indicates the desired value of the jth output neuron at time t,,. The weights and time-delays are updated step by step proportional to the opposite direction of the error gradient respectively: where q1 and q2 are the learning rates. The derivation of this learning algorithm was addressed explicitly in [ 141. We summarize the learning rule given as follows. If i is an output neuron, we have while if i is a hidden neuron, we have 21 28 3. ADAPTIVE TIME-DELAY RADIAL BASIS FUNCTION NETWORK A radial basis function network (RBF) is a two layer neural network whose outputs forma linear combination of the basis functions derived from the hidden neurons. The basis function i n the hidden layer produces a localized response to input stimulus as do locally-tuned receptive fields i n our nervous systems. The Gaussian function network, a realization of the RBF network using Gaussian kernels, is widely used i n pattern classification and function approximation. The output of a Gaussian neuron in the hidden layer is defined by (7) where U: is the output of thejth neuron in the hidden layer, x is the input vector, wi : denotes the weighting vector for thejth neuron i n the hidden layer (i.e., the center of the jth Gaussian kernel), 0: is the normalization parameter of the j th neuron (i.e., the width of thejth Gaussian kernel), and N' is the number of neurons i n the hidden layer. Equation (7) produces a radially symmetric output with a unique maximum at the center dropping off rapidly to zero at large radii. That is, i t produces a significant nonzero response only when the input falls within a small localized region of the input space. Inspired by the adaptive time-delay back- propagation network, the output equation of ATDRBF networks is described by i =l L=l where wl , , denotes the connection between the output of the ith neuron of the hidden layer and the input of thejth neuron of the output layer with an independent time-delay zi , , , u: ( f , -T;.,) is the output vector from the hidden layer at time f, -T;.,. Li, denotes the number of delay connections from the ith neuron of the hidden layer to the jth neuron of the output neuron. Shared with generic radial basis function networks, adaptive time-delay Gaussian function networks have the property of undergoing Local changes during training, unlike adaptive time-delay back-propagation networks which experience global weighting adjustments due to the characteristics of sigmoidal functions. The localized influence of each Gaussian neuron allows the learning system to refine its functional approximation i n a successive and efficient manner. The hybrid learning algorithm [ 5] which employs the K-means clustering for the hidden layer and the least mean square (LMS) algorithm for the output layer further ensures a faster convergence and often leads to better performance and generalization. The combination of locality of representation and linearity of learning offers tremendous computational efficiency in real-time adaptive control. K- means algorithm is perhaps the most widely known clustering algorithm because of its simplicity and its ability to produce good results. The normalization parameters, o:, are obtained once the clustering algorithm is complete. They represent a measure of the spread of the data associated with each cluster. The cluster widths are then determined by the average distance between the cluster centers and the training samples, (9) where 0, is the set of training patterns belonging to jth cluster and Mj is the number of samples in 0,. This is followed by applying a LMS algorithm to adapt the time- delays and interconnecting weights in output layer. The training set consists of input/output pairs, but now the input patterns are pre-processed by the hidden layer before being presented to the output layer. The adaptation of the output weights and time delays are derived based on error back- propagation to minimize the cost function, where dj(ra) indicates the desired value of the jth output neuron at time t n . The weights and time-delays are updated step by step proportional to the opposite direction of the error gradient respectively, 2129 where q and q2 are the learning rates. The mathematical derivation of this learning algorithm is straightforward. We summarize the learning rule given as follows. 30 1st 1.875 4. PACE SIMULATION STUDY 154 420 827 1374 4.695 7.855 10.996 14.137 2nd 3rd 4th The autonomous control of precision space structures requires a distributed computational architecture that provides the ability to perform system identification and dynamic control after orbital deployment. Neural network based decentralized control system provides an alternative way to reduce the need for a priori knowledge of structural qualitative behavior. USAF Phillips Laboratory's Planar Articulating Controls Experiments (i.e., PACE) arm (see Figure 3) offers a feasibility test bed for validating the proposed control strategy. Figure 3. The PACE test article Researcher has previously conducted classical control experiment on the PACE test article [14], [15]. The experiments were limited to only one mode of vibration. In this study, we consider five vibration modes identified by the simple modal testing with reasonable accuracy. Each subsystem consists of one set of PZT actuator and sensor (see Figure 4). This decentralized control is essential for precision space structure having a large array of actuators and sensors so that the control task can be achieved in a timely fashion. 2 3 Figure 4. A decentralized decomposition of PACE arm The plant model (i.e., motion equation) is derived by using the Hamilton principle and the assumed mode method [16]. Three dynamic systems are integrated in the model : DC motor, armdynamics and vibration. The model has 12 states, including arm angle, ann angular velocity, five strain terms (for five modes) and five strain rate terms (for five modes). The state vector is converted to five output signals; including armangle, arm angular velocity, and the outputs from three PZT sensors. The five modes of the simulated PACE test article are listed in Table 1. The educational experience indicates that if we want to suppress the 5th mode (i.e. 219 Hz), the sampling frequency should be set above 2 KHz to achieve reasonable accuracy in system identification of the plant dynamics. Due to equipment constraints at the Laboratory, we have limited the sampling frequency to 1 KHz. Table 1. The first five modes of the PACE arm 11 frequency )I 5 I 25 I 67 I 132 1 219 I] Because this plant is stiff and of high dimension, integration of the model using adaptive 4th order Runge- Kutta method results i n a computationally expensive procedure. The flexible arm is divided into 3 identical subsystems for vibration control. The states consists of one measured state (v,) and one control input (v,). System identification is simulated by a single-layer adaptive time-delay Gaussian function network with I00 hidden neurons, while vibration suppression is performed by three single-layer adaptive time-delay Gaussian function networks with 50 hidden neurons each. The number of tapped delays needs to be experimentally determined by the approximation power and computational time. Ten tapped delays are chosen for both the neural identifier and neural controllers. Prior to training the compensators, a neural identifier based on an ARMA model is utilized to identify the open-loop system. The horizon-of-one neural controllers then cooperatively regulate the dynamics of the nonlinear plant to follow a prespecified reference system asymptotically (i.e., the model reference adaptive control architecture). To train the neural controllers, wegenerated 20 data sets with 1000 data points at each sequence. Three quarters of the data points are used for training while one quarter of the data are used for cross validation. We achieve a cross validation error of 7.6% after 500 epochs of training. The accuracy of the three neural controllers on 21 30 cross validation are shown i n Table 2. They are computed as a percentage error with respect to the average magnitude of the states in the training set. Subsystem cross validation error (%) Table 2. Cross validation errors of the neural controllers vibration 1 vibration 2 vibration 3 7.2 7.7 7.9 The 7.6% error of the vibration subsystems is adequately accurate, but i t will bedifficult to achieve a high controller performance for specific missions. There is a potential to improve the accuracy of the neural network model for the vibration systemby utilizing the recurrent type of neural network. This will be pursued in our future research. 5. CONCLUSIONS The architecture proposed for decentralized neural control system successfully demonstrates the feasibility and flexibility of our proposed solution for precision space structural platforms. The salient features associated with the proposed learning control are discussed. In a similar spirit, the proposed architecture can be extended to the dynamic control of aeropropulsion engines, underwater vehicles, chemical processes, power plants, and manufacturing scheduling. The applicability of the present methodology to large realistic CSI structural test beds will be I . 2. 3. 4. 5. pursued in our future research. 6. REFERENCES D. A. White and D. A. Sofge, Handbook of Intelligent Control- Neural, Fuzzy, and Adaptive Approaches, Van Nostrand Reinhold, New York, NY, 1992. P. J. Antsaklis and K. M. Passino, An Introduction to Intelligent and Autonomous Control, Kluwer Academic, Hingham, MA, 1992. G. G. Yen, Reconfigurable Learning Control in Large Space Structures, IEEE Transactions on Control Systems Technology, to appear; also Proceedings of IEEE Conference on Decision and Control, pp. 744- 749, December 1993. K. Hornik, M. Stinchcombe and H. White, Multilayer Feedforward Networks are Universal Approximators, Neural Networks, Vol. 2, No. 5, pp. 359-366, 1989. J. Moody and C. J . Darken, Fast Learning in Networks of Locally-tuned Processing Units, Neural Computation, Vol. 1, No. 2, pp. 28 1-294, Summer 1989. 6. E. J . Hartman, J . D. Keeler and J . M. Kowalski, Layered Neural Networks with Gaussian Hidden Units as Universal Approximations, Neural Computation, Vol. 2, No. 2, pp. 210-215, Summer 1990. 7. D. Michie and R. A. Chambers, BOXES: An Experiment i n Adaptive Control, In: Machine Intelligence, (E. Dale and D. Michie, Editors), pp. 137- 152, 1968. 8. K. J . Astrom and B. Wittenmark, Adaptive Control, Addison-Wesley, New York, NY, 1989. 9. D. Psaltis, A. Sideris and A. A. Yamamura, A Multilayered Neural Network Controller, IEEE Control Systems Magazine, Vol. 8, No. 3, pp. 17-21, April 1988. 10.R. Elsey, A Learning Architecture for Control Based on Back-propagation Neural Netwock, Proceedings of IEEE International Conference on Neural Networks, pp. l l .M. Saerens and A. Soquet, A Neural Controller, Proceedings of IEE International Conference on Artificial Neural Networks, pp. 21 1-215, October 1989. 12.K. S. Narendra and K. Parthasarthy, Identification and Control of Dynamical Systems Using Neural Network, IEEE Transactions on Neural Networks, Vol. 1, No. 1, 13.A. Waibel, T. Hanazawa, G. Hinton, K. Shikano and K. Lang, Phoneme Recognition: Neural Networks Versus Hidden Markov Models, Proceedings of IEEE Conference on Acoustics, Speech and Signal Processing, pp. 107-1 10, April 1988. 14.D. T. Lin, J . E. Dayhoff and P. A. Ligomenides, Adaptive Time-delay Neural Network for Temporal Correlation and Prediction, Proceedings of SPIE Conference on Biological, Neural Net, and 3-0 Methods, pp. 170- 18 1, November 1992. 15.M. K. Kwak, M. J . Smith and A. Das, PACE: A Test Bed for the Dynamics and Control of Flexible Multibody Systems. Proceedings of NASA/NSF/DoD Workshop on Aerospace Computational Control, pp. 100- 105, August 1992. 16.K. K. Denoyer and M. K. Kwak, Dynamic Modeling and Vibration Suppression of a Slewing Active Structure Utilizing Piezoelectric Sensors and Actuators, Proceedings of SPIE Conference on Smart Structures and Intelligent Systenis, pp. 882-894, February 1993. 17.M. K. Kwak, K. K. Denoyer and D. Sciulli, Dynamics and Control of a Slewing Active Beam, Proceedings of 9th VPI&SU Symposium on Dynamics and Control of Large Structures. pp. 34-4 1, May 1993, 587-594, J uly 1988. pp. 4-27, March 1990. 21 31