3, JULY 2006
1735
I. INTRODUCTION
Manuscript received December 16, 2004; revised September 28, 2005. This
work was supported in part by CNPq, in part by FAPESP (Brazil) and in part by
the Universidad Tecnolgica de Pereira (Colombia). Paper no. TPWRD-005922004.
H. Salazar and R. Gallego are with the Universidad Tecnolgica de
Pereira, Pereira-Risaralda A.A. 097, Colombia (e-mail: hsi@utp.edu.co;
ragr@utp.edu.co).
R. Romero is with the Electrical Engineering Department, FEIS-UNESP, Ilha
Solteira SP 15385-000, Brazil (e-mail: ruben@dee.feis.unesp.br).
Digital Object Identifier 10.1109/TPWRD.2006.875854
two groups [18]: 1) heuristic algorithms and/or classic mathematical optimization techniques and 2) algorithms based on artificial intelligence (AI) in which some metaheuristics can be
incorporated. The algorithms belonging to group (1) generally
determine only one solution for a specific load condition which
is normally reached through an interactive process. The artificial neural networks (ANNs) belong to group (2) and can provide a set of good quality topologies for each load pattern in
real-time mode without executing an iterative process (i.e., the
ANN can provide solutions when a change in the load profile
occurs without executing an extensive iterative procedure).
Methods that employ classical optimization techniques can
be found in [19], [20], [24][26]. All of these approaches
propose different heuristic methods to minimize the computational effort to get a solution. Reference [22] employs the ANN
technique and presents important contributions; however, the
method proposes a large number of grouped neural networks
that make solutions for large systems almost impossible due to
the amount of the required training. Another limitation is the
output layer that limits the solution to one of the topologies
found during the training phase, limiting the generalization
capacity of a neural network (NN).
Another ANN implementation is proposed in [27]. The main
difference with the previously mentioned study is the use of
a Hopfield neural network that determines a topology after
reaching its equilibrium phase. The major problem is related to
the dynamic nature of the neural network, for which only small
systems can be solved.
This paper proposes an ANN that takes into account the problems mentioned above. The ANN employs only one neural network and provides the ability to determine the most suitable
topologies. In order to improve the performance and structure
of the ANN, the use of clustering techniques is proposed for
reducing the number of input data of the ANN. The best performance is achieved by employing clustering techniques for
the training sets, which results in a more effective information source for the ANN. The result is an ANN with enhanced
generalization ability and with the possibility of determining
high-quality topologies with lower losses.
This paper is organized as follows: Section II provides a
mathematical formulation of the problem. A review of neural
networks and clustering techniques is presented in Section III.
Section IV explains how neural network along with clustering
techniques can be used to solve the problem. Numerical results
are reported in Section V and the conclusions are in Section VI.
1736
(2)
(3)
(4)
(5)
(6)
(7)
where is the number of electrical nodes in the system, and
, , and
are the apparent power flow, electrical current
at branch , and the impedance of the branch , respectively.
is the load at bus ,
is the voltage at bus ,
is the
is equal to 1 if there is only
power flow at the feeder , and
one path between a terminal bus and a source; otherwise,
is
zero.
In the proposed mathematical formulation, the objective
function (1) represents the active losses in the system; constraints (2) and (3) represent Kirchhoffs Law, the constraints
(4) and (5) represent the power flow at the branches and the
voltage that should be kept within the limits. Constraint (6)
is the operational limits of the substation transformers and
(7) is the constraint that keeps the system radiality. The corresponding model is a mixed integer nonlinear programming
problem (MINLP).
III. NEURAL NETWORK AND CLUSTERING TECHNIQUES
A. Neural Networks
The ANN, also known as a neural network (NN) for short,
represents an organized system of densely connected nodes,
usually in a feedforward way. Each connection (or neurons) is
associated with a number that is referred to as weight. The NN
described in this paper is a standard multilayer perceptron. It is
net, it represents
shown in Fig. 1, and is also called
an NN with neurons from the input layer which receives information from the external world, neurons from the hidden
layer (which propagates the information), and neurons in the
output layer, which provides the response of the NN.
The output of an NN can be defined as a function of input
and weights
in the form
, in which represents the mapping function defined by the NN
.
in
The NN learning process consists of adjusting weights
order that a good mapping for the learning data is achieved.
Therefore, for learning data with an unknown mathematical relation, the NN provides a mapping. The set of data for which the
mapping is executed is represented in the following form:
(8)
where
represents the input and desired output for a
training pattern . The training algorithm looks for a function
with the ability to adjust and generalize the training set [the
generalization is the ability of an NN to provide adequately an
output for an input that does not belong to a training set (8)].
Based on an input and its corresponding desired output ,
the training process minimizes the energy function , which is
the mean quadratic error between and for all elements in
the training set [2][5]
(9)
, the algorithm tries to find a set
As for
that
, for all training set (i.e., the ideal condition is a combination of weights that makes the cost function
equal to zero). However, as the mapping function is nonlinear
due to the transfer functions of the different layers, the training
algorithm is an iterative process that minimizes (9) for a given
tolerance, which may be any algorithm employed for optimizing
the unconstrained nonlinear problems.
The training algorithms change their weights according to a
(10). The direction is determined by using
search direction
the classical first- and second-order algorithms. For the first
is found by using the gradient of the objective function
case,
(9) and they are known as backpropagation algorithms or gradient descendant. These algorithms are fast and the only drawback is that the back-propagation can only be applied to networks with differentiable activation functions
(10)
Another group is formed by algorithms, such as the conjugate gradient and quasi-Newton and LevenbergMarquardt,
that present good results even when some simplifications are
employed in order to avoid the higher order derivatives [5].
Heuristic algorithms that present good performance with little
computation effort also exist. They are the Delta-Bar-Delta [8],
QuickProp [6], and the Rprop [7] algorithm to name a few.
B. Clustering Techniques
The clustering techniques represent mathematical tools to analyze and create groups with similar characteristics as well as to
identify and recover unknown features of each identified group.
For a particular set of data, as shown in Fig. 2, the clustering
algorithms identify an -group set with its most representative
1737
element. For the case in the figure, three groups are identified
with a central element that is the most representative.
This work proposes a clustering method based on demand
values. It does not take into account the geographical localization of them. Thus, only the active and reactive power were
considered. This approach results in a small group of loads that
completely represent all demands of the system.
A consequence of the application of the clustering technique
is the significant reduction of the neurons in the input layer,
which increases the NN performance.
Clustering techniques consider two fundamental criteria:
the proximity measurement and the grouping criterion. The
proximity measurement represents the degree of similarity
between two points and takes into account the characteristics of
the system in order to avoid the dominance of one characteristic
over the others. The grouping criterion may be represented by
a cost function of the algorithm or by another criterion that
allows the clustering by using the proximity measurement.
The Fuzzy c-mean (FCM) algorithm is the most relevant partition algorithm in the literature [12][14] and it is equivalent
to the Hard c-mean (HCM) algorithm [30]. For a data set of ;
; and
, the algorithm groups the elements
into clusters in which one element belongs to different groups
with different degrees of membership. The basic objective of
most representative elements of
the algorithm is to find the
the groups, which will represent the center of each group or
partition. These elements are found iteratively minimizing functhat represents
tion (11) considering the value given by
the membership of to the grouping and to the center of
the cluster. The objective function of the FCM algorithm is as
follows:
(11)
where is the number of clusters, is the number of available
is the defuzzification factor that determines the overdata,
lapping degree of fuzzy sets, and
is the metric that defines
the distance between and . Different metrics are presented
in [16] and, in this paper, the Euclidean distance is used as the
measure of proximity. It should be noted and are active or
reactive power demands at different nodes of the system.
The FCM algorithm executes the following algorithm to determine the center and membership values that minimize
.
(14)
Step 4) Repeat steps 2 and 3 until the
in (11) does not
function
present a significant modification for two consecutive
iterations.
A difficulty in the FCM algorithm is the determination of
the number of clusters . A nonadequate choice may generate
that do not represent the data. A
nonreliable ;
way to get around this problem is to employ validation indices
to allow the comparison of the partitions or clusters determined
by the algorithm.
The quality of a partition or the number of groups is evaluated based on two criteria: compression and separation. Compression establishes that the members of each group should be
as close to each other as possible. Compression can be achieved
by minimizing the variance of the group. On the other hand, the
separation establishes that the groups should be separated adequately among them.
Several validation indices with different compression and
separation criterion have been proposed [16], [17]. The most
cited is the proposal of XieBeni [15]. The method consists of
which determines the sepemploying a validation function
aration and compression of a partition. The proposed strategy
is as follows:
for function
(15)
The function above is independent from the algorithm used
to establish the partition. Nevertheless, if the FCM algorithm is
being the Euclidean distance, the numerator
employed with
of (15) is the objective function (11), thus giving the following
relation:
(16)
is the minimum distance between point and the
where
center of the cluster . Simple calculation and little computational effort are advantages provided by (16). The best par, which provides a partititioning is found by minimizing
tioning that meets the compression and separation criteria [15].
1738
Fig. 3.
Fig. 4.
(17)
which finds the lower value of the XieBeni index. The algorithm is given as follows.
1) Initialize
,
,
.
2) Initialize the membership values
that satisfy the constraint
(12).
3) Execute FCM algorithm.
4) Calculate the XieBeni validation
.
index
,
.
5) If
6) If the optimal solution is not
found, then return to step 2.
limit
stop. Otherwise, go
7) If
to 8.
and return to step 2 until
8)
is
a the maximum number of
reached.
IV. PROPOSED METHODOLOGY
The distribution system reconfiguration problem by the NN
approach may be modeled as a pattern recognition problem in
which there is a topology with minimum loss for each load pattern. The training set should present sufficient information so
that the NN can provide the adequate mapping of the problem.
The construction of the training set should also be independent
from the system size in order for the NN to be applicable to realistic distribution systems.
The state of the system is described by a vector with the active and reactive demand at each system bus. Therefore, for each
system state, one or several radial topologies may exist that minimize the objective function. This concept is shown in Fig. 3.
Thus, the training pair is formed by the load at the buses (input
data) and the topology that minimizes the losses (output data).
A training pair of the set (8) presents the input for the state
1739
Fig. 7.
NN structure.
TABLE I
VALIDATION INDEX14 BUS SYSTEM
Fig. 5. Representative loads obtained by using the FCM algorithm with five
clusters.
Fig. 6.
The system data can be found in [20] and the loads are classified into three groups (residential, commercial, industrial) as
[22]. The loads are further discretized into four levels (100%,
85%, 70%, and 50%) which results in 64 states. For each state, a
radial topology that minimizes the system losses has been identified with the reconfigurator presented in [21].
up to
The number of clusters is found by searching
load groups. Since there are few load buses, a maximum of
have been chosen. The indices are found by using the XieBeni algorithm (Section III-B) with the fuzzification parameter
[15]. The results are shown in Table I.
, which
In Table I, the best results are obtained for
provides small values of (16). However,
also presents
excellent performance. For this reason, this parameter was also
used in the simulation. The size of the input vector decreases by
46.5% since the 26 elements (representing 13 buses with two
data for each bus) reduce to 12 elements represented by two load
centers for each of the three clusters with two elements (active
and reactive demand). The next step is to find the center of the
clusters for the 64 states and to determine the training set. The
first training set is similar to the one employed in [22] and the
other one is the proposal presented in Section IV.
The first NN presents a structure with 12 input neurons corresponding to the dimension of the training vector, 20 neurons for
the hidden layer, and seven neurons for the output, which represents the number of different topologies found. The transfer
functions in the hidden layer were the sigmoid function and
the linear transfer function for the output layer. The training
data are available from the authors. Four algorithms were employed in the training phase: 1) backpropagation with adaptive rate (BP-RA), 2) conjugate gradient with the PolakRibiere
1740
TABLE II
TRAINING RESULTSNN 12-20-7
(CG-PR) index, 3) resilient backpropagation (RBP), and 4) conjugate gradient (CG). The training was executed with 54 of 64
available cases. The NN generalization ability was verified by
using 30 cases (ten of them chosen from the training set, ten
known situations but not belonging to the training set, and ten
cases generated randomly).
The training results are presented in Table II in which the
column Iter. shows the total number of iterations to reach the
and the Efficiency column shows the
tolerance
percentage of cases learned by the NN. The rest of the columns
show the performance of the NN for the 30 cases employed for
verifying the generalization ability. The Gener. column indicates
the performance of the NN for the 30 cases used for verification.
For example, an NN trained with RBP provides the same
results as a reconfiguration algorithm for 29 of 30 cases used
for verification. The only difference is observed for case 26 in
which the standard reconfigurator and the NN identifies the radial topologies number 3 and 5, respectively. Therefore, the results shown in Table II indicate that all algorithms employed for
the training phase present an efficiency of 100%. In the generalization phase, three of them present 96.67% of efficiency (i.e.,
29 of 30 present the same result for the standard approach and
the NN approach). It should be noted that even with the difference observed in case 26, the NN provides suboptimal topology
that also decreases power losses.
An NN with topology 12-20-16 is also presented. The only
difference from the previous NN is in the output layer with 16
neurons (the number of branches in the system). The results for
this system are presented in Table III. Two columns were added:
Loops and Island, which are the number of unfeasible topologies
with loops and topologies with islanded buses, respectively. The
results were also promising, with 100% efficiency in the training
phase and in the generalization phase. The main difference with
this algorithm is the very low processing time, which makes
the approach very interesting for use in real-time mode. It is
important to note that the proposed approach supersedes other
similar proposals presented in the technical literature.
B. 136-Bus System
The 136-bus system data can be found in [28] and represent
realistic data from TresLagoas, Brazil. The loads were classified and discretized in the same way of the previous 14-bus
system for the creation of the 64 states for the training set.
In order to decrease the size of the input vector, an optimal
number of clusters for each load group was found. The results
is calculated for different
are shown in Fig. 8. In this figure,
load centers.
TABLE III
TRAINING RESULTSNN 12-20-16
TABLE IV
RESULTS WITH NN 28-150-156TRAINING PHASE
TABLE V
RESULTS WITH NN 28-150-156GENERALIZATION
1741
Fig. 9.
TABLE VI
PROCESSING TIME (MEAN)
VI. CONCLUSION
This paper showed the feasibility of using the NN for solving
the distribution system reconfiguration problem for real systems. The application of clustering techniques associated with
validation techniques allows the construction of a reduced
training set but with sufficient information in order to provide
adequate learning by an NN. Several tests with different training
algorithms allowed the creation of an efficient and very fast NN
that presents at least the same performance of reconfiguration
programs that employ traditional optimization approaches.
The results also show that the proposed NN algorithm
presents superior performance than those approaches based on
NN algorithms for the same problem. Finally, the processing
times with NN are very low, making this approach suitable for
online applications, such as the distribution system restoration
problem.
REFERENCES
[1] M. Riedmiller, Advanced supervised learning in multi-layer perceptrons-from backpropagation to adaptive learning algorithms, J. Comput.
Standards Interfaces. Special Issue Neural Netw., no. 5, 1994.
1742
[23] N. I. Santoso and O. T. Tan, Neural-net based real time control of capacitors installed on distribution system, IEEE Trans. Power Del., vol.
5, no. 1, pp. 266272, Jan. 1990.
[24] D. Shirmohammadi and H. W. Hong, Reconfiguration of electric distribution networks for resistive losses reduction, IEEE Trans. Power Del.,
vol. 4, no. 2, pp. 14921498, Apr. 1989.
[25] W. M. Lin, F. S. Cheng, and M. T. Tsay, Distribution feeder reconfiguration with refined genetic algorithm, Proc. Inst. Elect. Eng., Gen.,
Transm. Distrib., vol. 147, no. 6, pp. 349354, 2000.
[26] K. Nara, A. K. Deb, A. Shiose, M. Kitagawa, and T. Ishira, Implementation of genetic algorithm for distribution systems loss minimum reconfiguration, IEEE Trans. Power Syst., vol. 7, no. 3, pp. 10441051, Aug.
1992.
[27] D. Bouchard, V. L. Chikhani, V. L. John, and M. M. A. Salama, Applications of Hopfield neural-networks to distribution feeder reconfiguration, in Proc. 2nd IEEE Int. Forum on Applications of Neural Networks
to Power Systems, 1993, pp. 311316.
[28] R. A. Gallego, R. Romero, and A. Monticelli, Optimal capacitor placement in radial distribution network, IEEE Trans. Power Syst., vol. 16,
no. 4, pp. 630637, Nov. 2001.
[29] M. F. Moller, A scaled conjugate algorithm for fast supervised
learning, Neural Newtw., vol. 6, no. 4, pp. 525533, 1993.
[30] R. O. Duda and E. Hart, Pattern Classification and Scene Analysis. New York: Wiley, 1973.
[31] H. Symon, Neural Network: A Comprehensive Foundation, 2nd
ed. Upper Saddle River, NJ: Prentice-Hall, 1999.
Harold Salazar (S06) received the B.Sc. and M.Sc. degrees in electrical
engineering from the Universidad Tecnolgica de Pereira, Pereira-Risaralda,
Colombia, in 1998 and 2003, respectively, and is currently pursuing the Ph.D.
degree at Iowa State University, Ames.
Currently, he is an Assistant Professor at the Universidad Tecnolgica de
Pereira (Pereira-Colombia). His research interests include power system economics, financial markets, and intelligent systems applied to power systems.
Ramn Gallego received the B.Sc. degree in electrical engineering from the
Universidad Tecnolgica de Pereira (UTP), Risaralda, Colombia, in 1981, the
M.Sc. degree from the Universidad Nacional de Colombia, Bogot, in 1985, and
the Ph.D. degree from State University of Campinas, Campinas, Brazil, in 1997.
Currently, he is a Professor in the Electrical Engineering Department, Universidad Tecnologica de Pereira, Pereira, Colombia. His research interests include
power system planning, power system optimization, and optimization models
for production planning.
Rubn Romero (M93) received the B.Sc. and P.E. degrees from the National
University of Engineering, Lima, Per, in 1978 and 1984, respectively, and the
M.Sc. and Ph.D. degrees from the State University of Campinas, Campinas,
Brazil, in 1990 and 1993, respectively.
Currently, he is a Professor of electrical engineering with the Paulista State
University (FEIS-UNESP), Ilha Solteira, Brazil.