ANN+GIS: An Automated System For Property Valuation: Noelia Garcı A, Matı As Ga Mez, Esteban Alfaro

ARTICLE IN PRESS
Neurocomputing 71 (2008) 733742

www.elsevier.com/locate/neucom
ANN+GIS: An automated system for property valuation

Noelia Garca, Matas Gamez, Esteban Alfaro
Faculty of Business and Economics, University of Castilla-La Mancha, Plaza de la Universidad 1, 02071 Albacete, Spain
Abstract
Although property valuation models have become an important paradigm in real estate market research, the results of the most well-
known approaches are limited due to various data-related problems such as the non-linearity of relationships, the presence of noise, or
the absence of necessary information. This paper focuses on overcoming these obstacles. We introduce an automated system for property
valuation that combines articial neural network models with a geographic information system, and both tools have shown their
potential usefulness in the eld of economic research. The articial neural network models used in this work are the multilayer
perceptron, the radial basis function, and Kohonens maps.
r 2007 Elsevier B.V. All rights reserved.
Keywords: Articial neural networks; Geographic information systems; Housing prices
1. Introduction models. Other works that must be mentioned are: Tay and
Ho [20], Do and Grudnitsky [6], Worzala et al. [22],
The task of valuating housing properties has been largely McCluskey [14], Nguyen and Cripps [16]. This latter work
developed within the real estate market analysis. This concludes that ANN performs better than multiple-
regression problem has been usually tackled econometri- regression analysis if a sufcient training data size is
cally through hedonic and repeat-sales models, both provided. On the other hand, whichever approach is used,
belonging to the transaction-based approach [23]. The the analysis can be improved through the integration with a
hedonic pricing model was rst developed by Rosen [18] in geographic information system. As Thruston [21] stated,
1974 and constitutes a linear regression approach in which ANN linked to GIS can be used to simulate how the
the property price is determined as the weighted sum of the human brain processes spatial data problems. There are
different characteristics of which the property is made up. many applications in which the ANN coupled to GIS has
The other approach, the repeat-sales model, was intro- turned out very useful. For instance, we can mention: land
duced by Bailey et al. [3] in 1963. This model has been use, oceanography, forestry, consumer movement, airport
applied far less than the hedonic model due to the difculty noise evaluation and so on.
of nding the required information to implement it. The aim of this work is to show how different models of
More recently, there have been some successful attemps ANN and a geographic information system can be
from the geostatistics eld [8]. Additionally, since the combined to constitute a very powerful tool for economic
pioneering work of Borst [5], in 1991 the articial neural research, specically for the design of an automated
network (ANN) models have become a very attractive property appraisal system and for other complex tasks
alternative to the more traditional econometric models. related to the real estate market (e.g., the objective
The main advantage of these techniques is the ability to assignation of a quality level to each property, which
deal with non-linear relationships or initially unknown clearly has a large impact on the market value). In order
to reach these goals, we will use three of the most well-
Corresponding author. Tel.: +34 967 599 200; fax: +34 967 599 220. known ANN models [4,10,13,17]: the multilayer percep-
E-mail addresses: Noelia.Garcia@uclm.es (N. Garca), tron (MLP), radial basis function networks (RBF), and
Matias.Gamez@uclm.es (M. Gamez), Esteban.Alfaro@uclm.es self-organizing feature maps (SOFM) also known as
(E. Alfaro). Kohonens maps [12]. The rst two models represent
0925-2312/$ - see front matter r 2007 Elsevier B.V. All rights reserved.
doi:10.1016/j.neucom.2007.07.031
ARTICLE IN PRESS
734 N. Garca et al. / Neurocomputing 71 (2008) 733742
Fig. 1. Street map of Albacete.
a very interesting alternative to traditional methods with Table 1

regard to regression and supervised classication, whereas Confusion matrix for MLP and SOFM models
SOFM are specially designed for clustering tasks. So, we Real class Assigned class
will use the MLP and RBF models for the regression task
of estimating housing prices and MLP and Kohonens MLP SOFM
maps for intermediate tasks relating to the imputation of
Bad Standard Very Bad Standard Very No
missing values for various qualitative variables such as the good good class
quality of the property.
The second section of this article describes the crucial Training
Bad 43 1 0 30 10 0 4
problem to be solved, i.e. the estimation of free housing
Standard 7 203 2 5 198 5 4
prices in the city of Albacete (Spain), and provides essential Very 0 8 24 2 15 14 1
details of the sample as well as preliminary data processing. good
Section 3 deals with the implementation of neural models
Validation
and underlines the most important results. In Section 4, we Bad 14 5 0 13 6 0 0
combine the best model in terms of accuracy with the Standard 5 96 0 6 89 5 1
geographic information system by creating a computer Very 2 7 8 0 13 4 0
program in the SciViews graphics environment of the free good
software R, and this combination provides the automated Test
valuation system. Finally, Section 5 discusses some Bad 17 3 0 14 5 1 0
experimental results and suggests some changes in the Standard 5 91 3 1 90 3 5
design of databases related to real estate markets so that Very 0 3 15 0 9 8 1
good
the generalized use of the procedure proposed in the
previous section could be possible.
2. Problem description information has been obtained by a sampling procedure

from data supplied by real estate agencies due to the lack of
As we have already mentioned, the main objective of this any ofcial information relating to relevant characteristics
paper is to develop an automated valuation system. Such a such as quality, parking, heating, etc. After much hard
system must be able to estimate the market value of a work, we obtained 591 sample cases corresponding to real
property from information about the location and other market transactions conducted in Albacete in 2002. For
characteristics that may have some inuence on it. The more details, see [9]. The sample records contain a wealth
starting point is to compile the database and available of information about the following explanatory variables:
ARTICLE IN PRESS
N. Garca et al. / Neurocomputing 71 (2008) 733742 735
Table 2
Regression statistics
Price/m2 Total price
MLP RBF MLP RBF
Data mean 1285.9390 1292.1970 139 967.7 135 204.4

Data SD 296.0914 301.5658 460 32.14 47 511.7
Error mean 18.3310 24.8083 271.8061 1183.905
Error SD 139.3686 153.6088 13 049.79 17 601.32
Absolute error mean 98.6936 116.802 9085.213 13 220.65
Relative error mean 7.6748 9.0390 6.4909 9.7783
SD ratio 0.4707 0.5093 0.2835 0.3705
Correlation (target and estimation) 0.8833 0.8606 0.9591 0.9317
R2 0.8129 0.7406 0.9196 0.8628
Property type: a qualitative variable with the value 0

(apartments) and 1 (single family houses). Property type
Location: the postal address is converted into two Coord.X
Coord.Y
numerical variables CoordX and CoordY using the Age
location of the exact point on the geo-referenced1 street Surface
Bedrooms
map. Bathrooms
Age: expressed in years. Lift
Balcony
Total
Surface: measured in usable square metres. Heating
housing price
Number of bedrooms: number of rooms apart from the

Quality
living room.
Bathrooms: numerical variable resulting from adding Parking
Storage room
one point for a complete bathroom and half a point for GabLod
an incomplete bathroom without a bath or shower.
Lift: a qualitative variable with a value of 1 if there is a Fig. 2. MLP 14:1651:1 architecture.
lift and 0 if not.
Balcony: a qualitative variable with a value of 1 if there
is a balcony with more than 15 m2. Total housing price3: from this variable we have obtained
Heating: a qualitative variable with a value of 0 if there a new one, the square meter price as the ratio between
is no heating system and 1 if there is. the total price and the usable surface. These variables
Quality: a qualitative variable with three categories: constitute dependent variables, i.e. the outputs in the
J Bad for old houses built with poor quality materials different neural models that will be designed.
and in a bad state of repair.
J Standard for new or semi-new houses that were built
according to standard levels of quality and for old Fig. 1 reproduces the electronic map of Albacete, with
renovated properties. the dots representing the 591 sample records.
J Very good, for top quality new houses. It is worth mentioning that further information was
Parking: quantitative variable that indicates parking requested from the agencies such as the main orientation of
spaces. the building or the presence of recreational areas, swim-
Storage room: a qualitative variable with a value of 0 if ming pools, gardens, etc. Unfortunately, however, much of
there is not a storage room and 1 if there is. this information about these variables was found to be
Gablod: in addition to the information provided by the missing and so could not be used. For the remaining
agencies, the use of GIS has enabled us to include variables with a reasonable number of omitted values, we
another variable that measures the distance to the well- decided to complete them using the k-nearest technique for
accepted city centre, i.e. Plaza de Gabriel Lodares.2 the quantitative variables and ANN models4 (MLP and
SOFM) for classication tasks for the qualitative variables.
1 In the following section, we will briey describe the
This tool was provided by the Teledetection and GIS section of the
Instituto de Desarrollo Regional having received the necessary authoriza-
quality variable procedure. This variable gathers a great
tion from the Mayor of Albacete. deal of information and can often be very subjective. For
2
The square Plaza de Gabriel Lodares has been taken as the city centre
3
after ruling out other points such as the Plaza del Altozano or the City It is important to point out that these prices come from true buying
Hall. The reason for this is that these historical centres have been and selling operations rather than the offer prices that are largely used in
displaced by the growth of the city that has been restricted towards the work on this subject.
4
north by the railway lines. The neural models have been estimated using TRAJAN software.
ARTICLE IN PRESS
Fig. 3. Correlation between target and estimated prices.
29 cases in the total sample, the agencies had not labelled 562 cases to develop a network capable of estimating the
the level of quality and there was not enough information missing data. In order to measure the true error in order to
to assign it (home improvements, condition of the oors, select the best networks in terms of their generalization
carpentry, windows, etc.), and so we used the remaining capacity, the available samples were divided into three
ARTICLE IN PRESS
different sets: the training set (288 cases), the validation set 40
(137 cases), and the test set (137 cases). After many trials, 35
the two best networks were selected: a 14:1463:1 MLP 30
Frequencies
(14 nodes in the input layer, a hidden layer with six 25
nodes, and one node for each quality level in the output 20
layer), and a 1449 SOFM5 (14 units in the input layer 15
and 49 nodes in the competition layer). Table 1 shows the 10
confusion matrix for both models. 5
From the diagonal form of the confusion matrix in Table 1, 0

] ] ] ] ] ] ] ] ] ] ] ] ] ] ] ] ] ]
we can conclude that there is no confusion between 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ;0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
4 5 0 4 0 0 3 5 0 3 0 0 2 5 0 2 0 0 1 5 0 1 0 0 ;-5 0 0 0 ; 5 0 1 0 0 1 5 0 2 0 0 2 5 0 3 0 0 3 5 0 4 0 0
;- 0;- 0 ;- 0;- 0 ;- 0 ;- 0;- 0;- 0 0 [- [ 0 00 5 ; 0 ; 0; 0; 0; 0; 0;
extreme classes. Paying special attention to the results for 0
0 0 0 0 0 0 0 0
0 0 5 0 0 0 5 0 00 5 0 0 0 5 0 - 1 0
0 0 0 0 0 0 0 0
[ 5 [ 1 0 0 [ 1 5 0 [ 2 0 0 [ 2 5 0 [3 0 0 [ 3 5 0
the test set, the percentage of correctly classied cases by [ - 5 [-4 [ - 4 [-3 [ - 3 [ - 2 [ - 2 [ - 1 [
the MLP is close to 90%, while Kohonens map is only Error
capable of classifying 82% of the cases. Since the MLP Fig. 4. Error histogram for the test set.
results were more satisfactory than Kohonens ones,
missing data were substituted from MLP predictions.6
Table 3
3. Estimation of free housing prices Sensitivity analysis
Ranking Variable Ratio Ranking Variable Ratio

Having completed the sample, we now come to the
main objective of this researchthat is, the estimation of 1 Surface 2.6978 8 Quality 1.4074
property prices in Albacete. This task is approached from 2 Heating 2.1082 9 Balcony 1.2087
two perspectives, the square meter price and the total price, 3 Lift 1.9416 10 Age 1.1859
4 GabLod 1.7586 11 Coord. Y 1.1514
and for each one we trained a large number of MLP and 5 Type 1.6852 12 Bedroom 1.1129
RBF networks. Two models were then selected for each 6 Parking 1.5713 13 Coord. X 1.0972
dependent variable. Table 2 shows the main results for the 7 Storage 1.4987 14 Bathroom 1.0486
four models.
It can be seen that all input variables passed the sensitivity analysis.
The most signicant value is the prediction error
standard deviation. If this measure is not better than the
training data standard deviation, then the network has
performed no better than a simple mean estimator. We can hidden layers and elements in each one was chosen by
analyse the explained variance of the model through the taking as a criterion the construction of a network with the
ratio of the prediction error SD to the training data SD. least possible complexity. This objective resulted in our
A value signicantly lower than 1.0 indicates good selecting a 14:1651:1 network, i.e. an input layer with 14
regression performance. In sight of the previous regression nodes, pre-processed into 16 nodes,7 a hidden layer with
statistics, we could conclude that the best results were ve elements, and nally an output layer with one. This
obtained for the MLP estimating the total price. In the architecture is shown in Fig. 2.
following section, we will study the process of designing The following training decisions were set. Firstly, the
and training this model in detail. sample was divided into three subsets: a training set (50%),
The MLP architecture proposed here was selected after a a validation set (25%), and a test set (25%). The activation
large number of trials. The number of nodes in the input functions were selected to be linear in the input layer and
and the output layer was determined by the structure of sigmoid in the hidden and output layers. Weights and
our analysis, i.e. the number of explanatory and output threshold were then randomly set. The network was trained
variables, respectively. On the other hand, the number of with the Delta-Bar-Delta8 algorithm with the sum of
5 7
Although self-organizing feature maps were primarily designed to solve All variables were pre-processed before being introduced into the
clustering tasks, they can also be used to supervise classication tasks. network. The numerical variables were scaled to produce new variables in
Once the SOFM has been trained, we can label each competition node. the range 01. The qualitative variables were encoded by the two-state
The labels allow us to compute the error as the percentage of well- technique in a single input variable, except for the variable quality that was
classied cases in the supervised classication as normal. In this work, one converted using the one-of-N method. This technique uses a set of
restriction has been imposed and that is that at least 50% of the cases variables, one for each possible nominal value. In this case, there were
where one node is the winner must belong to the same class for this node three categories of quality, so the total number of variables changes from
to be labelled with this class. With this restriction, we attempt to keep give- 14 to 16.
8
and-take between the number of nodes that will remain unlabelled and The Delta-Bar-Delta rule, proposed in [11], is an improvement of
trust in the labelled neurons. In this case, the result is that nine cases in the standard back propagation. The objective is to accelerate the convergence
training set were not classied, one in the validation set and six in the test of the learning process from the following idea: since the error surface may
set. have different gradients along the direction of each weight, it might be
6
Nevertheless, an exhaustive comparison was carried out which desirable to allow the learning rates to differ for each adjustable parameter
concluded that agreement between both procedures exceeded 75%. in the network and to allow these rates to be adaptable during the epochs
ARTICLE IN PRESS
145441 238635
62459.2 513735.1
50 250 60
2375
Y
D) )
(S
GE
Y
LO
U
(A
(A
RF
B
G
GA
AC
X
E)
E)
X
35 0
00
Fig. 5. Response surfaces.
squared errors (SSE) as the error function. The details of

100000
the algorithm were:
98000
Maximum number of epochs: 2500. 96000

Price
Initial learning rate: 0.001.

Increment: 0.07. 94000
Decay: 0.5. 92000

Smoothing9: 0.5
90000
0
2
4
6
8
10
12
14
16
18
20
22
24
26
28
30
32
34
36
38
40
42
44
46
48
50
Once the network has been completely trained after 2346 Age
epochs, the model must be validated and for this, we will
Fig. 6. Network response to the age control.
reanalyze the results in Table 2. For the test set, the
percentage of the variance that has been explained is close
to 92%, the correlation coefcient between the estimated
and the target output is 0.96, and the absolute error mean
is about 9085h. performance can be assessed by means of a sensitivity
It is, however, worth specifying two points. Firstly, we analysis, which entails testing network performance as if
know that a high correlation coefcient does not imply the input variable were unavailable. Table 3 shows the ratio
coincidence between the estimated price and the target. for each variable. This ratio measures the relation of the
However, if we look at the graph in Fig. 3, the idea of error if the correspondent input variable is unavailable
coincidence is not mad since the tendency line almost with the error if all variables are available. A ratio of one
coincides with the squared diagonal; in other words, most or lower therefore means that pruning this input variable
points are close to the diagonal line where the estimated has no effect on network performance.
prices and the target prices are the same. Secondly, as Although an articial neural network model is usually
Fig. 3 shows, only a few points are a long way from the considered like a black box in the sense that it becomes
tendency line for the test set. Moreover, an analysis of the very difcult to explain the relationships between each
error histogram in Fig. 4 reveals that only by eliminating input with the output, this task can be resolved through
six extreme errors do the results change signicantly. More response surface graphs. Fig. 5 shows two of the most
specically, the absolute error mean and the relative mean interesting response surfaces. The graph on the left-hand
error drop to 7811h and 5.65%, respectively. side analyses the network response for the age and the
Once the model has been satisfactorily validated, the distance to the city centre and the graph on the right-hand
relative contribution of each input variable to the global side analyses the response for surface and age.
As expected, according to the monocentric assump-
(footnote continued) tion,10 the slope of the response to GabLod was negative.
in the learning process. The learning rate should be higher when changes
10
for one weight in consecutives epochs occur in the same direction but This assumption has been widely studied since pioneer research by
lower when the signs of those changes are opposite. Alonso [1], and refers to the existence of a central business district (CBD)
9
It is worth mentioning that the increments on the learning rate are where economic and social activities are concentrated. Since property
linear whereas the decays are set to be exponential. This is necessary in prices are supposedly higher in this area, the relationship between the
order to prevent the learning rate from growing too fast while allowing it distance to the CBD and value should have a negative slope. After
to decay rapidly if needed while guaranteeing the positive sign. Alonsos work, many authors research has been driven by this
ARTICLE IN PRESS
Fig. 7. Initial application page.
Fig. 8. Data editor to enter property characteristics.
However, it is worth pointing out that in centrally located further from the centre in such a way that the highest prices
areas, the oldest properties command the highest prices. On were found for both extremes of age. The response graph
the other hand, the response had a non-linear behaviour for surface and age shows that for the largest properties,
the price tends to decrease with age, whereas for medium-
(footnote continued) sized properties, the price only decreases with age until a
assumption: e.g. Mok, Chan and Cho [15]; Atack and Margo [2] or Dunse certain point after which it starts to increase. This suggests
and Jones [7]. Nevertheless, there have been a considerable number of an interesting non-linear behaviour on the effect of age as
studies that have replaced the monocentric assumption with the non- the univariate response graph in Fig. 6 also shows.
monocentric or polycentric one, assuming the presence of a group of sub- Fig. 6 shows that when the other characteristics are kept
centres that makes it impossible to observe an inverse relation between the
value and the distance to the main centre. However, we consider the at a constant value that is equal to their mean, the lowest
polycentric assumption to be more plausible in cities which are larger than price is reached for 22-year-old houses. This result is in line
Albacete, which has nearly 1 60 000 inhabitants living in 1234 km2. with the one shown in [6] where the authors found evidence
ARTICLE IN PRESS
Fig. 9. Zoom into the area where the property in the example is located.
of the existence of a reversal in the relationship between age

and value since the negative sign only holds for the rst
1620 years of the propertys life. The reason for this
behaviour was theorized by Sabella [19], arguing that after
this point the property price starts to increase due to the
increasing value of the plot of land where the property is
built. At this moment, it is worth stressing the usefulness of
exible models such as ANN for modelling non-linear
relationships as in this case.
4. GIS and neural model integration: automated and

intelligent valuation system
Once the model has been estimated and validated, it can

be used to estimate prices for properties from the sample
and this task should be done in the easiest way possible. In
order to do this, the estimated neural model can be
combined with the geographical information system so that
it is only necessary to click on a dot on the Albacete map to
select a property and complete a data editor with its
particular characteristics. In this work, this has been done
in the R SciViews graphics environment, and the applica-
tion has been personalized by creating a series of buttons to
allow non-specialists to use the program. Fig. 7 shows the Fig. 10. Final report of the estimation process.
ARTICLE IN PRESS
initial page with a brief description of the most important extract non-linear relationships that cannot be detected by
buttons. more traditional models.
The process can be summarized as follows. The rst step Our work is by no means nished, and in the future we
is to click on the maptools button so as to load the different will explore the possibilities of the GIS and include
layers of the geographical information system (streets, additional inputs in the analysis such as accessibility,
blocks, etc.) and also the necessary tools for map existence of green spaces, distance to educational centres,
navigation. The following step is to click on the Albacete and other social, economic and geographical factors. For
map button to visualize it and to use the mouse and the this, it is imperative that geographical information is
zoom buttons to move through the city in search of a regularly updated so that data relating to the housing
property whose price needs to be estimated. Once the house market is easily available. This is the only way to keep the
has been located, we must click on the point estimation automated valuation system up-to-date, for if it is not, the
button and place the pointer in the exact location. A data system will lose its usefulness.
editor such as the one shown on the left-hand side of Fig. 8
then appears so that the property characteristics may be References
entered. It is worth mentioning that variables such as the
location or the distance to the city centre are shown [1] W. Alonso, Location and Land Use, Harvard University Press,
automatically thanks to the use of the GIS. Cambridge, MA, 1964.
Let us take as an example the estimation of the price of a [2] J. Atack, R.A. Margo, Location, location, location! The price
gradient for vacant urban land: New York, 18351900, J. Real Estate
at located at 10 Cristobal Lozano Street. We imagine that
Financ. Econ. 16 (2) (1998) 151172.
the at is 17 years old, 95 m2, and standard quality, and has [3] M.J. Bailey, R.F. Muth, H.O. Nourse, A regression method for real
three bedrooms, a full bathroom, a toilet, a lift, no balcony, estate price index construction, J. Am. Stat. Assoc. 58 (1963)
a heating system, one parking space, and no storage room. 933942.
The estimated price given by the neural network system, in [4] C.M. Bishop, Neural Networks for Pattern Recognition, Clarendon
the example 127 592h, appears in a box on the location of Press, Oxford, 1995.
[5] R.A. Borst, Articial neural networks: the next modellin/calibration
the property in the map (Fig. 9). technology for the assessment community?, Prop. Tax J. 10 (1) (1991)
Finally, a more complete output can be ordered by 6994.
clicking on the Make a report button. In this report [6] A.Q. Do, G. Grudnitski, A neural network analysis of the effect of
(Fig. 10), we can nd the main data for the example and the age on housing values, J. Real Estate Res. 8 (2) (1993) 253264.
respective estimated price. [7] N. Dunse, C. Jones, A hedonic price model of ofce rents, J. Prop.
Val. Invest. 16 (3) (1998) 297312.
[8] M. Gamez, J.M. Montero, N. Garca, Kriging methodology for
regional economic analisis: estimating the housing price in Albacete,
5. Conclusions Int. Adv. Econ. Res. 6 (3) (2000) 438450.
[9] N. Garca, Diseno de Redes Neuronales Articiales para el Mercado
Inmobiliario. Aplicacion a la ciudad de Albacete, Ph.D. Thesis,
In this work, we have presented the construction of an
Department of Business and Economics Sciences, University of
automated valuation system through the combination of Castilla-La Mancha, 2004.
an articial neural model and the geographical information [10] S. Haykin, Neural Nerworks. A Comprehensive Foundation,
system. The combination of ANN and GIS has proved to Prentice Hall, Englewood Cliffs, NJ, 1994.
be a very powerful and useful tool for the task of real estate [11] R.A. Jacobs, Increased rates of convergente through learning rate
valuation and these results could surely be extended to any adaptation, Neural Networks 1 (4) (1988) 295307.
[12] T. Kohonen, Self-Organizing Maps, second ed., Springer, Berlin,
problem dealing with spatial data. Heidelberg, 1997.
With regard to the particular results of our empirical [13] B. Martn del Bro, A. Sanz, A. Redes Neuronales y Sistemas
work, the objective has been reached satisfactorily, and the Borrosos, RA-MA, 1997.
MLP model has performed better than SOFM in the [14] W. McCluskey, R. Borst, An evaluation of MRA, comparable sale
analyisis, and ANNs for the mass appraisal of residential properties
quality level assignation problem. Comparing the perfor-
in North Ireland, Assess. J. 4 (1) (1997) 4755.
mance of MLP and RBF models for property price [15] H.M.K. Mok, P.P.K. Chan, Y.S. Cho, A hedonic price model for
estimation, the best results were achieved by the MLP private properties in Hong-Kong, J. Real Estate Financ. Econ. 10
estimating the total price. This network yielded an R2 of (1995) 3748.
0.92 and a relative mean error of 5.65%. Maybe the reason [16] N. Nguyen, A. Cripps, Predicting housing value: a comparison of
for these results is the size of the available sample (591 multiple regression analysis and articial neural networks, J. Real
Estate Res. 22 (3) (2001) 314326.
records) that is small for RBF network requirements. [17] B.D. Ripley, Pattern Recognition and Neural Networks, Cambridge
The sensitivity analysis showed that one of the most University Press, Cambridge, 1999.
important variables was the distance to the central business [18] S. Rosen, Hedonic prices and implicit markets: product differentia-
district (in this work, Plaza Gabriel Lodares-Gablod) with tion in pure competition, J. Polit. Econom. 82 (1) (1974) 3455.
a negative slope according to the monocentric assumption. [19] E.M. Sabella, Determining the relationship between the propertys
age and its market value, Assesors J. 9 (1974) 8185.
Another important result was the non-linear behaviour on [20] D.P.H. Tay, D.K.K. Ho, Articial intelligence and the mass
the effect of age in the housing price. In this sense, it appraisal of residential apartments, J. Prop. Val. Invest. 10 (1991/
is worth mentioning the capability of neural models to 1992) 524540.
ARTICLE IN PRESS
[21] J. Thurston, GIS & Articial Neural Networks: Does your GIS Esteban Alfaro teaches Statistics at the Faculty of
Think?, GISVision Magazine, 2002. Economic and Business Sciences in the University
[22] E. Worzala, M. Lenk, A. Silva, An exploration of neural networks of Castilla-La Mancha. He completed his degree
and its application to real estate valuation, J. Real Estate Res. 10 (2) in Business in 1999 and got his Ph.D. in
(1995) 185201. Economics in 2005, both in the University of
[23] C.Y. Yiu, C.S. Tam, A review of recent empirical studies on property Castilla-La Mancha. His thesis dealt with the
price gradients, J. Real Estate Lit. 12 (3) (2004) 307322. application of ensemble classiers to corporate
failure prediction. Current research deals with
Noelia Garca teaches Statistics at the Faculty of spatial statistics and the combination of classi-
Economic and Business Sciences in the University ers (decision trees and neural nets) for solving
of Castilla-La Mancha. She got her degree in heated topics in the Economics.
Economics at the University of Madrid (UAM)
in 1996 and completed her Ph.D. in Economics in
2004 on the construction of an intelligent and
automated system for property valuation through
the combination of neural nets and a geographic
information system (GIS). Current research deals
with spatial statistics and the combination of
classiers (decision trees and neural nets) for solving heated topics in the
Economics.
Matas Gamez teaches Statistics at the Faculty of

Economic and Business Sciences in the University
of Castilla-La Mancha. He got his degree in
Mathematics at the University of Granada in
1991 and nished a Master in Applied Statistics a
year after. He completed his Ph.D. in Economics
at the University of Castilla-La Mancha in 1998
on the application of geo-statistical techniques to
the estimation of housing prices. Current research
deals with spatial statistics and the combination
of classiers (decision trees and neural nets) for solving heated topics in
the Economics.

ANN+GIS: An Automated System For Property Valuation: Noelia Garcı A, Matı As Ga Mez, Esteban Alfaro

Diunggah oleh

Informasi Dokumen

Deskripsi Asli:

Judul Asli

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

ANN+GIS: An Automated System For Property Valuation: Noelia Garcı A, Matı As Ga Mez, Esteban Alfaro

Diunggah oleh

Hak Cipta:

Format Tersedia

ARTICLE IN PRESS

Neurocomputing 71 (2008) 733742

ANN+GIS: An automated system for property valuation

Keywords: Articial neural networks; Geographic information systems; Housing prices

Fig. 1. Street map of Albacete.

a very interesting alternative to traditional methods with Table 1

2. Problem description information has been obtained by a sampling procedure

Price/m2 Total price

MLP RBF MLP RBF

Data mean 1285.9390 1292.1970 139 967.7 135 204.4

 Property type: a qualitative variable with the value 0

 Number of bedrooms: number of rooms apart from the

Fig. 3. Correlation between target and estimated prices.

the two best networks were selected: a 14:1463:1 MLP 30

layer), and a 1449 SOFM5 (14 units in the input layer 15

and 49 nodes in the competition layer). Table 1 shows the 10

confusion matrix for both models. 5

From the diagonal form of the confusion matrix in Table 1, 0

the MLP is close to 90%, while Kohonens map is only Error

Ranking Variable Ratio Ranking Variable Ratio

Fig. 5. Response surfaces.

squared errors (SSE) as the error function. The details of

 Maximum number of epochs: 2500. 96000

 Initial learning rate: 0.001.

 Decay: 0.5. 92000

Fig. 7. Initial application page.

Fig. 8. Data editor to enter property characteristics.

of the existence of a reversal in the relationship between age

4. GIS and neural model integration: automated and

Once the model has been estimated and validated, it can

Matas Gamez teaches Statistics at the Faculty of

Anda mungkin juga menyukai

Property type: a qualitative variable with the value 0

Number of bedrooms: number of rooms apart from the

Maximum number of epochs: 2500. 96000

Initial learning rate: 0.001.

Decay: 0.5. 92000