Anda di halaman 1dari 5

This paper can be referred as:

M.A Pasha, S.A Hussain, M. Akhlaq, M.T Khan., “Using Bayesian Neural Network for Modeling Users in Location Tracking Pervasive
Applications”, in Proc. of National Conference on Information Technology and Applications, Balochistan, Pakistan, April 21-22, 2005.

Using Bayesian Neural Network for Modeling Users in


Location-Tracking Pervasive Applications
Dr. Pasha M. A. R, Dr. Hussain S.A, Zia K, Akhlaq M, Khan M. T
Punjab University College of Information Technology
University of the Punjab, Lahore, Pakistan-54000
{marpasha, asadhussain, kashif, akhlaq, taimoor}@pucit.edu.pk

Abstract employed, mostly, it is not possible to sense all possible


location-time coordinates due to technological or cost
Location-Tracking is an important aspect of context-aware constraints. Therefore a location-tracking system must deal
pervasive computing applications. Due to technological and cost with the techniques of predicting the location of an object
constraints, the location-time pairs provided by the current in between the location-time pairs reported, which may be
technologies may be minutes or hours apart. That is why it is minutes or hours apart. It is mandatory to predict the
necessary to predict the location of a user based on previous location of an object on finer temporal point, if needed by
information. Most researchers predict the location of a user based the application. Prediction leads to uncertainty. Hence
on history of object’s direction, last reported location and speed.
Others have emphasized the need of including User Models, to
managing uncertainty within the tolerable bounds is critical
estimate the location more effectively. Managing user models for any decent location-aware pervasive application.
require techniques to acquire and model the knowledge. Bayesian Researchers have criticized the use of traditional
Network is a statistical inference technique that predicts the techniques in estimating the current location of an object5.
probability of an event on the basis of available information. These techniques utilize history of moving objects to
Neural Networks are parameterized, non-linear models used for estimate the current location, using object direction, last
empirical regression and classification modeling. In this paper we reported location and speed. Computer scientists have
have investigated a case study to evaluate the performance of claimed that this technique is too simplistic to encounter
Bayesian Learning for Neural Networks in location-tracking the real-time dynamic requirements5. In addition, they have
applications for pervasive computing.
emphasized the need of incorporating the object’s goals,
personalities and tasks into active consideration. In other
words there is a need to build User Models to get more
1 Introduction precise estimation of object’s location.

Mark Weiser’s vision of Ubiquitous Computing


emphasized the need of unifying computers and humans 2 User Models
seamlessly in an environment rich with computing. The
starting sentences of his seminal article “The Computer for User models are means of presenting user (object)’s habits,
21st Century”1, truly depicts the essence of his vision: “The preferences and behavior. For example, a cab driver may
most profound technologies are those that disappear. They have a favorite route when driving from a typical source to
weave themselves into the fabric of everyday life until they a specific destination. This information can be modeled
are undistinguishable from it”. A pervasive system has to using a causality relationship represented by directed graph
be context aware in order to be minimally intrusive2. as shown in Fig. 1.
Location-tracking is an important aspect of context-aware
applications. It enables the pervasive applications to serve Source Destination
the user based on his location, thus minimizing the need of
user intrusion to demand the appropriate service.
Many applications based on location-tracking are
developed in the last decade, particularly under the flag of Route
Mobile Computing3. These applications mostly use Global
Positioning System (GPS)4 as location sensing system. It is Fig. 1. Directed graph representing causality relationship
important to realize that all the objects (devices, users,
sensors) need not be location aware in a pervasive User models based on the above relationship would be
environment. Objects, not capable of sensing their location instances of the blueprint presented by the graph. For
directly, can use location-sensing techniques based on example:
‘Triangulation’, ‘Proximity’ or ‘Scene Analysis’, to sense Cab1, City, Railway Station, Route1
their location with respect to nearby location aware Cab1, City, Airport, Route2
objects3. Moreover, irrespective of location sensing system

National Conference on Information Technology & Applications, 05


To implement the causality relationship model, initially, 3 The Case Study
we need a prior assumption of route choice which can be
based on logical reasoning. In this particular example, from The case study shown in Fig. 3 consists of two types of
a source A to the destination B, a route choice may be nodes. The nodes with black background represent the
based on the least distance between A and B, or the time factors affecting the choice of the Speed and Routes. The
required to reach at B, or the condition of the roads of the node Weather Condition affects only the Speed of the cab.
route. After deciding about the initial allocation, the user The lower portion of the Fig 3 represents map of the city.
model should be updated based on each cab driver’s The nodes with white background are the locations in the
preferences. So, a learning mechanism is needed which can city. They can both be treated as Source or Destination.
ensure a consistent user model, based on reported facts. The black and white strips are roads connecting the spots.
As presented in Fig. 1, we need a network to represent Combinations of these roads from a typical Source to
the user models. Nodes of the network denote the states we Destination constitute a Route. It is clear from the map that
are interested in and arcs represent causal connections. A there can be more then one routes between two nodes.
directed acyclic graph is enough to represent this network. Event and Time of day affect the selection of Sources and
User models are full or partial instances of this network. Destination and consequently the Route. Speed and Route
The example network on which we have formulated our when combined, determine the current location of the
case study is borrowed from W. Abdelsalam and Y. taxicab.
Ebrahim’s efforts5, with slight modification, and is shown To realize the case study, first of all we have to store the
in Fig. 2. information of map in our database. Every source /
destination (Place) is interconnected. So there would be at
Event Time of day least Place2 basic datasets. Since a single combination can
have more then one routes, the total combinations would
be more than that. Some example datasets are:
Route (City, Railway Station, [R6, R5])
Source Destination Route (City, Railway Station, [R6, R7, R2, R3])
Route (City, Railway Station, [R8, R7, R5])
Route (City, Railway Station, [R8, R2, R3])
Route This also satisfies the situation in which source is Railway
Station and destination is City, if visualized in reverse
Weather Condition order.
Having resolved the relationships in the map, which
Speed
leads to datasets describing the Routes, we are ready to
take on other factors. Let Event has three values for this
example; the relations can be stored as:
Fig. 2. An example network for taxicab location-tracking Event (Festival)
application Event (Holiday)
Event (No Event)
The network in Fig. 2 is an example of taxicab location- The datasets for Time of day can be:
tracking application. The network nodes and connecting Time of day (Morning)
arcs show the factors that might affect the current location Time of day (Evening)
of a certain taxicab. For example, starting from nodes The datasets for Weather Conditions can be:
having no parents, we have three determining factors: Weather Condition (Rainy)
Event, Time of day, and Weather Conditions. Event is a list Weather Condition (Normal)
of events that can occur in the city e.g. a holiday or a The datasets of above three factors have only one attribute,
festival. The occurrence of an event has a direct as they have no parent.
relationship with the passenger’s Destination which can be As we move on to Source, it is evident from Fig. 2 that
an entertainment spot in case of a festival. Similarly, it has it has two parents i.e. Event and Time of day. So its
a relationship with passengers’ Sources. Time of day is datasets would be different combinations of Events and
determining the flow of passengers from typical Sources to Times as:
Destinations. Weather Conditions affect the Speed of a cab Source (Festival, Morning, Railway Station)
understandably. Further, Routes are direct consequence of Source (Festival, Evening, Airport)
Source and intended Destination. To study the significance
of the user models based on this taxicab network, we Similar is the case with Destination:
formulated a case study depicted in Fig. 3. Destination (No Event, Morning, Gulberg)

National Conference on Information Technology & Applications, 05


Speed
Weather Condition
Event Time of day
Scale 1cm = 5km
Source /
Destination
R3 R4
Route City R6 R5 Railway Jallo
Station
Rx = Road

R2
R8 R7 Airport
R1

Mozang
R9
Gulberg

Fig. 3. Case Study - Taxicab location-tracking application

It is noteworthy that Gulberg is a place in which most of adopted against this instance to ensure cab specific
the offices are located. Route can be explained by decisions in future.
combinations of sources and destinations as described
earlier. The last factor is Speed which is quite easy to 4 Learning Methodology
represent as it depends on Weather Conditions and Route:
Speed (Rainy, [R6, R5], 40)
Speed (Normal, [R6, R5], 60) Managing user models require techniques to acquire and
Having recorded all the necessary information in the model the knowledge. Predicate Calculus has been a useful
database, we are ready now to execute a scenario. way of acquiring and modeling the knowledge. The
problem with predicate calculus is that the predicates must
be chalked down knowing exactly how different variables
3.1 Example Scenario affect other variables. In context-aware applications we
cannot judge the exact behavior of variables in advance.
We have a recorded location of Cab1 at 10 A.M. The next Bayesian Network (BN) is proposed5 as an alternative
update would be available at 11 A.M. Our location tracking which does not need prior information about how variables
application calculates the location of Cab1 at 10:30 A. M. affect others. BNs implement a statistical inference
Let’s say following information is available at 10 A.M. technique that predicts the probability of an event on the
Source = City basis of available information. When compared with
Weather Condition = Normal predicate calculus, BNs is more realistic choice for
Event = Festival modeling users due to the fact that BNs do not require
Time of day = Morning exact knowledge of dependencies in advance. Moreover,
Given the above data, it is predicted that the user would be the probabilities of the happening of an event can be
heading Jallo (an entertainment spot) from City. He would refined by learning from history. Still, BNs require a static
be acquiring the route R6 Æ R5 Æ R3 Æ R4 since it initial allocation of probabilities, which make them
seems to be the shortest (initially inferred by system). susceptible for error prone decisions for a substantial time
Since the weather is fine, he would be having a speed of before it finally reaches to the probabilities almost opposite
60km/hr. So by multiplying 60 / 60 * 30, we get 30km. We to the initial allocation. Researchers have exploited the
can conclude from here that he would have reached 30km overlap between the fields of neural networks and statistics
from City on that route, and would be at R4. using strengths of both worlds to guarantee appropriate
Let’s say at 11 A.M. we got his location, and he proved learning in all situations6.
to be at R2 at 10:30. It means that he has acquired the route Artificial Neural Network (ANN) is parameterized non-
R6 Æ R7 Æ R2 Æ R4 due to some reason. The behavior linear models used for empirical regression and
of Cab1 has changed due to its own preferences and our classification modeling. Their flexibility makes them able
system should learn from this behavior. For that we have to to discover more complex relationships in data then
create an instance of our model for Cab1 and store this traditional statistical models8. Conventional approaches of
information against Cab1. A learning mechanism should be

National Conference on Information Technology & Applications, 05


learning in ANN are based on the minimization of an error
function, and are often motivated by some underlying
principle such as maximum likelihood. Such approaches
can suffer from a number of deficiencies, including the
problem of determining the appropriate level of model
complexity. More complex models (e.g. ones with more
hidden units or with smaller values of regularization
parameters) give better fits to the training data, but if the
model is too complex it may give poor generalization (the
phenomenon of over-fitting)9.
There is considerable overlap between the fields of
neural networks and statistics7. Bayesian Learning for
Neural Networks shows that Bayesian methods allow
complex neural network models to be used without fear of
the over-fitting that can occur with traditional neural
network learning methods10. For many practical problems,
the hierarchical Bayesian neural network approach is
shown to be superior to the one based on hierarchical
Bayesian logistic regression model as well as the classical Fig. 4. Two-dimensional input sampled (Black and Gray) from
feed-forward neural networks11. a mixture of four Gaussians.

4.1 Bayesian Learning for Neural Networks We created a two-layer Multiple Layer Perceptron
(MLP) network with 6 hidden units and one logistic output.
In Bayesian learning for Neural Networks, probability is A separate inverse variance hyperparameter was used for
used to represent uncertainty about the relationship being each group of weights (inputs, input bias, outputs, output
learned. Before we have seen any data, our prior opinions bias) and the weights were optimised with the scaled
about what the true relationship might be can be expressed conjugate gradient algorithm. After each 100 iterations, the
in a probability distribution over the network weights that hyperparameters were re-estimated twice. There were
define this relationship. After we look at the data, our eight cycles of the whole algorithm. Further, we trained an
revised opinions are captured by a posterior distribution MLP without Bayesian regularisation on the same dataset
over network weights12. using 400 iterations of scaled conjugate gradient. We
plotted the function represented by the trained networks.
We show the decision boundaries (output = 0.5) and the
5 Simulation optimal decision boundary given by applying Bayes'
theorem to the true data model, as shown in Fig. 5. Note
The evidence of usefulness of Bayesian learning for neural how the regularised network predictions are closer to the
networks is gathered by simulating a two-class classifier in optimal decision boundary, while the unregularised
MATLAB13. Classification was chosen as a potential test network is over-trained.
due to the reason that the identification of user models
based on history is a typical classification problem. An
open source MATLAB library Netlab14 was used to
implement the statistical behaviors along with Neural
Network Toolbox of MATLAB. We are thankful to Ian T
Nabney’s efforts to code “Bayesian classification for the
MLP” in Netlab, which was used for simulation with
modest refinements.
We generated a synthetic dataset with two-dimensional
input sampled from a mixture of four Gaussians. Each
class is associated with two of the Gaussians so that the
optimal decision boundary is non-linear. The plot of points
generated by two classes is shown in Fig. 4.

Fig. 5. Plot of Bayes, Regularized and Unregularized Network

National Conference on Information Technology & Applications, 05


6 Conclusion and Future Work References
User models are effective in estimating the location of a [1]. M. Weiser, “The Computer for the 21st Century,” Sci. Amer.,
user in location-tracking pervasive applications. Sept., 1991.
Appropriate learning mechanism is necessary to refine user [2]. M. Satyanarayanan, “Pervasive Computing: Vision and
models to reflect the current preferences of users, so that Challenges”, IEEE Personal Communications., 2001.
location based application can adapt their behavior [3]. J. Hightower, G. Borriello, “Location Systems for
according to context. Bayesian learning for neural Ubiquitous Computing”, IEEE Computer., Aug 2001.
[4]. A. E. Rabbany, “Introduction to GPS the Global Positioning
networks provide a combined technique to utilize the System”, 2002.
strengths of statistics and AI. It is shown through [5]. W. Abdelsalam, Y. Ebrahim, “Managing Uncertainty:
simulation that this technique is better then both Bayesian Modeling Users in Location-Tracking Applications”, IEEE
Networks and Neural Networks if implemented Transactions on Pervasive Computing., Jul–Sep 2004.
individually. [6]. MacKay, D. J. C. (1995) "Probable networks and plausible
The case study presented in this paper can be a useful predictions – a review of practical Bayesian methods for
test bed for location aware pervasive applications. In future, supervised neural networks".
we plan to present a framework to implement this case [7]. Cheng, B. and Titterington, D.M. (1994), "Neural Networks:
study with all essential ingredients, in addition to learning A Review from a Statistical Perspective", Statistical Science,
9, 2-54.
techniques for user models which is addressed in this paper. [8]. MacKay, D. J. C. (1995) “Bayesian Non-linear Modeling for
Other issues include structure of datasets, inference Neural Networks”.
techniques, moving objects databases, querying databases, [9]. Weigend, A. (1994), "On overfitting and the effective
physical map implementation etc., which need further number of hidden units," Proceedings of the 1993
investigations. Connectionist Models Summer School, 335-342.
[10]. Mueller, P. and Insua, D.R. (1995) "Issues in Bayesian
Analysis of Neural Network Models," Neural Computation,
10, 571-592.
[11]. Malay Ghosh, “Hierarchical Bayesian Neural Networks: An
Application to Prostate Cancer Study”. Journal of the
American Statistical Association, September, 2004.
[12]. http://www.faqs.org/faqs/ai-faq/neural-nets
[13]. www.mathworks.com
[14]. http://www.ncrg.aston.ac.uk/netlab/

National Conference on Information Technology & Applications, 05

Anda mungkin juga menyukai