Anda di halaman 1dari 3

Training a Simulated Soccer Agent how to Shoot using Artificial Neural Networks

Mir Hossein Dezfoulian, Nima Kaviani, Amin Nikanjam, Mostafa Rafaie-Jokandan


RoboSina Simulation Research Group Department of Computer Engineering, Bu-Ali Sina University Hamedan, Iran {dezfoulian,nima,nikanjam,rafaie}@basu.ac.ir http://basu.ac.ir/~robosina

Abstract. One of the main objectives in a soccer game is to score goals. It is, therefore, important for a robotic soccer agent to have a clear policy about whether he should attempt to score in a given situation, and if so, which point in the goal he should aim at. In this paper we describe the implementation of a learning algorithm used in a RoboSina agent to satisfy this requirement. By the use of this method, we are to adapt an optimal scoring policy for the agent. Practical results of the implementation are also included and discussed.
Keywords: RoboCup, Artificial Neural Networks, Soccer Simulation, Learning Algorithm.

each of which tests a specific part of machine learning algorithms along with brand new ideas and methods [1, 5]. In this paper, we are to introduce a new approach and discuss the results. More on the topic can be found in [4]. Section 2 gives a background of problem and some definitions. In section 3 we describe our approach to scoring problem and learning methods which is followed by empirical results of section 4. Finally in section 4 conclusion remarks are mentioned.

2. Background
The purpose of the RoboCup Simulated Soccer League is to provide a standardized problem domain for Artificial Intelligence research based on a soccer simulation called the RoboCup Soccer Server [2]. Teams of soccer agents programmed by researchers from all over the world can compete with each other using this simulator. Each team consists of 11 autonomous agents which simulate exactly human roles in a real soccer world. According to the definition of the optimal scoring problem, the scoring situation is defined by distance and angle of the ball with respect to the goal, the position of the ball, and also the position of the goal keeper. These parameters are collected from the subjective world data generated by the attacker itself while positions are in the form of Cartesian representation. The scoring policy is a function of scoring situation which determines two situations: possibility of shooting to goal and the position which the ball is shot to. In a simulated soccer environment the ball motion can be regarded as geometrically constrained continuous time Markov process, thus learning algorithms can be simply applied in such a testing domain. Artificial Neural Network

1. Introduction
A soccer game is the union effort of a couple of agents in the pitch to achieve the ultimate goal of winning the game in which scoring a goal is the key point, attracting a high consideration. RoboCup Simulation is a simulated model of a real game which provides a standardized domain for Artificial Intelligence research and Learning Algorithms. In such a domain, defining an optimal scoring policy that highly guarantees the success of the agent would be of great importance, and can be treated as a problem capable of applying learning algorithms. Basically, the Optimal Scoring Problem is stated as follows: Given a chance to shoot, agent must find the point in the goal where the probability of scoring is the highest.[1] In this paper we will introduce a solution to this problem. As the optimal scoring problem is well suited for Machine Learning techniques, previous works has been carried out in this area

is a popular machine learning method widely used in RoboCup. Its ability in function estimation, generalization, offline training, and working on a Markov Model made us use this method in training the agent.

3. Application
We have divided our application into three phases: providing the appropriate data and optimal target values for the supervised neural network, creating a suitable network and analyzing the data using the network, and finally evaluating the performance of the network after training procedure.

example in a given scoring situation, S1-S3 and S21-S29 are identified as the only successful groups (in which the ball crosses the goal). Group S21-S29 is a larger group, so being the proper group. Section S25, which is in the middle of group, is called best section. In the case of having no successful group, the situation was considered as a Fail situation, otherwise a Success situation. As all 29 tries must be similar at the start, a tool is needed to generate similar training samples. A trainer was used to generate similar situations [2]. Here we used RoboSina Trainer.

3.2 Learning
According to the ability of neural network in classification and generalization, we used a 2 layered feed forward backpropagation network (fig 2) to learn the scoring policy. The scoring situation parameters, distance and angle of the ball with respect to the goal, the position of the ball and also the position of the goal keeper, are applied to network as input, thus we have 6 nodes in the input layer. The network must meet two major requirements of scoring. First of all, it must be capable of determining the possibility of entering the ball to goal, so a node is considered to satisfy this requirement: if situation is Success, the output is 1, otherwise 0. Moreover, the network must have the ability of finding the best class. Devoting a separate node to each of the classes would result in a big output layer and a low speed in networks convergence, thus to reduce the output layer size we chose a binary coding for each class. As we have 29 separate classes, 5 bit codes would be sufficient, so 5 nodes were considered as output layer. For example having a best class named S23, we defined the target output as 101111, in which the right most side digit showed ball entrance to the goal and the remaining digits defined the number for the target state. Using this method we reduced the output layer from 30 nodes to 6 nodes. In case of having a Fail, the right most side digit marked as 0 and other digits were considered as dont care values (e.g. #####0), thus no change was occurred on correspond weights during learning process. While there is not a specific method to determine the accurate number of layers, it is an experimental task to find the exact number of the layers. Using a heuristic approach, we found out that a hidden layer consisting 5 nodes could result in a good convergence in network.

Figure 1: Stating the problem space

3.1 Data Gathering (Problem modeling) (mapping scoring problem)


The first step in mapping scoring problem is to discrete problem space into several classes. The main idea is dividing the goal line into some sections. Width of the goal is 14.02m, thus we divide it into 29 equal sections, each length of which is 0.5m. Each section represents unique class in the problem space. Figure 1 shows this policy in separating problem space. Sections are numbered from S1 to S29. When the ball crossed the goal line (a goal was scored) in specific section, the corresponded class would be regarded as a successful class. To recognize the successful classes in a specific situation (a goal attempt), we shot the ball to sections sequentially (S1 to S29), one time per section. Successful groups can be considered as sequential successful classes (e.g. S1-S3 where S1, S2 and S3 are successful classes). The largest successful group was named the proper group. The mid-class of a proper group was assumed to be the best section so that it is shot into. For

During the training phase, each sample introduced to the network and updating weights was performed using Back Propagation algorithms. This process continued until the least mean square error (LMS) became less than 0.05.

Optimal Scoring Problem. The problem was tackled by decomposing the goal line into several classes, providing a new classification method, and generating target data as a binary interpretation. The results show that observational learning using neural networks delivers a promising performance in comparison to a computational model. Besides, the extensive statistical analysis of goalshots provided in this paper should arouse interest of everyone dealing with simulated robot soccer.

Figure 2: the model of Neural Network

3.2 Evaluation
To generate episodes which could automatically disrupt and easy to be handled, we used the Keep Away mode in the soccer server [3]. Having collected 5500 training samples, we applied them to the network. Having trained the network, we used it in the agents code as a high level skill. Thereafter, we put the trained agent in the same environment with a keep away. This time, shoot decision was made upon the trained network output. Table 1 shows the result of using the trained shot in comparison to a computational method which was previously used in a RoboSina agent. This method shot the ball to the furthest point from the goal keeper. There are 1500 testing samples.

5. References
[1] Jelle Kok, Remco de Boer and Nikos Vlassis. Towards an optimal scoring policy for simulated soccer agents. In M. Gini, W. Shen, C. Torras, and H. Yuasa, editors, Proc. 7th Int. Conf. on Intelligent Autonomous Systems, pages 195-198, IOS Press, California, March 2002. Also in G. Kaminka, P.U. Lima, and R. Rojas, editors, RoboCup 2002: Robot Soccer World Cup VI, Fukuoka, Japan, pages 296303, Springer-Verlag,2002. [2] Mao Chen, Klaus Dorer, Ehsan Foroughi, Fredrik Heintz, ZhanXiang Huang, Spiros Kapetanakis, Kostas Kostiadis, Johan Kummeneje, Jan Murray, Itsuki Noda, Oliver Obst, Pat Riley, Timo Steffens, Yi Wang and Xiang Yin: Users Manual, RoboCup Soccer Server, for Soccer Server Version 7.07 and later; February 11, 2003 [3] Keepaway Soccer: a Machine Learning Testbed. Peter Stone and Richard S. Sutton. RoboCup-2001: Robot Soccer World Cup V, A. Birk, S. Coradeschi, S. Tadokoro (eds.), 2002. Springer Verlag, Berlin. [4] N. Kaviani and M. Rafaie-Jokandan: RoboSina from Scratch. B.Sc. Thesis, May 2005, Department of Computer Engineering, Bu-Ali Sina University, Hamedan, Iran. [5] Achim Rettinger. Learning from recorded games: A scoring policy for simulated soccer agents. In Ubbo Visser, Hans-Dieter Burkhard, Patrick Doherty, and Gerhard Lakemeyer, editors, Proceedings of the ECAI 2004, 16th European Conference on Artificial Intelligence, Workshop 8: Agents in dynamic and realtime environments, August 2004.

Algorithm

Tries

Success

Computational 1500 923 Learning 1500 1162 Table 1: Results of comparison

Success % 61.53 77.46

It can be inferred from Table 1 that the ratio of successful shots to scoring attempts is significantly better for the agent with the neural network shooting skill.

4. Conclusion
This paper outlined a technique that uses data obtained from simulated soccer games for supervised neural network learning. The benchmark used for testing this approach is the

Anda mungkin juga menyukai