Anda di halaman 1dari 6

ICROS-SICE International Joint Conference 2009 August 18-21, 2009, Fukuoka International Congress Center, Japan

Librarian Robot Controlled by Mathematical AIM Model


Masahiko Mikawa1 , Masahiro Yoshikawa1 , Takeshi Tsujimura2 and Kazuyo Tanaka1
University of Tsukuba, 1-2 Kasuga, Tsukuba-shi, Ibaraki, 305-8550 Japan (Phone: +81-29-859-1446; E-mail: {mikawa, yosikawa, ktanaka}@slis.tsukuba.ac.jp) 2 NTT Access Network Service Systems Laboratories, 1-7-1 Hanabatake, Tsukuba-shi, Ibaraki, 305-0805 Japan (E-mail: tujimura@ansl.ntt.co.jp)
1

Abstract: This paper presents a librarian robot that has sleep and wake functions. This robot is equipped with a laser range nder for tracking library users behaviors, a microphone for conversation with a user and a stereo vision. A lot of processes run in parallel in this system, operations of these processes are controlled by our proposed mathematical Activation-Input-Modulation (AIM) model, that can express consciousness states, such as wake or sleep based on stimuli detected by the external sensors. In a waking state, sensory information with stimuli is mainly processed. In a sleep state, most processing is paused, or information stored in memories during waking is mainly processed. Moreover, this system has two kinds of memories. One is a memory stored when external stimuli are detected, the other is a memory stored when no stimulus is detected. Both dynamic and gradual changes of sensory information can be stored by these functions. Keywords: Librarian robot, Consciousness model, Memory functions, Parallel processing

1. INTRODUCTION
We are developing a librarian robot. The robot is equipped with three kinds of external sensors, a stereo vision, a laser range nder and a microphone, and has a speech recognition function for communicating with a library user by a natural language, a bibliographical information retrieval function from our university library database through the Internet, and a provision function for showing the search results to a user. Lots of mobile robots that work in libraries have been proposed. Mobile robots equipped with a hand can handle books on a bookshelf[1][2]. The autonomous mobile robot LUCAS [3] has a graphical interface displaying a human-like animated character, and acts as a guide for a library user in a library. Our librarian robot xed in a library has a simpler conguration, and is rather close to the reception robots [4][5][6]. The librarian robot has the following two unique features. One is that the robot has a state model that expresses consciousness states such as wake or sleep. For example, when there is no user around the robot, since there is no need for her to work, she goes sleeping. The other is that the robot has functions of storing environmental information as memories during waking and organizing the stored memories during sleeping. These functions are realized with our proposed mathematical Activation-Input-Modulation (AIM) model [7]. This model can express consciousness states such as wake or sleep, and controls a ratio between external and internal information processing run in parallel, based on degrees of stimuli detected by external sensors. In a waking state, external information obtained by external sensors is mainly processed in real time. In a sleep state, almost processing is paused, or internal information stored in memories is mainly processed. Internal information means memories stored during the waking state. Moreover, this system has two kinds of memories. One is an explicit memory for storing external information when

stimuli are detected by external sensors, the other is an implicit memory for storing it when no stimulus is detected. The functions of these memories enable the system to store both dynamic and gradual changes about environmental information. Some experimental results reveal the validity and effectiveness of our librarian robot equipped with the mathematical AIM model.

2. MATHEMATICAL AIM MODEL


2.1 AIM Model Figure 1 shows the AIM state-space model proposed by Hobson [8]. Humans consciousness states can be expressed by levels of the following three elements. Activation controls processing power. Input switches information sources. Modulation switches external and internal information processing modes. In a waking state, external information obtained by external sensory organs is processed actively. In a rapid eye movement (REM) sleep, a human has a dream, it is said that stored internal information is processed actively. In a non-REM sleep, input and output gates are closed and the processing power declines totally. We have designed a mathematical AIM model [7] for controlling sensory information processing system of a robot based on this model.
Low Aminergic

Activation

High

Wake Non-REM sleep Relax

Modulation

REM sleep

Cholinergic

Internal

Input

External

Fig. 1 AIM state-space model

- 1200 -

PR0002/09/0000-1200 400 2009 SICE

External sensory organs

Data sampling processes

External information processes

WM storing process

STM storing process

LTM storing process

a_ex S Mathematical AIM model s_ex s_in a_in i_ex i_in m_ex

A Explicit I Implicit M WM WM storing process STM STM storing process LTM LTM storing process WM STM LTM

m_in

Data sampling processes

Internal information processes

Fig. 2 Mathematical AIM model, external and internal information processing systems and memory architecture 2.2 Perceptual Information Processing System Controlled by Mathematical AIM Model Figure 2 shows a relation among the mathematical AIM model, external and internal information processing systems and a memory system. A set of an external information processing system consists of an external sensory device, a data sampler, external information processors and working and short-term memory storing processors. It is important for the external information processing system to process data obtained by the sensory device in real-time. An internal information processing system consists of a data sampler, an internal information processor and long-term memory storing processors. In the internal information processing system, following kinds of data are treated. One is what it is difcult to process in real-time, because it takes much time to process it. Another is what it is not necessary to process in realtime. The third is lots of data such as time-series data that must be processed all together. The details of the memory system is described in the next subsection. The mathematical AIM model controls ratios among the external and internal data samplers and the information processors. The element S calculates stimuli, such as changes of perceptual information, extracted from sampled data. A decides an execution frequency of each information processor. I decides parameters used when stimuli are calculated in S. M decides an execution frequency of each data sampler. Each element consists of two sub-elements. The subscripts ex and in mean that the sub-elements relate to external and internal respectively. Figure 3 shows an example of the variations of the sub-elements with time. Each element changes depending on external stimuli. During the external stimulus s ex is larger than a threshold ths ex , the sub-elements related to the external are higher than that related to the internal. This is the waking state. After s ex becomes lower than ths ex , the state is shifted to the non-REM through the relax. After that, a pattern of alternating the REM and non-REM is generated. In the REM, the sub-elements related to the internal are higher than those related to the external.
1 Level Wake Relax non-REM non-REM a, i, m_ex ( t ) a, i, m_in ( t ) REM Wake

0 t 0 Tw t 1 t 2 t 3 Ta t 4 t 5 Tn t 6 sm_ex < th s sm_ex < th s sm_ex < th s


and and

t 7 t [sec] sm_ex > th s

sm_ex < th s

sm_ex > th s

Fig. 3 Variations of sub-elements of A, I and M with time a ex(t) is given by the following equations. a ex(t) = Lw + b L L 2(tt1 ) w a 1+cos( ) +La+b 2 fw La + b LaLn 1+cos( 2(tt4 ) ) +L +b n 2 fa Ln + b o 2(t t6 ) r ) +Ln +b 2 1 cos( fr (t < t1 ) (t2 t < t4 ) (t5 t < t6 ) (1) (t1 t < t2 )

(t4 t < t5 ) (t t6 )

Since it is able to express the other sub-elements in the same way, their equations are omitted here. The parameter Tw/a/n , Lw/a/n , fw/a/r , or and b are constant and independent each other. It is able to design several sleep patterns by choosing these constant parameters properly depending on applications. In this paper, the sleep pattern shown in Fig. 3 was designed based on a humans normal sleep pattern by using the following values. Lw = 0.75, La = 0.50, or = 0.25, Ln = b = 0.00 (2)

Although we chose these parameters as constants to imitate the normal sleeping pattern in this paper, some abnormal consciousness states such as hallucination or coma can also be expressed by choosing another parameters.

- 1201 -

The external and internal information processing systems and the memory system are controlled based on the behaviors of the sub-elements of the AIM model as shown in Fig. 2. The execution frequencies of the external and internal information processing are determined in proportion to each level of a ex and a in respectively. For example, when a ex becomes higher, the external information processing frequency increases. As a result, more external information is processed. When a in becomes higher, the internal information processing frequency increases. In the same way, the external and internal data sampling frequencies are determined by M . The element I decides the threshold ths and the resolution rss in order for the element S to detect stimuli from the external sensor or the internal memory. Here, the resolution means a number of skipped data to be processed in a data frame. The thresholds and resolutions vary in proportion to each level of i ex and i in. For example, when i ex increases, ths decreases and rss increases. This means that the system becomes more sensitive about smaller changes. 2.3 Memory System A memory system of a humans brain is so complex and not unraveled yet completely. So we have designed a new memory system by rearranging parts of humans memory functions. The memory system consists of an internal memory and memory storing processors as shown in Fig. 2. The internal memory consists of following three kinds of memories like human beings. One is a working memory (WM) for storing sensor information for a few seconds. Another is a short-term memory (STM) for storing information where stimuli are detected by an external sensor temporarily. The third is a long-term memory (LTM) for storing important information permanently. Each memory is classied into an explicit or implicit type. Since a humans explicit memory involves conscious recollection, the explicit memories in our system are used for storing dynamic changes detected as external stimuli. Since a humans implicit memory involves unconscious recall, the implicit ones in this system are used for storing gradual changes or static environmental information. A humans brain stores encoded perceptual information in his/her memory. In our system, however, let raw signals obtained by external sensory devices store in the internal memory. All information obtained by external sensory devices are stored in the WMs in real-time. Let memory size of the WMs have an upper limit, old information is overwritten by newer one sequentially. When external stimuli are detected by external sensory devices (s ex > ths ex ), all the contents of the explicit WM are transferred into the explicit STM. When no external stimulus is detected, all the contents of the implicit WM are transferred into the implicit STM periodically at a longer interval than that of the explicit memory. Redundant or useless information are included in the STM. Then the contents of the

STM are organized and transferred into the LTM by the LTM storing processors. When no external stimulus is detected, the mathematical AIM model starts to execute the LTM storing processors. Since the explicit STM/LTM storing processes detect external stimuli, dynamical information is stored in the explicit memories. Since the implicit STM storing process stores information periodically when no external stimulus is detected, the implicit LTM storing process trys to detects changes from the implicit STM. As a result, static or gradual changing information can be stored in the implicit LTM. Some examples stored in the LTMs are shown in the section 4.

3. LIBRARIAN ROBOT
3.1 System Conguration
LCD monitor
Welcome!!

Stereo camera
IEEE1394 Audio I/F Audio I/F

Linux: Fedora 9 Intel Core 2 Quad 2.66GHz


Linux: Fedora Core 6 Intel Pentium 4 3.8GHz Mac OS X Intel Core 2 Duo 2.0GHz Windows Xp Intel Pentium 4 3.2GHz University library system (Tulips)

Router

Speaker

Microphone

LRF 7DOF Manipulator

Fig. 4 Hardware conguration of librarian robot system


AIM model IPC server Vision Data sampler / WM storing Change detection STM storing LTM storing Voice Speech recognition
Welcome!!

Voice Data sampler / WM storing Change detection STM storing LTM storing LRF Change detection LRF Human tracking Dialogue engine Finite state machine Manipulator control Display for library users I/F for library system

University library system (Tulips)

Fig. 5 Software conguration of librarian robot system Figure 4 shows a hardware conguration of our library robot system. A manipulator (PA10-7C, Mitsubishi Heavy Industries, Ltd.) was used as a body of the librarian robot. This system has three kinds of external sensors, a laser range nder (LRF), a stereo vision and a microphone. The LRF (URG-04LX, Hokuyo Automatic Co., Ltd., range from 60 to 4095 [mm]) was xed in front of the robot at the level of about 1 [m]. A stereo camera (FCB-EX470L, Sony Corp.) was attached to the arm tip. Analog video signals output from the stereo camera are converted to digital video (DV) signals through media converters (ADVC-200TV, Canopus Co., Ltd.) and captured by a personal computer (PC) through an IEEE 1394 interface board. Audio signals from a microphone (SM58, Shure Inc.) are captured by the PCs through USB audio interfaces (UA-101, Roland Corp. and Sound Blaster Extigy, Creative Technology Ltd.).

- 1202 -

TCP/IP

Figure 5 shows a software conguration. A box with a solid line means a process, and one with a dotted line means a thread. The state of the mathematical AIM model is decided based on external stimuli detected by the external sensory devices as described in 2.2, and control the operations of the information processing systems and the memory systems. However, since the robot must have a complex dialogue with a library user as a librarian, it is necessary to break the waking state into some parts. Then we also designed a nite state machine consists of some states which change depending on behaviors of library users or a content of conversation between a user and the robot. 3.2 Information Processing Systems Controlled by the AIM model Four kinds of the external stimuli are detected in this system. One is a change of distance information measured by the LRF. Another is a brightness (Y component of color images) change of two captured time-series image. The third is an amplitude change of captured sound. The forth is a frequency characteristic change of sound. Spectral envelopes calculated based on the cepstrum method [9] are used for detecting changes of the frequency characteristics. These changes are detected by comparing two time-series data sets. The explicit and implicit WMs storing processes are included in the audio and video capturing threads. Image frames can be held in the WMs for 3 [sec], audio frames can be held for 5 [sec]. The STMs related to vision are stored in DV format in a hard disk (HD). The LTMs related to vision are converted by the LTM storing processes and stored in YUV format. The WMs, STMs and LTMs related to audio are stored in WAVE format. The implicit STM storing threads are constantly executed at a frequency of 1 [sec] in all states of the AIM model, and store captured image in the HD. When sm ex ths , the explicit STM storing process transfers all the explicit WMs to the HD as the explicit STM. When the system is in the REM sleep (a in 0.375), the explicit and implicit LTM storing threads are executed, and store vision and audio frames, in which changes are detected by comparing each pixel included in two timeseries STMs. The following values and Eqs. (2) were used for the constant parameters included in the mathematical AIM model as shown by Eq. (1). Each unit is second. These values determine the pace of change from one state to another state of the AIM model. Tw = Ta = Tn = 5, fw = fa = 20, fr = 60 (3) 3.3 Finite State Machine Our librarian robot has a nite state machine that decides the state of the robot depending on library users behaviors or contents of a conversation between a user and the robot. Figure 6 shows the state transition diagram of the nite state machine. Since the actual transitions are more complex, some state transitions are omitted in Fig. 6 for simplication.

When there is no people around the robot, the state changes to the Empty, since the AIM model detects no external stimulus at the same time, the system goes to sleep.
Reception Dialogue Conversation Say hello Say goodbye Self-introduce Faculty topic Weather topic Book search Say goodbye Book search Standing person Approaching person Leaving person Empty

Search keyword input

Search results display

Fig. 6 State Transition Diagram of Finite State Machine When there are some people around the robot, since the external sensors detect external stimuli, the state of the AIM model changes to the waking, and the state of the nite state machine changes to the Standing, Approaching or Leaving depending on the persons behavior measured by the LRF. When the state is the Standing person and a person is standing in front of the robot, the state of the nite state machine changes to the Conversation. And then, the state changes to a state for giving the standing person several information depending on his/her question. These states are classied to the three categories, the Reception, the Dialogue and the Book search. 3.4 Functions Under Each State 3.4.1 Reception: People tracking using LRF Before the robot begins a conversation with a library user, it is important for her to create a user-friendly mood as a receptionist. Plural persons can be tracked at the same time using the LRF as shown in Fig. 7. When plural persons approach to the robot, the robot makes her head mounted with the cameras turn to the nearest person for making eye contact. When the person comes within 2 [m] from the robot, and the robot says hello to him/her. If the person leaves from the robot, she says goodbye.

Walkers

Walls Position of librarian robot

Fig. 7 People tracking using LRF 3.4.2 Dialogue: Natural language dialogue system Julius, the open-source large vocabulary continuous speech recognition engine [10] is used for a Japanese speech recognition, AquesTalk [11] is used as a Japanese text-to-speech engine. These mean that the current librarian robot can understand and speak only Japanese. When a person stands in front of the robot, the robot talks to him/her rst Good morning (afternoon / evening). Welcome to the library. Then the robot waits

- 1203 -

to be talked to from a person. This dialogue system is able to deal with topics about a self-introduction, weather information, the faculty and the book stock. A person can know the latest weather information of major cities collected through the Internet. It is said that many Japanese like a topic about weather. It is a basic technique to start a conversation with Japanese. 3.4.3 Book search: Book search and information display When the robot is requested to search a book by a library user, the system asks him/her a keyword for searching and retrieve books based on the keyword using Tsukuba University Library digitized Information Public Service (Tulips) [12] through the Internet. When the search results are returned, they are displayed on the LCD monitor.

When the robot nds an approaching library user by the LRF, she says hello to him. The robot retrieves biographical information from Tulips through the Internet by a request from the user, and shows the book search results on the LCD monitor. When the user leaves the robot, she can say goodbye to him. Since the AIM model detects no stimulus from the external sensors equipped with the robot after the user leaves, the state of the AIM model is shifted to the sleep, and the robot falls asleep. The LTM storing process begins to work and organizes the STM during the REM sleep. When the robot nds a person again, the system wakes up and begins to work. 4.2 Environmental information stored in LTM Figure 10(a) shows a part of images stored in the explicit LTM. When a person was found by the LRF, the robot head mounted with the stereo camera turned to the nearest person. When brightness changes were detected in time series images, the changes were stored in the explicit STM. When the system was shifted to the REM, the explicit STM was transferred to the explicit LTM while deleting redundant images. Another part of the explicit LTM is shown in Fig. 10(b). When a person was out of range of the LRF, although the LRF could not detect any external stimuli, the camera could detect brightness changes. As a result, images of a person getting off an elevator were stored in the explicit LTM. Figure 10(c) shows a part of the stored in the implicit LTM. Gradual changes of a sunset sky could be detected. Although it is difcult to detect these kinds of gradual changes by the explicit memories that is processed in real-time, it is easy to detect them by using the implicit memories. These results also indicates that the consciousness states of our robot and its perceptual system controlled by the AIM model works well.

(a) Main window

(b) Book search results Fig. 8 GUI for information display

Figure 8 shows a graphical user interface (GUI). Ordinarily the main window is displayed on the screen as shown in Fig. 8 (a). When the robot is in the book search state, it changes to the window for displaying search results as shown in Fig. 8 (b). Weather information can also be displayed in this window.

5. CONCLUSION
We proposed a librarian robot who can communicate with a user and search books by his/her request by natural language. The largest feature of the robot is that the robot controlled by the mathematical AIM model can sleep. The advantage of this system is that a resource of a computing machine is made effective use of. Because required processes are executed only when needed. The validity and effectiveness of our proposed system were conrmed with the experimental results. We could conrm the basic functions of the robot as a librarian, and the abilities of collecting environmental information in response to external stimuli detected by the external sensors by the operation of our proposed AIM model. However, the current librarian robot has a poor stock of topics for conversation. Moreover, since book search results are only displayed on the LCD monitor, this is not user-friendly. In future works, we will enrich attractive and useful dialogue contents, and add easily understandable techniques for showing a place of a book to a user by use of gestures of the robots body.

4. EXPERIMENTAL RESULTS
4.1 Workow as library service A sample of library services performed by our librarian robot and an operation of the AIM model are shown in Fig. 9. You can see the movie by clicking on Fig. 9.

Hello. Please search a book.

Good afternoon. Welcome to our library.

Fig. 9 Workow as library service

- 1204 -

05/June/2009, 12:26:14.940

05/June/2009, 12:26:17.828

05/June/2009, 12:26:19.535

(a) Dynamic changes stored in explicit memory (in range of LRF) 05/June/2009, 17:56:44.384 05/June/2009, 17:56:46.119 05/June/2009, 17:56:47.821

(b) Dynamic changes stored in explicit memory (out of range of LRF) 05/June/2009, 17:33:31.000 05/June/2009, 18:25:11.000 05/June/2009, 19:04:15.000

(c) Gradual changes stored in implicit memory

Fig. 10 Images stored in explicit and implicit memories

ACKNOWLEDGMENTS
This research is partially supported by Project No. 21500185, Grand-in-Aid for Scientic Research (C), Japan Society for the Promotion of Science. We would like to express our gratitude to Research Professor Reid Simmons, the Robotics Institute, Carnegie Mellon University, USA, who offered the IPC library [13] and the people tracking software using a LRF, that are used in the Roboceptionist system [6].

[6]

[7]

REFERENCES
[1] Jackrit Suthakorn, Sangyoon Lee, Yu Zhou, Rory Thomas, Sayeed Choudhury, and Gregory S. Chirikjian, A robotic library system for an off-site shelving facility, in Proceedings of the 2002 IEEE International Conference on Robotics and Automation, 2002, pp. 3589 3594. [2] Tetsuo Tomizawa, Akihisa Ohya, and Shinichi Yuta, Book extraction for remote book browsing robot, Journal of Robotics and Mechatronics, vol. 16, no. 3, pp. 264 270, 2004. [3] J. Derek T. OKeeffe, The development of an autonomous service robot. implementation: Lucas the library assistant robot, Intelligent Service Robotics, vol. 1, no. 1, pp. 73 89, 2008. [4] Naoto Kawauchi, Yoshihiro Koketsu, Tadashi Nagashima, Ken Onishi, and Ryota Hiura, Home-use robot wakamaru (in Japanese), Mitsubishi Juko Giho, vol. 40, no. 5, pp. 270 273, 2003. [5] Takuya Hashimoto, Masaru Senda, Taichi Shiiba,

[8] [9] [10]

[11] [12] [13]

and Hiroshi Kobayashi, Development of the interactive receptionist system by the face robot, in SICE Annual Conference 2004 in Sapporo : Proceedings, 2004, pp. 1404 1408. Rachel Gockley, Reid Simmons, and Jodi Forlizzi, Modeling affect in socially interactive robots, in The 15th IEEE International Symposium on Robot and Human Interactive Communication (ROMAN06), 2006, pp. 558563. Masahiko Mikawa, Masahiro Yoshikawa, Takeshi Tsujimura, and Kazuyo Tanaka, Intelligent perceptual information parallel processing system controlled by mathematical aim model, in Proceedings of the 2007 7th IEEE-RAS International Conference on Humanoid Robots, 2007, pp. 389 403. J. Allan Hobson, The Dream Drugstore. The MIT Press, 2001. Douglas OShaughnessy, Speech Communications: Human and Machine, 2nd Edition. Wiley-IEEE Press, 1999, ch. 6, pp. 173227. Akinobu Lee, Tatsuya Kawahara, and Kiyohiro Shikano, Julius an open source real-time large vocabulary recognition engine, in Proceedings of the 7th European Conference on Speech Communication and Technology (EUROSPEECH 2001), 2001, pp. 1691 1694. http://www.a-quest.com/aquestalk/index.html (in japanese). https://www.tulips.tsukuba.ac.jp/portal/. http://www.cs.cmu.edu/ipc/.

- 1205 -

Anda mungkin juga menyukai