Anda di halaman 1dari 20

Applied Artificial Intelligence, 19:413431

Copyright # 2005 Taylor & Francis Inc.

ISSN: 0883-9514 print/1087-6545 online
DOI: 10.1080/08839510590910237



Martin Klesen & German Research Center for Artificial Intelligence,

Saarbrucken, Germany

Role-plays with educational agents allow to embed information and educational goals into a
narrative context. These agents do not only have to come across as believable and lifelike, they also
have to embody the pedagogical principles and educational goals of the application domain. In this
paper, we argue that theatrical concepts, namely concepts derived from improvisational theater and
meta-theater, can guide the design of educational agents. We illustrate this by presenting two
systems: Puppet provides young children with improvisational agents to promote early learning
through educational role-plays and CrossTalk uses interactive performances with virtual actors
to demonstrate the basic principles that govern automatic dialogue generation. For both systems,
we show how the theatrical concepts support a systematic approach to character design, behavior
modeling, and user interaction and how they help to increase the believability and lifelikeness of
our characters in an educational context.


Animated agents are now used in a wide range of different application

areas, including virtual training environments (Swartout et al. 2001), interactive fiction (Mateas and Stern 2003), and storytelling systems (Ryokai
et al. 2002), as well as in interactive installations where computer agents
play the role of product presenters and sales assistants (Klesen et al.
2003). Over the last few years, we have conducted a number of R&D projects at DFKI to explore the potential of animated agents as actors in
The Esprit, LTR, i3-ESE project Puppet was funded by the European Community under grant
number EP-29335. The project partners were: The University Of Aalborg, Denmark (Laboratory Of
Computer Vision and Media Technology), The University Of Aarhus, Denmark (Department Of Dramaturgy), The German Research Center for Artificial Intelligence GmbH, and the University Of Sussex,
UK (School of Cognitive and Computing Sciences). CrossTalk has been built upon contributions from
the MIAU project funded by the German Ministry for Education and Research and from the EU-funded
IST projects NECA, SAFIRA, and MAGICSTER. CrossTalk is a joint effort with contributions from
Patrick Gebhard, Michael Kipp, Thomas Rist, Stephan Baldes, Markus Schmitt, and Thomas Schleiff.
We would also like to thank our graphics designer, Peter Rist, for providing us with the virtual actors
Cyberella, Tina, and Ritchie.
Address correspondence to Martin Klesen, German Research Center for Artificial Intelligence,
DFKI GmbH, Stuhlsatzenhausweg 3, D-66123, Saarbrucken, Germany. E-mail:


M. Klesen

role-plays (Andre and Rist 2000). Role-plays with synthetic characters allow
to embed information and educational goals into a narrative context, making it possible to present information from different points of view or to
convey social aspects such as interpersonal relationships between characters
(Prendinger and Ishizuka 2001; Rist and Schmitt 2002).
In order to be successful in their respective application domain, these
agents have to meet certain behavioral objectives. Animated agents should
appear lifelike and believable, e.g., they should respond to changes in their
environment instantly and appropriately without mechanical repetition.
Their behavior should also be intelligible, i.e., the observer should be able
to read the agents goals and intentions and to interpret its behavior
accordingly. The challenge when building educational agents lies in the fact
that they do not only have to come across as believable and lifelike, but they
also have to embody the pedagogical principles and educational goals of the
application domain. Educational agents therefore also have to meet certain
structural objectives. The interaction between agents and the user should
follow a dramaturgical development or author-defined narrative structure
in order to achieve the pedagogical goals and to create a satisfying interaction experience for the user. The question is: How can we systematically
design synthetic agents so that their behavior meets our behavioral objectives and how can we systematically structure the interaction between synthetic agents and the user so that it meets our structural objectives?
Research in the agent community that has focused on creating lifelike
and believable agents has used models and principles from animation
design, biology, and psychology to endow synthetic characters with expressive bodies, physical needs, motivational drives, and computational models
of personality and emotions (Cassell et al. 2000; Kline and Blumberg 1999;
Trappl and Petta 1997). These concepts tend to control the behavior of an
agent on a local level, e.g., by generating appropriate actions and responses
on a moment by moment basis. To model and control the behavior of interactive animated agents on a more global level, some researchers in the last
few years have looked at the theater as a source of inspiration (Murray
2000). Using the computers as theater metaphor (Laurel 1993), drama
is not seen primarily as the performance of stories in front of an audience,
but as a general means of structuring the interaction between characters.
Dramaturgy provides concepts and methods for characterizing an actors
role (beliefs, desires, intentions, etc.) and for structuring the interaction
between actors to convey the plays dramaturgical structure (conflicts,
transformations, status changes, etc.).
In the Puppet and CrossTalk project, we have adopted this dramaturgical view. Instead of synthetic individuals, we want to create synthetic actors,
and instead of using motivational drives (fatigue, boredom, hunger, etc.),
we use theatrical concepts like status and attitude, role and meta-role as

Concepts for Role-Plays


behavior determinants. In this paper, we argue that theatrical concepts,

namely, concepts derived from improvisational theater and meta-theater,
provide methods for creating and directing synthetic actors thereby supporting a systematic approach to character design, behavior modeling,
and user interaction.
In this section, we introduce the theatrical concepts that guided the
design of our educational agents in the Puppet and CrossTalk project.
Our choice of concepts was motivated by the different application scenarios. Puppet aims at extending current forms of early learning through
educational role-plays (Scaife et al. 1999). The challenge was to create situations that would allow children to interact constantly and to build significance cumulatively. Improvisational theater seems to be well-suited for
this kind of scenario by providing a theoretical framework for character
interaction based on improvisational rules.
In CrossTalk, we use interactive performances with virtual actors to demonstrate the basic principles that govern automatic dialogue generation. We
wanted to give our actors the possibility to react to user feedback during a performance and to talk to each other when off-duty. Meta-theater was both a
powerful metaphor and a valuable source of inspiration as it allowed us to
address these problems by creating roles and meta-roles for each of our agents.
Improvisational Theater
Theater is usually seen as a highly structured event that has been
rehearsed many times, and can be repeated. Improvisational theater
(Johnstone 1981; Spolin 1999) is different because there is no predefined
narrative structure (no script) and no one really knows what might happen
next. To improvise means to create stories through interaction. Theatrical
improvisation leaves a lot of decisions to the performers, i.e., their behavior
is constrained but not completely specified by the instructions given to them
beforehand. Improvisation relies therefore on emergent narrative [...] in
which explicit narrative structure is absent but narrative frequently emerges
through character interaction (Aylett 1999). An improvised story is most
often uninteresting as a plotline in its raw form but it can have all the elements of a good story, e.g., a conflict, a love scene, or a change in status.
The art of improvisational theater is the art of framing: creating interesting
scenarios for the improvisers to be and act in. The improvisational frame must
provide an adequate amount of information. It is important that the participants dont get lost, she=he needs clues as to what is happening. The improvisational frame establishes a shared interest model upon which the actors


M. Klesen

can base their decisions. To be able to improvise, one should know who am I
playing (character, role), where are the events taking place (setting, props),
what happens (task, status), and when (time, duration). Improvisational theater is often based on rules defining a set of dramaturgical constraints. Improvisational rules can be either explicitgiving a clear description of the
improvisational taskor implicitthe task has to be discovered during the
improvisation. They can be shared, i.e., all participants get the same information, or discrete, i.e., some improvisers get exclusive information. Improvisational rules are often used to create conflict-oriented dramaturgical structures
by provoking interesting clashes of opposed wills.
Another set of concepts used in improvisational theater is the status
and the attitude of a character. Status refers to a characters intrinsic way
of behaving both toward others and toward the surrounding space. It determines how dominant or submissive someone behaves in an interaction.
Johnstone teaches actors to play high status by erect posture, making
gestures of authority, and using or even abusing objects (Johnstone
1981). Status often parallels social status but is not identical (be low but play
high). It can be exploited for dramatic effect, e.g., through status changes
within a scene, and it can be used to determine the outcome of a conflict in
an improvisational scenario as shown in the next section. Attitude, on the
other hand, is a rather broad concept. Literally it means the position or
manner of standing of the body and a manner of feeling and behaving.1
We focus on the second meaning. In improvisational theater, attitude is
used as a practical device and as a means of reducing complexity. It helps
the actors in an improvisation if they know with which attitude to pursue
their goals and to interact with other participants. Attitude can therefore
be viewed as a device for choosing between different strategies when pursuing some goal. Defining both status and attitude for each participant allows
a dramaturge to frame a situation and to create a specific constellation
between characters without having to give detailed instructions.
In conventional theater, the actors on stage are separated from the
audience by an invisible fourth wall. An actors role confines its existence
to the fictional world of the play, i.e., the portrayed character is not aware of
the theatrical performance and the audience. Meta-theater is a variety of
conventional theater that recognizes and exploits its own fictionality. It creates a complex mixture of illusion and reality by providing actors with roles
and meta-roles. In meta-theater, actors may suddenly interrupt their performance and start arguing with their fellow actors about the authors
intentions or address the audience by giving explanations about the plays
or the characters background. Actors stepping out of their roles convey a

Concepts for Role-Plays


feeling of authenticity, temporarily firing the attention of the audience by

making them believe that the actor is speaking by and for himself, although
this part is just as preconceived as the rest of the play, the actor just having
switched from role to meta-role. Actors can use their meta-role to address
the audience directly, something which is usually not possible when in
character. Brecht, for example, used this distancing effect in his epic theater to let the audience reflect about the plays development and alternative
actions of the characters, instead of trying to achieve immersion and
empathy. Later, we show how to use the meta-roles to counteract user
expectations and to enhance the illusion of life.
The first project in which we applied theatrical concepts was The Educational Puppet Theater of Virtual Worlds (for short, Puppet) that
started in October 1998 and ended in January 2002. This ESPRIT Long
Term Research project was funded by the European Commission within
the i3 initiative (intelligent information interfaces) on Experimental
School Environments.2 Focusing on early learning, typically in the age
range of 4 to 8, i3-ESE projects looked at the development of a range of
skills, such as creativity, imagination, self-expression, sharing, teamwork,
and learning to learn.
Puppet belongs to the projects that put their emphasis on particular
aspects of early learning, namely learning through storytelling and drama.
It is based on a theoretical framework of learning through externalization.
This theory of external cognition (Scaife and Rogers 1996) focuses on the role
of externalization in cognitive development: the process of articulating (making explicit) and acting out concepts. The main objective in Puppet was therefore to develop a Virtual Puppet Theater (VPT) for open-ended story worlds
that would allow to extend current forms of early learning through improvisational play (Scaife et al. 1999). The rationale for using a theater setting
to promote early learning was that dramatic symbolic play is an attractive
and educationally appropriate activity for 48-year-old children.
The VPT tries to achieve its objectives by offering the possibility of realtime interaction between a user-controlled avatar and a group of synthetic
characters in a 3D virtual world. The child can adopt the role of the audience, the actor, the scriptwriter, or the editor. As an observer, it can watch
the activities in the 3D virtual environment from any point of view. As an
actor, the child is an active participant in the improvisational scenarios
and it can explore cause and effect through avatar actions and agent reactions. As a scriptwriter, he can partially script the verbal behavior of all
characters in a scene by recording spoken dialogue, and as an editor, he
can re-record his own lines offline. After the editing phase, the child can


M. Klesen

re-enter the virtual world either as actor or as audience to hear the rerecorded sounds in context (Puppet 2002). In this section, we will focus
on the child as co-actor in an improvisational scenario.
Improvisation in Puppet
We have stated that Puppet aims to extend current forms of early learning
through educational role-plays. The challenge was thus to create situations
that would allow children to interact constantly with the synthetic actors in
the 3D virtual world and to build significance cumulatively during this interaction. Improvisational theater seems to be well-suited for this kind of scenario
by providing a theoretical framework for character interaction based on
improvisational rules. When talking about embodied, animated agents used
as improvisational actors, we share the view of Barbara Hayes-Roth: [The
agent] would accept directions from one or more exogenous sources, either
in real-time or in advance of a performance. The directions would constrain,
but not completely specify its behavior, for example assign the agent a role,
endow it with personality features, change its mood, or instruct it to perform
a kind of behavior (Hayes-Roth and van Gent 1997). In our case, these directions are mainly given by the improvisational frame. The improvisational
frame defines the basic characteristics of a character, e.g., its role, task, and
status. We view the improvisational frame as a collection of contextual constraints for a characters behavior, i.e., as a good means of restricting choices
while preserving the agents autonomy.
The scenario we have chosen for the VPT is a virtual farmyard inhabited
by two autonomous agents, a freedom-loving cow that wants to become
more humanlike and a farmer that wants to maintain order. A third character, the sheep, serves as the childs avatar. To ensure that there is a reasonable amount of interaction between the virtual characters even when the
child is passive, we decided to investigate a conflict-oriented play structure.
This is achieved by providing farmer and cow with conflicting goals. The
cow continually attempts to escape the confines of his pen, heading either
toward the farmers old gramophone to listen and dance to the music or
toward a bookshelf to recite some poems. The farmer, on the other hand,
is trying to recapture the cow when he detects that it has escaped. Such a
scenario is less likely to stagnate as each move by one agent will provoke
a counter move by the other agent. Based on these conflicting goals, a narrative cycle emerges, intending to capture the attention of the children and
get them to reflect upon the dissonance.
The behavior of the farmer and cow is determined by their current
status and attitude. A characters status can be high or low and determines
the outcome of the conflict. A characters attitude can be positive or negative and expresses a manner of feeling and behaving towards the other

Concepts for Role-Plays


agent. Both parameters are dynamic and change during the course of interaction as explained in Behavior Modeling.
Character Design in Puppet
The main objective for the character design in Puppet was that the
child must be able to read the characters behavior, i.e., their goals, their
status, and their attitude must be clearly expressed. Studies have shown that
there may be substantial differences between children and adults in the
ways that they perceive information and the visual cues that they rely on
when assigning meaning to a characters behavior (George and McIllhagga
2000). Virtual characters are primarily identified and categorized based on
their shape, i.e., how familiar their shape is to their real counterparts, quite
independent of the level of realism (e.g., making a cow wear glasses or
having a red sheep seems to be fun). Another important factor are the
proportions of a character. Young children find characters with childlike
proportions more attractive, i.e., characters that have an exaggerated size
of head (compared to adult proportions) are easier to process for children
because the primary visual cues (e.g., gaze and eyebrows) are better recognizable. This is in line with another finding, namely, that children seem
to rely primarily on facial expressions when trying to infer a characters
intentions and emotions (Reichenbach and Masters 1983). We therefore
decided to use 3D cartoon-style characters and to exaggerate the postures,
gestures, and facial expressions. Since the status and the attitude play
a dominant role in our scenario, they have to be clearly noticeable. Figure 1
shows the four different combinations of status and attitude for the cow.
Status is mainly expressed through a characters posture, as this is a strong
visual cue and can be displayed and recognized easily even on fairly simple cartoonlike characters. Attitude is mainly expressed by facial expression, e.g., the
cows face changing color to a light or deep red in negative attitude. Status
and attitude are not only expressed through posture but also through gait
and gibberish talk. We use different walking animations and idle time movements for each combination of status and attitude and some character-specific
animations (e.g., cow scraping hoof, farmer luring with right hand) to convey
their feelings and intentions. The sounds used by each agent have been
recorded by drama students based on available animations (e.g., farmer herding with little stick) and on verbal cues (e.g., cow in good mood on its way to
one of its favorite places).
Behavior Modeling in Puppet
We have already stated that the farmer and the cow have conflicting
goals in our improvisational scenario. They also have different strategies


M. Klesen

FIGURE 1 Cow in different status and attitude.

to achieve their respective goals. The farmer can try to lure the cow back
into the pen or he can try herding, e.g., by threatening it with a stick.
The cow on the other hand can ignore the farmer or try to avoid him or
it can confront him. The choice of strategy depends on the characters attitude. To create variations and build tension, we have implemented a set of
improvisational rules that changes these parameters during the course of
interaction. These rules take as input parameters the number of successes
(e.g., cow reaching one of its favorite places or farmer restoring order) and
the status and attitude of each character. We further defined an encounter as
a single interaction between farmer and cow. The minimum requirement
for an encounter is an action, e.g., the farmer shouts, and a reaction,
e.g., the cow stops and turns around. Encounters are used to increase variation when the agents meet several times within the same status and attitude. The following improvisational rule states that if the cow in high
status with negative attitude (H) has been approached once (encounter
> 1) by the farmer in high status with negative attitude (H), it will lower
its status if approached a second time.
Each rule also has an encounter-based action script associated with it that
gets executed when the rule is applied. These action-scripts are usually

Concepts for Role-Plays


combinations of facial, bodily, and vocal expressions and define an agents

behavior during the encounter.
We have used these rules to divide the scenario in two improvisational
cycles. In cycle one, the farmer will eventually succeed in bringing the
cow back to the pen (raising his status, cow dropping in status), and in
cycle two, the cow will finally reach the gramophone or the bookshelf.
Transitions between these two cycles occur when one of the agents
achieves its goal. Within each cycle, the status and attitude change in a
predefined way to reflect the dramaturgical development that leads to a
climax when both agents confront each other in high status and with
negative attitude: The farmer with a red face wields a big club and yells
at the cow which in turn scrapes its hoof and moves forward thereby trying
to intimidate the farmer. In the first improvisational cycle, the cow finally
accepts defeat, lowers its status and follows the farmer back to the pen.
This marks the beginning of the second cycle in which the cow sneaks
out and tries to reach the gramophone or the bookshelf. The escape is
eventually detected by the farmer and the cow follows him reluctantly back
to the pen, since he is still in high status. After having been defeated several times, the cow will finally raise its status and attack the farmer. This
time, however, the farmer gives in and the cycle ends with the farmer in
low status with positive attitude and the cow in high status with positive
attitude. Each improvisational cycle takes approximately five minutes to
complete, i.e., after approximately ten minutes the whole story will be
repeated due to its cyclic quality.
User Interaction in Puppet
The child can adopt the role of a co-actor at any point in time during
the two improvisational cycles by selecting the sheep as avatar. The child
controls the sheeps movements in the 3D virtual world with an ordinary
computer mouse. In addition, a concept keyboard is used to select the avatars viewpoint, mood, and actions. Figure 2 shows the laminated overlay
that we use for our target user group of 4- to 8-year-old children and that
is placed on top of the touch sensitive area of the concept keyboard.
The top row allows the child to select the avatars viewpoint, the icons in
the middle change the avatars mood, and the two icons at the bottom
select an action. There are two types of actions. The avatar can make a
sound (left icon) and the avatar can make a sound accompanied with an
animation (right icon). Which sound and animation is played depends
on the avatars mood, which can be positive, neutral, or negative (middle
row from left to right). To give the child a strong visual cue of which
mood it has selected, the sheep avatar changes its default idle and walking
animations as well as its appearance (textures) for face, ears, and body.


M. Klesen

FIGURE 2 Concept keyboard to select viewpoint, mood, and actions.

If the child initiates an avatar action by pressing an action icon on the

concept keyboard, the farmer and the cow will respond by showing character-specific reactions that depend on their current status and attitude but
also on the avatar mood, the type of action, and the avatars current position. The agents will engage in a prolonged interaction (e.g., following
or avoiding the avatar), which lasts until the avatar has been inactive for
some time, turns away, or changes mood. These prolonged interactions
have a designed dramaturgical development, which range from less intense
reactions (e.g., making a sound or an angry face) to dramatic actions (e.g.,
turning around and fleeing from the avatar). Additionally, improvisational
rules like the one described earlier will change the agents status and attitude, e.g., if the cow has been intimidated by the avatar and is fleeing it will
do this in low status and negative attitude.
The child can therefore influence the improvisational scenario by interfering with the agents goals (e.g., blocking their way) or by changing their
status and attitude during the prolonged interactions. For example, the
avatar can distract the farmer to help the cow reach one of its favorite
places or it can lure the cow back into the pen thus helping the farmer
to succeed with his goal. If the avatar is not doing anything, the agents will
resume their previous activities and their endless struggle continues.
External Evaluation
During its development, Puppet was repeatedly evaluated by a team of
psychologists from the School of Cognitive and Computing Sciences

Concepts for Role-Plays


(COGS) at the University of Sussex. In this section, we present some of the

results from the final evaluation of the Puppet system. A more detailed
description is given in Marshall et al. (2002).
The evaluation aimed at investigating young childrens ability to understand the behaviors and motives of the characters, to determine how well
they were able to understand and engage in the interaction between
farmer and cow, and to see how recording spoken dialogue for each
character would stir reflection about the characters activities. The evaluation was conducted in a local school in Sussex. Children played in pairs
with the system for between 15 and 30 minutes and each pair of children
had four sessions with the system over the course of three weeks. During all
sessions, an experimenter was present that guided the childrens interactions and encouraged them to talk about the system throughout. All sessions were videotaped for later transcriptions and analysis. Each child
subsequently took on the role of the audience, the actor, the scriptwriter,
and the editor.
The evaluation of the Puppet system showed that children were able
to develop a quite sophisticated level of understanding of the drama
and to reflect upon this. In particular, they could understand the underlying improvisational cycles in terms of the conflicting goals. Children also
seemed capable of understanding the attitude of agents, while having
some difficulties recognizing and interpreting status reversals and associated posture shifts correctly. This may have been because attitude maps
well onto the basic emotions of happiness and anger, whereas status is a
more sophisticated concept. The childrens interaction with the agents
as avatar were mostly physical, trying to run into them using the mouse.
Only a few of them used the concept keyboard to change the avatars
mood and initiate avatar actions. Therefore, the childrens interactions
with the agents were rarely as prolonged as we intended. The recording
session, on the other hand, in which they could record their own lines
for each character, was quite successful, allowing the children to step
back from the acting and to reflect about the feelings and motives of
the characters.
Based on their findings, the team at COGS suggested that direct physical interactions might be one way to lead children into more elaborate
interactions with characters and that children should be explicitly given a
goal by the experimenter (explicit framing) to sustain their interest and
to promote interaction in a virtual environment of this sort. Alternatively,
the goal should be more easily discovered by providing appropriate cues
(implicit framing). It also became clear that the character design plays a
crucial role for children of this age group as approximately 50% of the
comments were about visual appearance, and much less about movements,
sounds, and (inter-)actions.


M. Klesen


CrossTalk is an infotainment installation that needs no personnelit is
self-explaining and runs in an endless loop, which makes it ideal for information presentation in public spaces, such as a trade fair or a product information kiosk in a department store. CrossTalk provides visitors with a
spatially extended interaction experience by offering two virtual spaces
on separate screens, one displaying Cyberella, the installations hostess,
and the other displaying Tina and Ritchie, two virtual actors. Together with
the visitor console up front it creates an interaction triangle (Figure 3).
Conversations between virtual characters that cross the limits of physical
screens inspired the name CrossTalk. The system has been conceived
as a generic exhibition framework to stage technology in the field of
embodied characters by embedding an exhibit seamlessly in an overall
story=scenario. CrossTalks current exhibit is CarSales, a program that automatically generates car sales dialogues according to the personal interests
of a user. However, the dialogues are not just a simple enumeration of facts.
Instead, information about the car is presented in tactical question-answer
games where the customer asks questions addressing issues of user interest.
The underlying design principles and a full system description can be
found in Klesen et al. (2003).
CrossTalk has two main objectives. It should attract visitors and it
should inform them about the exhibit sustaining their interest for an
extended period of time. Both aspects are crucial for an infotainment system, in general, and for the design of educational agents, in particular.

FIGURE 3 Main components and spatial layout of the CrossTalk installation.

Concepts for Role-Plays


In CrossTalk, we address these aspects by making a distinction between two

modes of operation: ON mode and OFF mode. When no user is present,
the system is in OFF mode. In this mode, the task for the agents in the background screens is to attract passersby. As soon as a visitor enters the installation by pressing the start button on the frontal touch screen, the
application switches to ON mode. In this mode, the agents should do their
best to fulfill the educational goals of the installation by explaining and
demonstrating key features of the exhibit.
Meta-Theater in CrossTalk
Nowadays all users are aware that animated agents are programmed to
fulfill the tasks they have been designed for, e.g., acting as a virtual tutor,
tour guide, team mate, etc. In CrossTalk, Cyberella has the role of a fair
hostess and stage director whereas Tina and Ritchie adopt the roles of salesperson and customer in a generated car sales dialogue. In CrossTalk, however, we systematically counteract audience expectations by letting the
characters openly admit that they only play their role, that they are in
reality actors performing something a programmer has thought up. Of
course, this meta-role of the agents being actors is just as well the product
of programming. And although this is easily understood by users when
thinking about it, the commonplace simple illusion of life is here padded
by another layer of life, the meta-role, so that life and authenticity are
taken for granted on an intuitive level without deeper reflection.
In CrossTalk, the characters play out their meta-role when the system is
in OFF mode, i.e., when no user is present. In OFF mode, all three characters will look around, shift posture thinking about what to do next (using
cartoon-style think bubbles), or chat with each other. CrossTalks context
memory allows them to use context-dependent knowledge in their simulated small talk, e.g., the current location, time of day, or the number of
demos given in the last few hours. This behavior is likely to attract new visitors to approach and enter the installation. In ON mode, the characters
faithfully fulfill their role. Cyberella welcomes the user and offers a demo
performance of automatic presentation generation enacted by Tina and
Ritchie. We establish links between the role and the meta-role of an actor
by inserting unexpected out-of-character intermezzi during the performance in ON mode or by simulating a rehearsal of parts of the sales dialogue
in OFF mode. During rehearsals, Tina and Ritchie temporarily adopt their
roles as salesperson and customer but step out again when they forget their
lines, or start arguing about the script. Intermezzi on the other hand are
pre-scripted scenes emphasizing their meta-roles as virtual actors (e.g., fixing a technical problem with the background screens) as explained in
the next section.


M. Klesen

Character Design in CrossTalk

In CrossTalk, we wanted to visually separate, the role (salesperson=customer) and meta-role (virtual actors) of Tina and Ritchie. As we
read body postures easily and immediately assign meaning to them, we
decided that we should focus on the difference of posture. When they
are themselves as actors, they have a relaxed body posture; whereas when
they are playacting, they straighten up and look more tense (Figure 4). This
is in accord with observations in psychotherapy where body posture
was found to indicate separate topics or points of view in a conversation
(Scheflen 1964). It is also in line with our findings in Puppet. Characters
in CrossTalk have been modeled by a professional graphics and animation
designer. They offer a large repertoire of about 35 actions derived from
analyzing a German TV show with manual gesture annotation (Kipp
2001). This opens a wide range of possibilities to visually support the conversation. Since we use the Microsoft1 Agent3 technology, which supports
only fixed animation sequences, each action was modeled twice, once for
each of the two basic body postures.
Posture is, however, not the only means of expressing role shifts. Each
time Ritchie and Tina communicate with Cyberella or address the user, they
break the frame of the fourth wall that confines their virtual stage.
Another way to underline the meta-theatrical flavor of the installation is

FIGURE 4 Ritchie and Tina in role (left) and meta-role (right).

Concepts for Role-Plays


to create the illusion that the agents have full control over the system. This
is achieved by simulating technical problems (e.g., the screens go black, the
virtual stage has the wrong background image, etc.), which are then
solved by the agents themselves. We found that these effects, if sparsely
used, can have a strong impact on the audience and enhance the agents
believability and lifelikeness.
Behavior Modeling in CrossTalk
Agents in CrossTalk are not autonomous as they are in the Puppet
system. Their verbal and nonverbal behavior both in their role and in
their meta-role is specified through a multimodal script. This script is
divided into scenes, which specify for each agent what to say and what
to do. Transitions between scenes are defined in a scene flow graph.
The scene flow tells the system which scene should be played next during an interactive performance, e.g., when the user presses a button or
does not respond to a question (timeout). The scenes are either written
by a human author in a screenplaylike language or they are generated by
the system at runtime based on a domain and dialogue model. The simulated small talk in OFF mode is based on a multitude of pre-scripted
scenes (over 100 in English=German each), whereas the simulated car
sales dialogue of the exhibit is based on automatically generated scenes.
A detailed description of the authoring process can be found in Gebhard
et al. (2003).
In OFF mode, in their meta-role, the agents should be themselves
displaying rich personalities and a rich technical, geographical, and cultural background. For this purpose, we created a distinct character profile for each of the agents that was used by the human author to ensure
that their behavior was situationally and individually appropriate in all
scenes. Ritchie, for example, was characterized as professional and
slightly presumptuous. His jokes are sometimes anti-women and he likes
teasing Tina and questioning Cyberellas directions. In ON mode, however, the characters personalities are chosen by the user. He can decide
that Ritchie should play the salesperson as being polite and ill tempered
or as being impolite and good humored. There is, however, no clash or
misconception since we have the clear separation between role and
meta-role. This makes it also possible to exchange the exhibit, e.g., letting the agents discuss the best investment strategy for a given customer
profile instead of talking about cars. Exchanging the exhibit requires to
create new roles for Tina and Ritchie in ON mode. Their metaroles however do not change and this is it what makes CrossTalk a
generic exhibition framework providing a virtual stage for interactive


M. Klesen

User Interaction in CrossTalk

Infotainment is information and entertainment. The presenting
through performing paradigm (Andre and Rist 2000) relieves the user
to some extent from the pressure to interact. By watching the characters
interact with each other the user becomes familiar with the system
before initiating own actions. Reducing interaction inhibition is highly
important for installations meant for public spaces that should attract
During the simulated car sales dialogues, the frontal touch screen
(Figure 3) shows three feedback buttons that read applause, boo,
and ? (question mark). The user can press each of them at any time to
give positive or negative feedback or to request a short explanation from
Cyberella about whats currently going on. Such feedback may cause unexpected (meta-theatrical) behavior. For instance, if a visitor submits a boo,
the actors may get nervous and forget their lines. In contrast, applause
makes them proudly smiling or bowing to the user. Thus, by giving frequent
feedback, the user can influence the behavior of the characters and trigger
specific feedback scenes. In case a visitor presses the boo button (negative
feedback) in quick succession, Cyberella will become more and more distressed. Eventually she will abort the performance and suggest to the user
to change the parameter settings before instructing Tina and Ritchie to
start all over again.
The feedback scenes are seamlessly integrated into the running demo.
They are kept generic by letting the characters step out of their roles when
reacting to user feedback. The reactions of the characters in their metaroles are consistent with their author-defined personalities and do not
interfere with their roles as salesperson and customer. By exploiting the
meta-theater metaphor, we can therefore create a unique interaction
experience quite independent of the current exhibit.
Informal Evaluation
CrossTalk has been presented on various occasions, most prominently
at the worlds largest computer fair, the CeBIT 2002, in Hannover. Since
then people interacted with the system at the Pacific Rim International
Conference on Artificial Intelligence in Tokyo (2002), at the Information
Society Technologies Conference in Copenhagen (2002), at the First International Conference on Technologies for Interactive Digital Storytelling
and Entertainment in Darmstadt (2003), as well as on several in-house presentations at DFKI. At these occasions, we logged user input data for offline
analysis. These data include the overall time spent with the installation, the
parameter choices for the car sales dialogues, and the type and frequency

Concepts for Role-Plays


of feedback given during the performance. Though we have not yet

conducted a formal evaluation to test our hypotheses, these log files give
us some evidence about the systems effect on users.
The activities of the agents in OFF mode, i.e., their meta-role, captured
the attention of the visitors and attracted them to the installation. The
cross-screen conversations between Cyberella and our two virtual actors,
Tina and Ritchie, achieved a high-level of believability, letting visitors
watch=interact for even half an hour without detecting repetitive patterns.
Some people actually thought that the characters did truly respond to comments or jokes made by the others, thereby demonstrating that scripting
character behavior across screens is a powerful means to enhance the
illusion of life. And even though the car sales dialogues of the current exhibit are neither very realistic nor very informative, some users spent more
than 15 minutes watching them multiple times with different parameter settings to see how the agents would change their conversational behavior or
to see how they would react to feedback during the performance. This confirms our assumption that virtual actors in interactive performances can be
a strong motivating factor. It was, however, not much of a surprise that most
people liked Tinas and Ritchies meta-role much better than their role as
salesperson and customer. The scenes for the meta-roles contain many
humorous technical remarks about their virtual life (e.g., living on hard
disk sectors, traveling via Internet connections, measuring time in CPU
cycles, etc.) and references to the users cultural background (famous
movies, pop stars, etc.) compared to the technical question-answer games
it the simulated sales dialogues.
In this article, we have presented two rather different systems. Puppet is
a 3D virtual world inhabited by autonomous agents and allows unrestricted
user interaction via an avatar. The target user group are young children and
the pedagogical goal is to promote early learning through educational roleplays. CrossTalk, on the other hand, is basically 2D, the characters are completely scripted, and user interaction is limited to a small number of menus
and buttons displayed on a touch screen. The target audience are visitors of
a trade fair and the educational goal is to attract people and to inform
them about exhibits featuring embodied conversational agents. Despite
these differences, we have applied, in both cases, theatrical concepts,
namely, concepts derived from improvisational theater and meta-theater,
to their design. We showed how these concepts support a systematic
approach to character design, behavior modeling, and user interaction,
and how they can help to increase the believability and lifelikeness of these
characters in an educational context.


M. Klesen

We are convinced that a dramaturgical perspective is a very fruitful way

when designing applications with educational agents that should interact
with and motivate users. Interaction isnt interesting in itself. A coherent
framework, a set of rules, or a carefully chosen scenario is needed if the
user is going to find it worthwhile to become interactive at all. We have
shown that the improvisational frame provides such a coherent framework
for Puppet. It also provided us with a shared view and a common terminology in our multidisciplinary endeavor, facilitating the discussions
between developmental psychologists, drama pedagogues, and computer
scientists. A similar argument holds for the meta-theater metaphor used
in CrossTalk. A common understanding of what the agents were supposed
to do in ON and OFF mode, in role and meta-role, guaranteed that
the overall behavior of the character was consistent and situationally
and individually appropriate. Role-plays with virtual actors are, however,
only one application domain. Theatrical concepts could also be applied
to a wide range of other areas, including interactive fiction, games, and
virtual training environments.
1. Langenscheid-Longman. Dictionary of Contemporary English, Paul
Proctor (editor-in-chief Munchen, 1981).
2. This site is no longer being maintained.
I3net ended February 28, 2002.
Andre, E. and T. Rist. 2000. Presenting through performing: On the use of multiple animated
characters in knowledge-based presentation systems. In Proceedings of the IUI 2000, pages 18,
ACM Press.
Aylett, R. 1999. Narrative in virtual environments towards emergent narrative. In Proceedings of the AAAI
Fall Symposium on Narrative Intelligence, Cape Cod, MA, USA.
Cassell, J., J. Sullivan, S. Prevost, and E. Churchill, eds. 2000. Embodied Conversational Agents. Cambridge,
MA: The MIT Press.
Gebhard, P., M. Kipp, M. Klesen, and T. Rist. 2003. Authoring scenes for adaptive, interactive performances. Proc. of the Second International Joint Conference on Autonomous Agents and Multiagent Systems,
New York: ACM Press.
George, P. and M. McIllhagga. 2000. The communication of meaningful emotional information
for children interacting with virtual actors. Lecture Notes in Artificial Intelligence, vol. 1814,
Berlin, Heidelberg, New York: Springer-Verlag.
Hayes-Roth, B. and R. van Gent. 1997. Story-making with improvisational puppets. In Proceedings of the
First International Conference on Autonomous Agents, New York: ACM Press.
Johnstone, K. 1981. IMPRO: Improvisation and the Theatre. New York: Routledge.
Kipp, M. 2001. From human gesture to synthetic action. In Proceedings of the Workshop on Multimodal
Communication and Context in Embodied Agents held in conjunction with the Fifth International Conference
on Autonomous Agents, pages 914, New York: ACM Press.

Concepts for Role-Plays


Klesen, M., M. Kipp, P. Gebhard, and T. Rist. 2003. Staging exhibitions: Methods and tools for modelling
narrative structure to produce interactive performances with virtual actors. Virtual Reality 7(1):
Kline, C. and B. Blumberg. 1999. Building believable synthetic characters. In Proceedings of the i3 Spring
Days Workshop on Behavior Planning for Life-Like Characters and Avatars, pages 6371.
Laurel, B. 1993. Computers as Theatre. Reading, MA: Addison-Wesley.
Marshall, P., Y. Rogers, and M. Scaife. 2002. Puppet: A virtual environment for children to act and direct
interactive narratives. In Proceedings of the Second International Workshop on Narrative and Interactive
Learning Environments, Edinburgh, Scotland.
Mateas, M. and A. Stern. 2003. Integrating plot, character and natural language processing in the
interactive drama facade. In Proceedings of the Technologies for Interactive Digital Storytelling and
Entertainment (TIDSE) Conference, pages 139151.
Murray, J. H. 2000. Hamlet on the Holodeck: The Future of Narrative in Cyberspace. Cambridge, MA: The MIT
Prendinger, H. and M. Ishizuka. 2001. Social role awareness in animated agents. In Proceedings of the Fifth
Conference on Autonomous Agents, pages 270377, New York: ACM Press.
Puppet. 2002. Final Report , ESPRIT, LTR, i3-ESE Project EP-29335 Puppet.
Reichenbach, I. and J. Masters. 1983. Childrens use of expressive and contextual cues in judgments of
emotion. Child Development 54:102141.
Rist, T. and M. Schmitt. 2002. Applying socio-psychological concepts of cognitive consistency to
negotiation dialog scenarios with embodied conversational characters. In Proceedings of the AISB02
Symposium on Animated Expressive Characters for Social Interactions, pages 7984.
Ryokai, K., C. Vaucelle, and J. Cassell. 2002. Literacy learning by storytelling with a virtual peer. In
Proceeding of Computer Support for Collaborative Learning, Boulder, CO, pp. 352360.
Scaife, M. and the Puppet Project Team. 1999. Imagination, creativity and new forms of learning:
Designing a virtual theatre for young children. In Proceedings of the i3 Annual Conference, pages
Scaife, M. and Y. Rogers. 1996. External cognition: How do graphical representations work? International
Journal of Human-Computer Studies 45:185213.
Scheflen, A. E. 1964. The significance of posture in communication systems. Psychiatry 26:316331.
Spolin, V. 1999. Improvisation for the Theater, 3rd edition. Evanston, IL: Northwestern University Press.
Swartout, W., et al. 2001. Towards the holodeck: Integrating graphics, sound, character and story. In
Proceedings of the Fifth Conference on Autonomous Agents, pages 409416, New York: ACM Press.
Trappl, R. and P. Petta, eds. 1997. Creating personalities for synthetic actors: Towards autonomous
personality agents. Lecture Notes in Computer Science, vol. 1195, Berlin, Heidelberg, New York: