Managing Uncertanity in Text-to-Sketch Tracking Problem

2011 23rd IEEE International Conference on Tools with Artificial Intelligence
Managing Uncertainty in Text-To-Sketch Tracking Problems

Matthew D. Schmill and Tim Oates Computer Science and Electrical Engineering Department University of Maryland Baltimore County Baltimore, Maryland schmill@umbc.edu, oates@ubc.edu
AbstractText-to-Sketch (T2S) is a class of problems in which geolocation is performed using natural language descriptions of a location or locations as input. This is a challenging problem due to the many sources of uncertainty inherent to the task: there is often syntactic and semantic ambiguity present in the input observations, as well as referential ambiguity when the language used to describe the scene may refer to many possible objects or locations in the world. Tracking problems, in which the Text-to-Sketch paradigm is extended to incorporate multiple locations and movements over a temporal dimension, introduce additional uncertainty. We describe a tool for managing the uncertainty in Text-to-Sketch problems called MUTTS. The MUTTS system combines traditional natural language processing (NLP) tools with algorithms used to manage uncertainty in mobile robot navigation to allow the temporal and geographical constraints in the text to incrementally reduce the overall uncertainty of a subjects location and produce high quality sketches of the subjects location and movements over time. Keywords-interactive systems; particle lters; uncertainty; natural language processing; text to sketch;
I. I NTRODUCTION The goal of Text to Sketch (T2S) systems is to produce sketches from natural language descriptions. Exactly what constitutes a sketch varies from system to system. Existing approaches generate 2d topological maps based on textual descriptions of physical features (buildings, etc.) [1]. Those maps can then be matched against satellite imagery to provide geolocation services; one might imagine an agent trying to orient herself in a foreign environment, and utilizing an intelligent T2S system to provide geolocation details based on a spoken description of her surroundings. Another application of text to sketch is robot navigation based on qualitative specications [2]. An extension to the 2d version of T2S that is of particular interest to the intelligence community introduces a temporal dimension, to allow temporally extended sketches that can represent not just location but movement and routes [3]. This extension allows us to consider not just geolocation or map building, but tracking. But extending the T2S paradigm has a signicant impact on how T2S is executed due to how it affects the handling of uncertainty. A key issue in T2S systems is how to manage and represent uncertainty. Natural language is often imprecise and ambiguous, especially the terms we use to refer to
1082-3409/11 $26.00 2011 IEEE DOI 10.1109/ICTAI.2011.70 430
space, locations, time, and duration. Among the sources of uncertainty in T2S: Explicit imprecision refers to the use of language that explicitly represents uncertainty, such as near his house or around 3 oclock. Syntactic ambiguity refers to sentence structure in which more than one parse is possible given the language. Semantic ambiguity arises when there are multiple legal word senses given the syntactic interpretation. Referential ambiguity is possible when a word or phrase could refer to more than one known physical location, object, or person. Spatial imprecision results when words are used that are geographically non-specic, such as the park or downtown. Text to sketch systems, in order to be useful, must represent these uncertainties, and when possible, use constraints present in the data and background knowledge to reduce them. Furthermore, a T2S system must be able to present sketches and uncertainty in a manner that is useful to the human user, and in the best case, allow the user to supply background knowledge that will improve the results. In our work, we consider the task of tracking a subject as he moves around an urban environment. The goal is to produce a sketch of the subjects locations, movements, and the routes he has taken based on natural language observations, either in real-time or post-hoc. The textual descriptions may include eye-witness accounts, overheard conversations (possibly from the subject himself), police reports, and so on, and may refer to the movements of the subject as well as landmarks and locations that he has encountered. Some examples of the types of accounts we might expect: We saw him near the pizza restaurant. (eye-witness account) subject walking north for one half mile. subject turns east and continues for 5 minutes. (police report) I am meeting Jerry at the hospital. (overheard) The introduction of multiple speakers adds an additional level of uncertainty to the task: there may be irrelevant information in the text stream and some text may refer to
text
parser
Parse tree
semantic engine
TMR
particle filter
particle distribution
Figure 1.
An overview of the MUTTS pipeline.
the particle lter and how it is adapted for text-to-sketch. We follow with details of the natural language processing that MUTTS performs when there is input to be processed. Finally, we conclude with a discussion of domain knowledge and how its ubiquity in computer systems it can be used to enhance text to sketch. A. Particle Filters One of the key insights that makes the work described here possible is that text to sketch shares properties with a well-studied problem in robotics called localization. The goal of localization in mobile robotics is to integrate noisy, time series sensor and odometry data with a map to produce a probability distribution over possible robot locations. Sensing decreases uncertainty about the robots location because it enables reasoning about where on the map such readings might be produced. Movement increases uncertainty about the robots location because using typical dead reckoning algorithms that estimate change in location are inaccurate; it is impossible to know exactly how far the robot has moved due to effector noise and environmental factors. However, repeatedly sensing and moving can dramatically decrease uncertainty about the robots location. For example, a robot whose sonar detects a doorway on the left could be just inside or just outside any ofce in an ofce building, but it becomes clear that the robot is in a hallway when it observes a second doorway on the left while moving in a straight line. To better understand the relationship between mobile robot localization and text-to-sketch, consider Mr. Jones, who is known to be in Washington DC. Being told that ones is near the memorial enables reasoning about Jones location. We dont know which memorial Jones is near, nor do we know his precise location relative to the memorial due to the use of the word near, but we can represent his location as a probability density with values that increase the closer you get to anything on a map of Washington labeled as a monument. This is very much like the robot above that knows its approximate distance (due to sensor noise) to a door, but has no idea which door. Next, suppose were told Jones walked north for 20 minutes. To account for this information, the probability density describing possible locations is shifted north by the average distance that a person can walk in 20 minutes, but it is also spread out (reecting an increase in uncertainty about Jones location) to account for the fact that Jones could have been walking faster or slower than average or he could have diverged from a due north trajectory. Again, this uncertainty about the distance traveled is precisely the problem faced by mobile robots with noisy effectors. A number of algorithms exist that solve the localization problem efciently and, in some cases, optimally from the standpoint of using available information to maximally reduce uncertainty about location. We use an approach known as particle ltering [5]. A particle is, roughly, a point on
431
the subject in the rst person, third person, or not at all. Speakers may use colloquialisms, be non-native speakers of the target language, and there even exists the possibility of adversarial intelligence false observations intended to make the tracking task more difcult. In this paper, we present a system we have developed called MUTTS, which combines a mix of off-the-shelf and in-house NLP tools with a probabilistic framework called a particle lter [4] to tackle the problem of Text-to-Sketch for subject tracking. In the sections that follow, we describe the MUTTS system and its components, including the particle lter, a variant of which which we have adapted to T2S, and show how it can be used to represent, reduce, and visualize uncertainty. We present the text processing elements of MUTTS, and how it processes and displays information in a manner useful for intelligence analysis in a usage example. We conclude with a discussion of our ongoing and future efforts to improve the tool and its underlying AI components. II. S YSTEM Our system for generating sketches for tracking analysis is called MUTTS: Managing Uncertainty in Text To Sketch. MUTTS is a web-based application, written using the Google Web Toolkit (GWT), which compiles pure Java down to a combination of JavaScript and external libraries. The GWT offers access to a suite of Google functionality that includes Google Maps, Local Search, and Directions, all of which are used at various stages of processing and visualization. Supplementary road data is also available using the Census Bureaus TIGER/Line R data les. MUTTS takes natural language textual1 accounts of a subjects locations and movements as input, and provides as output visualizations of the most likely waypoints and routes that the subject took during the time period being tracked. A rough overview of the MUTTS system is shown in gure 1. The T2S process is treated as a pipeline in MUTTS. First, natural language processing tools are utilized to produce representations that encode syntactic structure and roles in the text. Those representations are then queried to infer semantics, producing text meaning representations (TMRs), and nally, those representations are passed to an adapted particle lter, which updates its own internal models of where the subject might be and how he might have gotten there. In the remainder of this section, we start by describing
1 Adding automated speech recognition would be a straightforward extension that would introduce an additional source of uncertainty.
the map that carries a quantum of probability mass. The more particles there are in an area of the map the higher the probability that the person of interest is there. The particle lter algorithm updates the positions of the particles in response to new information (e.g., the fact that Jones walked north for 20 minutes). The number of particles can be chosen to trade off computational cost and resolution of the probability density, but since computation is linear in the number of particles (i.e., constant per particle per update) and is easily made parallel, it is not unusual to have tens or even hundreds of thousands of particles. After each update, particles are redistributed by sampling (importance sampling [6]) from the density they approximate, so that particles will die out in low probability (unimportant) areas and become more concentrated in high probability (important) areas. In this way, particles initially allocated to, say, parses or reference resolutions that make subsequent observations improbable will be reallocated near particles based on parses or reference resolutions that are supported by subsequent observations. The MUTTS system implements an adapted particle lter as a probabilistic framework for representing uncertainty about a subjects location on a map. Next, we consider how analogs for sensing and moving are generating in the system. B. Text Processing in MUTTS Incoming text is rst processed by the Stanford Parser [7], [8], which we use to produce structured syntactic representations from raw text. The representations used by MUTTS are parse trees and typed dependency lists. A parse tree is a tree structure that represents the syntax of a sentence. Words are grouped into phrases and their roles in the sentence can be determined by examining the path back to the root. The typed dependency list is generated from a parse tree and expresses how words relate to one another in a sentence. Consider the following sentence: The subject was seen near a Popeyes Fried Chicken. The most probable parse for this sentence follows:
(ROOT (S (NP (DT the) (NN subject)) (VP (VBD was) (VP (VBN seen) (PP (IN near) (NP (NP (DT a) (NNP Popeye) (POS s)) (NNP Fried) (NNP Chicken)))))))
det(Popeye-7, a-6) poss(Chicken-10, Popeye-7) nn(Chicken-10, Fried-9) prep_near(seen-4, Chicken-10)
Note that with this representation, the trained eye (or computer) can quickly identify important parts of the sentence such as the verb phrase was seen. The typed dependency list for this parse tree above is as follows:
det(subject-2, the-1) nsubjpass(seen-4, subject-2) auxpass(seen-4, was-3)
Note that the typed dependency list provides a convenient representation for locating words related to one another; for instance, that Fried is a compound noun modier for Chicken. Together, the parse tree and typed dependency list provide enough information for the next phase of processing to begin. In this phase, the sentence structure is examined to produce a text meaning representation that can be used to update the particle lter in downstream processing. The module responsible for this process is called the semantic interpretation engine (SIE) as shown in gure 1. We have designed and developed the MUTTS SIE as a rule-based template matching system. The SIE comprises a lexicon of English words organized into an ontology to allow for generalizing across word classes, and a set of rules. On the left hand side of the rules are mechanisms for matching parse trees and typed dependency lists, and on the right hand side is code for extracting semantics to generate a useful TMR. For example, suppose we wanted to generate a rule to catch the phrase above. A rule is constructed rst to look for the passive voice (a passive verb is used as the root of the verb phrase), next to check for a verb that is the descendant of observation in the lexical ontology (by extracting seen from the auxpass relation in the typed dependency list), and nally requiring that there is a prepositional phrase beginning with a spatial preposition (again, using a combination of the parse tree, typed dependency list, and information in the lexical ontology). The pattern of the rule described above would match our sample sentence, and the right hand side of the rule would be used to generate a text meaning representation. The rst step is to decide whether the sentence represents a sensing event or a movement event, in the sense described in section II-A. Does the sentence refer to movement, in which case the TMR will be used to update particle locations in a dead-reckoning type update, or does the sentence reference a landmark or location, in which case the sentence is analogous to sensing in particle lter localization? Currently, classication of a sentence as a sense or movement event is hardwired into the rule. In this case, the combination of an observation verb and a spatial preposition indicate a sense event. Sense and movement events have parameters that are lled out during the processing of the right hand side of a rule. In the case of sense events, the primary objective of the rule is to extract a landmark or location reference, and the secondary objective is to determine the specicity of the reference. The primary objective is achieved by rst extracting the object of the spatial preposition (Chicken), then pulling all modiers of the object that indicate they
432
belong together (Popeyes and Fried). The secondary objective, in this case, is achieved by looking up a specicity level for the preposition being used, which is part of the lexical ontology. In this case, the use of near implies some uncertainty of the actual proximity to the landmark, whereas at would indicate relative certainty that the subject is actually at the landmark. In the case of our sentence, the TMR might look like this: 2
(SENSE :specificity moderate :landmark "Popeyes Fried Chicken")
These text meaning representations are ready for the particle lter to process sense and movement events, as described in section II-A. C. Applying Domain Knowledge Using Google Maps, Local Search, and Tiger/Lines R allows MUTTS to bring a great deal of domain knowledge to bear on managing the uncertainty in T2S a much broader range than any human analyst could be expected to have. The strength of automated text to sketch is the amount of domain knowledge available, encoded in search engines and databased, and the challenge is to exploit this knowledge while performing adequately where human intelligence excels: in natural language processing, commonsense reasoning, and so on. In this section we will describe a tool called path verication that is made possible by Google Directions and augments the utility of MUTTS in just such a manner. The complex geometry of high resolution maps, coupled with the surface features that go along with these maps transportation networks, waterways, green space, and so on create a conundrum for the particle lter when performing a dead reckoning update. If an observation comes in that has an agent driving or walking to the northeast for half a mile, then the particles must all be translated roughly a half mile, roughly to the northeast. A most basic update would simply move the particles, regardless of road networks or geographical features, and then the sketch might involve the agent driving over the Chesapeake Bay. On the other end of the spectrum, the particle lter could be tasked with incorporating all the various map features, conducting a search over the road network (and incorporating footpaths in the case of walking directions), and producing only legal particle updates that respect the rules of the road and the lay of the land. While the former approach is obviously too naive, the latter approach appears quite daunting. Fortunately, Google Directions essentially accomplishes exactly that task. To produce particle updates for movement events, MUTTS processes the simplistic dead reckoning update, and uses Google Directions to verify whether or not the updated particle location is realistic given the features of the map. This is path verication. If utterance ui moves particle p from location pi to pi+1 , we conclude that the update is veriable if the distance and time Google Directions derives for pi pi+1 is probable given the duration and distance distributions derived from the processing of ui . Said differently, if the source text says 10 minutes, but Google Directions returns a best route that takes 20 minutes, then the particle path is not veriable, and it should be resampled. III. U SAGE C ASE In this section we detail a typical usage of the system and describe some of the investigative features and visualizations
433
This representation is almost actionable by the particle lter. There remains the question of where, exactly, is Popeyes Fried Chicken? Particles are represented by latitude and longitude, not by their common name. To resolve this mapping between landmark or location names and points of latitude and longitude, we use Googles Local Search API. The functionality Local Search offers is to provide points that match a keyword search. In the case of Popeyes, and if our area of interest (AOI) is Baltimore, Local Search will return 13 Popeyes locations, complete with latitude, longitude, and a variety of other information in hypertext format. The landmark eld of the sense event can then be replaced by the corresponding points in the search results. MUTTS allows the user to dene an area of interest outside of which search results will be ignored. Generating movement events is a somewhat simpler process. Consider the following text: He walked east on Reisterstown Road, for maybe 15 minutes. The TMR for a movement action includes direction, distance and duration (any of which may be extracted from the text, derived by computation, or set to defaults), any known road references, and uncertainties associated with the direction, distance, and duration elds. Rules and templates are written to identify movement events and the typed dependency list is interrogated to ll in the TMR.
(MOVE :specificity approximate :direction (0.0 0.1) :duration (15 3.0) :distance (0.75 0.15) :onroad Reisterstown Rd.)
Note the introduction of a list notation to represent normal distributions. In the above TMR, the duration is expressed as a normal distribution with mean 15 (minutes) and a standard deviation of 3. The mean here is drawn directly from the text (15 minutes), while the standard deviation is derived from the combination of what is the typical inaccuracy of a human observer and any uncertainty modiers present in the text (in this case, maybe).
2 Those familiar with LISP will nd this symbology familiar, even though MUTTS is not implemented in LISP.
Figure 3. Particles distributed around annotated search results for pizza restaurant in the Arington area of Baltimore.
Figure 2.
A screenshot of the MUTTS application.
that exist in MUTTS. Recall that MUTTS is a web application built using Googles GWT framework, and incorporates a suite of online tools to support the operations necessary for geolocation and visualization. A screenshot of the full MUTTS application can be seen in gure 2; it contains a map view, a tree view for breaking down the text input, and an interaction panel for visualizing search and sketch results, and text areas to input data and otherwise interact with MUTTS. The discussion here is based on a tutorial developed for users of the system. The use case here is that someone (who we will refer to as the analyst) has received a collection of textual observations that refer to the locations and movements of a subject. What the analyst would like is to provide an automated system with the text, and get back a detailed map of the subjects locations at all times throughout the observation period; ideally, this would be a path through the map, annotated with all the subjects stops. Due to the various sources of uncertainty in the text stream, a single, true, accurate sketch cannot generally be known. Therefore, MUTTS generates sketches probabilistically, and allows the analyst to consider and visualize the possibilities. Analysis of a tracking problem begins with the analyst constraining the area of interest. In this case, the AOI is the Arington/Mount Washington area of Baltimore, Maryland. The initial conguration of the particle lter places the particles uniformly distributed over the the AOI. Particles are rendered to the MUTTS map view as triangles representing the hypothesized location and direction of movement. The analyst begins the process by collecting the textual accounts and entering them as input to MUTTS. Consider the following collection of descriptions of a subjects whereabouts: (9:30pm): We saw him near the pizza restaurant.
434
Figure 4. A movement event has introduced uncertainty in the location of the subject.
(9:41pm): subject walking north for one half mile. subject turns east and continues for 5 minutes. (9:52pm): I am meeting Jerry at the hospital. This is what one might expect in a typical tracking scenario (thought typically one would have more data). We have three textual accounts, from different sources, with approximate timestamps. In this example, the observations come from an eye witness, from an police report, and an overheard conversation of the subject. MUTTS will begin by processing the rst observation, which it will classify as a sense event, with search query pizza restaurant. The query returns 6 hits that are labeled A G in gure 3 (D is off the screen). Note that the particle lter has processed the sense event and those particles consistent with the locations of the pizza restaurants are given more weight, while those inconsistent are given lower weighting or resampled to locations consistent with a 2 dimensional normal distribution, centered at the nearest pizza restaurant, and consistent with models of the term near.
Figure 5.
A sensing event that removes uncertainty.
Figure 6.
A sketch that is consistent with the text.
The second record is then processed. MUTTS will generate two observations, both movement events, for the second report. Movements are processed by the particle lter as described in sections II-A and II-C. Essentially, a dead reckoning update is performed and path verication is used to quantify the likelihood that it may have happened. Movement events either contain explicit distance information or it can be derived from duration language and models of movement. In this case, the subject was observed walking, and MUTTS can model the translation described in the observations by a normal distribution consistent with a model of walking. The resulting particle distribution is shown in gure 4. Note the spreading effect that a movement event has on the particles, expressing the uncertainty associated when a subject begins moving. Not only may one half mile be a rough estimate, but the subject may have taken a number of different routes and side streets in traversing that distance. The third record is spoken in the rst person and is processed as a sense event. The search query, hospital, is highly specic, as is evidenced by the updated particle lter shown in gure 5. There is only one hospital, and all particles that are not in the vicinity of the hospital after the prior update are resampled to reect the relative certainty that at 9:52pm, the subject is at that particular hospital. At this point, having incorporated four events into the tracking problem, it is reasonable to start considering what a sketch looks like, along with how it is generated, visualized, and evaluated. A sketch is generated by iterating over particles in the particle lter, retrieving each particles history as its position and orientation has changed in response to processing the text, and generating routes with the help of Google directions. Thus, each particle tracked by the lter has a corresponding sketch, and each such sketch can be scored and ranked according to total distance traveled, duration, or by a believability ranking, which incorporates the particles weight over its history as well as external
435
measures, such as the path verication score for the various segments of the sketch. The analyst sees a ranked list of particle sketches, along with direction, duration, and believability, and begins viewing the sketches in order to envision the possible scenarios. Two sketches are shown in gures 6 and 7. The former gure contains a sketch with low duration and distance traveled, and high believability. The high believability score is derived from two main factors. First, the particle weight remains high over the duration of the sketch, indicating that when sense events were processed (in particular, the meeting at the hospital), the particles were already in close proximity to where the subject was suspected to be. Second, the path verier found the duration and distance traveled in all segments of the sketch could be reasonably expected given the corresponding movement events. The sketch shown in gure 7, in contrast, has a longer overall duration and distance traveled, and a lower believability score. This is in large part due to the particles initial position at the pizza restaurant labeled F in gure 3. It is unlikely that this is the restaurant referred to by the eye witness given subsequent movements and the eventual meeting at the unambiguously located hospital. The particle weights are correspondingly low in this sketch. In addition, the paths required to arrive at the hospital are unlikely. The location of Woodberry Woods and the Jones Falls Expressway prevent the subject from having a clear and timely route to the hospital, and this is precisely the role of the path verier: to ag routes as unlikely given the movements described in the text. The cycle of adding observations, visualizing the sketches, and evolving a picture of the most likely tracking scenarios can continue as long as there is additional data. We view the MUTTS system as an increasingly mixed-initiative, allowing the analyst to participate in the process by manually ruling out or adding landmarks and routes, as well as providing input to the language processing pipeline as well. Improving
TDL text parser target TDL
(no match) semantic engine (match)
TMR
Figure 8.
Advice-giving in the MUTTS pipeline.
Figure 7. A sketch that is unlikely given the input and background knowledge about travel times.
the interactivity between the analyst and MUTTS is an ongoing area of development. IV. F UTURE W ORK There is still much that can be done to improve MUTTS. Future work falls into two categories: rening the tool and basic research. MUTTS is currently in Alpha and initial usability testing and evaluations are being performed by intelligence analysts. The feedback is still in its preliminary stages as of this writing, but adding to the mixed-initiative capabilities as well as improving the rule base of the semantic interpretation engine (to cover more constructions) are obvious areas to improve on performance and enhance the utility of the tool. We feel that the semantic interpretation engine is also an obvious area that would benet from transitioning from a home-grown ontology to a larger scale, established product such as WordNet 3 , and an opportunity exists for analysts to teach MUTTS new semantic templates when new language constructs are observed by the system. Indeed, the goal is automated data acquisition from internal reports as well as the eld, and we must expect to receive unusual linguistic constructions from a variety of speakers with various backgrounds. Basic research goals include those areas where good, working solutions to AI aspects of T2S are not established. Here, we are not looking to make incremental improvements to the parser, for example, but to explore new avenues where advances to the eld in general may be made. While we are always trying to incorporate new methods for managing uncertainty, we are particularly encouraged about a novel learning paradigm that is well-suited to the problem of T2S for tracking and MUTTS in particular. Recall the pipeline diagram in gure 1. In actuality, since the parsing process can also be viewed as a pipeline, the pipeline is somewhat longer, consisting of: a part-of-speech
3 http://wordnet.princeton.edu/
tagger, a named-entity recognizer, a k-best parser, a typed dependency generator, the semantic interpretation engine, and nally, the particle lter and path verier. Many of these processes are trainable components, based on supervised or semi-supervised learning from labeled examples. Consider the following scenario. Rules and their corresponding templates have been generated to cover a variety of possible textual constructions. In the course of processing a large stream of text, MUTTS encounters the following two sentences: He walked north for 8 minutes. . . . . . then, he walked east for 8 minutes. These two sentences, in a prior release of the parser, were treated differently. 4 Here is the typed dependency list for the west observation:
nsubj(walked-1, north-2) num(minutes-5, 8-4) prep_for(north-2, minutes-5)
The SIE contains a rule that matches on a movement verb and a duration specication (walked and minutes, respectively), and creates a movement event that can be lled out searching for the num dependency of the TDL. But, the east instance was processed differently:
advmod(walked-1, east-2) prep_for(walked-1, 8-4) nsubj(walked-1, minutes-5)
The absence of the num dependency prevents the rule from completing the movement event in a way that is most useful to the particle lter. Though this particular pathology no longer occurs in the parser, it is illustrative of a general condition. The parser is a large, complex system that may not always parse sentences in a manner most convenient for our semantic interpretation engine, especially when dealing with unorthodox constructions found in casual speech. In these cases, we would like to invoke the learning components opportunistically to improve performance. Since the SIE and the parser are coupled in the MUTTS pipeline, and since the SIE has an existing rule that almost res completely, it is possible for the SIE to express its ideal input as a training instance, and pass it back in the pipeline as advice for upstream components to learn from. Ideally, the upstream component would then generate new
4 This
particular anomaly no longer occurs.
436
output closer to the SIEs target. This process is shown diagrammatically in gure 8. In this case, each process in the pipeline that receives advice may take the advice itself, retrain, and emend its output, or upon examining the advice, may decide to pass the advice upstream for other components to consider. In this particular example, the parser may be able to consider lower-ranked parse trees in the k-best set of trees, compute their TDLs, and determine if more usable output could be provided to the SIE. If a preferable TDL was found, the parser could then update its own scoring metric to better reect the preference of the SIE. Here, the proper trees were available in the k-best set, the correct output could be provided, and adjustments could be made. We are enthusiastic that this approach will provide improvements to the robustness of not only the MUTTS system, but in other pipelined machine learning systems with supervised and semi-supervised learning components. V. C ONCLUSIONS We have presented MUTTS: a web application that performs automated text-to-sketch for tracking problems. This tool combines state-of-the-art natural language processing algorithms with an adaptation of a mobile robot localization algorithm called a particle lter to manage the many sources of uncertainty in tracking from textual descriptions. We have demonstrated how the use of off-the-shelf syntactic processing, coupled with a special purpose semantic ontology and template-matching system, can generate sensing and movement events that correspond to sensing and acting in mobile robot navigation and localization. The system also leverages vast amounts of existing spatial knowledge in the form of Google Maps, Local Search, and Directions, as well as the TIGER/Line R road data to bridge the gap between textual observations and geolocation and tracking. By presenting a use case, we have illustrated the utility of MUTTS as an analysts assistant. It provides the ability to iteratively reduce uncertainty about the sketch by adding observations and providing mixed-initiative constraints, and provides visualizations and scoring metrics for assessing the likelihood of individual sketches. We nished by outlining directions for future development and areas in which progress can be made on the intelligence aspects of the tool.
The MUTTS tools is currently being alpha tested by the intelligence community and we are enthusiastic about its potential as both an analysts tool and a platform for machine learning and natural language research. ACKNOWLEDGMENT This project was supported by a grant from the Intelligence Community Postdoctoral Research Fellowship Program through funding from the Ofce of the Director of National Intelligence. R EFERENCES
[1] I. Sledge and J. Keller, Mapping natural language to imagery: Placing objects intelligently, in Fuzzy Systems, 2009. FUZZIEEE 2009. IEEE International Conference on, aug. 2009, pp. 518 523. [2] T. S. Levitt and D. T. Lawton, Qualitative navigation for mobile robots, Articial Intelligence, vol. 44, no. 3, pp. 305 360, 1990. [Online]. Available: http://www.sciencedirect. com/science/article/pii/000437029090027W [3] B. Tversky and P. U. Lee, Pictorial and verbal tools for conveying routes, in Spatial information theory: cognitive and computational foundations of geographic information science, C. Freksa and D. Mark, Eds. Springer, 1999, pp. 5164. [4] N. Metropolis and S. Ulam, The monte carlo method, Journal of the American Statistical Association, vol. 44, no. 247, pp. 335341, September 1949. [5] D. Fox, S. Thrun, F. Dellaert, and W. Burgard, Particle lters for mobile robot localization, in Sequential Monte Carlo Methods in Practice, A. Doucet, N. de Freitas, and N. Gordon, Eds. New York: Springer Verlag, 2000. [6] D. B. Rubin, Using the sir algorithm to simulate posterior distributions, in Bayesian Statistics 3: Proceedings of the Third Valencia International Meeting, J. Bernardo, M. Degroot, D. Lindley, and A. Smith, Eds. Oxford: Oxford University Press, 1987, pp. 385402. [7] D. Klein and C. D. Manning, Accurate unlexicalized parsing, in Proceedings of the 41st Meeting of the Association for Computational Linguistics, 2003, pp. 423430. [8] B. M. Marie-Catherine de Marneffe and C. D. Manning, Generating typed dependency parses from phrase structure parses, in LREC 2006, 2006.
437

Managing Uncertanity in Text-to-Sketch Tracking Problem

Diunggah oleh

Informasi Dokumen

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Managing Uncertanity in Text-to-Sketch Tracking Problem

Diunggah oleh

Hak Cipta:

Format Tersedia

2011 23rd IEEE International Conference on Tools with Artificial Intelligence

Managing Uncertainty in Text-To-Sketch Tracking Problems

An overview of the MUTTS pipeline.

det(Popeye-7, a-6) poss(Chicken-10, Popeye-7) nn(Chicken-10, Fried-9) prep_near(seen-4, Chicken-10)

A screenshot of the MUTTS application.

A sensing event that removes uncertainty.

A sketch that is consistent with the text.

TDL text parser target TDL

(no match) semantic engine (match)

Advice-giving in the MUTTS pipeline.

particular anomaly no longer occurs.

Anda mungkin juga menyukai