BCA - 601
ARTIFICIAL INTELLIGENCE
BCA - 601
This SIM has been prepared exclusively under the guidance of Punjab Technical
University (PTU) and reviewed by experts and approved by the concerned statutory
Board of Studies (BOS). It conforms to the syllabi and contents as approved by the
BOS of PTU.
Reviewer
Information contained in this book has been published by VIKAS Publishing House Pvt. Ltd. and has
been obtained by its Authors from sources believed to be reliable and are correct to the best of their
knowledge. However, the Publisher and its Authors shall in no event be liable for any errors, omissions
or damages arising out of use of this information and specifically disclaim any implied warranties or
merchantability or fitness for any particular use.
SECTION I
Introduction to AI: Definitions, AI problems, the underlying Unit 1: What is Artificial
assumption, and AI techniques, Level of Model, Criteria for Intelligence?
Success. (Pages 3-30);
Problems, Problem Space and Search: Defining the problem Unit 2: Problems, Problem
as a state space search, Production System, Problem Spaces and Search
Characteristics, Production System Characteristics, issues in (Pages 31-62)
design of search programs.
SECTION-II
Knowledge Representation Issues: Representation and Unit 3: Knowledge Representation
mapping, approaches to knowledge representation, issues in Issues
knowledge representation, the frame problem. (Pages 63-83);
Knowledge representation using predicate logic: Unit 4: Using Predicate Logic
Representing simple facts in logic, representing instance and is (Pages 85-112)
a relationships, resolution
SECTION-III
Weak -slot and -filler structures: Semantic nets, frames as Unit 5: Weak Slot and
sets and instances. Filler Structures
Strong slot and filler structures: Conceptual dependency, (Pages 113-132)
scripts, CYC. Unit 6: Strong Slot and
Natural language processing: Syntactic processing, semantic Filler Structures
analysis, discourse and pragmatic processing (Pages 133-153)
Unit 7: Natural Language
Processing
(Pages 155-190)
CONTENTS
INTRODUCTION 1
Self-Instructional
Material 1
Introduction
NOTES
Self-Instructional
2 Material
What is Artificial
INTELLIGENCE?
NOTES
Structure
1.0 Introuction
1.1 Unit Objectives
1.2 Basics of Artificial Intelligence (AI)
1.2.1 AI Influence on Communication Systems
1.2.2 Timeline of Telecommunication-related Technology
1.2.3 Timeline of AI-related Technology
1.2.4 Timeline of AI Events
1.2.5 Modern Digital Communications
1.2.6 Building Intelligent Communication Systems
1.3 Beginning of AI
1.3.1 Definitions
1.3.2 AI Vocabulary
1.4 Problems of AI
1.5 Branches of AI
1.6 Applications of AI
1.6.1 Perception
1.6.2 Robotics
1.6.3 Natural Language Processing
1.6.4 Planning
1.6.5 Expert Systems
1.6.6 Machine Learning
1.6.7 Theorem Proving
1.6.8 Symbolic Mathematics
1.6.9 Game Playing
1.7 Underlying Assumption of AI
1.7.1 Physical Symbol System Hypothesis
1.8 AI Technique
1.8.1 Tic-Tac-Toe
1.8.2 Question Answering
1.9 Level of the AI Model
1.10 Criteria for Success
1.11 Conclusion
1.12 Summary
1.13 Key Terms
1.14 Answers to Check Your Progress
1.15 Questions and Exercises
1.16 Further Reading
1.0 INTRODUCTION
In this unit, you will learn about the basics of artificial intelligence (AI), a key
technology today, used in many applications. AI is also known as Computational
Intelligence.
With the development of the digital electronic computer in 1941, the
technology to create machine intelligence was finally available. The term, Artificial
Intelligence was first used in 1956 at the Dartmouth conference, and since then AI
Self-Instructional
Material 3
What is Artificial has expanded with theories and principles developed by dedicated researchers.
Intelligence?
Though advances in the field of AI have been slower, than first estimated, progress
continues to be made. During the last four decades a variety of AI programs has
been developed in sync with other technological advancements.
NOTES In 1941, the development of digital computer revolutionized every aspect
of storage and processing of information in the US and Germany. The early computers
required large, separate air-conditioned rooms, and were programmers nightmare.
These involved separate configurations of thousands of wires to get a program
running.
The 1949 innovationthe stored program computermade the job of
entering a program easier. Further advancements in computer theory, led to
development in computer science and eventually to AI. With the invention of an
electronic means of processing data came a medium that made AI possible.
Self-Instructional
4 Material
Humans communicate with intelligent systems for solving problems What is Artificial
Intelligence?
requiring higher mental process.
Humans communicate with experts to seek specialized opinion.
Humans communicate with logic machines to seek guidance.
Humans communicate with reasoning systems to get new knowledge. NOTES
Humans communicate for many more things to do.
Advances in communication technologies have led to increased worldwide
connectivity, while new technology like cell-phone has increased mobility.
1.2.2 Timeline of Telecommunication-related Technology
Roots of Communication: The development of communication systems began
two centuries ago with wire-based electrical systems called telegraph and
telephone. Before that it was human messengers on foot or horseback. Egypt
and China built messenger relay stations.
Electric Telegraph was invented by Samuel Morse in 1831.
Morse code was invented by Samuel Morse in 1835, a method for transmitting
telegraphic information.
Typewriter was invented by Christopher Latham Sholes in 1867, as the first
practical typewriting business offices machine.
Telephone was invented by Alexander Graham Bell in 1875, an instrument
through which speech sounds, not voice, were first transmitted electrically.
Telephone Exchange, a Rotary Dialling system became operational in New
Haven, Connecticut in 1878.
Kodak Camera was invented in 1888 by George Eastman, who also introduced
Rolled Photographic Film.
Telegraphing was invented by Valdemar Poulsen in 1899; it was a tape
recorder; recording media was a magnetized steel tape.
Wireless telegraphy was invented by Guglielmo Marconi in 1902; it
transmitted MF band radio signals across the Atlantic Ocean, from Cornwall
to Newfoundland.
Audion vacuum tube was invented by Lee Deforest in 1906, a two-electrode
detector device and later in 1908 a three electrode amplifier device.
Cross-continental telephone call was invented by Graham Bells 1914,
evolution of the Telegraph into the Telephone; the Greek word tele meaning
from a far, and phone meaning voice.
Radios with tuners came in 1916, a technological revolution of its time; pilots
directed gunfire of artillery batteries on ground through wireless operator
attached to each battery.
Iconoscope was invented by Vladimir Zworykin in 1923, a tube for television
camera needed for TV transmission.
Television system was invented by John Logie Baird in 1925, transmitted TV
signals in 1927 between London and Glasgow over telephone line.
Radio networks came in 1927, a system which distributed programming
(contents) to multiple stations simultaneously, for the purpose of extending
total coverage beyond the limits of a single broadcast station.
Self-Instructional
Material 5
What is Artificial Radio broadcasting is traditionally through air as radio waves; it can be via
Intelligence?
cable, FM, local wire networks, satellite and internet.
Stations are linked in radio networks, either in syndication or simulcast or both.
NOTES 1.2.3 Timeline of AI-related Technology
Roots of AI: The development of artificial intelligence actually began centuries
ago, long before the computer.
Roman Abacus (5000 years ago): machine with memory.
Pascaline (1652): Calculating machines that mechanized arithmetic.
Diff. Engine (1849): Mechanical calculating machines programmable to
tabulate polynomial functions.
Boolean algebra (1854): Investigation of Laws of Thought symbolic language
of calculus.
Turing machines (1936): an abstract symbol-manipulating device, adapted
to simulate the logic, was the first computer invented (on paper only).
Von Neumann architecture (1945): Computer design model, a processing
unit and a shared memory structure to hold both instructions and data.
ENIAC (1946): Electronic Numerical Integrator and Calculator, the first
electronic general-purpose digital computer by Eckert and Mauchly.
1.2.4 Timeline of AI Events
The concept of AI as a true scientific pursuit is very new. It remained over centuries
a plot for popular science fiction stories. Most researchers agree the beginning of
AI with Alan Turing.
Turing test, by Alan Turing in 1950, in the paper Computing Machinery and
Intelligence used to measure machine intelligence.
Intelligent behaviour by Norbert Wiener in 1950 observed link between human
intelligence and machines, and theorized intelligent behaviour.
Logic Theorist, a program by Allen Newell and Herbert Simon in 1955,
claimed that machines can contain minds just as human bodies do, proved 38
out of the first 52 theorems in Principia Mathematica.
Birth of AI, Dartmouth Summer Research Conference on Artificial Intelligence
in 1956, organized by John McCarthy, regarded as the father of AI.
Seven years later, in 1963, AI began to pick up momentum. The field was
still undefined, and the ideas formed at the conference were re-examined.
In 1957, General Problem Solver (GPS) was tested. GPS was an extension of
Wieners feedback principle, and capable of solving to a great extent common
sense problems.
In 1958, LISP language was invented by McCarthy and soon adopted as the
language of choice among most AI developers.
In 1963, DoDs Advanced Research projects started at MIT, researching
Machine-Aided Cognition (artificial intelligence), by drawing computer
scientists from around the world.
Self-Instructional
6 Material
In 1968, Micro-world program SHRDLU, at MIT, controlled a robot arm What is Artificial
Intelligence?
operated above flat surface scattered with play blocks.
In mid-1970s, Expert systems for Medical diagnosis (Mycin), Chemical data
analysis (Dendral) and Mineral exploration (Prospector) were developed.
NOTES
During the 1970s, Computer vision (CV) technology of machines that see
emerged. David Marr was first to model the functions of the visual system.
In 1972, Prolog by Alain Colmerauer was a logic programming language.
Logic programming is the use of logic in both declarative and procedural
representation language.
1.2.5 Modern Digital Communications
In 1947, Shannon created a mathematical theory, which formed the basis for modern
digital communications. Since then the developments were:
1960s: three geosynchronous communication satellites, launched by NASA.
1961: packet switching theory was published by Leonard Kleinrock at MIT.
1965: wide-area computer network, a low speed dial-up telephone line, created
by Lawrence Roberts and Thomas Merrill.
1966: Optical fibre was used for transmission of telephone signals.
Late 1966: Roberts went to DARPA to develop the computer network concept
and put his plan for the Advanced Research Projects Agency Network
ARPANET, which he presented in a conference in 1967. There Paul Baran
and others at RAND presented a paper on packet switching networks for
secure voice communication in military use.
1968: Roberts and DARPA revised the overall structure and specifications
for the ARPANET, released an RFQ for development of key components: the
packet switches called Interface Message Processors (IMPs).
Architectural design by Bob Kahn.
Network topology and economics design and optimization by Roberts with
Howard Frank and his team at Network Analysis Corporation.
Data networking technology, network measurement system preparation by
Kleinrocks team at University of California, Los Angeles (UCLA).
The year 1969 saw the beginning of the Internet era, the development of
ARPANET, an unprecedented integration of capabilities of telegraph,
telephone, radio, television, satellite, Optical fibre and computer.
In 1969 a day after Labour Day, UCLA became the first node to join the
ARPANET. That meant, the UCLA team connected the first switch (IMP) to
the first host computer (a minicomputer from Honeywell).
A month later the second node was added at SRI (Stanford Research Institute)
and the first Host-to-Host message on the Internet was launched from UCLA.
It worked in a cleaver way, as described below:
Programmers for logon to the SRI Host from the UCLA Host typed in log
and the system at SRI added in, thus creating the word login.
Programmers could communicate by voice as the message was transmitted
using telephone headsets at both ends.
Self-Instructional
Material 7
What is Artificial Programmers at the UCLA end typed in the l and asked SRI if they received
Intelligence?
it; came the voice reply got the l.
By 1969, they connected four nodes (UCLA, SRI, UC SANTA BARBARA,
and University of Utah). UCLA served for many years as the ARPANET
NOTES Measurement Centre.
In the mid-1970s, UCLA controlled a geosynchronous satellite by sending
messages through ARPANET from California to East Coast satellite dish. By
1970, they connected ten nodes. * In 1972, the International Network Working
Group, INWG was formed to further explore packet switching concepts and
internetworking, as there would be multiple independent networks of arbitrary
design.
In 1973, Kahn developed a protocol that could meet the needs of an open
architecture network environment. This protocol is popularly known as TCP/
IP (Transmission Control Protocol/Internet Protocol).
Note: TCP/IP is named after two of the most important protocols in it.
IP is responsible for moving packet of data from node to node.
TCP is responsible for verifying delivery of data from client to server.
Sockets are subroutines that provide access to TCP/IP on most systems.
In 1976, X.25 protocol was developed for public packet networking.
In 1977, the first Internet work was demonstrated by Cerf and Kahn. They
connected three networks with TCP: the Radio network, the Satellite network
(SATNET), and the Advanced Research Projects Agency Network
(ARPANET).
In the 1980s, ARPANET was evolved into INTERNET. Internet is defined
officially as networks using TCP/IP.
On 1 January 1983, the ARPANET and every other network attached to the
ARPANET officially adopted the TCP/IP networking protocol. From then
on, all networks that use TCP/IP are collectively known as the Internet. The
standardization of TCP/IP allows the number of Internet sites and users to
grow exponentially.
Today, Internet has millions of computers and hundreds of thousands of
networks. The network traffic is dominated by its ability to promote people-
to-people interaction.
1.2.6 Building Intelligent Communication Systems
Modern telecommunications are based on coordination and utilization of individual
services, such as telephony, cellular, cable, microwave terrestrial and satellite, and
their integration into a seamless global network. Successful design, planning,
coordination, management and financing of global communications network require
a broad understanding of these services, their costs and the advantages and
limitations. The next generation of wireless and wired communication systems and
networks has requirements for many AI and AI-related concepts, algorithms,
techniques and technologies. Research groups are currently exploring and using
Semantic Web languages in monitoring, learning, adaptation and reasoning.
Self-Instructional
8 Material
To present an idea of Intelligent Communication Systems, examples of an What is Artificial
Intelligence?
Intelligent Mobile Platform, the VoiceRecognition across Mobile Phone and the
Project Knowledge Based Networking by DARPA are briefly illustrated.
Ex.1 Intelligent Mobile Platform: Magitti
NOTES
The mobile phone is no more a simple two-way communication device. Intelligent
mobiles infer our behaviour and suggest appropriate lists of restaurants, stores and
events.
The difference between Magitti with other GPS-enabled mobile applications
is Artificial intelligence.
Magitti is intelligence personal assistant.
Magittis specification has not been released, but mobile phones are becoming
increasingly powerful with sensors, entertainment tools, accelerometer, GPS, etc.
Perhaps AI would make more sense in not so long a future.
Ex.2 Voice Recognition across mobile phone: Lingo
Mobile phones can do lots of things, but a majority of people never use them for
more than calls and short text messages. A voice recognition-correction interface
across mobile phone applications is coming to market to provide speech recognition.
Now you can talk to your phone and the phone understands you.
Ex.3 Project Knowledge Based Networking by DARPA
Military research aims to develop self-configuring, secure wireless nets. Academic
concepts of Artificial Intelligence and Semantic Web, combined with technologies
such as the Mobile Ad-hoc Network (MANET), Cognitive Radio and Peer-to-Peer
networking provide the elements of such a network. This project by DARPA is
intended for soldiers in the field.
Ex.4 Semantic Web
Web is the first generation of the World Wide Web and contains static HTML ages.
Web 2.0 is the second generation of the World Wide Web, focused on the ability for
people to collaborate and share information online. It is dynamic, serves applications
to users and offers open communications. Web requires a human operator, using
computer systems to perform the tasks required to find, search and aggregate its
information. It is impossible for a computer to do these tasks without human guidance
because
Web pages are specifically designed for human readers.
Ex. 5 Mobile Ad-hoc Network (MANET)
The traditional wireless mobile networks have been Infrastructure based in which
mobile devices communicate with access points like base stations connected to the
fixed network. Typical examples of this kind of wireless networks are GSM, WLL,
WLAN, etc. Approaches to the next generation of wireless mobile networks are
Infrastructure less. MANET is one such network. A MANET is a collection of
wireless nodes that can dynamically form a network to exchange information without
using any existing fixed network infrastructure. This is very important because in
many contexts information exchange between mobile units cannot rely on any fixed
network infrastructure, but on rapid configuration of a wireless connections on-the-
Self-Instructional
Material 9
What is Artificial fly. The MANET is a self-configuring network of mobile routers and associated
Intelligence?
hosts connected by wireless links, the union of which forms an arbitrary topology
that may change rapidly and unpredictably.
1.3 BEGINNING OF AI
Although the computer provided the technology necessary for AI, it was not until
the early 1950s that the link between human intelligence and machines was really
observed. Norbert Wiener was one of the first Americans to make observations on
the principle of feedback theory. The most familiar example of feedback theory is
the thermostat: It controls the temperature of an environment by gathering the actual
temperature of the house, comparing it to the desired temperature, and responding
by turning the heat up or down. What was so important about his research into
feedback loops was that Wiener theorized that all intelligent behaviour was the
result of feedback mechanismsmechanisms that could possibly be simulated by
machines. This discovery influenced much of early development of AI.
In late 1955, Newell and Simon developed The Logic Theorist, considered
by many to be the first AI program. The program, representing each problem as a
tree model, would attempt to solve it by selecting the branch that would most likely
result in the correct conclusion. The impact that the logic theorist made on both the
public and the fields of AI has made it a crucial stepping stone in developing the AI
field.
Self-Instructional
10 Material
In 1956 John McCarthy, the father of AI, organized a conference to draw What is Artificial
Intelligence?
the talent and expertise of others interested in machine intelligence for a month of
brainstorming. He invited them to Vermont for the conference of The Dartmouth
summer research project on artificial intelligence. From that point, because of
McCarthy, the field would be known as Artificial intelligence. Although not a huge NOTES
success, the Dartmouth conference did bring together the founders of AI, and served
to lay the basis for the future of AI research.
In 1957 at Dartmouth New Hampshire conference discussed about the
possibilities of simulating human intelligence and thinking in computers. Today
Artificial Intelligence is a well-established, natural and scaling branch of computer
science. It is the science and engineering of making intelligent machines, especially
intelligent computer programs. It is related to the similar task of using computers to
understand human intelligence, understanding language, learning, reasoning and
solving problems. But AI does not have to confine itself to the methods that are
biologically observable.
We will discuss alternative definitions of Artificial Intelligence below.
1.3.1 Definitions
Artificial Intelligence is the study of how to make computers do things which at the
moment people do better. This is ephemeral as it refers to the current state of computer
science and it excludes major area problems that cannot be solved well either by
computers or by people at the moment.
Alternative - I
Artificial Intelligence is the branch of computer science that is concerned with the
automation of intellectual performance. AI is based upon the principles of computer
science, namely data structures used in knowledge representation, the algorithms
needed to apply that knowledge and the languages and programming techniques
used in their implementation.
Alternative - II
Artificial Intelligence is a field of study that encompasses computational techniques
for performing tasks that seem to require intelligence when performed by humans.
Alternative - III
Artificial Intelligence is the field of study that seeks to explain and follow intellectual
performance in terms of computational processes.
Alternative - IV
Artificial Intelligence is about generating representations and procedures that
automatically or autonomously solve problems heretofore solved by humans.
Alternative - V
Artificial Intelligence is that part of computer science concerned with designing
intelligent computer systems, i.e. computer systems that exhibit characteristics. We
associate intelligence in human behaviour with understanding language, learning,
reasoning and solving problems.
Self-Instructional
Material 11
What is Artificial There are various definitions of Artificial Intelligence according to different
Intelligence?
authors. Most of these definitions take a very technical direction and avoid
philosophical problems connected with the idea that AIs purpose is to construct an
artificial human. These definitions are categorized into four categories. The four
NOTES categories are:
1. Systems that think like humans (focus on reasoning and human framework)
2. Systems that think rationally (focus on reasoning and a general concept of
intelligence)
3. Systems that act like humans (focus on behaviour and human framework)
4. Systems that act rationally (focus on behaviour and a general concept of
intelligence)
What is rationality? - Doing the right thing, simply speaking.
Definitions:
1. The art of creating machines that performs functions that require intelligence
when performed by humans (Kurzweil). Involves cognitive modellingwe
have to determine how humans think in a literal sense.
2. GPS is a General Problem Solver (Newell and Simon). Deals with right
thinking and dives into the field of logic. Uses logic to represent the world
and relationships between objects in it and come to conclusions about it.
Problems: hard to encode informal knowledge into a formal logic system and
theorem provers have limitations (if theres no solution to a given logical
notation).
3. Turing defined intelligent behaviour as the ability to achieve human-level
performance in all cognitive tasks, sufficient to fool a human person (Turing
Test). Physical contact with the machine has to be avoided, because physical
appearance is not relevant to exhibit intelligence. However, the Total Turing
Test includes appearance by encompassing visual input and robotics as well.
4. The rational agentachieving ones goals given ones beliefs. Instead of
focusing on humans, this approach is more general, focusing on agents (which
perceive and act), more general than strict logical approach (i.e. thinking
rationally).
In practical ways, the Intelligence of a Computer System provides:
Ability to automatically perform tasks that currently require human operators
More autonomy in computer systems; fewer requirements for human
interference or monitoring
Flexibility in dealing with inconsistency in the environment
Ability to understand what the user wants from limited instructions
Increase in performance by learning from experience
1.3.2 AI Vocabulary
Intelligence relates to tasks involving higher mental processes, e.g. creativity, solving
problems, pattern recognition, classification, learning, induction, deduction, building
analogies, optimization, language processing, knowledge and many more.
Intelligence is the computational part of the ability to achieve goals.
Self-Instructional
12 Material
Intelligent behaviour is depicted by perceiving ones environment, acting in What is Artificial
Intelligence?
complex environments, learning and understanding from experience, reasoning to
solve problems and discover hidden knowledge, applying knowledge successfully
in new situations, thinking abstractly, using analogies, communicating with others
and more. NOTES
Science based goals of AI pertain to developing concepts, mechanisms and
understanding biological intelligent behaviour. The emphasis is on understanding
intelligent behaviour.
Engineering based goals of AI relate to developing concepts, theory and practice
of building intelligent machines. The emphasis is on system building.
AI Techniques depict how we represent, manipulate and reason with knowledge in
order to solve problems. Knowledge is a collection of facts. To manipulate these
facts by a program, a suitable representation is required. A good representation
facilitates problem solving.
Learning means that programs learn from what facts or behaviour can represent.
Learning denotes changes in the systems that are adaptive in other words, it enables
the system to do the same task(s) more efficiently next time.
Applications of AI refers to problem solving, search and control strategies, speech
recognition, natural language understanding, computer vision, expert systems, etc.
1.4 PROBLEMS OF AI
Intelligence does not imply perfect understanding; every intelligent being has limited
perception, memory and computation. Many points on the spectrum of intelligence
versus cost are viable, from insects to humans. AI seeks to understand the
computations required from intelligent behaviour and to produce computer systems
that exhibit intelligence. Aspects of intelligence studied by AI include perception, Check Your Progress
communicational using human languages, reasoning, planning, learning and memory. 1. Who advanced
mathematical
Let us consider some of the problems that Artificial Intelligence is used to theory?
solve. Early examples are game playing and theorem proving, which involves 2. Web 2.0 is the
resolution. Common sense reasoning formed the basis of GPS a general problem second generation
solver. Natural language processing met with early success; then the power of the of the World Wide
Web, focused on
computers hindered progress, but currently this topic is experiencing a flow. The people to
question of expert systems is interesting but it represents one of the best examples collaborate and
of an application of AI, which appears useful to non-AI people. Actually the expert share information
system solves particular subsets of problems using knowledge and rules about a online. (True or
False)
particular topic.
3. A MANET is a
The following questions are to be considered before we can step forward: collection of
wireless nodes that
1. What are the underlying assumptions about intelligence? can dynamically
2. What kinds of techniques will be useful for solving AI problems? form a network to
exchange
3. At what level human intelligence can be modelled? information without
using any existing
4. When will it be realized when an intelligent program has been built? fixed network
infrastructure. (True
or False).
Self-Instructional
Material 13
What is Artificial
Intelligence? 1.5 BRANCHES OF AI
A list of branches of AI is given below. However some branches are surely missing,
NOTES because no one has identified them yet. Some of these may be regarded as concepts
or topics rather than full branches.
Logical AI
In general the facts of the specific situation in which it must act, and its goals are all
represented by sentences of some mathematical logical language. The program
decides what to do by inferring that certain actions are appropriate for achieving its
goals.
Search
Artificial Intelligence programs often examine large numbers of possibilities for
example, moves in a chess game and inferences by a theorem proving program.
Discoveries are frequently made about how to do this more efficiently in various
domains.
Pattern Recognition
When a program makes observations of some kind, it is often planned to compare
what it sees with a pattern. For example, a vision program may try to match a
pattern of eyes and a nose in a scene in order to find a face. More complex patterns
are like a natural language text, a chess position or in the history of some event.
These more complex patterns require quite different methods than do the simple
patterns that have been studied the most.
Representation
Usually languages of mathematical logic are used to represent the facts about the
world.
Inference
Others can be inferred from some facts. Mathematical logical deduction is sufficient
for some purposes, but new methods of non-monotonic inference have been added
to the logic since the 1970s. The simplest kind of non-monotonic reasoning is default
reasoning in which a conclusion is to be inferred by default. But the conclusion can
be withdrawn if there is evidence to the divergent. For example, when we hear of a
bird, we infer that it can fly, but this conclusion can be reversed when we hear that
it is a penguin. It is the possibility that a conclusion may have to be withdrawn that
constitutes the non-monotonic character of the reasoning. Normal logical reasoning
is monotonic, in that the set of conclusions can be drawn from a set of premises, i.e.
monotonic increasing function of the premises. Circumscription is another form of
non-monotonic reasoning.
Common sense knowledge and Reasoning
This is the area in which AI is farthest from the human level, in spite of the fact that
it has been an active research area since the 1950s. While there has been considerable
progress in developing systems of non-monotonic reasoning and theories of action,
Self-Instructional yet more new ideas are needed.
14 Material
Learning from experience What is Artificial
Intelligence?
There are some rules expressed in logic for learning. Programs can only learn what
facts or behaviour their formalisms can represent, and unfortunately learning systems
are almost all based on very limited abilities to represent information. NOTES
Planning
Planning starts with general facts about the world (especially facts about the effects
of actions), facts about the particular situation and a statement of a goal. From
these, planning programs generate a strategy for achieving the goal. In the most
common cases, the strategy is just a sequence of actions.
Epistemology
This is a study of the kinds of knowledge that are required for solving problems in
the world.
Ontology
Ontology is the study of the kinds of things that exist. In AI the programs and
sentences deal with various kinds of objects and we study what these kinds are and
what their basic properties are. Ontology assumed importance from the 1990s.
Heuristics
A heuristic is a way of trying to discover something or an idea embedded in a
program. The term is used variously in AI. Heuristic functions are used in some
approaches to search or to measure how far a node in a search tree seems to be from
a goal. Heuristic predicates that compare two nodes in a search tree to see if one is
better than the other, i.e. constitutes an advance toward the goal, and may be more
useful.
Genetic programming
Genetic programming is an automated method for creating a working computer
program from a high-level problem statement of a problem. Genetic programming
starts from a high-level statement of what needs to be done and automatically
creates a computer program to solve the problem. It is being developed by John
Kozas group.
1.6 APPLICATIONS OF AI
AI has applications in all fields of human study, such as finance and economics,
environmental engineering, chemistry, computer science, and so on. Some of the
applications of AI are listed below:
Perception
Machine vision
Speech understanding
Touch ( tactile or haptic) sensation
Robotics
Self-Instructional
Material 15
What is Artificial Natural Language Processing
Intelligence? Natural Language Understanding
Speech Understanding
Language Generation
NOTES Machine Translation
Planning
Expert Systems
Machine Learning
Theorem Proving
Symbolic Mathematics
Game Playing
1.6.1 Perception
Perception is defined as as the formation, from a sensory signal, of an internal
representation suitable for intelligent processing. Computer perception is an
example of artificial intelligence. It focuses on vision and speech. Since it involves
incident time-varying continuous energy distributions prior to interpretation in
symbolic terms, perception is seen to be distinct from intelligence.
1.6.1.1 Machine vision
It is easy to interface a TV camera to a computer and get an image into memory; the
problem understands what the image represents. Vision takes lots of computation;
in humans, roughly 10 per cent of all calories consumed are burned in vision
computation.
1.6.1.2 Speech understanding
Speech understanding is available now. Some systems must be trained for the
individual user and require pauses between words. Understanding continuous speech
with a larger vocabulary is harder.
1.6.1.3 Touch (tactile or haptic) sensation
Important for robot assembly tasks.
1.6.2 Robotics
Although industrial robots have been expensive, robot hardware can be cheap. The
limiting factor in application of robotics is not the cost of the robot hardware itself.
What is needed is perception and intelligence to tell the robot what to do;
blind robots are limited to very well-structured tasks (like spray painting car
bodies).
1.6.3 Natural Language Processing
Natural language processing is explained below.
1.6.3.1 Natural language understanding
Natural languages are human languages such as English. Making computers
understand English allows non-programmers to use them with little training. Getting
Self-Instructional
16 Material
computers to generate human or natural languages can be called Natural Language What is Artificial
Intelligence?
generation. It is much easier than natural language understanding. A text written in
one language and then generated in another language by means of computers can
be called machine translation. It is important for organizations that operate in many
countries. NOTES
1.6.3.2 Natural language generation
Generation is easier than Natural Language understanding. It can be an inexpensive
output device.
1.6.3.3 Machine translation
Usable translation of text is available now. It is important for organizations that
operate in many countries.
1.6.4 Planning
Planning attempts to make an order of actions for achieving goals. Planning
applications include logistics, manufacturing scheduling and planning manufacturing
steps to construct a desired product. There are huge amounts of money to be saved
through better planning.
1.6.5 Expert Systems
Expert Systems attempt to capture the knowledge of a human and present it through
a computer system. There have been many successful and economically valuable
applications of expert systems.
1.6.5.1 Benefits
Reducing the skill level needed to operate complex devices
Diagnostic advice for device repair
Interpretation of complex data
Cloning of scarce expertise
Capturing knowledge of expert who is about to retire
Combining knowledge of multiple experts
Intelligent training
1.6.6 Machine Learning
Machine learning has been a central part of AI research from the beginning. In
machine learning, unsupervised learning is a class of problems in which one seeks
to determine how the data is organized. Unsupervised learning is the ability to find
patterns in a stream of input. Supervised learning includes both classification
(determines what category something belongs in) and regression (given a set of
numerical input/output examples, discovers a continuous function that would
generate the outputs from the inputs).
Self-Instructional
Material 19
What is Artificial
Intelligence? 1.8 AI TECHNIQUE
Artificial Intelligence research during the last three decades has concluded that
NOTES Intelligence requires knowledge. To compensate overwhelming quality, knowledge
possesses less desirable properties.
A. It is huge.
B. It is difficult to characterize correctly.
C. It is constantly varying.
D. It differs from data by being organized in a way that corresponds to its
application.
E. It is complicated.
An AI technique is a method that exploits knowledge that is represented so that:
The knowledge captures generalizations that share properties, are grouped
together, rather than being allowed separate representation.
It can be understood by people who must provide iteven though for many
programs bulk of the data comes automatically from readings.
In many AI domains, how the people understand the same people must supply
the knowledge to a program.
It can be easily modified to correct errors and reflect changes in real conditions.
It can be widely used even if it is incomplete or inaccurate.
It can be used to help overcome its own sheer bulk by helping to narrow the
range of possibilities that must be usually considered.
In order to characterize an AI technique let us consider initially OXO or tic-
tac-toe and use a series of different approaches to play the game.
The programs increase in complexity, their use of generalizations, the clarity
of their knowledge and the extensibility of their approach. In this way they move
towards being representations of AI techniques.
1.8.1 Tic-Tac-Toe
1 2 3
4 5 6
7 8 9
An element contains the value 0 for blank, 1 for X and 2 for O. A MOVETABLE
vector consists of 19,683 elements (39) and is needed where each element is a nine
element vector. The contents of the vector are especially chosen to help the algorithm.
The algorithm makes moves by pursuing the following:
1. View the vector as a ternary number. Convert it to a decimal number.
Self-Instructional
20 Material
2. Use the decimal number as an index in MOVETABLE and access the vector. What is Artificial
Intelligence?
3. Set BOARD to this vector indicating how the board looks after the move.
This approach is capable in time but it has several disadvantages. It takes
more space and requires stunning effort to calculate the decimal numbers.
This method is specific to this game and cannot be completed. NOTES
Event2 Event2
instance: finding instance: liking
tense: past tense: past
agent: Rani modifier: much
object: Thing1 object: Thing1
Thing1
instance: coat
colour: red
The question is stored in two forms: as input and in the above form.
1.8.2.8 Algorithm
Convert the question to a structured form using English know how, then use
a marker to indicate the substring (like who or what) of the structure, that
should be returned as an answer. If a slot and filler system is used a special
marker can be placed in more than one slot.
The answer appears by matching this structured form against the structured
text.
The structured form is matched against the text and the requested segments
of the question are returned.
1.8.2.9 Examples
Both questions 1 and 2 generate answers via a new coat and a red coat respectively.
Question 3 cannot be answered, because there is no direct response.
1.8.2.10 Comments
This approach is more meaningful than the previous one and so is more effective.
The extra power given must be paid for by additional search time in the knowledge
bases. A warning must be given here: that is to generate an unambiguous English
knowledge base is a complex task and must be left until later in the course. The
problems of handling pronouns are difficult. For example:
Rani walked up to the salesperson: she asked where the toy department was.
Rani walked up to the salesperson: she asked her if she needed any help.
Whereas in the original text the linkage of she to Rani is easy, linkage of
she in each of the above sentences to Rani and to the salesperson requires additional
knowledge about the context via the people in a shop.
Self-Instructional
Material 23
What is Artificial Method 3
Intelligence?
C contacts to S
C (Customer) (Salesperson) to know
selects the products products details and
to purchase payment scheme
Materials to be
delivered by S after
clearing payment
Admin Dept contains M (Merchandize), D (Money-
Dollars and L (Store Details) to control the whole
shopping transaction
1.8.2.12 Algorithm
Convert the question to a structured form using both the knowledge contained in
Method 2 and the World model, generating even more possible structures, since
even more knowledge is being used. Sometimes filters are introduced to prune the
Self-Instructional
possible answers. To answer a question, the scheme followed is:
24 Material
Convert the question to a structured form as before but use the world model What is Artificial
Intelligence?
to resolve any ambiguities that may occur. The structured form is matched against
the text and the requested segments of the question are returned.
1.8.2.13 Example NOTES
Both questions 1 and 2 generate answers, as in the previous program.
Question 3 can now be answered. The shopping script is instantiated and
from the last sentence the path through step 14 is the one used to form the
representation.
M is bound to the red coat-got home. Rani buys a red coat comes from
step 10 and the integrated text generates that she bought a red coat.
1.8.2.14 Comments
This program is more powerful than both the previous programs because it has
more knowledge. Thus, like the last game program it is exploiting AI techniques.
However, we are not yet in a position to handle any English question. The major
omission is that of a general reasoning mechanism known as inference to be used
when the required answer is not explicitly given in the input text. But this approach
can handle, with some modifications, questions of the following form with the
answerSaturday morning Rani went shopping.
Her brother tried to call her but she did not answer.
Question
Why couldnt Ranis brother reach her?
Answer
Because she was not in.
This answer is derived because we have supplied an additional fact that a person
cannot be in two places at once.
This patch is not sufficently general so as to work in all cases and does not provide
the type of solution we are really looking for.
Are we trying to produce programs that do the tasks the same way that people
do? OR
Are we trying to produce programs that simply do the tasks the easiest way
that is possible?
Programs in the first class attempt to solve problems that a computer can
easily solve and do not usually use AI techniques. AI techniques usually include a
search, as no direct method is available, the use of knowledge about the objects
involved in the problem area and abstraction on which allows an element of pruning
to occur, and to enable a solution to be found in real time; otherwise, the data could
explode in size. Examples of these trivial problems in the first class, which are now
Self-Instructional
Material 25
What is Artificial of interest only to psychologists are EPAM (Elementary Perceiver and Memorizer)
Intelligence?
which memorized garbage syllables.
The second class of problems attempts to solve problems that are non-trivial for a
computer and use AI techniques. We wish to model human performance on these:
NOTES
1. To test psychological theories of human performance. Ex. PARRY [Colby,
1975] a program to simulate the conversational behaviour of a paranoid
person.
2. To enable computers to understand human reasoning for example, programs
that answer questions based upon newspaper articles indicating human
behaviour.
3. To enable people to understand computer reasoning. Some people are reluctant
to accept computer results unless they understand the mechanisms involved
in arriving at the results.
4. To exploit the knowledge gained by people who are best at gathering
information. This persuaded the earlier workers to simulate human behaviour
in the SB part of AISB simulated behaviour. Examples of this type of approach
led to GPS (General Problem Solver) discussed in detailed later and also to
natural language understanding which will be covered in later lessons.
The test is conducted by two people, one of whom is the interrogator and the
other sits with a computer C in another room. The only communication is by means
of typing messages on a simple terminal around 1950. The interrogator is given the
task of determining whether a response comes from the person or the computer C.
The goal of the machine is to fool the interrogator into believing that the
machine is the person and therefore the interrogator will believe the machine can
think. The machine can trick the interrogator by giving the wrong answer to 12324
times 73981.
Some believe a computer will never pass the Turing Test and be able to
maintain the following dialogue due to Turing, which Turing believed that a computer
would need to exhibit to pass the test.
Recently computers have convinced five people out of ten that the programs
loaded inside them were of human intelligence. Also, Chinook, a computer program,
gave the draughts world.
This is a good example for AI technique and it defines three important AI techniques:
Self-Instructional
26 Material
Search What is Artificial
Intelligence?
A search program finds a solution for a problem by trying various sequences of
actions or operators until a solution is found.
Advantages NOTES
To solve a problem using search, it is only necessary to code the operators that can
be used; the search will find the sequence of actions that will provide the desired
result. For example, a program can be written to play chess using search if one
knows the rules of chess; it is not necessary to know how to play good chess.
Disadvantages
Most problems have search spaces so large that it is impossible to search the whole
space. Chess has been estimated to have 10120 possible games. The rapid growth of
combinations of possible moves is called the combinatoric explosion problem.
Use of knowledge
Use of knowledge provides a way of solving complex problems by exploiting the
structures of the objects that are concerned.
Abstraction
Abstraction provides a way of separating important features and variations from
the many unimportant ones that would otherwise overwhelm any process.
1.11 CONCLUSION
We live in an era of rapid change, moving towards the information or knowledge or
network society. We require seamless, easy-to-use, high quality, reasonable
communications between people and machines, anywhere and anytime. The creation
of intelligence in a machine has been a long cherished desire to replicate the
functionality of the human mind.
Intelligent information and communications technology (IICT) emulates and
employs some aspects of human intelligence in performing a task. The IICT based
systems include sensors, computers, knowledge-based software, human-machine
interfaces and other devices. IICT enabled machines and devices anticipate
requirements and deal with environments that are complex, unknown, unpredictable
and bring the power of computing technology into our daily lives and business
practices. Intelligent systems were first developed for use in traditional industries,
such as manufacturing, mining, etc., enabling the automation of routine or dangerous
tasks to improve productivity and quality. Today, an intelligent systems application
exists virtually in all sectors where they deliver social as well as economic benefits.
Post-World War II, a number of people started working on intelligent
machines. The English mathematician Alan Turing has been the first person who
gave a lecture in 1947 that AI was best researched by programming computers
rather than by building machines. By the late 1950s, there were many researchers
on AI and most of their work based on programming computers.
Daniel Dennetts book Brainchildren has an excellent explanation about the
Turing test and various partial Turing tests have been implemented, i.e. with
restrictions on the observers knowledge of AI and the subject matter of questioning.
Self-Instructional
Material 27
What is Artificial It turns out that some people are easily led into believing that a rather dumb program
Intelligence?
is intelligent.
Alan Turings article Computing Machinery and Intelligence in 1950
mentioned conditions for considering a machine to be intelligent. He argued that if
NOTES the machine could successfully pretend to be a knowledgeable observer like a human,
then we would certainly consider it to be intelligent. This test would satisfy most of
the people but not all philosophers. The observer could interact with the machine
and a human by teletype (to avoid the machine from imitating the appearance or
voice of the person), and the human would try to plead with the observer that it was
human and the machine would try to fool the observer.
The Turing test is a one-sided test. A machine that passes the test should
certainly be considered intelligent, but a machine could still be considered intelligent
without knowing enough about humans or to imitate a human.
1.12 SUMMARY
In this unit, you have learned about the various concepts related to AI, beginning
with its basics. It was in late 1955 that Newell and Simon developed The Logic
Theorist, considered by many to be the first AI program. The program, representing
each problem as a tree model, would attempt to solve it by selecting the branch that
would most likely result in the correct conclusion.
You have also learned the various definitions of AI according to different
authors. Most of these definitions take a very technical direction and avoid
philosophical problems connected with the idea that AIs purpose is to construct an
artificial human. AI seeks to understand the computations required from intelligent
behaviour and to produce computer systems that exhibit intelligence. This unit also
explained in detail, the applications of AI which mean problem solving, search and
control strategies, speech recognition, natural language understanding, computer
vision, expert systems, etc. Aspects of intelligence studied by AI include perception,
communicational using human languages, reasoning, planning, learning and memory.
Another important topic that you have learned is AI techniques. The later
depict how we represent, manipulate and reason with knowledge in order to solve
problems. Knowledge is a collection of facts. To manipulate these facts by a
program, a suitable representation is required. A good representation facilitates
problem solving. The level of the AI model and the criteria for success has been
also discussed.
Self-Instructional
28 Material
Software defined radio: It is a radio communication system where What is Artificial
Intelligence?
components that have typically been in hardware (e.g., mixers, filters,
amplifiers, modulators/demodulators, detectors, etc.) are implemented using
software.
Artificial intelligence: It is the branch of computer science that is concerned NOTES
with the automation of intellectual performance.
AI techniques: These depict how we represent, manipulate and reason with
knowledge in order to solve problems. Knowledge is a collection of facts.
Self-Instructional
30 Material
Problems, Problem
AND SEARCH
NOTES
Structure
2.0 Introduction
2.1 Unit Objectives
2.2 Problem of Building a System
2.2.1 AI - General Problem Solving
2.3 Defining Problem in a State Space Search
2.3.1 State Space Search
2.3.2 The Water Jug Problem
2.4 Production Systems
2.4.1 Control Strategies
2.4.2 Search Algorithms
2.4.3 Heuristics
2.5 Problem Characteristics
2.5.1 Problem Decomposition
2.5.2 Can Solution Steps be Ignored?
2.5.3 Is the Problem Universe Predictable?
2.5.4 Is Good Solution Absolute or Relative?
2.6 Characteristics of Production Systems
2.7 Design of Search Programs and Solutions
2.7.1 Design of Search Programs
2.7.2 Additional Problems
2.8 Summary
2.9 Key Terms
2.10 Answers to Check Your Progress
2.11 Questions and Exercises
2.12 Further Reading
2.0 INTRODUCTION
In the preceding unit, you learnt about the basic concepts and applications of AI. In
this unit, you will learn about problems, their definitions, characteristics and design
of search programs. You will learn to define problems in a state space research. A
state space represents a problem in terms of states and operators that change states.
You will also learn about problem solving which is a process of generating solutions
from observed or given data. This unit also examines production systems. Production
systems provide appropriate structures for performing and describing search
processes. Other important topics discussed in this unit areproblem characteristics
and production systems characteristics. The latter provide us with a good way of
describing the operations that can be performed in a search for a solution to the
problem. Finally, you will also learn about design of search programs and solutions.
Self-Instructional
Material 33
Problems, Problem Graph Trees
Spaces and Search
NOTES
Problem solving
The term, Problem Solving relates to analysis in AI. Problem solving may be
characterized as a systematic search through a range of possible actions to
reach some predefined goal or solution. Problem-solving methods are
categorized as special purpose and general purpose.
A special-purpose method is tailor-made for a particular problem, often
exploits very specific features of the situation in which the problem is
embedded.
A general-purpose method is applicable to a wide variety of problems. One
General-purpose technique used in AI is means-end analysis: a step-by-
step, or incremental, reduction of the difference between current state and
final goal.
Self-Instructional
Material 35
Problems, Problem 11. (four, three) if four < 4 (4, three-diff) pour diff, 4-four, into
Spaces and Search
four from three
12. (three, four) if three < 3 (four-diff, 3) pour diff, 3-three, into
three from four and a solution is given
NOTES below four three rule
Fig. 2.2 Production Rules for the Water Jug Problem
The problem solved by using the production rules in combination with an appropriate
control strategy, moving through the problem space until a path from an initial state
to a goal state is found. In this problem solving process, search is the fundamental
concept. For simple problems it is easier to achieve this goal by hand but there will
be cases where this is far too difficult.
Self-Instructional
36 Material
Example: Eight puzzle (8-Puzzle) Problems, Problem
Spaces and Search
The 8-puzzle is a 3 3 array containing eight square pieces, numbered 1 through 8,
and one empty space. A piece can be moved horizontally or vertically into the empty
space, in effect exchanging the positions of the piece and the empty space. There
are four possible moves, UP (move the blank space up), DOWN, LEFT and RIGHT. NOTES
The aim of the game is to make a sequence of moves that will convert the board
from the start state into the goal state:
2 3 4 1 2 3
8 6 2 8 4
7 5 7 6 5
This example can be solved by the operator sequence UP, RIGHT, UP, LEFT, DOWN.
Example: Missionaries and Cannibals
The Missionaries and Cannibals problem illustrates the use of state space search for
planning under constraints:
Three missionaries and three cannibals wish to cross a river using a two-
person boat. If at any time the cannibals outnumber the missionaries on either side
of the river, they will eat the missionaries. How can a sequence of boat trips be
performed that will get everyone to the other side of the river without any missionaries
being eaten?
State representation:
1. BOAT position: original (T) or final (NIL) side of the river.
2. Number of Missionaries and Cannibals on the original side of the river.
3. Start is (T 3 3); Goal is (NIL 0 0).
Operators:
(MM 2 0) Two Missionaries cross the river.
(M 1 0) One Missionary.
(C 0 1) One Cannibal.
Self-Instructional
Material 37
Problems, Problem Missionaries/Cannibals Search Graph
Spaces and Search
Missionaries on Left Cannibals on Left
Boat Position
NOTES 3 3 0
MC CC
3 1 1
2 2 1
M C
3 2 0
CC
3 0 1
C
3 1 0
MM
1 1 1
MC
2 2 0
MM
0 2 1
0 3 0
CC
0 1 1
C
M
0 2 0
1 1 0
CC
MC 0 0 1
Tree structure
Tree is a way of organizing objects, related in a hierarchical fashion.
Tree is a type of data structure in which each element is attached to one or
more elements directly beneath it.
The connections between elements are called branches.
Tree is often called inverted trees because it is drawn with the root at the top.
The elements that have no elements below them are called leaves.
A binary tree is a special type: each element has only two branches below it.
Properties
Tree is a special case of a graph.
The topmost node in a tree is called the root node.
At root node all operations on the tree begin.
A node has at most one parent.
The topmost node (root node) has no parents.
Self-Instructional
Material 41
Problems, Problem Each node has zero or more child nodes, which are below it .
Spaces and Search
The nodes at the bottommost level of the tree are called leaf nodes.
Since leaf nodes are at the bottom most level, they do not have children.
NOTES A node that has a child is called the childs parent node.
The depth of a node n is the length of the path from the root to the node.
The root node is at depth zero.
Stacks and Queues
The Stacks and Queues are data structures that maintain the order of last-in,
first-out and first-in, first-out respectively. Both stacks and queues are often
implemented as linked lists, but that is not the only possible implementation.
Stack - Last In First Out (LIFO) lists
An ordered list; a sequence of items, piled one on top of the other.
The insertions and deletions are made at one end only, called Top.
If Stack S = (a[1], a[2], . . . . a[n]) then a[1] is bottom most element
Any intermediate element (a[i]) is on top of element a[i-1], 1 < i <= n.
In Stack all operation take place on Top.
The Pop operation removes item from top of the stack.
The Push operation adds an item on top of the stack.
Queue - First In First Out (FIFO) lists
An ordered list; a sequence of items; there are restrictions about how items
can be added to and removed from the list. A queue has two ends.
All insertions (enqueue ) take place at one end, called Rear or Back
All deletions (dequeue) take place at other end, called Front.
If Queue has a[n] as rear element then a[i+1] is behind a[i] , 1 < i <= n.
All operation takes place at one end of queue or the other.
The Dequeue operation removes the item at Front of the queue.
The Enqueue operation adds an item to the Rear of the queue.
Search
Search is the systematic examination of states to find path from the start / root state
to the goal state.
Search usually results from a lack of knowledge.
Search explores knowledge alternatives to arrive at the best answer.
Search algorithm output is a solution, that is, a path from the initial state to a
state that satisfies the goal test.
For general-purpose problem-solving Search is an approach.
Search deals with finding nodes having certain properties in a graph that
represents search space.
Search methods explore the search space intelligently, evaluating
possibilities without investigating every single possibility.
Self-Instructional
42 Material
Examples: Problems, Problem
Spaces and Search
For a Robot this might consist of PICKUP, PUTDOWN, MOVEFORWARD,
MOVEBACK, MOVELEFT, and MOVERIGHTuntil the goal is reached.
Puzzles and Games have explicit rules: e.g., the Tower of Hanoi puzzle. NOTES
(a) Start (b) Final
This puzzle involves a set of rings of different sizes that can be placed on
three different pegs.
The puzzle starts with the rings arranged as shown in Figure 2.4(a)
The goal of this puzzle is to move them all as to Figure 2.4(b)
Condition: Only the top ring on a peg can be moved, and it may only be
placed on a smaller ring, or on an empty peg.
In this Tower of Hanoi puzzle:
Situations encountered while solving the problem are described as states.
Set of all possible configurations of rings on the pegs is called problem space.
States
A State is a representation of elements in a given moment.
A problem is defined by its elements and their relations.
At each instant of a problem, the elements have specific descriptors and
relations; the descriptors indicate how to select elements?
Among all possible states, there are two special states called:
Initial state the start point
Final state the goal state
State Change: Successor Function
Check Your Progress
A successor function is needed for state change. The Successor Function
moves one state to another state. 1. What does a search
refer to?
Successor Function: 2. Problem solving is
It is a description of possible actions; a set of operators. a process of
It is a transformation function on a state representation, which converts generating solutions
from observed or
that state into another state. given data. (True or
It defines a relation of accessibility among states. False)
It represents the conditions of applicability of a state and corresponding 3. What are the two
transformation function. characteristics by
which a problem is
defined?
Self-Instructional
Material 43
Problems, Problem State space
Spaces and Search
A state space is the set of all states reachable from the initial state.
A state space forms a graph (or map) in which the nodes are states and
the arcs between nodes are actions.
NOTES In a state space, a path is a sequence of states connected by a sequence of
actions.
The solution of a problem is part of the map formed by the state space.
Structure of a state space
The structures of a state space are trees and graphs.
A tree is a hierarchical structure in a graphical form.
A graph is a non-hierarchical structure.
A tree has only one path to a given node;
i.e., a tree has one and only one path from any point to any other point.
A graph consists of a set of nodes (vertices) and a set of edges (arcs). Arcs
establish relationships (connections) between the nodes; i.e., a graph has
several paths to a given node.
The Operators are directed arcs between nodes.
A search process explores the state space. In the worst case, the search explores
all possible paths between the initial state and the goal state.
Problem solution
In the state space, a solution is a path from the initial state to a goal state or,
sometimes, just a goal state.
A solution cost function assigns a numeric cost to each path; it also gives
the cost of applying the operators to the states.
A solution quality is measured by the path cost function; and an optimal
solution has the lowest path cost among all solutions.
The solutions can be any or optimal or all.
The importance of cost depends on the problem and the type of solution
asked.
Problem description
A problem consists of the description of:
The current state of the world,
The actions that can transform one state of the world into another,
The desired state of the world.
The following action one taken to describe the problem:
State space is defined explicitly or implicitly
Self-Instructional
44 Material
Preconditions provide partial description of the state of the world that Problems, Problem
Spaces and Search
must be true in order to perform the action, and
Instructions tell the user how to create the next state.
Operators should be as general as possible, so as to reduce their number.
Elements of the domain has relevance to the problem NOTES
Knowledge of the starting point.
This can also be explained with the help of algebraic function as given below.
Algebraic Function
A function may take the form of a set of ordered pair, a graph or a equation. Regardless
of the form it takes, a function must obey the condition that, no two of its ordered
pairs have the same first member with different second members.
Relation: A set of ordered pair of the form (x, y) is called a relation.
Function: A relation in which no two ordered pairs have the same x-value but
different y-value is called a function. Functions are usually named by lower-case
letters such as f, g, and h.
For example, f = {(-3, 9), (0, 0), (3, 9)}, g = {(4, -2), (2, 2)}
Here f is a function, g is not a function.
Domain and Range: The domain of a function is the set of all the first members of
its ordered pairs, and the range of a function is the set of all second members of its
ordered pairs.
if function f = {(a, A), (b, B), (c, C)},then its domain is {a, b, c } and its range is {A,
B, C}.
Function and mapping: A function may be viewed as a mapping or a pairing of one
set with elements of a second set such that each element of the first set (called
domain) is paired with exactly one element of the second set ( called co domain) .
e.g., if a function f maps {a, b, c} into {A, B, C, D} such that a ! A (read a is
mapped into A), b ! B, c ! C then the domain is {a, b, c} and the co domain is {A,
B, C, D}. Since a is paired with A in co domain, A is called the image of a. Each
element of the co domain that corresponds to an element of the domain is called the
image of that element.
The set of image points, {A, B, C}, is called the range. Thus, the range is a subset of
the co domain.
Onto mappings: Set A is mapped onto set B if each element of set B is image of an
element of a set A. Thus, every function maps its domain onto its range.
Describing a function by an equation: The rule by which each x-value gets paired
with the corresponding y-value may be specified by an equation. For example, the
Self-Instructional
Material 45
Problems, Problem function described by the equation y = x + 1 requires that for any choice of x in the
Spaces and Search
domain, the corresponding range value is x + 1. Thus, 2 ! 3, 3 ! 4, and 4 ! 5.
Restricting domains of functions: Unless otherwise indicated, the domain of a
function is assumed to be the largest possible set of real numbers. Thus: The domain
NOTES of y = x / (x2 - 4) is the set of all real numbers except 2 since for these values of x
the denominator is 0.
The domain of y = (x - 1)1/2 is the set of real numbers greater than or equal to 1
since for any value of x less than 1, the root radical has a negative radicand so the
radical does not represent a real number.
Example: Find which of the relation describe function?
(a) y = x1/2 , (b) y = x3 , (c) y > x , (d) x = y2
Equations (a) and (b) produce exactly one value of y for each value of x. Hence,
equations (a) and (b) describe functions.
The equation (c), y > x does not represent a function since it contains ordered pair
such as (1, 2) and (1, 3) where same value of x is paired with different values of y.
The equation (d), x = y2 is not a function since ordered pair such as (4, 2) and (4, -
2) satisfy the equation but have the same value of x paired with different values of
y.
Function notation: For any function f, the value of y that corresponds to a given
value of x is denoted by f(x).
If y = 5x -1, then f (2), read as f of 2, represents the value of y.
When x = 2, then f(2) = 5 . 2 - 1 = 9;
when x = 3, then f(3) = 5 . 3 - 1 = 14;
In an equation that describes function f, then f(x) may be used in place of y, for
example, f(x) = 5x -1. If y = f(x), then y is said to be a function of x.
Since the value of y depends on the value of x, y is called the dependent variable
and x is called the independent variable.
2.4.2 Search Algorithms
Many traditional search algorithms are used in AI applications. For complex
problems, the traditional algorithms are unable to find the solutions within some
practical time and space limits. Consequently, many special techniques are
developed, using heuristic functions.
The algorithms that use heuristic functions are called heuristic algorithms.
Heuristic algorithms are not really intelligent; they appear to be intelligent
because they achieve better performance.
Heuristic algorithms are more efficient because they take advantage of
feedback from the data to direct the search path.
Uninformed search algorithms or Brute-force algorithms, search through
the search space all possible candidates for the solution checking whether
each candidate satisfies the problems statement.
Informed search algorithms use heuristic functions that are specific to the
problem, apply them to guide the search through the search space to try to
Self-Instructional reduce the amount of time spent in searching.
46 Material
A good heuristic will make an informed search dramatically outperform any Problems, Problem
Spaces and Search
uninformed search: for example, the Traveling Salesman Problem (TSP), where the
goal is to find is a good solution instead of finding the best solution.
In such problems, the search proceeds using current information about the
problem to predict which path is closer to the goal and follow it, although it does NOTES
not always guarantee to find the best possible solution. Such techniques help in
finding a solution within reasonable time and space (memory). Some prominent
intelligent search algorithms are stated below:
1. Generate and Test Search 4. A* Search
2. Best-first Search 5. Constraint Search
3. Greedy Search 6. Means-ends analysis
There are some more algorithms. They are either improvements or combinations of
these.
Hierarchical Representation of Search Algorithms: A Hierarchical
representation of most search algorithms is illustrated below. The
representation begins with two types of search:
Uninformed Search: Also called blind, exhaustive or brute-force search, it
uses no information about the problem to guide the search and therefore may
not be very efficient.
Informed Search: Also called heuristic or intelligent search, this uses
information about the problem to guide the searchusually guesses the
distance to a goal state and is therefore efficient, but the search may not be
always possible.
Search Algorithms
G(State, Operator, Cost)
No Heuristics User Heuristics h(n)
Imposed Fixed
Depth Limit Priority
Queue = h(n)
Iterative
Deepening DFS A* Search A0* Search
The first requirement is that it causes motion, in a game playing program, it moves
on the board and in the water jug problem, filling water is used to fill jugs. It means
the control strategies without the motion will never lead to the solution.
Self-Instructional
Material 47
Problems, Problem The second requirement is that it is systematic, that is, it corresponds to the
Spaces and Search
need for global motion as well as for local motion. This is a clear condition that
neither would it be rational to fill a jug and empty it repeatedly, nor it would be
worthwhile to move a piece round and round on the board in a cyclic way in a
NOTES game. We shall initially consider two systematic approaches for searching. Searches
can be classified by the order in which operators are tried: depth-first, breadth-first,
bounded depth-first.
Breadth First
Depth First
1
2
1
3 6 3 4 5 6
Self-Instructional
48 Material
Advantages Problems, Problem
Spaces and Search
1. Guaranteed to find an optimal solution (in terms of shortest number of steps
to reach the goal).
2. Can always find a goal node if one exists (complete). NOTES
Disadvantages
1. High storage requirement: exponential with tree depth.
2.4.2.2 Depth-first search
A search strategy that extends the current path as far as possible before backtracking
to the last choice point and trying the next alternative path is called Depth-first
search (DFS).
This strategy does not guarantee that the optimal solution has been found.
In this strategy, search reaches a satisfactory solution more rapidly than breadth
first, an advantage when the search space is large.
Algorithm
Depth-first search applies operators to each newly generated state, trying to drive
directly toward the goal.
1. If the starting state is a goal state, quit and return success.
2. Otherwise, do the following until success or failure is signalled:
a. Generate a successor E to the starting state. If there are no more successors,
then signal failure.
b. Call Depth-first Search with E as the starting state.
c. If success is returned signal success; otherwise, continue in the loop.
Advantages
1. Low storage requirement: linear with tree depth.
2. Easily programmed: function call stack does most of the work of maintaining
state of the search.
Disadvantages
1. May find a sub-optimal solution (one that is deeper or more costly than the
best solution).
2. Incomplete: without a depth bound, may not find a solution even if one exists.
2.4.2.3 Bounded depth-first search
Depth-first search can spend much time (perhaps infinite time) exploring a very
deep path that does not contain a solution, when a shallow solution exists. An easy
way to solve this problem is to put a maximum depth bound on the search. Beyond
the depth bound, a failure is generated automatically without exploring any deeper.
Problems:
1. Its hard to guess how deep the solution lies.
2. If the estimated depth is too deep (even by 1) the computer time used is
dramatically increased, by a factor of bextra.
3. If the estimated depth is too shallow, the search fails to find a solution; all
that computer time is wasted.
Self-Instructional
Material 49
Problems, Problem 2.4.3 Heuristics
Spaces and Search
A heuristic is a method that improves the efficiency of the search process. These are
like tour guides. There are good to the level that they may neglect the points in
NOTES general interesting directions; they are bad to the level that they may neglect points
of interest to particular individuals. Some heuristics help in the search process without
sacrificing any claims to entirety that the process might previously had. Others may
occasionally cause an excellent path to be overlooked. By sacrificing entirety it
increases efficiency. Heuristics may not find the best solution every time but
guarantee that they find a good solution in a reasonable time. These are particularly
useful in solving tough and complex problems, solutions of which would require
infinite time, i.e. far longer than a lifetime for the problems which are not solved in
any other way.
2.4.3.1 Heuristic search
To find a solution in proper time rather than a complete solution in unlimited time
we use heuristics. A heuristic function is a function that maps from problem state
descriptions to measures of desirability, usually represented as numbers. Heuristic
search methods use knowledge about the problem domain and choose promising
operators first. These heuristic search methods use heuristic functions to evaluate
the next state towards the goal state. For finding a solution, by using the heuristic
technique, one should carry out the following steps:
1. Add domainspecific information to select what is the best path to continue
searching along.
2. Define a heuristic function h(n) that estimates the goodness of a node n.
Specifically, h(n) = estimated cost(or distance) of minimal cost path from n
to a goal state.
3. The term, heuristic means serving to aid discovery and is an estimate, based
on domain specific information that is computable from the current state
description of how close we are to a goal.
Finding a route from one city to another city is an example of a search problem in
which different search orders and the use of heuristic knowledge are easily
understood.
1. State: The current city in which the traveller is located.
2. Operators: Roads linking the current city to other cities.
3. Cost Metric: The cost of taking a given road between cities.
4. Heuristic information: The search could be guided by the direction of the
goal city from the current city, or we could use airline distance as an estimate
of the distance to the goal.
Heuristic search techniques
For complex problems, the traditional algorithms, presented above, are unable to
find the solution within some practical time and space limits. Consequently, many
special techniques are developed, using heuristic functions.
Blind search is not always possible, because it requires too much time or
Space (memory).
Self-Instructional
50 Material
Heuristics are rules of thumb; they do not guarantee a solution to a problem. Problems, Problem
Spaces and Search
Heuristic Search is a weak technique but can be effective if applied correctly;
it requires domain specific information.
Characteristics of heuristic search NOTES
Heuristics are knowledge about domain, which help search and reasoning in
its domain.
Heuristic search incorporates domain knowledge to improve efficiency over
blind search.
Heuristic is a function that, when applied to a state, returns value as estimated
merit of state, with respect to goal.
Heuristics might (for reasons) underestimate or overestimate the merit of
Self-Instructional
Material 51
Problems, Problem Algorithm: Travelling salesman problem
Spaces and Search
1. Select a city at random as a starting point
2. Repeat
NOTES 3. Select the next city from the list of all the cities to be visited and choose the
nearest one to the current city, then go to it,
4. until all cities are visited.
This produces a significant development and reduces the time from order N! to
N[2]. Our goal is to find the shortest route that visits each city exactly once. Suppose
the cities to be visited and the distances between them are as shown below:
Hyderabad Secunderabad Mumbai Bangalore Chennai
Hyderabad - 15 270 780 740
Secunderabad 15 - 280 760 780
Mumbai 270 280 - 340 420
Bangalore 780 760 340 - 770
Chennai 740 780 420 770 -
(Distance in Kilometres)
One situation is the salesman could start from Hyderabad. In that case, one path
might be followed as shown below:
Hyderabad
15
Secunderabad
780
Chennai
420
Mumbai
340
Bangalore
780
Hyderabad
TOTAL : 2335
Here the total distance is 2335 km. But this may not be a solution to the
problem, maybe other paths may give the shortest route.
It is also possible to create a bound on the error in the answer, but in general
it is not possible to make such an error bound. In real problems, the value of a
particular solution is trickier to establish, but this problem is easier if it is measured
in miles, and other problems have vague measures.
Although heuristics can be created for unstructured knowledge, producing
coherent analysis is another issue and this means that the solution lacks reliability.
Rarely is this an optimal solution, since the required approximations are usually in
sufficient.
Although heuristic solutions are bad in the worst case, the problem occurs
Self-Instructional
very infrequently.
52 Material
Formal Statement: Problems, Problem
Spaces and Search
Problem solving is a set of statements describing the desired states expressed in a
suitable language; e.g. first-order logic.
The solution of many problems (like chess, crosses) can be described by NOTES
finding a sequence of actions that lead to a desired goal.
Each action changes the state, and
The aim is to find the sequence of actions that lead from the initial (start)
state to a final (goal) state.
A well-defined problem can be described by the example given below:
Example
Initial State: (S)
Operator or successor function: for any state x , returns s(x), the set of states
reachable from x with one action.
State space: all states reachable from the initial one by any sequence of actions.
Path: sequence through state space.
Path cost: function that assigns a cost to a path; cost of a path is the sum of
costs of individual actions along the path.
Goal state: (G)
Goal test: test to determine if at goal state.
Search notations
Search is the systematic examination of states to find path from the start /
root state to the goal state.
The notations used for this purpose are given below:
Evaluation function f (n) estimates the least cost solution through node n
Heuristic function h(n)
Estimates least cost path from node n to goal node
Cost function g(n) estimates the least cost path from start node to node n
f (n) = g(n) + h(n)
Actual Estimate
Goal
Start n
g(n) h(n)
f(n)
Self-Instructional
Material 53
Problems, Problem The notations ^f, ^g, ^h, are sometimes used to indicate that these values are
Spaces and Search
estimates of f , g , h
^f(n) = ^g(n) + ^h(n)
NOTES If h(n) d actual cost of the shortest path from node n to goal, then h(n) is an
underestimate.
AI - Search and Control of Strategies
Estimate cost function g*
The estimated least cost path from start node to node n is written as g*(n).
g* is calculated as the actual cost, so far, of the explored path.
g* is known exactly by summing all path costs from start to current state.
If search space is a tree, then g* = g, because there is only one path from start
node to current node.
In general, the search space is a graph.
If search space is a graph, then g* e g,
g* can never be less than the cost of the optimal path; it can only over
estimate the cost.
g* can be equal to g in a graph if chosen properly.
Estimate heuristic function h*
Check Your Progress The estimated least cost path from node n to goal node is written h*(n)
4. Production systems h* is heuristic information, represents a guess at: How hard it is to reach
provide appropriate
structures for from current node to goal state ?.
performing and h* may be estimated using an evaluation function f(n) that measures
describing search
processes. (True or goodness of a node.
False) h* may have different values; the values lie between 0 d h*(n) d h(n); they
5. The Stacks and mean a different search algorithm.
Queues are data
structures that If h* = h, it is a perfect heuristic; means no unnecessary nodes are ever
maintain the order expanded.
of and first-in, first-
out and last-in,
first-out
respectively. (True
2.5 PROBLEM CHARACTERISTICS
or False)
6. Heuristic Heuristics cannot be generalized, as they are domain specific. Production systems
algorithms are more provide ideal techniques for representing such heuristics in the form of IF-THEN
efficient than rules. Most problems requiring simulation of intelligence use heuristic search
traditional
algorithms. (True or
extensively. Some heuristics are used to define the control structure that guides the
False) search process, as seen in the example described above. But heuristics can also be
7. What is a Breadth- encoded in the rules to represent the domain knowledge. Since most AI problems
first search (BFS)? make use of knowledge and guided search through the knowledge, AI can be
8. Heuristics may not described as the study of techniques for solving exponentially hard problems in
find the best
polynomial time by exploiting knowledge about problem domain.
solution every time
but guarantee that To use the heuristic search for problem solving, we suggest analysis of the
they find a good
problem for the following considerations:
solution in
reasonable time. Decomposability of the problem into a set of independent smaller subproblems
(True or False)
Possibility of undoing solution steps, if they are found to be unwise
Self-Instructional
54 Material
Predictability of the problem universe Problems, Problem
Spaces and Search
Possibility of obtaining an obvious solution to a problem without comparison
of all other possible solutions
Type of the solution: whether it is a state or a path to the goal state NOTES
Role of knowledge in problem solving
Nature of solution process: with or without interacting with the user
The general classes of engineering problems such as planning, classification,
diagnosis, monitoring and design are generally knowledge intensive and use a large
amount of heuristics. Depending on the type of problem, the knowledge
representation schemes and control strategies for search are to be adopted. Combining
heuristics with the two basic search strategies have been discussed above. There are
a number of other general-purpose search techniques which are essentially heuristics
based. Their efficiency primarily depends on how they exploit the domain-specific
knowledge to abolish undesirable paths. Such search methods are called weak
methods, since the progress of the search depends heavily on the way the domain
knowledge is exploited. A few of such search techniques which form the centre of
many AI systems are briefly presented in the following sections.
2.5.1 Problem Decomposition
Suppose to solve the expression is: + (X + X + 2X + 3sinx)dx
(X + X + 2X + 3sinx)dx
x 3cosx
This problem can be solved by breaking it into smaller problems, each of
which we can solve by using a small collection of specific rules. Using this technique
of problem decomposition, we can solve very large problems very easily. This can
be considered as an intelligent behaviour.
2.5.2 Can Solution Steps be Ignored?
Suppose we are trying to prove a mathematical theorem: first we proceed considering
that proving a lemma will be useful. Later we realize that it is not at all useful. We
start with another one to prove the theorem. Here we simply ignore the first method.
Consider the 8-puzzle problem to solve: we make a wrong move and realize
that mistake. But here, the control strategy must keep track of all the moves, so that
we can backtrack to the initial state and start with some new move.
Consider the problem of playing chess. Here, once we make a move we never recover
from that step. These problems are illustrated in the three important classes of
problems mentioned below:
Self-Instructional
Material 55
Problems, Problem 1. Ignorable, in which solution steps can be ignored.
Spaces and Search
Eg: Theorem Proving
2. Recoverable, in which solution steps can be undone.
NOTES Eg: 8-Puzzle
3. Irrecoverable, in which solution steps cannot be undone.
Eg: Chess
2.5.3 Is the Problem Universe Predictable?
Consider the 8-Puzzle problem. Every time we make a move, we know exactly
what will happen. This means that it is possible to plan an entire sequence of moves
and be confident what the resulting state will be. We can backtrack to earlier moves
if they prove unwise.
Suppose we want to play Bridge. We need to plan before the first play, but we
cannot play with certainty. So, the outcome of this game is very uncertain. In case
of 8-Puzzle, the outcome is very certain. To solve uncertain outcome problems, we
follow the process of plan revision as the plan is carried out and the necessary
feedback is provided. The disadvantage is that the planning in this case is often very
expensive.
2.5.4 Is Good Solution Absolute or Relative?
Consider the problem of answering questions based on a database of simple facts
such as the following:
1. Siva was a man.
2. Siva was a worker in a company.
3. Siva was born in 1905.
4. All men are mortal.
5. All workers in a factory died when there was an accident in 1952.
6. No mortal lives longer than 100 years.
Suppose we ask a question: Is Siva alive?
By representing these facts in a formal language, such as predicate logic, and
then using formal inference methods we can derive an answer to this question easily.
There are two ways to answer the question shown below:
Method I:
1. Siva was a man.
2. Siva was born in 1905.
3. All men are mortal.
4. Now it is 2008, so Sivas age is 103 years.
5. No mortal lives longer than 100 years.
Method II:
1. Siva is a worker in the company.
2. All workers in the company died in 1952.
Answer: So Siva is not alive. It is the answer from the above methods.
Self-Instructional
56 Material
We are interested to answer the question; it does not matter which path we Problems, Problem
Spaces and Search
follow. If we follow one path successfully to the correct answer, then there is no
reason to go back and check another path to lead the solution.
NOTES
2.6 CHARACTERISTICS OF PRODUCTION
SYSTEMS
Production systems provide us with good ways of describing the operations that can
be performed in a search for a solution to a problem.
At this time, two questions may arise:
1. Can production systems be described by a set of characteristics? And how
can they be easily implemented?
2. What relationships are there between the problem types and the types of
production systems well suited for solving the problems?
To answer these questions, first consider the following definitions of classes
of production systems:
1. A monotonic production system is a production system in which the application
of a rule never prevents the later application of another rule that could also
have been applied at the time the first rule was selected.
2. A nonmonotonic production system is one in which this is not true.
3. A partially communicative production system is a production system with
the property that if the application of a particular sequence of rules transforms
state P into state Q, then any combination of those rules that is allowable also
transforms state P into state Q.
4. A commutative production system is a production system that is both
monotonic and partially commutative.
Table 2.1 Four Categories of Production Systems
Production System Monotonic Non-monotonic
Partially Commutative Theorem Proving Robot Navigation
Non-partially Commutative Chemical Synthesis Bridge
X 0
0
X X
Since the same state may be reachable by different sequences of moves, the
state space may in general be a graph. It may be treated as a tree for simplicity, at
the cost of duplicating states.
2.7.1.2 Solving problems using search
Given an informal description of the problem, construct a formal description
as a state space:
Define a data structure to represent the state.
Make a representation for the initial state from the given data.
Write programs to represent operators that change a given state
representation to a new state representation.
Write a program to detect terminal states. Self-Instructional
Material 59
Problems, Problem Choose an appropriate search technique:
Spaces and Search How large is the search space?
How well structured is the domain?
What knowledge about the domain can be used to guide the search?
NOTES
2.7.2 Additional Problems
The additional problems are given below.
2.7.2.1 The Monkey and Bananas Problem
The monkey is in a closed room in which there is a small table. There is a bunch of
bananas hanging from the ceiling but the monkey cannot reach them. Let us assume
that the monkey can move the table and if the monkey stands on the table, the
monkey can reach the bananas. Establish a method to instruct the monkey on how
to capture the bananas.
1. Task Environment
A. Information
Set of places {on the floor, place1, place2, under the bananas}
B. Operators
CLIMB
Pretest the monkey is in the tables place
Move the monkey on the table
WALK
Variable x is the set of places
Move the monkeys place becomes x.
MOVE TABLE
Variable x is the set of places
Pretest the monkeys place is in the set of places
The monkeys place is in the tables place
Move the monkeys place becomes x.
The tables place becomes x.
GET BANANAS
Pretest the tables place is under the bananas
The monkeys place is on the table
Move the contents of the monkeys hand are the bananas
C. Differences
d1 is the monkeys place
d2 is the tables place
d3 are the contents of the monkeys hand
D. Differences ordering
d3 is harder to reduce than d2 which is harder to reduce than d1.
Specific Task
Goal transform initial object to desired object
Objects
Initial the monkeys place is place1 the tables place is place2
The contents of the monkeys hand is empty
Desired the contents of the monkeys hand are the bananas.
Self-Instructional
60 Material
Problems, Problem
2.8 SUMMARY Spaces and Search
In this unit, you have learned about problems, their definitions, characteristics and
design of search programs. You have also learned how to define the problem in a NOTES
state space research. A state space represents a problem in terms of states and
operators that change states. You have further learned about problem solving which
is a process of generating solutions from observed or given data.
In this unit, you have also studied the two key steps that must be taken
towards designing a program to solve a particular problem:
1. First, accurately defining the problem means, specify the problem space, the
operators for moving within the space, and the starting and goal states.
2. Second, analyse the problem for determining where it falls.
Finally, consider the last two steps for developing a program to solve that
problem:
3. Identify and represent the knowledge required by the task.
4. Choose one or more techniques to solve the problem and apply those
techniques.
Literally, the relationship between the problem characteristics and specific
techniques should become clear and clear.
Self-Instructional
62 Material
Knowledge
REPRESENTATION ISSUES
NOTES
Structure
3.0 Introduction
3.1 Unit Objectives
3.2 Knowledge Representation
3.2.1 Approaches to AI Goals
3.2.2 Fundamental System Issues
3.2.3 Knowledge Progression
3.2.4 Knowledge Model
3.2.5 Knowledge Typology Map
3.3 Representation and Mappings
3.3.1 Framework of Knowledge Representation
3.3.2 Representation of Facts
3.3.3 Using Knowledge
3.4 Approaches to Knowledge Representation
3.4.1 Properties for Knowledge Representation Systems
3.4.2 Simple Relational Knowledge
3.5 Issues in Knowledge Representation
3.5.1 Important Attributes and Relationships
3.5.2 Granularity
3.5.3 Representing Set of Objects
3.5.4 Finding Right Structure
3.6 The Frame Problem
3.6.1 The Frame Problem in Logic
3.7 Summary
3.8 Key Terms
3.9 Questions and Exercises
3.10 Further Reading
3.0 INTRODUCTION
In the preceding unit, you learnt about problems, their definitions, characteristics
and design of search programs. In this unit, you will learn about the various aspects
of knowledge representation. Let us start with the observation that we have no
perfect method of knowledge representation today. This stems largely from our
ignorance of just what knowledge is. Nevertheless many methods have been worked
out and used by AI researchers.
You will learn that the knowledge representation problem concerns the
mismatch between human and computer memory, i.e., how to encode knowledge
so that it is a faithful reflection of the experts knowledge and can be manipulated
by computers.
For mapping, we call these representations of knowledgeknowledge bases
and the manipulative operations on these knowledge bases, are known as inference
engine programs. In this unit you will learn about the frame problem which is a
problem of representing the facts that change as well as those that do not change.
For example, consider a table with a plant on it under a window. Suppose, it is
Self-Instructional
Material 63
Knowledge moved to the centre of the room. Here, it must be inferred that the plant is now in
Representation Issues
the centre but the window is not.
NOTES
3.1 UNIT OBJECTIVES
After going through this unit, you will be able to:
Explain the different knowledge representation issues
Understand representation and mapping
Describe the various approaches to knowledge representation
Explain the issues in knowledge representation
Describe the frame problem
Self-Instructional
Material 65
Knowledge Knowledge is a description of the world. It determines a systems
Representation Issues
competence by what it knows.
Representation is the way knowledge is encoded. It defines the performance
of a system in doing something.
NOTES Different types of knowledge require different kinds of representation.
The Knowledge Representation models/mechanisms are often based on:
Logic Rules Frames Semantic Net
Different types of knowledge require different kinds of reasoning.
Knowledge is a general term. Knowledge is a progression that starts with
data which is of limited utility. By organizing or analysing the data, we understand
what the data means, and this becomes information. The interpretation or evaluation
of information yield knowledge. An understanding of the principles within the
knowledge is wisdom.
3.2.3 Knowledge Progression
Analysing Principles
Evolution
Fig. 3.1 Knowledge Progression
Self-Instructional
66 Material
Events Knowledge
Representation Issues
Events are actions.
Eg: Vinay played the guitar at the farewell party.
Performance NOTES
Playing the guitar involves the behaviour of the knowledge about how to do things.
Meta-knowledge
Knowledge about what we know.
To solve problems in AI, we must represent knowledge and must deal with the
entities.
Facts
Facts are truths about the real world on what we represent. This can be considered
as knowledge level.
3.2.4 Knowledge Model
The Knowledge Model defines that as the level of connectedness and
understanding increases, our progress moves from data through information and
knowledge to wisdom.
The model represents transitions and understanding.
The transitions are from data to information, information to knowledge, and
finally knowledge to wisdom;
The support of understanding is the transitions from one stage to the next
stage.
The distinctions between data, information, knowledge and wisdom are not
very discrete. They are more like shades of gray, rather than black and white.
Data and information deal with the past and are based on gathering facts and
adding context.
Knowledge deals with the present and that enables us to perform.
Wisdom deals with the future vision for what will be rather than for what it is
or it was.
Degrees of
Connectedness Wisdom
Understanding
Principles
Knowledge
Understanding
Patterns
Information
Understanding
Relations Degrees of
Understanding
Data
Information
Context
Socialization
Facts Data
Wisdon
Intemalization
Knowledge
Conversion Externalizatin
Combination
Principles are the basic building blocks of theoretical models and allow for making
predictions and drawing implications. These artifacts are supported in the knowledge
creation process for creating two of knowledge types: declarative and procedural,
which are explained below.
Knowledge Type
Cognitive psychologists sort knowledge into Declarative and Procedural
categories and some researchers add Strategic as a third category.
Procedural knowledge Declarative knowledge
Examples: procedures, rules, Example: concepts, objects,
strategies, agendas, models. facts, propositions, assertions,
semantic nets, logic and
Self-Instructional descriptive models.
68 Material
Focuses on tasks that must be Refers to representations of objects Knowledge
Representation Issues
performed to reach a particular and events; knowledge about facts
objective or goal. and relationships;
Knowledge about how to do Knowledge about that something
something; e.g., to determine if if is true or false. e.g., A car has NOTES
Peter or Robert is older, first find four tyres; Peter is older than
their ages. Robert.
Note :
About procedural knowledge, there is some disparity in views.
One view is, that it is close to Tacit knowledge; it manifests itself in the doing
of something, yet cannot be expressed in words; e.g. we read faces and moods.
Another view is that it is close to declarative knowledge; the difference is
that a task or method is described instead of facts or things.
All declarative knowledge is explicit knowledge; it is knowledge that can be
and has been articulated.
Strategic knowledge is considered to be a subset of declarative knowledge.
Self-Instructional
Material 69
Knowledge 3.3.1 Framework of Knowledge Representation
Representation Issues
A computer requires a well-defined problem description to process and to provide
well-defined acceptable solution. To collect fragments of knowledge we first need
NOTES to formulate a description in our spoken language and then represent it in formal
language so that the computer can understand. The computer can then use an
algorithm to compute an answer. This process is illustrated below.
Solve
Problem Solution
Represent Informal
Interpret
Formal
Compute
Representation Output
Self-Instructional
70 Material
need means to manipulate Knowledge
Representation Issues
Requires some formalism - to what we represent
Thus, knowledge representation can be considered at two levels:
(a) knowledge level at which facts are described, and NOTES
(b) Symbol level at which the representations of the objects, defined in terms of
symbols, can be manipulated in the programs.
Note: A good representation enables fast and accurate access to knowledge and
understanding of the content.
3.3.2 Representation of Facts
Representations of facts are those which we manipulate. This can be regarded as
the symbol level since we normally define the representation in terms of symbols
that can be manipulated by programs. We can structure these entities in two levels:
The knowledge level
At which the facts are going to be described.
The symbol level
At which representations of objects are going to be defined in terms of symbols that
can be manipulated in programs.
Natural language (or English) is the way of representing and handling the facts.
Logic enables us to consider the following fact:
spot is a dog represented as dog(spot)
We infer that all dogs have tails
: dog(x) has_a_tail(x)
According to the logical conclusion
has_a_tail(Spot)
Using a backward mapping function
Spot has a tail can be generated.
The available functions are not always one to one but are many to many
which are a characteristic of English representations. The sentences All dogs have
Self-Instructional
Material 71
Knowledge tails and every dog has a tail both say that each dog has a tail, but from the first
Representation Issues
sentence, one could say that each dog has more than one tail and try substituting
teeth for tails. When an AI program manipulates the internal representation of facts
these new representations can also be interpretable as new representations of facts.
NOTES Consider the classic problem of the mutilated chessboard. The problem in a
normal chessboard, the opposite corner squares, have been eliminated. The given
task is to cover all the squares on the remaining board by dominoes so that each
domino covers two squares. Overlapping of dominoes is not allowed. Consider
three data structures:
The first two data structures are shown in the diagrams above and the third
data structure is the number of black squares and the number of white squares. The
first diagram loses the colour of the squares and a solution is not easy. The second
preserves the colours but produces no easier path because the numbers of black and
white squares are not equal. Counting the number of squares of each colour, giving
black as 32 and the number of white as 30, gives the negative solution as a domino
must be on one white square and one black square; thus the number of squares must
be equal for a positive solution.
3.3.3 Using Knowledge
We have briefly discussed above where we can use knowledge. Let us consider how
knowledge can be used in various applications.
Learning
Acquiring knowledge is learning. It simply means adding new facts to a knowledge
base. New data may have to be classified prior to storage for easy retrieval, interaction
and inference with the existing facts and has to avoid redundancy and replication in
the knowledge. These facts should also be updated.
Retrieval
Using the representation scheme shows a critical effect on the efficiency of the
method. Humans are very good at it. Many AI methods have tried to model humans.
Reasoning
Get or infer facts from the existing data.
If a system only knows that:
Ravi is a Jazz musician.
All Jazz musicians can play their instruments well.
Self-Instructional
72 Material
If questions are like this Knowledge
Representation Issues
Is Ravi a Jazz musician? OR
Can Jazz musicians play their instruments well?
then the answer is easy to get from the data structures and procedures. NOTES
However, questions like
Can Ravi play his instrument well?
require reasoning. The above are all related. For example, it is fairly obvious
that learning and reasoning involve retrieval, etc.
Self-Instructional
Material 73
Knowledge James is mortal
Representation Issues
Is not sound, whereas
Tom is mortal
NOTES Is sound.
3.4.1.3 Inferential efficiency
A knowledge representation scheme should be tractable, i.e. make inferences in
reasonable time. Unfortunately, any knowledge representation scheme with
interesting expressive power is not going to be efficient; often, the more general
knowledge representation schemes are less efficient. We have to use knowledge
representation schemes tailored to problem domain i.e. less general but more
efficient. The ability to direct the inferential mechanisms into the most productive
directions by storing guides.
3.4.1.4 Acquisitional efficiency
Acquire new knowledge using automatic methods wherever possible rather than on
human intervention. Till today no single system optimizes all the properties. Now
we will discuss some of the representation schemes.
3.4.1.5 Well-defined syntax and semantics
It should be possible to tell:
Whether any construction is grammatically correct.
How to read any particular construction, i.e. no ambiguity.
Thus a knowledge representation scheme should have well-defined syntax. It
should be possible to precisely determine for any given construction, exactly what
its meaning is. Thus a knowledge representation scheme should have well-defined
semantics. Here syntax is easy, but semantics is hard for a knowledge representation
scheme.
3.4.1.6 Naturalness
A knowledge representation scheme should closely correspond to our way of
thinking, reading and writing. A knowledge representation scheme should allow a
knowledge engineer to read and check the knowledge base. Simply we can define
knowledge representation as knowledge representation is the problem of representing
what the computer knows.
3.4.1.7 Frame problem
A frame problem is a problem of representing the facts that change as well as those
that do not change. For example, consider a table with a plant on it under a window.
Suppose we move it to the centre of the room. Here we must infer that plant is now
in the center but the window is not. Frame axioms are used to describe all the things
that do not change when an operator is applied in one state to go to another state say
n + 1.
3.4.2 Simple Relational Knowledge
Simple way to store facts.
Self-Instructional Each fact about a set of objects is set out systematically in columns.
74 Material
Little opportunity for inference. Knowledge
Representation Issues
Knowledge basis for inference engines.
age
Adult Male 35
is a
Musician
is a
Jazz Avant
Garde/Jazz
instance instance
Ravi John
bands bands
Self-Instructional
76 Material
Knowledge
3.5 ISSUES IN KNOWLEDGE REPRESENTATION Representation Issues
Self-Instructional
Material 79
Knowledge Disadvantages
Representation Issues
1. We cant identify a major clue in some situations.
2. It is difficult to anticipate which clues are important and which are not.
NOTES (2) Revising the choice when necessary
Once we find a structure and if it doesnt seem to be appropriate then we
would opt for another choice. The different ways in which this can be done
are:
1. Select the fragments of the current structure and match them against the
other alternatives available. Choose the best one.
2. Make an excuse for the failure of the current structures and continue to
use that. There are heuristics such as the fact that a structure is more
appropriate if a desired feature is missing rather than having an
inappropriate feature.
Example: A person with one leg is more plausible than a person with a
tail.
3. Refer to specific links between the structures in order to follow new
directions to explore.
4. If the knowledge is represented in a isa hierarchy then move upward
until a structure is found and they should not conflict with evidence. Use
this structure if it provides the required knowledge or create a new structure
below it.
3.7 SUMMARY
In this unit, you have learned about the various aspects of knowledge representation.
There is no perfect method of knowledge representation today. This stems largely
from our ignorance of just what knowledge is. Nevertheless, many methods have
Self-Instructional
Material 81
Knowledge been worked out and used by AI researchers.
Representation Issues
You have also learned about mapping. The knowledge representation problem
concerns the mismatch between human and computer memory, i.e., how to encode
NOTES knowledge so that it is a faithful reflection of the experts knowledge and can be
manipulated by computer. For mapping, we call these representations of knowledge
knowledge bases, and the manipulative operations on these knowledge bases,
inference engine programs.
You have also learned about the frame problem which is a problem of
representing the facts that change as well as those that do not change.
In this unit, you have studied about knowledge representation and reasoning
formalism. This means it could lead to new conclusions from the knowledge we
have. It could be reasoned that we have enough knowledge to build a knowledge-
based agent. However, propositional logic is a weak language; there are many things
that cannot be expressed. To express knowledge about objects, their properties and
the relationships that exist between objects, we need a more expressive language:
first-order logic.
Self-Instructional
82 Material
Knowledge
3.10 FURTHER READING Representation Issues
1. www.plato.standford.edu
2. www.indiastudychannel.com NOTES
3. www.egeria.cs.cf.ac.uk
4. www.cs.cardiff.ac.uk
5. www.guthulamurali.freeservers.com
6. www.meanderingpassage.com
7. www.jpmf.house.cern.ch
8. www.mm.iit.uni-miskolc.hu
9. www.bookrags.com
10. www.stewardess.inhatc.ac.kr
11. www.seop.leeds.ac.uk
12. www.nwlink.com
Self-Instructional
Material 83
Using Predicate Logic
4.0 INTRODUCTION
In the preceding unit, you learnt about the various aspects of knowledge
representation. In this unit, you will learn about the various concepts relating to the
use of predicate logic. You will also learn about how to use simple facts in logic.
The unit will also discuss resolution and natural deduction.
Self-Instructional
86 Material
Predicate logic is one in which a statement from a natural language like English Using Predicate Logic
is translated into symbolic structures comprising predicates, functions, variables,
constants, qualifiers and logical connectives.
The syntax for the predicate logic is determined by the symbols that are allowed
and the rules of combination. The semantics is determined by the interpretations NOTES
assigned to predicates. The symbols and rules of combination of a predicate logic
are:
(1) Predicate Symbol
(2) Constant Symbol
(3) Variable Symbol
(4) Function Symbol
In addition to these symbols, parenthesis, delimiter and comma are also used.
Predicate Symbol: This symbol is used to represent a relation in the domain
of discourse. For example: Valmiki wrote Ramayana wrote (Valmiki, Ramayana)
In this, wrote is a Predicate. e.g., Rama loves Sita loves (Rama, Sita).
Constant Symbol: A constant symbol is simple term and is used to represent
object or entities in the domain of discourse. These objects may be physical objects
or people or anything we name. e.g., Rama, Sita, Ramayana, Valmiki
Variable Symbol: These symbols are permitted to have indefinite values
about the entity which is being referred to or specified. e.g., X loves Y loves(X,
Y). In this X, Y are variables.
Function Symbol: These symbols are used to represent a special type of
relationship or mapping. e.g., Ramas father is married to Ramas Mother
Married(father(Rama), Mother(Rama)). In this Father and Mother are functions
and married is Predicate.
Constants, variables and functions are referred to as terms and predicates are
referred to as atomic formulas or atoms. The statements in predicate logic are
termed as well-formed formula. A predicate with no variable is called a ground
atom.
4.3.1 Logic
Logic is concerned with the truth of statements about the world.
Generally each statement is either TRUE or FALSE.
Logic includes: Syntax, Semantics and Inference Procedure.
Syntax: Specifies the symbols in the language about how they can be combined
to form sentences. The facts about the world are represented as sentences in logic.
Semantic: Specifies how to assign a truth value to a sentence based on its
meaning in the world. It specifies what facts a sentence refers to. A fact is a claim
about the world, and it may be TRUE or FALSE.
Inference Procedure: Specifies methods for computing new sentences from
an existing sentence.
Self-Instructional
Material 87
Using Predicate Logic Note:
Facts are claims about the world that are True or False.
Representation is an expression (sentence), that stands for the objects and relations.
NOTES Sentences can be encoded in a computer program.
Logic as a Knowledge Representation Language: Logic is a language for
reasoning, a collection of rules used while doing logical reasoning. Logic is
studied as KR languages in artificial intelligence.
Logic is a formal system in which the formulas or sentences have true
or false values.
The problem of designing a KR language is a tradeoff between that
which is:
i. Expressive enough to represent important objects and relations in
a problem domain.
ii. Efficient enough in reasoning and answering questions about
implicit information in a reasonable amount of time.
Logics are of different types: Propositional logic, Predicate logic,
temporal logic, Modal logic, Description logic, etc;
They represent things and allow more or less efficient inference.
Propositional logic and Predicate logic are fundamental to all logic.
Propositional Logic is the study of statements and their connectivity.
Predicate Logic is the study of individuals and their properties.
4.3.1.1 Logic representation
The Facts are claims about the world that are True or False.
Logic can be used to represent simple facts.
To build a Logic-based representation:
User defines a set of primitive symbols and the associated semantics.
Logic defines ways of putting symbols together so that user can define legal
sentences in the language that represent TRUE facts.
Logic defines ways of inferring new sentences from existing ones.
Sentences - either TRUE or false but not both are called propositions.
A declarative sentence expresses a statement with a proposition as content;
Example:
The declarative snow is white expresses that snow is white; further, snow is
white expresses that snow is white is TRUE.
In this section, first Propositional Logic (PL) is briefly explained and then
the Predicate logic is illustrated in detail.
4.3.2 Propositional logic (PL)
A proposition is a statement, which in English would be a declarative sentence.
Every proposition is either TRUE or FALSE.
Examples: (a) The sky is blue. (b) Snow is cold. , (c) 12 * 12=144
Propositions are sentences, either true or false but not both.
Self-Instructional
88 Material
A sentence is the smallest unit in propositional logic. Using Predicate Logic
If proposition is true, then truth value is true.
if proposition is false, then truth value is false.
Example:
NOTES
Sentence Truth value Proposition (Y/N)
Grass is green true Yes
2 + 5 = 5 false Yes
Close the door - No
Is it hot outside? - No
x > 2 where is variable - No (since x is not defined)
x = x - No
(dont know what is x and =;
3 = 3 or air is equal to air or
Water is equal to water has no meaning)
Propositional logic is fundamental to all logic.
Propositional logic is also called Propositional calculus, sentential
calculus, or Boolean algebra.
Propositional logic describes the ways of joining and/or modifying entire
propositions, statements or sentences to form more complicated
propositions, statements or sentences, as well as the logical relationships
and properties that are derived from the methods of combining or altering
statements.
Statement, variables and symbols
These and a few more related terms such as, connective, truth value, contingencies,
tautologies, contradictions, antecedent, consequent and argument are explained
below.
Statement: Simple statements (sentences), TRUE or FALSE, that do not
contain any other statement as a part, are basic propositions; lower-case letters,
p, q, r, are symbols for simple statements.
Large, compound or complex statement are constructed from basic
propositions by combining them with connectives.
Connective or Operator: The connectives join simple statements into
compounds, and join compounds into larger compounds.
Table 4.1 indicates five basic connectives and their symbols:
Listed in decreasing order of operation priority;
An operation with higher priority is solved first.
Example of a formula: ((((a b) V c ! d) ! (a V c))
Self-Instructional
Material 89
Using Predicate Logic Table 4.1 Connectives and Symbols in Decreasing Order of Operation Priority
Note: The propositions and connectives are the basic elements of propositional
logic.
Truth value
The truth value of a statement is its TRUTH or FALSITY.
Example:
p is either TRUE or FALSE, ~p is either TRUE or FALSE,
p v q is either TRUE or FALSE, and so on. Use T or 1 to mean TRUE.
use F or 0 to mean FALSE
Truth table defining the basic connectives:
p q p q p q pvq p!q p ! q q!p
T T F F T T T T T
T F F T F T F F T
F T T F F T T F F
F F T T F F T T T
Tautologies
A proposition that is always true is called a tautology, e.g. (P v P) is always true
regardless of the truth value of the proposition P.
Contradictions
A proposition that is always false is called a contradiction. e.g., (P P) is always
false, regardless of the truth value of the proposition P.
Contingencies
A proposition is called a contingency, if that proposition is neither a tautology nor a
contradiction, e.g. (P v Q) is a contingency.
Antecedent, Consequent
In the conditional statements, p ! q , the 1st statement or if - clause (here p) is
called antecedent, 2nd statement or then - clause (here q) is called consequent.
Self-Instructional
90 Material
Argument Using Predicate Logic
Any argument can be expressed as a compound statement. Take all the premises,
conjoin them and make that conjunction the antecedent of a conditional and make
the conclusion the consequent. This implication statement is called the corresponding
conditional of the argument. NOTES
Note:
Every argument has a corresponding conditional, and every implication
statement has a corresponding argument.
Because the corresponding conditional of an argument is a statement, it is
therefore either a tautology, or a contradiction or a contingency.
An argument is valid if and only if its corresponding conditional is a
tautology.
Two statements are consistent if and only if their conjunction is not a
contradiction.
Two statements are logically equivalent if and only if their truth table columns
are identical; if and only if the statement of their equivalence using a is a
tautology.
Note
Truth tables are adequate to test validity, tautology, contradiction, contingency,
consistency and equivalence.
Here we will highlight major principles involved in knowledge representation.
In particular, predicate logic will be met in other knowledge representation schemes
and reasoning methods. The following are the standard logic symbols we use in this
topic:
There are two types of quantifiers:
1. for all
2. There exists
Connectives:
Implies
Not
Or
And
Let us now look at an example of how predicate logic is used to represent
knowledge. There are other ways but this form is popular.
The propositional logic is not powerful enough for all types of assertions;
Example: The assertion x > 1, where x is a variable, is not a proposition
because it is neither true nor false unless value of x is defined. For x > 1 to be a
proposition,
either we substitute a specific number for x;
or change it to something like
There is a number x for which x > 1 holds;
or For every number x, x > 1 holds.
Self-Instructional
Material 91
Using Predicate Logic Consider the following example:
All men are mortal.
Socrates is a man.
NOTES Then Socrates is mortal,
These cannot be expressed in propositional logic as a finite and logically
valid argument (formula).
We need languages that allow us to describe properties (predicates) of objects,
or a relationship among objects represented by the variables. Predicate logic satisfies
the requirements of a language.
Predicate logic is powerful enough for expression and reasoning.
Predicate logic is built upon the ideas of propositional logic.
Predicate:
Every complete sentence contains two parts: a subject and a predicate. The subject
is what (or whom) the sentence is about. The predicate tells something about the
subject;
Example:
A sentence Judy {runs}.
The subject is Judy and the predicate is runs.
Predicate always includes verb and tells something about the subject. Predicate
is a verb phrase template that describes a property of objects or a relation among
objects represented by the variables.
Example:
The car Tom is driving is blue;
The sky is blue;
The cover of this book is blue.
Predicate is is blue, describes property.
Predicates are given names; Let B be the name for predicate, is blue.
Sentence is represented as B(x), read as x is blue;
x represents an arbitrary Object .
Predicate logic expressions:
The propositional operators combine predicates, like
If (p(....) && ( !q(....) || r (....) ) )
Examples of logic operators: disjunction (OR) and conjunction (AND).
Consider the expression with the respective logic symbols || and && x < y || (
y < z && z < x)
Which is true || (true && true); Applying truth table, found True
Assignment for < are 3, 2, 1 for x, y, z and then the value can be FALSE or
TRUE
3 < 2 || (2 < 1 && 1 < 3)
It is False
Self-Instructional
92 Material
Predicate Logic Quantifiers Using Predicate Logic
Self-Instructional
Material 93
Using Predicate Logic * read for all x in a, p holds
* a is universe of discourse
* x is a member of the domain of discourse
NOTES * p is a statement about x
* In propositional form, it is written as: x P(x)
* read for all x, P(x) holds
for each x, P(x) holds or
for every x, P(x) holds where P(x) is predicate, x means all the objects x in the
universe P(x) is true for every object x in the universe
Example: English language to Propositional form
* All cars have wheels
x: car x has wheel
*x P(x) where P (x) is predicate tells: x has wheels x is variable for object cars
that populate universe of discourse
Apply Existential quantifier There Exist
Existential Quantification allows us to state that an object does exist without naming
it.
Existential quantification: x: a p
* read there exists an x such that p holds
* a is universe of discourse
* x is a member of the domain of discourse.
* p is a statement about x
In propositional form it is written as: x P(x)
* read there exists an x such that P(x) or there exists at least one x such that P(x)
* Where P(x) is predicate, x means at least one object x in the universe P(x) is true
for least one object x in the universe
Example: English language to Propositional form
* Someone loves you
x: Someone x loves you
* x P(x),where P(x) is predicate tells : x loves you x is variable for object someone
that populate universe of discourse
Formula:
In mathematical logic, a formula is a type of abstract object, a token of which is a
symbol or string of symbols which may be interpreted as any meaningful unit in a
formal language.
Terms:
Defined recursively as variables, or constants, or functions like f(t1, . . . , tn), where
f is an n-ary function symbol, and t1, . . . , tn are terms. Applying predicates to terms
produce atomic formulas.
Self-Instructional
94 Material
Atomic formulas: Using Predicate Logic
Self-Instructional
96 Material
Using Predicate Logic
4.5 COMPUTABLE FUNCTIONS AND PREDICATES
The objective is to define class of functions C computable in terms of F. This is
expressed as C {F} is explained below using two examples: NOTES
(1) evaluate factorial n and
(2) expression for triangular functions.
Ex: (1) A conditional expression to define factorial n or n!
Expression
If p1 then e1 else if p2 then e2 . . . else if pn then en .
i.e. (p1 !?e1, p2 !?e2 . . . pn !?en)
Here p1, p2 . . . pn are propositional expressions taking the values T or F for
true and false respectively.
The value of (p1 !?e1, p2 !?e2 . . . pn !?en) is the value of the e corresponding
to the first p that has value T.
The expressions defining n! , n= 5, recursively are:
n! = n x (n-1)! for n e?1
5! = 1 x 2 x 3 x 4 x 5 = 120
0! = 1
The above definition incorporates an instance that the product of no numbers
i.e. 0! = 1, then only, the recursive relation
(n + 1)! =n! X (n+1) works for n = 0.
Now, use conditional expressions
n! = (n = 0 !?1, n 0 !?n. (n 1)!)
to define functions recursively.
Ex: Evaluate 2! According to above definition.
2! = (2 = 0 !?1, 2 0 !?2. (2 1)! )
= 2 x 1!
= 2 x (1 = 0 !?1, 1 0 !?1. (1 1)! )
= 2 x 1 x 0!
= 2 x 1 x (0 = 0 !?1, 0 0 !?0. (0 1)! )
=2x1x1
=2
4.6 RESOLUTION
Resolution is a procedure used in proving that arguments which are expressible in
predicate logic are correct. Resolution is a procedure that produces proofs by
refutation or contradiction. Resolution leads to refute a theorem-proving technique
for sentences in propositional logic and first-order logic.
Resolution is a purely syntactic uniform proof procedure. Resolution does
not consider what predicates may mean, but only what logical conclusions may be
Self-Instructional
Material 97
Using Predicate Logic derived from the axioms. The advantage is Resolution is universally applicable to
problems that can be described in first order logic. The disadvantage is Resolution
by itself cannot make use of any domain dependent heuristics. Despite many attempts
to improve the efficiency of resolution, if often takes exponential time.
NOTES Resolution is a rule of inference.
Resolution is a computerized theorem prover.
Resolution is so far only defined for Propositional Logic.
The strategy is that the Resolution techniques of Propositional logic be adopted
in Predicate Logic.
1. Proof in predicate logic is composed of a variety of processes including
modus ponens, chain rule, substitution, matching, etc.
2. However, we have to take variables into account, e.g. parent of(Ram,
Sam) AND Not Parent of(Ram, Lakshman) cannot be resolved.
3. In other words, the resolution method has to look inside predicates to
discover if they can be resolved.
4.6.1 Algorithm: Convert to Clausal Form
1. Eliminate
P Q P Q
2. Reduce the scope of each to a single term.
(P Q) P Q
(P Q) P Q
x: P x: P
x: p x: P
P P
3. Standardize variables so that each quantifier binds a unique variable.
( x: P(x)) (x: Q(x)) ( x: P(x)) (y: Q(y))
4. Move all quantifiers to the left without changing their relative order.
( x: P(x)) (y: Q(y)) x: y: (P(x) (Q(y))
5. Eliminate (Solemnization).
x: P(x) P(c) Skolem constant
x: y P(x, y) x: P(x, f(x)) Skolem function
6. Drop .
x: P(x) P(x)
7. Convert the formula into a conjunction of disjuncts.
(P Q) R (P R) (Q R)
8. Create a separate clause for each conjunction.
9. Standardize apart the variables in the set of obtained clauses.
Example:1
x: [Roman(x) know(x, Marcus)] [hate(x, Caesar) V
( y: z: hate(y, z) thinkcrazy(x, y))]
1. Eliminate
x: [Roman(x) know(x, Marcus)] V [hate(x, Caesar) V ( y:
z: hate(y, z) V thinkcrazy(x, y))]
2. Reduce scope of.
x: [ Roman(x) V know(x, Marcus)] V [hate(x, Caesar) V ( y: z:
hate(y, z) V thinkcrazy(x, y))]
3. Standardize variables:
Self-Instructional x: P(x) V x: Q(x) converts to x: P(x) V y: Q(y)
98 Material
4. Move quantifiers. Using Predicate Logic
Self-Instructional
Material 99
Using Predicate Logic 2. P( f (y)) [Copy of Conclusion]
3. _ [1, 2 Resolution {x/ f (y)}]
Resolution Example 2
|- xyz ((P(y)(Q (z)) (P(x)(Q(x)))
NOTES 1. P ( f (x))(Q(g(x)) [ Conclusion]
2. P(x) [ Conclusion]
3. Q(x) [ Conclusion]
4. P(y) [Copy of 2]
5. Q(g(x)) [1, 4 Resolution {y/ f (x)}]
6. Q(z) [Copy of 3]
7. _ [5, 6 Resolution {z/g(x)}]
4.6.2 The Basis of Resolution
The resolution procedure is a simple iterative process: at every step two clauses
(parent clauses) are compared for deriving a new clause that has been inferred from
them. The new clause represents that the two parent clauses interact with each
other.
4.6.2.1 Soundness and completeness
The notation p |= q is read p entails q; it means that q holds in every model
in which p holds.
The notation p |- q means that q can be derived from p by some proof mechanism
m.
A proof mechanism m is sound if p |-m q p |= q.
A proof mechanism m is complete if p |= q p |-m q.
Resolution for predicate calculus is:
Sound: If it is derived by resolution, then the original set of clauses is
unclassifiable.
Complete: If a set of clauses is unclassifiable, resolution will eventually
derive. However, this is a search problem and may take a very long time.
We generally are not willing to give up soundness, since we want our conclusions
to be valid. We might be willing to give up completeness: if a soundproof procedure
will prove the theorem we want, that is enough.
Example: Resolution
Another type of proof system based on refutation
Better suited to computer implementation than systems of axioms and rules
(can give correct no answers)
Generalizes to first-order logic (see next week)
The basis of Prologs inference method
To apply resolution, all formulae in the knowledge base and the query must
be in clausal form (c.f. Prolog clauses)
Resolution Rule of Inference
Resolution Rule
Self-Instructional
100 Material
Using Predicate Logic
PQ Q R
NOTES
(P Q)
If the empty clause is derived, answer yes (query follows from knowledge 2. List four types of
logic.
base), otherwise answer no (query does not follow from knowledge base)
3. Define atomic
Resolution: Example 1 formula.
(G V H)( J K), G J 4. List the two natural
Clausal form of (G V H) (J K) is deduction methods.
{G V J, H V J, G V K, H V K}
Self-Instructional
Material 101
Using Predicate Logic 1. G V J [Premise]
2. H V J [Premise]
3. G V K [Premise]
4. H V K [Premise]
NOTES 5. G [Premise]
6. J [ Conclusion]
7. G [1, 6 Resolution]
8. _ [5, 7 Resolution]
Resolution: Example 2
PQ, QR | P!R
Recall PR P V R
Clausal form of (P V R) is {P, R}
1. P V Q [Premise]
2. Q V R [Premise]
3. P [ Conclusion]
4. R [ Conclusion]
5. Q [1, 3 Resolution]
6. R [2, 5 Resolution]
7. _ [4, 6 Resolution]
Resolution: Example 3
((P V Q) P)Q
Clausal form of (((P V Q) P)Q) is {P V Q, P, Q}
1. P V Q [ Conclusion]
2. P [ Conclusion]
3. Q [ Conclusion]
4. Q [1, 2 Resolution]
5. [3, 4 Resolution]
Soundness and Completeness Again
Resolution refutation is sound, i.e. it preserves truth (if a set of premises are
all true, any conclusion drawn from those premises must also be true)
Resolution refutation is complete, i.e. it is capable of proving all consequences
of any knowledge base (not shown here!)
Resolution refutation is decidable, i.e. there is an algorithm implementing
resolution which when asked whether S | P, can always answer yes or no
(correctly)
Heuristics in Applying Resolution
Clause eliminationcan disregard certain types of clauses
Pure clauses: contain literal L where L doesnt appear elsewhere
Tautologies: clauses containing both L and L
Subsumed clauses: another clause exists containing a subset of the literals
Ordering strategies
Resolve unit clauses (only one literal) first
Start with query clauses
Aim to shorten clauses
Self-Instructional
102 Material
4.6.2.2 Resolution strategies Using Predicate Logic
Different strategies have been tried for selecting the clauses to be resolved. These
include:
a. Level saturation or two-pointer method: the outer pointer starts at the NOTES
negated conclusion; the inner pointer starts at the first clause. The two
clauses denoted by the pointers are resolved, if possible, with the result
added to the end of the list of clauses. The inner pointer is incremented
to the next clause until it reaches the outer pointer; then the outer pointer
is incremented and the inner pointer is reset to the front. The two-pointer
method is a breadth-first method that will generate many duplicate
clauses.
b. Set of support: One clause in each resolution step must be part of the
negated conclusion or a clause derived from it. This can be combined
with the two-pointer method by putting the clauses from the negated
conclusion at the end of the list. Set-of-support keeps the proof process
focused on the theorem to be proved rather than trying to prove
everything.
c. Unit preference: Clauses are prioritized, with unit clauses preferred, or
more generally shorter clauses preferred. To reach our goal, this has
zero literals and is only obtained by resolving two unit clauses. Resolution
with a unit clause makes the result smaller.
d. Linear resolution: one clause in each step must be the result of the
previous step. This is a depth-first strategy. It may be necessary to back
up to a previous clause if no resolution with the current clause is possible.
4.6.3 Resolution in Propositional Logic
(P Q) R P V Q V R (2)
(S V T) Q S V Q (3)
T V Q (4)
T T (5)
Self-Instructional
Material 103
Using Predicate Logic
P V Q V R R
P V Q P
NOTES T V Q Q
T T
iii) We can get the inference immediately if we can find a substitution such
that King(x) and Greedy(x) match King(John) and Greedy(y)
= {x/John, y/John} works
Unify (, ) = if =
p q
Knows (John, x) Knows (John, Jane) {x/Jane}}
Knows (John, x) Knows (y, OJ) {x/OJ, y/
John}}
Knows (John, x) Knows (y, Mother(y))
Knows (John, x) Knows(x, OJ)
Standardizing apart eliminates overlap of variables, e.g. Knows (z17, OJ)
iv) We can get the inference immediately if we can find a substitution such
that King(x) and Greedy(x) match King(John) and Greedy(y)
= {x/John,y/John} works
Unify(,) = if =
p q
Knows (John, x) Knows (John, Jane) {x/Jane}}
Knows (John, x) Knows (y, OJ) {x/OJ, y/John}}
Knows (John, x) Knows (y, Mother(y)) {y/John, x/Mother
(John)}}
Knows (John, x) Knows (x, OJ)
Standardizing apart eliminates overlap of variables, e.g. Knows (z17, OJ)
v) We can get the inference immediately if we can find a substitution such
that King(x) and Greedy(x) match King (John) and Greedy(y)
= {x/John, y/John} works
Unify (, ) = if =
Self-Instructional
106 Material
p q Using Predicate Logic
Knows (John, x) Knows (John, Jane) {x/Jane}}
Knows (John, x) Knows (y, OJ) {x/OJ, y/John}}
Knows (John, x) Knows (y, Mother(y)) {y/John, x/Mother
(John)}} NOTES
Knows (John, x) Knows (x, OJ) {fail}
Standardizing apart eliminates overlap of variables, e.g. Knows (z17, OJ)
To unify Knows (John, x) and Knows (y, z), = {y/John, x/z } or = {y/John,
x/John, z/John}
The first unifier is more general than the second.
There is a single most general unifier (MGU) that is unique up to renaming of
variables.
MGU = { y/John, x/z }
4.6.5 Resolution in Predicate Logic
Algorithm: Resolution
1. Convert all the propositions of F to clause form.
2. Negate P and convert to clause form. Add it to the set of clauses obtained in
1.
3. Repeat until a contradiction is found, no progress can be made, or a
predetermined amount of effort has been expanded.
a. Select two clauses. Call these parent clauses.
b. Resolve them together. The resolvent will be the disjunction of all the
literals of both parent clauses with appropriate substitutions performed
and with the following exception: If there is one pair of literals T1 and
T2, such that one of the parent clauses contains T1 and the other
contains T2 and if T1 and T2 are unifiable, then neither T1 nor T2
should appear in the resolvent. If there is more than one pair of
complementary literals, only one pair should be omitted from the
resolvent.
c. If the resolvent is the empty clause, then a contradiction has been found.
If it is not, then add it to the set of clauses available to the procedure.
Example:
1. Marcus was a man.
2. Marcus was a Pompeian.
3. All Pompeians were Romans.
4. Caesar was a ruler.
5. All Pompeians were either loyal to Caesar or hated him.
6. Every one is loyal to someone.
7. People only try to assassinate rulers they are not loyal to.
8. Marcus tried to assassinate Caesar.
Axioms in clause form are:
1. man(Marcus)
2. Pompeian(Marcus)
3. Pompeian(x1) v Roman(x1)
4. Ruler(Caesar)
Self-Instructional
Material 107
Using Predicate Logic 5. Roman(x2) v loyalto(x2, Caesar) v hate(x2, Caesar)
6. loyalto(x3, f1(x3))
7. man(x4) v ruler(y1) v tryassassinate(x4, y1) v loyalto(x4, y1)
8. tryassassinate(Marcus, Caesar)
NOTES Prove: hate (Marcus, Caesar)
Marcus/x
Pompeion(Marcus) V loyalto(Marcus, Caesar) 2
7 loyalto(Marcus, Caesar)
Marcus/x4, Caesar/y1
tryassassinate(Marcus, Caesar) 8
Marcus/x2
3 Roman(Marcus) V hate(Marcus, Caesar)
Marcus/x1
Pompeion(Marcus) V hate(Marcus, Caesar) 2
hate(Marcus, Caesar)
(a)
hate(Marcus, Caesar) 10
Marcus/x6, Caesar/y3
Presecute(Caesar, Marcus,) 9
Marcus/x5, Caesar/y2
hate(Marcus, Caesar)
: (b)
:
Fig. 4.3 An Unsuccessful Attempt at Resolution
4.8 SUMMARY
In this unit, you have learned how predicate logic can be used as the basis of the
technique for knowledge representation. You also learned how resolutions are applied
in knowledge representation and problem solving technique. . There are two ways
to approach achieving the computational goal. The first is to search good heuristics
that can inform a theorem proving program. The second approach is to change
given data to the program but not the program. A difficulty with the use of theorem
proving in AI systems is that there are some kinds of information that are not easily
represented in predicate logic. Consider the following examples:
It is very cool today. How can relative degree of cool be represented?
Black hair people often have grey eyes. How can the amount of certainty be
represented?
It is better to have more pieces on the board than the opponent has. How
can we represent this kind of heuristic information?
I know Dravid thinks the Sachin will win, but I think they are going to lose.
How can several different belief systems be represented at once?
Using the knowledge representation on these examples we have not yet suitably
given the proper representation. They primarily deal with the knowledge base that
is incomplete, although other problems do also exist, such as the difficulty of
representing continuous phenomena in a discrete system. You will learn about their
solutions in the next units. You have also learned in this unit that:
First-order logic allows us to speak about objects, properties of objects
and relationships between objects.
It also allows quantification over variables.
First-order logic is quite an expressive knowledge representation language;
much more so than propositional logic.
However, you need to add things like equality if you wish to be able to do
things like counting.
You have also traded expressiveness for decidability.
How much of a problem is this?
If you add (Piano) axioms for mathematics, then you encounter Gdels
famous incompleteness theorem (which is beyond the scope of this course).
You have now investigated one knowledge representation and reasoning
formalism.
This means you can draw new conclusions from the knowledge you have:
you can reason.
You have enough to build a knowledge-based agent.
However, you have also learned that propositional logic is a weak language;
there are many things that cannot be expressed.
Self-Instructional
Material 109
Using Predicate Logic To express knowledge about objects, their properties and the relationships
that exist between objects, You need a more expressive language: first-order logic.
e. Use a resolution to answer the question What food does Myke eat?
3. Trace the operation of the unification algorithm on each of the following
pairs of literals.
NOTES
a. f(Marcus) and f(Caesar)
b. f(x) and f(g(x))
c. f(Marcus, g(x, y)) and f(x, g(Caesar, Marcus))
4. Suppose that we are attempting to resolve the following clauses:
loves(mother(a), a)
%loves(y, x) V loves(x, y)
a. What will be the result of the unification algorithm?
b. What must be generated as a result of resolving these two clauses?
c. What does the example show about the order in which the substitutions
determined by the unification procedure must be performed?
5. Assume the following facts:
Steve only likes easy courses.
Computing courses are hard.
All courses in Sociology are easy.
Society is evil is a sociology course.
Represent these facts in predicate logic and answer the question?
What course would Steve like?
6. Find out what knowledge representation schemes are used in the STRIPS
system.
7. Translate the following sentences into propositional logic:
(i) If Jane and John are not in town we will play tennis.
(ii) It will either rain today or it will be dry today.
(iii) You will not pass this course unless you study.
To do the translation you will need to
(a) Identify a scheme of abbreviation
(b) Identify logical connectives
8. Convert the following formulae into Conjunctive Normal Form (CNF):
(i) P ! Q
(ii) (P ! :Q) ! R
(iii) :(P ^ :Q) ! (:R _ :Q)
9. Show using the truth table method that the following inferences are valid.
(i) P ! Q; :Q j= :P
(ii) P ! Q j= :Q ! :P
(iii) P ! Q;Q ! R j= P ! R
10. Repeat question 3 using resolution. In this case we want to show:
(i) P ! Q; :Q :P
(ii) P ! Q :Q ! :P
(iii) P ! Q;Q ! R P ! R
Self-Instructional
Material 111
Using Predicate Logic 11. Determine whether the following sentences are valid (i.e., tautologies) using
truth tables.
(i) ((P _ Q) ^ :P) ! Q
(ii) ((P ! Q) ^ :(P ! R)) ! (P ! Q)
NOTES (iii) :(:P ^ P) ^ P
(iv) (P _ Q) ! :(:P ^ :Q)
12. Repeat question 5 using resolution. In this case we aim to show:
(i) ((P _ Q) ^ :P) ! Q
(ii) ((P ! Q) ^ :(P ! R)) ! (P ! Q)
(iii) :(:P ^ P) ^ P
(iv) (P_Q)!:(:P^:Q)
Self-Instructional
112 Material
Weak Slot and Filler
STRUCTURES
NOTES
Structure
5.0 Introduction
5.1 Unit Objectives
5.2 Semantic Nets
5.2.1 Representation in a Semantic Net
5.2.2 Inference in a Semantic Net
5.2.3 Extending Semantic Nets
5.3 Frames
5.3.1 Frame Knowledge Representation
5.3.2 Distinction between Sets and Instances
5.4 Summary
5.5 Key Terms
5.6 Answers to Check Your Progress
5.7 Questions and Exercises
5.8 Further Reading
5.0 INTRODUCTION
In the preceding unit, you learnt the various concepts relating to the use of predicate
logic and how to use simple facts in logic. In this unit, you will learn about the
various concepts about weak slot and filler structures.
There are two attributes that are of very general significance, the use of which
you have already studied in the previous units: instance and isa. These attributes
are important because they support property inheritance. They are called a variety
of things in AI systems, but the names do not matter. What does matter is that they
represent class membership and inclusion and that class inclusion is transitive. In
slot and filler systems, these attributes are usually represented explicitly. In logic-
based systems, these relationships may be represented this way or they may be
represented implicitly by a set of predicates describing a particular class.
This is introduced as a device to support property inheritance along isa and
instance links. This is an important aspect of these structures. Monotonic inheritance
can be performed substantially more efficiently with such structures than with pure
logic, and non-monotonic inheritance is easily supported. The reason that inheritance
is easy is that the knowledge in slot and filler systems consists of structures as a set
of entities and their attributes. These structures turn out to be useful for other reasons
besides the support for inheritance.
Weak slot and filler structures are a data structure and the reasons to use this data
structure are as follows:
It enables attribute values to be retrieved quickly.
Assertions are indexed by the entities.
Binary predicates are indexed by first argument.
Properties of relations are easy to describe.
Self-Instructional
Material 113
Weak Slot and Filler It allows ease of consideration as it embraces aspects of object oriented
Structures
programming.
This structure is called a weak slot and filler structure because:
A slot is an attribute value pair in its simplest form.
NOTES
A filler is a value that a slot can takecould be a numeric, string (or any
data type) value or a pointer to another slot.
A weak slot and filler structure does not consider the content of the
representation.
We describe the two kinds of structures: Semantics and Frames. In this unit,
you will learn about representation of these structures themselves and techniques
for reasoning with them. We call these knowledge poor structures weak by analogy
with the weak methods for problem solving.
Mammal
is a
has_part
Person Head
instance
team_colours
team
Black/Blue Dhoni India
These values can also be represented in logic as: isa (person, mammal),
instance (Dhoni, person), team(Dhoni, India).
We have already seen how conventional predicates such as lecturer (Rao)
can be written as instance (Rao, lecturer). Recall that isa and instance represent
inheritance and are popular in many knowledge representation schemes. But we
have a problem: How we can have more than 2 place predicates in semantic nets?
E.g. score (India, Australia, 236). Solution:
Create new nodes to represent new objects either contained or alluded to
in the knowledge, game and fixture in the current example. Relate
information to nodes and fill up slots (see Fig. 5.2).
Cricket Match
isa
away-team
score
Australia Fixture 3 236
Home-team
India
Self-Instructional
Fig. 5.2 A Semantic Network for n-Place Predicate
116 Material
As a more complex example consider the sentence: Sam gave Ravi a book. Here we Weak Slot and Filler
Structures
have several aspects of an event ( see Fig. 5.3).
gave book
NOTES
instance
agent object
Sam event 1 book 13
receiver
Ravi
fly
action
has_part
bird wings
instance
action
emu run
Self-Instructional
Material 117
Weak Slot and Filler In making certain inferences we will also need to distinguish between the
Structures
link that defines a new entity and holds its value and the other kind of link that
relates two existing entities. Consider the example shown where the height of two
people is depicted and we also wish to compare them.
NOTES We need extra nodes for the concept as well as its value.
Sam Ravi
height height
160 170
Special procedures are needed to process these nodes, but without this
distinction the analysis would be very limited.
Sam Ravi
height height
greater than
H1 H2
value value
160 170
grey 1
skin tail
elephant
4 trunk
legs
isa 1
e1
name
clyde
Self-Instructional
118 Material
Using inheritance Weak Slot and Filler
Structures
To find the value of a property of e1, first look at e1.
If the property is not attached to that node, climb the isa link to the nodes
parent and search there. NOTES
isa signifies set membership:
ako signifies the subset relation:
Repeat, using isa/ako links, until the property is found or the inheritance
hierarchy is exhausted.
Sets of things in a semantic network are termed types.
Individual objects in a semantic network are termed instances.
Examples of Semantic Networks
State: I own a tan leather chair.
furniture
ako
part
person chair seal
isa isa
owner colour
me my-chair tan
covering isa
leather brown
give
isa
agent object
John event7 book23
person isa
Mary book
isa ako
Bilbo habbit person
agent
believes
instance
agent
John event1
object
space 1
earth flat
instance insurance
Object 1 Prop 1
has_property
Now consider the quantified expression: Every parent loves their child.
To represent this we:
Create a general statement, GS, special class.
Make node g an instance of GS.
Every element will have at least 2 attributes:
a form that states which relation is being asserted.
one or more forall ( )or exists
( ) connectionsthese represent
universally quantifiable variables in such statements e.g. x, y in
x parent(x) y: child(y)
loves(x, y)
Here we have to construct two spaces one for each x, y.
Note: We can express variables as existentially qualified variables and express the
event of love having an agent p and receiver b for every parent p which could
simplify the network.
Also, if we change the sentence to Every parent loves child then the node of
the object being acted on (the child) lies outside the form of the general statement.
Self-Instructional
120 Material
Thus it is not viewed as an existentially qualified variable whose value may depend Weak Slot and Filler
Structures
on the agent. So we could construct a partitioned network as in Figure 5.8.
instance
GS
NOTES
instance
form
gs2
instance instance
forall
exists c2 i3
receiver
space2
instance
agent
p1
space1
d b m
assailant victim
2. Every dog has bitten a mail carrier.
SA
GS Dogs Bite mail carrier
Self-Instructional
Material 121
Weak Slot and Filler 3. Every dog in town has bitten the mail carrier.
Structures
Dogs
NOTES
SA
GS Town Dogs Bite mail carrier
SA
Dogs Bite mail carrier
d b m
assailant victim
GS g form
Self-Instructional
122 Material
Frames can inherit slots from parent frames. For example, man might inherit Weak Slot and Filler
Structures
properties from Ape or Mammal (A parent class of man).
Properties
Frames implement semantic networks. NOTES
They add procedural attachment.
A frame has slots and slots have values.
A frame may be generic, i.e. it describes a class of objects.
A frame may be an instance, i.e. it describes a particular object.
Frames can inherit properties from generic frames.
Frames are a variant of nets that are one of the most popular ways of
representing non-procedural knowledge in an expert system. In a frame, all the
information relevant to a particular concept is stored in a single complex entity,
called a frame. Superficially, frames look pretty much like record data structures.
However frames, at the very least, support inheritance. They are often used to capture
knowledge about typical objects or events, such as a typical bird or a typical restaurant
meal.
We could represent some knowledge about elephants in frames as follows:
Mammal
subclass: Animal
warm_blooded: yes
Elephant
subclass: Mammal
* colour: grey
* size: large
Clyde
instance: Elephant
colour: pink
owner: Fred
Nellie:
instance: Elephant
size: small
A particular frame (such as Elephant) has a number of attributes or slots such
as colour and size where these slots may be filled with particular values, such
as grey. We have used a * to indicate those attributes that are only true of a typical
member of the class, and not necessarily every member. Most frame systems will
let you distinguish between typical attribute values and definite values that must be
true.
[Rich & Knight in fact distinguish between attribute values that are true of
the class itself, such as the number of members of that class, and typical attribute
values of members]
In the above frame system, we would be able to infer that Nellie was small,
grey and warm blooded. Clyde is large, pink and warm blooded and owned by Fred.
Self-Instructional
Material 123
Weak Slot and Filler Objects and classes inherit the properties of their parent classes UNLESS they have
Structures
an individual property value that conflicts with the inherited one.
Inheritance is simple where each object/class has a single parent class, and
where slots take single values. If slots may take more than one value it is less clear
NOTES whether to block inheritance when you have more specific information. For example,
i f y o u k n o w has_part head, and that an elephant has_part trunk you
t h a t a m a m m a l
may still want to infer that an elephant has a head. It is therefore useful to label slots
according to whether they take single values or multiple values.
If objects/classes have several parent classes (e.g. Clyde is both an elephant
and a circus animal), then you may have to decide which parent to inherit from
(maybe, elephants are by default wild, but circus animals are by default tame).
There are various mechanisms for making this choice, based on choosing the most
specific parent class to inherit from.
In general, both slots and slot values may themselves be frames. Allowing
slots of be frames means that we can specify various attributes of a slot. We might
want to say, for example, that the slot size always must take a single value of type
size-set (where size-set is the set of all sizes). The slot owner may take multiple
values of type person (Clyde could have more than one owner). We could specify
this in the frames:
Size:
instance: Slot
single_valued: yes
range: Size-set
Owner:
instance: Slot
single_valued: no
range: Person
The attribute value Fred (and even large and grey, etc.) could be represented
as a frame, e.g:
Fred:
instance: Person
occupation: Elephant-breeder
One final useful feature of frame systems is the ability to attach procedures to
slots. So, if we dont know the value of a slot, but know how it could be calculated,
we can attach a procedure to be used if needed, to compute the value of that slot.
Maybe, we have slots representing the length and width of an object and sometimes
need to know the objects areawe would write a (simple) procedure to calculate
it, and put that in place of the slots value. Such mechanisms of procedural attachment
are useful, but perhaps should not be overused, or else our nice frame system would
consist mainly of just lots of procedures, interacting in an unpredictable fashion.
Frame systems, in all their full glory, are pretty complex and sophisticated
things. More details are available in [Rich & Knight, sec 9.2] or [Luger &Stubblefield,
sec 9.4.1] (or less detail in [Bratko]). The main idea to get clear is the notion of
inheritance and default values. Most of the other features are developed to support
Self-Instructional
inheritance reasoning in a flexible but principled manner. As we saw for nets, it is
124 Material
easy to get confused about what slots and objects really mean. In frame systems we Weak Slot and Filler
Structures
partly get round this by distinguishing between default and definite values, and by
allowing users to make slots first class citizens, giving the slot particular properties
by writing a frame-based definition.
Advantages of frames NOTES
1. A frame collects information about an object in a single place in an organized
fashion.
2. By relating slots to other kinds of frames, a frame can represent typical
structures involving an object; these can be very important for reasoning based
on limited information.
3. Frames provide a way of associating knowledge with objects.
4. Frames may be a relatively efficient way of implementing AI applications.
5. Frames allow data that are stored and computed to be treated in a uniform
manner. For example, class, a computed grade or marks of a student might be
stored.
Disadvantages of frames
1. Slot fillers must be real data. For example, it is not possible to say that Ram
is a player or a student, since there is no way to deal with a disjunction in a
slot filler.
2. It is not possible to quantify over slots. For example, there is no way to
represent some students make 100 on the Examination.
3. It is necessary to repeat the same information to make it usable from different
viewpoints, since methods are associated with slots or particular object types.
For example, it may be easy to answer, Whom does Sam love?, but hard to
answer Who loves Nisha?
Frames are good for applications in which the structure of the data and the
available information are relatively fixed. The procedures or methods associated
with slots provide a good way to associate knowledge with a class of objects. Frames
are not particularly good for applications in which deep deductions are made from
a few principles as in theorem proving.
Object oriented programming and frame systems have much in common:
Class / Instance structure
Inheritance from higher classes
Ability to associate programs with classes and call them automatically
Because of these similarities it is not possible to draw a hard distinction
between object oriented systems and frames. However, there are some differences
that are typically found.
Frame systems typically have the following features in contrast to the typical
object oriented system:
1. Richer Slots: A slot can contain more kinds of information. E.g.:
documentation, default value, restrictions on slot fillers.
2. Slot orientation: Usually there are no messages that are not associated
with slots.
3. More complex structure: A frame system could potentially have instance
values, default values, and if needed, methods for the same slot. When all
Self-Instructional
Material 125
Weak Slot and Filler these are combined with multiple inheritance paths, the result can be
Structures
complex.
These are certain problems with frames:
Inheritance causes trouble.
NOTES Inference tends to become complex and ill structured.
Frames are good for representing static situations; they are not so good
for representing things that change over time.
As example of a simple frame is shown below:
(Ravi
(profession (value professor)
(Age (value 42))
(wife (value Rani))
(children (value A, B)
(address (street (value Nagarampalem)
(city (value Guntur))
(state (value AP)
(PIN (value 522 004))
The general frame template structure is shown below:
(<frame name>)
(<slot 1> (<fact 1><value 1><value k1>)
(<fact 2><value 1><value k2>)
(<slot 2><fact 1><value 1.<value km>)
|
|
|)
From the above general structure it is seen that a frame may have any number
of slots, and a slot may have any number of facts, each with number of values. This
provides a general framework from which to build a variety of knowledge structures.
The slots in a frame specify genera or specific.
Now we will discuss:
Frame knowledge Representation
Interpreting frames
5.3.1 Frame Knowledge Representation
Consider the example first discussed in Semantic Nets:
Person
isa: Mammal
Cardinality:
Adult-Male
isa: Person
Cardinality:
Self-Instructional
126 Material
Cricket-Player Weak Slot and Filler
Structures
isa: Adult-Male
Cardinality:
Height:
NOTES
Weight:
Position:
Team:
Team-Colours:
Back
isa: Cricket-Player
Cardinality:
Tries:
Dhoni
instance: Back
Height: 6-0
Position: Centre
Team: India
Team-Colours: Black
Cricket-Team
isa: Team
Cardinality:
Team-size: 10
Coach:
Cricket team
Instance: Indian Team
Team size: 10
Coach: Amarnadh
Players: {Sachin, Yuvaraj Singh, Srinath, Harbajan..}
Here, the frames Person, Adult-Male, Cricket-Player and India-Team are all
classes and the frames Amarnadh and Indian are instances.
Note:
The isa relation is in fact the subset relation.
The instance relation is in fact element of.
The isa attribute possesses a transitivity property. This implies: Amarnadh is
a Back and a Back is a Cricket-Player who in turn is an Adult-Male and also
a Person.
Both isa and instance have inverses which are called subclasses or all
instances.
There are attributes that are associated with the class or set such as cardinality
and on the other hand there are attributes that are possessed by each member
of the class or set.
Self-Instructional
Material 127
Weak Slot and Filler 5.3.2 Distinction between Sets and Instances
Structures
It is important that this distinction is clearly understood.
India can be thought of as a set of players or as an instance of an Indian-
NOTES Team.
If India were a class then
its instances would be players
it could not be a subclass of Indian-Team; otherwise, its elements would
be members of Rugby-Team which we do not want.
Instead we make it a subclass of Cricket-Player and this allows the players to
inherit the correct properties enabling us to let the Indian to inherit information
about teams.
This means that Indian is an instance of Cricket-Team.
But there is a problem here:
A class is a set and its elements have properties.
We wish to use inheritance to bestow values on its members.
But there are properties that the set or class itself has such as the manager
of a team.
This is why we need to view Indian as a subset of one class players and an
instance of teams. We seem to have a CATCH 22. Solution:
MetaClasses.
A metaclass is a special class whose elements are themselves classes.
Now consider our rugby teams as:
Class
instance: Class
isa: Class
Cardinality:
Team
Instance: class
Isa: class
Cardinality: The number of team
Team Size: 10
Indian Team
Isa: Team
Cardinality: The number of team
Team Size: 10
Coach: Amarnath
India
Instance: Cricket-team
Team Size: 10
Coach: Amarnath
Self-Instructional
128 Material
Amarnath Weak Slot and Filler
Structures
Instance: Back
Height: 6-0
Position: Slip NOTES
Team: India
Team-colours: Blue
Fig. 5.10 A Metaclass Frame System
Self-Instructional
130 Material
Weak Slot and Filler
5.4 SUMMARY Structures
In this unit, you have learned the various concepts about weak slot and filler
structures. There are two attributes that are of very general significance, the use of NOTES
which you have already learned in the previous units. These attributes are important
because they support property inheritance. This unit also discussed in detail about
semantic nets and frames. The idea of a frame system as a way to represent declarative
knowledge has been encapsulated in a series of frame oriented representation
languages, whose features have evolved and been driven by an increased
understanding of the sort of representation issues.
Semantic networks start out by encoding relationships like isa and ako but
can represent quite complicated information on this simple basis.
Frames organize knowledge around concepts considered to be of interest
(like person and patient in the example code).
A frame can be a generic frame (or template) or an instance frame.
Frames also allow procedural attachmentthat is, demons can be attached
to slots so that the mere fact of creating a frame or accessing a slot can cause.
Significant computation to be performed.
Self-Instructional
132 Material
Strong Slot and Filler
STRUCTURES
NOTES
Structure
6.0 Introduction
6.1 Unit Objectives
6.2 Conceptual Dependency
6.3 Scripts
6.4 CYC
6.4.1 Motivation
6.5 CYCL
6.6 Global Ontology
6.7 Summary
6.8 Key Terms
6.9 Answers to Check Your Progress
6.10 Questions and Exercises
6.11 Further Reading
6.0 INTRODUCTION
In the previous unit, you learnt about slot and filler structures in very general outlines.
In this unit, you will learn about strong slot and filler structures. Individual semantic
networks and frame systems may have specialized links and inference procedures,
but there are no hard and fast rules about what kinds of objects and links are good
in general for knowledge representation. Such decisions are left up to the builder of
the semantic network or frame system.
Strong slot and filler structures typically:
Represent links between objects according to more rigid rules
Specific notions of what types of object and relations between them are
provided
Represent knowledge about common situations
We have two types of strong slot and filler structures:
1. Conceptual Dependency (CD)
2. Scripts
For a specific need, we can place the restrictions on the structure, that is, we can
have a meaningful structure with specific structural elements and primitives. This
will enable us to derive specific methods for using the structure.
Conceptual dependencies for dealing with natural languages
Low-level representation capturing textual information
Primitives:
Actions transfer, movement, mental, sensory
Objects modifiers of actions, modifiers of objects
Tenses, directionality
Self-Instructional
Material 133
Strong Slot and Filler Note that many sentences can form the same CD because they convey the
Structures
same information, just differently
I gave the man a book.
I gave the book to a man.
NOTES
That was the man that I gave the book to.
If a sentence is ambiguous, the Conceptual Dependencies must also contain
the ambiguity, or we could create multiple Conceptual Dependencies
one for each interpretation.
CDs can be very lengthy for even a very simple idea.
Scripts for planning or reasoning over stereotypical events.
Captures the typical sequence of actions along with the actors and props
of a typical event.
Includes: Preconditions, Post conditions, roles, props, settings and
variations.
Used primarily for answering the questions and summarizing the input
text.
What about a typical action?
CYC an experiment dealing with common sense knowledge
Most Systems have common sense reasoning abilities at all this has
been a criticism of AI.
CYC An experiment to program common sense knowledge.
Use simple interfacing strategies (e.g. Resolution) to drive new common
sense facts.
CYC systems consist of perhaps 10 million individual common sense
facts.
Effort of 5 years and many people.
Uses a frame based representation and is capable of generating new
generations.
Functional Representation for reasoning over device functions and behaviour.
How do we represent functionality and device behaviour? How do we
reason about how things work?
We can use a functional based representation to perform hypothetical
reasoning.
Elements of a functional representation:
i. Devices are described by a Function, structure and behaviour.
ii. Function: The purposes of component X.
iii. Structures: How components physically construct.
iv. Behaviour: The sequence of states that X undergoes to bring about the
function.
Examples of Functional Representation:
Physical Objects:
Doorbell Ringer System
Chemical Processing Unit
Human Body subsystem (e.g. Gate)
Abstracts Objects:
Compute program
Combat Mission Plan
Self-Instructional
134 Material
Diagnostic reasoning process Strong Slot and Filler
Structures
Uses of functional representations
Given functional representation of how something works
Generate a fault tree for diagnosis NOTES
Perform Hypothetical reasoning
Produce predictions and simulations
Perform redesign
NOTES
11. Represents the relationship between PP and a state in which ist started and
another in which it ended.
Self-Instructional
138 Material
12. The two forms of the rule describe the cause of an action and the cause of a Strong Slot and Filler
Structures
state change.
NOTES
(a) (b)
13. Describes the relationship between a conceptualization and the time at which
the event it describes occurred.
The has an object (actor), PP and action, ACT, i.e. PP ACT. The triple arrow ()
is also a two link but between an object, PP, and its attribute, PA, i.e. PP PA.
It represents isa type dependencies. E.g. Rao lecturer
Rao is a lecturer.
Primitive states are used to describe many state descriptions such as height,
health, mental state, physical state. There are many more physical states than
primitive actions. They use a numeric scale.
E.g. John height (+10) John is the tallest
John height (< average) John is short
Frank health (-10) Frank is dead
Dave mental_state (-10) Dave is sad
Vase physical_state (-10) the vase is broken
You can also specify things like the time of occurrence in the relationship.
Self-Instructional
Material 139
Strong Slot and Filler For example: John gave Mary the book yesterday
Structures
to
P 0 Mary
R
John TRANS Book
NOTES John
from
Now, let us consider a more complex sentence: Since smoking can kill you, I
stopped. Lets look at how we represent the inference that smoking can kill:
Use the notion of one to apply the knowledge to.
Use the primitive act of ingesting smoke from a cigarette to one.
Killing is a transition from being alive to dead. We use triple arrows to indicate
a transition from one state to another.
Have a conditional, c causality link. The triple arrow indicates dependency
of one concept on another.
One
0 R
Cigarette
From
health(-10)
One
to __
health( 10)
from
One
P to
Advantages of CD
Using these primitives involves fewer inference rules.
Many inference rules are already represented in the CD structure.
The holes in the initial structure help to focus on the points still to be
Self-Instructional
established.
140 Material
Disadvantages of CD Strong Slot and Filler
Structures
Knowledge must be decomposed into fairly low-level primitives.
Impossible or difficult to find the correct set of primitives.
A lot of inference may still be required. NOTES
Representations can be complex even for relatively simple actions. Consider:
Dave bet Frank five pounds that Wales would win the Rugby World Cup.
Complex representations require a lot of storage.
Applications of CD
MARGIE
(Meaning Analysis, Response Generation and Inference on English) model natural
language understanding.
SAM
(Script Applier Mechanism) Scripts to understand stories (see the next section).
PAM
(Plan Applier Mechanism) Scripts to understand stories. Schank et al. developed
all of the above.
6.3 SCRIPTS
A script is a structure that prescribes a set of circumstances which could be expected
to follow on from one another.
It is similar to a thought sequence or a chain of situations which could be
anticipated. It could be considered to consist of a number of slots or frames but with
more specialized roles.
A Script can be viewed as a frame for a sequence of actions. Understanding
natural language often requires knowledge of typical sequences.
Example:
Shyam went to a restaurant.
He ordered an exclusive dinner.
He had forgotten his wallet.
He had to wash dishes.
The sequence of sentences mentions only the parts of the story that are different
from what might otherwise be expected. Understanding such a sequence requires
knowledge of a restaurant script that specifies typical sequences of actions involved
in going to a restaurant.
In contrast to the relatively static slots of a frame, a script may have a directed
graph of events composing the script.
A script is a structure that described a stereotyped sequence of events in a
particular content. A script consists of slots. Each slot contains some information
Self-Instructional
Material 141
Strong Slot and Filler about kinds of values it may contain as well as a default value to be used if no other
Structures
information is available.
Scripts are frame-like structures used to represent commonly occurring
situations such as going to the movies, shopping in a supermarket, eating in a
NOTES restaurant, visiting a dentist, reading about baseball or politics. Scripts are used in
natural language understanding systems to organize a knowledge base in terms of
the situations that the system is to understand.
When people enter a restaurant they know what to expect and how to act.
They are met at the entrance by restaurant staff or by a sign indicating that they
should continue in and be directed to a seat. Either a menu is available at the table
or presented by the waiter or the customer asks for it. These are the routines for
ordering food, eating, paying and leaving.
In fact the restaurant script is quite different from other eating scripts such as
the fast food model of the formal family means. In the fast food model the customer
enters gets in the line to order, pay for the meal (before eating), waits about for a
tray with the food, takes the tray and tries to find a clean table and so on. These are
two different stereotyped sequenced of events and each has a potential script.
Scripts are beneficial because:
Events tend to occur in known runs or patterns.
Causal relationships between events exist.
Entry conditions exist which allow an event to take place.
Prerequisites exist upon events taking place. E.g. when a student progresses
through a degree scheme or when a purchaser buys a house.
The components of a script include:
Entry Conditions
Entry conditions or descriptors of the world that must be true for the script to be
called. From the above script, these include an open restaurant and a happy customer
that has some money, or conditions that must, in general, be satisfied before the
events described in the script can occur.
Results
Those results or facts that are true once the script has terminated. Example, the
customer is full and poorer, the restaurant owner has more money or conditions that
will, in general, be true after the events described in the script have occurred.
Props
Props or the things that support the content of the script. These might include
table, waiters and menus. The set of props supports reasonable default assumptions
about the situation; a restaurant is assumed to have tables and chairs, unless stated
otherwise.
OR
These are the slots representing the objects that are involved in the events
described in the scripts. The presence of these objects can be inferred even if they
are not mentioned explicitly.
Self-Instructional
142 Material
Roles Strong Slot and Filler
Structures
Are the items those that the individual participants perform? The waiter takes the
order, delivers food and presents the bill. The customers order, eat and pay.
OR NOTES
These are the slots representing the people, who are involved in the events
described in the script. If specific individuals are mentioned, they can be inserted
into the appropriate slots.
Track
The specific variations in one or more general patterns that is represented by their
particular script.
Scenes
Script is broken into a sequence of scenes each of which presents a temporal aspect
of the script. The sequence of events that occur. Events are represented in conceptual
dependency form. In the restaurant there is entering, ordering, eating, etc.
Example:
Restaurant Script Structure
SCRIPT : Restaurant
TRACK : Oberoi
PROPS : Tables, Chairs, Menu, Money, Food
ROLES : Customer, Waiter, Cashier, Owner, Cook
Entry Conditions
(a) Customer is hungry
(b) Customer has money
Scene 1:
Entering
(a) Customer PTRANS into Restaurant.
(b) Customer ATTEND eyes to the tables.
(c) Customer MBUILD where to sit.
(d) Waiter PTRANS customer to an empty table.
(e) Customer MOVES to sitting position.
Scene 2:
Ordering
(a) Waiter PTANS the menu.
(b) Customer MBUILD choice of food.
(c) Customer MTRANS signal to waiter.
(d) Waiter PTRANS to table.
(e) Customer MTRANS I want food to waiter.
Self-Instructional
Material 143
Strong Slot and Filler Scene 3:
Structures
Eating
(a) Cook ATRANS food to waiter.
NOTES (b) Waiter ATRANS food to customer.
(c) Customer INGESTS food.
(Option: Return to scene 2 to order more: otherwise, go to scene 4)
Scene 4:
Eating
(a) Customer MTRANS to waiter.
(b) Waiter PTRANS to customer.
(c) Waiter ATRANS bill to customer.
(d) Customer ATRANS tip to waiter.
(e) Customer PTRANS to cashier.
(f) Customer ATRANS money to cashier.
(g) Customer PTRANS out of restaurant.
Results
(a) Customer is no more hungry.
(b) Customer is satisfied.
(c) Owner gets money.
Example 2:
A Supermarket script structure:
SCRIPT : Food market
TRACK : Super market
PROP : Shopping cart, Market ITEMS, Check-out Stands, Money
ROLES : Shopping, Daily Attendant, Sea food Attendant, Check-out
Clerk, Sacking clerk, Cashier, other Shoppers.
Entry Condition:
(a) Shopper needs groceries.
(b) Food markets open.
Scene 1:
(a) Shopper enters into market.
(b) Shopper PTRANS from one location to another into market.
(c) Shopper PTRANS shopping_cart to shopper.
Scene 2:
(a) Shopper shops for items.
(b) Shopper moves (MOVES).
Self-Instructional
144 Material
(c) Shopper focuses eyes on display items (ATTEND). Strong Slot and Filler
Structures
(d) Shopper transfers items to shopping_cart (PTRANS).
Scene 3:
NOTES
(a) Check-out.
(b) Shopper MOVES shopper to check-out stand.
(c) Shopper WAITS for Shopper turn.
(d) Shopper ATTENDS eyes to charge.
(e) Shopper ATRANS money to Cashier.
(f) Sales Boy ATRANS bags to Shopper.
Scene 4:
(a) Exit market
(b) Shopper PTRANS shopper to exit market.
Results
(a) Shopper has less money.
(b) Shopper has grocery money.
(c) Market has less grocery items.
(d) Market has more money.
Scripts are useful in describing certain situations such as robbing a bank. This might
involve:
Getting a gun
Hold up a bank
Escape with the money
Here the Props might be
Gun, G
Loot, L
Bag, B
Getaway car, C
The Roles might be:
Robber, S
Cashier, M
Bank Manager, O
Policeman, P
The Entry Conditions might be:
S is poor.
S is destitute.
The Results might be:
S has more money.
O is angry. Self-Instructional
Material 145
Strong Slot and Filler M is in a state of shock.
Structures
P is shot.
There are 3 scenes: obtaining the gun, robbing the bank and the getaway.
NOTES Script: Robbery
Track: Successful Snatch
Props:
G = Run
L = Loot
B = Bag
C = Get away Car
Roles:
R = Robber
M = Cashier
O = Bank Manager
P = Police Man
Entry Conditions:
R is poor
R is destitute
Scene 1:
Getting a gun
R PTRANS R into Gun Shop.
R MBUILD R choice of G.
R MTRANS choice.
R ATRANS buys G.
Scene 2:
Holding up the bank
R PTRANS R into bank.
R ATTEND eyes M, O and P.
R MOVE R to M positions.
R GRASP G
R MOVE G to point M.
R MTRANS Give me the money or else to M.
P MTRANS Hold it Hands up to R.
R PROPEL Shoots G.
P INGEST bullet from G.
M ATRANS L to M.
M ATRANS L puts in bag B.
Self-Instructional
146 Material
M PTRANS exit. Strong Slot and Filler
Structures
O ATRANS raises the alarm.
Scene 3:
NOTES
The getaway
M PTRANS C.
Results
R has more money.
O is angry.
M is in a state of shock.
P is shot.
Some additional points to note on Scripts:
If a particular script is to be applied it must be activated and the activating
depends on its significance.
If a topic is mentioned in passing, then a pointer to that script could be held.
If the topic is important then the script should be opened.
The danger lies in having too many active scripts, much as one might have
too many windows open on the screen or too many recursive calls in a program.
Provided events follow a known trail we can use scripts to represent the
actions involved and use them to answer detailed questions.
Different trails may be allowed for different outcomes of Scripts (e.g. The
bank robbery goes wrong).
Advantages of Scripts:
Ability to predict events.
A single coherent interpretation may be built up from a collection of
observations.
Disadvantages:
Less general than frames
May not be suitable to represent all kinds of knowledge
6.4 CYC
What is CYC?
An ambitious attempt to form a very large knowledge base aimed at capturing
commonsense reasoning.
Initial goals to capture knowledge from a hundred randomly selected articles
in the Encyclopedia Britannica.
Both Implicit and Explicit knowledge encoded.
Emphasis on study of underlying information (assumed by the authors but
not needed to tell the readers.
Self-Instructional
Material 147
Strong Slot and Filler Example:
Structures
Suppose we read that Sam learned of Napoleons death.
Then we (humans) can conclude Napoleon never new that Sam had died.
NOTES How do we do this?
We require special implicit knowledge or commonsense such as:
We only die once.
You stay dead.
You cannot learn of anything when dead.
Time cannot go backwards.
Why build large knowledge bases:
6.4.1 Motivations
Why we should build large knowledge bases at all? There are many reasons:
Brittleness
Specialized knowledge bases are brittle. Hard to encode new situations and non-
graceful degradation in performance. Commonsense based knowledge bases should
have a firmer foundation.
Form and Content
Knowledge representation may not be suitable for AI. Commonsense strategies
could point out where difficulties in content may affect the form.
Shared Knowledge
Should allow greater communication among systems with common bases and
assumptions. Building an immense knowledge base is an overwhelming task;
however, we should ask whether there are any methods for acquiring this knowledge
automatically. Here there are two possibilities:
1. Machine Learning: Current techniques permit only modest extensions of a
programs knowledge when compared with older techniques. In order for a
system to learn a great deal, it must already know a great deal. In particular
systems with a lot of knowledge will be able to employ powerful analogical
reasoning.
2. Natural Language Understanding: Humans can extend their knowledge
by reading books and talking with other humans. Now we have online versions
of dictionaries and encyclopaedias, so why dont we use these into an AI
program and have it assimilate all the information automatically? Although
there are many methods for building language understanding systems, these
methods are themselves very knowledge intensive.
The approach taken by CYC is to hand-code the 10-million or so facts that make up
commonsense knowledge. It may then possibly to develop more automatic methods.
The CYC is coded by using the following methods:
By hand
Self-Instructional
148 Material
Special CYC languages such as Strong Slot and Filler
Structures
1. LISP like
2. Frame Based
3. Multiple inheritances
4. Slots are fully fledged objects. NOTES
5. Generalized inheritance any link not just isa and instance.
Example:
Shyam learnt of Napoleons death.
Then human can conclude Napoleon never knew that Shyam had died. For
doing this, we need special implicit knowledge of common sense such as:
We only die once.
You stay dead.
You cannot learn when dead.
Time cannot go backwards.
Sometimes, we have to build large knowledge bases. The reasons to construct large
knowledge bases are as follows:
There are multiple ways for representing knowledge. They are:
1. Simple relational knowledge
2. Inheritable knowledge
3. Inferential knowledge
4. Procedural knowledge
1. Inferential knowledge: The simplest way to represent declarative facts is a
set of relations of the same sort. This sort of representation provides very
weak inferential capabilities. The following table shows an example of simple
relational knowledge.
Player Height in feet Weight in pounds Bats throws
ABC 60 180 LeftRight
XYZ 61 170 RightRight
PQR 62 190 RightLeft
EFG 510 215 LeftLeft
6.5 CYCL
CYCs knowledge is encoded in a representation language called CYCL. CYCL is
a frame-based system that incorporates most of the techniques like multiple
inheritance, slots as full-fledged objects, transfers-through, mutually-disjoint-with,
etc. CYCL generalizes the notion of inheritance so that properties can be inherited
along any link, not just isa and instance. For example:
1. All birds have two legs.
2. All of Marys friends speak Spanish.
We can encode the first fast using standard inheritanceany frame with Bird on its
instance slot inherits the value 2 on its legs slot. The second fact can be encoded in
a similar fashion if we allow inheritance to proceed along the friend relation.
In addition to frames, CYCL contains a constraint language that allows the
expression of arbitrary first-order logical expressions. The time at which the default
reasoning is actually performed is determined by the direction of the
slotValueSubsumes rules. If the direction is backward, the rule is an if-needed rule,
and it is invoked whenever someone inquires. If the direction is forward, the rule is
Check Your Progress
an if-added rule and additions are automatically propagated. While forward rules
can be very useful, they can also require substantial time and space to propagate
1. Who developed the
their values. If a rule is entered as backward then the system defers reasoning until
theory of conceptual
dependency? the information is specifically requested. CYC maintains a separate background
2. Script is similar to a process for accomplishing forward propagations. A Knowledge engineer can continue
thought sequence or entering knowledge while its effects are propagated during idle keyboard time.
a chain of situations
that could be Frame based inference is very efficient, while general logical reasoning is
anticipated. (True or computationally hard. CYC actually supports about twenty types of efficient inference
False) mechanism, each with its own truth maintenance facility. The constraint language
3. CYCL generalizes allows for the expression of facts that are too complex for any of these mechanisms
the notion of
inheritance so that
to handle.
properties cannot be The constraint language also provides an elegant, abstract layer of re-
inherited along any
presentation. In reality, CYC maintains two levels of representation: the
link, not just isa and
instance. (True or Epistemological level (EL) and the heuristic level (HL). The EL contains facts stated
False) in the logical constraint language, while the HL contains the same
Self-Instructional
150 Material
Facts are stored using efficient inference templates. There is a translation Strong Slot and Filler
Structures
program for automatically converting an EL statement into an efficient HL
representation. The EL provides a clean, simple functional interface to CYC so that
users and computer programs can easily insert and retrieve information from the
knowledge base. The EL / HL distinction represents one way of combining the NOTES
formal neatness of logic with the computational efficiency of frames.
In addition to frames, inference mechanisms and the constraint language,
CYCL performs consistency checking and conflict resolution.
6.7 SUMMARY
In this unit, you have learned about the various concepts related to strong slot and
filler structures. You have studued about CYC which is a multi-user system that
provides each knowledge enterer with a textual and graphical interface to the
knowledge base. Users modifications to the knowledge base are transmitted to a
central serve, where they are checked and then propagated to other users. We do not
have much experience with the engineering problems of building and maintaining
very large knowledge bases. In future, we should have a lot of tools that checks
consistency in the knowledge base. In this unit, you have also learned in detail
about how conceptual dependency works out in various kinds of scenariosyou also
learned (with several examples) how to write scripts and scenes in various situations.
Short-Answer Questions
1. Suggest a semantic net to describe the main organs of the human body.
2. Convert the following statements to Conceptual dependencies.
I gave a pen to my friend.
Ram ate ice cream.
I borrowed a book from your friend.
While going home, I saw a car.
3. Construct a script for going to a movie from the viewpoint of the movie goer.
4. Would conceptual dependency be a good way to represent the contents of a
typical issue of National Geographic?
5. Construct CD representation of the following:
(i) John begged Mary for a pencil.
(ii) Jim stirred his coffee with a spoon.
(iii) Dave took the book from Jim.
(iv) On my way home, I stopped to fill my car with petrol.
(v) I heard strange music in the woods.
(vi) Drinking beer makes you drunk.
(vii) John killed Mary by strangling her.
Long-Answer Questions
1. Try capturing the differences between the following in CD:
John slapped Dave, John punched Dave.
Sue likes Prince, Sue adores Prince.
2. Rewrite the script given in the lecture so that the bank robbery goes wrong.
3. Write a script to allow for both outcome of the bank robbery: Getaway and
going wrong and getting caught.
4. Write a script for enrolling as a student.
5. Find out about how MARGIE, SAM and PAM are implemented. In particular
pay attention to their reasoning and inference mechanisms with the knowledge.
Self-Instructional
152 Material
6. Find out how the CYCL language represents knowledge. Strong Slot and Filler
Structures
7. What are the two levels of representation in the constraints of CYC?
8. Find out the relevance of Meta-Knowledge in CYC and how it controls the
interpretations of knowledge. NOTES
Self-Instructional
Material 153
Natural Language
PROCESSING
NOTES
Structure
7.0 Introduction
7.1 Unit Objectives
7.2 The Problem of Natural Language
7.2.1 Steps in the Process
7.2.2 Why is Language Processing Difficult?
7.3 Syntactic Processing
7.4 Augmented Transition Networks
7.5 Semantic Analysis
7.6 Discourse and Pragmatic Processing
7.6.1 Conversational Postulates
7.7 General Comments
7.8 Summary
7.9 Key Terms
7.10 Answers to Check Your Progress
7.11 Questions and Exercises
7.12 Further Reading
7.0 INTRODUCTION
In the preceding unit, you learnt about the various concepts related to strong slot
and filler structures. In this unit, you will learn about the fundamental techniques of
natural language processing and evaluate some current and potential applications.
This unit will also help you to develop an understanding of the limits of these
techniques and of current research issues.
Self-Instructional
Material 159
Natural Language Before getting into detail on the several components of the natural language
Processing
understanding process, it is useful to survey all of them and see how they fit together.
7.2.1 Steps in the Process
NOTES 1. Morphological Analysis: Individual words are analysed into their components
and non-word tokens, for example, punctuation marks, are separated from
the words.
2. Syntactic Analysis: Linear sequences of words are changed into structures
that show how the words relate to each other. Some word sequences may be
refused if they have disregarded the languages rules for how words may be
combined. For example, an English syntactic analyser would refuse the
sentence Boy the go the to store.
3. Semantic Analysis: The structures created by the syntactic analyser are
assigned meaning. In other words, a mapping is made between the syntactic
structures and objects and in the task domain. Structures for which no such
mapping is possible may be refused. For example, the sentence Colourless
green ideas sleep furiously would be refused as semantically not suitable.
4. Discourse Integration: The meaning of an individual sentence may depend
on the sentences that precede it and may influence the meaning of the sentences
that follow it. For example, the word it in the sentence, Shyam wanted it,
depends of the prior discourse content, while the word Shyam may influence
the meaning of the subequent sentences (Such as, he always had).
5. Pragmatic Analysis: The structure representing what was said is reinterpreted
to determine what was actually meant. For example, the sentence Do you
know what time it is? should be explained as a request to be told the time.
7.2.1.1 Syntax
The stage of syntactic analysis is the best understood stage of natural language
processing. Syntax helps us understand how words are grouped together to make
complex sentences, and gives us a starting point for working out the meaning of the
whole sentence. For example, consider the following two sentences:
1. The dog ate the bone.
2. The bone was eaten by the dog.
The rules of syntax help us work out that its the bone that gets eaten and not
the dog. A simple rule like its the second noun that gets eaten just wont
work.
Syntactic analysis allows us to determine possible groupings of words in a
sentence. Sometimes, there will only be one possible grouping, and we will
be well on the way to working out the meaning. For example, in the following
sentence:
3. The rabbit with long ears enjoyed a large green lettuce.
We can work out from the rules of syntax that the rabbit with long ears
forms one group (a noun phrase), and a large green lettuce forms another
noun phrase group. When we get down to working out the meaning of the
sentence we can start off by working out the meaning of these word groups,
before combining them together to get the meaning of the whole sentence.
Self-Instructional
160 Material
In other cases, there may be many possible groupings of words. For example, Natural Language
Processing
in the sentence John saw Mary with a telescope, there are two different
readings based on the following groupings:
(i) John saw (Mary with a telescope), i.e., Mary has the telescope.
NOTES
(ii) John (saw Mary with a telescope), i.e., John saw her with the telescope.
When there are many possible groupings, then the sentence is syntactically
ambiguous. Sometimes we will be able to use general knowledge to work out
which is the intended grouping. For example, consider the following sentence:
4. I saw the Forth Bridge flying into Edinburgh.
We can probably guess that the Forth Bridge isnt flying! So, this sentence is
syntactically ambiguous, but unambiguous if we bring to bear general
knowledge about bridges. The John saw Mary .. example is more seriously
ambiguous, though we may be able to work out the right reading if we know
something about John and Mary (is John in the habit of looking at girls through
a telescope?).
Anyway, rules of syntax specify the possible organizations of words in sentences.
They are normally specified by writing a grammar for the language. Of course, just
having a grammar isnt enough to analyse the sentence; we need a parser to use the
grammar to analyse the sentence. The parser should return possible parse trees for
the sentence, indicating the possible groupings of words into higher-level syntactic
sections. The next section will describe how simple grammar and parsers may be
written; focussing on Prologs built in direct clause grammar formalism.
Language is meant for communicating about the world. If we succeed at
building a computational model of language, we should have a powerful tool for
communication about the world. So we can exploit knowledge about the world, in
combination with linguistic facts, to build computational natural language systems.
The largest part of human linguistic communication occurs as speech. Written
language is a recent invention and it plays a major role than speech in most activities.
Communication
Intentional exchange of information brought about by the production and perception
of signs drawn from a shared system of conventional signs.
Humans use language to communicate most of what is known about the world.
The Turing test is based on language.
Communication as Action
Speech act
Language production viewed as an action
Speaker, hearer, utterance
Examples:
Query: Have you smelled the wumpus anywhere?
Inform: Theres a breeze here in 3 4.
Request: Please help me carry the gold.
Self-Instructional
Material 161
Natural Language I could use some help carrying this.
Processing
Acknowledge: OK
Promise: Ill shoot the wumpus.
NOTES Fundamentals of Language
Formal language: A (possibly infinite) set of strings
Grammar: A finite set of rules that specifies a language
Rewrite rules
non-terminal symbols (S, NP, etc.)
terminal symbols (he)
S NP VP
NP Pronoun
Pronoun he
Chomskys Hierarchy
Every language is specified by a particular grammar. Therefore, the classification
of languages is based on the classification of the grammar used to specify them.
Consider a grammar G(VN, , P, S), where VN and are the sets of symbols and S
VN. For classifying grammar, you need to depend on P, which is the set of
productions of the grammar.
Four classes of grammatical formalisms:
Recursively enumerable grammar
Unrestricted rules: both sides of the rewrite rules can have any number of
terminal and non-terminal symbols AB C
Context-sensitive grammar
The RHS must contain at least as many symbols as the LHS ASB AXB
Context-free grammar (CFG)
LHS is a single non-terminal symbol S XYa
Regular grammars X a X aY
In terms of the forms of production, Chomsky classified the grammar hierarchy
into four types, which are as follows:
Type 0 Grammar
Type 1 Grammar
Type 2 Grammar
Type 3 Grammar
Type 0 Grammar
In the formal language theory, type 0 grammar is the superset of grammars and is
defined as a phase structure grammar that has no restrictions. In other words, in
type 0 grammar, no restrictions are there on the left and the right sides of the
productions of the grammar. Type 0 grammar is also known as unrestricted grammar.
In type 0 grammar, the productions used for generating strings are known as type 0
productions. Type 0 productions are not associated with any restrictions. The
languages specified by type 0 grammar are known as type 0 languages.
Self-Instructional
162 Material
In a production of the form A with A being a variable, and are Natural Language
Processing
respectively known as the left context and the right context, while the string is
known as replacement string. Again, a production having the form, A is
known as type 1 production, when . If you consider a production, abcAcd
abcABcd, then Abc is the left context, while cd is the right context. Similarly, in NOTES
AC , A is the left context, while is the right context and the production here
simply erases C.
Theorem 1: If G be type 0 grammar, then you can find an equivalent grammar G1,
where each production is either of the form or A a. Here, and are the
strings variables where A is a variable and a is a terminal.
Proof: To construct G1, consider a production in G with or having the
same terminals. Let in both and , a new variable Ca replace each of the terminals
to produce and . Now, for every , where and have same terminals,
there is a corresponding with productions of the form Ca a for each
terminal that appears on or . Hence, the new productions obtained from the
above construction are the new productions for G1. Also, the variables of G along
with the new variables of the form Ca are the variables of G1. Similarly, the terminals
and the start symbol of G1 are also same as those of G. Thus, G1 satisfies the required
conditions for a grammar and it is equivalent to G. Therefore, L(G) = L(G1).
Note: This theorem also holds for grammars of types 1, 2 and 3.
Type 1 Grammar
A grammar is termed as type 1 grammar when all of its productions are type 1
productions. In other words, you can define a grammar G (VN , , P, S) , where all
the productions are of the form A . Type 1 grammar is also known as
context-sensitive or context-dependent grammar. The formal language that can be
generated using type 1 grammar is known as type 1 or context-sensitive language.
In type 1 grammar, production of the type S is allowed; however, here, S does
not appear on the right-hand side of any of the production.
A production A in type 1 grammar does not increase the length of the
working string as |A| || and . However, if a production is such
that || ||, then it is not necessary that the production is of type 1. For example,
AB BC is not of type 1.
Again, a grammar G (VN , , P, S) is called monotonic when every production in P
is of the form having || || or S . In the second situation, S does not
appear on the right-hand side of any of the production of G.
Theorem: Every monotonic grammar G is equivalent to type 1 grammar.
Proof: For the grammar G, you can apply theorem 1 to get an equivalent grammar
G1. Now, consider a production A1A2Am B1B2Bn, where n m in G1. For
m = 1, this production is of type 1. Now, consider m 2. Then you can introduce
new variables C 1, C 2, , C nr to construct the following type 1 productions
corresponding to A1A2Am B1B2Bn
A1A2Am C1A2Am
C1A2Am C1C2A3Am
C1C2A3Am C1C2C3A4Am
C1C2Cm1Am C1C2 CmBm+1 Bm+2Bn
Self-Instructional
Material 163
Natural Language C1C2 CmBm+1 Bm+2Bn B1C2 CmBm+1 Bm+2Bn
Processing
B1C2 C3Bn B1B2C3Bn
B1 B2CmBm+1Bn B1B2 BmBn
NOTES Now, explaining the above construction, you can find that since more than one
symbol on the left-hand side of the production A1A2Am B1B2Bn are replaced,
the production is not of type 1. In the chain of the above productions, A1, A2, , Am
1
are replaced by C1, C2, , Cm1. Here, Am is replaced by CmBm+1Bn. Similarly,
C1, C2, etc. are correspondingly replaced by B1, B2, etc. Since only one variable is
replaced at a time in these productions so these productions are of type 1.
Now, for each production in G1 that is not of type 1, you can repeat this construction
process. This can give rise to a new grammar G2 having variables that are of G1-
together with the new variables. The productions of G2 are of type 1 and the terminals
and the start symbol of G2 are those of G1.
Therefore, you can conclude that G2 is of type 1 or context-sensitive, i.e.,
L(G2) = L(G1) = L(G)
Type 2 Grammar
In the formal language, a grammar is called type 2 grammar, if it only contains type
2 productions. A type 2 production is of the form A , where A is in VN and is
in (VN )*. In type 2 production, the right-hand side of the production does not
have any right or left context. For example, S Ba, A ab, B abc, etc. are the
examples of type 2 productions.
Type 2 grammar is often termed as context-free grammar and its productions are
called context-free productions. If a language is generated using type 2 grammar,
then it is known as type 2 language or context-free language.
Type 3 Grammar
A grammar is termed as type 3 grammar if it only contains the productions of type
3. Here, a production of type 3 is a production of the form A a or A aB, where
A and B are in VN, while a is in . In type 3 grammar, the productions of the form
S are allowed; however, in such cases, the right-hand side of the productions
does not contain S. Type 3 grammar is generally known as regular grammar. The
languages specified by type 3 grammar are known as type 3 languages or regular
languages.
Example 1. Find the highest type number that can be applied to the following
productions:
1. S A0, A 1 | 2 | B0, B 012
2. S ASB | b, A bA | c
3. S bS | bc
Solution:
1. Here, S A0, A B0 and B 012 are of type 2, while A 1 and A
2 are of type 3. Therefore, the highest type number is 2.
2. Here, S ASB is of type 2, while S b, A bA and A c are of type 3.
Therefore, the highest type number is 2.
3. Here, S bS is of type 3, while S ab is of type 2. Therefore, the highest
Self-Instructional type number is 2.
164 Material
Chomsky Normal Form Natural Language
Processing
In the Chomsky Normal Form (CNF), there are certain restrictions on the length
and the type of symbols to be used in the RHS of the productions. A grammar G is
said to be in the CNF if the productions defined in the grammar is in one of the
following forms: NOTES
Aa
A BC
S if G
For example, the grammar G with the productions S AB, S , A a and B
b is in the CNF. The grammar G with the productions S ABC, S aC, A b, B
a and C c is not in the CNF because the productions S ABC and S aC
are not in the form as required by the definition of the CNF. You can reduce a CFG
into an equivalent grammar G2, which is in CNF, by performing the following steps:
Step 1: Eliminate all the null productions and then eliminate all the unit
productions from the given grammar. Represent the grammar obtained after
the elimination of all the null and unit productions by G = (VN, , P, S).
Step 2: Eliminate all the terminals that appear with some non-terminals on
the RHS of the productions. For this, you need to define the grammar G1 =
(V2 N, , P1, S). P1 and V2 N in G1 are constructed in the following ways:
o The productions in P, which are of the forms A a and A BC,
are placed in P1. Also, all the non-terminal symbols in VN are included
in V2 N.
o Consider the production, A X 1X2 Xn with some terminal
symbols on the RHS. If Xi represents the terminal symbol, say ai,
then add a new non-terminal Cai to V2 N and a new production Cai
ai to P1. Also, all the terminal symbols in the production of the
form A X1X2 Xn are replaced by the corresponding new non-
terminal symbol and the resulted production is included in P1.
Step 3: Restrict the number of the non-terminals on the RHS of a production.
The RHS of the productions in P1 either consists of a terminal symbol or
more than one non-terminal symbols. Thus, you need to define the grammar,
G2 = (V2 2N, , P2, S), in the following ways:
o All the productions in P1, which are in the required form, are included
in P2. All the non-terminals in V2 N are also included in V2 2N.
o For the productions of type A A1A2 . . . Am, where m > 3, add new
productions of the following form to P2 2:
A A1C1
C1 A2C2
...
...
...
Cm - 2 Am 1Am
Where, C1, C2, . . . , Cm 2 are the new non-terminal symbols added
to V2 2N.
Self-Instructional
Material 165
Natural Language Example 2. Reduce the grammar G to the CNF with the following productions:
Processing
S bBaA, A aA | a, B bB | b
Solution: Since the grammar does not contain any null and unit productions, you
need to proceed with step 2.
NOTES
Step 2: Let G1 = (V2 N, , P1, S)
Now, construct P1 and V2 N as follows:
Add the production A a and B b to P1.
Eliminate the terminal symbols from the productions S bBaA, A aA | a and B
bB | b by using the following productions:
S CbBCaA
A CaA
B CbB
Ca a
Cb b
All these productions are added to P1 and V2 N = {S, A, B, Ca, Cb}
Step 3: The length of the RHS of the production S CbBCaA is four; therefore, it
should be replaced by the following productions:
S CbC1
C1 BC2
C2 CaA
Therefore, the productions in P2 are as follows:
S CbC1, C1 BC2, C2 CaA, A CaA, B CbB, Ca a, Cb b, A a, and
B b.
V2 2 N = {S, A, B, Ca, Cb, C1, C2}
= {a, b}
Thus, the grammar in the CNF is represented by G2 = (V2 2 N, , P2, S).
SPEAKER:
Intention
Know (H, Alive (Wumpus, S3))
Generation
The wumpus is dead
Synthesis
[thaxwahmpaxsihzdehd]
HEARER:
Perception:
The wumpus is dead
Self-Instructional
166 Material
Analysis (Parsing): Natural Language
Processing
(Semantic Interpretation): Alive (Wumpus, Now)
Tired (Wumpus, Now)
(Pragmatic Interpretation): Alive (Wumpus1, S3) NOTES
Tired (Wumpus1, S3)
HEARER:
Disambiguation:
Alive (Wumpus1, S3)
Incorporation:
TELL (KB, Alive (Wumpus1, S3))
Writing Grammar
Natural language grammar specifies allowable sentence structures in terms of basic
syntactic categories such as nouns and verbs, and allows us to determine the structure
of the sentence. It is defined in a similar way to a grammar for programming language,
though it tends to be more complex, and the notations used are somewhat different.
Because of the complexity of natural language, a given grammar is unlikely to
cover all possible syntactically acceptable sentences.
[Note: In natural language we dont usually parse language in order to check
that it is correct. We parse it in order to determine the structure and help work out
the meaning. But most grammars are just concerned with the structure of correct
English, as it gets much more complex to parse if you allow bad English.]
A starting point for describing the structure of a natural language is to use
context free grammar (as often used to describe the syntax of programming
languages). Suppose we want grammar that will parse sentences like:
1. John ate the biscuit.
2. The lion ate the schizophrenic.
3. The lion kissed John.
But we want to exclude incorrect sentences like:
1. Ate John biscuit the.
2. Schizophrenic the lion the ate.
3. Biscuit lion kissed.
A simple grammar that deals with this is the following:
1. sentence noun_phrase, verb_phrase.
Self-Instructional
Material 167
Natural Language 2. noun_phrase proper_name.
Processing
3. noun_phrase determiner, noun.
4. verb_phrase verb, noun_phrase.
NOTES proper_name [Mary]
proper_name [John]
noun [schizophrenic]
noun [biscuit]
verb [ate]
verb [kissed]
determiner [the]
The notation is similar to what is sometimes used for grammars of programming
languages. A sentence consists of a noun phrase and a verb phrase. A noun phrase
consists of either a proper noun (using rule 2) or a determiner (e.g., the, a) and a
noun. A verb phrase consists of a verb (e.g., ate) and a noun phrase. The rules at the
end are really like dictionary entries, which state the syntactic category of different
words. Basic syntactic categories such as noun and verb are terminal symbols in
grammar, as they cannot be expanded into lower level categories. (We put words in
square brackets because thats the way it is generally done in prologs built in grammar
notation).
If we consider the example sentences just stated, the sentence John ate the
biscuit consists of a noun phrase John and a verb phrase ate the biscuit. The
noun phrase is just a proper noun, while the verb phrase consists of a verb ate and
another noun phrase (the biscuit). This noun phrase consists of a determiner the
and a noun biscuit. The incorrect sentences will be excluded by grammar. For
example, biscuit lion kissed starts with two nouns, which is not allowed in grammar.
However, some odd sentences will be allowed, such as the biscuit kissed John.
This sentence is syntactically acceptable, just semantically odd, so should still be
parsed.
For a given grammar, we can illustrate the syntactic structure of the sentence
by giving the parse tree, which shows how the sentence is broken down into different
syntactic constituents. This kind of information may be useful for later semantic
processing. Anyway, given the above grammar, the parse tree for John ate the lion
would be:
Of course, the grammar given above is not really adequate to parse natural
language properly. Consider the following two sentences:
Mary eat the lion.
Mary eats the ferocious lion.
If we have eat and eats categorized as verbs, then, given the simple grammar
above, the first sentence will be acceptable according to the grammar, while the
second wontwe dont have any mention of adjectives in our grammar. To deal
with the first problem we need to have some method of enforcing number/person
agreement between subjects and verbs, so that things like I am.. and We are .. are
accepted, but I are .. and We am .. are not. To deal with the second problem, we
need to add further rules to our grammar.
Self-Instructional
168 Material
To enforce subject-verb agreement, the simplest method is to add arguments Natural Language
Processing
to our grammar rules. If were only concerned about singular vs plural nouns (and
assume that we dont have any first or second person pronouns), we might get the
rules and dictionary entries, which include the following:
sentence noun_phrase(Num), verb_phrase(Num) NOTES
noun_phrase(Num) proper_name(Num)
noun_phrase(Num) determiner(Num), noun(Num)
verb_phrase(Num) verb(Num), noun_phrase(_)
proper_name(sing) [Mary]
noun(sing) [lion]
noun(plur) [lions]
det(sing) [the]
det(plur) [the]
verb(sing) [eats]
verb(plur) [eat]
Note that strictly, we no longer have context-free grammar having added these extra
arguments to our rules. In general, getting the agreement right in grammar is much
more complex that this. We need fairly complex rules, and we also need to put more
information in dictionary entries. A good dictionary will not state everything
explicitly, but will exploit general information about word structure, such as the
fact that, given a verb such as eat the third person singular form generally involves
adding an s: hit/hits, eat/eats, like/likes, etc. Morphology is the area of natural
language processing concerned with such things.
To extend the grammar to allow adjectives we need to add an extra rule or
two, e.g.:
noun_phrase(Num) determiner(Num), adjectives, noun(Num).
adjectives adjective, adjectives.
adjectives adjective.
adjective [ferocious].
adjective [ugly].
etc.
That is, noun phrases can consist of a determiner, some adjectives and a noun.
Adjectives consist of an adjective and some more adjectives, OR just an adjective.
We can now parse the sentence the ferocious ugly lion eats Mary.
Another thing we may need to do to grammar is to extend it so we can
distinguish between transitive verbs that take an object (e.g. likes) and intransitive
verbs that dont (e.g. talks). (Mary likes the lion is OK while Mary likes is not.
Mary talks is OK while Mary talks the lion is not). This is left as an exercise for
the reader.
Our grammar so far (if we put all the bits together) still only parses sentences
of a very simple form. It certainly wouldnt parse the sentences Im currently writing!
We can try adding more and more rules to account for more and more of English:
for example, we need rules that deal with prepositional phrases (e.g. Mary likes
Self-Instructional
Material 169
Natural Language the lion with the long mane), and relative clauses (e.g., The lion that ate Mary
Processing
kissed John). These are left as another exercise for the reader.
As we add more and more rules to allow more bits of English to be parsed
then we may find that our basic grammar formalism becomes inadequate, and we
NOTES need a more powerful one to allow us to concisely capture the rules of syntax.
There are lots of different grammar formalisms that have been developed (e.g.,
unification grammar, categorical grammar), but we wont go into them.
Formal Grammar
The lexicon for o:
Noun stench | breeze | glitter | wumpus | pit | pits | gold |
Verb is | see | smell | shoot | stinks | go | grab | turn |
Adjective right | left | east | dead | back | smelly |
Adverb here | there | nearby | ahead | right | left | east |
Pronoun me | you | I | it |
Name John | Mary | Bush | Rocky |
Article the | a | an |
Preposition to | in | on | near |
Conjunction and | or | but |
Digit 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
The grammar for o:
S NP VP | + feel a breeze
| S Conjunction S | feel a breeze + and + I smell a wumpus
NP Pronoun |
| Name John
| Noun pits
| Article Noun the + wumpus
| Digit Digit 34
| NP PP the wumpus + to the east
| NP RelClause the wumpus + that is smelly
VP Verb stinks
| VP NP feel + a breeze
| VP Adjective is + smelly
| VP PP turn + to the east
| VP Adverb go + ahead
PP Preposition NP to + the east
RelClause that VP that + is smelly
Parts of speech
Open class: noun, verb, adjective, adverb Closed class: pronoun, article,
preposition, conjunction
Self-Instructional
170 Material
Grammar Natural Language
Processing
Overgenerate: Me go Boston
Undergenerate: I think the wumpus is smelly
7.2.1.2 Parsers NOTES
Having a grammar isnt enough to parse natural language; you need a parser. The
parser should search for possible ways the rules of the grammar can be used to
parse the sentence, so parsing can be viewed as a kind of search problem. In general,
there may be many different rules that can be used to expand or rewrite a given
syntactic category, and the parser must check through them all, to see if the sentence
can be parsed using them. For example, in our mini-grammar above there were two
rules for noun_phrasesa parse of the sentence may use either one or the other. In
fact we can view the grammar as defining an AND-OR tree to search: alternative
ways of expanding a node give OR branches, while rules with more than one syntactic
category on the right hand side give AND branches.
So, to parse a sentence we need to search through all these possibilities,
effectively going through all possible syntactic structures to find one that fits the
sentence. There are good ways and bad ways of doing this, just as there are good
and bad ways of parsing programming languages. One way is basically to do a
depth-first search through the parse tree. When you reach the first terminal node in
the grammar (i.e. a primitive syntactic category, such as noun) you check whether
the first word of the sentence belongs to this category (e.g., is a noun). If it does,
then you continue the parse with the rest of the sentence. If it doesnt, you backtrack
and try alternative grammar rules.
As an example, suppose you were trying to parse John loves Mary, given
the following grammar:
sentence noun_phrase, verb_pharse.
verb_phrase verb, noun_phrase.
noun_pharse det, noun.
noun_phrase p_name.
verb [loves].
p_name [john].
p_name [Mary].
You might start off expanding sentence to a verb phrase and a noun phrase. Then
the noun phrase would be expanded to give a determiner and a noun, using the third
rule. A determiner is a primitive syntactic category (a terminal node in the grammar);
so we check whether the first word (John) belongs to that category. It doesnt
John is a proper nounso we backtrack and find another way of expanding
noun_phrase and try the fourth rule. Now, as John is a proper name this will work
OK, so we continue the parse with the rest of the sentence (loves Mary). We
havent yet expanded the verb phrase, so we try to parse loves Mary as a verb
phrase. This will eventually succeed, so the whole thing succeeds.
It may be clear by now that in Prolog the parsing mechanism is really just
Prologs built-in search procedure. Prolog will just backtrack to explore the different
possible syntactic structures. The extra arguments which Prolog (internally) adds to
Self-Instructional
Material 171
Natural Language DCG rules allows it to parse a bit of the sentence using one rule, then the rest of the
Processing
sentence using another.
Note that this kind of simple top-down parser can be implemented reasonably
easily in other languages that dont support backtracking by using an agenda-based
NOTES search mechanism. An item in the agenda may be the remaining syntactic categories
and bits of sentences left to parse, given this branch of the search tree (e.g. [[verb,
noun_phrase], [loves, Mary]]).
Simple parsers are often inefficient, because they dont keep a record of all
the bits of the sentence that have been parsed. Using simple depth-first search with
backtracking can result in useful bits of parsing being undone on backtracking: if
you have a rule a -> b, c, d and a rule a -> b, c, e then you may find the first part
of the sentence consists of b and c constituents, go on to check if the rest is a d
constituent, but when that fails a Prolog-like system will throw away its conclusions
about the first half when backtracking to try the second rule. It will then duplicate
the parsing of the b and c bits. Anyway, better parsers try to avoid this. Two
important kinds of parsers used in natural language are transition network parsers
and chart parsers. A transition network parser makes use of a more flexible network
representation for grammar rules to partially avoid this problem. The above rules
would be represented something like:
The transition network parser would basically traverse this network, checking
that words in the sentence match the syntactic categories on each arc.
A chart parser goes further in avoiding duplicating work by explicitly recording
in a chart all the possible parses of each bit of the sentence.
Parse Tree
Self-Instructional
Material 173
Natural Language First, lets go back to our syntactically ambiguous sentences and see how
Processing
semantics could help:
Time flies like an arrow.
NOTES Fruit flies like a banana.
If we have some representation of the meanings of the different words in the sentence
we can probably rule out the silly parse. We might look up banana (maybe in some
frame or semantic net system) and find that it is a fruit, and fruits generally dont
fly. We might then be able to throw out the reading flies like a banana if we made
sure that sentences which mean X does something like Y require that X and Y can
do that thing!
Sometimes ambiguity is introduced at the stage of semantic analysis. For
example:
John went to the bank
Did John go to the river bank or the financial bank? We might want to make this
explicit in our semantic representation, but without contextual knowledge we have
no good way of choosing between them. This kind of ambiguity occurs when a
word has two possible meanings, but both of them may, for example, be nouns. To
obtain a semantic representation it helps if you can combine the meanings of the
parts of the sentence in a simple way to get at the meaning of the whole (The term
compositional semantics refers to this process). For those familiar with lambda
expressions, one way to do this is to represent word meanings as complex lambda
expressions, and just use function application to combine them. To combine a noun
phrase John and a verb phrase sleeps we might have:
Verb phrase meaning : X.sleeps (X)
Noun phrase meaning : john.
Apply VP meaning to NP meaning : X.sleeps (X)(john) = sleeps (john)
The output of the semantic analysis stage may be anything from a semantic net to
an expression in some complex logic. It will partially specify the meaning of the
sentence. From He went to the bank we might have two possible readings,
represented in predicate logic as:
X male (X) wentto (X, Y) financialbank (Y)
X male (X) wentto (X, Y) riverbank (Y)
Good representations for sentence meaning tend to be much more complex than
this, to properly capture tense, conditionals, etc., but this gives the general flavour.
Semantic Interpretation:
Semantics: Meaning of utterances
First-order logic as the representation language
Compositional semantics: Meaning of a phrase is composed of meaning of
the constituent parts of the phrase
Exp(x) Exp(x1) Operator (op) Exp(x2)
{x = Apply (op, x1, x2)}
Exp(x) (Exp (x))
Self-Instructional
174 Material
Exp(x) Number(x) Natural Language
Processing
Number(x) Digit(x)
Number(x) Number(x1) Digit(x2) {x = 10 x1 + x2}
Digit(x) x {0 x 9} NOTES
Operator(x) x {x {+, , , }}
Example:
John loves Mary
Loves (John, Mary)
(y x Loves (x, y)) (Mary) x Loves (x, Mary)
(x Loves (x, Mary)) (John) Loves (John, Mary)
S (rel (obj)) NP (obj) VP (rel)
VP (rel (obj)) Verb (rel) NP (obj)
NP (obj) Name (obj)
Name (John) John
Name (Mary) Mary
Verb (y x Loves (x, y)) loves
Pragmatics
Pragmatics is the last stage of analysis, where the meaning is elaborated based on
contextual and world knowledge. Contextual knowledge includes knowledge of
the previous sentences (spoken or written), general knowledge about the world and
knowledge of the speaker.
One important task at this stage is to work out referents of expressions. For
example, in the sentence he kicked the brown dog the expression the brown dog
refers to a particular brown dog (say, Fido). The pronoun he refers to the particular
guy we are talking about (Fred). A full representation of the meaning of the sentence
should mention Fido and Fred.
We can often find this out by looking at the previous sentence, e.g.:
Fred went to the park.
He kicked the brown dog.
Self-Instructional
Material 175
Natural Language We can work out from this that he refers to Fred. We might also guess that
Processing
the brown dog is in the park, but to figure out that we mean Fido wed need some
extra general or contextual knowledgethat the only brown dog that generally
frequents the park is Fido. In general, this kind of inference is pretty difficult, though
NOTES quite a lot can be done using simple strategies, like looking at whos mentioned in
the previous sentence to work out who he refers to. Of course, sometimes there
may be two people (or two dogs) that the speaker might be referring to, e.g.:
There was a brown dog and a black dog in the park. Fred went to the park
with Jim. He kicked the dog.
In cases like this we have referential ambiguity. It is seldom quite as explicit
as this, but in general can be a big problem. When the intended referent is unclear a
natural language dialogue system may have to initiate a clarification subdialogue,
asking for example Do you mean the black one or the brown one..
Anyway, another thing that is often done at this stage of analysis (pragmatics)
is to try and guess at the goals underlying utterances. For example, if someone asks
how much something is you generally assume that they have the goal of (probably)
buying it. If you can guess at peoples goals you can be a bit more helpful in
responding to their questions. So, an automatic airline information service, when
asked when the next flight to Paris is, shouldnt just say 6 pm if it knows the flight
is full. It should guess that the questioner wants to travel on it, check that this is
possible, and say 6 pm, but its full. The next flight with an empty seat is at 8 pm.
Pragmatic Interpretation
Adding context-dependent information about the current situation to each
candidate semantic interpretation
Indexicals: Phrases that refer directly to the current situation
Ex.: I am in Boston today
(I refers to speaker and today refers to now)
Generation
We have so far only discussed natural language understanding. However, you should
be aware also of the problems in generating natural language. If you have something
you want to express (e.g. eats (John, chocolate)), or some goal you want to achieve
(e.g. get Fred to close the door), then there are many ways you can achieve that
through language:
He eats chocolate.
Its chocolate that John eats.
John eats chocolate.
Chocolate is eaten by John.
Close the door.
Its cold in here.
Can you close the door?
A generation system must be able to choose appropriately from among the
different possible constructions, based on knowledge of the context. If a complex
text is to be written, it must further know how to make that text coherent.
Self-Instructional
176 Material
Anyway, thats enough on natural language. The main points to understand Natural Language
Processing
are roughly what happens at each stage of analysis (for language understanding),
what the problems are and why, and how to write simple grammar in Prologs DCG
formalism.
NOTES
Language Generation
The same DCG can be used for parsing and generation
Parsing:
Given: S (sem, [John, loves, Mary])
Return: sem = Loves (John, Mary)
Generation:
Given: S (Loves (John, Mary), words)
Return: words = [John, loves, Mary]
Ambiguity
Lexical ambiguity
the back of the room vs back up your files
In the interest of stimulating the economy, the government lowered the interest
rate.
Syntactic ambiguity (structural ambiguity)
I smelled a wumpus in 2,2
Semantic ambiguity
the IBM lecture
Pragmatic ambiguity
Ill meet you next Friday
Discourse Understanding
Discourse: Multiple sentences
Reference resolution: The interpretation of a pronoun or a definite noun phrase
that refers to an object in the world.
John flagged down the waiter. He ordered a ham sandwich.
He refers to John
After John proposed to Mary, they found a preacher and got married. For the
honeymoon, they went to Hawaii.
they? the honeymoon?
Structure of coherent discourse: Sentences are joined by coherence relations
Examples of coherence relations between S1 and S2:
Enable or cause: S1 brings about a change of state that causes or enables
S2
I went outside. I drove to school.
Explanation: The reverse of enablement, S2 causes or enables S1 and is
an explanation for S1.
I was late for school. I overslept.
Exemplification: S2 is an example of the general principle in S1.
Self-Instructional
Material 177
Natural Language This algorithm reverses a list. The input [A, B, C] is mapped to [C, B, A].
Processing
Etc.
7.2.2 Why is Language Processing Difficult?
NOTES Consider trying to build a system that would answer e-mails sent by customers to a
retailer selling laptops and accessories via the Internet. This might be expected to
handle queries such as the following:
Has my order number 4291 been shipped yet?
Is FD5 compatible with a 565G?
What is the speed of the 565G?
Assume the query is to be evaluated against a database containing product and
order information, with relations such as the following:
ORDER
Here S Sentence
NP Noun Phrase
Self-Instructional
Material 179
Natural Language VP Verb Phrase
Processing
NAME Noun
ART Article
NOTES To construct such descriptions, you must know what structures are legal for English.
A set of rules called rewrite rules describe the tree structures that are allowable.
These rules say that a certain symbol may be expanded in the tree by a sequence of
other symbols. For instance, a set of rules that would allow the tree structure in the
above figure is the following figure:
S NP VP
VP VERB NP
NP NAME
NP ART NOUN
Here, S may consist of NP followed by a VP, VP may consist of VERB followed by
NP and so on. Grammars consisting entirely of rules of the form, are called Context
<Sym> <Sym>1..<Sym>m for n>=1.
Free Grammars (CFGs).
Symbols that cannot be further decomposed in a grammar, such as NOUN, ART
and VERB in the above example, are called terminal symbols. The other symbols
such as NP, VP, S are called non-terminal symbols.
Two common and simple techniques for CFGs are known as top-down parsing
and bottom-up parsing. Top-down Parsing begins by starting with the symbol S and
rewriting it say, to NP VP. These symbols may then themselves be rewritten as per
the rewrite rules. Finally, terminal symbols such as NOUN may be rewritten by a
word, such as rat, which is marked as a noun in the lexicon. Thus, in top-down
parsing you use the rules of the grammar in such as way that the right hand side of
the rule is always used to rewrite the symbol on the left hand side. In bottom-up
parsing, we start with the individual words and replace them with their syntactic
categories.
A possible top-down parsing for Shyam ate the bun will be
S NP VP
NAME VP
Shyam VP
Shyam VERB NP
Shyam ate NP
Shyam ate ART NOUN
Shyam ate the NOUN
Shyam ate the bun.
A possible bottom-up parsing of the sentence Shyam ate the bun may be
NAME ate the bun
NAME VERB the bun
Self-Instructional
NAME VERB ART bun
180 Material
NAME VERB ART NOUN Natural Language
Processing
NP VERB ART NOUN
NP VERB NP
NP VP NOTES
S
Another grammar representation is often more convenient for visualizing
the grammar. This formalism is based open the notion of transition network
consisting of nodes and labelled arcs. Consider the network named S in the following
grammar:
Each arc is labelled with a word category. Starting at a given node, you can traverse
an arc if the current word in the sentence is in the category on the arc. If the arc is
followed, the current word is updated to the next word. This network recognized
the same set of sentences as the following content-free grammar:
S NOUN S1
S 1 VERB S2
S 2 NOUN
Example:
Shyam ate mango can be accepted by the above grammar.
The Simple Transition Network (STN) formalism is not powerful enough to
describe all languages that can be described by a CFG. To get the descriptive power
of CFGs, we need a notion of recursion in the network grammar. A Recursive
Transition Network (RTN) is like a simple transition network, except that it allows
arc labels that refer to other networks rather than word categories.
Self-Instructional
Material 181
Natural Language
Processing
NOTES
But the method doubles the size of the grammar. In addition, to checking
some other feature, you would need to double the size of the grammar again. Besides
the problems of size, this approach also forces you to make the singular plural
distinction even when it is not necessary to check it.
Check Your Progress
A better solution is to allow words and syntactic structures to have features as
3. List the two common
well as a basic category. We can do so by using the slot value list notation introduced techniques for CFGs.
earlier for syntactic structure. This notation allows you to store number information 4. Simple transition
as well as other useful information about the word in a data structure called the network (STN)
lexicon. The following table shows a lexicon with the root and number information formalism is not
for some words, in which 3S means, third person singular and 3P means third powerful enough to
describe all
person plural. languages that can be
described by a CFG.
Word Representation (True or False?)
5. An augmented
Cats (NOUN ROOT Cat
transition network
Num {SP}) (ATN) is a bottom-
up parsing procedure
Cat (NOUN ROOT Cat that allows various
Num {35}) kinds of knowledge
to be incorporated
the (ART ROOT THE into the parsing
NUM M{3S 3P}) system so it can
operate efficiently.
a (ART ROOT A (True or False?)
NUM {3S})
Self-Instructional
Material 183
Natural Language cried (VERB ROOT CRT
Processing
NUM {3S 3P})
hates (VERB ROOT hate
NUM {3S})
NOTES
hate (VERB ROOT hate
NUM {3P})
Shyam (NAME ROOT Shyam)
Now extend the RTN formalism by adding a test to each arc in the network. A test
is simply a function that is said to succeed if it returns a non-empty value, such as a
set or atom, and to fail if it returns the empty set or nil. If a test fails, its arc is not
traversed.
Self-Instructional
186 Material
Elements of sets. Natural Language
Processing
Ex. The decals we have in stock are stars, the moon, item and a flag.
Ill take two moons.
To understand the second sentence all that is required is that we use the context NOTES
of the first sentence to establish that the word moons mean moon decals.
Name of the individuals.
Ex. John went to the movie.
John should be understood to be some person name John.
Causal chains.
Ex. There was a big snow storm yesterday.
The schools were closed today.
The snow should be recognized as the reason that the schools were closed.
Planning Sequences.
Ex. Shalu wanted a new car.
She decided to get a job.
Shalus sudden interest in a job should be recognized as arising out of her
desire for a new car and thus for the money to buy a car.
Illocutionary force.
Ex. It sure is cold in here.
In many circumstances, this sentence should be recognized as having, as its
intended effect, that the hearer should do something like close the window or
turn up the thermostat.
Implicit presuppositions.
Ex. Did Sam fail CS101?
Speakers presuppositions include the fact that CS101 is a valid course, Sam
is a student, and that Sam took CS101, should recognize that he failed.
In order to be able to recognize these kinds of relationships among sentences, a
great deal of knowledge about the world is required. Programs that can do multiple
sentence understanding rely either on large knowledge bases or on strong constraints
on the domain of discourse so that only a more limited knowledge base is necessary.
The way this knowledge is organized is critical to the success of the understanding
program. We should focus on the use of the following kinds of knowledge:
1. The current focus of the dialogue.
2. A model of each participants current beliefs.
3. The goal-driven character of dialogue.
4. The rules of conversation shared by all participants.
7.6.1 Conversational Postulates
Unfortunately, this analysis of language is complicated by the fact that we dont say
exactly what we mean. We often use indirect speech acts such as It sure is cold in
here. Regularity gives rise to a set of conversational postulates, which are rules
Self-Instructional
Material 187
Natural Language about conversation that are shared by all speakers. Usually, these rules are followed.
Processing
Some of these conversational postulates are:
Sincerity condition
NOTES Ex: Can you open the door?
Reasonableness conditions
Ex: Can you open the door?
Why do you want it open?
Appropriateness Conditions
Ex: Who won the race?
Someone with long, dark hair.
I thought you knew all the runners.
Assuming a cooperative relationship between the parties to a dialogue, the shared
assumption of these postulates greatly facilitates communication.
Self-Instructional
188 Material
Natural Language
7.8 SUMMARY Processing
In this unit, you have learned about the tough problem of language understanding.
One interesting way to summarize the natural language understanding problem is NOTES
to view it as a constraint satisfaction problem. Unfortunately, many more kinds of
constraint satisfactions must be considered. But constraint satisfaction does provide
a reasonable framework in which to view the whole collection of steps that together
create a meaning for a sentence. Essentially, each of the steps described in the unit
exploits a particular kind of knowledge that contributes a specific set of constraints
that must be satisfied by any correct final interpretation of a sentence.
You have also learned about syntactic processing which contributes a set of
constraints derived from the grammar of the language. Semantic processing
contributes an additional set of constraints derived from the knowledge it has about
entities that can exist in the world. The unit also discussed discourse processing
which contributes a further set of constraints that arise from the stricture of coherent
discourses. And finally, pragmatics contributes yet another set of constraints. There
are many important issues in natural language processing. By combining
understanding and generation systems, it is possible to attack the problem of machine
translation, by which we understand text written in one language and then generate
it in another language.
Self-Instructional
Material 189
Natural Language
Processing 7.11 QUESTIONS AND EXERCISES
1. What is natural language processing?
NOTES 2. Write the production rules necessary to check the syntax of an English noun.
The grammar shall include both proper and common nouns.
3. What is a simple transition network (STN)?
4. Describe syntactic grammars and semantic grammars.
Self-Instructional
190 Material