Anda di halaman 1dari 49

Artificial Intelligence (AI)

Computational Linguistics (CL)

The TALK project

Learning software project

Artificial Intelligence:
An easy introduction from a
computational linguistic perspective
Ciprian-Virgil Gerstenberger
Saami Language Technology, Giellatekno, University of Troms, Norway

24.11.2011 University of Murmansk

Artificial Intelligence (AI)

Computational Linguistics (CL)

Outline

Artificial Intelligence (AI)

The TALK project

Learning software project

Artificial Intelligence (AI)

Computational Linguistics (CL)

Outline

Artificial Intelligence (AI)


Computational Linguistics (CL)

The TALK project

Learning software project

Artificial Intelligence (AI)

Computational Linguistics (CL)

Outline

Artificial Intelligence (AI)


Computational Linguistics (CL)
The TALK project

The TALK project

Learning software project

Artificial Intelligence (AI)

Computational Linguistics (CL)

Outline

Artificial Intelligence (AI)


Computational Linguistics (CL)
The TALK project
Learning software project

The TALK project

Learning software project

Artificial Intelligence (AI)

Computational Linguistics (CL)

The TALK project

Learning software project

What is Artificial Intelligence?

What is artificial?
all human made

What is intelligence?
the capacity to learn and solve problems
(Webster dictionary)

the ability to think and act rationally

Artificial Intelligence (AI)

Computational Linguistics (CL)

The TALK project

Learning software project

Definitions
Artificial Intelligence is a branch of Science which deals with helping
machines find solutions to complex problems in a more human-like
fashion. This generally involves borrowing characteristics from human
intelligence, and applying them as algorithms in a computer friendly
way.
http://ai-depot.com/Intro.html

Artificial Intelligence (AI) is the area of computer science focusing on


creating machines that can engage on behaviors that humans
consider intelligent.
http://library.thinkquest.org/2705/

Artificial Intelligence (AI)

Computational Linguistics (CL)

The TALK project

Learning software project

Measuring intelligence: Turing


The Turing Test (1950)
The computer is interrogated by a human via a teletype. It the
test passes if the human cannot tell if there is a computer or
human at the other end.

http://www.cs.cmu.edu/afs/cs/academic/class/15381-s07/www/slides/011607comboIntro.pdf

Is this test sufficient?

Artificial Intelligence (AI)

Computational Linguistics (CL)

The TALK project

Learning software project

Measuring intelligence: Searle

The Chinese Room Argument John Searle (1980)


An English man knowing no Chinese locked in a room with Chinese
symbols (a data base) and a book of instruction for manipulating the
symbols (the program) would get Chinese symbols which, unknown
to the person in the room, are questions in Chinese (the input). By
following the instruction in the program the man is able to pass out
Chinese symbols which are correct answers to the questions (the
output).
The program enables the person in the room to pass the Turing
Test for understanding Chinese but he does not understand a word of
Chinese.

Artificial Intelligence (AI)

Computational Linguistics (CL)

The TALK project

Learning software project

Intelligent systems

http://www.cs.cmu.edu/afs/cs/academic/class/15381-s07/www/slides/011607comboIntro.pdf

Artificial Intelligence (AI)

Computational Linguistics (CL)

The TALK project

Learning software project

Intelligent systems
Key steps of a knowledge-based agent (Craik, 1943):
the stimulus must be translated into an internal
representation
humans sensoric organs vs. machines sensors

the representation is manipulated by cognitive processes


to derive new internal representations
humans representation?
memory

these in turn are translated into action


complex with humans
sometimes unpredictable

Artificial Intelligence (AI)

Computational Linguistics (CL)

The TALK project

Learning software project

Complexity and Efficiency

Solving problems
huge computational complexity
Does the intelligent system answer at all?

space-time trade-offs
Does the intelligent systems answer in reasonable time?

optimizing the search by use of domain knowledge


heuristics
pruning

Artificial Intelligence (AI)

Computational Linguistics (CL)

The TALK project

Approaches to AI: Classification

Bottom-Up:
the machine will discover the world on its own,
the way humans do
Top-Down:
learning occurs from what is already known

What is the bottom?


observed data

What is the top?


abstractions; data models

Learning software project

Artificial Intelligence (AI)

Computational Linguistics (CL)

The TALK project

Learning software project

Basic tasks
Searching
filtering material of a certain type
Recognizing patterns
abstracting; classifying
Constraint solving
satisfying a number of limitations
Reasoning (with uncertain information)
drawing conclusions; deduction; induction
Learning
world changing; maintaining an accurate model; dynamicity

Artificial Intelligence (AI)

Computational Linguistics (CL)

The TALK project

Approaches to AI: Examples


Pattern Recognition
Expert Systems
Human-Computer Interaction
Games
Auction design
Diagnosis
Neural Networks and Parallel Computation
Evolutionary Computation and Planning
Genetic Algorithms
Logic Programming
Robotics

Learning software project

Artificial Intelligence (AI)

Computational Linguistics (CL)

The TALK project

Learning software project

AI Foundations
Involving cross-disciplinary studies:
Philosophy: logic, philosophy of mind, philosophy of
science, philosophy of mathematics
Mathematics: logic, probability theory, theory of
computability
Psychology: behaviorism, cognitive psychology,
neuroscience
Linguistics: theory of grammar, syntax, semantics
Information Theory, Computer Science, Engineering:
hardware, algorithms, computational complexity theory
mechanical engineering
...

Artificial Intelligence (AI)

Computational Linguistics (CL)

The TALK project

AI and Computational Linguistics

Computational Linguistics: central role within AI


Automatic Speech Recognition
one of the oldest pattern recognition tasks

Machine-Translation
ETAP-1 Russian English (starting in 1970s)
Google services

Human-Machine Interaction
dialogue modeling
dialogue systems

Learning software project

Artificial Intelligence (AI)

Computational Linguistics (CL)

The TALK project

Learning software project

Definitions

Computational linguistics is the science of language with particular


attention given to the processing complexity constraints dictated by
the human cognitive architecture. Like most sciences, computational
linguistics also has engineering applications.
http://www.cs.tcd.ie/courses/csll/CSLLcourse.html

Computational linguistics is the study of computer systems for


understanding and generating natural language.
(Ralph Grishman, Computational Linguistics: An Introduction, Cambridge University Press 1986. )

Artificial Intelligence (AI)

Computational Linguistics (CL)

The TALK project

Learning software project

Natural Language Processing (NLP)

Spoken Language
Automatic Speech Recognition (ASR)
Speech Syntesis, e.g., Text-To-Speech (TTS)

Artificial Intelligence (AI)

Computational Linguistics (CL)

The TALK project

Learning software project

Natural Language Processing (NLP)

Written Language
Natural Language Analysis (NLA)
input utterance abstractions
surface form meaning(s)

Natural Language Generation (NLG)


abstraction output utterance
meaning surface form(s)

Artificial Intelligence (AI)

Computational Linguistics (CL)

The TALK project

Learning software project

Approaches in CL
Rule-Based
explicit encoding of linguistic knowledge
usually consisting of a set of hand-crafted, grammatical
rules
easy to test and debug
require considerable human effort
often based on limited inspection of the data with an
emphasis on prototypical examples
often fail to reach sufficient domain coverage
often lack sufficient robustness when input data are noisy
http://www.sfs.uni-tuebingen.de/~fr/teaching/ws05-06/icl/slides/lecture2.pdf

Artificial Intelligence (AI)

Computational Linguistics (CL)

The TALK project

Learning software project

Approaches in CL
Data-Driven
implicit encoding of linguistic knowledge
often using statistical methods or machine learning
methods
require less human effort
require large-scale data sources
coverage directly proportional to the richness of the data
source
more adaptive to noisy data
http://www.sfs.uni-tuebingen.de/~fr/teaching/ws05-06/icl/slides/lecture2.pdf

Artificial Intelligence (AI)

Computational Linguistics (CL)

The TALK project

Learning software project

Application Areas

machine translation
speech recognition
speech synthesis
text generation
man-machine interfaces
intelligent word processing: spelling correction, grammar
correction
document management: information retrival, information
extraction, text summarization

Artificial Intelligence (AI)

Computational Linguistics (CL)

The TALK project

Learning software project

The Project

Talk and Look: Tools for Ambient Linguistic Knowledge


funded by the EU as project No. IST-507802 within the 6th
Framework program
cooperation between
=
=
=
=

German Research Center for Artificial Intelligency (DFKI)


University of Saarland, Germany (USAAR)
BOSCH
BMW

Artificial Intelligence (AI)

Computational Linguistics (CL)

The TALK project

Learning software project

The Aims

focusing on the development of new technologies for


adaptive multimodal and multilingual human-computer
dialogue systems
make dialogue interfaces more conversational, robust,
intuitive, and user-adaptive
Long-term vision: users interacting naturally with devices and
services, in the home or car, using speech, graphics, or a
combination of the two

Artificial Intelligence (AI)

Computational Linguistics (CL)

The TALK project

Learning software project

The SAMMIE Corpus

Collecting Data
multimodal MP3 player interaction experiment
car driving simulation and interaction with an MP3 player at
the same time
Wizard-of-Oz study
different human wizards decide whether to ask a
clarification request in a multimodal manner or else to use
speech alone

Artificial Intelligence (AI)

Computational Linguistics (CL)

The TALK project

The Experiment

Learning software project

Artificial Intelligence (AI)

Computational Linguistics (CL)

The TALK project

Learning software project

The SAMMIE Corpus

Goals
learning policies form multimodal interaction based on
factors such as long vs. short song lists, interaction with
the mp3 player being not the main focus
learning multimodal presentation strategies
learning multimodal clarification strategies

Artificial Intelligence (AI)

Computational Linguistics (CL)

The TALK project

Learning software project

The SAMMIE System

Developing a prototype of a dialogue system


to show natural, intuitive mixed-initiative interaction
particular emphasis on multimodal turn-planning
particular emphasis on flexible natural language generation

Artificial Intelligence (AI)

Computational Linguistics (CL)

The TALK project

Learning software project

The SAMMIE Multimodal System Architecture

Artificial Intelligence (AI)

Computational Linguistics (CL)

The TALK project

Learning software project

System Architecture

Description
the classical approach of a pipelined architecture
multimodal fusion and fission modules as parts of the
dialogue manager
dialogue manager decides on the next system move,
based on its model of the tasks, the current context and
the result of the song database
generation of an appropriate message to the driver

Artificial Intelligence (AI)

Computational Linguistics (CL)

The TALK project

Interaction
A typical interaction with the SAMMIE system

Learning software project

Artificial Intelligence (AI)

Computational Linguistics (CL)

The TALK project

Learning software project

Multimodality
Screenshot of in-car final showcase systems GUI,
main menu

push-to-talk (not barge-in!)

Artificial Intelligence (AI)

Computational Linguistics (CL)

The TALK project

Learning software project

System integration: How?

Nuance java API for speech recognition


MySQL Database
Dialogue Manager
MARY Text-To-Speech java API for speech syntesis
...

Application Programming Interface (API)


+
Middleware: The Open Agent Architecture (OAA)
OAA 2.x agent libraries for:
Prolog
ANSI C/C++
Java
Compaqs Web Language

Artificial Intelligence (AI)

Computational Linguistics (CL)

The TALK project

Learning software project

Example of integration
Generic MySQL query OAA agent
connect( host, port, user, pass, database, Result )

Tourist information scenario


retrieve all hotels which are cheaper than 30:
oaa Solve(sendGenericDBQuery( [id, name], [type=hotel,
price<30], Result ), [] )

and print their ids and names:


sendGenericDBQuery( [id, name], [type=hotel,
price<30], [ [H1, HOTEL PRIMUS],[H5, ART HOUSE HOTEL],
[H6, ALEXANDER HOTEL] ] )

Artificial Intelligence (AI)

Computational Linguistics (CL)

The TALK project

OAA Example

Learning software project

Artificial Intelligence (AI)

Computational Linguistics (CL)

The TALK project

Learning software project

The Oahpa! project

OAHPA!
a web-based, language learning program for North Saami
Computer-Assisted/Aided Language Learning (CALL)

http://oahpa.uit.no

Artificial Intelligence (AI)

Computational Linguistics (CL)

The TALK project

North Saami Oahpa!

Learning software project

Artificial Intelligence (AI)

Computational Linguistics (CL)

The TALK project

South Saami Oahpa!

Learning software project

Artificial Intelligence (AI)

Computational Linguistics (CL)

The TALK project

Learning software project

The Learning Units


Simple learning units
Numra: exercise numerals
Leksa: train vocabulary
MorfaS: train word inflection
More complex learning units
MorfaC: train word inflection in context of a well-formed
sentence
Vasta: give answes to random questions
Sahka: interactive dialogues

Artificial Intelligence (AI)

Computational Linguistics (CL)

The TALK project

Learning software project

The Infrastructure

General software
Django: a high-level Python Web framework for rapid
development of Web applications
MySQL database
Javascript for polishing up some web-features not
managable in Django only

Artificial Intelligence (AI)

Computational Linguistics (CL)

The TALK project

Learning software project

The Infrastructure
Special software
Xerox tools
=
=
=
=

twolc: for morphophonology


lexc: for morphology
xfst: for compiling transducer
lookup: for analysis and generation

Finite-State Morphology Finite-State Automata


VISLCG: parser for Constraint Grammar
Parsing Syntax Analysis

Artificial Intelligence (AI)

Computational Linguistics (CL)

The TALK project

The Games

Options
teaching book
word class
dialectal form
level of difficulty

Learning software project

Artificial Intelligence (AI)

Computational Linguistics (CL)

The TALK project

Learning software project

The Games

Game Rating
for simple games, just match against the correct answers
from the database
for more complex games, expert knowledge stored in
different modules
= help information on request
= XML-file for giving feedback on specific errors
= special parsers for dialogue moves for the dialogue unit
Sahka based on Constraint Grammar

Artificial Intelligence (AI)

Computational Linguistics (CL)

The TALK project

Rating vocabulary training

Learning software project

Artificial Intelligence (AI)

Computational Linguistics (CL)

The TALK project

Rating dialogue answers

Learning software project

Artificial Intelligence (AI)

Computational Linguistics (CL)

Sahka
Overview of the analysis process

The TALK project

Learning software project

Artificial Intelligence (AI)

Computational Linguistics (CL)

The TALK project

Learning software project

Ideas
AI and CL for education projects

usefull
reasoning: assumptions about the students knowledge on

a specific subject
specialized databases
language: special terminology; mixed of natural language

and special coding (mathematics, physics, chemistry, etc.)


specialized parsers

Think up your own project!

Artificial Intelligence (AI)

Computational Linguistics (CL)

The TALK project

Learning software project

Conclusion

= intelligent program a matter of interpretation


= plenty of free software in the web to start your own project
you need an exact view on that it should be about
= software integration via API and middleware
= specialised software for natural language input and output
your contribution your knowledge of a specific domain