Anda di halaman 1dari 19

Statistical Natural Language

Processing

What is NLP?

Natural Language Processing (NLP), or


Computational Linguistics, is concerned with
theoretical and practical issues in the design
and implementation of computer systems for
processing human languages

It is an interdisciplinary field which draws on


other areas of study such as computer
science, artificial intelligence, linguistics and
logic

Applications of NLP
natural

language interfaces to
databases
programs for classifying and retrieving
documents by content
explanation generation for expert
systems
machine translation
advanced word-processing tools

What makes NLP a


computational challenge?
Ambiguous

nature of Natural Language.


There are varied applications for
language technology
Knowledge representation is a difficult
task.
There are different levels of information
encoded in our language

What is statistical NLP?


Statistical

NLP aims to perform


statistical inference for the field of NLP
Statistical inference consists of taking
some data generated in accordance
with some unknown probability
distribution and making inferences.

Motivations for Statistical NLP


Cognitive

modeling of the human language


processing has not reached a stage where
we can have a complete mapping between
the language signal and the information
contents.
Complete mapping is not always required.
Statistical approach provides the flexibility
required for making the modeling of a
language more accurate.

Idea behind Statistical NLP


View

language processing as a noisy


channel information transmission.
The approach requires a model that
characterizes the transmission by giving
for every message the probability of the
observed output

Statistical Modeling and


Classification
Primitive

acoustic features
Quantization
Maximum likelihood and related rules
Class conditional density function
Hidden Markov Model Methodology

Details.
Primitive acoustic features are used to
estimate the speech spectrum on the basis of
its statistical properties.
By means of quantization a typical speech
signal can be represented as a sequence of
symbols and can be mapped using statistical
decision rules into a multidimensional
acoustic feature space, thus classifying the
signal.

Maximum Likelihood
Although there is no direct method for computing the
probability of a phonetic unit given its acoustic
features,we can use Bayes rule to estimate the
probability of a phonetic class given its features
from the likelihood of the features given the
class. This method leads to the maximum likelihood
classifier which assigns an unknown vector to that
class whose probability density function conditioned
on the class has the maximum value.
Another variant of the maximum likelihood methodology
is clustering.

Hidden Markov Models


A Hidden Markov Model, is a set of states (lexical
categories in our case) with directed edges
labeled with transition probabilities that
indicate the probability of moving to the state at
the end of the directed edge, given that one is
now in the state at the start of the edge. The
states are also labeled with a function which
indicates the probabilities of outputting different
symbols if in that state (while in a state, one
outputs a single symbol before moving to the
next state). In our case, the symbol output from
a state/lexical category is a word belonging to
that lexical category.

Hidden Markov Models (cont.)

Conditional Class Density


Function
All statistical methods of speech
recognition depend on the class
conditional density function.
These, in turn, depend on the existence
of a sufficiently large, correctly labeled
training set and well understood
statistical estimation techniques

How does statistics help


Disambiguation

may be achieved by
using stochastic context free grammars
It helps in providing degrees of
grammaticality
Naturalness
Structural preference
Error Tolerance

Example using stochastic


CFG
for example consider the sentence
John Walks
The grammar is as follows :
1 S -> NP V
0.7
2 S -> NP
0.3
3 NP -> N
0.8
4 NP -> N N
0.2
5 N -> John
0.6
6 N -> Walks 0.4
7 V -> Walks 1.0
The numbers on the right represent the weights for each rule.The weight
of the analysis is the product of the weights of the rules used in the
derivation.

Predicting the right sentence that is perceived


is based on these weights.

Degrees of grammaticality
Traditional

approaches to NLP do not


accommodate gradations of
grammaticality. A sentence is either
correct or not.
In some cases acceptability may vary
with the structure and context of the
sentence.

Structural Preference
Consider the sentence
The emergency crews hate most is domestic
violence.
The correct interpretation is:
The emergency [that the crews hate most] is domestic
violence.
These preferences can be seen more as structural
preferences rather than parsing preferences.
Statistical approaches can easily handle such structural
preferences.

Error Tolerance
A remarkable

property of human
language comprehension is error
tolerance.
Many sentences that the traditional
approach classifies as ungrammatical
can actually be interpreted by statistical
NLP techniques.

Conclusions
Free

and commercial software is now


available that provides a lot of NLP features.
(e.g. Microsoft XP has a speech recognition
software by which users can control menus
and execute commands)
A lot of research is going into developing new
applications and investigating new techniques
and approaches that will make Statistical NLP
more feasible in the near future.

Anda mungkin juga menyukai