Ahmad M. Bakr Computer and Systems Engineering Department Faculty of Engineering Alexandria University, Egypy
Agenda
Introduction.
Introduction
NLP is a branch of artificial intelligence that deals
with analyzing, understanding and generating the languages that humans use naturally in order to interface with computers. Natural language processing aims to teach computers to understand the way humans learn and use language.
Introduction
Speech processing: get flight information or book a hotel over the
phone. Information extraction: discover names of people and events they participate in, from a document. Machine translation: translate a document from one human language into another. Question answering: find answers to natural language questions in a text collection or database. Summarization: generate a short biography of Noam Chomsky from one or more news articles.
Text Processing
Text processing is manipulation of text, especially
the transformation of text from one format to another. Usually from plain text (set of paragraphs) to a form that is easy to be included in calculations. Vector Space Model (VSM) is one of the forms used by application to represent document as a vector of its words.
dj={W1,W2, W3 . Wn}
Information Retrieval
Information retrieval is the activity of obtaining
Information Retrieval
Usually information is indexed to speed up the
queries. Inverted Index is one of the primary attempts to index text based on its words.
Information Retrieval
Can we use inverted index to search for
sentences A B C?
Information Retrieval
Document Index Graph
Sentiment Analysis
Sentiment analysis or opinion mining refers to the
application of natural language processing, computational linguistics, and text analytics to identify and extract subjective information in source materials.
Sentiment Analysis
Techniques:
Maintaining a list of words for each class
Example This is a nice movie , This is a bad movie Using classifiers that trained with sentences for each class separately
seeks to locate and classify atomic elements in text into predefined categories such as the names of persons, organizations, locations, expressions of times, quantities, monetary values, percentages, etc.
Question Answering
What is Question Answering
fields of information retrieval and natural language processing (NLP), which is concerned with building systems that automatically answer questions posed by humans in a natural language.
Question Answering
A QA implementation, usually a computer
program, may construct its answers by querying a structured database of knowledge or information, usually a knowledge base. More commonly, QA systems can pull answers from an unstructured collection of natural language documents.
Question Answering
Question Answering
Question Classification
Question classifier module determines the type of
into the type of human (individual) Examples: 2) Where is Alexandria Located ? should be classied into the type of place
types Question is put in a form of parse tree to capture the relationship between its entities (i.e subjects, objects etc) The main purpose of the parse tree is to understand the question and the links between its entities.
Question Answering
Query Formulation
Apply text processing techniques to form a query
weights)
Question Answering
Search knowledge base
The main target is to identify the paragraphs that
possibly contain answers to the users question Knowledge based is usually indexed.
Answers Extraction
Parse the candidate paragraphs to extract
sentences with possible answers Construct the parse tree of the matches sentences Parse tree gives insights about the relationship between the entities of a candidate sentence Rank the possible answers based on their relevance to the question.