AUTOMATIC INDEXING
AND ABSTRACTING
OF DOCUMENT TEXTS
by
Marie-Francine Moens
Katholieke Universiteit Leuven, Belgium
ae
KLUWER ACADEMIC PUBLISHERS
Boston / Dordrecht / LondonCONTENTS
PREFACE
ACKNOWLEDGEMENTS
PARTI
THE INDEXING AND ABSTRACTING ENVIRONMENT
1. THE NEED FOR INDEXING AND ABSTRACTING TEXTS
Introduction
Electronic Documents
Communication through Natural Language Text
Understanding of Natural Language Text: The Cognitive
Process
5. Understanding of Natural Language Text: The Automated
Process
Important Concepts in Information Retrieval and Selection
General Solutions to the Information Retrieval Problem
The Need for Better Automatic Indexing and Abstracting
Techniques
EN
eID
2. THE ATTRIBUTES OF TEXT
Introduction
The Study of Text
An Overview of Some Common Text Types
Text Described at a Micro Level
Text Described at a Macro Level
Conclusions
au kYN
3. TEXT REPRESENTATIONS AND THEIR USE
1. Introduction
2. Definitions
WR ww
xviii
PART II
eee eee
Automatic Indexing and Abstracting of Document Texts
Representations that Characterize the Content of Text
Intellectual Indexing and Abstracting
Use of the Text Representations
A Note about the Storage of Text Representations
Characteristics of Good Text Representations
Conclusions
METHODS OF AUTOMATIC INDEXING AND ABSTRACTING
4, AUTOMATIC INDEXING:THE SELECTION OF NATURAL
LANGUAGE INDEX TERMS
CRI AARYNE
Introduction
A Note about Evaluation
Lexical Analysis
Use of a Stoplist
Stemming
The Selection of Phrases
Index Term Weighting
Alternative Procedures for Selecting Index Terms
Selection of Natural Language Index Terms: Accomplishments
and Problems
10. Conclusions
5. AUTOMATIC INDEXING: THE ASSIGNMENT OF
CONTROLLED LANGUAGE INDEX TERMS
1,
aurwyp
7.
Introduction
A Note about Evaluation
Thesaurus Terms
Subject and Classification Codes
Learning Approaches to Text Categorization
Assignment of Controlled Language Index Terms:
Accomplishments and Problems
Conclusions
6. AUTOMATIC ABSTRACTING: THE CREATION OF TEXT
SUMMARIES
1
2.
3
4.
Introduction
A Note about Evaluation
The Text Analysis Step
The Transformation Step
103
103
104
106
il
115
131
132
133
133
134
136
148