Anda di halaman 1dari 61
AUTOMATIC INDEXING AND ABSTRACTING OF DOCUMENT TEXTS by Marie-Francine Moens Katholieke Universiteit Leuven, Belgium ae KLUWER ACADEMIC PUBLISHERS Boston / Dordrecht / London CONTENTS PREFACE ACKNOWLEDGEMENTS PARTI THE INDEXING AND ABSTRACTING ENVIRONMENT 1. THE NEED FOR INDEXING AND ABSTRACTING TEXTS Introduction Electronic Documents Communication through Natural Language Text Understanding of Natural Language Text: The Cognitive Process 5. Understanding of Natural Language Text: The Automated Process Important Concepts in Information Retrieval and Selection General Solutions to the Information Retrieval Problem The Need for Better Automatic Indexing and Abstracting Techniques EN eID 2. THE ATTRIBUTES OF TEXT Introduction The Study of Text An Overview of Some Common Text Types Text Described at a Micro Level Text Described at a Macro Level Conclusions au kYN 3. TEXT REPRESENTATIONS AND THEIR USE 1. Introduction 2. Definitions WR ww x viii PART II eee eee Automatic Indexing and Abstracting of Document Texts Representations that Characterize the Content of Text Intellectual Indexing and Abstracting Use of the Text Representations A Note about the Storage of Text Representations Characteristics of Good Text Representations Conclusions METHODS OF AUTOMATIC INDEXING AND ABSTRACTING 4, AUTOMATIC INDEXING:THE SELECTION OF NATURAL LANGUAGE INDEX TERMS CRI AARYNE Introduction A Note about Evaluation Lexical Analysis Use of a Stoplist Stemming The Selection of Phrases Index Term Weighting Alternative Procedures for Selecting Index Terms Selection of Natural Language Index Terms: Accomplishments and Problems 10. Conclusions 5. AUTOMATIC INDEXING: THE ASSIGNMENT OF CONTROLLED LANGUAGE INDEX TERMS 1, aurwyp 7. Introduction A Note about Evaluation Thesaurus Terms Subject and Classification Codes Learning Approaches to Text Categorization Assignment of Controlled Language Index Terms: Accomplishments and Problems Conclusions 6. AUTOMATIC ABSTRACTING: THE CREATION OF TEXT SUMMARIES 1 2. 3 4. Introduction A Note about Evaluation The Text Analysis Step The Transformation Step 103 103 104 106 il 115 131 132 133 133 134 136 148