Anda di halaman 1dari 157

The Effects of Part-of-Speech Tagsets on Tagger

Performance

A thesis presented
by
Andrew MacKinlay
to
The Department of Computer Science and Software Engineering
in partial fulfillment of the requirements
for the degree of
Bachelor of Science with Honours

University of Melbourne
Melbourne, Australia
October 2005

c
2005
- Andrew MacKinlay
All rights reserved.

Thesis advisor(s)
Timothy Baldwin
Steven Bird

Author
Andrew MacKinlay

The Effects of Part-of-Speech Tagsets on Tagger Performance

Abstract
In natural language processing (NLP), a crucial subsystem in a wide range of applications is a part-of-speech (POS) tagger, which labels (or classifies) unannotated words
of natural language with part-of-speech labels corresponding to categories such as noun,
verb or adjective. Mainstream approaches are generally corpus-based : a POS tagger
learns from a corpus of pre-annotated data how to correctly tag unlabelled data.
Previous work has tended to focus on applying new algorithms to the problem or
adding hand-tuned features to assist in classifying difficult instances. Using these methods, a number of distinct approaches have plateaued to similar accuracy figures of
96.9 0.3%.
Here we approach the problem of improving accuracy in POS tagging from a unique
angle. We use a representative set of tagging algorithms and attempt to optimise performance by modifying the inventory of tags (or tagset) used in the pre-labelled training
data . We modify tagsets by systematically mapping the tags of the training data to a
new tagset. Our aim is to produce a tagset which is more conducive to automatic POS
tagging by more accurately reflecting the underlying lingustic distinctions which should
be encoded in a tagset.
The mappings are reversible, enabling the original tags to be trivially recovered, which
facilitates comparison with previous work and between competing mappings. We explore
two different broad sources of these mappings. Our primary focus is on using linguistic
insight to determine potentially useful distinctions which we can then evaluate empirically.
We also evaluate an alternative data-driven approach for extracting patterns of regularity
in a tagged corpus.
Our experiments indicate the approach is not as successful as we had predicted. Our
most successful mappings were data-driven, which give improvements of approximately
0.01% in token level accuracy over the development set using specific taggers, with increments of 0.03% over the test set. We show a wide range of linguistically motivated
modifications which cause a performance decrement, while the best linguistic approaches
maintain performance approximately over the development data and produce up to 0.05%
improvement over the development data. Our results lead us to believe that this line of
research is unlikely to provide significant gains over conventional approaches to POS
tagging.

Contents
Title Page . . . . . . .
Abstract . . . . . . . .
Table of Contents . . .
Citations to Previously
Acknowledgments . . .

. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
Published Work
. . . . . . . . . .

.
.
.
.
.

i
iii
iv
vii
viii

.
.
.
.
.
.

1
1
2
2
4
5
6

.
.
.
.
.
.
.
.
.
.

7
7
8
9
9
10
13
14
15
17
18

.
.
.
.
.
.

19
19
20
21
23
24
25

4 Experimental Evaluation
4.1 Benchmark and Baseline . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2 Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

27
28
29

1 Background
1.1 Introduction . . . . . . .
1.2 Parts of Speech . . . . .
1.2.1 Terminology . . .
1.2.2 POSs Defined . .
1.2.3 POS Tagging . .
1.2.4 Natural Language

. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
Corpora

.
.
.
.
.
.

.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.

.
.
.
.
.
.

2 Literature Review
2.1 Overview of Tagging Algorithms . . . . . . . . . . . . . . .
2.2 Using Linguistic Insight to Optimise NLP Applications . .
2.3 Linguistic Resources for Modifying the Tagset . . . . . . .
2.3.1 The Brown Corpus Tagset . . . . . . . . . . . . . .
2.3.2 The Penn Treebank Tagset . . . . . . . . . . . . . .
2.3.3 Other Tagsets in Use . . . . . . . . . . . . . . . . .
2.4 Algorithms for POS Tagging . . . . . . . . . . . . . . . . .
2.4.1 POS Tagging with Transformation-Based Learning
2.4.2 Support Vector Machine-based POS Tagging . . . .
2.4.3 Maximum Entropy POS Tagging . . . . . . . . . .
3 Methodology
3.1 General Outline . . . . . . . . . .
3.2 Motivation . . . . . . . . . . . . .
3.3 Experimental Setup . . . . . . . .
3.3.1 Data Sampling . . . . . .
3.3.2 A Data-Driven Alternative
3.3.3 Evaluation Metrics . . . .

.
.
.
.
.
.

.
.
.
.
.
.

iv

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.

4.3

.
.
.
.
.
.
.

31
31
32
33
36
37
39

5 Conclusion
5.1 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.2 Further Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

40
40
41

A The Penn Tagset

45

4.4

Linguistically Motivated Modifications . . . . . . . . . . . . . . .


4.3.1 Notational Conventions . . . . . . . . . . . . . . . . . . . .
4.3.2 Syntactically-Conditioned Modifications of Closed Classes
4.3.3 Syntactically-Conditioned Modifications of Open Classes .
4.3.4 Lexically-Conditioned Modifications of Closed Classes . . .
4.3.5 Lexically-Conditioned Modifications of Open Classes . . .
Overall Summary . . . . . . . . . . . . . . . . . . . . . . . . . . .

B Complete Results
B.1 . . . . . . . . . . . . . . . . . . . . .
B.2 rbdeg[ld] Mapping . . . . . . . .
B.3 rbdeg[s] Mapping . . . . . . . . .
B.4 vbcop[s] + rbdeg[s] Mapping . .
B.5 vbcop[lm] + rbdeg[s] Mapping
B.6 . . . . . . . . . . . . . . . . . . . . .
B.7 . . . . . . . . . . . . . . . . . . . . .
B.8 insub[s] Mapping . . . . . . . . . .
B.9 . . . . . . . . . . . . . . . . . . . . .
B.10 . . . . . . . . . . . . . . . . . . . . .
B.11 nnms[l] + dtnum[l] Mapping . .
B.12 vbrp[s] Mapping . . . . . . . . . .
B.13 . . . . . . . . . . . . . . . . . . . . .
B.14 inrp[l] Mapping . . . . . . . . . .
B.15 inrp[l] + insub[s] Mapping . . .
B.16 . . . . . . . . . . . . . . . . . . . . .
B.17 . . . . . . . . . . . . . . . . . . . . .
B.18 . . . . . . . . . . . . . . . . . . . . .
B.19 vbrp[ld] Mapping . . . . . . . . .
B.20 . . . . . . . . . . . . . . . . . . . . .
B.21 . . . . . . . . . . . . . . . . . . . . .
B.22 . . . . . . . . . . . . . . . . . . . . .
B.23 vbtr[s] Mapping . . . . . . . . . .
B.24 . . . . . . . . . . . . . . . . . . . . .
B.25 to:in Mapping . . . . . . . . . . . .
B.26 . . . . . . . . . . . . . . . . . . . . .
B.27 . . . . . . . . . . . . . . . . . . . . .
B.28 vbinf[lm] Mapping . . . . . . . . .
B.29 . . . . . . . . . . . . . . . . . . . . .
B.30 . . . . . . . . . . . . . . . . . . . . .
B.31 . . . . . . . . . . . . . . . . . . . . .
v

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

47
48
49
50
52
53
54
55
56
59
60
61
63
64
65
66
68
69
71
73
75
77
79
80
81
83
85
86
87
89
90
91

B.32
B.33
B.34
B.35
B.36
B.37
B.38
B.39
B.40
B.41
B.42
B.43
B.44
B.45
B.46
B.47
B.48
B.49
B.50
B.51
B.52
B.53
B.54
B.55
B.56
B.57
B.58
B.59
B.60
B.61
B.62
B.63
B.64
B.65
B.66
B.67
B.68
B.69
B.70
B.71
B.72
B.73

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
rp(clc) Mapping . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
in(clc,s) Mapping . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
rbr(clc,s) Mapping . . . . . . .
. . . . . . . . . . . . . . . . . . . .
in(clc) Mapping . . . . . . . . .
rbloc[lm] Mapping . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
rbloc[s] Mapping . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
vbcop[s] Mapping . . . . . . . .
dtnum[l] + jjnum[l] Mapping
to:in + insub[s] Mapping . . . .
vbcop[lm] Mapping . . . . . . .
in/rp/rb[l] Mapping . . . . . . .
. . . . . . . . . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

vi

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

92
93
94
95
96
97
98
99
100
101
102
105
106
109
110
113
114
115
116
117
118
119
120
123
124
125
126
127
130
131
132
133
134
135
137
138
139
142
143
145
146
148

Citations to Previously Published Work

Portions of Chapter 3, Chapter 4 and Chapter 5 are to appear in the following paper:

Andrew MacKinlay and Timothy Baldwin (Forthcoming). POS Tagging with


a More Informative Tagset. In Proceedings of the Australasian Language Technology Workshop 2005, Sydney, Australia

Acknowledgments
Kate, and all my other friends for putting up with my absence over the year.
My secondary supervisor Steven for providing occasional but very helpful support
whenever it was needed.
And most importantly my primary supervisor Tim, who was extremely generous with
his time and assistance, for being a highly effective source of inspiration and information
as well as an editor and proofreader.

Chapter 1
Background
1.1

Introduction

Part-of-speech (POS) tagging is a well-studied problem in natural language processing, in which the aim, given a natural language text, is to a label each word in that
sample with a POS tag such as noun, verb or adjective. In one of the earlier successful
approaches to POS tagging in the framework which is now mainstream in NLP, Church
(1988) determined that it was possible to achieve high accuracy in POS tagging using an
impoverished feature set derived from no more than two words in the immediate local
context.
In the following decade or so, subsequent work generally involved applying new algorithms to essentially the same task (Brill 1995; Ratnaparkhi 1996; Daelemans et al. 1996;
Nakagawa et al. 2001), with these diverse approaches all settling on a similar set of features to those used by Church, and never using anything more extensive than highly
localised context. There was very little successful novel feature engineering in general
variants on this small set of features are tacitly regarded as optimal for tractable POS
tagging. A wide range of approaches rapidly asymptoted to a glass ceiling of performance, achieving accuracy of 96.8 0.2% over the standard dataset used for the task,
the Penn Treebank (Marcus et al. 1993).
More recently approaches have been targeted more on reducing the running time
of an existing approach (Gimenez and M`arquez 2003; Ngai and Florian 2001) with a
concomitant further reduction in focus on improving the feature set. The only departure
from this was by Toutanova and Manning (2000), who added a selection of languagespecific, hand-tuned features, achieving improvements in accuracy of approximately 0.3%,
bringing accuracy to just above the top end of the range quoted above.
There has been very little improvement in accuracy in POS tagging in the last decade
diverse approaches have reached a similar plateau. The application of different algorithms
to the task has already been extensively explored, and feature engineering is unlikely to
help much more than has already been shown to be possible. This motivates the need
for a new approach to the problem, which is the focus of this thesis.
This thesis reports an investigation into an alternative approach to POS tagging.
Rather than the application of a new algorithm or addition of new features, it is an
examination of the effect of modifying the tagset the inventory of tags assigned by the
tagger. We create new tagsets by systematically mapping words tagged in the original
1

tagset to a new, finer-grained tagset, which is designed to more accurately reflect the
underlying linguistic distinctions which should be made explicit in a well-designed tagset.
This finer-grained tagset will provide more specific contextual information with which
to determine the identity of surrounding tags. We empirically evaluate whether this additional information facilitates more accurate tagging. This evaluation stage is facilitated
by the use of reversible mappings which enable the original tags to be recovered so that
comparison between different mappings and with previous work can be easily achieved.
Thus we are explicitly discarding any increased linguistic utility of the new distinctions
(although this may be valuable) and examining the effect of these new distinctions on
accuracy. The primary focus is to investigate a wide range of mappings designed using
linguistic knowledge, however we also demonstrate an alternative approach where we attempt to use machine learning techniques to infer salient groupings from a tagged corpus
of text.
To conduct a thorough investigation it is necessary to have a small and well-defined
focus. The sole source of gold-standard data is the Penn Treebank corpus we use this as
a case study to investigate the extent to which performance is affected by a finer-grained
tagset. We leave it as an open question how applicable our findings are to different tagsets
and languages, as this is a question for further empirical evaluation.
This thesis is structured as follows: The remainder of this chapter introduces background material and terminology.Chapter 2 contains a broad survey of previous work on
POS tagging and information about various alternative tagsets used for different natural
language corpora. Chapter 3 explains the general setup of the experiments performed and
evaluation metrics, and the motivation for attempting to improve performance in POS
tagging, as well as the method for extracting useful groupings of lexical items from the
data. Chapter 4 gives detailed explanations of the mappings used in various experiments
along with summaries of results for each. Finally, in Chapter 5 we discuss the significance
of the results demonstrated and outline opportunities for further work.

1.2
1.2.1

Parts of Speech
Terminology

At this point it will be useful to define some terms relevant to both linguistics and
to computational linguistics, taking definitions mostly from Crystal (1987). Syntax essentially refers to the arrangement of words in a sentence and how this arrangement
influences the semantic relationships between words. Most theories of syntax posit a
hierarchical underlying structure for a sentence: at the base level there are words which
combine with each other to form phrases, and these phrases recursively combine with
each other to form sentences. This captures our intuition that in a sentence like The
cat watched the fantastic performers there is a closer relationship between, for example,
fantastic and performers than there is between fantastic and cat. Syntacticians therefore
often represent sentences in terms of parse trees, such as that in Figure 1.1.
Morphology is another linguistic term which refers to how words are built up from
smaller units of meaning known as morphemes. A word such as cats could be segmented
into two smaller meaning bearing units: cat which has the meaning associated with
its dictionary definition and the suffix s indicating plural. Both of these segments
2

Sentence
HHH
H






HH
H

VerbPhrase

NounPhrase

H
 HHH


H

H
 HH

Determiner Noun
the

cat



HH

Verb

NounPhrase

watched

HH

HH


H
HH


Determiner Adjective
the

fantastic

Noun

performers

Figure 1.1: A example of a syntactic parse tree


contribute some component of meaning to the overall meaning of cats, however these
segments cannot be divided up further into any smaller units which are associated with
an intrinsic meaning, and for this reason they are generally accepted as being morphemes.
Similarly, a word such as dispassionately can be segmented as dis + passion + ate +
ly.
Morphology is often classified into two sub-types: inflectional and derivational. Inflectional morphology is concerned with how words inflect in a particular grammatical
context. So, the distinction between cat and cats is determined by how the word is used
in a particular instance in this case whether the intended referent is singular or plural.
The underlying meaning of the word has changed very little. Other inflectional suffixes
in English include the past tense suffix for verbs, in play + ed. Meanwhile derivational
morphology is typified by affixes of the kind seen above in dispassionately. Each of the
affixes causes a change in meaning and creates a new word with a substantially different
meaning which often performs an entirely different syntactic function.
It will also be necessary to introduce some terminological distinctions from computational linguistics literature. This thesis deals with words of natural language wherein the
term word is often not sufficiently precise for our purposes. For the purposes of dividing
up text into units, it will suffice to accept, unless mentioned otherwise, the definition of
word as understood by literate speakers of English: an orthographic unit comprised of a
string of alphabetical or numeric characters separated from other words by whitespace
or punctuation marks. We will abstract away from the large number of complications to
this simplistic definition. Similarly, we will also oversimplify in defining a sentence as an
orthographic unit terminated by a full stop, exclamation mark or question mark which
is not being used for another purpose such as an abbreviation.1
Additionally, we will adopt the following more specific terminology as defined by
Jurafsky and Martin (2000). The term wordform refers to a word as it appears in the text
including attached inflectional morphemes, while lemma refers to an abstract uninflected
word stem of the type we might find in a dictionary. Thus in the sentence in Figure 1.1
1

The task of classifying a full stop as a genuine or spurious sentence boundary is in fact an area of NLP
study in itself (e.g. Mikheev 2000, Ratnaparkhi 1996), however we will assume that this classification
task has already taken place.

the set of lemmas is {the, cat, watch, fantastic, performer} and the set of wordforms
is {the, cat, watched, fantastic, performers}. The term types, referring to the distinct
wordforms in some text, is often contrasted with tokens, the running wordforms in a
sample of text. The example sentence contains 6 tokens (the, cat, watched, fantastic, the
and performers) and 5 types (the, cat, watched, fantastic and performers).

1.2.2

POSs Defined

The term part of speech has been used since the beginnings of grammatical study to
denote broad categories of words in natural language. As discussed in Crystal (1987),
traditional grammars usually recognise eight to ten POSs: noun, verb, adjective, pronoun,
preposition, conjunction, adverb, interjection, and possibly article and participle. These
POSs were generally defined by broad semantic criteria e.g. a noun is a person, place
or thing, a verb is an an action and an adjective is a quality of something.
As noted by Crystal (1987), linguists more recently have noted a number of shortcomings of such an approach. While such definitions work acceptably for particular members
of the class such as the noun chair, it is trivial to find examples which we would like
to group with a particular POS but do not share the requisite semantic criteria of these
simplistic definitions. It is difficult to argue that idea is a thing, or that to wish is an
action, and yet it is clearly desirable to group these with nouns and verbs respectively.
Additionally, for some word classes such as articles, (a, an and the), it is not at all clear
what semantic criteria could define the class. Essentially there is a mismatch between the
semantic criteria being used to define the POSs and the underlying syntactic distinctions
they are meant to reflect.
Modern linguistic approaches often use the term word class rather than part of speech
to avoid a term with too many pre-existing associations. Recognising the shortcomings
of the semantic definitions of POSs, they tend to identify word classes by formal criteria.
One such criterion is based on syntax: word classes should be comprised of words which
tend to occur in a particular position relative to other words in a sentence. For example,
the fact that in English chair and idea can both be substituted for X in I like the
X is evidence that they belong in the same word class. Another useful criterion is
morphological: since chair and idea can both have the plural morpheme s appended to
them as a suffix, there is stronger justification for grouping them together.
These criteria can be used with reasonable effectiveness in many languages to identify
salient groupings of words. However, as often occurs in natural language, the picture
is complicated somewhat by less canonical examples. A word like sheep does not occur
with the plural suffix s even when the intended meaning is plural, so there is a case
on morphological grounds for placing it in a different word class to chair. Looking at
syntactic criteria, most adjectives can freely occur preceding nouns or following be, but
some adjectives such as asleep can only occur in the latter construction: the man is asleep
versus *the asleep man.2 These so called attributive adjectives could be grouped with
other adjectives or placed in a class of their own. Accounting for every irregularity in
a language in this way potentially leads to thousands of word classes, however ignoring
too many of the distinctions evident from these criteria obscures patterns of syntactic
2

The * here is a shorthand commonly seen in linguistics literature to denote a sentence which is
ungrammatical, i.e. not a valid sample of the language

regularity. It is thus an open question how many of these morphological and syntactic
differences we wish to account for when we divide up the lexicon of the language into
word classes. This is a point we return to in Section 1.2.3.
It is usual to recognise two broad categories of word classes: open classes and closed
classes. Open classes have a number of characteristics: they freely admit new members,
and the words contained in them tend to make a up a large percentage of the lexicon of the
language and be readily modified by morphological processes. In most languages including
English, classes like nouns, verbs and adjectives tend to be open classes. Prototypical
closed classes have an opposing set of characteristics: they contain a small number of
lexical entries (even though they make up a large percentage of tokens in running text), do
not readily admit new members and contain words which allow little or no morphological
modification.

1.2.3

POS Tagging

Broadly, POS tagging is a classification task in NLP where the input is an unlabelled
sample of text (or corpus) and each word token contained in it is assigned a POS. Here
the term POS is more similar to what we described as word classes above than the
POS from traditional grammars, however in our use of the term we follow the general
practice in NLP literature.
There are important differences between the POSs in POS tagging and the word
classes of modern linguists beyond those already mentioned. In NLP, POS tags are
assigned to tokens in a sample of text, not to abstract lexical entries. An important
consequence is that it is not always clear what the POS of a particular token is. A large
number of wordforms in English are ambiguous: the string of characters constituting the
word do not uniquely determine the actual POS. So a noun like chair could also be a
verb in a sentence like I would like to chair the meeting. As we shall see in Section 2.1,
there are a number of strategies to overcome this ambiguity: we usually assign the tag
to a token based on past evidence of the most likely tag for the wordform, augmenting
this with information from surrounding words.
Another difference is that word classes such as noun and verb tend to be associated with a lemma rather than a wordform, however this thesis follows the standard
approach in NLP where the POSs we assign distinguish between different wordforms corresponding to the same lemma. Thus, as will be discussed more extensively in Section 2.3,
a plural noun like cats is tagged differently from a singular noun like cat, and a past tense
verb like ate is distinguished from the present tense eat.
The focus on wordforms is the main reason why the size of even small computational
tagsets (i.e. the inventory of tags distinguished in a given POS tagging task) tends to be
somewhat larger than the set of broad word classes used by linguists; each distinction
corresponding to an inflectional suffix is marked. Additionally, a tagset can be designed
to distinguish any number of lexical, morphological or syntactic differences between wordforms. It is the differences here which account for much of the variation between tagsets.
This thesis investigates the extent to which such distinctions in tagsets are useful, primarily from the standpoint of tagging accuracy.
POS tagging is important since in some form it is a component of many applications in
NLP and related domains. Tasks such as information retrieval, parsing and corpus-based
5

linguistics can all depend to some degree on POS tagging. The emphasis on downstream
applications will be relevant later when we look at evaluation metrics (Section 3.3.3) but
in general, we consider POS tagging in isolation.

1.2.4

Natural Language Corpora

Up until this point, the definition of correct POS tag has been left unspecified. This
thesis follows the NLP mainstream of corpus-based POS-tagging. This presupposes the
existence of a gold-standard training corpus pre-annotated with what we will accept as
correct tags. Essentially the task of POS-tagging becomes a task of specialised machine
learning: the corpus is divided up into training and test data, and, using the gold-standard
tags on the training data, a tagger is trained to assign POS tags. Evaluation is conducted
by running the trained tagger over the held-out test data and comparing the results with
the gold-standard.
There are a number of tagged corpora in existence, however of particular relevance is
the corpus which is the primary focus of this paper, the Penn Treebank (Marcus et al.
1993), with tagged sentences such as:
Both Clarcor and Anderson are based in Rockford , Ill
DT NNP CC NNP
VBP VBN IN NNP
, NNP
As well a being the de facto standard for a range of NLP tasks including POS tagging, it is
significant because it is also annotated with syntactic parse trees, as shown in Figure 1.2.
S
HH

H

HH





H
HH

NP-SBJ

VP

P
P
@

@ PPP

PP

@

DT

Both

NNP

Clarcor

CC

NNP

and Anderson

HH
HH


VBP

VP

are

H
 HH

H

VBN

PP-LOC-CLR

based

HHH


H

IN

NP

in

HHH

H

NP

NP

NNP

NNP

Rockford

Ill

Figure 1.2: A parse tree annotation from the Penn Treebank

Chapter 2
Literature Review
2.1

Overview of Tagging Algorithms

A large proportion of word types in English are unambiguous in terms of POS tags.
According to DeRose (1988), roughly 88.5% of word types receive only one tag in the
Brown corpus. However the tagging problem is not quite as easy as figures such as this
might tend to suggest. More frequent words tend, on average, to have a larger number of
possible tags. DeRose also points out that a given character string uniquely determines
the POS for only 60% of tokens in the Brown corpus. Thus we would expect that for
roughly 40% of tokens which a tagger encounters in a unseen data, we will need some way
to disambiguate the possible tags. In the investigation phase of this project we calculated
the average number of possible tags in sections 23 and 24 of the Penn Treebank WSJ
Corpus to be 2.2 tags per word token, when the tagger was trained on sections 0 to 22.
A problem which is perhaps even more challenging if less frequent is that of unknown
words. Words will inevitably appear in the test data which were not in the training data.
Approximately 2.5% of tokens in sections 23 and 24 of the WSJ corpus were not observed
anywhere else in the corpus. Many taggers deal with unknown words effectively by using
features derived from the character strings at beginning or end of the word to determine
the correct tag.
The simplest and most obvious method for dealing with tag ambiguity yields surprisingly accurate performance. A probability distribution is built based on the tags
observed for each word. In the test data, each word is assigned the tag with the highest
probability according to the distribution. This is known as unigram tagging as it assigns
a tag based on a word unigram a sequence of exactly one word. According to Charniak
et al. (1993) such an approach (with some smoothing1 ) gives an accuracy of 91.5% on
the Brown corpus. The results from preliminary investigation here suggest something
similar: an unsmoothed unigram tagger trained on sections 0 to 22 of the WSJ corpus
achieves 91.5% accuracy on sections 23 and 24.
However the most significant aspect is how modern state-of-the-art taggers achieve
higher accuracy. Almost all taggers since Church (1988) have used a similar observation:
1

Smoothing is the reassignment of probability mass to deal sensibly with low-frequency word types
simply assigning their probability of a tag given a word to be the observed frequency seen in the training
data means that no allowance will be made for a word receiving a tag it did not receive in the training
data, but this is actually a very real possibility, especially for low-frequency words

nearby tokens can be highly informative about the POS of a particular token. A determiner such as the is highly likely to be followed by an adjective or noun, while it is almost
never followed by a verb or a modal. For example, the word can is ambiguous between
noun and modal verb (among others). So if a word sequence such as the can is observed,
a tagger can predict that can should be tagged as a noun rather than as a modal which
would be predicted by the unigram method. Taggers which use features derived from the
local context are described by Church (1988), Brill (1995), Ratnaparkhi (1996), Brants
(2000) and Gimenez and M`arquez (2004), inter alia.

2.2

Using Linguistic Insight to Optimise NLP Applications

There are a number of prior examples in the literature showing that in NLP, informed
linguistic insight can be more effective at improving performance in a particular task than
applying a new algorithm.
The approach taken by Klein and Manning (2003) is perhaps the most informative.
The task in question was parsing, i.e. assigning the correct parse tree to a set of sentences.
While it is a separate task from POS-tagging, there are obvious parallels between the
two. Rather than learning how to assign a POS label from a text annotated with tags, in
Treebank-style parsing a grammar is induced from a training set annotated with a parse
tree for each sentence. In the testing phase, a parser, instead of selecting the most likely
tag from the tagset, determines the most likely parse tree for each sentence from those
allowed by the induced grammar.
Klein and Manning looked only at unlexicalized parsing, where the lexical identity of
a word is used only to determine the possible POSs. They demonstrated that substantial
increases in parsing accuracy were possible by using a finer-grained set of annotations
for nodes in the tree. The original Treebank annotations were split into subcategories
based on nearby features in the parse tree, either from descendants of the node or from
features of parents or siblings. The authors used linguistic insight to identify potentially
useful categorial distinctions they could introduce to improve the accuracy which were
not present in the original annotation. For example, each node was annotated with the
basic node label of the parent, which captured the fact that the distribution of lexical
items for a given POS is often dependent on the category of the parent node of the token.
Applying a suite of similar splits on the node labels, they improved accuracy from 77.77%
to 86.32%.
A different but related method was used by Toutanova and Manning (2000), who
applied linguistically relevant observations to POS tagging. Here, rather than modifying
category labels and applying an existing algorithm, the authors modified the algorithm to
utilise features derived from the data which were designed to encode linguistically relevant
information, and in particular information which could help to distinguish between easily
confusible POSs.
For example, in English, there is a systematic ambiguity between past tense and past
participle forms of regular verbs (which receive the tags VBD and VBN respectively in
the WSJ corpus). So while an irregular verb like speak has past tense spoke and past
participle spoken, regular English verbs (the majority) show no morphological distinction
8

between the two, such as talk, which has talked in both cases. The lack of overt wordform
distinctions between these two classes means that it is a significant source of error for most
taggers 6.9% of the total error in the baseline model used by Toutanova and Manning.
Based on the knowledge that past participles in English occur after the auxiliary verbs
have and be, the authors added a feature to the tagger which was activated if either of
these two words occur in a (relatively large) preceding context window, and thus reduced
the number of VBD/VBN errors by 12.3%.
By adding a suite of similar hand-engineered features, Toutanova and Manning increased the accuracy from the baseline of 96.72% to a final figure of 96.86%, and they
note that even when the accuracy figures for corpus-based POS taggers start to look
extremely similar, it is still possible to move performance levels up (Toutanova and
Manning 2000:69). This observation justifies the goal of this thesis, which is to improve
tagger performance using linguistic insight.

2.3

Linguistic Resources for Modifying the Tagset

We noted in Section 2.2 that a finer-grained set of category labels can markedly
improve performance in a related NLP application of parsing, and that there is potential
to improve POS tagging performance by adding new linguistically motivated features
to the tagger. These suggest that it may be possible to apply an analogous version of
Klein and Mannings method to POS tagging. If we alter the tagset to encode more
subtle distinctions within the present word classes, providing more useful information to
disambiguate surrounding words, we could potentially increase tagging accuracy.
The inspiration for these modifications can come from a number of sources. One
obvious set of resources is the various comprehensive descriptive grammars of English,
such as Huddleston and Pullum (2002) or Quirk et al. (1985). Grammars such as these
provide thorough descriptions of various linguistic phenomena such as salient syntactic
groupings of words. Yielding a different but overlapping set of possible modifications are
the tagsets used in various other corpora. Some of these tagsets, along with the Penn
Treebank tagset itself, are examined below.
It will be useful in the discussion to introduce some points noted by Leech (1997)
about the distinction between linguistic and computational requirements which influence
the design of tagsets. The linguistic quality of a tagset is determined by the extent
to which each tag denotes a set of words with a unique set of syntactic properties in
common with each other, while the its computational tractability 2 is determined by how
easy it is to determine the tag for a particular word, and how much each tag aids in the
disambiguation of surrounding words. The extreme cases of tagsets with either one tag
per word, or one tag for all words, are examples of tagsets which are highly tractable in
computational terms, but of very little use linguistically, which perhaps serves to indicate
that these requirements sometimes conflict. This is a point to which we will return in
Section 3.1.
2

We will use this term as defined by Leech hereafter

2.3.1

The Brown Corpus Tagset

The Brown Corpus (Francis and Kucera 1979), as one of the earliest digital corpora,
was annotated with a tagset which, according to Sampson (1987), was directly or indirectly the basis of most of the tagsets in existence today. The tagset has 87 simple
tags, but 186 compound tags, which are discussed below. There are several reasons why
even the number of simple tags is much larger than the eight traditional POSs discussed
above. Since most verbs show different syntactic distributions depending on how they are
inflected, tokens from the major word classes can each be assigned one of several different
POSs, depending on how they are inflected. For example, verb tokens can appear in one
of five different categories:
VB for bare uninflected verb (present, imperative or infinitive) e.g. take
VBZ for present tense, third person singular e.g. takes
VBD for simple past tense e.g. took
VBN for past participle e.g. taken
VBG for present participle e.g. taking
Similarly nouns are divided into common and proper nouns, each of which is further
divided into singular and plural. Additionally there are variants of each for when the
token occurs with the possessive marker s. Adjectives are also divided into absolute (e.g.
tall), comparative (e.g. taller) and superlative (e.g. tallest), while similar distinctions are
observed for adverbs. These open classes, along with the cardinal and ordinal numbers,
account for 23 of the tags in the tagset. The remainder are used on the various closed
classes, including:
Tags for all of the inflected forms of the auxiliary verbs be, have and do (similar
to the inflected forms listed above, but with the addition of negated versions of the
past and present forms, so that is/BEZ is distinguished from isnt/BEZ*)
Tags for various articles and determiners, such as a(n), the, this, each
Tags for different wh-words, such as what, who, when etc.
Tags for prepositions: in, at, from, over etc.
Tags for various punctuation marks
along with several other closed classes, such as pronouns and modal verbs.
The tagset is still more complex however, when we consider that there are two ways
of forming complex tags: two tags joined by a plus symbol indicates the second item
is a contracted item appended to the host word. For example Johnll would be tagged
NP+MD for Proper noun and modal verb. The many possible combinations in
which this can occur brings the number of tags to 186. Additionally, there are a number
of tags denoting various additional pieces of information about word tokens, which are
joined with a hyphen to the primary tag: FW denotes foreign words, TL denotes a title,
HL denotes a headline and NC denotes part of a multi-word phrase and FW a foreign
word. In an extreme case LArcade receives the tag FW-AT+NN-TL (where AT denotes
article). These extra tags bring the total number of simple and complex tags to 472.
10

2.3.2

The Penn Treebank Tagset

The tagset for the Penn Treebank (Marcus et al. 1993), which we reproduce in Appendix A, is based on the original Brown tagset3 however its comparatively small size
a deliberate design decision. The corpus differs from earlier corpora in two ways: it is
designed more for NLP and the sentences contained are not only tagged with POSs but
are also parsed. In this spirit Marcus et al. set out to create A Simplified POS Tagset
for English to alleviate problems of sparse data when used probabilistically, and thus
increase its computational tractability (as defined above). There are two primary ways
this simplification is achieved. One is by avoiding compound tags. For example, while
the contracted auxiliaries ll and s and the contracted adverb nt would all be indicated
with compound tags on the host word in the Brown corpus (e.g. shell/PPS+MD), in
the Penn Treebank these clitics4 are split from the main noun or verb in the tokenisation
stage, and treated as standalone tokens with their own tags; the possessive s is treated
similarly. This avoids having a separate tag for verb forms when they are negated followed by an auxiliary, and for nouns when they are used in a possessive construction.
Thus, arent is tagged the same way as are not, making the syntactic similarity of the
two apparent.
The other means by which Marcus et al. simplified the tagset, and the one which is
more relevant here, was with the notion of recoverability: if the distinctions between
several tags could be recovered from some other available information, the information
was not included in the tag.
The most obvious type of information which can be recovered is the lexical identity of
a word. Thus the Penn Treebank attempts to avoid classes with just a single lexical item,
and, unlike the Brown tagset and several other related tagsets, the Penn Treebank tagset
reserves no special tags for the auxiliary verbs be, have and do they are treated just like
any other verb. Other lexically recoverable distinctions which were removed include those
between various articles and determiners, and between reflexive and personal pronouns
(e.g. the distinction between myself and me).
The other category of information available in the Penn Treebank stems from the fact
that it was designed as a parsed corpus, meaning that there is abundant syntactic information available. Again, Marcus et al. set out to design the tagset to remove some of the
redundancy present in the combination of syntactic structure and POS tags. The clearest example of this is with the IN category. In traditional grammars, there is a division
between subordinating conjunctions and prepositions. Subordinating conjunctions such
as that, because, since, etc., introduce subordinate clauses that are structurally similar
to main clauses they have a subject, verb and (sometimes) object, and can often stand
alone as a main clause, as in He left [because he was angry], where he was angry is a
subordinate clause. Meanwhile prepositions such as in, to, on attach to noun phrases, as
in at the zoo.5 The Penn Treebank IN tag conflates both of these categories despite their
3

Note that the Penn Treebank actually contains a subset of the Brown corpus parsed and tagged in
Penn Treebank format, however we will ignore this from now on the Brown corpus refers to the original
described in Francis and Kucera (1979)
4
These are items which have a status between words and affixes (Manning and Sch
utze 1999). It is
perhaps this intermediate status which explains their variable treatment between different tagsets.
5
More modern grammars such as Huddleston and Pullum (2002) acknowledge but disagree with this
traditional division, preferring to group most of the subordinating conjunctions with prepositions, apart

11

differing syntactic functions.


However Marcus et al. (1993:315) stress that all of this information is available to
users of the corpus by looking at these additional sources:
... the lexical and syntactic recoverability inherent in the POS-tagged version
of the Penn Treebank corpus allows end users to employ a much richer tagset
... if the need arises.
Clearly the tagset was not designed to differentiate all possible distributional differences,
but in most work on POS-tagging the tagset is used in unaltered form; this additional
information is, in the case of syntactic information, not used at all, and in the case of
lexical information is only used as a side-effect of the taggers utilising features derived
from lexical context, as will be explained in more detail below. It seems that the possibility of making explicit certain syntactic regularities within the coarse Penn Treebank
word classes is never considered.
There are a few other points worth noting regarding the Treebank tagset. One is that
in certain situations the tagset stipulates that certain words should be tagged differently
depending on syntactic context, as opposed to the Brown tags, which are more invariant
for each lexical item. Thus in the Penn Treebank, one receives the tag NN (common
noun) when it is the head of a noun phrase (as in the one you mentioned), rather than
the tag CD (cardinal number) which it receives in a prenominal position. This concern
with syntactic function has led to the creation of one new category not in the Brown
tagset: present tense verbs with a non-third person singular subject (e.g. they run) are
tagged VBP, even though except for be they are morphologically identical to the
infinitival form (they like to run), which is tagged VB. Returning to Leechs distinction
between linguistic quality and computational tractability, it is clear that modifications
such as these make the tagset more linguistically useful, while they probably slightly
decrease computational tractability. In certain contexts a VB/VBP distinction could
help disambiguate surrounding tags, but this is more than likely offset by the difficulty
in distinguishing between POSs for which there is a systematic ambiguity for a given
orthographic form.
Marcus et al. note that further simplifications are possible, however they did not pursue [their] strategy of tagset reduction to its logical conclusion(Marcus et al. 1993:315),
and it is not entirely clear why. There are a number of aspects of the tagset which do not
seem to conform to the stated objectives. For example, VBP (present tense, non-third
person singular) and VBD (past tense) for the purposes of POS-tagging show almost identical syntactic distributions, and the difference between them is almost always lexically
recoverable (the exceptions being a handful of irregular verbs like cut). There is therefore
a strong argument, using the criteria established, for conflating the two. A tag which is
more notable for not conforming to these criteria is the idiosyncratic Penn Treebank tag
TO, which is a tag solely used to tag the token to. to is used both as a preposition (c.f.
IN for other prepositions), as in to the zoo, and to introduce an infinitive (e.g. they like
to run), so it is contravenes two of Marcus et al.s conditions: it is a tag for a single lexical
from that, whether and conditional if which they leave as subordinators. The reasons are too complex
to explain in detail here, but perhaps hint at some justification for Marcus et al.s decision to conflate
the two classes. Nonetheless, it is still significant that in Huddleston and Pullum (2002) a distinction is
made.

12

item (one of only two in the Treebank tagset excluding punctuation, the other being EX
for existential there), and it fails to differentiate between two syntactically different uses.
Some of these observations will be relevant in Chapter 4.
All of the modifications mentioned bring the tagset size to a comparatively small 36
tags for lexical items and a further 12 for punctuation, symbols etc.. The one factor
complicating this is provision for tags indicating ambiguity. Ambiguous words are tagged
with all possible tags separated by a | (e.g. JJ|VBN ), giving a theoretically very large
maximum tagset size. Over the 1.1 million tokens of the WSJ corpus, 36 ambiguous tags
occur, and they are used on just 147 tokens in the corpus, which is so infrequent that any
probabilistic method will have almost no chance of learning their correct application.

2.3.3

Other Tagsets in Use

It is worthwhile to briefly examine some other tagsets for purposes of comparison and
as a possible source to inform modifications to the tagset. The differences between the
Brown and Penn tagsets have already hinted that there is more than one possible way to
divide up English words into POSs; the following should make that clearer still. Sampson
(1987) summarises several alternative tagsets.
The Lancaster-Oslo Bergen Corpus of British English (LOB corpus hereafter; Johansson et al. 1978) was developed jointly by British and Norwegian researchers as a British
English corpus parallel to the Brown corpus. While there is an equivalent sample of
texts, the tagsets differ substantially between the two. The LOB tagset, like the later
Penn Treebank tagset, requires that cliticised modals, auxiliaries and negation particles
such as ll and nt be split off from the host verb or (pro)noun, and tagged equivalently to
the corresponding expanded words. However, unlike the Penn Treebank, the possessive
s is left attached to the host noun and a $ is appended to the tag. Furthermore, the
Brown corpus extra information tags for headlines, foreign words etc. are abandoned.
Thus, apart from roughly 20 possessive tags such as NN$ which are arguably compound tags, the LOB corpus is composed of atomic tags. While its tagset size of 132 is
far smaller than the corresponding 471-tag Brown corpus, the fact that these are all basic
tags makes it clear that it actually makes finer grained grammatical distinctions than the
Brown corpus, with only 87 basic tags (including possessive tags).
A further extension of the LOB tagset is referred to by Sampson (1987) as the Lancaster tagset, due to the affinities of the researchers involved. It introduces a number of
new distinctions not recognised in the LOB tagset, such as for adjectives which can only
be used attributively,6 and adjectives which are purely predicative.7
Some later tagsets that are clearly related to the Lancaster tagsets are the CLAWS
tagsets used by the CLAWS tagging system described in Garside (1987). The CLAWS5
(C5) tagset which is used for the British National Corpus (Garside et al. 1997) is relatively
small but at 61 tags including punctuation still contains more tags than the 46 in the
Penn Treebank. Like the Penn Treebank, the C5 tagset distinguishes prepositions from
subordinating conjunctions (which are in turn distinguished from that), of from the other
6

As defined in Section 1.2.2, these are adjectives which can only appear in a prenominal position such
as utter: He is an utter fool versus *His foolishness is utter
7
Those which can only be complements of verbs such as be for example asleep: She is asleep versus
*There was an asleep woman

13

prepositions, and reflexive pronouns from other pronouns. However there are several
cases where the C5 tagset conflates categories which are distinct in the Penn Treebank:
in the C5 system, comparative and superlative adverbs (like more/most in a more/most
representative sample) are grouped with the other adverbs as AV0 (c.f. RB, RBR and
RBS in the Penn Treebank), and there is only a single class for singular and plural proper
nouns (versus NNP/NNPS).
A later development discussed in Garside et al. (1997) is the C7 tagset. At 146 tags,
it is more similar in magnitude to the aforementioned Lancaster tagset than anything
else we have discussed here. There is a corresponding similarity in the distinctions made,
and while there are a number of differences, they are often quite subtle and not relevant
here. A brief summary of the distributions of tags across the different tagsets is given
in Table 2.3.3, which lists by tagset the number of distinctions made in each broad
POS category. A few caveats must be mentioned with regard to the table. Obviously
it is a simplification of a large amount of information, and as such omits many of the
intricacies of the distinctions between the tagsets. One aspect of this is that the broad
POS categories do not exactly line up between the different tagsets, so if one tagset
makes more distinctions in a particular category, this does not necessarily mean that it
is making these distinctions over exactly the same set of words. For example, tomorrow
is tagged as a noun in the Penn Treebank, but as an RT nominal adverb of time in the
C7 tagset.

2.4

Algorithms for POS Tagging

It remains now to examine the algorithms which are frequently used for POS-tagging,
focussing especially on the implementations of those algorithms which are used here.
Of particular relevance are the exact features which are used by each tagger for the
disambiguation of ambiguous and unknown words, so these will be treated in some detail.

2.4.1

POS Tagging with Transformation-Based Learning

The transformation-based learning paradigm as applied to POS tagging was first


described in Brill (1995). Like the other taggers described here, it is a machine learning
technique which takes a tagged corpus as input from which it can learn how to correctly
tag a test sample. The tagger learns a set of rules for assigning tags (based on various
features which will be explained below) from the training data which gives the least
possible errors and applies them to test data.
The algorithm relies on the fact that for the task it is trivial to devise a very simple
tagging mechanism which achieves quite reasonable results, such as the unigram mostlikely-tag mentioned above. Such a method can be used to create an initial annotation
of the text. Of course while such an annotation will have a large percentage of tags
correct, there will still be a substantial proportion which are incorrectly tagged. The TBL
algorithm aims to correct these errors by successively applying rules which correct such
8

The determiner category as we use it here includes possessive determiners such as their
The figures for the Brown and LOB tagsets differ from those quoted in the text as they do not
include possessive tags for words ending in s i.e. tags for nouns ending in $
9

14

Articles
Determiners8
wh-determiners
Prepositions
Subordinating Conjunctions
Coordinating Conjunctions
Pronouns
wh-Pronouns
Adverbs
wh-Adverbs
Particles
Cardinal Numbers
Nouns
Proper Nouns
General Verbs
Auxiliary verbs
Modal verbs
Adjectives
Punctuation
Other
Total9

Brown LOB Lancaster C5 C7 Penn


1
2
2
1
2
3
6
6
10
2
10
1
1
2
1
3
2
1
1
5
2
4
2
1
1
6
2
5
1
1
3
1
2
1
5
12
12
3
16
1
2
4
6
1
3
1
4
5
10
1
11
3
2
1
4
1
2
1
1
1
2
1
2
1
1
3
4
1
4
1
4
16
25
3
15
2
2
2
5
1
7
2
5
5
7
6
8
6
16
16
16
18 21
1
1
1
1
2
1
4
6
9
3
4
3
6
13
12
3
12
12
12
10
27
10 15
17
75
105
166
61 146
48

Table 2.1: Distinctions made in broad POS categories for different tagsets

errors based on contextual and wordform-derived information. To use Brills example,


if we have an incorrectly tagged sequence such as the following (with incorrect tags in
bold):
The/DT can/MD rusted/VBD 10
the fact that can is preceded by a word with the tag DT is strong evidence that it should
be tagged as a noun (NN) even though most occurrences of can in the training corpus
would be tagged as modals MD. So, this error will be corrected if we apply a rule like:
Change the tag to NN if the current tag is MD and the previous tag is DT
The linguistic relevance of this rule is also transparent: the syntactic slot following a
determiner is more likely to be a noun than a modal verb from what we know about the
structure of noun phrases in English. However, consider a sentence like:
This/DT can/MD be/VB difficult/JJ
10

Penn Treebank tags have been used here. As a reminder, DT denotes determiner, MD denotes
modal verb and VBD denotes past tense verb

15

In this case, the correct tag of MD would be erroneously changed to NN by the above
rule. So the utility of a particular rule is increased by previously existing errors it corrects
and reduced by new errors it introduces in its application. As we shall see below, the TBL
algorithm takes this into account by scoring the rules based on the difference between
the number of errors removed and the number of errors introduced.
As described in Brill (1995), the training module of a TBL tagger has access to a
set of tagged gold-standard training data for reference. As shown in Algorithm 1, the
first step is applying the initial-state annotation (usually the unigram most-likely tag, as
mentioned above) to the untagged training data. From here, the tagger generates a set
of transformational rules of the kind mentioned above. These are generated from a set of
rule templates, which are derived from POS-based and lexicalized (i.e. wordform-based)
features including:
Change the tag to B when the current tag is A and:
1. The preceding/following word is tagged T
2. The word two before/after is tagged T
3. One of the two preceding/following words is tagged T
4. The preceding/following word is tagged T and the following word is tagged S
5. The preceding/following word is W
6. The current word is W and the preceding/following word is X
7. The current word is W and the preceding/following word is tagged T
8. The current word is W
Where S and T range over the possible POSs and W and X range over all word types
in the corpus, and words separated by a / represent different versions of the rule rather
than disjunctive possibilities in the one rule.
In addition to learning rules conditioned by these contextual features, there is also a
phase of the algorithm to deal with unknown words. The details are not relevant here;
it will suffice to note that it operates in a similar way but with different rule templates
based on word-initial or word-final character strings. When the tagger is run over test
data, after the initial annotation, these lexical rules can then be applied to correct errors
in unknown words, and finally the contextual rules can be applied.
This approach produces results very close to those which use far more mathematically
complex algorithms. Brill reports an overall accuracy of 96.6% on the Penn Treebank WSJ
corpus using 900,000 words of training data (split as 600,000 for learning contextual rules
and 350,000 for learning rules for unknown words). The only drawback of this scheme is
the long training time required, since in each iteration the counts for each possible rule
must be regenerated, as previous rule applications will have probably changed the score
of that rule since counts were last generated.
On of the most successful approaches to deal with this problem was that devised
by Ngai and Florian (2001), which vastly reduces the training time with no reductions
in accuracy. The basic idea is to avoid repetition by generating and storing good and
bad counts for each rule r once at the beginning, and updating the counts only if they
16

Algorithm 1 Algorithm to train a transformation-based learning tagger, where S denotes


the samples in the training corpus, and for each samples s S, C[s] denotes the posited
classification, Cinit denotes the initial classification and T [s] denotes the true classification
for all s S do
C[s] = Cinit [(s)] // apply initial-state rule to corpus
end for
L // list of rules
repeat
vmax 0
for all p P do // P the set of rule templates
R all instantiations of p
for all r R do
good(r) |{s|s S C[s] 6=T[s] C[r(s)] = T [s]}|
bad(r) |{s|s S C[s] = T [s] C[r(s)] 6= T [s]}|
v(r) good(r) bad(r)
if v(r) > vmax then
rbest r
vmax v(r)
end if
end for
end for
for all s S do
C[s] C[rbest (s)] // apply rule to corpus
end for
append rbest to L // add rule to list
until vmax < 3
output L
are modified by the application of another rule.11 According to Ngai and Florians figures, these optimisations reduce the running time over the Penn Treebank WSJ corpus
from 5880 minutes using Brills 1995 tagger to 17 minutes using their own FastTBL
algorithm. This is the TBL implementation used here.

2.4.2

Support Vector Machine-based POS Tagging

Support vector machines (SVMs) were first applied to POS tagging in Nakagawa et al.
(2001). An SVM is a binary classification algorithm based on a geometric interpretation of
the feature values for each instance. As detailed in Cristianini and Shawe-Taylor (2000),
given a set of training instances each consisting of a vector of binary or numeric feature
values and a true classification y {1, 1}, an SVM learns a classification function
f (x) which can be used to classify a test instance with feature vector x. In binary
classification problems, the classification rule is then sgn (f (x)). The classification rule
f (x) is dependent on what is known as the kernel function, which effectively maps the
11

This includes not only tags which are directly affected by r, but also those tags on which r depends,
i.e. those in the context window used by the rule

17

data into a higher dimensional feature space allowing the correct classification of instance
which have non-linearly separable feature values in the original feature space.
Gimenez and M`arquez (2004), whose SVM-based tagger SVMTool-1.2.2 we utilise here,
extend binary support vector machines to cover multiclass classification using a strategy
known as one-per-class binarisation: an SVM is constructed for each POS which contains
ambiguous lexical items (reportedly 34 for the Penn Treebank), and in the classification
stage, the most confident prediction from all of the SVMs is selected as the tag for the
word.
The contextual features used in Gimenez and M`arquezs tagger include unigrams,
bigrams and trigrams of words and POSs, derived from the tokens appearing in a context
window of 2 tokens on either side of the target. POS features for ambiguous words which
have not yet been tagged can be replaced with ambiguity classes. These nominal features
are binarised to act as input to the SVM in the usual way: a nominal feature with k
possible values is represented by k binary features each of which is true when the original
feature takes one particular value and false otherwise. Thus we have features along
the lines of preceding POS bigram is (DT,JJ). The accuracy reported for SVMTool is
97.16% for all tokens and 89.01% for unknown tokens.

2.4.3

Maximum Entropy POS Tagging

A successful approach to POS tagging in Maximum Entropy (ME) framework is described in Ratnaparkhi (1996). This is a probabilistic approach to classification tasks
based on the Principle of Maximum Entropy, which states that when choosing between
a number of different probabilistic models for a set of data, the most valid model is the
one which makes fewest arbitrary assumptions about the nature of the data.
The ME approach was shown to be effective on a range of NLP problems by Ratnaparkhi (1998). It uses a set of binary features12 and pairs each of these with each possible
output label. In the training stage each feature-label pair is assigned a weighting (using
any of a number of algorithms) so that the entropy of the observed data is maximised,
and in the classification stage these weightings are multiplied with the corresponding feature to estimate the probability of a label for a particular instance. Finally, the highest
probability label is selected.
The features in this case are similar to the features used by the taggers described
in Sections 2.4.1 and 2.4.2, depending on tokens in a context window of size five. In
Ratnaparkhi (1996) the discriminating elements are the current word and the two words
on either side, and the two preceding POSs which have already been assigned.
Toutanova and Manning (2000) adapt Ratnaparkhis approach, by removing the lexical features based on preceding words and as we saw in Section 2.2 by adding a number of
other hand-tuned features derived from a larger context window to assist in disambiguation of problematic words. We identify their tagger, which uses the Improved Iterative
Scaling algorithm (Malouf 2002) for parameter estimation, by StanME. These optimisations bring the accuracy from the baseline for all/unknown words of 96.76%/84.5% to
96.86%/86.91%.
12

In ME literature such as Ratnaparkhi (1998) the term contextual predicate (CP) corresponds to
what is called feature in most machine-learning literature, with feature reserved for a distinct but related
structure dependent on CPs. For consistency we retain the more widespread usage of feature throughout.

18

Chapter 3
Methodology
3.1

General Outline

Previous approaches to corpus-based POS-tagging have tended to focus on applying


new algorithms to an existing task (Brill 1995; Ratnaparkhi 1996; Daelemans et al. 1996;
Nakagawa et al. 2001) or, in a few cases, on increasing accuracy by adding additional features to the common core of features used by almost all taggers (Toutanova and Manning
2000). This previous work has for the most part used the Penn Treebank in unaltered
form as the source of training and test data. In contrast to this, here we take an alternative approach in keeping the algorithms and features fixed and investigate the effect of
treating the tagset as variable.
We investigate whether we can improve the information available to POS taggers by
introducing a finer-grained and therefore more informative tagset. If we subdivide the
tags in a linguistically sensible way, the extra information provided may help the tagger
make syntactic generalisations which are not apparent either from the coarse POS tags
or from the sparsely populated lexical feature vector. Effectively, we are attempting to
create a tagset which is more computationally tractable, so the tagger can more effectively
disambiguate ambiguous and unknown words using POS-based contextual information.
In creating a finer-grained tagset, we are almost inevitably increasing its linguistic
utility; we do not plan to introduce distinctions which have no basis in linguistic reality.
We have already mentioned in Section 2.3 that there is often tension between the requirements of linguistic utility and computational tractability. It is thus an investigation
into whether they always conflict, i.e. whether there is some set of linguistically useful
modifications which also increase computational tractability.
However, it is clear that a more complex tagset could make the baseline tagging task
more difficult due to a potentially increased number of tags for a given word. This means
that any performance improvements achieved using a modified tagset could be obscured
by decreases in performance caused by increased ambiguity elsewhere. Therefore, to
enable comparison with previous work we will adopt the architecture shown in Figure 3.1.
For a given set of tagset modifications, the testing cycle is as follows: we map the tag of
each token in the training data appropriately to a particular new version of the tagset,
run the trained tagger over a test corpus, and for purposes of comparison map the finergrained POS-tags back to the original Penn Treebank tags before evaluating performance.
This method means that any increased linguistic utility of the mapped tags is discarded
19

Figure 3.1: The experimental architecture


before evaluation, but for the purposes of this experiment the linguistic utility is a means
for improving tagger performance rather than an end in itself.
To facilitate the final stage of mapping the tags back to original Penn tags, we place
certain restrictions on allowable modifications: the mapping function must either be
injective from the old to the new tags (i.e. each tag in the new tagset corresponds to
at most one in the original tagset), or any distinctions which are collapsed must be
unambiguously recoverable from the wordform so the equivalent tags from the original
tagset can be determined reliably.

3.2

Motivation

It is worth addressing the question here of why it is worth striving for a small performance improvement. By NLP standards, accuracy of 97.0% seems astoundingly high,
begging the question of whether there is any point in attempting to raise this figure by
a few fractions of a percent. However, according to word-by-word evaluation metrics,
POS tagging is actually quite a simple task as noted by Charniak et al. (1993), the
unigram-based most-likely tag (MLT) baseline for the task is around 91%.
In getting a clearer understanding of POS tagger evaluation, it is worth considering
that POS tagging is generally a pre-processing phase in NLP, which acts as input to a
second stage such as sentence-level parsing. If we look at sentence-level accuracy (i.e. the
proportion of sentences in which all tokens are correctly tagged) the POS tagging task
seems harder with an average sentence length of 24 words and assuming errors occur
independently we would expect a tagger which gives 97% accuracy over word tokens to
achieve 49% at the sentence level, while the corresponding figure for a tagger performing
at 98% is 62% of sentences tagged correctly.
There are a number of reasons to believe there is still room for improvement in tagging
20

accuracy. As noted in Brill and Wu (1998), there is a high degree of complementarity


in errors made by maximum entropy and TBL-based taggers (among others), suggesting
that even though these taggers use similar contextual features, the differences in the way
these features are combined result in errors over different words. This implies that at
least some of the time, there is sufficient information available, but that the different
underlying algorithms fail to apply it correctly in certain cases.

3.3

Experimental Setup

The tag-mapping module is implemented in Python using the Natural Language


Toolkit (Loper and Bird 2002) and can conditionally map the POS tag of any token
in the Penn Treebank dependent on a conjunctive or disjunctive combination of lexical,
syntactic or collocational features.
The lexical features, as we have already seen, are those dependent on the wordform
itself. The usual usage of this is to supply a list of wordforms mapped to a new POS.
Syntactic features are those derived from the parsed annotation of the corpus; more
specifically, they are dependent on the phrasal categories of nearby nodes: parent, grandparent, immediately preceding or following siblings or all preceding or following siblings.
Collocational features are based on relative sentential position with no regard to tree
structure. In this case, they are determined by the two preceding and following POS
unigrams.
It is worth emphasising that using extra information sources, including syntactic information derived from possibly very distant elements in the sentence, does not represent
a departure from the mainstream methods of POS-tagging in NLP using only local context. These features are purely being used to filter the POS tags in a preprocessing stage
in ways we hope more accurately reflect underlying patterns of regularity. While we are
modifying the data so that it is annotated in a way which is hopefully more conducive
to accurate tagging, the tagging algorithms themselves remain unchanged this method
does not give us an unfair advantage by using syntactic information in the tagging process
itself.
To illustrate the extraction of feature values, let us consider the parse tree previously shown in Section 1.2.4 and repeated in Figure 3.2 for convenience. The values of
these features for each of the tokens are shown in Table 3.1. The only lexical feature is
Wordform; syntactic features are all of the Sib and Par features, and the collocational
features are the NextPOS and PrevPOS features. Note that the preterminal POS label of the target token is ignored, while the POS labels of adjacent tokens are treated
like any other non-terminal node. This was the most sensible way to avoid redundancy
in the features.
We could use these features, for example, to map tags to create a class corresponding
to transitive prepositions which we could denote IN-TR, as distinct from subordinators
(discussed in Section 2.3.2). The condition for this mapping would be [POS=IN &
RightSib=NP] (making use of a syntactic feature in this case). With lexical features
we might create a new class BEP representing the verb to be in the finite base form
(i.e. present tense, non-third person singular) which is equivalent to the C5 tag VBB 1
1

This would in general be used in conjunction with a suite of corresponding modifications to create

21

S
HH

H




HH
H



HH

VP

NP-SBJ
PP
 @
@ PPP


PP

@

DT

Both

NNP

Clarcor

CC

NNP

and Anderson

HHH

H

VBP

VP

are

HHH

H

VBN

PP-LOC-CLR

based

H
 HH

HH


IN

NP

in

HH

H



NP

CM

NNP

Rockford

HH

NP

NNP
Ill

Figure 3.2: A parse tree annotation from the Penn Treebank repeated (with minor notational changes) from Section 1.2.4 for convenience
using the condition [POS=VBP & Wordform{am, are}]. If we applied both of
these mappings, the sentence shown in Figure 3.2 would be rendered by the tag mapping
module as shown in Figure 3.3. Sentences with modified tags such as this would be used
as training data.
Both Clarcor and Anderson are based in
Rockford , Ill
DT NNP CC NNP
BEB VBN INTR NNP
, NNP
Figure 3.3: A sample sentence with some mapped tags (in bold)

There is of course a corresponding inverse mapping module enabling the tags of the
test data to be mapped back to the original tags for comparison with the gold-standard.
As described above, this ensures our results are not penalised by the possibly increased
difficulty of assigning tags in a more complex tagset. At the same time, we will still
be able to see any changes which, due to the more specific contextual information, have
improved accuracy over the original tags.
An example of the experimental process is illustrated in Figure 3.4. To use the gold
standard sentence shown as test data, the tags would first be stripped off. Following this,
a tagger trained on training data of the sort exemplified by Figure 3.3 might assign tags
as seen in the second row of Figure 3.4. These would be mapped back to the original
Penn Treebank tagset using the inverse mapping module, producing the tagged version
shown on the final line. Comparison with the gold-standard would reveal one tagging
error in the sentence.
other tags for the verb to be, such as BEZ for is, conditioned in a similar way

22

Wordform
POS
Par
GrandPar
LeftSib
AllLeftSib
RightSib
AllRightSib
PrevPOS
PrevPOS2
NextPOS
NextPOS2

are
VBP
VP
S

VP
{VP}
NNP
CC
VBP
VBN

based
VBN
VP
VP

PPLOCCLR
{PPLOCCLR}
VBP
NNP
VBN
IN

in
IN
PPLOC-CLR
VP

NP
{NP}
VBN
VBP
IN
NNP

Rockford
NNP
NP
NP

CM
{CM,NNP}
IN
VBN
NNP
CM

,
CM
NP
NP
NNP
{NNP}
NNP
{NNP}
NNP
IN
CM

Ill
NNP
NP
NP
CM
{CM}

CM
NNP

Table 3.1: Feature values for a selection of tokens given the parse tree in Figure 3.2, with
the tag for comma listed as CM for clarity

Gold-standard sentence with original


Penn tags
The sentence as it might be tagged by
our trained tagger
The test sentence with tags mapped
back to original Penn tagset

IBM
NNP
IBM
NNP
IBM
NNP

is
based in Armonk N.Y
VBZ VBN IN NNP
NNP
is
based in
Armonk N.Y
BEZ 2 JJ
INTR NNP
NNP
is
based in Armonk N.Y
VBZ JJ
IN NNP
NNP

Figure 3.4: An illustration of the testing process, showing the gold-standard sentence,
the sentence as it might be tagged by a tagger trained on data such as that in Figure 3.3,
and the tagged sentence with its tags converted back to to the original tagset, revealing
one error, marked in bold.

3.3.1

Data Sampling

An initial round of experimentation with the corpus divided into training, test and
development sets as most commonly seen in the literature, (i.e. sections 020 as a training
set and sections 2123 as a development/test sets) revealed that the results were quite
sensitive to the choice of development set. Since we were aiming for possibly quite small
increments in accuracy, this meant there was a risk of overfitting to the data since any
changes in accuracy could have been due to peculiarities of the particular data set even
with 200K words of test data, a global change of 0.05% corresponds to only 100 words,
or much less over a specific POS.
To alleviate this problem we used five-fold cross-validation: the data is divided into
five partitions and a complete training/test cycle involves five iterations. In each iteration,
a different partition is held out as test data and the remainder are used to train the
POS tagger. Five-fold cross-validation over sections 0 to 22 of the Penn Treebank WSJ
corpus effectively gives a development set of one-million words, the same size as the entire
training/development corpus.
2

See footnote 1

23

Ratnaparkhi (1996) noted the presence of inter-annotator inconsistencies in the Penn


Treebank, observing sharp changes in the distribution of certain POSs at boundaries
which correspond to changes in annotator. To avoid effects caused by these discrepancies masking any meaningful corpus-wide generalisations, we also departed from the
usual strategy of constructing the data partitions from contiguous sections of the corpus.
Rather, we divided the corpus into units of five sentences each and assigned the first
sentence of each group to the first partition, the second to the second partition and so
on. As noted by Manning and Sch
utze (1999:208) (albeit for the distinct but related task
of n-gram language modelling), this method tends to inflate performance figures, however
we are purely looking for differences in accuracy relative to the baseline so this is not a
problem.

3.3.2

A Data-Driven Alternative

Our primary goal here is to apply linguistic intuition to the task of tagset modification. We also pursue an alternative, more data-driven line of investigation: we investigate
whether in a separate stage to training the taggers, we can use machine learning techniques to determine useful subdivisions (or clusters in machine-learning terms) in the
tagsets corresponding to patterns of syntactic regularity. The approach is almost identical to the primary approach described above, but replaces linguistic insight with machine
learning techniques.
We defined a range of features which could help in determining patterns of syntactic
regularity. Some of the features were syntactic, often corresponding to layers of annotation used by Klein and Manning (2003): phrasal categories of the parent, grandparent,
left sibling and right sibling, and binary-valued features for whether a given preterminal
corresponds to a phrasal head,3 or whether it is the only element in its phrase. There were
also a set of collocational features corresponding more closely to the features available to
the tagger, based on the two preceding and two following POS tags.
The approach we took was to conflate the nominal feature values extracted for each
token by word type and construct a frequency distribution of the values of each feature
for each word type. For example if we had as input sentences the sentences shown in
Figures 3.2 and 3.4 (naturally we would have access to the original unmodified goldstandard parse tree which for Figure 3.4 is not shown), and we were using the features
Par and PrecPOS, we would construct a number of frequency distributions, including
the following for based/VBN :
Par:

VP
2

PrecPOS:

VBZ
1

VBP
1

The frequency distribution for each feature with n non-zero values was then converted
into a set of n numeric features for the word type using maximum likelihood estimation.
So if we specify f syntactic or collocational features, the result of this is f groups of
features for each type, derived from f probability distributions composed of individual
3

A phrasal head in English is the rightmost non-terminal in a phrase with a category which corresponds
to the phrasal category, e.g. the rightmost NN in an NP. These are regarded by linguists as being
particularly significant components of a phrase.

24

features with values corresponding to the relative frequencies of these particular values
for the feature. For the above example, the corresponding set of derived features and
values for based/VBN would be:
Par is VP PrecPOS is VBZ
1.0
0.5

PrecPos is VBP
0.5

This method of combining feature values was the most principled way we could find of
capturing a large amount of distributional information manageably. These feature values
were then used as input for the implementation of the EM algorithm4 in the Weka toolkit
(Witten and Frank 2000).

3.3.3

Evaluation Metrics

Perhaps the most intuitively obvious method of evaluation in classification tasks is the
accuracy: the fraction of samples (in this case word tokens) which the algorithm classifies
correctly according to the gold standard i.e. the number of correct classifications divided
by the total number of samples. For comparison with previous work we use the global
token-level accuracy metric since it is the most widely-used metric in tagging research.
The token-level accuracy over unknown words (i.e. those which did not appear in the
training data) is also crucial since this is a major source of tagging errors in our baseline
with an unmodified tagset, just 2.4% of the tokens in the training data were unknown but
they contributed 1113% of errors. Another metric we will use which is less widespread
is the sentence-level accuracy metric described in Section 3.2. As noted, it is valuable as
a reflection of likely problems in downstream applications.
However in a classification problem, it is also often instructive to look at performance
over individual classes, since two classifiers which produce identical global accuracy figures
could in fact have vastly different distributions of errors between different classes. In this
case, accuracy is often less interesting, as the figures tend to be dominated by the large
number of true negatives. If we instead focus on a particular target class and denote the
number of true positives (i.e. those samples correctly classified with the target label) by
pt , false negatives (erroneously classified into a non-target class) by fn , and true negatives
and false positives by nt and pf respectively, we can gauge the number of true positives
relative to false positive and false negatives using precision (P ), recall (R) and F-Score
(F ), which are defined by:
pt
pt
2P R
R=
P =
F =
pt + pf
pt + nf
P +R
The precision corresponds to the fraction of positive classifications which were actually
correct, and the recall corresponds to the fraction of true samples which were classified
positively, while the F-score combines the two using the harmonic mean, which heavily
penalises a system with a low score for either P or R. Where it seems relevant, we
will look at precision, recall and F-score over individual POSs to highlight points of
interest observable from variations in results which are not evident from the broad metrics
discussed above.
4

In fact the implementation included a wrapper which ran the algorithm multiple times iteratively
increasing the number of clusters until the log-likelihood decreased, avoiding arbitrary selection of the
number of clusters

25

Baseline
As well as showing the benchmark of accuracy achievable with an unmodified tagset,
for a point of comparison we will apply a suite of naively conceived modifications to act
as a baseline indicating how far it is possible to get with a simplistic approach. The
idea is borrowed from POS induction, which involves determining word clusters (i.e.
POSs) from unannotated data. The task here is similar except that we are looking for
patterns of regularity within a particular POS, so the baseline used by Clark (2003) may
be informative. To subdivide a POS into n subclasses, we assign each of the (n 1)
most frequently seen word tokens from the class into (n 1) separate new classes and
the remainder to a final subclass. For example, it can be determined empirically that the
most frequently observed token with class PRP is it. So, for n = 1 over the PRP class we
create a mapping (using lexical features) assigning it to one class and all other members
of PRP to another (or equivalently leave them in the original).
There are number of reasons why we might expect such a baseline to show reasonable
performance. Often the most frequent members of the class will show a large degree of
syntactic irregularity compared to other members of their class. For example in the determiner class (DT), the two most frequent lexical items are the and a, which correspond
to what grammarians call articles, are unique among the determiners in being unable
stand alone as a grammatical noun phrase, as is demonstrated by oppositions such as
I like that versus *I like the. Additionally, these most frequent lexical items are those
which will be most resistant to problems of data sparseness which we often introduce in
creating new categories.

26

Chapter 4
Experimental Evaluation
There is a very large range of potential modifications to any tagset, therefore it was
necessary to adopt an approach which involved tentatively testing as wide a range of
modifications as possible but only running a complete test cycle on the most promising
modifications. This enabled a broad search space initially but also kept the investigation
manageable in terms of CPU time.
We discuss several broad groups of modifications below: the baseline in Section 4.1,
clustering in Section 4.2 and linguistically motivated modifications described in Section 4.3. For each of these, we adopted the following incremental prototyping architecture. We selected fnTBL (Ngai and Florian 2001) as our first stage prototyping tool
for a set of tagset modifications, as it can complete a five-fold cross-validation test-cycle
in under two hours. Any modification which had a large negative impact on performance
at this stage was generally not investigated further, since the taggers use similar features,
and we were attempting to find universally useful distinctions. The SVM tagger SVMTool 1.2.2 (Gimenez and M`arquez 2004), with a turnaround of under seven hours, was
used in subsequent experimentation. Only the Stanford NLP Maximum Entropy tagger
(Toutanova and Manning 2000) (StanME hereafter) had a prohibitive training time, so
for practical reasons was used minimally, for benchmarking and later-stage testing.
We have attempted to summarise as broad a range of modifications as possible within
the space constraints, meaning that in some cases there was one or more subtly different
variants of a mapping we cover here for which we do not report results, instead reporting
only on the most successful variant. Additionally a number of (generally less successful)
modifications are not reported here.
In all of the tables in the results section we quote figures for accuracy over all tokens to
five significant figures. Even though these figures are derived from over one million tokens,
the comparatively large number of significant figures probably implies an unjustified
degree of confidence in these numbers, however we quote this many figures in order to
make visible the comparatively small changes in performance we were expecting. The
final digit in particular should not be interpreted as confident reflection of performance
but rather as a rough indication. Other figures over which we expect more variability and
which are derived from a smaller number of samples are given to four significant figures.
In each table, as well as showing the token and sentence-level accuracy figures, we also
show a selection of changes in F-score over individual POSs relative to the benchmark. To
keep the number of figures manageable, we determined the most significant changes using
27

TB
SV
MaxEnt
All Tokens
96.842 96.852 97.056
Unknown Tokens 81.94 84.62
87.34
Sentences
51.77 50.72
53.72
Table 4.1: Accuracy (%) of off-the-shelf taggers with default parameters

the paired t-test (Witten and Frank 2000), which estimates the statistical significance of
differences between two tests using the figures for each fold of cross-validation. In each
table, we show the two most significant negative F-score changes, and the two most
significant positive changes, excluding uninteresting POSs: punctuation marks, as well
as FW, which occurs so infrequently that it is almost impossible to tag correctly meaning
that any changes are almost certainly due to noise, despite the confidence of the paired
t-test.

4.1

Benchmark and Baseline

As a benchmark, in Table 4.9, we show the global accuracy figures for each of our
taggers as run with default or recommended parameter settings using the experimental
setup described in Section 3.3 (i.e. five-fold validation over sections 021 of the Penn
Treebank) with no mapping function applied to any of the tags. F-scores over specific
POSs from the same experiment are shown in Table 4.2, for reference in subsequent tables
which quote changes relative to these.
In Table 4.3 we show the results of our baseline test (as described in Section 3.3.3) of
naive frequency-based clustering over a selection of POSs judged to potentially benefit
from such mappings, with further testing using the more CPU-intensive taggers conducted
for the most successful modifications, as described above. Closed classes are intuitively
more likely to exhibit syntactic irregularities that could be captured by such a method,
however a notable exception is the open class RB. Of all of the open classes, RB stands
out as a disparate class (a dustbin class according to Crystal (1987:92)) often used to
contain words which do not obviously belong in any other class. It is thus composed of
wordforms with widely differing syntactic functions such as not, as and home, and should
be an ideal candidate for modification in order to tease apart these distinct functions. We
will make use of this fact more extensively in Section 4.3 but for the moment we simply
note this as a motivating factor for further investigation of naive methods over the RB
class. In addition, we show results for the closed classes IN, PRP and DT.
These results are instructive for a number of reasons. The first point to note is that
even from these few data points it is clear that there is no simple correlation (either positive or negative) between the number of new classes introduced and the accuracy. There
seems to be an optimal number of classes in many cases which produces performance very
close to the benchmark, and having more or fewer classes degrades performance. The
fact that the best of the modifications produces very similar performance to the baseline
is a point of interest we will return to later.
28

EX
IN
JJ
JJR
JJS
NN
NNP
NNPS
NNS
PDT
RB
RBR
RBS
RP
VB
VBD
VBG
VBN
VBP
VBZ

TBL
All
Unk
97.38

98.36

92.22 80.84
88.41 41.44
95.46 70.33
96.12 73.75
96.29 89.65
62.62 19.65
97.47 88.79
76.48

92.8 85.63
71.86

86.04

75.72

95.65 82.27
95.46 75.25
93.28 86.42
89.56 75.54
93.06 44.44
97.61 75.22

SVM
All
Unk
97.26

98.43

91.66 76.01
87.55 34.86
93.26 73.87
96.13 71.79
97.01 88.57
65.6 20.78
97.73 85.77
75.18

92.57 82.29
70.47

78.55

77.04

95.42 79.71
95.06 72.25
91.79 79.21
88.84 74.00
92.97 55.46
97.03 66.75

MaxEnt
All
Unk
97.05

98.21 48.48
92.31 82.44
87.46 58.65
94.77 78.35
96.48 78.99
96.96 92.35
59.44 44.62
98.07 91.32
68.82

92.32 88.62
68.37

82.58

76.37

96.27 88.41
96.32 82.34
93.08 87.76
90.33 80.25
94.16 68.31
97.78 83.35

Table 4.2: Benchmark F-Score (%) over 1,047K tokens of text, for selected POSs

4.2

Clustering

This round of experimentation concerns the clustering approach described in Section 3.3.2. Two different sets of features were used as input to the clustering algorithm
in order to determine slightly different patterns of regularity. One trial used collocational
information derived from preceding and following POSs only. Since there is no extra
information here than what is already available to the taggers, this effectively acts as
a preprocessing stage to determine distributional similarities that the taggers had not
noted explicitly. The alternative approach used the same collocational information and
added to this the syntactic information, such as parent and sibling node labels from the
parse tree annotation of the corpus, in an attempt to capture deeper syntactic regularities
which could facilitate accurate tagging.
In each case, the clustering algorithm was run over a range of intermediate sized
classes: DT, IN, JJR, JJS, PRP, RB, RBR, RBS and RP, ignoring classes containing
just a handful of items (such as EX ) or a large number of items (such as VBN ).
A qualitative examination of the output was used to select candidates for further
testing. A possible modification could be rejected for a number of reasons. One reason
was if it reflected transparently an easily reproducible pattern of syntactic regularity but
as an imperfect version of a manually created mapping used elsewhere. For example, the
29

POS:n Alg
TB
Bench SV
ME
IN:2 TB
IN:3 TB
IN:4 TB
DT:2 TB
DT:3 TB
DT:4 TB
TB
PRP:2 SV
ME
PRP:3 TB
PRP:4 TB
RB:2 TB
TB
RB:3 SV
ME
RB:4 TB

Accuracy
All Unk Sent
96.842 81.94 51.77
96.852 84.62 50.72
97.056 87.34 53.72
96.823 81.40 51.41
96.817 81.78 51.28
96.819 81.57 51.55
96.806 81.61 51.13
96.813 81.78 51.38
96.806 81.64 51.28
96.839 81.81 51.69
96.851 84.60 50.67
97.050 87.32 53.63
96.830 81.95 51.57
96.838 81.71 51.60
96.834 81.67 51.49
96.843 81.73 51.72
96.855 84.67 50.71
97.056 87.28 53.72
96.831 81.56 51.62

Most Significant F-score Changes from Benchmark(%)


Negative
Positive

JJU :1.75 VBPA :0.20 RPA :+0.38


RBU :+1.75
JJA :0.10 VBNA :0.26 VBZA :+0.10 VBGU :+1.21
NNSU :0.44 JJSA :0.26 PDTA :+0.12 NNPSU :+23.60
RBA :0.09 NNPA :0.05 PRP$A :+0.01 VBGU :+1.66
PDTA :2.07 RBA :0.11 VBGU :+1.47 VBGA :+0.12
CDA :0.05
JJU :1.12 VBGU :+1.40 VBDU :+1.96
VBA :0.12 EXA :0.26 JJRA :+0.40 VBGA :+0.18
VBNA :0.04 NNSU :0.17 VBA :+0.02
CCA :+0.01
VBNA :0.05 NNU :0.13 VBDU :+0.36 VBA :+0.02
CDU :0.44 CDA :0.04 NNPU :+0.35 RBU :+1.95
VBDU :1.74 CDU :0.32 PRPA :+0.01 VBPA :+0.13
VBNA :0.15 JJA :0.04 RBRA :+0.91 RPA :+0.60
VBU :0.75 NNPSA :1.71 VBGA :+0.20 POSA :+0.04
VBA :0.01 CDA :0.00 VBPA :+0.04 NNA :+0.01
DTA :0.00 JJRA :0.03 NNSA :+0.01 VBA :+0.02
WDTA :0.17 VBNU :2.36 RPA :+0.32 VBZA :+0.07

Table 4.3: Accuracy (%) over all tokens (All), unknown tokens (Unk) and sentences
(Sent), and largest changes in F-score over specific POSs (subscript A and U denote all
and unknown tokens, respectively) for modifications described in Section 4.1 with naively
subdivided POSs
clustering for PRP using all features approximately reproduced the distinction between
nominative pronouns such as he, accusative pronouns such as them and reflexive pronouns
such as ourselves, with some irregularities such as dividing the reflexive pronouns into
several different clusters. Additionally, a clustering would be ignored if it produced only
one cluster or in the case of larger classes such as RB, if there was no clear pattern of
regularity in the data and the observed distinctions seemed largely random.
As described above, we selected more promising results on the basis of experimentation with fnTBL for further investigation. The results are shown in Table 4.4 where
a clustering within a POS is identified by the POS followed by cl for clustering
in parentheses along with one or more identifiers for the feature set(s) used, where c
denotes collocational and s denotes syntactic.
Some interesting points should be noted about the clustering results. The performance
of the RP 2 cluster is very similar to that of the benchmark, however closer examination
reveals that the clusters correspond to the frequent lexical items in RP versus the infrequent items, many of which occur only once in the data and are likely to be annotation
errors. Thus, the clustering is little more than a form of data cleaning and it is prob2

Empirically we found that in this case the same cluster was produced both by using collocational
features only as well as using both collocational and syntactic features

30

Mapping

Alg

Benchmark

in(clc)

rp(clc)1
rbr(clc,s)
in(clc,s)

TB
SV
ME
TB
SV
ME
TB
SV
ME
TB
TB
SV
ME

Accuracy
All Unk Sent
96.842 81.94 51.77
96.852 84.62 50.72
97.056 87.34 53.72
96.850 82.00 51.82
96.855 84.61 50.74
97.050 87.32 53.59
96.840 81.65 51.73
96.852 84.65 50.73
97.053 87.30 53.68
96.822 81.77 51.41
96.831 81.79 51.52
96.865 84.64 50.90
97.065 87.32 53.78

Most Significant F-score Changes from Benchmark(%)


Negative
Positive

VBU :1.18 VBA :0.04 MDA :+0.01 RBA :+0.08


JJA :0.01 VBNA :0.02 VBZA :+0.01 CCA :+0.02
VBNA :0.04 JJSA :0.30 NNPSU :+3.26 CCA :+0.01
VBNU :3.62 VBNA :0.16 PRPA :+0.01 VBZA :+0.08
CCA :0.00 RBA :0.02 VBDU :+0.62 WDTA :+0.03
NNU :0.25 NNPSA :0.38 CDA :+0.00 VBDU :+0.41
NNPA :0.04 VBNA :0.07 VBGA :+0.17 VBGU :+2.19
RBA :0.09 CDU :0.33 POSA :+0.03 VBDA :+0.07
RBU :0.24 NNPSU :3.23 RBA :+0.05
INA :+0.06
WDTA :0.28 JJRA :0.08 CCA :+0.01 RBA :+0.11

Table 4.4: Accuracy figures over all tokens (All), unknown tokens (Unk) and sentences
(Sent), and largest changes in F-score over specific POSs (subscript A and U denote all
and unknown tokens, respectively) for modifications described in Section 4.3.2
ably unsurprising that the results are very similar to the benchmark. Additionally, the
in(clc) clustering which produces slight performance improvements in some cases is
probably due to overfitting, as a result of using a very similar set of features to those
available to the tagger in a dataset and extracting these features from the testing data
the clusters produced show no discernible patterns of regularity. To show that there was
genuine improvement we could test this using the held-out test data (Sections 22-23 of
the WSJ) that until now has been unutilised, however we will assume the result is not
significant.

4.3

Linguistically Motivated Modifications

We show results here for a wide range of linguistically motivated modifications of the
type outlined in Section 3.3.

4.3.1

Notational Conventions

We use the following conventions for the mnemonic names used to identify the different
modifications to the tagset. Each modification will refer to a mapping applied only to a
specific POS. Where two or more mappings are used in combination they are shown joined
by a +. For modifications which simply subdivide an existing tag (the majority), the
name of the modification is composed of two parts the first showing the original POS
which is separated by a long dash () from the second part, which is an abbreviation
reflecting the new distinction being made. Groupings of closely related modifications
will be identified by the affected POSs separated by a /. Mappings which (partially)
collapse a distinction will be identified by the original class and the target class separated
31

by a colon. In cases where a similar distinction is conditioned in different tests by different


types of features, these are disambiguated by a code in square brackets indicating the
type of features appended to the end of the name, where [s] denotes syntactic and
[l] denotes lexical. Additionally these lexical features may be further specified as
[lm] or [ld], for manually created or data-extracted respectively, referring to the
sources of information to create the lexical mappings, which will be explained more
fully in Section 4.3.5. For example, later we will see a tagset modification identified by
vbcop[lm]+rbdeg[s], indicating a manually created lexical mapping of the VB class
in conjunction with a syntactically conditioned modification of the RB class.

4.3.2

Syntactically-Conditioned
Classes

Modifications

of

Closed

Here we consider mapping which are conditioned on syntactic features which affect
words in closed classes. One obvious candidate modification is reversing the idiosyncratic
conflation of prepositions such as in (which take NP (noun phrase) complements) and
subordinating conjunctions such as because (which take S (clause) complements), i.e. the
IN tag as described in Section 2.3.2. This could have been achieved lexically, by extracting
a list of lexemes which frequently act as subordinators in the training data, and mapping
the tags of the tokens accordingly. However, the most successful and principled approach
was using syntactic features for each token and thus deciding on a token-by-token basis.
This captures the fact that there are certain words that are ambiguous (using only word
unigrams) between the two. For example, before can act as a preposition in He left before
her and as a subordinating conjunction in He left before she got angry. We let the tagger
resolve such ambiguities as appropriate. Two syntactic features were used to determine
if a given IN token is a subordinating conjunction (as distinct from a preposition): an
SBAR parent node or an S immediate right sibling.3 This modification is designated
insub[s].
A similar modification in this domain was another reversal of a Penn Treebank idiosyncrasy: the conflation mentioned in Section 2.3.2 of the preposition to which heads
a prepositional phrase such as She went to university with the infinitival to which
heads a non-finite verb phrase such as She likes to teach. According to Marcus et al.s
criteria, it would make more sense to disambiguate these distinct uses of to and group
the prepositional use with the other prepositions IN. The prepositional use can be distinguished by a right sibling with category NP or QP while tokens which introduce an
infinitive are left with the original tag TO. This modification, identified by to:in was
also run in combination with insub[s], since both were designed to label prepositions
more consistently.
All of the modifications are reversals of characteristics of the Penn tagset, which were
founded with a particular intention in mind but are clear examples of idiosyncrasies of the
Penn tagset. As is clear from Table 4.5, while neither of them achieved the objective of
the experiment, the results for insub[s] in particular suggest that there is an argument
for introducing this distinction in other NLP applications and retaining the mapped tags
in the output, since while we do not gain anything in terms of performance, we also do
3

These are theoretically equivalent but were included for robustness

32

Mapping

Alg

Benchmark

insub[s]
to:in
to:in +
insub[s]

TB
SV
ME
TB
SV
ME
TB
SV
TB
SV

Accuracy
All Unk Sent
96.842 81.94 51.77
96.852 84.62 50.72
97.056 87.34 53.72
96.842 81.76 51.63
96.855 84.65 50.77
97.048 87.29 53.64
96.833 81.67 51.60
96.855 84.51 50.72
96.834 81.88 51.54
96.846 84.42 50.64

Most Significant F-score Changes from Benchmark(%)


Negative
Positive

NNPA :0.05 NNPSA :1.74 JJRA :+0.25 VBGA :+0.20


RPA :0.49 EXA :0.10
INA :+0.02 NNPA :+0.01
WDTA :0.79 DTA :0.02
JJA :+0.04 NNSA :+0.01
INA :0.01 NNPSA :0.82 VBZA :+0.10 VBGU :+1.38
VBU :5.48
JJA :0.02 VBPA :+0.05 VBPU :+13.07
VBDU :2.67 VBNU :2.59 WDTA :+0.22 VBGU :+1.68
VBU :7.10 NNU :0.43 PRP$A :+0.02 NNSU :+0.16

Table 4.5: Accuracy figures over all tokens (All), unknown tokens (Unk) and sentences
(Sent), and most significant changes in F-score over specific POSs (subscript A and U
denote all and unknown tokens, respectively) for modifications described in Section 4.3.2
not lose anything, and it is potentially useful in downstream applications. The to:in
modification induces a small performance penalty but does at least result in a more
consistent set of labels. Of course neither of these modifications would be of much value
if an analysis of errors in the finer-grained tags (i.e. without mapping the tags back to the
original tagset) revealed the classification was doing no better than chance. For example
if the tagger was not doing particularly well at distinguishing between the to/TO and
to/IN then the distinction could introduce more problems than it solves through noise
in the data.

4.3.3

Syntactically-Conditioned Modifications of Open Classes

The mappings we consider now are again syntactically conditioned but apply to open
classes such as RB. One modification is based on the observation that in the baseline
taggers, 5.8-6.4% of tagging errors were due to a gold-standard JJ (adjective) being
tagged VBN (verb past participle) or vice versa, with a further 1.9-2.0% of errors due
to the corresponding JJ/VBG (verb present participle) confusion. These distinctions
are notoriously difficult to make in certain cases: in an isolated sentence such as She
was offended, it is not possible even for a human to determine whether offended should
be tagged as JJ or VBN ; similarly we cannot discriminate between JJ or VBG as the
correct tag for entertaining in They were entertaining. However, but we should be able
to assist in discrimination by utilising the linguistic tests for distinguishing between the
two:4 adjectives can be modified by degree adverbs such as very, while verbs cannot.
Thus, the presence of a degree adverb should indicate unequivocally that the head word is
an adjective. In practice there is no clear boundary between degree adverbs and the more
common verb-modifying adverbs so the approach we took here was to allow ambiguity of
degree adverb membership and condition the tag mapping on syntactic features for each
token: an RB with either an RB or JJ as its right sibling, or an ADJP (adjective phrase)
4

These are also recommended in the Penn Treebank tagging guidelines (Santorini 1990)

33

as parent was mapped to a degree adverb. This modification is denoted rbdeg[s].


In line with the observation mentioned in Section 4.1 that the adverb or RB class
stands out as a particularly disparate class, we attempted some other similar modifications
to different subsets of the RB class. The rbloc[s] modification maps tokens tagged
RB which have ADVP-LOC or ADVP-DIR as parent to a new class. Empirically this
tends to be words such as here, home and abroad, which show quite different syntactic
distributions to other adverbs (for example, they can occur as complement to be, such as
He is abroad), however due to the labelling conventions this also captures other adverbs
such as locally which show distribution more like the larger class of non-locative adverbs.
There is also a natural parallel in the temporal domain for adverbs such as now and
sometimes, which we map to a new class when the tokens occur with ADVP-TMP as
parent; we denote this mapping rbtmp[s].
Another series of modifications of this type relies on the fact that verbs tend to
select for certain types of complements so knowledge of characteristics of the verb can
assist in disambiguating problematic words in the immediately following context. All
of the modifications of this type are in fact composed of a suite of six modifications
corresponding to each of the original Penn verb tags (VB, VBP, VBZ, VBD, VBN and
VBG), all of which must be mapped to a new tag. There were four modifications of this
type which are described below.
The first such modification represents an alternate approach to making the JJ/VBG
and JJ/VBN distinctions mentioned above and is again inspired by a test used by linguists
(and also recommended by Santorini (1990)) to distinguish between these problematic
cases. Adjectives can occur after be as well as after other so-called copular verbs such
as seem, become and appear. Thus while crushed in He was crushed is ambiguous
(without context) even to a human annotator, in He seemed crushed it is unambiguously
an adjective or JJ. This motivates the vbcop[s] modification, which is conditioned
on the label of any right sibling containing PRD, the Treebank notation for predicate
complement, i.e. the argument of a copular verb.
A set of problematic distinctions (noted by authors such as Toutanova and Manning
(2000)) is a set of words which are ambiguous between verb particles (RP), prepositions
(IN ) and adverbs (RB).5 For example, in could be an RP in She cashed in her shares,
an IN in She stood in the hallway and an RB in She went in. Toutanova and Manning
reduced this ambiguity using a lexically conditioned feature, based on particular verb
tokens, however in this case we evaluate the utility of syntactic features for the same
purpose. In vbrp[s], a verb token is mapped to a new tag when one of its right siblings
is PRT (the Treebank annotation for the parent of a verb particle). This could help
in making the difficult RP/IN and RP/RB distinctions, since it will be more likely to
explicitly annotate verbs which we could expect to precede particles, and this information
could in turn be used to determine the identity of a subsequent ambiguous token.
The vbinf[s] modification is another syntactically conditioned mapping inspired by
a lexical feature used by Toutanova and Manning (2000), where a feature was added
5

Huddleston and Pullum (2002) hold that the class of words to which we are referring which are
tagged as RB in the Penn Treebank should instead be treated as prepositions without noun phrase
complements for a range of reasons including the systematic ambiguity mentioned above and the fact
that these phrases headed by putative adverbs are distributed like prepositional phrases rather than
adverb phrases.

34

Mapping
Benchmark
rbdeg[s]
rbloc[s]
rbtmp[s]
vbcop[s]
rbdeg[s]
vbcop[s]
vbrp[s]
vbtr[s]

Alg
TB
SV
TB
SV
TB
SV
TB
+ TB
TB
SV
TB
TB

Accuracy
All Unk Sent
96.842 81.94 51.77
96.852 84.62 50.72
96.818 81.66 51.36
96.847 84.72 50.63
96.841 81.73 51.59
96.851 84.66 50.66
96.807 81.92 51.29
96.822 81.72 51.43

Most Significant F-score Changes from Benchmark(%)


Negative
Positive

CDA :0.63 JJRA :4.15 POSA :+0.41 WPA :+0.66


NNPU :0.05 VBGA :0.09 VBDU :+2.73 NNPSA :+0.17
NNSU :1.17 VBDU :3.26 RBA :+0.09 VBGU :+1.87
NNA :0.01 VBZU :0.71 WDTA :+0.04 VBA :+0.02
CDA :0.03 VBNA :0.21 VBGU :+2.15 RBU :+4.05
NNU :1.14 RBA :0.13 VBGU :+2.30 VBZA :+0.17

96.839
96.846
96.824
96.776

VBPA :0.21 NNPSA :0.81


VBNU :3.02 POSA :0.03
VBA :0.21 VBNA :0.34
VBDU :4.95 VBPA :0.65

81.78
84.70
81.71
81.52

51.62
50.70
51.59
50.89

VBGU :+1.74 VBZA :+0.20


RBA :+0.10
JJU :+0.29
VBGU :+2.04 VBZA :+0.12
DTA :+0.02 VBZA :+0.07

Table 4.6: Overall accuracy figures and most significant changes in F-scores over specific POSs for modifications described in Section 4.3.3 (See Table 4.5 for explanation of
symbols)
which was sensitive to the presence of the verbs do, let, make or help which frequently
take bare infinitival complements (i.e. a verb with tag VB the bare, uninflected form
not preceded by to). This is designed to help with the problematic VB/VBP distinction
made in the Penn Treebank which we described in Section 2.3.2. For example we should
be able to correctly tag find as VBP in They find it difficult and as VB in They do
find it difficult. Our strategy here differs slightly: we map tokens with an immediate
right sibling S, which corresponds to the parent of infinitival verb phrases, to investigate
whether it helps the tagger make the distinction.
The final syntactically-conditioned modification of verbs is vbtr[s], reflecting the
distinction between transitive and intransitive verbs. Transitive verbs such as kill have
a complement corresponding to an object (in English, this is usually a noun following
the verb), which is prototypically the thing acted upon. Intransitive verbs such as die
lack such an argument. This mapping considers verb tokens as transitive if they have an
NP as one of their right siblings. The mapping was not aimed at one particular class
of ambiguity, however it was a promising modification as it would create new classes
composed of large numbers of token instances.
As we can see in Table 4.6 many of these modifications were not as successful as we had
hoped. Some of the lack of success of the rbdeg[s] modifications can be attributed to
inconsistencies in the data (examples such as very/RB alarmed/VBN directly contradict
the Treebank tagging guidelines), as well as the fact that there are too few instances of
JJ/VBN ambiguity following a degree adverb in the data for this extra information to
help. A more successful modification targeted at the same problem was vbcop[s], which
even produced some significant changes over the targeted POSs, although some of these
such VBN U for SVMTool, were decreases rather than increases in F-score. rbloc[s]
produced some noticeable changes in F-score, although these are difficult to account
for. The least successful by far was vbtr[s], suggesting that verb valency is very
35

Mapping

Alg

Accuracy
All Unk Sent
96.842 81.94 51.77
96.852 84.62 50.72
96.818 81.52 51.43
96.832 81.59 51.51
96.851 84.63 50.73
96.812 81.68 51.42
96.801 81.57 51.15

Most Significant F-score Changes from Benchmark(%)


Negative
Positive

VBNU :1.32 VBDU :2.64 VBGA :+0.07 VBZA :+0.10


NNU :1.72 JJU :1.36 VBDA :+0.11 VBGU :+1.21
NNPU :0.06 PDTA :0.85 JJRU :+3.42 NNSU :+0.12
RBA :0.08 NNPA :0.05 VBZU :+3.89 VBZA :+0.13
NNU :1.42 RBA :0.10 VBZU :+4.65 VBZA :+0.16

TB
SV
inrp[l]
TB
inrp[l] +
TB
insub[s]
SV
in/rp/rb[l] TB
in/rp/rb[l] + TB
insub[s]
dtnum[l] + TB 96.810 81.59 51.37 VBPA :0.12 NNSU :0.82 VBDA :+0.05
jjnum[l]
Benchmark

CCA :+0.05

Table 4.7: Overall accuracy figures and most significant changes in F-scores over specifics
POSs for modifications described in Section 4.3.4 (See Table 4.5 for explanation of
symbols)
unpredictable and/or the extra information it provides is not useful in disambiguating in
problematic cases, perhaps explaining why none of the tagsets discussed in Section 2.3
make such a distinction.

4.3.4

Lexically-Conditioned Modifications of Closed Classes

Here we move on to lexical mappings, i.e. those which have in their condition set a
list of words, focussing first on closed class words. One trial in this category was designed
to increase computational tractability with little reference to linguistic motivation. It
concerns the ambiguity between IN and RP (particle). As we have already noted in
Section 4.3.3, these POSs are notoriously difficult to distinguish between, since many
words such as on are systematically ambiguous between the two. However, there are many
members of IN which have no homographs (distinct lexemes with the same spelling) in the
RP class. If we determine the ambiguous types in a preprocessing pass over the training
data6 and map the ambiguous members of IN to a new class, we are explicitly indicating
to the tagger whether or not a word is ambiguous between the two POSs and could
improve performance for these particular words. This mapping is designated inrp[l].
As shown in Table 4.7, in another trial this modification was used in conjunction the the
insub[s] modification mentioned in Section 4.3.2, reflecting our intuition that since these
two modifications dealt with the same POSs there might be some interaction between
the two. It is clear from Table 4.7 that the two modifications described in combination
performed better than the inrp[l] modification alone.
A related trial concerns the similar ambiguity of certain members of RB. A large
number of word types are unambiguously RB, however there is a subset of types which
6

A highly rigorous approach would determine these figures from the training data only. However our
use of cross-validation means that such an approach would require a separate preprocessing phase for
each folder. The well-populated feature vectors we were examining meant that overfitting by extracting
counts from the development data as well as training data was unlikely to cause overfitting, so we adopted
this slightly less rigorous approach to avoid complexity of implementation

36

are ambiguous between RB and IN, and a subset of this containing types for which the
correct tag could be any of RB, IN or RP. We thus divide RBs into those which are
unambiguous (leaving them unchanged), those which are ambiguous between RB and IN
and those which exhibit a three-way ambiguity between the relevant classes. Similarly,
IN s are divided into classes for types showing no ambiguity, RB/IN ambiguity and threeway ambiguity. This suite of closely related modifications is designated in/rp/rp[l]
according to the conventions noted in Section 4.3.1
Another modification in this domain is inspired by tagsets such as C7 which make
distinctions within the determiners and articles according to whether they indicate that
the following noun phrase is singular (e.g. a or this) or plural (e.g. these), or do not specify
the number of the noun phrase which they introduce (e.g. the). Related to this are the
words such as many, few and countless which are tagged JJ 7 in the Penn Treebank but
like these determiners indicate explicitly the number of the noun phrase they occur in
(which in this case is plural). The modifications are designated dtnum and jjnum.
One of the most interesting aspects of the results shown in Table 4.7 is that the
inrp[l] modification performed better when combined with the the insub[s] modification. We do not have convincing explanation for the outcome of these rules interacting. While it does not support our hypothesis that we can improve performance over
the benchmark using carefully selected modifications, it does indicate that in certain
situations (i.e. with the inrp[l] modification as starting point) the addition of a motivated distinction to the tagset can improve accuracy. Interestingly, the addition of the
insub[s] distinction had the opposite effect for the in/rp/rb[l] modification. Finally
we note an interesting point that the dtnum[l] modification reduced performance significantly over NNS (singular common noun), one of the POSs where we would naively
expect better performance.

4.3.5

Lexically-Conditioned Modifications of Open Classes

Outlined here are some mappings of open classes based on lexical features. There are
a large number of modifications here which are variant versions of the modifications listed
in Section 4.3.3. They are intended to reflect similar patterns of syntactic regularity, but
instead of conditioning the mappings using syntactic features on a token-by-token basis, a
pre-prepared word-list was used. These word lists could be generated in one of two ways.
They could either be manually created from grammars such as Huddleston and Pullum
(2002) or generated in a preprocessing pass over the data using similar syntactic features
to extract counts for each word type and using these counts to create a list of word types
which had relative frequencies of occurrence of more than an arbitrarily chosen threshold
in the particular syntactic context.
Manually generated lists were applied to the following mappings (recall from Section 4.3.1 that [lm] is shorthand for manually created lexical mapping):
The locative adverb mapping, rbloc[lm]
The mapping for verbs which select for infinitives, denoted vbinf[lm] (these were
derived from those used for the related feature by Toutanova and Manning (2000))
7

These are technically open class words and thus should appear in Section 4.3.5 however it was natural
to group this modification with the determiner mapping due to their similar functions

37

Mapping
Benchmark
rbloc[lm]
vbinf[lm]
vbcop[lm]
vbcop[lm]
rbdeg[s]
rbdeg[ld]
vb-rp[ld]
nnms[l] +
dtnum[l]

Alg
TB
SV
TB
TB
SV
TB
+ TB
TB
TB
SV
TB
SV

Accuracy
All Unk Sent
96.842 81.94 51.77
96.852 84.62 50.72
96.826 81.74 51.53
96.842 81.89 51.61
96.848 84.59 50.66
96.827 81.73 51.45
96.821 81.90 51.43

Most Significant F-score Changes from Benchmark(%)


Negative
Positive

NNSU :0.84 VBPA :0.18 RPA :+0.25 VBZA :+0.14


CDA :0.04 CDU :0.39 RBU :+2.08 RBA :+0.10
VBNA :0.07 VBZU :1.08 POSA :+0.02 NNPSA :+0.10
NNPSA :1.16 VBPA :0.16 WDTA :+0.13 RPA :+0.30
CDA :0.03 WDTA :0.14 MDA :+0.02 VBPU :+14.12

96.825
96.823
96.852
96.782
96.858

VBNU :2.78
VBPA :0.19
NNSU :0.19
NNU :2.91
NNU :2.16

81.44
81.69
84.60
81.11
84.41

51.51
51.46
50.71
51.05
50.73

PDTA :1.16
VBDA :0.07
VBGA :0.05
JJA :0.32
VBZU :2.94

VBGA :+0.07
DTA :+0.02
VBPA :+0.03
CCA :+0.03
RBSA :+0.42

VBZA :+0.09
POSA :+0.02
INA :+0.00
VBGU :+1.05
CDA :+0.01

Table 4.8: Overall accuracy figures and most significant changes in F-scores over specifics
POSs for modifications described in Section 4.3.5 (See Table 4.5 for explanation of
symbols)
The copular verb mapping, denoted vbcop[lm]
Lists generated from the data were used in the following variants:
The degree adverb mapping denoted rbdeg[ld]
The mapping for verbs which are frequently followed by particles vbrp[ld]
Another lexical modification used external sources of information to create the wordlist. The LinGO ERG (Copestake and Flickinger 2000) is a freely available precision
grammar with a lexicon which encodes very fine-grained syntactic and semantic distinctions between wordforms. One of these features is noun countability: the distinction
between nouns which can be pluralised (the majority), such as chair which becomes
chairs and those which cannot, such as inspiration. We can extract from the lexicon a
list of non-count nouns and use this to create a mapping of count nouns for the NN class,
designated nnms. Similarly we can also extract a (much smaller) list of nouns such as
armaments which only occur in the plural. This modification was combined with the
dtsg[l] mapping described in Section 4.3.4 since we would expect these two mappings
which deal with number in noun phrases to interact with each other somewhat. We might
expect this mapping to reduce confusions between NN and NNS.
The results are shown in Table 4.8. There are again several modifications which
maintain benchmark performance either with fnTBL or SVMTool. fnTBL had no net
change with the vbinf[lm] modification, however the POSs most affected are somewhat
surprising the most significant changes were over RBs. Examination of the output
suggests that this is because of fewer RB/JJ confusions which are perhaps distributed
differently near to the targeted words. Another puzzling F-score change is over NN in
nnms[l] + dtnum[l]: while we might might have hoped to tag NN more accurately
with this modification, performance in fact dropped appreciably.
38

Benchmark
Freq-based PRP:2
Freq-based RB:3
Clust rp(clc)
Clust in(clc,s)
Clust in(clc)
insub[s]
vbinf[lm]
vbcop[s]
inrp[s] + insub[ld]

All
96.842
96.839
96.843
96.840
96.831
96.850
96.842
96.842
96.839
96.832

TBL
Unk
81.94
81.81
81.73
81.65
81.79
82.00
81.76
81.89
81.78
81.59

Sent
51.77
51.69
51.72
51.73
51.52
51.82
51.63
51.61
51.62
51.51

All
96.852
96.851
96.855
96.852
96.865
96.855
96.855
96.848
96.846
96.851

SVM
Unk
84.62
84.60
84.67
84.65
84.64
84.61
84.65
84.59
84.70
84.63

Sent
50.72
50.67
50.71
50.73
50.90
50.74
50.77
50.66
50.70
50.73

MaxEnt
All
Unk Sent
97.056 87.34 53.72
97.048 87.40 53.51
97.056 87.28 53.72
97.053 87.30 53.68
97.065 87.32 53.78
97.050 87.32 53.59
97.050 87.37 53.51

Table 4.9: Accuracy (%) of the best-performing or most motivated tag modifications for
each of the broad methods discussed, with the highest accuracy figure in each column in
bold

Benchmark
Clust in(clc,s)
insub[s]
vbinf[lm]

All
96.68
96.68
96.70
96.73

TBL
Unk
83.71
83.59
84.07
84.10

Sent
49.52
49.91
50.00
49.94

All
96.75
96.78
96.77
96.75

SVM
Unk
87.23
87.38
87.32
87.26

Sent
49.76
50.04
49.94
49.74

Table 4.10: Accuracy (%) of selected tag modifications from Table 4.9 over the held-out
129K-token test set of sections 22 and 23 of the WSJ corpus

4.4

Overall Summary

In Table 4.9, we evaluate some of the most promising modifications from each of
the broad categories. Over the training/development data, the clustering modifications
achieve higher accuracy in general than our linguistically motivated modifications. We
show results for selected modifications in Table 4.10 over the test set to determine the
extent of overfitting. Time constraints prevented testing with StanME however we would
expect from previous results that it would follow a simlar pattern to SVMTool. These
figures come from a much smaller dataset of only 129K tokens and are therefore a less
accurate reflection of performance. The linguistic modifications if anything seem more
successful over this data, however the size of the change over the comparatively small
dataset is such that it is unlikely to be statistically significant. In general it seems that
the linguistic modifications are less data-dependent while the data-driven modifications
may have a slight tendency to overfit. Nonetheless, it seem that according to the results
for SVMTool that it is possible there is some genuine improvement using the clustering
modifications, and the modification we show provides the best performance over the test
set.

39

Chapter 5
Conclusion
5.1

Discussion

This thesis has detailed a thorough investigation into the possibility of improving POS
tagging accuracy by subdividing the tagset used in ways which make it more informative.
We have shown the effect of three alternative types of modification: naive, linguistically
motivated and data-driven.
Our results in general have not supported the hypothesis that it is possible to achieve
significant performance improvements in POS tagging over the Penn Treebank by utilising
a finer-grained tagset. Even with a diverse range of modifications to both closed and
open classes, we have not found a mapping we could introduce which led to statistically
significant performance improvements, while we were able to come up with a number of
modifications which led to noticeable accuracy reductions.
These results could be seen to support the intuitions of Marcus et al. (1993) about the
desirability of a coarse tagset to avoid detrimental effects on accuracy of data sparseness.
It seems that the linguistic constructions in which particular modification might be useful
are not frequent enough to improve performance, and that if these modifications do have
any positive effect on accuracy these effects are more than counteracted by the effect of
the more sparsely populated feature vectors.
There are several obvious reasons for the difficulty in improving performance here. It is
the most difficult 3% of tokens which we are attempting to tag correctly. Among these are
words which probably cannot be tagged correctly with a small context window, words
for which humans would have difficulty agreeing on a tag, and tags which are tagged
incorrectly in the gold standard (a fact which was explored in Ratnaparkhi (1996)).
This thesis lends weight to argument that the 97.0% glass ceiling in tagger accuracy
probably has as much to do with the estimated 3% error rate quoted in the Penn Treebank
documentation as a lack of specific contextual information.
However despite this, there are still reasons to believe that there is room for improvement. As mentioned in Section 3.2, the observation of Brill and Wu (1998) that
there is high degree of complementarity in errors made by taggers including maximum
entropy and TBL suggest that at least some of the time there is sufficient information
available and problems in correct application. Given this, the lack of success so far in
applying linguistic intuition was surprising. While the highest-performing modification
was the linguistically-motivated reintroduction of subordinators, accuracy in this best
40

case was not significantly different from using an unmodified tagset. However the worst
of the linguistically motivated modifications resulted in markedly lower accuracy than
the benchmark. Even modifications targeted at addressing a specific confusion (such as
rb-deg) reduced overall performance and failed to produce noticeable changes in F-scores
over the affected POSs.
The clustering was not designed on a particularly firm theoretical basis; rather, we
attempted it as a comparison with the linguistically motivated methods. Despite this, it
has produced some intra-POS clusters which, at least over the development set, improve
performance. As the results in Table 4.10 show, there is some overfitting occurring, but
there are still appreciable performance improvements.
We have evaluated a range of naive, data-driven and linguistically informed approaches
to tagset modification. Results from modifications in all of the areas ranged from appreciable performance deterioration to approximately constant performance, with just one
modification achieving a noticeable improvement over the test and development data.
The clustering clearly has the potential to produce substantial improvements when the
feature values are derived from the development data, while the linguistic modifications
have not produced such results. However, it seems from Table 4.9 and Table 4.10 that the
linguistic modifications are less data-dependent in that they perform equally well over
unseen data.

5.2

Further Work

A number of opportunities exist for future research. It is possible that some of the
modifications we added which kept performance at an approximately constant level (most
notably the insub[s] modification) would actually result in better performance in downstream applications such as chunk parsing. Evaluating modifications extrinsically within
applications was beyond the scope of the investigation but remains an open possibility.
Another area of potential research in this domain would be to follow more closely
the model established by Klein and Manning (2003)s approach to parsing. If we were
restricted to unlexicalised tagging, it is possible new distinctions in the tagset could be far
more productive, since the baseline tagger would have more impoverished information.
However, such an approach would only be useful if an unlexicalised tagger could be shown
to have other advantages over a lexicalised tagger, such as shorter times for training or
tagging, or ability to be trained from smaller data sets.
A possible method for avoiding the aforementioned problems of data sparseness is
using a two-tiered classification of POS tags. We conducted a preliminary investigation
of this, systematically adding delimiters to newly created tags, and adding contextual
features to the tagger (in this case SVMTool) dependent on the portion of the POS tag
preceding or following the delimiter. Multiple levels of classification of POS tags are
used successfully in the jaws tagging system (Garside et al. 1997) but do not appear
to have been applied to the the Penn Treebank. This method should give the taggers
access to the more densely populated coarse-tag features when necessary, but when the
subtler distinctions we have added are useful they are available. The first stages of
investigation did not produce particularly promising results, however far more extensive
experimentation is possible.

41

Bibliography
Brants, Thorsten. 2000. TnT - a statistical part-of-speech tagger. In Proceedings of
the 6th Applied Natural Language Processing Conference, 224231, Seattle, USA.
Brill, Eric. 1995. Transformation-based error-driven learning and natural language
processing: A case study in part-of-speech tagging. Computational Linguistics
21.54365.
, and Jun Wu. 1998. Classifier combination for improved lexical disambiguation.
In Proceedings of the 36th Annual Meeting of the Association for Computational
Linguistics and 17th International Conference on Computational Linguistics, 191
195, Montreal, Canada.
Charniak, Eugene, Curtis Hendrickson, Neil Jacobson, and Mike
Perkowitz. 1993. Equations for part-of-speech tagging. In Proceedings of the
National Conference on Artificial Intelligence, 784789, Washington, USA.
Church, Kenneth. 1988. A stochastic parts program and noun phrase parser for
unrestricted text. In Proceedings of the 2nd Conference on Applied Natural Language
Processing, 136143, Austin, USA.
Clark, Alexander. 2003. Combining distributional and morphological information
for part of speech induction. In Proceedings of the 10th Conference of the European Chapter of the Association for Computational Linguistics, 5966, Budapest,
Hungary.
Copestake, Ann, and Dan Flickinger. 2000. An open-source grammar development
environment and broad-coverage English grammar using HPSG. In Proceedings of
the Second conference on Language Resources and Evaluation (LREC-2000), Athens,
Greece.
Cristianini, Nello, and John Shawe-Taylor. 2000. An Introduction to Support
Vector Machines and other kernel-based learning methods. Cambridge, UK: Cambridge University Press.
Crystal, David. 1987. The Cambridge Encyclopedia of Language. Cambridge, UK:
Cambridge University Press.
Daelemans, Walter, Jakob Zavrel, Peter Berck, and Steven Gillis. 1996.
MBT: A memory-based part of speech tagger-generator. In Fourth Workshop on
Very Large Corpora, Copenhagen, Denmark.
42

DeRose, Steven J. 1988. Grammatical category disambiguation by statistical optimization. Computational Linguistics 14.3139.
era, 1979. Brown Corpus Manual: Manual of Information
Francis, W. N., and H. Kuc
to accompany A Standard Corpus of Present-Day Edited American English for use
with Digital Computers. Brown University, Providence, USA.
Garside, Roger. 1987. The CLAWS tagging system. In A Computational Analysis
of English, ed. by Roger Garside, Geoffrey Leech, and Geoffrey Sampson, chapter 2.
Essex, England: Longman Group UK.
, Geoffrey Leech, and Anthony McEnery (eds.) 1997. Corpus Annotation:
Linguistic Information from Computer Text Corpora. New York, USA: Addison
Wesley Longman Ltd.
nez, Jesu
s, and Llus Ma
`rquez. 2003. Fast and accurate part-of-speech tagging:
Gime
The SVM approach revisited. In Proceedings of the International Conference on
Recent Advances on Natural Language Processing, Borovets, Bulgaria.
, and . 2004. SVMTool: A general POS tagger generator based on support vector
machines. In Proceedings of the 4th International Conference on Language Resources
and Evaluation, Lisbon, Portugal.
Huddleston, Rodney, and Geoffrey K. Pullum (eds.) 2002. The Cambridge
Grammar of the English Language. Cambridge, UK: Cambridge University Press.
Johansson, Stig, Geoffrey Leech, and Helen Goodluck. 1978. Manual Of
Information To Accompany The Lancaster-Oslo/Bergen Corpus Of British English,
For Use With Digital Computers. Oslo, Norway: Department of English, University
of Oslo.
Jurafsky, Daniel, and James Martin. 2000. Speech and Language Processing.
Prentice-Hall Series in Artificial Intelligence. Upper Saddle River, USA: PrenticeHall.
Klein, Dan, and Christopher D. Manning. 2003. Accurate unlexicalized parsing.
In Proceedings of the 41st Annual Meeting of the Association for Computational
Linguistics, 423430, Sapporo, Japan.
Leech, Geoffrey. 1997. Grammatical tagging. In Corpus Annotation: Linguistic
Information from Computer Text Corpora, ed. by Roger Garside, Geoffrey Leech,
and Anthony McEnery, chapter 2. New York, USA: Addison Wesley Longman Ltd.
Loper, Edward, and Steven Bird. 2002. NLTK: The natural language toolkit. In
Proceedings of the ACL Workshop on Effective Tools and Methodologies for Teaching
Natural Language Processing and Computational Linguistics, 6269, Philadelphia,
USA.
Malouf, Robert. 2002. A comparison of algorithms for maximum entropy parameter
estimation. In Proc. of the 6th Conference on Natural Language Learning (CoNLL2002), 4955, Taipei, Taiwan.
43

tze. 1999. Foundations of Statistical


Manning, Christopher D., and Hinrich Schu
Natural Language Processing. Cambridge, USA: The MIT Press.
Marcus, Mitchell, Beatrice Santorini, and Mary Ann Marcinkiewicz. 1993.
Building a large annotated corpus of English: The Penn Treebank. Computational
Linguistics 19.313330.
Mikheev, Andrei. 2000. Tagging sentence boundaries. In Proceedings of the First
Meeting of the North American Chapter of the Association for Computational Linguistics, 264271, Seattle, USA.
Nakagawa, Tetsuji, Taku Kudoh, and Yuji Matsumoto. 2001. Unknown word
guessing and part-of-speech tagging using support vector machines. In Proceedings
of the Sixth Natural Language Processing Pacific Rim Symposium, 325331, Tokyo,
Japan.
Ngai, Grace, and Radu Florian. 2001. Transformation-based learning in the fast
lane. In Proceedings of the Second Meeting of the North American Chapter of the
Association for Computational Linguistics, 407, Pittsburgh, USA.
Quirk, Randolph, Sidney Greenbaum, Geoffrey Leech, and Jan Svartvik.
1985. A Comprehensive Grammar of the English Language. New York, USA: Longman.
Ratnaparkhi, Adwait. 1996. A maximum entropy part-of-speech tagger. In Proceedings of the Conference on Empirical Methods in Natural Language Processing,
133142, Philadelphia, USA.
, 1998. Maximum Entropy Models for Natural Language Ambiguity Resolution. University of Pennsylvania dissertation.
Sampson, Geoffrey. 1987. Appendix B: Alternative grammatical coding systems. In
A Computational Analysis of English, ed. by Roger Garside, Geoffrey Leech, and
Geoffrey Sampson. Essex, England: Longman Group UK.
Santorini, Beatrice, 1990. Part-of-Speech Tagging Guidelines for the Penn Treebank
Project, 2nd printing, 3rd edition.
Toutanova, Kristina, and Christoper D. Manning. 2000. Enriching the knowledge sources used in a maximum entropy part-of-speech tagger. In Proceedings of
the 2000 Joint SIGDAT Conference on Empirical Methods in NLP and Very Large
Corpora, 6370, Hong Kong, China.
Witten, Ian H., and Eibe Frank. 2000. Data Mining: Practical Machine Learning
Tools and Techniques with Java Implementations. San Francisco, USA: Morgan
Kaufmann.

44

Appendix A
The Penn Tagset
We reproduce the Penn Tagset in full here, providing a range of examples selecetd
to indicate the scope of word types in the class. Note that many of the examples are
ambiguous with one or more other word types apart from the class under which they are
listed.

Tag
$

(
)
,

.
:
CC
CD
DT

Description
dollar
opening quotation mark
closing quotation mark
opening parenthesis
closing parenthesis
comma
dash
sentence terminator
colon or ellipsis
conjunction, coordinating
numeral, cardinal
determiner

EX
FW
IN
JJ

existential there
foreign word
preposition or conjunction,
subordinating
adjective or numeral, ordinal

JJR

adjective, comparative

JJS

adjective, superlative

LS
MD

list item marker


modal auxiliary

Examples
$ -$ $ A$ C$ HK$ M$ NZ$ S$ U.S.$ US$

([{
)]}
,

. ! ?
: ; ...
and but or ...
ten-thirty forty-three 25 one-tenth million
all an another any both each many neither some
that the these
there
ich jeux habeas alai je jour ...
in out inside on by below within foruntil into
than whether if because before that...
third alarmed English resilient different financial early big average ...
earlier greater shaper fewer older more less
cleaner creamier ...
latest most largest oldest loudest least dirtiest
nearest ...
A A. B B. C C. First One one two three ...
can could may might must need shall should will
would ...

45

NN
NNP

noun, common, singular or


mass
noun, proper, singular

NNPS

noun, proper, plural

NNS

noun, common, plural

PDT
POS
PRP

pre-determiner
genitive marker
pronoun, personal

PRP$
RB

pronoun, possessive
adverb

RBR

adverb, comparative

RBS
RP
SYM
TO
UH
VB

adverb, superlative
particle
symbol
to as preposition or infinitive marker
interjection
verb, base form

VBD

verb, past tense

VBG

verb, present participle or


gerund
verb, past participle

VBN
VBP
VBZ
WDT
WP
WP$
WRB

verb, present tense, not 3rd


person singular
verb, present tense, 3rd person singular
Wh-determiner
Wh-pronoun
Wh-pronoun, possessive
Wh-adverb

director yield exercise chairman cigarette percentage rate growth milk book ...
Lorillard Pacific McDermott Indianapolis January Rothschild Frederick Japan Tuesday ...
Asians Cabernets States Airlines Democrats
Protestants Rothschilds ...
filters men workers ratepayers units people
rights counterparts capsules quantities ...
all both half many quite such sure this
s
hers herself him it I me myself ours ourselves he
theirs them we ...
her his mine my our their thy your
heavily far not perhaps again still here often increasingly very relatively also then on over in
...
more further earlier better closer less later
harder...
most best hardest least ...
in on off up across even about along through ...
% & . ) ). * + ,. < = > @
to
oh yes well heck quackw wow hey ...
be have make oversee treat prove remain seem
refund get work offer share ...
were dumped poured had contracted was made
upheld favored said became took gave ...
being doing trying increasing running reducing
predicting ...
been taken become based considered broken gotten managed surfaced given studied sold ...
am are have allow offer argue invest talk mention seem ...
is has does appears requires follows describes
says makes ...
that what whatever which whichever
what who whom ...
whose
how however when where whereby why ...

46

Appendix B

Complete Results

Here we show the results for all of the trials we attempted , with mappings referred to in the
body of the thesis named explicitly by the heading, and conditions under which the tags were
mapped listed before the results. We show precision recall and F-score over individal POSs,
and accuracy over the global metrics. The column headed F chg denotes the change in
F-score or accuracy relative (as appropriate) relative to the benchmark figures. Finally the
column head p(F chg) shows the statistical significance of the change according to the
paired t-test evaluated over the five cross-validation folds.

47

B.1

48

0.00
0.00
78.87
77.27
70.18
0.00
74.50
83.30
34.68
83.65

85.71
0.00

82.06
80.35
70.83
65.90
63.81
88.26

p(FU)

24622

98.75

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.281
0.138
0.362
0.393
0.993
0.938
0.348
0.892
0.066
0.469
0.310
0.367
0.188
0.206
0.686
0.798
0.390
0.374
0.073
0.918
0.070
0.975

0.374

0.199
0.572
0.051
0.771
0.498
0.025
0.663

0.031
0.046

100.00
0.00 99.99
99.91
99.99
100.00
100.00
100.00
99.97
0.00 99.50
96.98 99.39
0.00 99.41
97.11
0.00 51.76
0.00 98.43
72.84 91.58
20.00 87.59
74.07 89.02
0.00 99.73
68.84 96.10
94.24 96.97
18.38 64.58
86.10 97.70
75.00
99.22
0.00 99.55
0.00 99.65
80.23 92.47
0.00 70.35
0.00 46.02
77.03
0.00 81.67
99.99
0.00 57.67
77.02 95.25
64.32 95.09
93.29 91.90
83.66 88.86
51.15 92.90
56.32 97.12
0.00 96.11
99.35
100.00
0.00 99.94
0.00 99.99

-0.23

-0.36
-8.85
-2.44

-0.32
-0.16
15.61
-1.06

0.72

-0.32
-1.11
1.66
-0.38
2.38
3.00

0.217

0.524
0.178
0.186

0.622
0.766
0.914
0.072

0.470

0.737
0.244
0.104
0.446
0.781
0.307

81.73

-0.26 0.445

RecU

0.02
-0.02
0.01
-0.16
0.06
-0.00
-0.09
0.04
-4.54
0.01
-0.03
-0.05
-1.55
-0.03
-0.24
-0.01
0.00
0.01
-0.11
-0.18
-41.42
-0.01

0.00

-0.18
0.03
0.11
0.02
-0.08
0.10
0.03

-1.17
-0.04

PrecU

100.00
99.99
99.91
99.99
100.00
100.00
100.00
99.97
99.50
99.39
99.41
97.11
51.76
98.43
91.58
87.59
89.02
99.73
96.10
96.97
64.58
97.70
75.00
99.22
99.55
99.65
92.47
70.35
46.02
77.03
81.67
99.99
57.67
95.25
95.09
91.90
88.86
92.90
97.12
96.11
99.35
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.92
99.99
100.00
100.00
100.00
99.94
99.37
99.44
99.49
98.84
46.22
98.69
91.19
90.80
93.14
99.80
96.02
97.37
66.20
97.80
80.10
99.50
99.36
99.91
90.76
65.51
36.83
75.72
83.05
100.00
47.00
95.34
94.48
92.41
90.90
92.26
96.52
95.62
99.62
100.00
99.87
99.99
51.21
96.800

F chg

100.00
100.00
99.90
100.00
100.00
100.00
100.00
100.00
99.63
99.34
99.34
95.43
58.82
98.16
91.97
84.59
85.25
99.65
96.17
96.58
63.04
97.60
70.51
98.94
99.75
99.38
94.24
75.96
61.30
78.39
80.33
99.99
74.60
95.16
95.71
91.39
86.90
93.54
97.73
96.61
99.08
100.00
100.00
100.00

F(0.5)

Rec

TBL
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

Prec

True

(Pos = RB & Wd {absolutely, admirably, all,...<153 ommitted>...,wickedly, wildly,


wonderfully}) (Pos RBJJ)

B.2

rbdeg[ld] Mapping

49

0.00
77.18
71.43
75.47
74.11
83.64
39.32
82.66

83.27
0.00

0.00

83.40
82.52
70.82
65.82
64.71
86.93

p(FU)

24622

98.84

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.208

0.332
0.649
0.744
0.241
0.495
0.862
0.197
0.406
0.854
0.423
0.927
0.392
0.215
0.765
0.071
0.978
0.379
0.178
0.947
0.470
0.259
0.313
0.856
0.374

0.386
0.826
0.085
0.569
0.220
0.049
0.098
0.374

0.207
0.399

100.00
0.00 99.99
99.89
99.99
100.00
100.00
100.00
99.97
0.00 99.50
97.07 99.40
0.00 99.40
97.06
0.00 50.68
0.00 98.42
73.02 91.59
23.53 87.77
74.07 92.06
0.00 99.73
69.06 96.13
93.10 96.99
19.66 65.13
88.10 97.72
74.31
99.22
0.00 99.55
0.00 99.65
79.07 92.57
0.00 70.96
0.00 61.58
77.34
0.00 81.30
99.99
0.00 57.67
75.79 95.36
59.86 95.07
91.67 91.86
79.33 88.78
50.38 92.87
56.96 97.12
0.00 95.93
99.33
100.00
0.00 99.94
0.00 99.99

-0.13

-1.28
1.54
1.21

-0.40
-0.51
26.14
-0.56

-1.43

-0.37
-3.96
0.88
-2.78
2.15
3.10

0.466

0.157
0.865
0.194

0.643
0.155
0.957
0.317

0.317

0.354
0.102
0.270
0.069
0.658
0.300

81.44

-0.61 0.156

RecU

-0.02

0.02
-0.01
-0.01
-0.21
-2.02
-0.00
-0.08
0.25
-1.29
0.01
-0.00
-0.02
-0.72
-0.01
-1.16
0.00
0.00
0.01
-0.00
0.69
-21.61
0.38
-0.45
0.00

-0.07
0.01
0.07
-0.06
-0.11
0.09
-0.16
-0.02

-0.60
-0.02

PrecU

100.00
99.99
99.89
99.99
100.00
100.00
100.00
99.97
99.50
99.40
99.40
97.06
50.68
98.42
91.59
87.77
92.06
99.73
96.13
96.99
65.13
97.72
74.31
99.22
99.55
99.65
92.57
70.96
61.58
77.34
81.30
99.99
57.67
95.36
95.07
91.86
88.78
92.87
97.12
95.93
99.33
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.91
99.99
100.00
100.00
100.00
99.94
99.39
99.45
99.48
98.84
46.64
98.70
91.34
90.83
96.62
99.80
96.07
97.28
66.85
97.87
77.58
99.52
99.37
99.91
91.06
66.56
48.15
75.69
84.75
100.00
47.00
95.33
94.47
92.10
90.64
92.16
96.57
95.45
99.58
100.00
99.87
99.99
51.51
96.825

F chg

100.00
100.00
99.87
100.00
100.00
100.00
100.00
100.00
99.61
99.35
99.31
95.33
55.50
98.15
91.85
84.90
87.91
99.67
96.19
96.71
63.50
97.58
71.30
98.93
99.74
99.39
94.13
75.97
85.40
79.06
78.12
99.99
74.60
95.38
95.68
91.62
86.99
93.60
97.66
96.42
99.08
100.00
100.00
100.00

F(0.5)

Rec

TBL
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

Prec

True

(Pos = RB & Wd {absolutely, all, any,...<33 ommitted>...,utterly, very, wildly}) (Pos


RBDEG)

B.3

rbdeg[s] Mapping

50

0.00
79.92
87.50
86.84
81.01
82.40
53.85
87.27

0.00
91.63
0.00
0.00

33.33
85.88
82.45
82.34
79.13
74.14
90.12

p(FU)

24622

99.35

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.374
0.799
0.230
0.295
0.700
0.265
0.427
0.848
0.904
0.374
0.978
0.278
0.084
0.000
0.792
0.234
0.374
0.374
0.374
0.515
0.374
0.914
0.462

0.880
0.290
0.768
0.056
0.588
0.982
0.792
0.071

0.374

0.264
0.500

100.00
0.00 99.99
99.85
100.00
100.00
100.00
100.00
99.95
0.00 99.54
98.51 99.43
0.00 99.38
97.33
0.00 43.58
0.00 98.37
82.25 92.21
24.71 88.43
61.11 95.48
0.00 99.74
67.77 96.12
98.17 96.26
11.97 62.72
90.35 97.46
75.96
99.20
0.00 99.59
0.00 99.67
80.62 92.73
0.00 72.11
0.00 86.04
75.62
0.00 75.44
99.99
5.88 49.35
78.95 95.66
72.77 95.46
91.34 93.20
73.60 89.54
32.82 93.06
64.45 97.60
0.00 96.80
99.45
100.00
0.00 99.96
0.00 99.99

0.28
-7.02
2.00

0.07
-0.05
-0.35
-0.01

0.17

0.00
2.73
0.22
0.96
2.38
-0.08

0.081
0.196
0.374

0.212
0.049
0.374
0.959

0.713

0.919
0.006
0.145
0.282
0.552
0.706

84.72

0.12 0.092

RecU

-0.01
0.00
0.01
0.01
-0.05
1.30
0.00
-0.01
0.02
0.03
-0.00
0.01
-0.03
0.17
-0.00
-0.68
0.01
-0.00
-0.01
-0.07
0.35
0.00
-0.13

0.76
0.01
-0.00
-0.09
-0.02
-0.00
-0.00
0.03

0.02

-0.18
-0.01

PrecU

100.00
99.99
99.85
100.00
100.00
100.00
100.00
99.95
99.54
99.43
99.38
97.33
43.58
98.37
92.21
88.43
95.48
99.74
96.12
96.26
62.72
97.46
75.96
99.20
99.59
99.67
92.73
72.11
86.04
75.62
75.44
99.99
49.35
95.66
95.46
93.20
89.54
93.06
97.60
96.80
99.45
100.00
99.96
99.99

TrueU

p(F)

100.00
99.98
99.75
99.99
100.00
100.00
100.00
99.94
99.41
99.57
99.30
99.58
32.77
98.94
92.18
92.10
95.77
99.78
95.10
97.96
73.40
97.20
84.38
99.65
99.33
99.96
89.87
67.87
84.36
75.89
72.88
100.00
38.00
94.64
95.94
94.14
90.87
92.18
96.41
96.42
100.00
100.00
99.91
99.99
50.63
96.847

F chg

100.00
100.00
99.95
100.00
100.00
100.00
100.00
99.96
99.67
99.28
99.46
95.18
65.00
97.81
92.24
85.03
95.19
99.70
97.16
94.62
54.76
97.73
69.07
98.75
99.85
99.38
95.79
76.92
87.79
75.34
78.18
99.99
70.37
96.70
94.97
92.28
88.25
93.95
98.82
97.20
98.90
100.00
100.00
100.00

F(0.5)

Rec

SVM
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

Prec

True

(Pos = RB & SibR = RB) (Pos RBDEG)


(Pos = RB & SibR = JJ) (Pos RBDEG)
(Pos = RB & Par = ADJP) (Pos RBDEG)

51

1234

0.00
0.00 0.00
84.23 74.80
0.00
100.00 66.67
72.61
88.83
28.57
83.87

70.37
95.31
33.33
90.43

80.00 95.24
100.00 100.00

85.71
91.67
65.00
70.59
76.92
100.00

85.71
55.00
92.86
81.82
62.50
55.00
0.00

83.79

p(FU)

99.07 92.24

F chgU

0
0
0
0
0
0
0
0
0
116
0
0
1
1
250
3
3
0
162
384
6
115
0
0
0
0
21
1
0
0
0
0
0
28
20
42
44
16
20
1
0
0
0
0

F(0.5)U

0.178
0.030
0.080

0.206
0.306
0.089
0.560
0.524

0.969
0.939
0.120
0.603
0.411
0.232
0.073
0.182
0.581

0.021
0.446
0.159
0.439
0.129

0.796

0.209

0.872
0.203
0.789
0.586
0.772
0.147
0.643
0.002

0.213
0.374
0.000
0.921

RecU

0.01
0.09
-0.05

0.03
-0.18
-0.63
-0.08
0.76

-0.00
0.01
-4.15
-1.46
0.16
-0.20
0.38
-9.25
0.11

0.41
-0.18
-0.49
0.39
-9.54

0.46

-0.14

0.04
-0.44
-0.09
0.54
0.23
0.87
1.24
0.66

0.06
0.01
-2.48
-0.01

PrecU

100.00
100.00
100.00
99.94
100.00
100.00
100.00
100.00
99.30
98.78
99.33
98.00
40.00
98.42
91.67
83.91
91.89
99.88
95.94
97.39
59.53
97.84
56.00
99.63
99.37
99.15
92.93
63.75
64.86
77.39
0.00
99.85
0.00
95.46
94.64
91.71
89.31
93.18
97.87
97.28
100.00
100.00
100.00
100.00

TrueU

p(F)

100.00
100.00
100.00
99.96
100.00
100.00
100.00
100.00
99.65
98.54
99.34
100.00
50.00
98.87
91.58
84.71
91.40
100.00
95.95
97.44
67.37
97.97
77.78
99.75
99.16
99.76
91.81
59.30
63.16
72.14
0.00
100.00
0.00
96.74
93.54
91.23
90.79
91.20
97.18
96.57
100.00
100.00
100.00
100.00
50.53
96.833

F chg

Rec

100.00
100.00
100.00
99.92
100.00
100.00
100.00
100.00
98.96
99.02
99.31
96.08
33.33
97.98
91.76
83.12
92.39
99.76
95.93
97.33
53.33
97.71
43.75
99.50
99.58
98.55
94.09
68.92
66.67
83.47
0.00
99.71
0.00
94.22
95.77
92.20
87.89
95.26
98.57
98.01
100.00
100.00
100.00
100.00

F(0.5)

Prec

True

TBL
#
1
$
342

399
,
2571
-LRB52
-RRB55
.
1959
:
293
CC
1141
CD
1439
DT
4076
EX
49
FW
2
IN
4952
JJ
2992
JJR
157
JJS
93
MD
413
NN
6268
NNP
5202
NNPS
95
NNS
3004
PDT
9
POS
403
PRP
954
PRP$
409
RB
1490
RBR
86
RBS
19
RP
140
SYM
1
TO
1028
UH
1
VB
1196
VBD
1548
VBG
764
VBN
1031
VBP
727
VBZ
1133
WDT
204
WP
141
WP$
6
WRB
92

409
SENT
1963
TOKENS 47356

100.00
100.00
100.00
99.94
100.00
100.00
100.00
100.00
99.30
98.78
99.33
98.00
40.00
98.42
91.67
83.91
91.89
99.88
95.94
97.39
59.53
97.84
56.00
99.63
99.37
99.15
92.93
63.75
64.86
77.39
0.00
99.85
0.00
95.46
94.64
91.71
89.31
93.18
97.87
97.28
100.00
100.00
100.00
100.00

-2.59

4.24

-0.44
3.83

1.47

5.67

7.53
-4.85
-3.46
2.41
24.35

0.162

0.344

0.792
0.021

0.684

0.267

0.447
0.556
0.376
0.837
0.249

2.26 0.149

B.4

vbcop[s] + rbdeg[s] Mapping

52

20.00
6.67
78.70
71.43
70.00
73.97
83.47
36.00
82.32
0.00

87.21
40.00
0.00

83.24
85.30
71.69
67.39
67.26
86.75

p(FU)

24622

98.37

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.442
0.143
0.571
0.889
0.231
0.633
0.132
0.967
0.793
0.418
0.282
0.149
0.100
0.864
0.430
0.717
0.737
0.374
0.057
0.951
0.542
0.627

0.374

0.155
0.539
0.302
0.123
0.705
0.027
0.341

0.106
0.148

100.00
0.00 99.99
99.91
99.99
100.00
100.00
100.00
99.97
0.00 99.50
97.36 99.38
0.00 99.40
97.21
1.49 50.11
4.55 98.42
72.34 91.60
29.41 87.56
77.78 93.79
0.00 99.73
68.20 96.10
94.12 96.97
19.23 64.62
88.27 97.74
74.55
99.22
0.00 99.55
0.00 99.65
79.26 92.45
28.57 70.48
0.00 76.66
77.19
0.00 81.67
99.99
0.00 57.67
74.91 95.29
62.68 95.09
93.18 91.90
82.26 88.72
58.02 92.99
56.10 97.19
0.00 95.96
99.35
100.00
0.00 99.94
0.00 99.99

-0.22

-0.83
19.52
-0.26

-1.14
-0.10
20.65
-0.67

0.91

-1.07
0.01
2.30
0.11
12.32
2.08

0.140

0.075
0.527
0.374

0.048
0.669
0.339
0.435

0.426

0.352
0.968
0.057
0.896
0.112
0.516

81.72

-0.27 0.174

RecU

0.02
-0.03
-0.00
-0.05
-3.13
-0.01
-0.07
0.02
0.57
0.01
-0.03
-0.04
-1.50
0.01
-0.84
-0.00
-0.00
0.01
-0.13
0.01
-2.41
0.19

0.00

-0.14
0.03
0.11
-0.13
0.02
0.17
-0.13

-0.76
-0.02

PrecU

100.00
99.99
99.91
99.99
100.00
100.00
100.00
99.97
99.50
99.38
99.40
97.21
50.11
98.42
91.60
87.56
93.79
99.73
96.10
96.97
64.62
97.74
74.55
99.22
99.55
99.65
92.45
70.48
76.66
77.19
81.67
99.99
57.67
95.29
95.09
91.90
88.72
92.99
97.19
95.96
99.35
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.92
99.99
100.00
100.00
100.00
99.94
99.37
99.45
99.48
98.95
46.22
98.69
91.25
90.42
96.10
99.81
95.99
97.35
66.23
98.01
77.83
99.53
99.36
99.90
90.74
66.93
68.93
76.10
83.05
100.00
47.00
95.31
94.53
92.37
90.60
92.15
96.40
95.62
99.62
100.00
99.87
99.99
51.43
96.822

F chg

100.00
100.00
99.90
100.00
100.00
100.00
100.00
100.00
99.63
99.31
99.32
95.53
54.73
98.15
91.94
84.88
91.58
99.65
96.21
96.60
63.08
97.46
71.53
98.90
99.74
99.39
94.23
74.43
86.34
78.31
80.33
99.99
74.60
95.27
95.65
91.43
86.92
93.85
98.00
96.31
99.08
100.00
100.00
100.00

F(0.5)

Rec

TBL
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

Prec

True

(Pos = VB & SibAllR {ADJP-PRD, ADJP-PRD-TPC,


ADJP-TPC-PRD,...<47 ommitted>...,UCP-LOC-PRD, UCP-PRD, UCP-PRD-LOC})
(Pos VC)
(Pos = VB & SibAllR {ADJP-PRD, ADJP-PRD-TPC,
ADJP-TPC-PRD,...<47 ommitted>...,UCP-LOC-PRD, UCP-PRD, UCP-PRD-LOC})
(Pos VC)
and similarly for other VB.*
(Pos = RB & SibR = RB) (Pos RBDEG)
(Pos = RB & SibR = JJ) (Pos RBDEG)
(Pos = RB & Par = ADJP) (Pos RBDEG)

B.5

vbcop[lm] + rbdeg[s] Mapping

53

0.00
0.00
78.74
82.61
73.21
74.15
84.29
35.44
81.98
0.00

83.96
25.00

84.11
82.13
70.78
66.22
70.75
86.29

p(FU)

24622

0.00
98.60

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.374

0.340
0.030
0.508
0.276
0.682
0.182
0.397
0.979
0.192
0.139
0.707
0.523
0.349
0.239
0.449
0.983
0.184
0.374
0.676
0.424
0.248
0.403

0.374
0.374
0.136
0.543
0.190
0.134
0.648
0.497
0.057
0.374

0.057
0.017

100.00
0.00 99.99
99.90
99.99
100.00
100.00
100.00
99.97
0.00 99.50
97.19 99.37
0.00 99.41
97.05
0.00 51.02
0.00 98.41
74.29 91.71
22.35 87.56
75.93 89.83
0.00 99.74
68.99 96.14
93.35 97.03
11.97 64.88
88.57 97.70
74.50
99.22
0.00 99.56
0.00 99.65
82.17 92.54
14.29 70.28
0.00 65.71
77.19
0.00 81.67
99.99
0.00 56.79
76.14 95.30
61.50 95.04
91.23 91.95
82.68 88.70
57.25 92.93
55.25 97.05
0.00 95.96
99.33
100.00
0.00 99.94
0.00 99.99

-0.19

0.57

0.91

-0.43
0.02
-13.90
-0.72

0.93

0.27
-2.65
0.64
-0.63
14.12
0.91

0.167

0.444

0.817

0.574
0.976
0.640
0.201

0.497

0.882
0.335
0.527
0.664
0.050
0.647

81.90

-0.05 0.911

RecU

-0.01

0.02
-0.03
0.01
-0.21
-1.38
-0.01
0.05
0.01
-3.68
0.02
0.01
0.01
-1.10
-0.03
-0.91
0.00
0.01
0.01
-0.04
-0.27
-16.35
0.20

0.00
-1.52
-0.13
-0.02
0.17
-0.15
-0.05
0.03
-0.14
-0.02

-0.76
-0.02

PrecU

100.00
99.99
99.90
99.99
100.00
100.00
100.00
99.97
99.50
99.37
99.41
97.05
51.02
98.41
91.71
87.56
89.83
99.74
96.14
97.03
64.88
97.70
74.50
99.22
99.56
99.65
92.54
70.28
65.71
77.19
81.67
99.99
56.79
95.30
95.04
91.95
88.70
92.93
97.05
95.96
99.33
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.93
99.99
100.00
100.00
100.00
99.94
99.39
99.42
99.49
98.74
47.48
98.72
91.52
90.44
88.35
99.81
96.05
97.28
65.96
97.92
79.09
99.55
99.36
99.90
90.92
66.30
70.58
75.89
83.05
100.00
46.00
95.23
94.44
92.36
90.50
92.16
96.45
95.30
99.58
100.00
99.87
99.99
51.43
96.821

F chg

100.00
100.00
99.87
100.00
100.00
100.00
100.00
100.00
99.61
99.33
99.33
95.43
55.12
98.11
91.89
84.84
91.36
99.67
96.22
96.77
63.84
97.48
70.40
98.89
99.75
99.39
94.22
74.78
61.47
78.54
80.33
99.99
74.19
95.38
95.65
91.54
86.97
93.70
97.67
96.62
99.08
100.00
100.00
100.00

F(0.5)

Rec

TBL
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

Prec

True

(Pos = VB & Wd {appear, become, feel, look, remain, seem, smell, sound}) (Pos VJ)
(Pos = VBD & Wd {appeared, became, felt, looked, remained, seemed, smelled, smelt,
sounded}) (Pos VJD)
and similarly for other VB.*
(Pos = RB & SibR = RB) (Pos RBDEG)
(Pos = RB & SibR = JJ) (Pos RBDEG)
(Pos = RB & Par = ADJP) (Pos RBDEG)

B.6

54

0.00
100.00
0.00
79.45
70.59
71.93
73.71
83.24
47.37
83.49
0.00

83.60
0.00

82.22
84.62
69.02
66.44
62.04
87.54

p(FU)

24622

0.00
98.72

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.832
0.394
0.537
0.555
0.145
0.413
0.128
0.336
0.814
0.366
0.236
0.197
0.473
0.413
0.361
0.995
0.371
0.178
0.223
0.953
0.527
0.895
0.178
0.374
0.196
0.346
0.726
0.835
0.304
0.970
0.167
0.059

0.313
0.405

100.00
0.00 99.99
99.91
99.99
100.00
100.00
100.00
99.97
0.00 99.48
96.92 99.39
0.00 99.39
97.00
1.49 50.22
0.00 98.44
72.23 91.59
14.12 87.40
75.93 93.54
0.00 99.73
69.70 96.10
94.32 96.98
11.54 65.37
88.10 97.74
74.08
99.22
0.00 99.55
0.00 99.65
81.01 92.48
0.00 70.40
0.00 74.09
77.03
0.00 83.61
99.99
0.00 55.00
75.44 95.36
64.55 95.09
93.07 91.81
81.01 88.74
51.15 92.96
55.67 97.11
0.00 96.26
99.35
100.00
0.00 99.94
0.00 99.99

-0.27

-0.45

-0.19
-0.15
-10.70
-0.04

-0.01

-1.29
1.36
0.07
-1.36
1.09
1.96

0.335

0.589

0.760
0.413
0.703
0.878

0.964

0.159
0.171
0.939
0.164
0.974
0.447

81.77

-0.21 0.397

RecU

-0.00
-0.02
-0.01
-0.27
-2.93
0.01
-0.08
-0.17
0.31
0.01
-0.03
-0.03
-0.36
0.01
-1.46
0.00
0.00
0.01
-0.10
-0.10
-5.68
-0.01
2.38
0.00
-4.63
-0.07
0.03
0.02
-0.11
-0.01
0.09
0.18

-0.53
-0.02

PrecU

100.00
99.99
99.91
99.99
100.00
100.00
100.00
99.97
99.48
99.39
99.39
97.00
50.22
98.44
91.59
87.40
93.54
99.73
96.10
96.98
65.37
97.74
74.08
99.22
99.55
99.65
92.48
70.40
74.09
77.03
83.61
99.99
55.00
95.36
95.09
91.81
88.74
92.96
97.11
96.26
99.35
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.92
99.99
100.00
100.00
100.00
99.94
99.40
99.43
99.45
98.63
48.32
98.71
91.28
90.20
96.29
99.76
96.04
97.34
66.06
97.92
78.84
99.50
99.38
99.91
90.87
65.98
65.02
75.62
86.44
100.00
44.00
95.29
94.64
92.30
90.34
92.42
96.48
96.06
99.62
100.00
99.87
99.99
51.54
96.826

F chg

100.00
100.00
99.90
100.00
100.00
100.00
100.00
100.00
99.55
99.34
99.33
95.42
52.27
98.16
91.89
84.77
90.95
99.69
96.17
96.63
64.69
97.57
69.87
98.95
99.73
99.39
94.15
75.45
86.10
78.51
80.95
99.99
73.33
95.42
95.54
91.34
87.19
93.52
97.75
96.47
99.08
100.00
100.00
100.00

F(0.5)

Rec

TBL
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

Prec

True

(Pos = IN & Wd = than) (Pos INCMP)


(Pos = IN & Wd {til, although, as,...<9 ommitted>...,whereas, whether, while}) (Pos
INSUB)

B.7

55

25.00
0.00
77.78
77.78
70.18
74.63
84.23
30.82
82.32

0.00
82.87
25.00
0.00

0.00
82.16
81.44
69.89
66.20
59.46
87.13

p(FU)

24622

0.00
98.26

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.374

0.474
0.385
0.287
0.794
0.049
0.938
0.302
0.599
0.823
0.479
0.643
0.473
0.203
0.725
0.320
0.549
0.107
0.374
0.945
0.869
0.434
0.331

0.374
0.374
0.606
0.031
0.668
0.863
0.491
0.236
0.217
0.374

0.029
0.359

100.00
0.00 99.99
99.90
99.99
100.00
100.00
100.00
99.97
0.00 99.50
97.36 99.39
0.00 99.39
97.16
1.49 49.57
0.00 98.43
72.95 91.62
24.71 87.42
74.07 92.47
0.00 99.71
69.09 96.14
92.76 97.00
20.94 64.64
87.63 97.72
74.46
99.21
0.00 99.54
0.00 99.65
81.59 92.57
14.29 70.38
0.00 71.40
77.49
0.00 81.67
99.99
0.00 55.95
75.96 95.40
63.85 95.14
91.45 91.84
78.77 88.82
50.38 92.92
56.53 97.14
0.00 96.27
99.33
100.00
0.00 99.94
0.00 99.99

-0.28

-0.96

-2.44

-0.05
-0.32
20.01
-1.02

-0.08

-0.96
-0.93
0.03
-2.79
-1.65
2.72

0.222

0.087

0.195

0.982
0.341
0.743
0.255

0.998

0.316
0.479
0.933
0.185
0.670
0.358

81.46

-0.59 0.153

RecU

-0.01

0.02
-0.02
-0.02
-0.11
-4.18
-0.00
-0.05
-0.15
-0.84
-0.01
0.01
-0.02
-1.46
-0.01
-0.95
-0.01
-0.01
0.01
-0.00
-0.14
-9.11
0.58

0.00
-2.98
-0.02
0.08
0.04
-0.02
-0.06
0.12
0.19
-0.02

-0.60
-0.01

PrecU

100.00
99.99
99.90
99.99
100.00
100.00
100.00
99.97
99.50
99.39
99.39
97.16
49.57
98.43
91.62
87.42
92.47
99.71
96.14
97.00
64.64
97.72
74.46
99.21
99.54
99.65
92.57
70.38
71.40
77.49
81.67
99.99
55.95
95.40
95.14
91.84
88.82
92.92
97.14
96.27
99.33
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.93
99.99
100.00
100.00
100.00
99.94
99.39
99.49
99.50
98.84
48.32
98.62
91.26
89.89
94.08
99.76
96.02
97.24
66.85
97.90
78.59
99.51
99.35
99.90
91.16
66.72
66.26
76.76
83.05
100.00
47.00
95.47
94.58
92.24
90.70
92.18
96.62
96.27
99.58
100.00
99.87
99.99
51.51
96.831

F chg

100.00
100.00
99.87
100.00
100.00
100.00
100.00
100.00
99.61
99.29
99.27
95.53
50.88
98.23
91.98
85.08
90.92
99.66
96.25
96.75
62.58
97.54
70.75
98.92
99.73
99.39
94.03
74.46
77.40
78.23
80.33
99.99
69.12
95.34
95.71
91.43
87.02
93.66
97.68
96.27
99.08
100.00
100.00
100.00

F(0.5)

TBL
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

Rec

TO & SibR = NP) (Pos TOIN)


TO & Par = QP) (Pos TOQ)
IN & SibR = S) (Pos INSUB)
IN & Par = SBAR) (Pos INSUB)

Prec

=
=
=
=

True

(Pos
(Pos
(Pos
(Pos

B.8

insub[s] Mapping

56

0.00
79.68
88.89
86.49
81.14
82.45
54.00
87.32

0.00
90.30
0.00

33.33
86.35
80.59
82.05
78.53
72.88
89.91

p(FU)

24622

99.35

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.912
0.988
0.656
0.179
0.953
0.194
0.764
0.576
0.660

0.670
0.192
0.650
0.440
0.746
0.374
0.983
0.987
0.766
0.204
0.797
0.132

0.277
0.466
0.453
0.899
0.219
0.825
0.288

0.380
0.003

100.00
0.00 99.99
99.85
100.00
100.00
100.00
100.00
99.96
0.00 99.54
98.51 99.42
0.00 99.38
97.28
0.00 42.98
0.00 98.38
82.10 92.21
28.24 88.34
59.26 95.48
0.00 99.74
67.69 96.12
98.16 96.30
11.54 62.68
90.44 97.47
76.43
99.20
0.00 99.59
0.00 99.67
81.20 92.81
0.00 71.58
0.00 86.13
75.35
0.00 75.44
99.99
5.88 48.98
78.77 95.66
71.13 95.47
91.02 93.26
73.04 89.55
32.82 93.10
64.88 97.61
0.00 96.65
99.45
100.00
0.00 99.94
0.00 99.99

0.04
3.42

0.09
-0.03
-3.23
0.07

-0.14

0.14
0.41
-0.13
0.19
1.84
0.21

0.682
0.962

0.272
0.305
0.374
0.284

0.374

0.711
0.401
0.557
0.623
0.921
0.848

84.65

0.03 0.424

RecU

0.00
0.00
0.00
-0.10
-0.10
0.02
-0.00
-0.08
0.03

0.00
0.01
0.11
0.01
-0.07
0.01
0.00
0.00
0.01
-0.40
0.11
-0.49

0.01
0.02
-0.03
-0.01
0.04
0.00
-0.13

0.11
0.00

PrecU

100.00
99.99
99.85
100.00
100.00
100.00
100.00
99.96
99.54
99.42
99.38
97.28
42.98
98.38
92.21
88.34
95.48
99.74
96.12
96.30
62.68
97.47
76.43
99.20
99.59
99.67
92.81
71.58
86.13
75.35
75.44
99.99
48.98
95.66
95.47
93.26
89.55
93.10
97.61
96.65
99.45
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.75
99.99
100.00
100.00
100.00
99.94
99.37
99.57
99.34
99.58
32.77
98.80
92.08
92.29
95.77
99.79
95.09
97.97
73.19
97.22
84.13
99.65
99.33
99.96
90.63
66.56
84.36
75.65
72.88
100.00
36.00
94.61
95.94
94.16
90.90
92.24
96.41
96.88
100.00
100.00
99.87
99.99
50.77
96.855

F chg

100.00
100.00
99.95
100.00
100.00
100.00
100.00
99.98
99.70
99.27
99.41
95.08
62.40
97.96
92.35
84.71
95.19
99.69
97.17
94.69
54.81
97.73
70.02
98.75
99.85
99.39
95.10
77.41
87.98
75.05
78.18
99.99
76.60
96.74
95.01
92.37
88.24
93.98
98.84
96.41
98.90
100.00
100.00
100.00

F(0.5)

Rec

SVM
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

Prec

True

(Pos = IN & SibR = S) (Pos INSUB)


(Pos = IN & Par = SBAR) (Pos INSUB)

57

0.00
0.00
79.44
70.00
72.73
73.71
83.62
31.82
82.65

0.00
84.65
33.33

86.06
82.21
69.92
66.21
54.87
86.45

p(FU)

24622

98.55

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.271
0.653
0.100
0.598
0.288
0.867
0.605
0.091
0.393
0.688
0.846
0.041
0.079
0.803
0.833
0.168
0.736

0.559
0.657
0.409
0.186

0.374

0.877
0.204
0.075
0.885
0.240
0.180
0.286

0.313
0.992

100.00
0.00 99.99
99.91
99.99
100.00
100.00
100.00
99.97
0.00 99.51
97.60 99.40
0.00 99.38
97.11
0.00 50.90
0.00 98.42
71.93 91.63
24.71 87.76
74.07 95.04
0.00 99.72
69.14 96.12
94.01 96.97
11.97 64.46
89.08 97.72
75.29
99.24
0.00 99.55
0.00 99.64
81.20 92.56
14.29 70.70
0.00 84.22
77.36
0.00 81.67
99.99
0.00 57.67
75.79 95.43
62.91 95.12
93.07 91.97
81.01 88.83
47.33 92.84
57.39 97.13
0.00 96.17
99.35
100.00
0.00 99.94
0.00 99.99

-0.01

-0.68
4.76
-0.65

-0.60
-0.07
-16.30
-0.03

0.72

1.11
-1.35
0.81
-1.54
-8.37
3.34

0.920

0.533
0.981
0.517

0.578
0.650
0.483
0.915

0.545

0.468
0.375
0.125
0.411
0.406
0.178

81.76

-0.22 0.652

RecU

0.03
-0.01
-0.03
-0.16
-1.61
-0.00
-0.04
0.25
1.91
0.00
-0.01
-0.05
-1.74
-0.01
0.16
0.02
-0.00

-0.02
0.32
7.22
0.41

0.00

0.01
0.06
0.20
-0.01
-0.14
0.11
0.08

-0.36
0.00

PrecU

100.00
99.99
99.91
99.99
100.00
100.00
100.00
99.97
99.51
99.40
99.38
97.11
50.90
98.42
91.63
87.76
95.04
99.72
96.12
96.97
64.46
97.72
75.29
99.24
99.55
99.64
92.56
70.70
84.22
77.36
81.67
99.99
57.67
95.43
95.12
91.97
88.83
92.84
97.13
96.17
99.35
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.92
99.99
100.00
100.00
100.00
99.94
99.39
99.49
99.49
98.84
47.48
98.63
91.23
90.91
95.82
99.78
96.02
97.32
65.68
97.92
80.60
99.54
99.36
99.90
91.16
66.25
81.28
76.24
83.05
100.00
47.00
95.37
94.60
92.34
90.55
92.20
96.59
96.04
99.62
100.00
99.87
99.99
51.63
96.842

F chg

Rec

100.00
100.00
99.90
100.00
100.00
100.00
100.00
100.00
99.62
99.31
99.26
95.43
54.85
98.22
92.03
84.82
94.27
99.67
96.22
96.62
63.28
97.52
70.64
98.95
99.73
99.38
94.00
75.80
87.39
78.51
80.33
99.99
74.60
95.49
95.65
91.61
87.17
93.49
97.68
96.30
99.08
100.00
100.00
100.00

F(0.5)

Prec

True

TBL
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

58

24622

0.00
99.21 98.80
0.00
100.00
62.50
82.47
81.25
88.10
79.56
89.33
56.76
90.09

5.97
22.73
82.45
45.88
68.52
0.00
78.09
95.62
35.90
92.69

0.00
0.00
92.36 84.30
0.00 0.00
0.00
100.00 100.00

90.56
84.44
85.48
80.94
74.31
89.93

0.00
85.79
80.28
89.83
79.47
61.83
78.37
0.00

0.00
0.00
87.29

p(FU)

0.00

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.983

0.034
0.109
0.014
0.239
0.338
0.015
0.069
0.096
0.285
0.618
0.953
0.457
0.739
0.042
0.316
0.104
0.420
0.374
0.708
0.525
0.049
0.908

0.374
0.518
0.929
0.235
0.717
0.396
0.827
0.007

0.287
0.136

RecU

0.00

-0.01
0.01
-0.02
0.20
-2.00
-0.05
0.04
-0.08
-0.13
-0.00
-0.00
-0.00
-0.08
0.01
0.88
-0.02
0.01
0.01
0.01
-0.14
-0.83
0.05

2.92
-0.01
-0.00
-0.04
0.02
0.03
0.00
-0.79

-0.16
-0.01

PrecU

100.00
99.99
99.77
100.00
100.00
100.00
100.00
99.97
99.53
99.41
99.35
97.25
40.49
98.17
92.35
87.39
94.65
99.66
96.48
96.96
59.39
98.08
69.43
99.01
99.51
99.54
92.32
68.27
81.89
76.41
70.48
99.99
42.11
96.26
96.32
93.04
90.35
94.18
97.79
95.38
99.45
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.57
99.99
100.00
100.00
100.00
99.94
99.31
99.48
99.35
98.42
27.73
98.50
92.84
90.11
94.65
99.56
96.59
97.72
54.10
98.51
65.49
99.63
99.17
99.95
91.00
62.52
80.04
73.36
62.71
100.00
28.00
95.10
96.55
91.94
89.66
93.69
96.72
96.48
100.00
100.00
99.87
99.99
53.64
97.048

F chg

Rec

100.00
100.00
99.97
100.00
100.00
100.00
100.00
100.00
99.75
99.35
99.34
96.10
75.00
97.84
91.86
84.82
94.65
99.75
96.36
96.20
65.83
97.66
73.86
98.40
99.85
99.15
93.68
75.19
83.84
79.73
80.43
99.99
84.85
97.46
96.10
94.17
91.04
94.68
98.87
94.29
98.90
100.00
100.00
100.00

F(0.5)

Prec

True

MAX
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

100.00
99.99
99.77
100.00
100.00
100.00
100.00
99.97
99.53
99.41
99.35
97.25
40.49
98.17
92.35
87.39
94.65
99.66
96.48
96.96
59.39
98.08
69.43
99.01
99.51
99.54
92.32
68.27
81.89
76.41
70.48
99.99
42.11
96.26
96.32
93.04
90.35
94.18
97.79
95.38
99.45
100.00
99.94
99.99

0.03

0.03
0.00
-1.62

-0.22
0.01
-1.44
0.06

-0.53

-0.34
-0.03
-0.19
-0.07
-1.19
0.48

0.591

0.826
0.756
0.374

0.321
0.774
0.740
0.609

0.112

0.220
0.889
0.274
0.744
0.378
0.309

-0.05 0.336

B.9

59

57.14
0.00
78.76
73.53
71.43
74.45
83.72
32.14
83.79
0.00

83.95
20.00

79.60
78.55
70.42
66.29
64.55
86.62

p(FU)

24622

0.00
98.95

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.128
0.397
0.062
0.276
0.791
0.131
0.178
0.291
0.362
0.996
0.269
0.363
0.164
0.497
0.474
0.986
0.568
0.606
0.035
0.461
0.368
0.461
0.374
0.374

0.030
0.685
0.448
0.690
0.702
0.161
0.496
0.374

0.084
0.057

100.00
0.00 99.99
99.91
99.99
100.00
100.00
100.00
99.97
0.00 99.53
97.10 99.39
0.00 99.41
97.05
5.97 51.04
0.00 98.44
72.06 91.53
29.41 87.34
74.07 91.12
0.00 99.72
69.19 96.10
93.84 96.98
19.23 64.75
87.89 97.75
74.49
99.22
0.00 99.55
0.00 99.64
79.07 92.50
14.29 70.01
0.00 73.02
77.15
0.00 82.64
99.99
0.00 57.67
76.67 95.32
63.62 95.10
92.75 91.75
81.01 88.77
54.20 92.99
58.24 97.13
0.00 96.19
99.33
100.00
0.00 99.94
0.00 99.99

-0.06

-0.99
20.52
-1.55

-0.09
-0.09
15.81
0.03

-1.04

-2.01
-2.70
1.07
-1.48
6.24
4.35

0.740

0.286
0.453
0.374

0.921
0.797
0.543
0.975

0.423

0.087
0.184
0.097
0.333
0.336
0.321

81.70

-0.29 0.367

RecU

0.05
-0.02
0.01
-0.21
-1.33
0.02
-0.15
-0.23
-2.30
0.00
-0.03
-0.03
-1.30
0.02
-0.91
0.00
-0.01
-0.01
-0.08
-0.65
-7.04
0.14
1.20
0.00

-0.11
0.04
-0.05
-0.08
0.03
0.11
0.11
-0.02

-0.76
-0.03

PrecU

100.00
99.99
99.91
99.99
100.00
100.00
100.00
99.97
99.53
99.39
99.41
97.05
51.04
98.44
91.53
87.34
91.12
99.72
96.10
96.98
64.75
97.75
74.49
99.22
99.55
99.64
92.50
70.01
73.02
77.15
82.64
99.99
57.67
95.32
95.10
91.75
88.77
92.99
97.13
96.19
99.33
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.92
99.99
100.00
100.00
100.00
99.94
99.39
99.44
99.50
98.74
46.22
98.69
91.17
90.25
88.16
99.77
96.00
97.30
66.82
97.91
78.34
99.52
99.35
99.91
91.02
65.88
83.54
75.30
84.75
100.00
47.00
95.54
94.66
92.11
90.48
92.20
96.62
95.79
99.62
100.00
99.87
99.99
51.43
96.815

F chg

100.00
100.00
99.90
100.00
100.00
100.00
100.00
100.00
99.67
99.33
99.33
95.43
56.99
98.20
91.89
84.62
94.27
99.67
96.20
96.66
62.81
97.59
71.00
98.93
99.74
99.36
94.03
74.70
64.86
79.10
80.65
99.99
74.60
95.10
95.55
91.39
87.13
93.80
97.64
96.60
99.05
100.00
100.00
100.00

F(0.5)

Rec

TBL
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

Prec

True

(Pos = DT & Wd {a, an, another, each, every, that, this}) (Pos DT1)

B.10

60

33.33
0.00
79.30
80.00
71.93
74.85
83.40
42.42
80.64

14.29
85.54
0.00

81.85
80.53
70.58
65.05
58.56
82.68

p(FU)

24622

98.49

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.374

0.431
0.870
0.996
0.231
0.248
0.669
0.725
0.500
0.501
0.688
0.619
0.309
0.249
0.000
0.374
0.489
0.774
0.178
0.741
0.630
0.185
0.082

0.374

0.224
0.497
0.319
0.437
0.683
0.880
0.731

0.239
0.544

100.00
0.00 99.99
99.90
99.99
100.00
100.00
100.00
99.97
0.00 99.48
97.54 99.40
0.00 99.40
97.06
1.49 50.00
0.00 98.43
71.58 91.67
14.12 87.63
75.93 90.25
0.00 99.72
68.99 96.15
94.29 97.00
11.97 65.20
88.48 97.67
75.03
99.24
9.09 99.55
0.00 99.65
80.23 92.58
0.00 70.78
0.00 58.81
77.36
0.00 81.67
99.99
0.00 57.67
77.54 95.37
64.08 95.09
92.42 91.93
80.59 88.78
49.62 92.96
54.18 97.02
0.00 96.10
99.35
100.00
0.00 99.94
0.00 99.99

-0.07

-1.02

0.02
-0.06
-10.17
-1.62

0.62

-0.09
-1.22
1.04
-2.72
-3.14
-1.94

0.744

0.143

0.999
0.766
0.755
0.011

0.757

0.753
0.333
0.179
0.155
0.448
0.559

81.61

-0.41 0.179

RecU

-0.01

0.01
-0.00
0.00
-0.21
-3.35
0.00
0.01
0.09
-3.22
0.00
0.02
-0.02
-0.61
-0.06
-0.20
0.02
-0.01
0.01
0.01
0.43
-25.13
0.40

0.00

-0.05
0.03
0.15
-0.06
-0.01
-0.01
0.01

-0.64
-0.01

PrecU

100.00
99.99
99.90
99.99
100.00
100.00
100.00
99.97
99.48
99.40
99.40
97.06
50.00
98.43
91.67
87.63
90.25
99.72
96.15
97.00
65.20
97.67
75.03
99.24
99.55
99.65
92.58
70.78
58.81
77.36
81.67
99.99
57.67
95.37
95.09
91.93
88.78
92.96
97.02
96.10
99.35
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.90
99.99
100.00
100.00
100.00
99.94
99.40
99.47
99.47
98.84
43.70
98.70
91.18
90.39
92.39
99.78
96.00
97.48
66.03
97.93
79.85
99.57
99.37
99.92
91.06
66.56
52.88
75.58
83.05
100.00
47.00
95.42
94.56
92.37
90.63
92.30
96.41
95.62
99.62
100.00
99.87
99.99
51.49
96.829

F chg

100.00
100.00
99.90
100.00
100.00
100.00
100.00
100.00
99.57
99.34
99.33
95.33
58.43
98.16
92.17
85.04
88.21
99.67
96.29
96.53
64.39
97.40
70.76
98.91
99.72
99.38
94.16
75.57
66.24
79.21
80.33
99.99
74.60
95.32
95.63
91.51
87.00
93.62
97.63
96.59
99.08
100.00
100.00
100.00

F(0.5)

Rec

TBL
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

Prec

True

(Pos = JJ & WdCs = [A-Z]*[a-z]* & Pos1Before = (?!ABSENT).*) (Pos JJP)

B.11

nnms[l] + dtnum[l] Mapping

61

25.00
0.00
77.42
87.10
81.40
86.14
82.41
56.00
86.55

0.00
89.68
0.00

33.33
83.15
80.00
80.04
76.87
66.15
89.44

p(FU)

24622

99.38

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.374

0.374
0.455
0.009
0.285
0.107
0.688
0.856
0.318
0.199
0.347
0.718
0.932
0.179
0.548
0.194
0.840
0.374
0.982
0.980
0.099
0.563
0.017
0.950

0.238
0.587
0.744
0.064
0.795
0.836
0.223
0.580

0.822
0.252

100.00
0.00 99.99
99.84
100.00
100.00
100.00
100.00
99.95
0.00 99.53
98.54 99.43
0.00 99.39
97.18
1.49 42.54
0.00 98.37
84.48 92.19
31.76 88.54
64.81 95.58
0.00 99.74
62.07 96.12
98.39 96.34
11.97 62.74
90.78 97.50
76.42
99.19
0.00 99.59
0.00 99.67
82.56 92.83
0.00 72.08
0.00 86.41
75.73
0.00 75.44
99.99
5.88 50.34
80.53 95.67
70.42 95.46
93.29 93.14
73.32 89.53
32.82 93.07
61.67 97.57
0.00 96.72
99.45
100.00
0.00 99.94
0.00 99.99

0.03

-0.06
12.33
2.61

-2.16
0.05
0.35
-0.20

0.41

-0.55
-0.46
-0.30
-0.65
-1.28
-2.94

0.178

0.596
0.298
0.332

0.002
0.389
0.972
0.386

0.659

0.585
0.519
0.631
0.298
0.624
0.032

84.41

-0.25 0.015

RecU

-0.01

-0.01
-0.01
0.01
0.01
-0.21
-1.10
0.00
-0.03
0.15
0.13
0.00
0.00
0.04
0.19
0.04
-0.08
-0.01
-0.00
0.00
0.03
0.30
0.42
0.02

2.78
0.01
0.00
-0.15
-0.03
0.01
-0.04
-0.05

0.03
0.01

PrecU

100.00
99.99
99.84
100.00
100.00
100.00
100.00
99.95
99.53
99.43
99.39
97.18
42.54
98.37
92.19
88.54
95.58
99.74
96.12
96.34
62.74
97.50
76.42
99.19
99.59
99.67
92.83
72.08
86.41
75.73
75.44
99.99
50.34
95.67
95.46
93.14
89.53
93.07
97.57
96.72
99.45
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.74
99.99
100.00
100.00
100.00
99.94
99.33
99.57
99.31
99.58
32.35
98.92
92.24
92.74
96.05
99.78
94.96
97.99
72.81
97.31
82.87
99.65
99.33
99.97
90.55
66.88
84.36
75.72
72.88
100.00
37.00
94.68
95.90
94.20
90.93
92.08
96.33
96.44
100.00
100.00
99.87
99.99
50.73
96.858

F chg

100.00
100.00
99.95
100.00
100.00
100.00
100.00
99.96
99.72
99.29
99.46
94.89
62.10
97.82
92.13
84.71
95.12
99.71
97.30
94.73
55.11
97.69
70.91
98.73
99.86
99.38
95.22
78.16
88.55
75.75
78.18
99.99
78.72
96.68
95.02
92.10
88.18
94.07
98.84
97.01
98.90
100.00
100.00
100.00

F(0.5)

Rec

SVM
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

Prec

True

(Pos = DT & Wd {a, another, each, every, little, many, much}) (Pos DT1)
(Pos = DT & Wd {these, those}) (Pos DTP)
(Pos = NNS & Wd {acrobatics, adenoids, alms,...<66 ommitted>...,tweezers, vicissitudes,
waterworks}) (Pos NNSP)
(Pos = NN & Wd {abalone, abandon, abandonment,...<9266 ommitted>...,zirconium,
zoning, zoology}) (Pos NNM)
(Pos = JJ & Wd {countless, few, many, numerous, several}) (Pos JJP)

62

50.00
0.00
76.07
90.00
71.43
74.96
82.46
40.00
83.48

81.91
0.00

0.00
81.32
79.35
70.83
66.47
64.00
87.99

p(FU)

24622

98.51

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.070

0.285
0.111
0.638
0.447
0.417
0.436
0.021
0.556
0.805
0.366
0.026
0.120
0.118
0.467
0.688
0.536
0.627
0.629
0.934
0.547
0.534
0.807

0.347
0.361
0.845
0.231
0.180
0.839
0.331
0.872

0.007
0.013

100.00
0.00 99.99
99.87
99.99
100.00
100.00
100.00
99.97
0.00 99.51
96.92 99.38
0.00 99.40
97.16
1.49 49.45
0.00 98.41
73.70 91.37
10.59 87.45
74.07 93.76
0.00 99.73
65.12 96.02
94.01 96.92
17.95 65.19
87.00 97.70
75.03
99.20
0.00 99.55
0.00 99.64
81.59 92.57
0.00 70.76
0.00 76.48
77.00
0.00 81.67
99.99
0.00 57.67
77.89 95.32
63.15 94.97
91.99 91.78
80.59 88.59
48.85 92.93
53.32 97.04
0.00 95.97
99.33
100.00
0.00 99.94
0.00 99.99

-0.37

-1.51

-1.55

-2.91
-0.81
19.25
-0.66

-0.66

-0.17
-2.66
1.05
-1.56
-0.09
-0.53

0.089

0.183

0.374

0.005
0.102
0.302
0.324

0.670

0.817
0.299
0.214
0.149
0.872
0.948

81.11

-1.02 0.002

RecU

-0.04

0.03
-0.03
-0.01
-0.11
-4.41
-0.01
-0.32
-0.11
0.53
0.01
-0.11
-0.09
-0.63
-0.03
-0.20
-0.02
0.00
-0.01
-0.00
0.41
-2.63
-0.05

-0.11
-0.09
-0.02
-0.28
-0.04
0.01
-0.13
-0.02

-1.49
-0.06

PrecU

100.00
99.99
99.87
99.99
100.00
100.00
100.00
99.97
99.51
99.38
99.40
97.16
49.45
98.41
91.37
87.45
93.76
99.73
96.02
96.92
65.19
97.70
75.03
99.20
99.55
99.64
92.57
70.76
76.48
77.00
81.67
99.99
57.67
95.32
94.97
91.78
88.59
92.93
97.04
95.97
99.33
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.83
99.99
100.00
100.00
100.00
99.94
99.38
99.46
99.44
98.84
47.48
98.69
91.06
89.95
95.91
99.76
95.78
97.41
66.37
97.84
78.34
99.51
99.36
99.90
91.29
66.82
68.93
75.48
83.05
100.00
47.00
95.52
94.47
92.37
90.31
92.09
96.38
95.55
99.77
100.00
99.87
99.99
51.05
96.782

F chg

Rec

100.00
100.00
99.91
100.00
100.00
100.00
100.00
100.00
99.64
99.30
99.36
95.53
51.60
98.14
91.68
85.08
91.69
99.69
96.27
96.44
64.04
97.56
71.99
98.89
99.75
99.37
93.89
75.19
85.90
78.59
80.33
99.98
74.60
95.13
95.49
91.19
86.93
93.79
97.70
96.39
98.90
100.00
100.00
100.00

F(0.5)

Prec

True

TBL
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

B.12

63

62.50
0.00
78.74
68.18
71.93
74.00
83.98
34.91
82.56

0.00
83.20
50.00

0.00
84.06
79.70
71.56
65.35
61.95
86.44

p(FU)

24622

98.40

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.374

0.535
0.146
0.389
0.374
0.794
0.178
0.871
0.814
0.844
0.077
0.579
0.519
0.558
0.680
0.071
0.747
0.203

0.890
0.412
0.527
0.530

0.374
0.374
0.023
0.134
0.070
0.046
0.234
0.012
0.457

0.489
0.465

100.00
0.00 99.99
99.90
99.99
100.00
100.00
100.00
99.97
0.00 99.49
97.25 99.38
0.00 99.41
97.21
7.46 52.25
0.00 98.41
72.91 91.65
17.65 87.59
75.93 93.42
0.00 99.75
69.47 96.11
93.68 97.03
15.81 65.15
87.34 97.72
74.19
99.22
0.00 99.54
0.00 99.64
82.56 92.57
14.29 70.99
0.00 74.07
76.57
0.00 81.67
99.99
0.00 57.32
74.91 95.22
61.74 94.96
92.86 92.01
78.77 88.53
53.44 92.88
58.67 97.15
0.00 96.01
99.35
100.00
0.00 99.94
0.00 99.99

-0.26

-0.40

-0.17
-0.01
4.74
-1.03

0.71

-0.61
-3.70
2.04
-3.47
3.45
4.71

0.221

0.563

0.856
0.996
0.965
0.181

0.474

0.444
0.058
0.023
0.074
0.728
0.175

81.71

-0.28 0.437

RecU

-0.01

0.02
-0.03
0.01
-0.05
1.01
-0.02
-0.01
0.05
0.18
0.04
-0.02
0.02
-0.68
-0.01
-1.31
-0.01
-0.01

-0.01
0.73
-5.71
-0.61

0.00
-0.61
-0.21
-0.11
0.23
-0.34
-0.10
0.12
-0.08

-0.45
-0.02

PrecU

100.00
99.99
99.90
99.99
100.00
100.00
100.00
99.97
99.49
99.38
99.41
97.21
52.25
98.41
91.65
87.59
93.42
99.75
96.11
97.03
65.15
97.72
74.19
99.22
99.54
99.64
92.57
70.99
74.07
76.57
81.67
99.99
57.32
95.22
94.96
92.01
88.53
92.88
97.15
96.01
99.35
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.93
99.99
100.00
100.00
100.00
99.94
99.37
99.46
99.48
98.95
48.74
98.69
91.23
90.17
96.05
99.82
96.04
97.37
66.88
97.93
77.83
99.52
99.35
99.90
91.17
67.24
65.23
74.75
83.05
100.00
47.00
95.17
94.36
92.24
90.40
92.26
96.55
95.43
99.62
100.00
99.87
99.99
51.59
96.824

F chg

100.00
100.00
99.87
100.00
100.00
100.00
100.00
100.00
99.61
99.30
99.34
95.53
56.31
98.12
92.08
85.16
90.93
99.68
96.18
96.70
63.51
97.51
70.87
98.92
99.73
99.38
94.01
75.18
85.68
78.48
80.33
99.99
73.44
95.27
95.57
91.78
86.74
93.50
97.75
96.61
99.08
100.00
100.00
100.00

F(0.5)

TBL
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

Rec

VB & SibAllR = PRT) (Pos VBRP)


VBG & SibAllR = PRT) (Pos VBGRP)
VBD & SibAllR = PRT) (Pos VBDRP)
VBN & SibAllR = PRT) (Pos VBNRP)
VBP & SibAllR = PRT) (Pos VBPRP)
VBZ & SibAllR = PRT) (Pos VBZRP)

Prec

=
=
=
=
=
=

True

(Pos
(Pos
(Pos
(Pos
(Pos
(Pos

vbrp[s] Mapping

B.13

64

20.00
0.00
80.78
76.00
70.69
73.78
82.74
30.34
83.40

0.00
84.31
0.00

0.00
81.32
80.42
69.32
65.89
57.85
87.13

p(FU)

24622

99.16

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.208

0.548
0.037
0.187
0.393
0.172
0.253
0.913
0.892
0.842
0.071
0.388
0.309
0.366
0.629
0.283
0.071
0.638
0.378
0.570
0.462
0.266
0.212

0.374
0.225
0.247
0.222
0.480
0.089
0.543
0.137
0.441

0.144
0.297

100.00
0.00 99.99
99.89
99.99
100.00
100.00
100.00
99.97
0.00 99.49
96.45 99.38
0.00 99.39
97.11
1.49 48.95
0.00 98.44
71.21 91.65
22.35 87.57
75.93 92.27
0.00 99.74
69.67 96.11
94.88 96.97
11.54 64.90
88.19 97.72
74.82
99.20
0.00 99.55
0.00 99.65
81.20 92.56
0.00 70.72
0.00 64.80
77.40
0.00 81.67
99.99
0.00 56.44
75.61 95.36
62.68 95.02
92.21 91.75
82.54 88.72
53.44 92.93
56.53 97.08
0.00 96.16
99.35
100.00
0.00 99.94
0.00 99.99

-0.30

-0.42
-0.91
-0.89

-0.16
-0.19
-19.54
-0.05

0.52

-1.69
-2.50
-0.08
-0.98
0.17
2.72

0.193

0.516
0.374
0.374

0.820
0.522
0.251
0.762

0.740

0.071
0.055
0.959
0.324
0.883
0.345

81.70

-0.30 0.213

RecU

-0.02

0.02
-0.03
-0.01
-0.16
-5.38
0.01
-0.01
0.03
-1.06
0.03
-0.02
-0.04
-1.07
-0.01
-0.47
-0.02
-0.00
0.01
-0.02
0.34
-17.50
0.47

0.00
-2.13
-0.07
-0.04
-0.04
-0.13
-0.04
0.06
0.07

-0.82
-0.02

PrecU

100.00
99.99
99.89
99.99
100.00
100.00
100.00
99.97
99.49
99.38
99.39
97.11
48.95
98.44
91.65
87.57
92.27
99.74
96.11
96.97
64.90
97.72
74.82
99.20
99.55
99.65
92.56
70.72
64.80
77.40
81.67
99.99
56.44
95.36
95.02
91.75
88.72
92.93
97.08
96.16
99.35
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.91
99.99
100.00
100.00
100.00
99.94
99.35
99.38
99.48
98.84
44.12
98.69
91.03
90.39
96.38
99.80
96.05
97.44
66.23
97.88
80.10
99.50
99.37
99.90
91.16
66.61
52.47
75.55
83.05
100.00
46.00
95.38
94.51
92.27
90.59
92.22
96.52
95.70
99.62
100.00
99.87
99.99
51.40
96.821

F chg

100.00
100.00
99.87
100.00
100.00
100.00
100.00
100.00
99.64
99.37
99.30
95.43
54.97
98.20
92.28
84.92
88.49
99.68
96.17
96.51
63.62
97.55
70.20
98.91
99.73
99.40
94.00
75.36
84.72
79.35
80.33
99.99
73.02
95.34
95.54
91.24
86.92
93.66
97.65
96.62
99.08
100.00
100.00
100.00

F(0.5)

Rec

TBL
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

Prec

True

(Pos = JJ & WdCs =r e [A-Z]*[a-z]* & Pos1Before =r e (?!ABSENT)(?)(?!LRB)(?! ).*)


(Pos JJP)

B.14

inrp[l] Mapping

65

75.00
0.00
77.72
81.25
71.93
0.00
73.83
84.70
39.10
82.93

84.84
25.00

0.00
0.00
82.33
79.76
69.48
64.96
62.50
88.20

p(FU)

24622

98.36

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.374

0.115
0.099
0.441
0.399
0.903
0.885
0.071
0.975
0.346
0.422
0.648
0.494
0.186
0.733
0.687
0.285
0.783
0.374
0.685
0.420
0.292
0.756
0.374

0.374
0.673
0.717
0.062
0.290
0.796
0.031
0.497

0.050
0.077

100.00
0.00 99.99
99.91
99.99
100.00
100.00
100.00
99.97
0.00 99.53
96.78 99.36
0.00 99.41
96.85
4.48 52.09
0.00 98.43
72.17 91.55
30.59 87.56
75.93 90.76
0.00 99.73
68.86 96.12
92.66 97.00
26.07 65.01
88.14 97.72
75.06
99.24
0.00 99.55
0.00 99.65
80.23 92.55
14.29 70.00
0.00 67.65
77.16
0.00 80.99
99.99
0.00 55.90
76.84 95.40
62.91 95.09
93.40 91.86
83.38 88.77
53.44 92.99
57.60 97.12
0.00 96.04
99.35
100.00
0.00 99.94
0.00 99.99

-0.52

-1.55
27.49

-0.73
-0.08
50.54
-0.36

0.21

-0.27
-2.64
0.60
-1.32
3.88
4.40

0.104

0.144
0.374

0.315
0.838
0.194
0.580

0.856

0.660
0.040
0.321
0.039
0.648
0.225

81.52

-0.52 0.126

RecU

-0.00

0.06
-0.05
0.01
-0.42
0.70
0.00
-0.12
0.01
-2.68
0.01
-0.01
-0.02
-0.91
-0.01
-0.16
0.02
0.00
0.01
-0.03
-0.68
-13.87
0.14
-0.83

-3.07
-0.02
0.03
0.07
-0.08
0.02
0.10
-0.05

-0.75
-0.02

PrecU

100.00
99.99
99.91
99.99
100.00
100.00
100.00
99.97
99.53
99.36
99.41
96.85
52.09
98.43
91.55
87.56
90.76
99.73
96.12
97.00
65.01
97.72
75.06
99.24
99.55
99.65
92.55
70.00
67.65
77.16
80.99
99.99
55.90
95.40
95.09
91.86
88.77
92.99
97.12
96.04
99.35
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.91
99.99
100.00
100.00
100.00
99.94
99.37
99.41
99.49
98.74
47.06
98.69
91.27
90.64
90.00
99.80
96.00
97.26
66.61
97.95
79.60
99.54
99.38
99.90
91.00
65.46
70.58
75.55
83.05
100.00
45.00
95.44
94.48
92.32
90.87
92.26
96.54
95.55
99.62
100.00
99.87
99.99
51.43
96.818

F chg

100.00
100.00
99.91
100.00
100.00
100.00
100.00
100.00
99.69
99.30
99.33
95.04
58.33
98.17
91.83
84.67
91.54
99.66
96.23
96.74
63.48
97.49
71.01
98.95
99.72
99.39
94.15
75.21
64.96
78.83
79.03
99.98
73.77
95.37
95.70
91.40
86.76
93.74
97.71
96.53
99.08
100.00
100.00
100.00

F(0.5)

Rec

TBL
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

Prec

True

(Pos = RP & Wd {about, across, along,...<14 ommitted>...,up, upon, with}) (Pos


RPIN)
(Pos = IN & Wd {about, across, along,...<14 ommitted>...,up, upon, with}) (Pos
INRP)

B.15

inrp[l] + insub[s] Mapping

66

0.00
0.00
79.67
88.89
88.89
80.97
82.41
53.85
87.49

0.00
90.11
0.00

33.33
86.02
80.65
82.23
78.51
76.36
89.85

p(FU)

24622

99.35

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.374

0.374
0.766
0.616
0.926
0.957
0.836
0.464
0.700
0.758
0.577
0.374
0.633
0.747
0.552
0.861
0.056
0.374
0.374
0.374
0.843
0.721
0.909
0.195

0.671
0.398
0.324
0.859
0.787
0.986
0.236

0.922
0.725

100.00
0.00 99.99
99.84
100.00
100.00
100.00
100.00
99.95
0.00 99.54
98.51 99.42
0.00 99.37
97.38
0.00 43.21
0.00 98.38
81.95 92.21
28.24 88.36
59.26 95.48
0.00 99.73
67.82 96.11
98.16 96.29
11.97 62.72
90.35 97.47
75.83
99.20
0.00 99.59
0.00 99.67
81.20 92.81
0.00 71.73
0.00 86.01
75.52
0.00 75.44
99.99
5.88 48.98
78.77 95.66
70.42 95.48
91.13 93.24
73.46 89.57
32.06 93.06
64.45 97.61
0.00 96.63
99.45
100.00
0.00 99.94
0.00 99.99

-0.06
3.42
1.11

0.09
-0.06
-0.35
0.12

-0.24

-0.04
-0.08
0.04
0.48
1.61
-0.21

0.391
0.374
0.374

0.455
0.043
0.992
0.043

0.182

0.747
0.893
0.947
0.450
0.996
0.593

84.63

0.00 0.935

RecU

-0.01

-0.01
0.01
-0.00
0.00
0.00
0.46
0.01
-0.01
-0.06
0.03
-0.00
-0.00
-0.00
0.16
0.00
-0.85
0.01
-0.00
-0.01
0.01
-0.18
-0.03
-0.27

0.00
0.02
-0.04
0.01
0.00
-0.00
-0.15

0.02
-0.00

PrecU

100.00
99.99
99.84
100.00
100.00
100.00
100.00
99.95
99.54
99.42
99.37
97.38
43.21
98.38
92.21
88.36
95.48
99.73
96.11
96.29
62.72
97.47
75.83
99.20
99.59
99.67
92.81
71.73
86.01
75.52
75.44
99.99
48.98
95.66
95.48
93.24
89.57
93.06
97.61
96.63
99.45
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.74
99.99
100.00
100.00
100.00
99.94
99.39
99.56
99.34
99.58
32.77
98.78
92.06
92.21
95.77
99.79
95.09
97.97
73.19
97.21
83.38
99.65
99.33
99.96
90.63
66.93
84.16
75.96
72.88
100.00
36.00
94.60
95.96
94.14
90.89
92.20
96.40
96.88
100.00
100.00
99.87
99.99
50.73
96.851

F chg

100.00
100.00
99.95
100.00
100.00
100.00
100.00
99.96
99.69
99.28
99.41
95.27
63.41
97.98
92.36
84.81
95.19
99.68
97.15
94.67
54.87
97.72
69.54
98.75
99.85
99.38
95.08
77.27
87.96
75.08
78.18
99.99
76.60
96.74
95.00
92.35
88.28
93.95
98.84
96.37
98.90
100.00
100.00
100.00

F(0.5)

Rec

SVM
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

Prec

True

(Pos = RP & Wd {about, across, along,...<14 ommitted>...,up, upon, with}) (Pos


RPIN)
(Pos = IN & SibR = S) (Pos INSUB)
(Pos = IN & Par = SBAR) (Pos INSUB)
(Pos = IN & Wd {about, across, along,...<14 ommitted>...,up, upon, with}) (Pos
INRP)

67

50.00
0.00
78.07
70.00
79.55
73.49
83.23
44.44
84.15

82.64
20.00
0.00
0.00
0.00
85.32
79.59
71.11
65.86
59.09
84.29

p(FU)

24622

0.00
98.72

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.374

0.508
0.899
0.230
0.251
0.680
0.938
0.268
0.785
0.981
0.330
0.123
0.757
0.210
0.498
0.462
0.417
0.379
0.374
0.317
0.907
0.926
0.821
0.823
0.374
0.374
0.639
0.112
0.918
0.809
0.601
0.200
0.641

0.131
0.597

100.00
0.00 99.99
99.91
99.99
100.00
100.00
100.00
99.97
0.00 99.47
97.45 99.41
0.00 99.37
96.80
1.49 51.21
0.00 98.43
72.12 91.61
24.71 87.63
64.81 93.13
0.00 99.73
67.84 96.08
94.86 97.01
17.09 65.11
87.55 97.75
74.79
99.24
0.00 99.55
0.00 99.65
81.20 92.51
14.29 70.52
0.00 77.94
77.10
0.00 81.97
99.99
0.00 55.95
75.44 95.40
63.15 95.17
91.88 91.80
80.03 88.87
49.62 92.90
56.32 97.12
0.00 96.17
99.35
100.00
0.00 99.94
0.00 99.99

0.01

-1.36

-3.31

-1.72
0.11
18.83
0.06

-0.46

0.46
-2.54
1.21
-2.36
-2.74
1.15

0.990

0.090

0.180

0.030
0.686
0.350
0.868

0.895

0.894
0.095
0.083
0.149
0.855
0.408

81.59

-0.44 0.321

RecU

-0.00

-0.01
0.00
-0.03
-0.48
-1.01
0.00
-0.06
0.09
-0.13
0.02
-0.05
-0.01
-0.74
0.02
-0.51
0.02
0.00
0.01
-0.07
0.07
-0.78
0.08
0.37
0.00
-2.98
-0.02
0.11
0.01
0.03
-0.08
0.10
0.08

-0.59
-0.01

PrecU

100.00
99.99
99.91
99.99
100.00
100.00
100.00
99.97
99.47
99.41
99.37
96.80
51.21
98.43
91.61
87.63
93.13
99.73
96.08
97.01
65.11
97.75
74.79
99.24
99.55
99.65
92.51
70.52
77.94
77.10
81.97
99.99
55.95
95.40
95.17
91.80
88.87
92.90
97.12
96.17
99.35
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.91
99.99
100.00
100.00
100.00
99.94
99.39
99.49
99.47
98.53
44.54
98.63
91.35
90.78
91.73
99.80
95.96
97.42
66.16
97.88
79.60
99.54
99.37
99.90
91.03
65.93
82.51
76.03
84.75
100.00
47.00
95.41
94.66
92.02
90.53
92.21
96.59
96.23
99.62
100.00
99.87
99.99
51.51
96.832

F chg

Rec

100.00
100.00
99.91
100.00
100.00
100.00
100.00
100.00
99.56
99.33
99.28
95.13
60.23
98.23
91.86
84.69
94.58
99.67
96.20
96.61
64.10
97.62
70.54
98.94
99.74
99.39
94.04
75.80
73.85
78.21
79.37
99.99
69.12
95.39
95.68
91.58
87.26
93.59
97.66
96.11
99.08
100.00
100.00
100.00

F(0.5)

Prec

True

TBL
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

B.16

68

50.00
0.00
77.80
70.97
71.93
74.23
83.82
43.09
83.50

84.19
0.00
0.00

83.72
86.42
70.45
65.32
63.96
80.60

p(FU)

24622

98.78

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.654
0.029
0.613
0.460
0.938
0.217
0.348
0.263
0.775
0.237
0.192
0.389
0.360
0.479
0.653
0.988
0.990
0.374
0.088
0.555
0.526
0.049

0.374
0.374
0.119
0.080
0.948
0.027
0.011
0.496
0.574

0.025
0.045

100.00
0.00 99.99
99.91
99.99
100.00
100.00
100.00
99.97
0.00 99.49
96.95 99.37
0.00 99.41
96.90
1.49 51.67
0.00 98.41
73.65 91.62
25.88 87.73
75.93 93.60
0.00 99.73
68.76 96.10
93.58 96.97
22.65 65.25
86.70 97.70
75.03
99.22
0.00 99.55
0.00 99.65
79.46 92.50
0.00 70.19
0.00 74.23
75.80
0.00 81.67
99.99
0.00 57.14
76.67 95.25
61.27 94.94
91.34 91.80
81.28 88.46
54.20 92.72
57.82 97.08
0.00 96.04
99.35
100.00
0.00 99.94
0.00 99.99

-0.23

-0.46

-0.55
-0.15
42.89
-0.81

-0.66

0.41
-0.76
0.43
-2.12
5.80
0.87

0.312

0.575

0.165
0.539
0.013
0.038

0.653

0.921
0.748
0.563
0.117
0.282
0.783

81.65

-0.35 0.098

RecU

0.01
-0.04
0.00
-0.37
-0.11
-0.02
-0.04
0.21
0.37
0.02
-0.03
-0.04
-0.53
-0.02
-0.20
0.00
-0.00
0.01
-0.08
-0.40
-5.50
-1.61

0.00
-0.91
-0.18
-0.13
0.00
-0.42
-0.27
0.06
-0.05

-1.11
-0.04

PrecU

100.00
99.99
99.91
99.99
100.00
100.00
100.00
99.97
99.49
99.37
99.41
96.90
51.67
98.41
91.62
87.73
93.60
99.73
96.10
96.97
65.25
97.70
75.03
99.22
99.55
99.65
92.50
70.19
74.23
75.80
81.67
99.99
57.14
95.25
94.94
91.80
88.46
92.72
97.08
96.04
99.35
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.92
99.99
100.00
100.00
100.00
99.94
99.36
99.42
99.49
98.74
45.38
98.69
91.35
91.02
96.52
99.80
96.00
97.28
66.88
97.84
79.85
99.53
99.36
99.90
91.12
65.46
64.61
73.71
83.05
100.00
46.00
95.22
94.28
91.99
90.39
92.14
96.56
95.60
99.62
100.00
99.87
99.99
51.25
96.802

F chg

100.00
100.00
99.90
100.00
100.00
100.00
100.00
100.00
99.62
99.33
99.32
95.14
60.00
98.13
91.90
84.66
90.85
99.67
96.19
96.66
63.70
97.57
70.76
98.92
99.74
99.39
93.93
75.67
87.22
78.01
80.33
99.99
75.41
95.29
95.60
91.60
86.61
93.31
97.61
96.49
99.08
100.00
100.00
100.00

F(0.5)

Rec

TBL
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

Prec

True

(Pos = VB & SibAllR = PRT) (Pos VBRP)


(Pos = VBG & SibAllR = PRT) (Pos VBGRP)
And similarly for other VB.* (Pos = RP & Wd {about, across,
along,...<14 ommitted>...,up, upon, with}) (Pos RPIN)
(Pos = IN & Wd {about, across, along,...<14 ommitted>...,up, upon, with}) (Pos
INRP)

B.17

69

70

100.00
0.00
77.94
60.61
71.93
73.93
84.64
32.82
80.93

82.74
0.00

0.00
83.49
83.54
70.16
66.47
53.17
87.25

p(FU)

24622

98.49

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.354
0.295
0.677
0.178
0.958
0.000
0.246
0.930
0.864
0.225
0.156
0.697
0.098
0.335
0.955
0.615
0.594
0.694
0.000
0.954
0.264
0.314
0.374

0.659
0.297
0.359
0.014
0.234
0.021
0.156

0.001
0.000

100.00
0.00 99.99
99.91
99.99
100.00
100.00
100.00
99.97
0.00 99.50
97.19 99.39
0.00 99.41
96.70
1.49 51.85
0.00 97.34
73.02 91.57
23.53 87.53
75.93 92.36
0.00 99.74
67.92 96.10
92.95 97.00
18.38 64.80
89.12 97.71
75.18
99.23
0.00 99.54
0.00 99.64
80.81 88.34
0.00 70.50
0.00 64.72
77.54
0.00 82.64
99.99
0.00 57.67
77.19 95.39
61.97 95.04
92.86 91.88
79.47 88.69
51.15 92.85
57.17 97.16
0.00 96.26
99.35
100.00
0.00 99.94
0.00 99.99

-0.25

-0.81
-2.77

-1.38
0.04
13.39
-1.09

-0.64

0.64
-1.51
0.90
-2.18
-5.99
3.49

0.305

0.204
0.796

0.054
0.867
0.811
0.068

0.594

0.695
0.305
0.371
0.075
0.431
0.208

81.49

-0.55 0.036

RecU

0.02
-0.02
0.01
-0.58
0.23
-1.10
-0.10
-0.02
-0.97
0.02
-0.03
-0.02
-1.22
-0.02
0.00
0.01
-0.01
-0.01
-4.57
0.04
-17.61
0.64
1.20

-0.04
-0.02
0.10
-0.16
-0.13
0.14
0.18

-5.05
-0.27

PrecU

100.00
99.99
99.91
99.99
100.00
100.00
100.00
99.97
99.50
99.39
99.41
96.70
51.85
97.34
91.57
87.53
92.36
99.74
96.10
97.00
64.80
97.71
75.18
99.23
99.54
99.64
88.34
70.50
64.72
77.54
82.64
99.99
57.67
95.39
95.04
91.88
88.69
92.85
97.16
96.26
99.35
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.92
99.99
100.00
100.00
100.00
99.94
99.38
99.46
99.46
98.53
47.06
98.96
91.25
90.89
96.48
99.81
95.96
97.23
66.61
97.97
80.10
99.53
99.37
99.90
83.02
65.67
52.47
76.73
84.75
100.00
47.00
95.37
94.47
92.40
90.48
92.31
96.62
96.10
99.62
100.00
99.87
99.99
49.20
96.585

F chg

Rec

100.00
100.00
99.90
100.00
100.00
100.00
100.00
100.00
99.62
99.32
99.35
94.93
57.73
95.77
91.88
84.41
88.57
99.67
96.23
96.77
63.08
97.46
70.82
98.94
99.72
99.37
94.39
76.09
84.44
78.36
80.65
99.98
74.60
95.41
95.62
91.37
86.98
93.40
97.70
96.43
99.08
100.00
100.00
100.00

F(0.5)

Prec

True

TBL
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

B.18

71

72

100.00
100.00 100.00 99.99
99.90
99.99
100.00
100.00
100.00
99.97
0.00 0.00 99.50
98.51 96.84 99.36
0.00 99.38
96.80
0.00 0.00 52.19
0.00 0.00 97.32
78.48 72.34 91.62
72.22 30.59 87.80
71.93 75.93 93.77
0.00 99.72
73.31 68.15 96.08
83.96 94.09 96.98
32.69 14.53 64.71
83.04 87.80 97.73
75.18
99.24
0.00 99.56
0.00 99.65
82.90 80.81 88.31
25.00 14.29 70.60
0.00 76.32
77.41
0.00 81.67
99.99
0.00 57.67
84.41 75.96 95.28
80.91 62.68 95.11
70.48 91.99 91.86
66.06 81.56 88.84
49.61 48.85 92.66
85.53 58.24 97.12
0.00 96.09
99.35
100.00
0.00 99.94
0.00 99.99

24622

81.52

p(FU)

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F chgU

F(0.5)U

0.374
0.373

0.525
0.061
0.095
0.182
0.756
0.000
0.489
0.378
0.803
0.688
0.091
0.124
0.212
0.794
0.905
0.071
0.180
0.374
0.000
0.827
0.531
0.410

0.374

0.040
0.352
0.130
0.985
0.044
0.113
0.994

0.000
0.000

RecU

0.01
-0.01

0.02
-0.05
-0.02
-0.48
0.89
-1.13
-0.04
0.29
0.55
0.00
-0.05
-0.03
-1.36
0.00
0.00
0.02
0.01
0.01
-4.60
0.19
-2.84
0.47

0.00

-0.15
0.05
0.07
-0.00
-0.34
0.10
-0.00

-5.00
-0.27

PrecU

100.00
99.99
99.90
99.99
100.00
100.00
100.00
99.97
99.50
99.36
99.38
96.80
52.19
97.32
91.62
87.80
93.77
99.72
96.08
96.98
64.71
97.73
75.18
99.24
99.56
99.65
88.31
70.60
76.32
77.41
81.67
99.99
57.67
95.28
95.11
91.86
88.84
92.66
97.12
96.09
99.35
100.00
99.94
99.99

TrueU

p(F)

100.00
99.99
99.92
99.99
100.00
100.00
100.00
99.94
99.33
99.42
99.47
98.53
47.48
98.90
91.25
91.36
96.10
99.78
95.97
97.28
66.44
97.90
80.10
99.54
99.37
99.90
82.93
65.56
68.31
76.94
83.05
100.00
47.00
95.27
94.64
92.22
90.65
92.08
96.63
96.23
99.62
100.00
99.87
99.99
49.23
96.585

F chg

Rec

100.00
100.00
99.88
100.00
100.00
100.00
100.00
100.00
99.67
99.29
99.29
95.13
57.95
95.78
91.99
84.52
91.54
99.67
96.19
96.69
63.07
97.56
70.82
98.94
99.74
99.39
94.44
76.48
86.46
77.88
80.33
99.99
74.60
95.30
95.59
91.51
87.09
93.24
97.62
95.94
99.08
100.00
100.00
100.00

F(0.5)

Prec

True

TBL
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

-0.42

-0.96
23.27

-1.60
0.19
-3.18
-0.48

-0.55

0.32
-2.24
0.76
-1.36
-11.24
3.81

0.048

0.381
0.374

0.013
0.329
0.845
0.283

0.760

0.731
0.283
0.332
0.430
0.016
0.334

-0.52 0.167

B.19

vbrp[ld] Mapping

73

0.00
79.64
88.89
84.21
81.02
82.48
53.85
86.95

0.00
90.30
0.00

33.33
86.07
81.77
82.31
78.04
75.44
89.33

p(FU)

24622

99.35

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.178
0.208
0.421
0.374

0.022
0.329
0.925
0.915

0.327
0.099
0.767
0.329
0.374

0.374
0.374
0.815
0.379
0.625
0.858

0.096
0.642
0.075
0.251
0.054
0.099
0.639

0.913
0.899

100.00
0.00 99.99
99.85
100.00
100.00
100.00
100.00
99.96
0.00 99.53
98.51 99.43
0.00 99.38
97.33
0.00 43.02
0.00 98.37
82.12 92.23
28.24 88.40
59.26 95.46
0.00 99.74
67.69 96.12
98.12 96.29
11.97 62.63
90.35 97.46
76.34
99.20
0.00 99.59
0.00 99.68
81.20 92.80
0.00 71.93
0.00 86.13
75.71
0.00 75.44
99.99
5.88 48.98
79.12 95.68
69.48 95.45
91.13 93.23
73.46 89.53
32.82 93.09
62.74 97.58
0.00 96.78
99.45
100.00
0.00 99.94
0.00 99.99

0.02
3.42
-1.09

0.02
-0.02
-0.35
-0.19

-0.14

0.22
-0.16
0.09
0.19
2.93
-2.00

0.639
0.374
0.374

0.875
0.581
0.374
0.019

0.374

0.099
0.859
0.594
0.837
0.459
0.089

84.60

-0.03 0.130

RecU

-0.00
0.00
0.00
-0.05

0.00
0.01
-0.01
0.00

0.00
-0.01
0.03
-0.01
-0.18

0.00
0.01
-0.00
0.09
0.11
-0.02

0.02
-0.01
-0.05
-0.03
0.03
-0.03
0.01

-0.01
-0.00

PrecU

100.00
99.99
99.85
100.00
100.00
100.00
100.00
99.96
99.53
99.43
99.38
97.33
43.02
98.37
92.23
88.40
95.46
99.74
96.12
96.29
62.63
97.46
76.34
99.20
99.59
99.68
92.80
71.93
86.13
75.71
75.44
99.99
48.98
95.68
95.45
93.23
89.53
93.09
97.58
96.78
99.45
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.75
99.99
100.00
100.00
100.00
99.94
99.33
99.57
99.29
99.58
32.35
98.91
92.08
92.21
95.77
99.79
95.10
97.95
73.12
97.21
84.13
99.65
99.34
99.96
90.54
67.19
84.36
75.82
72.88
100.00
36.00
94.63
95.87
94.15
90.96
92.18
96.37
96.38
100.00
100.00
99.87
99.99
50.71
96.852

F chg

100.00
100.00
99.95
100.00
100.00
100.00
100.00
99.98
99.73
99.28
99.46
95.18
64.17
97.84
92.37
84.90
95.15
99.69
97.16
94.68
54.78
97.71
69.87
98.74
99.85
99.40
95.18
77.39
87.98
75.59
78.18
99.99
76.60
96.74
95.04
92.34
88.15
94.01
98.82
97.19
98.90
100.00
100.00
100.00

F(0.5)

Rec

SVM
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

Prec

True

(Pos = VB & Wd {bail, beef, blot,...<76 ommitted>...,smooth, soak, speed}) (Pos


VBRP)
(Pos = VBG & Wd {bailing, beefing, blotting,...<76 ommitted>...,smoothing, soaking,
speeding}) (Pos VBGRP)
and similary for other VB.*

74

100.00
100.00 100.00 99.99
99.90
99.99
100.00
100.00
100.00
99.97
0.00 99.51
98.84 97.28 99.40
0.00 99.42
97.11
0.00 50.46
0.00 0.00 98.43
78.71 73.37 91.68
70.00 16.47 87.53
71.93 75.93 90.29
0.00 99.73
73.99 68.76 96.11
84.13 93.71 97.01
38.27 13.25 64.75
81.60 88.61 97.72
74.85
99.24
0.00 99.56
0.00 99.65
83.63 82.17 92.65
0.00 0.00 71.01
0.00 58.76
0.00
77.33
0.00 81.67
99.99
0.00 57.67
85.08 74.04 95.40
81.73 61.97 94.99
69.75 92.32 91.83
65.55 79.47 88.60
53.03 53.44 92.80
87.58 55.89 97.13
0.00 96.10
99.35
100.00
0.00 99.94
0.00 99.99

24622

81.69

p(FU)

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F chgU

F(0.5)U

0.374
0.374

0.278
0.884
0.104
0.208
0.268
0.870
0.340
0.924
0.507
0.581
0.547
0.792
0.244
0.792
0.346
0.097
0.493
0.178
0.170
0.405
0.186
0.661

0.374

0.537
0.061
0.484
0.134
0.012
0.192
0.917

0.215
0.303

RecU

0.01
-0.01

0.03
-0.00
0.02
-0.16
-2.46
0.00
0.03
-0.02
-3.18
0.01
-0.02
-0.01
-1.29
-0.01
-0.43
0.02
0.01
0.01
0.08
0.76
-25.20
0.37

0.00

-0.03
-0.07
0.04
-0.26
-0.19
0.11
0.01

-0.69
-0.02

PrecU

100.00
99.99
99.90
99.99
100.00
100.00
100.00
99.97
99.51
99.40
99.42
97.11
50.46
98.43
91.68
87.53
90.29
99.73
96.11
97.01
64.75
97.72
74.85
99.24
99.56
99.65
92.65
71.01
58.76
77.33
81.67
99.99
57.67
95.40
94.99
91.83
88.60
92.80
97.13
96.10
99.35
100.00
99.94
99.99

TrueU

p(F)

100.00
99.99
99.93
99.99
100.00
100.00
100.00
99.94
99.41
99.46
99.49
98.84
46.22
98.71
91.32
90.06
92.58
99.79
96.01
97.34
65.75
97.97
79.09
99.55
99.37
99.91
91.26
67.30
52.47
76.35
83.05
100.00
47.00
95.26
94.43
92.11
90.53
92.18
96.50
95.49
99.62
100.00
99.87
99.99
51.46
96.823

F chg

Rec

100.00
100.00
99.87
100.00
100.00
100.00
100.00
100.00
99.61
99.35
99.35
95.43
55.56
98.15
92.05
85.14
88.11
99.67
96.21
96.68
63.78
97.46
71.04
98.94
99.74
99.39
94.07
75.15
66.75
78.33
80.33
99.99
74.60
95.54
95.56
91.55
86.76
93.41
97.77
96.71
99.08
100.00
100.00
100.00

F(0.5)

Prec

True

TBL
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

-0.03

-0.09
-23.51

-0.71
0.11
-5.28
-0.94

0.73

-0.67
-2.43
0.32
-2.92
-4.02
2.22

0.844

0.915
0.520

0.062
0.775
0.853
0.248

0.492

0.426
0.214
0.545
0.064
0.369
0.612

-0.31 0.172

B.20

75

76

100.00
100.00 100.00 99.99
99.89
99.99
100.00
100.00
100.00
99.97
0.00 99.48
98.87 97.30 99.40
0.00 99.42
97.26
33.33 1.49 51.67
0.00 0.00 98.42
77.68 72.78 91.50
75.00 21.18 87.85
76.47 72.22 91.52
0.00 99.73
73.89 68.84 96.06
84.12 93.40 96.98
36.55 22.65 64.83
82.51 87.59 97.68
0.00
74.11
99.17
0.00 0.00 99.55
0.00 99.65
84.91 78.49 92.45
0.00 0.00 71.25
0.00 68.02
77.21
0.00 0.00 80.99
99.99
0.00 57.67
81.59 75.44 95.00
83.33 62.21 95.02
70.28 91.88 91.70
65.65 80.87 88.65
69.70 52.67 92.76
84.18 56.96 97.02
0.00 95.86
99.33
100.00
0.00 99.94
0.00 99.99

24622

81.52

p(FU)

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F chgU

F(0.5)U

0.374
0.374

0.646
0.858
0.095
0.954
0.994
0.487
0.103
0.253
0.715
0.680
0.021
0.305
0.267
0.255
0.222
0.074
0.473
0.374
0.114
0.033
0.382
0.686
0.374
0.374

0.010
0.368
0.293
0.078
0.019
0.867
0.105
0.374

0.066
0.046

RecU

0.01
-0.01

0.00
-0.00
0.02
0.00
-0.11
-0.01
-0.17
0.35
-1.87
0.01
-0.07
-0.03
-1.17
-0.04
-1.42
-0.05
-0.01
0.01
-0.13
1.11
-13.41
0.22
-0.83
0.00

-0.44
-0.04
-0.11
-0.21
-0.22
-0.01
-0.24
-0.02

-1.31
-0.06

PrecU

100.00
99.99
99.89
99.99
100.00
100.00
100.00
99.97
99.48
99.40
99.42
97.26
51.67
98.42
91.50
87.85
91.52
99.73
96.06
96.98
64.83
97.68
74.11
99.17
99.55
99.65
92.45
71.25
68.02
77.21
80.99
99.99
57.67
95.00
95.02
91.70
88.65
92.76
97.02
95.86
99.33
100.00
99.94
99.99

TrueU

p(F)

100.00
99.99
99.88
99.99
100.00
100.00
100.00
99.94
99.40
99.48
99.48
98.95
45.38
98.69
91.37
90.89
92.20
99.80
95.95
97.28
66.58
97.89
78.59
99.52
99.36
99.90
90.82
66.88
65.64
75.72
83.05
100.00
47.00
95.07
94.30
91.96
90.61
91.93
96.46
95.17
99.58
100.00
99.87
99.99
51.14
96.782

F chg

Rec

100.00
100.00
99.91
100.00
100.00
100.00
100.00
100.00
99.56
99.33
99.36
95.63
60.00
98.15
91.64
85.02
90.84
99.65
96.18
96.69
63.18
97.47
70.11
98.83
99.73
99.39
94.14
76.24
70.58
78.76
79.03
99.99
74.60
94.94
95.75
91.43
86.77
93.60
97.58
96.56
99.08
100.00
100.00
100.00

F(0.5)

Prec

True

TBL
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

0.00

-1.14

0.56

-0.72
-0.06
34.60
-0.93

-0.88

-1.65
-1.40
0.55
-2.08
8.18
1.78

0.993

0.286

0.954

0.021
0.909
0.645
0.172

0.146

0.207
0.443
0.411
0.083
0.036
0.594

-0.51 0.229

B.21

77

78

24622

0.00
98.66 97.07
0.00
100.00
0.00
78.00
66.67
69.64
74.54
83.73
33.77
83.37

1.49
0.00
73.00
30.59
72.22
0.00
68.38
93.50
22.22
86.91

0.00

0.00
0.00
84.22 79.65
33.33 14.29
0.00
0.00
100.00 100.00
0.00
0.00
80.56 76.32
82.12 63.62
70.38 92.32
66.25 81.98
65.69 51.15
88.89 58.24
0.00

0.00
0.00
81.59

p(FU)

0.00

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.339
0.358
0.188
0.251
0.512
0.942
0.022
0.715
0.352
0.418
0.146
0.427
0.247
0.772
0.771
0.131
0.994
0.374
0.004
0.884
0.373
0.724
0.374

0.034
0.767
0.846
0.173
0.176
0.135
0.938
0.374

0.054
0.052

RecU

0.02
-0.02
-0.03
-0.48
-1.31
-0.00
-0.09
0.09
-1.43
0.01
-0.04
-0.03
-1.30
-0.02
-0.16
0.04
0.00
0.01
-0.09
0.10
-9.75
-0.24
1.20

-0.21
0.02
-0.03
-0.08
-0.16
0.15
0.02
-0.02

-1.08
-0.04

PrecU

100.00
99.99
99.91
99.99
100.00
100.00
100.00
99.97
99.50
99.39
99.37
96.80
51.05
98.43
91.57
87.63
91.93
99.73
96.09
96.98
64.75
97.71
75.06
99.26
99.55
99.65
92.49
70.54
70.89
76.85
82.64
99.99
57.67
95.23
95.08
91.77
88.77
92.82
97.17
96.10
99.33
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.92
99.99
100.00
100.00
100.00
99.94
99.36
99.45
99.47
98.53
45.80
98.64
91.38
90.89
92.25
99.80
95.97
97.30
66.78
97.83
80.35
99.53
99.38
99.90
90.98
65.93
70.16
75.20
84.75
100.00
47.00
95.26
94.58
92.12
90.49
92.13
96.66
96.14
99.58
100.00
99.87
99.99
51.26
96.806

F chg

Rec

100.00
100.00
99.90
100.00
100.00
100.00
100.00
100.00
99.64
99.33
99.28
95.13
57.67
98.22
91.77
84.60
91.60
99.67
96.22
96.66
62.84
97.59
70.42
98.99
99.72
99.39
94.05
75.85
71.64
78.58
80.65
99.98
74.60
95.20
95.58
91.42
87.11
93.52
97.69
96.06
99.08
100.00
100.00
100.00

F(0.5)

Prec

True

TBL
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

100.00
99.99
99.91
99.99
100.00
100.00
100.00
99.97
99.50
99.39
99.37
96.80
51.05
98.43
91.57
87.63
91.93
99.73
96.09
96.98
64.75
97.71
75.06
99.26
99.55
99.65
92.49
70.54
70.89
76.85
82.64
99.99
57.67
95.23
95.08
91.77
88.77
92.82
97.17
96.10
99.33
100.00
99.94
99.99

-0.22

-0.79
20.29
-4.01

-0.65
-0.25
28.99
-0.78

-0.51

-1.67
-0.77
0.83
-0.98
3.69
5.43

0.308

0.217
0.471
0.374

0.051
0.648
0.448
0.493

0.699

0.185
0.427
0.075
0.169
0.366
0.182

-0.44 0.262

B.22

79

75.00
0.00
78.85
78.79
71.93
75.19
83.02
39.24
83.34

84.93
25.00

81.63
83.12
70.95
66.48
61.74
87.33

p(FU)

24622

0.00
98.28

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.374

0.381
0.450
0.303
0.793
0.191
0.805
0.112
0.951
0.987
0.979
0.991
0.041
0.393
0.246
0.231
0.463
0.996
0.374
0.081
0.831
0.950
0.656

0.374

0.189
0.518
0.387
0.726
0.506
0.074
0.189

0.233
0.452

100.00
0.00 99.99
99.91
99.99
100.00
100.00
100.00
99.97
0.00 99.49
97.22 99.38
0.00 99.39
97.16
4.48 49.78
0.00 98.42
72.04 91.56
30.59 87.57
75.93 93.11
0.00 99.72
68.81 96.13
94.80 96.96
13.25 65.03
88.02 97.75
75.44
99.24
0.00 99.55
0.00 99.64
80.81 92.45
14.29 70.55
0.00 78.17
77.14
0.00 81.67
99.99
0.00 57.67
77.19 95.30
62.44 95.12
91.99 91.85
82.54 88.87
54.20 92.95
56.10 97.14
0.00 96.24
99.35
100.00
0.00 99.94
0.00 99.99

-0.34

-0.95
26.40

0.10
-0.06
-4.67
-0.18

0.64

-0.45
-1.30
1.14
-0.49
4.08
2.34

0.272

0.096
0.384

0.826
0.781
0.806
0.593

0.572

0.743
0.402
0.090
0.716
0.633
0.314

81.85

-0.11 0.640

RecU

-0.00

0.02
-0.02
-0.01
-0.11
-3.78
-0.01
-0.10
0.02
-0.16
0.00
-0.00
-0.05
-0.88
0.02
0.35
0.02
-0.00
-0.01
-0.14
0.10
-0.49
0.13

0.00

-0.13
0.06
0.06
0.04
-0.02
0.12
0.15

-0.64
-0.02

PrecU

100.00
99.99
99.91
99.99
100.00
100.00
100.00
99.97
99.49
99.38
99.39
97.16
49.78
98.42
91.56
87.57
93.11
99.72
96.13
96.96
65.03
97.75
75.44
99.24
99.55
99.64
92.45
70.55
78.17
77.14
81.67
99.99
57.67
95.30
95.12
91.85
88.87
92.95
97.14
96.24
99.35
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.91
99.99
100.00
100.00
100.00
99.94
99.37
99.46
99.50
98.84
47.06
98.62
91.19
90.47
92.02
99.77
96.00
97.34
66.13
97.92
80.86
99.55
99.36
99.89
90.97
66.19
82.51
76.14
83.05
100.00
47.00
95.43
94.62
92.24
90.63
92.31
96.60
96.19
99.62
100.00
99.87
99.99
51.49
96.827

F chg

100.00
100.00
99.91
100.00
100.00
100.00
100.00
100.00
99.62
99.30
99.28
95.53
52.83
98.23
91.94
84.85
94.23
99.67
96.26
96.59
63.96
97.58
70.70
98.93
99.74
99.38
93.98
75.51
74.26
78.17
80.33
99.99
74.60
95.18
95.61
91.46
87.19
93.60
97.69
96.29
99.08
100.00
100.00
100.00

F(0.5)

Rec

TBL
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

Prec

True

(Pos = IN & SibR = S) (Pos INSUB)


(Pos = IN & Par = SBAR) (Pos INSUB)
(Pos = IN & Wd {after, before, since, until} & SibR =r e (?!S)) (Pos INTMP)

B.23

80

25.00
0.00
78.07
63.89
71.43
73.59
83.59
31.78
80.97

0.00
83.50
0.00
0.00
0.00
81.14
82.30
72.11
67.93
68.60
89.12

p(FU)

24622

0.00
98.84

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.374

0.316
0.427
0.283
0.266
0.061
0.789
0.064
0.827
0.975
0.979
0.268
0.323
0.215
0.123
0.813
0.831
0.961
0.374
0.440
0.887
0.916
0.672

0.014
0.026
0.734
0.018
0.007
0.278
0.018
0.374

0.003
0.008

100.00
0.00 99.99
99.90
99.99
100.00
100.00
100.00
99.97
0.00 99.51
97.04 99.39
0.00 99.42
97.05
1.49 50.67
0.00 98.42
73.98 91.50
27.06 87.50
74.07 92.95
0.00 99.72
69.22 96.09
94.52 96.98
14.53 64.66
87.89 97.65
75.09
99.22
0.00 99.55
0.00 99.65
82.36 92.54
0.00 70.53
0.00 77.68
77.21
0.00 81.67
99.99
0.00 57.67
74.74 95.00
58.92 94.88
85.06 91.74
75.14 88.21
45.04 92.36
54.39 97.09
0.00 95.91
99.33
100.00
0.00 99.94
0.00 99.99

-0.15

-0.06
9.05
-1.55

-0.63
0.17
-4.03
-1.73

0.77

-2.38
-4.95
-1.46
-3.58
-1.96
1.20

0.398

0.933
0.622
0.374

0.290
0.646
0.789
0.019

0.536

0.166
0.005
0.526
0.082
0.487
0.758

81.52

-0.52 0.189

RecU

-0.01

0.03
-0.02
0.02
-0.21
-2.06
-0.00
-0.17
-0.05
-0.33
0.00
-0.04
-0.04
-1.43
-0.07
-0.12
-0.00
-0.00
0.01
-0.03
0.07
-1.11
0.21

-0.45
-0.19
-0.06
-0.70
-0.65
0.07
-0.19
-0.02

-1.79
-0.07

PrecU

100.00
99.99
99.90
99.99
100.00
100.00
100.00
99.97
99.51
99.39
99.42
97.05
50.67
98.42
91.50
87.50
92.95
99.72
96.09
96.98
64.66
97.65
75.09
99.22
99.55
99.65
92.54
70.53
77.68
77.21
81.67
99.99
57.67
95.00
94.88
91.74
88.21
92.36
97.09
95.91
99.33
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.90
99.99
100.00
100.00
100.00
99.94
99.37
99.44
99.50
98.74
47.90
98.67
91.53
90.58
92.02
99.80
96.10
97.34
66.34
98.01
80.10
99.56
99.35
99.90
91.19
66.19
81.28
75.89
83.05
100.00
47.00
94.96
94.28
91.50
88.87
91.14
96.10
95.36
99.58
100.00
99.87
99.99
50.89
96.776

F chg

100.00
100.00
99.90
100.00
100.00
100.00
100.00
100.00
99.64
99.34
99.34
95.43
53.77
98.18
91.48
84.62
93.91
99.64
96.08
96.62
63.07
97.30
70.67
98.87
99.75
99.39
93.94
75.46
74.39
78.57
80.33
99.98
74.60
95.04
95.48
91.98
87.56
93.62
98.11
96.46
99.08
100.00
100.00
100.00

F(0.5)

TBL
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

Rec

VB & SibAllR = NP) (Pos VBTR)


VBG & SibAllR = NP) (Pos VBGTR)
VBD & SibAllR = NP) (Pos VBDTR)
VBN & SibAllR = NP) (Pos VBNTR)
VBP & SibAllR = NP) (Pos VBPTR)
VBZ & SibAllR = NP) (Pos VBZTR)

Prec

=
=
=
=
=
=

True

(Pos
(Pos
(Pos
(Pos
(Pos
(Pos

vbtr[s] Mapping

B.24

81

0.00
79.48
88.46
88.89
80.45
82.47
53.85
87.45

0.00
90.30
0.00

33.33
86.45
80.32
82.14
78.36
73.53
89.47

p(FU)

24622

99.35

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.298
0.700
0.119
0.374
0.437
0.040
0.005
0.707
0.374
0.178
0.772
0.410
0.718
0.297
0.552

0.374
0.374
0.847
0.697
0.374
0.247

0.374
0.069
0.997
0.197
0.960
0.002
0.484
0.927

0.791
0.752

100.00
0.00 99.99
99.85
100.00
100.00
100.00
100.00
99.96
0.00 99.54
98.48 99.42
0.00 99.39
97.33
0.00 41.69
0.00 98.38
82.19 92.20
27.06 88.38
59.26 95.43
0.00 99.75
67.87 96.12
97.98 96.29
11.97 62.64
90.35 97.47
76.68
99.20
0.00 99.59
0.00 99.68
81.20 92.80
0.00 71.92
0.00 85.92
75.63
0.00 75.44
99.99
5.88 48.65
70.53 95.59
70.89 95.46
91.56 93.31
73.32 89.56
38.17 93.10
65.52 97.61
0.00 96.77
99.45
100.00
0.00 99.94
0.00 99.99

-0.02

-0.04
0.00
1.11

-0.16
-0.10
-0.35
0.10

-0.14

-5.57
0.08
0.20
0.28
13.07
0.57

0.374

0.684
0.744
0.374

0.160
0.211
0.374
0.350

0.233

0.033
0.864
0.237
0.216
0.005
0.318

84.51

-0.13 0.097

RecU

0.01
-0.00
0.01
-0.05
-3.08
0.02
-0.02
-0.03
-0.03
0.01
0.00
-0.00
0.03
0.01
0.25

-0.00
0.01
-0.00
0.08
-0.14
-0.12

-0.68
-0.07
0.00
0.03
0.00
0.05
0.01
-0.00

-0.03
0.00

PrecU

100.00
99.99
99.85
100.00
100.00
100.00
100.00
99.96
99.54
99.42
99.39
97.33
41.69
98.38
92.20
88.38
95.43
99.75
96.12
96.29
62.64
97.47
76.68
99.20
99.59
99.68
92.80
71.92
85.92
75.63
75.44
99.99
48.65
95.59
95.46
93.31
89.56
93.10
97.61
96.77
99.45
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.75
99.99
100.00
100.00
100.00
99.94
99.32
99.56
99.32
99.58
31.09
98.92
92.05
92.21
95.68
99.79
95.11
97.95
73.19
97.21
84.89
99.65
99.33
99.96
90.50
67.03
84.16
75.62
72.88
100.00
36.00
94.41
95.91
94.23
91.00
92.24
96.43
96.56
100.00
100.00
99.87
99.99
50.70
96.854

F chg

100.00
100.00
99.95
100.00
100.00
100.00
100.00
99.98
99.76
99.28
99.46
95.18
63.25
97.84
92.35
84.85
95.19
99.71
97.15
94.69
54.74
97.74
69.92
98.74
99.85
99.40
95.21
77.58
87.77
75.64
78.18
99.99
75.00
96.80
95.01
92.41
88.17
93.98
98.83
96.97
98.90
100.00
100.00
100.00

F(0.5)

Rec

SVM
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

Prec

True

(Pos = TO & Wd = to & SibR = NP) (Pos IN)

82

0.00
9.52
78.14
78.57
73.21
73.79
84.55
37.31
84.23

0.00
84.80
0.00

84.33
84.23
70.45
65.73
53.17
86.82

p(FU)

24622

98.69

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.208

0.644
0.450
0.476
0.216
0.090
0.334
0.549
0.892
0.840
0.990
0.456
0.449
0.364
0.340
0.177
0.404
0.989
0.178
0.464
0.586
0.260
0.438

0.374

0.119
0.062
0.061
0.791
0.213
0.029
0.151

0.329
0.677

100.00
0.00 99.99
99.89
99.99
100.00
100.00
100.00
99.97
0.00 99.48
97.16 99.39
0.00 99.41
97.00
0.00 49.64
9.09 98.41
73.26 91.63
12.94 87.53
75.93 92.25
0.00 99.72
68.71 96.12
93.15 96.99
32.05 65.04
87.63 97.77
74.35
99.24
0.00 99.55
0.00 99.65
82.17 92.61
0.00 70.72
0.00 64.37
77.30
0.00 81.67
99.99
0.00 57.67
74.56 95.36
62.68 95.15
92.64 91.91
81.98 88.82
51.15 92.89
57.82 97.17
0.00 96.01
99.35
100.00
0.00 99.94
0.00 99.99

-0.16

-0.52

0.91

-0.87
0.09
65.95
0.15

1.42

-0.71
-0.53
1.04
-1.41
-5.99
3.98

0.173

0.517

0.374

0.046
0.839
0.142
0.936

0.367

0.447
0.605
0.233
0.060
0.309
0.130

81.72

-0.27 0.377

RecU

-0.02

0.00
-0.02
0.01
-0.26
-4.05
-0.01
-0.03
-0.03
-1.09
0.00
-0.01
-0.03
-0.85
0.04
-1.10
0.02
0.00
0.01
0.04
0.36
-18.06
0.33

-0.00

-0.07
0.09
0.13
-0.02
-0.08
0.15
-0.08

-0.48
-0.01

PrecU

100.00
99.99
99.89
99.99
100.00
100.00
100.00
99.97
99.48
99.39
99.41
97.00
49.64
98.41
91.63
87.53
92.25
99.72
96.12
96.99
65.04
97.77
74.35
99.24
99.55
99.65
92.61
70.72
64.37
77.30
81.67
99.99
57.67
95.36
95.15
91.91
88.82
92.89
97.17
96.01
99.35
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.90
99.99
100.00
100.00
100.00
99.94
99.41
99.45
99.48
98.74
43.28
98.67
91.26
90.39
96.38
99.77
96.04
97.27
67.74
97.91
78.84
99.52
99.38
99.90
91.19
66.19
51.85
75.76
83.05
99.99
47.00
95.36
94.62
92.35
90.59
92.29
96.58
95.62
99.62
100.00
99.87
99.99
51.57
96.833

F chg

Rec

100.00
100.00
99.88
100.00
100.00
100.00
100.00
100.00
99.56
99.34
99.34
95.33
58.19
98.15
92.00
84.84
88.45
99.67
96.19
96.71
62.55
97.63
70.34
98.97
99.72
99.40
94.08
75.92
84.85
78.91
80.33
99.99
74.60
95.36
95.67
91.48
87.11
93.50
97.77
96.41
99.08
100.00
100.00
100.00

F(0.5)

Prec

True

TBL
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

B.25

to:in Mapping

83

0.00
79.47
88.46
88.89
80.47
82.46
53.85
87.45

0.00
90.30
0.00

33.33
86.64
80.32
82.12
78.36
73.53
89.47

p(FU)

24622

99.35

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.297
0.991
0.171
0.374
0.500
0.058
0.062
0.228
0.374
0.178
0.457
0.120
0.718
0.320
0.552

0.374
0.374
0.913
0.797
0.374
0.177

0.374
0.111
0.935
0.235
0.956
0.012
0.602
0.785

0.973
0.659

100.00
0.00 99.99
99.85
100.00
100.00
100.00
100.00
99.96
0.00 99.54
98.48 99.42
0.00 99.39
97.33
0.00 41.81
0.00 98.38
82.21 92.20
27.06 88.32
59.26 95.43
0.00 99.75
67.87 96.12
97.98 96.29
11.97 62.64
90.35 97.47
76.68
99.20
0.00 99.59
0.00 99.68
81.20 92.80
0.00 71.83
0.00 85.92
75.63
0.00 75.44
99.99
5.88 48.65
70.53 95.60
70.89 95.46
91.45 93.31
73.32 89.56
38.17 93.11
65.52 97.61
0.00 96.78
99.45
100.00
0.00 99.94
0.00 99.99

-0.02

-0.03
0.00
1.11

-0.15
-0.11
-0.35
0.10

-0.14

-5.48
0.08
0.13
0.28
13.07
0.57

0.374

0.707
0.744
0.374

0.182
0.130
0.374
0.350

0.233

0.034
0.864
0.498
0.216
0.005
0.318

84.51

-0.13 0.077

RecU

0.01
-0.00
0.01
-0.05
-2.81
0.02
-0.02
-0.10
-0.03
0.01
0.01
-0.00
0.03
0.01
0.25

-0.00
0.01
-0.00
-0.04
-0.14
-0.13

-0.68
-0.06
-0.00
0.03
0.00
0.05
0.00
0.01

0.00
0.00

PrecU

100.00
99.99
99.85
100.00
100.00
100.00
100.00
99.96
99.54
99.42
99.39
97.33
41.81
98.38
92.20
88.32
95.43
99.75
96.12
96.29
62.64
97.47
76.68
99.20
99.59
99.68
92.80
71.83
85.92
75.63
75.44
99.99
48.65
95.60
95.46
93.31
89.56
93.11
97.61
96.78
99.45
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.75
99.99
100.00
100.00
100.00
99.94
99.33
99.56
99.32
99.58
31.09
98.92
92.05
92.13
95.68
99.79
95.12
97.95
73.19
97.21
84.89
99.65
99.33
99.96
90.51
66.93
84.16
75.65
72.88
100.00
36.00
94.41
95.91
94.22
91.00
92.25
96.43
96.54
100.00
100.00
99.87
99.99
50.72
96.855

F chg

100.00
100.00
99.95
100.00
100.00
100.00
100.00
99.98
99.76
99.28
99.46
95.18
63.79
97.85
92.36
84.82
95.19
99.71
97.15
94.68
54.74
97.74
69.92
98.74
99.85
99.40
95.21
77.51
87.77
75.60
78.18
99.99
75.00
96.81
95.01
92.41
88.17
93.98
98.83
97.01
98.90
100.00
100.00
100.00

F(0.5)

Rec

SVM
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

Prec

True

(Pos = TO & Wd = to & SibR = NP) (Pos IN)


(Pos = TO & Par = QP & Wd = to) (Pos IN)

84

57.14
10.53
78.59
75.00
73.21
73.80
84.04
36.59
83.53

0.00
83.01
0.00

84.57
80.47
70.86
65.92
58.20
83.92

p(FU)

24622

98.60

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.286
0.160
0.318
0.228
0.442
0.139
0.851
0.747
0.401
0.609
0.735
0.981
0.153
0.773
0.407
0.626
0.630

0.608
0.366
0.291
0.414
0.374
0.178

0.570
0.162
0.247
0.481
0.988
0.114
0.602

0.322
0.552

100.00
0.00 99.99
99.91
99.99
100.00
100.00
100.00
99.97
0.00 99.52
97.30 99.39
0.00 99.41
97.06
5.97 53.49
9.09 98.41
72.67 91.65
14.12 87.46
75.93 90.65
0.00 99.72
68.81 96.12
93.71 97.01
19.23 65.06
87.29 97.74
74.79
99.23
0.00 99.55
0.00 99.64
82.36 92.60
0.00 70.96
0.00 67.52
77.26
0.00 82.64
99.99
0.00 57.67
74.04 95.39
64.79 95.12
92.64 91.91
82.40 88.77
54.20 92.97
55.89 97.12
0.00 96.16
99.35
100.00
0.00 99.94
0.00 99.99

-0.13

-0.66

0.91

-0.79
0.05
21.32
-0.46

0.48

-0.95
-0.65
1.38
-1.02
1.20
0.51

0.367

0.527

0.374

0.467
0.929
0.243
0.366

0.661

0.352
0.716
0.016
0.669
0.988
0.787

81.67

-0.33 0.535

RecU

0.04
-0.01
0.01
-0.21
3.39
-0.01
-0.01
-0.10
-2.80
0.00
-0.01
0.00
-0.82
0.01
-0.51
0.01
-0.01

0.03
0.69
-14.04
0.28
1.20
-0.00

-0.04
0.06
0.12
-0.08
-0.00
0.10
0.08

-0.42
-0.01

PrecU

100.00
99.99
99.91
99.99
100.00
100.00
100.00
99.97
99.52
99.39
99.41
97.06
53.49
98.41
91.65
87.46
90.65
99.72
96.12
97.01
65.06
97.74
74.79
99.23
99.55
99.64
92.60
70.96
67.52
77.26
82.64
99.99
57.67
95.39
95.12
91.91
88.77
92.97
97.12
96.16
99.35
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.92
99.99
100.00
100.00
100.00
99.94
99.36
99.47
99.49
98.84
48.32
98.69
91.24
90.06
89.90
99.77
96.04
97.33
67.19
97.86
79.60
99.52
99.37
99.90
91.22
67.09
70.16
75.69
84.75
100.00
47.00
95.28
94.62
92.29
90.65
92.42
96.52
95.81
99.62
100.00
99.87
99.99
51.60
96.833

F chg

Rec

100.00
100.00
99.90
100.00
100.00
100.00
100.00
100.00
99.67
99.31
99.33
95.33
59.90
98.13
92.06
85.01
91.40
99.67
96.21
96.70
63.06
97.62
70.54
98.94
99.72
99.38
94.02
75.31
65.08
78.89
80.65
99.98
74.60
95.50
95.63
91.53
86.96
93.52
97.73
96.52
99.08
100.00
100.00
100.00

F(0.5)

Prec

True

TBL
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

B.26

85

0.00
79.68
88.89
86.49
81.14
82.45
54.00
87.32

0.00
90.30
0.00

33.33
86.35
80.59
82.05
78.53
72.88
89.91

p(FU)

24622

99.35

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.912
0.988
0.656
0.179
0.953
0.194
0.764
0.576
0.660

0.670
0.192
0.650
0.440
0.746
0.374
0.983
0.987
0.766
0.204
0.797
0.132

0.277
0.466
0.453
0.899
0.219
0.825
0.288

0.380
0.003

100.00
0.00 99.99
99.85
100.00
100.00
100.00
100.00
99.96
0.00 99.54
98.51 99.42
0.00 99.38
97.28
0.00 42.98
0.00 98.38
82.10 92.21
28.24 88.34
59.26 95.48
0.00 99.74
67.69 96.12
98.16 96.30
11.54 62.68
90.44 97.47
76.43
99.20
0.00 99.59
0.00 99.67
81.20 92.81
0.00 71.58
0.00 86.13
75.35
0.00 75.44
99.99
5.88 48.98
78.77 95.66
71.13 95.47
91.02 93.26
73.04 89.55
32.82 93.10
64.88 97.61
0.00 96.65
99.45
100.00
0.00 99.94
0.00 99.99

0.04
3.42

0.09
-0.03
-3.23
0.07

-0.14

0.14
0.41
-0.13
0.19
1.84
0.21

0.682
0.962

0.272
0.305
0.374
0.284

0.374

0.711
0.401
0.557
0.623
0.921
0.848

84.65

0.03 0.424

RecU

0.00
0.00
0.00
-0.10
-0.10
0.02
-0.00
-0.08
0.03

0.00
0.01
0.11
0.01
-0.07
0.01
0.00
0.00
0.01
-0.40
0.11
-0.49

0.01
0.02
-0.03
-0.01
0.04
0.00
-0.13

0.11
0.00

PrecU

100.00
99.99
99.85
100.00
100.00
100.00
100.00
99.96
99.54
99.42
99.38
97.28
42.98
98.38
92.21
88.34
95.48
99.74
96.12
96.30
62.68
97.47
76.43
99.20
99.59
99.67
92.81
71.58
86.13
75.35
75.44
99.99
48.98
95.66
95.47
93.26
89.55
93.10
97.61
96.65
99.45
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.75
99.99
100.00
100.00
100.00
99.94
99.37
99.57
99.34
99.58
32.77
98.80
92.08
92.29
95.77
99.79
95.09
97.97
73.19
97.22
84.13
99.65
99.33
99.96
90.63
66.56
84.36
75.65
72.88
100.00
36.00
94.61
95.94
94.16
90.90
92.24
96.41
96.88
100.00
100.00
99.87
99.99
50.77
96.855

F chg

100.00
100.00
99.95
100.00
100.00
100.00
100.00
99.98
99.70
99.27
99.41
95.08
62.40
97.96
92.35
84.71
95.19
99.69
97.17
94.69
54.81
97.73
70.02
98.75
99.85
99.39
95.10
77.41
87.98
75.05
78.18
99.99
76.60
96.74
95.01
92.37
88.24
93.98
98.84
96.41
98.90
100.00
100.00
100.00

F(0.5)

Rec

SVM
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

Prec

True

(Pos = IN & SibR = S) (Pos INSUB)


(Pos = IN & Par = SBAR) (Pos INSUB)

B.27

86

0.00
79.75
88.46
86.84
81.16
82.49
53.85
87.25

0.00
91.81
0.00
0.00

33.33
86.56
81.12
82.10
78.81
73.68
89.82

p(FU)

24622

99.35

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.374
0.804
0.375
0.698
0.345
0.374
0.554
0.789
0.926
0.374
0.978
0.494
0.163
0.087
0.896
0.866
0.374
0.374
0.374
0.545
0.961
0.914
0.631

0.227
0.002
0.698
0.290
0.980
0.951
0.681
0.071

0.374

0.414
0.613

100.00
0.00 99.99
99.85
100.00
100.00
100.00
100.00
99.95
0.00 99.54
98.51 99.42
0.00 99.38
97.23
0.00 43.45
0.00 98.37
82.21 92.21
27.06 88.42
61.11 95.48
0.00 99.74
67.77 96.12
98.16 96.27
11.97 62.73
90.48 97.47
76.33
99.20
0.00 99.59
0.00 99.67
80.43 92.74
0.00 71.87
0.00 86.04
75.65
0.00 75.44
99.99
5.88 50.67
79.12 95.67
71.60 95.46
91.34 93.25
73.74 89.56
32.06 93.06
64.24 97.60
0.00 96.80
99.45
100.00
0.00 99.96
0.00 99.99

0.15

2.00

0.16
-0.00
-0.35
0.05

0.14

0.50
1.08
0.07
0.86
0.53
-0.41

0.316

0.374

0.250
0.888
0.374
0.645

0.759

0.041
0.014
0.374
0.050
0.849
0.274

84.70

0.10 0.100

RecU

-0.01
0.00
0.00
0.00
-0.15
1.02
0.00
-0.01
0.01
0.03
-0.00
0.00
-0.03
0.18
-0.00
-0.20
0.01
-0.00
-0.01
-0.06
0.02
0.00
-0.09

3.44
0.02
0.00
-0.03
-0.00
0.00
-0.01
0.03

0.02

-0.14
-0.00

PrecU

100.00
99.99
99.85
100.00
100.00
100.00
100.00
99.95
99.54
99.42
99.38
97.23
43.45
98.37
92.21
88.42
95.48
99.74
96.12
96.27
62.73
97.47
76.33
99.20
99.59
99.67
92.74
71.87
86.04
75.65
75.44
99.99
50.67
95.67
95.46
93.25
89.56
93.06
97.60
96.80
99.45
100.00
99.96
99.99

TrueU

p(F)

100.00
99.98
99.75
99.99
100.00
100.00
100.00
99.94
99.41
99.56
99.29
99.58
32.77
98.93
92.18
92.38
95.77
99.78
95.09
97.96
73.40
97.20
84.89
99.65
99.33
99.96
89.91
67.14
84.36
75.93
72.88
100.00
38.00
94.62
95.94
94.18
90.95
92.16
96.41
96.42
100.00
100.00
99.91
99.99
50.64
96.848

F chg

100.00
100.00
99.95
100.00
100.00
100.00
100.00
99.96
99.67
99.28
99.46
94.98
64.46
97.81
92.23
84.79
95.19
99.70
97.16
94.63
54.77
97.73
69.34
98.75
99.85
99.38
95.76
77.33
87.79
75.38
78.18
99.99
76.00
96.74
95.00
92.34
88.20
93.99
98.82
97.20
98.90
100.00
100.00
100.00

F(0.5)

Rec

SVM
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

Prec

True

(Pos = RB & SibR = RB) (Pos RBDEG)


(Pos = RB & SibR = JJ) (Pos RBDEG)
(Pos = RB & Par = ADJP) (Pos RBDEG)

B.28

87

0.00
79.64
88.00
86.49
81.02
82.47
51.92
87.24

0.00
90.32
0.00

33.33
85.74
80.70
82.02
78.41
72.88
89.22

p(FU)

24622

99.35

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.374
0.178
0.264
0.374

0.247
0.507
0.634
0.944

0.290
0.261
0.148
0.258

0.207
0.597
0.625
0.852
0.468
0.882
0.227

0.789
0.118
0.323
0.044
0.586
0.103
0.412

0.370
0.276

100.00
0.00 99.99
99.85
100.00
100.00
100.00
100.00
99.96
0.00 99.53
98.48 99.42
0.00 99.38
97.43
0.00 43.02
0.00 98.37
82.03 92.21
25.88 88.46
59.26 95.46
0.00 99.74
67.59 96.11
98.11 96.29
11.54 62.68
90.40 97.46
76.48
99.21
0.00 99.59
0.00 99.68
81.40 92.80
0.00 71.99
0.00 86.01
75.66
0.00 75.44
99.99
5.88 48.98
79.12 95.66
70.66 95.43
91.34 93.26
73.04 89.50
32.82 93.05
63.81 97.56
0.00 96.82
99.45
100.00
0.00 99.94
0.00 99.99

-0.02

-0.03
-3.48

-0.07
-0.04
-3.91
0.00

0.04
0.13
0.02
0.12
1.84
-1.08

0.374

0.440
0.374

0.255
0.153
0.282
0.996

0.813
0.700
0.944
0.616
0.213
0.072

84.59

-0.04 0.356

RecU

-0.00
-0.00
0.00
0.05

0.00
-0.01
0.05
0.00

-0.00
-0.01
0.10
-0.01

0.02
0.00
0.01
-0.00
0.18
-0.03
-0.08

0.01
-0.03
-0.02
-0.07
-0.01
-0.04
0.05

-0.11
-0.00

PrecU

100.00
99.99
99.85
100.00
100.00
100.00
100.00
99.96
99.53
99.42
99.38
97.43
43.02
98.37
92.21
88.46
95.46
99.74
96.11
96.29
62.68
97.46
76.48
99.21
99.59
99.68
92.80
71.99
86.01
75.66
75.44
99.99
48.98
95.66
95.43
93.26
89.50
93.05
97.56
96.82
99.45
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.75
99.99
100.00
100.00
100.00
99.94
99.33
99.56
99.29
99.58
32.35
98.91
92.05
92.27
95.77
99.79
95.07
97.96
73.26
97.21
84.38
99.65
99.34
99.96
90.54
67.19
84.16
75.76
72.88
100.00
36.00
94.65
95.93
94.19
90.85
92.16
96.35
96.46
100.00
100.00
99.87
99.99
50.66
96.848

F chg

100.00
100.00
99.95
100.00
100.00
100.00
100.00
99.98
99.73
99.27
99.47
95.37
64.17
97.83
92.37
84.95
95.15
99.69
97.18
94.68
54.77
97.70
69.94
98.77
99.85
99.40
95.17
77.53
87.96
75.57
78.18
99.99
76.60
96.70
94.93
92.35
88.19
93.96
98.81
97.18
98.90
100.00
100.00
100.00

F(0.5)

SVM
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

Rec

VB & Wd {do, help, let, make}) (Pos VBI)


VBG & Wd {doing, helping, letting, making}) (Pos VBGI)
VBD & Wd {did, helped, let, made}) (Pos VBDI)
VBN & Wd {done, helped, let, made}) (Pos VBNI)
VBP & Wd {do, help, let, make}) (Pos VBPI)
VBZ & Wd {does, helps, lets, makes}) (Pos VBZI)

Prec

=
=
=
=
=
=

True

(Pos
(Pos
(Pos
(Pos
(Pos
(Pos

vbinf[lm] Mapping

88

80.00
0.00
78.79
78.26
70.69
73.69
84.35
35.25
82.99

85.09
0.00

0.00
85.71
81.29
71.09
66.89
59.17
87.25

p(FU)

24622

98.71

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.373

0.210
0.024
0.464
0.374
0.894
0.245
0.409
0.785
0.488
0.227
0.931
0.937
0.470
0.384
0.664
0.999
0.178

0.035
0.517
0.824
0.192

0.374
0.374
0.310
0.134
0.147
0.072
0.857
0.524
0.580
0.374

0.123
0.969

100.00
0.00 99.99
99.90
99.99
100.00
100.00
100.00
99.97
0.00 99.47
96.69 99.37
0.00 99.40
97.16
5.97 51.58
0.00 98.42
72.98 91.70
21.18 87.63
75.93 93.17
0.00 99.74
69.12 96.13
93.98 97.02
20.94 65.07
88.31 97.72
75.32
99.22
0.00 99.55
0.00 99.64
82.95 92.66
0.00 70.86
0.00 78.52
77.28
0.00 81.67
99.99
0.00 57.32
75.79 95.50
62.21 94.99
92.86 91.93
82.12 88.70
54.20 92.95
55.67 97.06
0.00 96.05
99.33
100.00
0.00 99.94
0.00 99.99

-0.39

-0.32

-0.89

-0.64
0.38
26.44
-0.23

2.08

0.93
-2.45
1.66
-0.37
2.00
1.83

0.061

0.500

0.374

0.220
0.194
0.691
0.595

0.051

0.586
0.156
0.101
0.794
0.889
0.582

81.89

-0.06 0.746

RecU

-0.01

-0.01
-0.04
-0.01
-0.11
-0.29
-0.01
0.05
0.09
-0.09
0.02
-0.00
0.00
-0.82
-0.00
0.20
0.00
-0.01

0.10
0.55
-0.04
0.30

0.00
-0.61
0.08
-0.07
0.15
-0.16
-0.02
0.03
-0.04
-0.02

-0.41
0.00

PrecU

100.00
99.99
99.90
99.99
100.00
100.00
100.00
99.97
99.47
99.37
99.40
97.16
51.58
98.42
91.70
87.63
93.17
99.74
96.13
97.02
65.07
97.72
75.32
99.22
99.55
99.64
92.66
70.86
78.52
77.28
81.67
99.99
57.32
95.50
94.99
91.93
88.70
92.95
97.06
96.05
99.33
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.92
99.99
100.00
100.00
100.00
99.94
99.36
99.42
99.47
98.84
47.90
98.70
91.34
90.36
92.02
99.79
96.05
97.35
66.85
97.91
80.35
99.54
99.35
99.90
91.25
66.77
83.13
75.48
83.05
100.00
47.00
95.44
94.37
92.10
90.69
92.12
96.47
95.51
99.58
100.00
99.87
99.99
51.61
96.842

F chg

Rec

100.00
100.00
99.88
100.00
100.00
100.00
100.00
100.00
99.57
99.32
99.32
95.53
55.88
98.14
92.07
85.05
94.36
99.69
96.20
96.68
63.37
97.54
70.89
98.91
99.74
99.38
94.12
75.49
74.40
79.16
80.33
99.99
73.44
95.56
95.63
91.77
86.78
93.79
97.65
96.59
99.08
100.00
100.00
100.00

F(0.5)

Prec

True

TBL
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

B.29

89

0.00
79.64
88.89
86.49
81.15
82.45
54.00
87.32

0.00
90.30
0.00

33.33
86.18
80.37
82.05
78.53
72.88
89.91

p(FU)

24622

99.35

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.853
0.988
0.656
0.374
0.953
0.170
0.856
0.576
0.660

0.603
0.345
0.650
0.577
0.746

0.983
0.987
0.741
0.204
0.797
0.130

0.437
0.444
0.384
0.922
0.202
0.989
0.288

0.374
0.057

100.00
0.00 99.99
99.85
100.00
100.00
100.00
100.00
99.96
0.00 99.53
98.51 99.42
0.00 99.38
97.33
0.00 42.98
0.00 98.38
82.08 92.22
28.24 88.34
59.26 95.48
0.00 99.74
67.64 96.12
98.16 96.30
11.54 62.68
90.44 97.47
76.43
99.20
0.00 99.59
0.00 99.67
81.20 92.81
0.00 71.58
0.00 86.13
75.35
0.00 75.44
99.99
5.88 48.98
78.77 95.66
71.13 95.47
91.02 93.25
73.04 89.56
32.82 93.11
64.88 97.61
0.00 96.65
99.45
100.00
0.00 99.94
0.00 99.99

0.00
3.42

0.05
-0.03
-3.23
0.07

-0.14

0.05
0.29
-0.13
0.19
1.84
0.21

0.983
0.962

0.735
0.305
0.374
0.284

0.374

0.979
0.549
0.557
0.623
0.921
0.848

84.64

0.01 0.760

RecU

-0.00
0.00
0.00
-0.05
-0.10
0.02
-0.00
-0.08
0.03

0.00
0.01
0.11
0.01
-0.07

0.00
0.00
0.01
-0.40
0.11
-0.49

0.00
0.02
-0.03
-0.00
0.05
-0.00
-0.13

0.10
0.00

PrecU

100.00
99.99
99.85
100.00
100.00
100.00
100.00
99.96
99.53
99.42
99.38
97.33
42.98
98.38
92.22
88.34
95.48
99.74
96.12
96.30
62.68
97.47
76.43
99.20
99.59
99.67
92.81
71.58
86.13
75.35
75.44
99.99
48.98
95.66
95.47
93.25
89.56
93.11
97.61
96.65
99.45
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.75
99.99
100.00
100.00
100.00
99.94
99.37
99.57
99.34
99.58
32.77
98.80
92.07
92.29
95.77
99.79
95.09
97.97
73.19
97.22
84.13
99.65
99.33
99.96
90.63
66.56
84.36
75.65
72.88
100.00
36.00
94.61
95.94
94.16
90.91
92.24
96.40
96.88
100.00
100.00
99.87
99.99
50.77
96.854

F chg

100.00
100.00
99.95
100.00
100.00
100.00
100.00
99.98
99.69
99.27
99.41
95.18
62.40
97.96
92.36
84.71
95.19
99.69
97.17
94.69
54.81
97.73
70.02
98.74
99.85
99.39
95.09
77.41
87.98
75.05
78.18
99.99
76.60
96.73
95.02
92.36
88.24
93.99
98.84
96.41
98.90
100.00
100.00
100.00

F(0.5)

Rec

SVM
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

Prec

True

(Pos = IN & SibR = S) (Pos INSUB)


(Pos = IN & Par = SBAR) (Pos INSUB)

B.30

90

0.00
0.00
78.67
68.00
69.64
74.20
83.84
35.19
81.31

83.20
0.00

84.89
85.00
70.48
66.09
48.61
84.14

0.00

p(FU)

24622

0.00
98.87

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.374

0.415
0.917
0.612
0.131
0.377
0.725
0.044
0.804
0.994
0.979
0.207
0.205
0.092
0.031
0.186
0.994
0.064
0.178
0.869
0.702
0.986
0.344

0.374

0.000
0.118
0.693
0.018
0.004
0.203
0.052

0.374

0.015
0.034

100.00
0.00 99.99
99.90
99.99
100.00
100.00
100.00
99.97
0.00 99.49
97.10 99.41
0.00 99.40
97.00
0.00 50.68
0.00 98.42
72.12 91.54
20.00 87.47
72.22 93.03
0.00 99.72
68.86 96.09
93.93 96.98
8.12 64.88
89.29 97.67
74.67
99.22
0.00 99.57
0.00 99.65
82.56 92.57
0.00 70.64
0.00 78.52
77.32
0.00 81.67
99.99
0.00 57.67
74.91 95.18
63.85 94.95
92.75 91.82
80.31 88.51
53.44 92.55
55.67 97.07
0.00 95.83
99.35
100.00
0.00 99.87
0.00 99.99

-0.10

-1.00

-4.01

-0.50
0.03

-0.76

0.71

-0.15
0.93
1.12
-2.02
-8.21
0.39

0.524

0.060

0.097

0.506
0.944

0.054

0.469

0.703
0.370
0.120
0.097
0.125
0.926

81.62

-0.40 0.207

RecU

-0.01

0.02
-0.00
-0.01
-0.26
-2.04
-0.01
-0.13
-0.08
-0.25
0.00
-0.04
-0.04
-1.10
-0.06
-0.67
0.00
0.02
0.01
-0.01
0.23
-0.04
0.36

0.00

-0.25
-0.11
0.03
-0.37
-0.46
0.05
-0.27

-0.06

-1.32
-0.05

PrecU

100.00
99.99
99.90
99.99
100.00
100.00
100.00
99.97
99.49
99.41
99.40
97.00
50.68
98.42
91.54
87.47
93.03
99.72
96.09
96.98
64.88
97.67
74.67
99.22
99.57
99.65
92.57
70.64
78.52
77.32
81.67
99.99
57.67
95.18
94.95
91.82
88.51
92.55
97.07
95.83
99.35
100.00
99.87
99.99

TrueU

p(F)

100.00
99.98
99.93
99.99
100.00
100.00
100.00
99.94
99.39
99.46
99.48
98.74
47.06
98.70
91.22
90.36
91.83
99.79
96.02
97.30
65.48
97.93
79.09
99.51
99.36
99.90
91.20
66.30
83.13
75.44
83.05
100.00
47.00
95.16
94.17
92.09
90.40
91.95
96.50
95.17
99.62
100.00
99.87
99.99
51.14
96.798

F chg

100.00
100.00
99.87
100.00
100.00
100.00
100.00
100.00
99.60
99.35
99.32
95.33
54.90
98.14
91.87
84.77
94.26
99.65
96.17
96.66
64.29
97.41
70.72
98.94
99.77
99.40
93.98
75.58
74.40
79.30
80.33
99.99
74.60
95.20
95.75
91.56
86.70
93.15
97.65
96.50
99.08
100.00
99.87
100.00

F(0.5)

TBL
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

Rec

VB & SibR = S) (Pos VBI)


VBG & SibR = S) (Pos VBGI)
VBD & SibR = S) (Pos VBDI)
VBN & SibR = S) (Pos VBNI)
VBP & SibR = S) (Pos VBPI)
VBZ & SibR = S) (Pos VBZI)

Prec

=
=
=
=
=
=

True

(Pos
(Pos
(Pos
(Pos
(Pos
(Pos

B.31

91

0.00
78.83
75.76
71.93
74.51
82.94
34.65
84.18
0.00
0.00
84.96
20.00

79.52
84.50
70.98
67.66
66.36
86.91

p(FU)

24622

0.00
98.78

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.374

0.349
0.416
0.602
0.550
0.762
0.102
0.090
0.782
0.180
0.637
0.508
0.185
0.602
0.738
0.063
0.578
0.369
0.629
0.064
0.592
0.180
0.481
0.374
0.374

0.072
0.275
0.080
0.846
0.994
0.319
0.521
0.374

0.067
0.023

100.00
0.00 99.99
99.90
99.99
100.00
100.00
100.00
99.97
0.00 99.50
97.25 99.39
0.00 99.41
97.21
0.00 52.46
0.00 98.41
72.10 91.57
29.41 87.62
75.93 90.54
0.00 99.71
68.89 96.11
94.58 96.98
18.80 65.24
87.08 97.72
73.62
99.23
0.00 99.54
0.00 99.64
81.01 92.47
14.29 70.26
0.00 59.98
76.90
0.00 81.30
99.99
0.00 57.67
74.91 95.26
65.26 95.18
92.64 91.91
82.40 88.86
54.20 92.97
55.46 97.10
0.00 96.02
99.33
100.00
0.00 99.94
0.00 99.99

-0.07

-0.92
21.54

-0.28
-0.21
17.31
-0.19

0.78

-3.22
1.93
1.47
0.41
7.58
1.44

0.619

0.164
0.425

0.646
0.429
0.915
0.685

0.628

0.075
0.362
0.121
0.657
0.461
0.706

81.78

-0.19 0.547

RecU

-0.01

0.02
-0.01
0.01
-0.05
1.41
-0.02
-0.10
0.08
-2.91
-0.00
-0.02
-0.03
-0.56
-0.01
-2.07
0.01
-0.01
-0.01
-0.11
-0.31
-23.65
-0.18
-0.45
0.00

-0.17
0.13
0.12
0.03
-0.00
0.08
-0.07
-0.02

-0.84
-0.03

PrecU

100.00
99.99
99.90
99.99
100.00
100.00
100.00
99.97
99.50
99.39
99.41
97.21
52.46
98.41
91.57
87.62
90.54
99.71
96.11
96.98
65.24
97.72
73.62
99.23
99.54
99.64
92.47
70.26
59.98
76.90
81.30
99.99
57.67
95.26
95.18
91.91
88.86
92.97
97.10
96.02
99.33
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.93
99.99
100.00
100.00
100.00
99.94
99.38
99.46
99.47
98.84
47.06
98.69
91.26
90.78
92.86
99.77
95.98
97.36
67.26
97.86
77.33
99.53
99.36
99.90
90.94
65.41
53.50
74.99
84.75
100.00
47.00
95.36
94.82
92.32
90.42
92.28
96.50
95.51
99.62
100.00
99.87
99.99
51.38
96.813

F chg

100.00
100.00
99.87
100.00
100.00
100.00
100.00
100.00
99.63
99.33
99.35
95.63
59.26
98.13
91.87
84.67
88.34
99.66
96.23
96.61
63.33
97.58
70.25
98.94
99.72
99.37
94.06
75.88
68.24
78.91
78.12
99.99
74.60
95.17
95.55
91.49
87.36
93.67
97.71
96.53
99.05
100.00
100.00
100.00

F(0.5)

Rec

TBL
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

Prec

True

(Pos = DT & Wd = the) (Pos DT0)


(Pos = DT & Wd = a) (Pos DT1)
(Pos = DT & Wd = The) (Pos DT2)

B.32

92

75.00
11.11
78.30
62.16
70.00
73.96
83.81
35.61
83.65

84.40
20.00

81.00
83.54
69.87
66.40
55.46
87.07

p(FU)

24622

0.00
98.64

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.660
0.917
0.179
0.316
0.496
0.613
0.460
0.546
0.138
0.291
0.271
0.758
0.245
0.628
0.212
0.993
0.593
0.374
0.549
0.890
0.413
0.192

0.374

0.664
0.860
0.780
0.521
0.197
0.433
0.197
0.374

0.159
0.225

100.00
0.00 99.99
99.91
99.99
100.00
100.00
100.00
99.97
0.00 99.49
97.39 99.40
0.00 99.38
96.80
4.48 50.34
9.09 98.42
72.56 91.61
27.06 87.49
77.78 93.02
0.00 99.73
68.02 96.09
93.96 97.00
20.09 64.95
87.21 97.71
75.27
99.22
0.00 99.55
0.00 99.64
81.78 92.55
14.29 70.38
0.00 78.15
76.77
0.00 81.67
99.99
0.00 57.67
77.02 95.40
61.97 95.07
92.86 91.82
80.59 88.75
50.38 92.86
54.82 97.05
0.00 95.96
99.33
100.00
0.00 99.94
0.00 99.99

-0.07

-0.91
8.15
-0.26

-1.28
0.03
23.60
-0.44

0.94

-0.94
-1.51
0.67
-1.62
-4.80
0.79

0.655

0.228
0.952
0.374

0.318
0.984
0.099
0.086

0.424

0.465
0.500
0.414
0.283
0.569
0.744

81.57

-0.46 0.321

RecU

0.01
-0.00
-0.02
-0.47
-2.70
-0.01
-0.06
-0.07
-0.26
0.02
-0.04
-0.01
-0.99
-0.02
0.12
0.00
-0.01
-0.01
-0.03
-0.14
-0.51
-0.36

0.00

-0.02
0.01
0.03
-0.09
-0.11
0.03
-0.13
-0.02

-0.52
-0.02

PrecU

100.00
99.99
99.91
99.99
100.00
100.00
100.00
99.97
99.49
99.40
99.38
96.80
50.34
98.42
91.61
87.49
93.02
99.73
96.09
97.00
64.95
97.71
75.27
99.22
99.55
99.64
92.55
70.38
78.15
76.77
81.67
99.99
57.67
95.40
95.07
91.82
88.75
92.86
97.05
95.96
99.33
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.92
99.99
100.00
100.00
100.00
99.94
99.37
99.48
99.47
98.63
47.06
98.67
91.26
90.36
91.69
99.78
95.97
97.36
67.12
97.82
80.10
99.51
99.36
99.90
91.18
66.40
83.54
75.34
83.05
100.00
47.00
95.41
94.56
92.28
90.40
92.24
96.45
95.41
99.58
100.00
99.87
99.99
51.55
96.819

F chg

100.00
100.00
99.90
100.00
100.00
100.00
100.00
100.00
99.61
99.33
99.30
95.04
54.11
98.16
91.96
84.79
94.39
99.69
96.21
96.65
62.92
97.60
70.98
98.94
99.73
99.37
93.96
74.85
73.42
78.25
80.33
99.99
74.60
95.39
95.59
91.37
87.16
93.50
97.66
96.52
99.08
100.00
100.00
100.00

F(0.5)

Rec

TBL
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

Prec

True

(Pos = IN & Wd = of) (Pos IN0)


(Pos = IN & Wd = in) (Pos IN1)
(Pos = IN & Wd = for) (Pos IN2)

B.33

93

33.33
0.00
79.82
68.97
71.93
74.33
83.82
39.26
83.83

84.40
33.33

81.82
81.63
70.48
65.28
60.33
86.08

p(FU)

24622

0.00
98.66

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.374
0.347
0.070
0.374
0.828
0.470
0.906
0.611
0.355
0.605
0.679
0.944
0.433
0.526
0.374
0.997
0.783

0.138
0.248
0.369
0.006

0.374

0.673
0.831
0.022
0.430
0.726
0.597
0.633
0.374

0.162
0.714

100.00
0.00 99.99
99.91
99.99
100.00
100.00
100.00
99.97
0.00 99.48
96.75 99.38
0.00 99.41
97.16
1.49 51.40
0.00 98.44
71.99 91.67
23.53 87.66
75.93 91.93
0.00 99.72
70.13 96.14
94.83 97.02
22.65 65.14
87.04 97.71
75.06
99.22
0.00 99.55
0.00 99.64
81.78 92.64
14.29 70.99
0.00 70.77
77.35
0.00 81.67
99.99
0.00 57.67
75.79 95.39
63.62 95.05
91.23 91.94
81.15 88.76
55.73 92.98
56.96 97.06
0.00 96.03
99.33
100.00
0.00 99.94
0.00 99.99

-0.39

-0.41
0.65

0.53
0.47
38.25
-0.42

0.94

-1.28
-1.03
0.40
-2.23
4.46
2.70

0.185

0.666
0.884

0.108
0.117
0.552
0.279

0.354

0.393
0.193
0.377
0.106
0.697
0.492

81.93

-0.02 0.933

RecU

0.00
-0.03
0.00
-0.11
-0.64
0.01
0.01
0.13
-1.42
0.00
0.01
0.00
-0.70
-0.01
-0.16
0.00
0.00

0.08
0.74
-9.90
0.40

0.00

-0.04
-0.01
0.16
-0.09
0.02
0.03
-0.06
-0.02

-0.37
-0.00

PrecU

100.00
99.99
99.91
99.99
100.00
100.00
100.00
99.97
99.48
99.38
99.41
97.16
51.40
98.44
91.67
87.66
91.93
99.72
96.14
97.02
65.14
97.71
75.06
99.22
99.55
99.64
92.64
70.99
70.77
77.35
81.67
99.99
57.67
95.39
95.05
91.94
88.76
92.98
97.06
96.03
99.33
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.92
99.99
100.00
100.00
100.00
99.94
99.40
99.42
99.47
98.84
46.22
98.70
91.20
90.64
92.30
99.78
96.07
97.37
66.95
97.82
79.60
99.50
99.38
99.90
91.27
66.88
69.75
76.03
83.05
100.00
47.00
95.43
94.47
92.16
90.70
92.46
96.54
95.57
99.58
100.00
99.87
99.99
51.63
96.840

F chg

100.00
100.00
99.90
100.00
100.00
100.00
100.00
100.00
99.56
99.35
99.34
95.53
57.89
98.18
92.13
84.87
91.57
99.67
96.21
96.67
63.43
97.61
71.01
98.95
99.72
99.38
94.06
75.65
71.82
78.71
80.33
99.99
74.60
95.36
95.64
91.72
86.90
93.52
97.58
96.49
99.08
100.00
100.00
100.00

F(0.5)

Rec

TBL
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

Prec

True

(Pos = WRB & Wd = when) (Pos WRB0)


(Pos = WRB & Wd = where) (Pos WRB1)
(Pos = WRB & Wd = how) (Pos WRB2)

B.34

94

42.86
0.00
78.61
72.22
71.93
73.13
83.85
40.78
83.20

84.43
33.33

0.00
83.88
80.06
70.18
64.60
58.06
87.88

p(FU)

24622

98.78

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.374

0.531
0.526
0.987
0.568
0.307
0.849
0.145
0.211
0.204
0.605
0.665
0.053
0.415
0.006
0.561
0.796
0.369

0.318
0.432
0.262
0.411

0.713
0.094
0.193
0.075
0.253
0.037
0.075
0.374

0.063
0.129

100.00
0.00 99.99
99.90
99.99
100.00
100.00
100.00
99.97
0.00 99.50
96.89 99.40
0.00 99.40
97.16
4.48 50.34
0.00 98.42
71.01 91.58
30.59 87.79
75.93 89.74
0.00 99.72
69.75 96.12
93.56 96.96
17.95 65.21
88.82 97.79
74.88
99.22
0.00 99.55
0.00 99.64
79.84 92.62
14.29 70.83
0.00 64.05
77.18
0.00 81.67
99.99
0.00 57.67
75.79 95.44
63.15 94.94
92.21 91.85
81.56 88.60
54.96 92.87
55.89 97.16
0.00 96.12
99.33
100.00
0.00 99.94
0.00 99.99

-0.26

-1.84
23.27

-0.54
-0.15
19.96
0.18

-0.27

-0.10
-2.28
0.62
-2.58
1.82
2.35

0.154

0.133
0.374

0.417
0.541
0.222
0.526

0.818

0.830
0.338
0.246
0.030
0.651
0.183

81.52

-0.51 0.170

RecU

-0.01

0.02
-0.01
0.00
-0.11
-2.69
-0.00
-0.09
0.27
-3.77
0.00
-0.01
-0.05
-0.59
0.06
-0.39
-0.01
-0.00

0.05
0.51
-18.45
0.18

0.02
-0.12
0.07
-0.27
-0.11
0.14
0.04
-0.02

-0.91
-0.03

PrecU

100.00
99.99
99.90
99.99
100.00
100.00
100.00
99.97
99.50
99.40
99.40
97.16
50.34
98.42
91.58
87.79
89.74
99.72
96.12
96.96
65.21
97.79
74.88
99.22
99.55
99.64
92.62
70.83
64.05
77.18
81.67
99.99
57.67
95.44
94.94
91.85
88.60
92.87
97.16
96.12
99.33
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.93
99.99
100.00
100.00
100.00
99.94
99.36
99.45
99.49
98.84
46.64
98.68
91.26
90.91
88.77
99.78
96.04
97.26
66.71
97.99
79.60
99.50
99.36
99.90
91.10
66.61
67.28
75.96
83.05
100.00
47.00
95.40
94.47
92.31
90.27
92.26
96.57
95.62
99.58
100.00
99.87
99.99
51.35
96.815

F chg

100.00
100.00
99.87
100.00
100.00
100.00
100.00
100.00
99.63
99.35
99.32
95.53
54.68
98.17
91.89
84.87
90.73
99.67
96.20
96.67
63.78
97.59
70.69
98.94
99.74
99.38
94.19
75.63
61.12
78.44
80.33
99.98
74.60
95.49
95.42
91.40
86.98
93.49
97.75
96.63
99.08
100.00
100.00
100.00

F(0.5)

Rec

TBL
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

Prec

True

(Pos = WP & Wd = who) (Pos WP0)


(Pos = WP & Wd = what) (Pos WP1)
(Pos = WP & Wd = What) (Pos WP2)

B.35

95

100.00
100.00 100.00 99.99
99.91
99.99
100.00
100.00
100.00
99.97
0.00 99.52
98.26 97.51 99.38
0.00 99.41
97.06
11.11 1.49 51.26
0.00 0.00 98.43
78.74 72.10 91.64
83.33 23.53 87.60
71.43 74.07 93.12
0.00 99.73
74.26 69.01 96.14
83.13 94.16 96.98
30.38 10.26 64.63
83.12 87.25 97.72
74.10
99.23
0.00 99.55
0.00 99.65
83.63 81.20 92.59
33.33 14.29 70.77
0.00 78.14
77.00
0.00 0.00 80.33
99.99
0.00 57.67
82.71 77.19 95.40
81.01 64.08 95.10
71.20 92.32 91.94
66.21 81.84 88.75
61.11 50.38 92.89
87.54 54.18 97.09
0.00 96.15
99.35
100.00
0.00 99.94
0.00 99.99

24622

81.62

p(FU)

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F chgU

F(0.5)U

0.374
0.374

0.262
0.237
0.644
0.266
0.777
0.759
0.609
0.779
0.988
0.288
0.820
0.334
0.258
0.806
0.077
0.703
0.471
0.374
0.457
0.588
0.954
0.882
0.374
0.374

0.721
0.148
0.005
0.404
0.326
0.017
0.490

0.409
0.779

RecU

0.01
-0.00

0.04
-0.02
0.01
-0.21
-0.92
0.00
-0.03
0.06
-0.15
0.02
0.01
-0.04
-1.48
-0.01
-1.43
0.01
-0.01
0.01
0.02
0.42
-0.52
-0.06
-1.64
0.00

-0.03
0.04
0.16
-0.09
-0.09
0.07
0.07

-0.44
-0.00

PrecU

100.00
99.99
99.91
99.99
100.00
100.00
100.00
99.97
99.52
99.38
99.41
97.06
51.26
98.43
91.64
87.60
93.12
99.73
96.14
96.98
64.63
97.72
74.10
99.23
99.55
99.65
92.59
70.77
78.14
77.00
80.33
99.99
57.67
95.40
95.10
91.94
88.75
92.89
97.09
96.15
99.35
100.00
99.94
99.99

TrueU

p(F)

100.00
99.99
99.91
99.99
100.00
100.00
100.00
99.94
99.36
99.48
99.49
98.84
47.06
98.71
91.28
90.42
91.83
99.80
96.02
97.34
66.27
97.87
77.83
99.52
99.35
99.91
91.13
66.72
83.13
75.41
83.05
100.00
47.00
95.42
94.52
92.41
90.61
92.08
96.54
95.55
99.62
100.00
99.87
99.99
51.59
96.837

F chg

100.00
100.00
99.91
100.00
100.00
100.00
100.00
100.00
99.68
99.29
99.32
95.33
56.28
98.15
92.00
84.95
94.44
99.67
96.25
96.62
63.07
97.57
70.71
98.94
99.74
99.38
94.10
75.34
73.72
78.66
77.78
99.99
74.60
95.37
95.69
91.48
86.98
93.71
97.65
96.76
99.08
100.00
100.00
100.00

F(0.5)

Rec

TBL
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

Prec

True

(Pos = WDT & Wd = which) (Pos WDT0)


(Pos = WDT & Wd = that) (Pos WDT1)
(Pos = WDT & Wd = what) (Pos WDT2)

-0.20

-0.97
5.26
-1.55

-0.34
-0.30
-26.20
-0.74

0.13

0.18
-0.96
1.50
-1.08
-0.42
0.27

0.321

0.182
0.779
0.374

0.719
0.363
0.435
0.236

0.815

0.991
0.554
0.028
0.476
0.759
0.893

-0.40 0.308

B.36

96

50.00
0.00
79.11
68.42
73.08
0.00
74.00
83.45
37.25
83.49

82.94
0.00

0.00

82.24
81.46
69.79
65.41
70.00
87.00

p(FU)

24622

98.74

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.373

0.219
0.152
0.767
0.407
0.064
0.820
0.440
0.457
0.382
0.583
0.349
0.335
0.187
0.591
0.178
0.992
0.099

0.302
0.993
0.357
0.300
0.374
0.374

0.440
0.456
0.542
0.225
0.060
0.136
0.409
0.374

0.329
0.816

100.00
0.00 99.99
99.90
99.99
100.00
100.00
100.00
99.97
0.00 99.53
96.81 99.39
0.00 99.40
97.11
4.48 50.23
0.00 98.42
72.23 91.62
30.59 87.68
70.37 95.01
0.00 99.73
68.94 96.11
94.39 96.98
16.24 64.93
88.36 97.74
74.25
99.22
0.00 99.56
0.00 99.64
81.01 92.54
0.00 70.46
0.00 84.53
77.26
0.00 80.33
99.99
0.00 57.67
77.19 95.38
62.91 95.09
91.77 91.85
79.75 88.69
53.44 93.09
55.89 97.09
0.00 96.05
99.33
100.00
0.00 99.94
0.00 99.99

-0.32

-0.66
21.27
-2.95

-0.57
0.02
8.85
0.10

-0.40

-0.09
-1.74
0.10
-2.89
9.27
1.95

0.029

0.302
0.374
0.374

0.157
0.972
0.621
0.678

0.650

0.821
0.010
0.920
0.058
0.175
0.592

81.71

-0.28 0.051

RecU

-0.01

0.05
-0.02
-0.00
-0.16
-2.91
-0.00
-0.04
0.15
1.88
0.01
-0.02
-0.03
-1.02
0.01
-1.23
-0.00
0.01

-0.04
-0.02
7.61
0.28
-1.64
0.00

-0.04
0.03
0.06
-0.16
0.13
0.06
-0.04
-0.02

-0.43
-0.00

PrecU

100.00
99.99
99.90
99.99
100.00
100.00
100.00
99.97
99.53
99.39
99.40
97.11
50.23
98.42
91.62
87.68
95.01
99.73
96.11
96.98
64.93
97.74
74.25
99.22
99.56
99.64
92.54
70.46
84.53
77.26
80.33
99.99
57.67
95.38
95.09
91.85
88.69
93.09
97.09
96.05
99.33
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.92
99.99
100.00
100.00
100.00
99.94
99.37
99.43
99.49
98.84
46.64
98.69
91.21
90.83
95.77
99.80
96.01
97.33
66.37
97.95
78.09
99.45
99.37
99.90
91.17
66.04
81.48
75.82
83.05
100.00
47.00
95.50
94.54
92.14
90.47
92.31
96.54
95.55
99.58
100.00
99.87
99.99
51.60
96.838

F chg

100.00
100.00
99.88
100.00
100.00
100.00
100.00
100.00
99.69
99.34
99.30
95.43
54.41
98.16
92.04
84.75
94.27
99.66
96.21
96.63
63.56
97.53
70.78
98.99
99.75
99.38
93.95
75.51
87.80
78.75
77.78
99.99
74.60
95.27
95.65
91.57
86.99
93.88
97.63
96.55
99.08
100.00
100.00
100.00

F(0.5)

Rec

TBL
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

Prec

True

(Pos = PRP & Wd = it) (Pos PRP0)


(Pos = PRP & Wd = he) (Pos PRP1)
(Pos = PRP & Wd = they) (Pos PRP2)

B.37

97

20.00
0.00
78.20
65.38
70.69
73.72
83.49
38.75
83.26
0.00

85.37
50.00

79.81
85.08
70.96
66.37
62.50
85.16

p(FU)

24622

98.86

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.374

0.267
0.427
0.934
0.393
0.381
0.345
0.079
0.146
0.213
0.366
0.459
0.054
0.241
0.481
0.135
0.694
0.609
0.178
0.038
0.835
0.257
0.393
0.374
0.374
0.374
0.086
0.456
0.763
0.274
0.375
0.494
0.431
0.374

0.026
0.069

100.00
0.00 99.99
99.90
99.99
100.00
100.00
100.00
99.97
0.00 99.51
96.84 99.38
0.00 99.40
97.11
1.49 50.23
0.00 98.44
72.89 91.56
20.00 87.42
75.93 90.06
0.00 99.73
68.73 96.10
93.88 96.97
13.25 64.86
87.08 97.71
73.71
99.21
0.00 99.55
0.00 99.65
81.40 92.49
14.29 70.34
0.00 66.48
76.82
0.00 82.64
99.99
0.00 56.79
75.61 95.34
62.91 95.11
93.07 91.82
82.40 88.74
53.44 93.02
56.53 97.08
0.00 96.02
99.33
100.00
0.00 99.94
0.00 99.99

-0.24

-0.74

-0.89

-0.90
-0.22
-4.98
-0.75

1.26

-2.57
0.12
1.66
-0.65
3.88
1.80

0.299

0.310

0.374

0.280
0.447
0.566
0.237

0.480

0.230
0.857
0.014
0.361
0.733
0.638

81.61

-0.41 0.296

RecU

-0.01

0.03
-0.02
0.00
-0.16
-2.90
0.01
-0.11
-0.15
-3.43
0.01
-0.03
-0.05
-1.13
-0.02
-1.95
-0.01
0.00
0.01
-0.09
-0.18
-15.37
-0.29
1.20
0.00
-1.52
-0.09
0.05
0.02
-0.10
0.06
0.06
-0.08
-0.02

-1.34
-0.04

PrecU

100.00
99.99
99.90
99.99
100.00
100.00
100.00
99.97
99.51
99.38
99.40
97.11
50.23
98.44
91.56
87.42
90.06
99.73
96.10
96.97
64.86
97.71
73.71
99.21
99.55
99.65
92.49
70.34
66.48
76.82
82.64
99.99
56.79
95.34
95.11
91.82
88.74
93.02
97.08
96.02
99.33
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.91
99.99
100.00
100.00
100.00
99.94
99.38
99.42
99.48
98.84
45.38
98.69
91.22
90.39
88.54
99.75
96.00
97.32
66.10
97.88
77.33
99.52
99.37
99.88
91.05
65.93
71.60
75.06
84.75
100.00
46.00
95.44
94.51
92.37
90.52
92.31
96.58
95.74
99.62
100.00
99.87
99.99
51.13
96.806

F chg

100.00
100.00
99.90
100.00
100.00
100.00
100.00
100.00
99.63
99.35
99.33
95.43
56.25
98.19
91.90
84.64
91.64
99.70
96.20
96.61
63.67
97.54
70.41
98.90
99.73
99.42
93.98
75.39
62.03
78.67
80.65
99.99
74.19
95.24
95.71
91.28
87.04
93.74
97.59
96.29
99.05
100.00
100.00
100.00

F(0.5)

Rec

TBL
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

Prec

True

(Pos = DT & Wd = the) (Pos DT0)

B.38

98

71.43
0.00
78.83
73.91
73.21
74.19
83.97
38.89
82.59

12.50
83.80
0.00

0.00

82.67
85.90
70.72
65.26
59.13
90.00

p(FU)

24622

98.66

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.374

0.715
0.098
0.768
0.372
0.861
0.730
0.016
0.998
0.983
0.466
0.236
0.084
0.314
0.867
0.199
0.971
0.541

0.914
0.566
0.908
0.251
0.374
0.374
0.374
0.322
0.447
0.235
0.020
0.499
0.196
0.857

0.147
0.240

100.00
0.00 99.99
99.90
99.99
100.00
100.00
100.00
99.97
0.00 99.49
96.92 99.37
0.00 99.40
96.90
7.46 52.25
0.00 98.43
73.15 91.56
20.00 87.56
75.93 92.97
0.00 99.73
68.91 96.09
93.28 96.97
20.94 64.99
88.91 97.72
74.49
99.22
9.09 99.54
0.00 99.64
81.20 92.57
0.00 70.80
0.00 77.76
77.35
0.00 80.99
99.99
0.00 55.90
76.14 95.35
62.91 95.03
92.53 91.88
82.12 88.61
51.91 92.91
55.89 97.12
0.00 96.07
99.35
100.00
0.00 99.94
0.00 99.99

-0.30

-0.17

0.91

-0.46
-0.21
31.01
-0.16

0.23

-0.55
0.52
1.21
-1.73
-0.32
3.30

0.039

0.729

0.374

0.390
0.641
0.638
0.592

0.812

0.169
0.629
0.067
0.124
0.799
0.264

81.78

-0.20 0.463

RecU

-0.01

0.01
-0.04
-0.00
-0.37
1.00
0.01
-0.10
0.01
-0.30
0.01
-0.04
-0.04
-0.93
-0.00
-0.91
0.00
-0.01

-0.01
0.47
-1.01
0.40
-0.83
0.00
-3.07
-0.08
-0.03
0.09
-0.26
-0.06
0.10
-0.02

-1.04
-0.03

PrecU

100.00
99.99
99.90
99.99
100.00
100.00
100.00
99.97
99.49
99.37
99.40
96.90
52.25
98.43
91.56
87.56
92.97
99.73
96.09
96.97
64.99
97.72
74.49
99.22
99.54
99.64
92.57
70.80
77.76
77.35
80.99
99.99
55.90
95.35
95.03
91.88
88.61
92.91
97.12
96.07
99.35
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.90
99.99
100.00
100.00
100.00
99.94
99.38
99.43
99.47
98.63
51.26
98.68
91.29
90.44
91.69
99.79
95.96
97.31
66.06
97.97
78.34
99.55
99.37
99.90
91.13
66.51
82.72
75.93
83.05
100.00
45.00
95.30
94.51
92.20
90.35
92.35
96.48
95.70
99.62
100.00
99.87
99.99
51.28
96.817

F chg

100.00
100.00
99.90
100.00
100.00
100.00
100.00
100.00
99.59
99.31
99.33
95.23
53.28
98.19
91.84
84.84
94.30
99.67
96.22
96.64
63.96
97.48
71.00
98.89
99.71
99.38
94.04
75.69
73.36
78.83
79.03
99.99
73.77
95.40
95.55
91.56
86.93
93.47
97.77
96.43
99.08
100.00
100.00
100.00

F(0.5)

Rec

TBL
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

Prec

True

(Pos = IN & Wd = of) (Pos IN0)


(Pos = IN & Wd = in) (Pos IN1)

B.39

99

75.00
11.11
78.57
64.86
71.93
74.14
83.26
31.82
83.52
0.00

84.79
0.00

0.00
33.33
80.45
82.32
70.95
67.05
62.39
86.10

0.00

p(FU)

24622

0.00
98.45
0.00

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.300
0.013
0.875
0.393
0.707
0.573
0.093
0.655
0.853
0.211
0.080
0.069
0.354
0.786
0.277
0.464
0.706
0.629
0.177
0.723
0.262
0.924

0.374
0.260
0.888
0.523
0.453
0.843
0.367
0.233
0.178

0.374

0.134
0.200

100.00
0.00 99.99
99.91
99.99
100.00
100.00
100.00
99.97
0.00 99.50
96.89 99.36
0.00 99.40
97.11
4.48 51.42
4.55 98.42
72.04 91.55
28.24 87.63
75.93 92.30
0.00 99.73
68.43 96.09
94.75 96.97
11.97 64.86
87.63 97.72
74.32
99.23
0.00 99.56
0.00 99.64
81.01 92.53
0.00 70.38
0.00 64.27
76.97
0.00 81.67
99.99
5.88 57.83
75.79 95.34
66.67 95.07
92.53 91.75
80.45 88.70
51.91 92.96
54.39 97.07
0.00 95.93
99.31
100.00
0.00 99.91
0.00 99.99

-0.42

-1.12
12.86

-0.86
0.08
-16.30
-0.28

0.68

-2.08
1.96
1.40
-1.16
2.17
-0.13

0.096

0.053
0.579

0.233
0.729
0.224
0.576

0.386

0.344
0.148
0.165
0.454
0.627
0.960

81.64

-0.37 0.295

RecU

0.03
-0.05
-0.00
-0.16
-0.61
-0.01
-0.12
0.09
-1.03
0.01
-0.05
-0.05
-1.13
-0.01
-1.14
0.01
0.01
-0.01
-0.05
-0.13
-18.18
-0.09

0.28
-0.09
0.01
-0.05
-0.15
-0.02
0.05
-0.17
-0.04

-0.02

-1.04
-0.04

PrecU

100.00
99.99
99.91
99.99
100.00
100.00
100.00
99.97
99.50
99.36
99.40
97.11
51.42
98.42
91.55
87.63
92.30
99.73
96.09
96.97
64.86
97.72
74.32
99.23
99.56
99.64
92.53
70.38
64.27
76.97
81.67
99.99
57.83
95.34
95.07
91.75
88.70
92.96
97.07
95.93
99.31
100.00
99.91
99.99

TrueU

p(F)

100.00
99.98
99.92
99.99
100.00
100.00
100.00
99.94
99.38
99.42
99.47
98.84
49.58
98.69
91.20
90.86
96.52
99.77
95.97
97.35
66.10
97.87
79.09
99.54
99.36
99.90
91.05
65.93
51.44
75.48
83.05
100.00
48.00
95.41
94.60
92.22
90.35
92.24
96.48
95.53
99.58
100.00
99.87
99.99
51.28
96.806

F chg

100.00
100.00
99.90
100.00
100.00
100.00
100.00
100.00
99.63
99.31
99.33
95.43
53.39
98.15
91.91
84.62
88.43
99.69
96.20
96.60
63.67
97.56
70.09
98.93
99.75
99.37
94.06
75.48
85.62
78.53
80.33
99.98
72.73
95.27
95.55
91.28
87.11
93.69
97.67
96.32
99.05
100.00
99.96
100.00

F(0.5)

Rec

TBL
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

Prec

True

(Pos = DT & Wd = the) (Pos DT0)


(Pos = DT & Wd = a) (Pos DT1)
(Pos = DT & Wd = this) (Pos DT2)

B.40

100

33.33
0.00
77.61
70.27
70.69
73.26
83.31
48.21
82.73

85.77
20.00

0.00
0.00
85.14
83.55
70.16
65.51
56.52
87.46

p(FU)

24622

98.74

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.373

0.313
0.272
0.667
0.216
0.682
0.489
0.100
0.718
0.822
0.374
0.245
0.439
0.054
0.723
0.912
0.715
0.380
0.374
0.989
0.956
0.528
0.196
0.374

0.867
0.155
0.236
0.101
0.023
0.436
0.880

0.084
0.297

100.00
0.00 99.99
99.90
99.99
100.00
100.00
100.00
99.97
0.00 99.50
96.78 99.38
0.00 99.40
97.00
1.49 51.53
0.00 98.43
71.97 91.58
30.59 87.64
75.93 93.67
0.00 99.72
69.09 96.10
93.93 97.00
11.54 65.11
87.76 97.72
75.15
99.23
0.00 99.55
0.00 99.64
81.78 92.58
14.29 70.50
0.00 76.29
77.34
0.00 80.99
99.99
0.00 57.67
74.39 95.41
60.80 94.98
92.64 91.90
79.05 88.58
49.62 92.78
56.75 97.07
0.00 96.11
99.35
100.00
0.00 99.94
0.00 99.99

-0.33

-1.75
22.26
-0.89

-0.94
-0.30
-10.39
-0.69

1.75

-0.39
-2.59
0.81
-3.19
-4.72
3.11

0.185

0.018
0.366
0.374

0.076
0.453
0.376
0.190

0.087

0.763
0.113
0.303
0.033
0.102
0.233

81.40

-0.66 0.012

RecU

-0.01

0.03
-0.02
-0.00
-0.26
-0.39
0.01
-0.08
0.10
0.45
0.00
-0.03
-0.02
-0.75
-0.01
-0.04
0.01
0.00
-0.01
0.00
0.04
-2.88
0.38
-0.83

-0.01
-0.09
0.11
-0.29
-0.20
0.04
0.03

-0.80
-0.02

PrecU

100.00
99.99
99.90
99.99
100.00
100.00
100.00
99.97
99.50
99.38
99.40
97.00
51.53
98.43
91.58
87.64
93.67
99.72
96.10
97.00
65.11
97.72
75.15
99.23
99.55
99.64
92.58
70.50
76.29
77.34
80.99
99.99
57.67
95.41
94.98
91.90
88.58
92.78
97.07
96.11
99.35
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.92
99.99
100.00
100.00
100.00
99.94
99.39
99.42
99.49
98.74
49.58
98.67
91.24
90.94
95.96
99.77
96.04
97.37
65.86
97.90
80.35
99.48
99.38
99.90
91.15
65.93
68.52
76.21
83.05
100.00
47.00
95.25
94.39
92.34
90.23
92.18
96.51
95.62
99.62
100.00
99.87
99.99
51.41
96.823

F chg

100.00
100.00
99.88
100.00
100.00
100.00
100.00
100.00
99.62
99.35
99.31
95.33
53.64
98.20
91.94
84.57
91.49
99.67
96.16
96.63
64.38
97.54
70.58
98.98
99.73
99.37
94.04
75.75
86.05
78.50
79.03
99.98
74.60
95.57
95.57
91.46
86.98
93.38
97.63
96.61
99.08
100.00
100.00
100.00

F(0.5)

Rec

TBL
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

Prec

True

(Pos = IN & Wd = of) (Pos IN0)

B.41

101

66.67
0.00
79.04
65.22
71.93
74.48
84.18
37.86
83.01
0.00

84.48
0.00

33.33
84.45
82.42
70.75
66.25
62.83
86.80

p(FU)

24622

0.00
98.51

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.373

0.560
0.113
0.579
0.241
0.768
0.669
0.123
0.353
0.868
0.332
0.825
0.864
0.342
0.562
0.413
0.843
0.232
0.374
0.383
0.869
0.266
0.622
0.374
0.374
0.374
0.451
0.824
0.553
0.536
0.857
0.763
0.222

0.104
0.371

100.00
0.00 99.99
99.90
99.99
100.00
100.00
100.00
99.97
0.00 99.50
96.81 99.36
0.00 99.41
97.06
5.97 51.69
0.00 98.43
72.54 91.62
17.65 87.68
75.93 92.31
0.00 99.73
69.42 96.12
94.14 97.01
22.65 65.14
88.23 97.71
74.44
99.21
0.00 99.56
0.00 99.64
83.33 92.60
0.00 70.59
0.00 64.80
77.22
0.00 82.64
99.99
5.88 57.83
77.19 95.39
63.85 95.05
91.88 91.84
81.70 88.74
54.20 92.96
56.32 97.04
0.00 96.23
99.35
100.00
0.00 99.94
0.00 99.99

-0.44

-0.48

0.10
0.35
36.40
-0.27

1.95

1.19
-0.41
0.93
-1.13
4.93
2.34

0.086

0.240

0.893
0.118
0.169
0.358

0.030

0.653
0.748
0.170
0.455
0.514
0.202

81.95

0.00 0.962

RecU

-0.01

0.02
-0.04
0.01
-0.21
-0.09
0.01
-0.04
0.15
-1.01
0.02
-0.01
-0.00
-0.71
-0.01
-0.99
-0.01
0.01
0.00
0.03
0.17
-17.51
0.23
1.20
0.00
0.28
-0.04
-0.02
0.05
-0.11
-0.01
0.02
0.15

-0.49
-0.01

PrecU

100.00
99.99
99.90
99.99
100.00
100.00
100.00
99.97
99.50
99.36
99.41
97.06
51.69
98.43
91.62
87.68
92.31
99.73
96.12
97.01
65.14
97.71
74.44
99.21
99.56
99.64
92.60
70.59
64.80
77.22
82.64
99.99
57.83
95.39
95.05
91.84
88.74
92.96
97.04
96.23
99.35
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.92
99.99
100.00
100.00
100.00
99.94
99.38
99.43
99.48
98.84
48.32
98.69
91.20
90.80
96.43
99.77
96.09
97.32
67.09
97.89
78.84
99.44
99.37
99.91
91.19
66.04
52.26
75.41
84.75
100.00
48.00
95.39
94.42
91.98
90.73
92.25
96.52
95.81
99.62
100.00
99.87
99.99
51.57
96.830

F chg

100.00
100.00
99.88
100.00
100.00
100.00
100.00
100.00
99.61
99.30
99.34
95.33
55.56
98.18
92.04
84.76
88.53
99.70
96.16
96.71
63.29
97.54
70.50
98.99
99.75
99.37
94.06
75.83
85.23
79.12
80.65
99.99
72.73
95.39
95.68
91.69
86.83
93.67
97.57
96.66
99.08
100.00
100.00
100.00

F(0.5)

Rec

TBL
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

Prec

True

(Pos = PRP & Wd = it) (Pos PRP0)


(Pos = PRP & Wd = he) (Pos PRP1)

B.42

102

0.00
79.67
88.46
86.49
81.07
82.53
54.00
86.99

0.00
90.32
0.00

33.33
85.90
79.52
82.06
78.58
72.41
89.88

p(FU)

24622

99.35

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.374
0.143

0.618

0.512
0.673
0.697
0.374

0.978
0.972
0.570
0.072
0.374
0.178

0.713
0.617
0.374
0.191

0.172
0.037
0.674
0.011
0.220
0.168
0.374

0.197
0.400

100.00
0.00 99.99
99.85
100.00
100.00
100.00
100.00
99.95
0.00 99.54
98.51 99.42
0.00 99.37
97.38
0.00 43.02
0.00 98.36
82.12 92.22
27.06 88.39
59.26 95.48
0.00 99.74
67.69 96.12
98.14 96.29
11.54 62.63
90.35 97.45
76.62
99.21
0.00 99.59
0.00 99.67
81.40 92.80
0.00 71.81
0.00 86.13
75.66
0.00 75.44
99.99
5.88 48.98
79.12 95.68
70.19 95.44
91.56 93.27
72.77 89.52
32.06 93.09
62.74 97.58
0.00 96.76
99.45
100.00
0.00 99.94
0.00 99.99

0.04

0.04
0.02
-3.23
-0.17

0.13
-0.91
0.15
0.03
0.00
-1.76

0.356

0.777
0.323
0.374
0.033

0.606
0.103
0.418
0.970
0.908
0.198

84.60

-0.03 0.289

RecU

-0.01
0.01

-0.00

-0.00
0.00
-0.02
0.03

0.00
0.00
0.02
-0.01
0.18
0.01

0.00
-0.08
0.11
-0.09

0.02
-0.02
-0.01
-0.04
0.03
-0.02
-0.01

-0.09
-0.00

PrecU

100.00
99.99
99.85
100.00
100.00
100.00
100.00
99.95
99.54
99.42
99.37
97.38
43.02
98.36
92.22
88.39
95.48
99.74
96.12
96.29
62.63
97.45
76.62
99.21
99.59
99.67
92.80
71.81
86.13
75.66
75.44
99.99
48.98
95.68
95.44
93.27
89.52
93.09
97.58
96.76
99.45
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.75
99.99
100.00
100.00
100.00
99.94
99.34
99.57
99.28
99.58
32.35
98.91
92.07
92.24
95.77
99.79
95.08
97.96
73.19
97.20
84.63
99.65
99.33
99.96
90.54
66.98
84.36
75.72
72.88
100.00
36.00
94.65
95.91
94.22
90.90
92.16
96.37
96.38
100.00
100.00
99.87
99.99
50.67
96.851

F chg

100.00
100.00
99.95
100.00
100.00
100.00
100.00
99.96
99.75
99.28
99.46
95.27
64.17
97.82
92.37
84.86
95.19
99.69
97.17
94.69
54.73
97.71
70.00
98.76
99.85
99.39
95.19
77.38
87.98
75.59
78.18
99.99
76.60
96.73
94.97
92.35
88.19
94.03
98.83
97.15
98.90
100.00
100.00
100.00

F(0.5)

Rec

SVM
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

Prec

True

(Pos = PRP & Wd = it) (Pos PRP0)

103

10.00
5.00
78.95
70.59
74.55
74.05
83.76
40.22
83.85

84.18
25.00

0.00
82.40
81.04
70.78
66.70
62.93
83.95

p(FU)

24622

98.49

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.374

0.140
0.634
0.945
0.134
0.206
0.418
0.827
0.104
0.723
0.178
0.667
0.472
0.261
0.215
0.201
0.807
0.980
0.994
0.451
0.328
0.394
0.317

0.077
0.968
0.033
0.851
0.597
0.296
0.492

0.653
0.906

100.00
0.00 99.99
99.91
99.99
100.00
100.00
100.00
99.97
0.00 99.53
97.25 99.40
0.00 99.40
97.01
1.49 50.68
4.55 98.44
72.63 91.67
28.24 87.90
75.93 91.57
0.00 99.73
68.76 96.14
94.34 96.99
15.81 65.12
87.17 97.75
74.49
99.21
0.00 99.55
0.00 99.64
80.43 92.55
14.29 70.86
0.00 68.38
77.37
0.00 81.67
99.99
0.00 57.67
77.19 95.31
62.21 95.06
91.77 91.96
82.82 88.83
55.73 93.00
58.24 97.09
0.00 96.16
99.35
100.00
0.00 99.94
0.00 99.99

-0.22

-0.47
15.70
1.83

-0.67
0.19
9.24
-0.34

-0.04

0.00
-2.58
0.90
-0.15
6.58
3.03

0.334

0.442
0.633
0.374

0.282
0.576
0.941
0.395

0.990

0.950
0.148
0.199
0.836
0.228
0.352

81.81

-0.16 0.602

RecU

0.01

0.05
-0.01
-0.00
-0.26
-2.02
0.01
0.01
0.40
-1.81
0.01
0.01
-0.02
-0.73
0.02
-0.91
-0.01
-0.00
-0.00
-0.03
0.55
-12.95
0.43

-0.12
0.00
0.18
-0.01
0.03
0.07
0.07

-0.26
-0.00

PrecU

100.00
99.99
99.91
99.99
100.00
100.00
100.00
99.97
99.53
99.40
99.40
97.01
50.68
98.44
91.67
87.90
91.57
99.73
96.14
96.99
65.12
97.75
74.49
99.21
99.55
99.64
92.55
70.86
68.38
77.37
81.67
99.99
57.67
95.31
95.06
91.96
88.83
93.00
97.09
96.16
99.35
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.92
99.99
100.00
100.00
100.00
99.94
99.37
99.47
99.49
98.84
46.64
98.70
91.31
91.25
92.34
99.78
96.04
97.33
66.37
97.89
78.34
99.49
99.37
99.89
91.11
65.93
65.84
75.72
83.05
100.00
47.00
95.35
94.47
92.31
90.73
92.42
96.62
95.70
99.62
100.00
99.87
99.99
51.69
96.839

F chg

Rec

100.00
100.00
99.91
100.00
100.00
100.00
100.00
100.00
99.69
99.32
99.31
95.24
55.50
98.17
92.04
84.78
90.81
99.67
96.25
96.65
63.92
97.60
71.00
98.94
99.73
99.39
94.04
76.59
71.11
79.10
80.33
99.98
74.60
95.27
95.66
91.60
87.00
93.57
97.57
96.62
99.08
100.00
100.00
100.00

F(0.5)

Prec

True

TBL
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

104

24622

0.00
99.15 98.77
0.00
100.00
72.73
82.45
80.85
90.00
79.72
89.35
57.72
90.02

5.97
36.36
82.36
44.71
66.67
0.00
78.06
95.60
36.75
92.73

0.00
0.00
92.95 84.30
0.00 0.00
0.00
100.00 100.00

90.64
84.35
85.42
81.00
73.87
89.51

0.00
86.67
80.99
90.04
79.19
62.60
78.59
0.00

0.00
0.00
87.32

p(FU)

0.00

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.177
0.308
0.295

0.558
0.287
0.633
0.701
0.281
0.178
0.967
0.098
0.867
0.811
0.228
0.626
0.999
0.705
0.155
0.419
0.185
0.516

0.374
0.020
0.202
0.327
0.018
0.373
0.417
0.957

0.155
0.065

RecU

-0.01
-0.00
-0.00

-1.45
-0.01
-0.01
-0.02
-0.18
-0.01
-0.00
-0.01
-0.06
0.00
0.51
-0.01
-0.00
-0.01
-0.03
0.22
-0.92
-0.10

2.92
0.02
-0.02
-0.03
-0.05
0.03
0.00
-0.00

-0.17
-0.01

PrecU

100.00
99.99
99.77
100.00
100.00
100.00
100.00
99.97
99.53
99.40
99.36
97.05
40.72
98.21
92.30
87.44
94.60
99.65
96.48
96.95
59.40
98.07
69.17
99.03
99.50
99.53
92.29
68.52
81.82
76.29
70.48
99.99
42.11
96.29
96.31
93.06
90.29
94.18
97.79
96.14
99.45
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.57
99.99
100.00
100.00
100.00
99.94
99.33
99.46
99.31
98.53
28.57
98.70
92.80
90.17
94.69
99.56
96.58
97.71
54.17
98.50
64.99
99.63
99.14
99.95
90.84
62.73
79.63
73.22
62.71
100.00
28.00
95.11
96.59
92.00
89.51
93.74
96.74
95.93
100.00
100.00
99.87
99.99
53.63
97.050

F chg

Rec

100.00
100.00
99.97
100.00
100.00
100.00
100.00
100.00
99.74
99.34
99.42
95.61
70.83
97.72
91.81
84.87
94.51
99.74
96.38
96.20
65.75
97.64
73.93
98.43
99.86
99.11
93.79
75.49
84.13
79.64
80.43
99.99
84.85
97.50
96.02
94.13
91.08
94.63
98.86
96.34
98.90
100.00
100.00
100.00

F(0.5)

Prec

True

MAX
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

100.00
99.99
99.77
100.00
100.00
100.00
100.00
99.97
99.53
99.40
99.36
97.05
40.72
98.21
92.30
87.44
94.60
99.65
96.48
96.95
59.40
98.07
69.17
99.03
99.50
99.53
92.29
68.52
81.82
76.29
70.48
99.99
42.11
96.29
96.31
93.06
90.29
94.18
97.79
96.14
99.45
100.00
99.94
99.99

-0.01

-0.03
-1.83
-2.24

-0.13
0.02
0.65
0.04

-0.23

0.23
0.36
-0.11
-0.21
-0.80
0.41

0.678

0.802
0.374
0.178

0.057
0.710
0.635
0.754

0.182

0.380
0.207
0.392
0.590
0.374
0.284

-0.02 0.527

B.43

105

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1
24622

98.57

50.00
0.00
79.04
81.25
71.93
74.05
83.69
42.20
82.45

83.23
0.00

82.79
81.85
71.42
66.40
57.26
86.21

p(FU)

0.373

0.571
0.035
0.962
0.095
0.584
0.656
0.204
0.880
0.373
0.466
0.905
0.475
0.384
0.099
0.637
0.577
0.647

0.950
0.368
0.381
0.165
0.178
0.374

0.204
0.977
0.040
0.327
0.683
0.232
0.092

0.112
0.145

F chgU

-0.01

0.01
-0.04
0.00
-0.21
-2.69
0.01
-0.04
-0.03
-1.38
0.01
-0.00
-0.02
-0.75
-0.04
0.31
0.01
0.01

0.00
-0.36
-9.39
0.72
2.38
0.00

-0.09
-0.00
0.22
-0.11
-0.03
0.07
-0.08

-0.61
-0.01

F(0.5)U

100.00
99.99
99.90
99.99
100.00
100.00
100.00
99.97
99.48
99.37
99.40
97.06
50.34
98.43
91.62
87.52
91.97
99.73
96.13
96.99
65.11
97.69
75.41
99.23
99.56
99.64
92.58
70.22
71.18
77.60
83.61
99.99
57.67
95.34
95.06
91.99
88.74
92.94
97.09
96.02
99.35
100.00
99.94
99.99

100.00
0.00 99.99
99.90
99.99
100.00
100.00
100.00
99.97
0.00 99.48
96.66 99.37
0.00 99.40
97.06
4.48 50.34
0.00 98.43
71.93 91.62
15.29 87.52
75.93 91.97
0.00 99.73
69.22 96.13
94.63 96.99
19.66 65.11
88.44 97.69
75.41
99.23
0.00 99.56
0.00 99.64
81.78 92.58
0.00 70.22
0.00 71.18
77.60
0.00 83.61
99.99
0.00 57.67
75.09 95.34
62.44 95.06
92.75 91.99
81.70 88.74
51.15 92.94
53.53 97.09
0.00 96.02
99.35
100.00
0.00 99.94
0.00 99.99

-0.48

-0.92
-26.16

-0.32
0.29
29.08
-0.50

0.25

-1.20
-1.95
1.88
-1.00
-2.58
-1.05

0.034

0.162
0.343

0.476
0.388
0.326
0.144

0.779

0.159
0.301
0.033
0.524
0.706
0.677

81.74

-0.25 0.033

RecU

100.00
99.98
99.92
99.99
100.00
100.00
100.00
99.94
99.41
99.42
99.48
98.84
46.64
98.69
91.22
90.36
92.30
99.79
96.01
97.38
66.27
97.93
80.35
99.56
99.35
99.90
91.21
65.72
70.37
76.17
86.44
100.00
47.00
95.32
94.56
92.24
90.55
92.29
96.45
95.47
99.62
100.00
99.87
99.99
51.51
96.828

PrecU

100.00
100.00
99.88
100.00
100.00
100.00
100.00
100.00
99.55
99.32
99.33
95.33
54.68
98.17
92.03
84.85
91.65
99.67
96.25
96.60
63.99
97.46
71.05
98.91
99.76
99.38
93.99
75.38
72.00
79.08
80.95
99.99
74.60
95.36
95.57
91.75
87.00
93.61
97.74
96.57
99.08
100.00
100.00
100.00

TrueU

TBL
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

p(F)

{either, half, many, neither}) (Pos DT0)


= all) (Pos DT1)
= no) (Pos DT2)
= both) (Pos DT3)
{any, every, some, these, those}) (Pos DT4)
{del, la, le, nary, them}) (Pos DT5)
{a, an, another, each, that, the, this}) (Pos DT6)

F chg

Wd
Wd
Wd
Wd
Wd
Wd
Wd

F(0.5)

&
&
&
&
&
&
&

Rec

DT
DT
DT
DT
DT
DT
DT

Prec

=
=
=
=
=
=
=

True

(Pos
(Pos
(Pos
(Pos
(Pos
(Pos
(Pos

B.44

rp(clc) Mapping

106

0.00
79.64
88.89
88.89
81.11
82.52
53.85
87.25

0.00
90.13
0.00

33.33
86.18
80.64
81.96
78.88
72.41
89.85

p(FU)

24622

99.35

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.374

0.374
0.178
0.374
0.699

0.749
0.603
0.761
0.180

0.869
0.380
0.395
0.291
0.374

0.254
0.688
0.374
0.607

0.667
0.681
0.921
0.622
0.938
0.608
0.070

0.754
0.994

100.00
0.00 99.99
99.84
100.00
100.00
100.00
100.00
99.95
0.00 99.53
98.51 99.42
0.00 99.37
97.38
0.00 43.02
0.00 98.36
82.06 92.21
28.24 88.42
59.26 95.50
0.00 99.74
67.67 96.12
98.14 96.30
11.97 62.68
90.44 97.47
76.62
99.20
0.00 99.59
0.00 99.67
81.40 92.78
0.00 71.91
0.00 86.13
75.79
0.00 75.44
99.99
5.88 48.98
78.77 95.66
71.36 95.45
91.45 93.28
73.04 89.55
32.06 93.06
64.45 97.61
0.00 96.80
99.45
100.00
0.00 99.94
0.00 99.99

-0.02
3.42
1.11

0.05
0.01
-0.35
0.03

-0.10

0.05
0.62
0.03
0.41
0.00
-0.21

0.826
0.374
0.374

0.559
0.608
0.374
0.574

0.374

0.627
0.138
0.882
0.390
0.995
0.374

84.65

0.03 0.071

RecU

-0.01

-0.01
-0.00
-0.00
0.00

-0.00
-0.01
0.01
0.05

0.00
0.00
0.10
0.00
0.18

-0.02
0.07
0.11
0.09

0.00
-0.01
0.00
-0.01
-0.00
0.00
0.03

0.02
0.00

PrecU

100.00
99.99
99.84
100.00
100.00
100.00
100.00
99.95
99.53
99.42
99.37
97.38
43.02
98.36
92.21
88.42
95.50
99.74
96.12
96.30
62.68
97.47
76.62
99.20
99.59
99.67
92.78
71.91
86.13
75.79
75.44
99.99
48.98
95.66
95.45
93.28
89.55
93.06
97.61
96.80
99.45
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.74
99.99
100.00
100.00
100.00
99.94
99.33
99.57
99.28
99.58
32.35
98.91
92.06
92.27
95.77
99.79
95.08
97.95
73.26
97.22
84.63
99.65
99.33
99.96
90.53
67.14
84.36
75.69
72.88
100.00
36.00
94.63
95.93
94.22
90.93
92.13
96.42
96.42
100.00
100.00
99.87
99.99
50.73
96.852

F chg

100.00
100.00
99.95
100.00
100.00
100.00
100.00
99.96
99.73
99.27
99.46
95.27
64.17
97.82
92.36
84.88
95.24
99.69
97.17
94.70
54.77
97.72
70.00
98.74
99.85
99.39
95.15
77.42
87.98
75.90
78.18
99.99
76.60
96.71
94.97
92.36
88.20
94.01
98.83
97.20
98.90
100.00
100.00
100.00

F(0.5)

Rec

SVM
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

Prec

True

(Pos = RP & Wd {back, down, in, off, on, out, over, up}) (Pos RP0)

107

0.00
0.00
78.92
84.21
71.93
0.00
74.17
83.83
31.63
82.51

83.40
50.00
0.00
0.00
25.00
81.87
81.65
69.63
64.98
63.11
85.29

p(FU)

24622

98.87

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.204
0.605
0.675
0.407
0.407
0.519
0.390
0.682
0.392
0.641
0.838
0.307
0.128
0.446
0.240
0.757
0.178
0.374
0.556
0.777
0.410
0.187
0.374
0.374
0.374
0.151
0.843
0.568
0.051
0.774
0.073
0.235

0.631
0.904

100.00
0.00 99.99
99.91
99.99
100.00
100.00
100.00
99.97
0.00 99.53
97.10 99.39
0.00 99.40
97.11
0.00 50.34
0.00 98.44
72.47 91.62
18.82 87.64
75.93 95.02
0.00 99.71
69.06 96.13
93.71 96.98
13.25 64.72
89.21 97.75
75.38
99.23
0.00 99.56
0.00 99.65
82.75 92.55
14.29 70.59
0.00 84.22
77.27
0.00 80.99
99.99
5.88 56.47
75.26 95.32
62.68 95.07
92.53 91.84
79.05 88.69
49.62 92.95
55.89 97.11
0.00 96.25
99.35
100.00
0.00 99.94
0.00 99.99

-0.10

-0.60

-0.36
-0.08
-10.13
-0.05

0.95

-1.61
-1.85
0.32
-3.62
0.17
1.16

0.577

0.251

0.777
0.710
0.204
0.877

0.377

0.155
0.403
0.603
0.008
0.947
0.654

81.65

-0.36 0.294

RecU

0.05
-0.01
-0.00
-0.16
-2.70
0.01
-0.04
0.10
1.89
-0.00
0.00
-0.03
-1.35
0.03
0.27
0.01
0.01
0.01
-0.03
0.17
7.22
0.29
-0.83
0.00
-2.08
-0.11
0.01
0.05
-0.16
-0.02
0.08
0.17

-0.18
-0.00

PrecU

100.00
99.99
99.91
99.99
100.00
100.00
100.00
99.97
99.53
99.39
99.40
97.11
50.34
98.44
91.62
87.64
95.02
99.71
96.13
96.98
64.72
97.75
75.38
99.23
99.56
99.65
92.55
70.59
84.22
77.27
80.99
99.99
56.47
95.32
95.07
91.84
88.69
92.95
97.11
96.25
99.35
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.92
99.99
100.00
100.00
100.00
99.94
99.36
99.44
99.50
98.84
47.06
98.69
91.26
90.56
95.82
99.79
96.02
97.30
66.37
97.98
80.60
99.50
99.38
99.90
91.08
66.09
81.28
75.62
83.05
100.00
48.00
95.34
94.43
92.49
90.56
92.23
96.53
95.85
99.62
100.00
99.87
99.99
51.73
96.840

F chg

Rec

100.00
100.00
99.90
100.00
100.00
100.00
100.00
100.00
99.69
99.35
99.30
95.43
54.11
98.19
91.98
84.90
94.23
99.64
96.24
96.67
63.14
97.53
70.80
98.96
99.73
99.39
94.07
75.75
87.39
78.99
79.03
99.99
68.57
95.30
95.72
91.20
86.90
93.69
97.69
96.66
99.08
100.00
100.00
100.00

F(0.5)

Prec

True

TBL
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

108

24622

0.00
99.26 98.80
0.00
100.00
72.73
82.50
80.85
86.05
79.49
89.36
57.62
90.12

5.97
36.36
82.34
44.71
68.52
0.00
78.11
95.52
37.18
92.69

0.00
0.00
92.77 84.50
0.00 0.00
0.00
100.00 100.00

90.93
84.18
85.63
81.00
73.21
89.22

0.00
86.14
81.22
90.26
79.19
62.60
77.94
0.00

0.00
0.00
87.30

p(FU)

0.00

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.716
0.101
0.194
0.719
0.203
0.414
0.806
0.931
0.533
0.374
0.983
0.243
0.172
0.991
0.159
0.374
0.374

0.554
0.870
0.268
0.992

0.374
0.402
0.569
0.262
0.552
0.205
0.766
0.667

0.590
0.633

RecU

-0.00
0.00
-0.00
0.05
1.75
-0.00
0.00
0.01
-0.07
-0.00
0.00
-0.01
-0.38
-0.00
0.53
0.01
0.00

-0.01
0.06
-0.60
0.02

2.92
0.01
-0.01
-0.02
-0.02
-0.03
0.00
-0.01

-0.08
-0.00

PrecU

100.00
99.99
99.77
100.00
100.00
100.00
100.00
99.97
99.54
99.41
99.36
97.10
42.04
98.21
92.31
87.47
94.70
99.66
96.48
96.95
59.21
98.07
69.18
99.04
99.50
99.53
92.30
68.41
82.08
76.39
70.48
99.99
42.11
96.28
96.32
93.06
90.32
94.13
97.79
96.12
99.45
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.57
99.99
100.00
100.00
100.00
99.94
99.33
99.47
99.31
98.53
29.41
98.71
92.82
90.20
94.93
99.57
96.58
97.70
53.96
98.49
64.74
99.63
99.15
99.95
90.91
62.57
79.63
72.59
62.71
100.00
28.00
95.10
96.60
91.99
89.55
93.65
96.74
95.91
100.00
100.00
99.87
99.99
53.68
97.053

F chg

Rec

100.00
100.00
99.97
100.00
100.00
100.00
100.00
100.00
99.74
99.35
99.41
95.71
73.68
97.72
91.82
84.90
94.48
99.74
96.37
96.21
65.58
97.64
74.28
98.45
99.85
99.12
93.75
75.44
84.68
80.60
80.43
99.99
84.85
97.49
96.03
94.17
91.10
94.61
98.86
96.34
98.90
100.00
100.00
100.00

F(0.5)

Prec

True

MAX
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

100.00
99.99
99.77
100.00
100.00
100.00
100.00
99.97
99.54
99.41
99.36
97.10
42.04
98.21
92.31
87.47
94.70
99.66
96.48
96.95
59.21
98.07
69.18
99.04
99.50
99.53
92.30
68.41
82.08
76.39
70.48
99.99
42.11
96.28
96.32
93.06
90.32
94.13
97.79
96.12
99.45
100.00
99.94
99.99

0.06

-0.02
-1.83
-2.63

-0.25
-0.02
1.29
0.08

-0.20

0.07
0.41
0.13
-0.21
-1.20
-0.18

0.178

0.826
0.374
0.202

0.105
0.710
0.587
0.630

0.179

0.727
0.076
0.127
0.369
0.310
0.858

-0.04 0.158

B.45

109

100.00
100.00 100.00 99.99
99.91
99.99
100.00
100.00
100.00
99.97
0.00 99.48
98.28 97.07 99.37
0.00 99.40
97.21
100.00 2.99 51.45
0.00 0.00 98.42
78.34 71.93 91.60
69.23 21.18 87.78
70.69 75.93 93.18
0.00 99.73
73.89 69.62 96.14
83.58 94.17 96.99
39.00 16.67 65.13
83.64 87.55 97.74
75.24
99.24
0.00 0.00 99.55
0.00 99.64
85.66 82.17 92.52
0.00 0.00 71.00
0.00 78.48
77.05
0.00 82.64
99.99
0.00 57.67
83.46 76.14 95.47
82.28 64.32 95.04
71.43 91.45 91.94
65.70 82.12 88.70
63.55 51.91 92.90
89.26 56.96 97.10
0.00 96.14
99.35
100.00
0.00 99.94
0.00 99.99

24622

81.75

p(FU)

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F chgU

F(0.5)U

0.374

0.807
0.104
0.785
0.558
0.779
0.747
0.142
0.333
0.412
0.471
0.779
0.069
0.448
0.690
0.823
0.288
0.374

0.401
0.264
0.712
0.962
0.374
0.374

0.364
0.267
0.192
0.022
0.313
0.271
0.641

0.333
0.536

RecU

0.01

0.00
-0.03
0.00
-0.05
-0.54
-0.00
-0.07
0.26
-0.09
0.01
0.01
-0.02
-0.73
0.01
0.08
0.02
-0.00

-0.06
0.75
-0.09
0.01
1.20
0.00

0.04
-0.03
0.16
-0.16
-0.08
0.07
0.05

-0.34
-0.01

PrecU

100.00
99.99
99.91
99.99
100.00
100.00
100.00
99.97
99.48
99.37
99.40
97.21
51.45
98.42
91.60
87.78
93.18
99.73
96.14
96.99
65.13
97.74
75.24
99.24
99.55
99.64
92.52
71.00
78.48
77.05
82.64
99.99
57.67
95.47
95.04
91.94
88.70
92.90
97.10
96.14
99.35
100.00
99.94
99.99

TrueU

p(F)

100.00
99.99
99.92
99.99
100.00
100.00
100.00
99.94
99.39
99.45
99.49
98.95
48.32
98.68
91.29
90.94
92.06
99.75
96.03
97.35
66.64
97.89
80.35
99.55
99.36
99.90
90.97
66.46
82.92
75.41
84.75
100.00
47.00
95.41
94.39
92.18
90.74
92.34
96.52
95.70
99.62
100.00
99.87
99.99
51.64
96.835

F chg

100.00
100.00
99.90
100.00
100.00
100.00
100.00
100.00
99.57
99.29
99.32
95.53
55.02
98.17
91.91
84.83
94.32
99.70
96.24
96.64
63.68
97.59
70.73
98.94
99.73
99.38
94.13
76.22
74.49
78.77
80.65
99.99
74.60
95.53
95.69
91.70
86.74
93.46
97.68
96.58
99.08
100.00
100.00
100.00

F(0.5)

Rec

TBL
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

Prec

True

(Pos = RP & Wd {across, along, back,...<4 ommitted>...,out, over, up}) (Pos RP0)

-0.41

-1.34
-6.97
-0.89

-0.13
-0.01
12.39
-0.26

1.92

-0.10
-0.07
1.26
-1.36
3.03
4.18

0.127

0.015
0.958
0.374

0.829
0.927
0.675
0.438

0.032

0.811
0.957
0.052
0.051
0.659
0.263

-0.24 0.385

B.46

in(clc,s) Mapping

110

0.00
79.53
88.89
84.62
81.30
82.50
54.00
87.23

0.00
90.11
0.00

33.33
85.74
81.67
81.91
78.74
71.93
89.38

p(FU)

24622

99.35

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.374
0.137
0.651
0.025
0.442
0.381
0.003
0.479
0.450
0.856

0.357
0.752
0.967
0.428
0.374
0.178
0.374
0.374
0.019
0.603
0.374
0.934

0.498
0.196
0.586
0.894
0.936
0.097
0.832

0.096
0.024

100.00
0.00 99.99
99.85
100.00
100.00
100.00
100.00
99.95
0.00 99.55
98.48 99.42
0.00 99.40
97.23
0.00 44.14
0.00 98.43
82.12 92.23
28.24 88.32
61.11 95.46
0.00 99.74
67.64 96.12
98.11 96.30
11.54 62.62
90.31 97.47
76.34
99.21
0.00 99.59
0.00 99.67
81.20 92.84
0.00 71.72
0.00 86.01
75.71
0.00 75.44
99.99
5.88 48.98
79.12 95.66
71.13 95.47
91.13 93.29
73.46 89.56
31.30 93.06
64.88 97.62
0.00 96.75
99.45
100.00
0.00 99.94
0.00 99.99

-0.02

-0.04
3.42
0.91

0.13
-0.02
-3.23
-0.05

-0.24

0.04
1.04
-0.17
0.62
-1.86
-0.04

0.374

0.577
0.374
0.594

0.468
0.413
0.374
0.558

0.181

0.914
0.132
0.406
0.306
0.605
0.829

84.64

0.01 0.774

RecU

-0.01
0.02
-0.00
0.02
-0.15
2.62
0.06
0.01
-0.10
0.00

0.01
0.00
0.01
0.00
-0.18
0.01
-0.00
-0.01
0.05
-0.20
-0.03
-0.02

0.01
0.01
0.01
0.01
0.00
0.02
-0.02

0.37
0.01

PrecU

100.00
99.99
99.85
100.00
100.00
100.00
100.00
99.95
99.55
99.42
99.40
97.23
44.14
98.43
92.23
88.32
95.46
99.74
96.12
96.30
62.62
97.47
76.34
99.21
99.59
99.67
92.84
71.72
86.01
75.71
75.44
99.99
48.98
95.66
95.47
93.29
89.56
93.06
97.62
96.75
99.45
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.75
99.99
100.00
100.00
100.00
99.94
99.37
99.57
99.36
99.58
34.03
98.92
92.08
92.24
95.82
99.79
95.09
97.96
73.12
97.21
84.13
99.65
99.33
99.96
90.50
66.82
84.16
76.00
72.88
100.00
36.00
94.64
95.91
94.22
90.95
92.16
96.45
96.69
100.00
100.00
99.87
99.99
50.90
96.865

F chg

100.00
100.00
99.95
100.00
100.00
100.00
100.00
99.96
99.74
99.27
99.44
94.98
62.79
97.93
92.37
84.73
95.10
99.69
97.18
94.69
54.76
97.73
69.87
98.76
99.85
99.38
95.31
77.39
87.96
75.42
78.18
99.99
76.60
96.71
95.03
92.38
88.21
93.98
98.82
96.81
98.90
100.00
100.00
100.00

F(0.5)

Rec

SVM
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

Prec

True

(Pos = IN & Wd {@, a, and,...<12 ommitted>...,v., vs., which}) (Pos IN0)


(Pos = IN & Wd = than) (Pos IN1)
(Pos = IN & Wd {til, ago, albeit,...<13 ommitted>...,whereas, whether, while}) (Pos
IN2)

111

0.00
16.67
79.30
68.42
71.93
74.24
83.87
31.25
81.71

84.19
20.00

83.94
78.47
69.84
67.13
57.39
87.25

p(FU)

24622

98.74

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.687
0.357
0.556
0.626
0.282
0.153
0.564
0.634
0.389
0.141
0.826
0.595
0.382
0.724
0.431
0.137
0.180
0.178
0.110
0.991
0.381
0.198

0.374

0.480
0.123
0.444
0.847
0.626
0.147
0.468
0.374

0.149
0.411

100.00
0.00 99.99
99.91
99.99
100.00
100.00
100.00
99.97
0.00 99.49
96.78 99.38
0.00 99.39
97.05
0.00 49.88
4.55 98.44
72.23 91.63
30.59 87.63
75.93 91.27
0.00 99.74
69.29 96.12
94.09 97.01
8.55 65.16
89.59 97.72
74.85
99.25
0.00 99.56
0.00 99.65
79.46 92.49
14.29 70.47
0.00 73.15
77.35
0.00 81.67
99.99
0.00 57.67
77.02 95.38
65.02 95.13
92.21 91.78
81.56 88.82
50.38 92.91
55.67 97.11
0.00 96.17
99.33
100.00
0.00 99.94
0.00 99.99

-0.33

-0.55
21.27

-0.15
0.14
-35.40
-0.35

-0.66

0.78
-1.57
0.34
-0.49
-3.25
1.83

0.200

0.499
0.387

0.805
0.280
0.278
0.640

0.472

0.762
0.427
0.675
0.677
0.607
0.347

81.79

-0.18 0.444

RecU

0.01
-0.03
-0.01
-0.22
-3.59
0.02
-0.03
0.09
-2.13
0.02
-0.01
-0.01
-0.67
-0.01
-0.43
0.03
0.01
0.01
-0.09
-0.01
-6.87
0.39

0.00

-0.05
0.07
-0.02
-0.02
-0.06
0.09
0.08
-0.02

-0.59
-0.01

PrecU

100.00
99.99
99.91
99.99
100.00
100.00
100.00
99.97
99.49
99.38
99.39
97.05
49.88
98.44
91.63
87.63
91.27
99.74
96.12
97.01
65.16
97.72
74.85
99.25
99.56
99.65
92.49
70.47
73.15
77.35
81.67
99.99
57.67
95.38
95.13
91.78
88.82
92.91
97.11
96.17
99.33
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.92
99.99
100.00
100.00
100.00
99.94
99.34
99.42
99.47
98.63
42.44
98.71
91.22
90.89
88.45
99.81
96.06
97.36
66.20
97.93
79.85
99.54
99.37
99.91
90.92
65.88
83.54
75.89
83.05
100.00
47.00
95.39
94.69
92.22
90.56
92.18
96.54
96.04
99.58
100.00
99.87
99.99
51.52
96.831

F chg

Rec

100.00
100.00
99.90
100.00
100.00
100.00
100.00
100.00
99.63
99.34
99.31
95.52
60.48
98.18
92.05
84.60
94.29
99.67
96.19
96.66
64.15
97.51
70.44
98.96
99.74
99.39
94.11
75.74
65.06
78.85
80.33
99.99
74.60
95.37
95.57
91.35
87.15
93.66
97.69
96.30
99.08
100.00
100.00
100.00

F(0.5)

Prec

True

TBL
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

112

24622

0.00
99.21 98.86
0.00
80.00
71.43
82.37
80.85
84.44
79.53
89.44
56.08
90.15

5.97
22.73
82.54
44.71
70.37
0.00
78.01
95.47
35.47
92.99

0.00
0.00
92.75 84.30
0.00 0.00
0.00
100.00 100.00

90.93
84.31
85.54
81.44
73.45
90.39

0.00
86.14
80.75
90.26
79.05
63.36
78.59
0.00

0.00
0.00
87.32

p(FU)

0.00

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.034
0.157
0.207
0.239
0.046
0.131
0.287
0.133
0.280
0.378
0.830
0.912
0.521
0.076
0.247

0.071
0.374
0.015
0.936
0.325
0.724
0.374

0.374
0.398
0.533
0.543
0.572
0.065
0.233
0.038

0.578
0.248

RecU

0.01
0.01
-0.01
0.20
5.04
0.02
0.04
-0.08
-0.17
-0.00
0.00
-0.00
-0.22
0.02
0.63

0.01
0.01
0.11
-0.03
-0.81
-0.06
1.73

2.92
-0.01
0.01
-0.01
0.02
0.07
0.02
-0.28

0.10
0.01

PrecU

100.00
99.99
99.77
100.00
100.00
100.00
100.00
99.97
99.55
99.41
99.36
97.25
43.40
98.24
92.34
87.39
94.61
99.66
96.48
96.96
59.30
98.09
69.25
99.04
99.51
99.54
92.42
68.35
81.90
76.33
71.70
99.99
42.11
96.26
96.34
93.07
90.35
94.22
97.80
95.87
99.45
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.57
99.99
100.00
100.00
100.00
99.94
99.36
99.49
99.31
98.42
31.09
98.68
92.84
90.17
94.83
99.58
96.58
97.70
54.20
98.52
65.24
99.62
99.16
99.95
91.03
62.57
79.63
73.19
64.41
100.00
28.00
95.08
96.62
92.01
89.58
93.80
96.77
95.89
100.00
100.00
99.87
99.99
53.78
97.065

F chg

Rec

100.00
100.00
99.97
100.00
100.00
100.00
100.00
100.00
99.74
99.34
99.41
96.10
71.84
97.80
91.85
84.78
94.39
99.73
96.38
96.23
65.47
97.66
73.79
98.45
99.86
99.13
93.85
75.30
84.31
79.75
80.85
99.99
84.85
97.47
96.05
94.16
91.14
94.65
98.86
95.85
98.90
100.00
100.00
100.00

F(0.5)

Prec

True

MAX
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

100.00
99.99
99.77
100.00
100.00
100.00
100.00
99.97
99.55
99.41
99.36
97.25
43.40
98.24
92.34
87.39
94.61
99.66
96.48
96.96
59.30
98.09
69.25
99.04
99.51
99.54
92.42
68.35
81.90
76.33
71.70
99.99
42.11
96.26
96.34
93.07
90.35
94.22
97.80
95.87
99.45
100.00
99.94
99.99

0.06

0.02
-1.83
-2.02

-0.29
0.00
-2.61
0.25

-0.33

0.07
0.19
0.08
-0.03
-0.41
0.87

0.284

0.718
0.374
0.208

0.183
0.906
0.271
0.170

0.159

0.919
0.691
0.500
0.876
0.565
0.446

-0.01 0.797

B.47

113

24622

0.00
98.87 97.19
0.00
100.00
0.00
79.85
66.67
71.93
73.75
83.29
25.64
82.13

1.49
0.00
71.38
30.59
75.93
0.00
69.78
94.12
8.55
88.48

0.00

0.00
0.00
84.87 80.43
0.00 0.00
0.00
100.00 100.00

82.33
84.23
69.85
66.40
60.53
86.93

0.00
76.84
62.68
91.77
81.70
52.67
56.96
0.00

0.00
0.00
81.67

p(FU)

0.00

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.587
0.908
0.709
0.374
0.458
0.522
0.274
0.047
0.375
0.612
0.459
0.255
0.033
0.637
0.290
0.185
0.591
0.994
0.123
0.032
0.380
0.348
0.482
0.374
0.374
0.925
0.110
0.567
0.155
0.514
0.047
0.731

0.117
0.203

RecU

-0.00
0.00
-0.01
-0.05
-1.80
-0.01
-0.09
0.34
-1.35
0.00
-0.03
-0.05
-1.78
0.01
-0.59
-0.03
-0.01
-0.00
-0.17
0.39
-9.29
0.15
2.20
0.00
-1.52
0.00
0.06
-0.03
-0.12
-0.07
0.11
0.05

-0.94
-0.03

PrecU

100.00
99.99
99.91
99.99
100.00
100.00
100.00
99.97
99.47
99.41
99.40
97.21
50.80
98.42
91.57
87.84
92.00
99.72
96.11
96.97
64.43
97.74
74.73
99.19
99.55
99.64
92.42
70.75
71.25
77.16
83.46
99.99
56.79
95.43
95.11
91.76
88.73
92.90
97.13
96.14
99.35
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.92
99.99
100.00
100.00
100.00
99.94
99.38
99.47
99.48
98.95
46.64
98.67
91.15
91.11
92.34
99.75
96.07
97.31
65.55
97.98
78.59
99.44
99.37
99.88
90.81
66.09
70.37
75.76
89.83
100.00
46.00
95.44
94.62
92.08
90.45
92.26
96.53
95.79
99.62
100.00
99.87
99.99
51.33
96.817

F chg

100.00
100.00
99.90
100.00
100.00
100.00
100.00
100.00
99.57
99.35
99.31
95.53
55.78
98.16
92.01
84.81
91.66
99.69
96.14
96.63
63.35
97.50
71.23
98.94
99.72
99.40
94.09
76.12
72.15
78.62
77.94
99.99
74.19
95.41
95.62
91.45
87.08
93.56
97.75
96.50
99.08
100.00
100.00
100.00

F(0.5)

Rec

TBL
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

Prec

True

(Pos = PRP & Wd {I, he, she, they, we}) (Pos PRPN)
(Pos = PRP & Wd {her, him, me, them, us}) (Pos PRPA)
(Pos = PRP & Wd {herself, himself, itself, myself, ourselves, themselves, yourself,
yourselves}) (Pos PRPRX)

100.00
99.99
99.91
99.99
100.00
100.00
100.00
99.97
99.47
99.41
99.40
97.21
50.80
98.42
91.57
87.84
92.00
99.72
96.11
96.97
64.43
97.74
74.73
99.19
99.55
99.64
92.42
70.75
71.25
77.16
83.46
99.99
56.79
95.43
95.11
91.76
88.73
92.90
97.13
96.14
99.35
100.00
99.94
99.99

-0.06

-0.83
20.29

-0.11
-0.22
-38.30
-0.68

0.36

-0.27
-0.53
0.15
-1.00
1.56
3.10

0.786

0.064
0.412

0.807
0.312
0.059
0.351

0.780

0.730
0.831
0.809
0.616
0.582
0.340

-0.33 0.371

B.48

114

0.00
79.80
88.46
84.21
81.05
82.48
53.85
87.36

0.00
90.32
0.00

33.33
86.51
80.11
82.20
78.73
72.88
89.68

p(FU)

24622

99.35

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.676
0.997
0.835
0.071
0.787
0.223
0.641
0.468
0.944
0.374
0.985
0.649
0.980
0.555
0.199
0.374
0.983
0.987
0.773
0.453
0.374
0.145

0.885
0.956
0.184
0.267
0.415
0.749
0.195

0.668
0.165

100.00
0.00 99.99
99.85
100.00
100.00
100.00
100.00
99.96
0.00 99.53
98.51 99.42
0.00 99.38
97.23
0.00 42.54
0.00 98.38
82.10 92.21
27.06 88.34
59.26 95.46
0.00 99.73
67.74 96.11
98.14 96.29
11.97 62.62
90.48 97.47
76.71
99.20
0.00 99.59
0.00 99.67
81.40 92.81
0.00 71.71
0.00 86.13
75.35
0.00 75.44
99.99
5.88 48.98
78.77 95.66
70.89 95.46
91.45 93.24
72.91 89.53
32.82 93.08
65.10 97.60
0.00 96.64
99.45
100.00
0.00 99.94
0.00 99.99

0.11
0.00
-1.09

0.08
-0.01
-0.35
0.11

0.24
-0.04
0.19
0.22
1.84
0.29

0.308
0.744
0.374

0.549
0.456
0.374
0.032

0.363
0.988
0.402
0.707
0.821
0.788

84.67

0.06 0.112

RecU

-0.01
0.00
0.00
-0.15
-1.10
0.02
-0.00
-0.08
0.00
-0.00
-0.00
-0.00
0.01
0.00
0.30
0.01
0.00
0.00
0.01
-0.22
0.11
-0.49

0.00
0.00
-0.04
-0.03
0.02
-0.01
-0.14

0.04
-0.00

PrecU

100.00
99.99
99.85
100.00
100.00
100.00
100.00
99.96
99.53
99.42
99.38
97.23
42.54
98.38
92.21
88.34
95.46
99.73
96.11
96.29
62.62
97.47
76.71
99.20
99.59
99.67
92.81
71.71
86.13
75.35
75.44
99.99
48.98
95.66
95.46
93.24
89.53
93.08
97.60
96.64
99.45
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.75
99.99
100.00
100.00
100.00
99.94
99.37
99.57
99.33
99.58
32.35
98.80
92.06
92.24
95.77
99.79
95.09
97.96
73.09
97.23
84.63
99.65
99.33
99.96
90.63
66.72
84.36
75.65
72.88
100.00
36.00
94.61
95.95
94.15
90.89
92.22
96.39
96.88
100.00
100.00
99.87
99.99
50.74
96.851

F chg

100.00
100.00
99.95
100.00
100.00
100.00
100.00
99.98
99.69
99.27
99.42
94.98
62.10
97.96
92.37
84.75
95.15
99.68
97.17
94.68
54.78
97.72
70.15
98.75
99.85
99.39
95.10
77.50
87.98
75.05
78.18
99.99
76.60
96.72
94.98
92.35
88.21
93.95
98.84
96.39
98.90
100.00
100.00
100.00

F(0.5)

Rec

SVM
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

Prec

True

(Pos = IN & SibR = S) (Pos INSUB)


(Pos = IN & Par = SBAR) (Pos INSUB)

B.49

115

0.00
16.67
77.75
76.19
72.73
0.00
75.02
83.91
33.63
82.01

83.14
0.00

81.26
83.54
70.50
65.38
57.80
90.33

p(FU)

24622

0.00
98.37

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.199
0.559
0.811
0.583
0.274
0.324
0.167
0.241
0.505
0.809
0.387
0.180
0.312
0.809
0.628
0.568
0.989
0.374
0.512
0.434
0.188
0.124

0.374

0.143
0.124
0.385
0.717
0.076
0.038
0.091

0.273
0.327

100.00
0.00 99.99
99.91
99.99
100.00
100.00
100.00
99.97
0.00 99.50
97.28 99.39
0.00 99.40
97.21
0.00 50.33
4.55 98.44
72.32 91.57
18.82 87.76
74.07 90.33
0.00 99.72
68.48 96.11
93.51 96.99
16.24 64.84
88.91 97.72
75.12
99.20
0.00 99.55
0.00 99.65
83.14 92.54
0.00 70.98
0.00 59.12
77.29
0.00 81.67
99.99
0.00 57.67
74.56 95.31
64.32 95.13
92.32 91.87
80.73 88.86
48.09 92.80
58.03 97.16
0.00 96.30
99.35
100.00
0.00 99.94
0.00 99.99

-0.26

-1.42

-0.65

-0.26
-0.13
5.40
-0.52

1.03

-2.44
0.59
0.93
-2.37
-5.34
5.86

0.348

0.059

0.517

0.650
0.645
0.806
0.348

0.526

0.172
0.450
0.351
0.294
0.494
0.136

81.59

-0.44 0.099

RecU

0.02
-0.02
0.00
-0.05
-2.71
0.01
-0.09
0.25
-3.13
0.00
-0.02
-0.03
-1.16
-0.01
-0.08
-0.02
-0.00
0.01
-0.04
0.73
-24.73
0.32

0.00

-0.12
0.07
0.08
0.03
-0.18
0.14
0.22

-0.65
-0.02

PrecU

100.00
99.99
99.91
99.99
100.00
100.00
100.00
99.97
99.50
99.39
99.40
97.21
50.33
98.44
91.57
87.76
90.33
99.72
96.11
96.99
64.84
97.72
75.12
99.20
99.55
99.65
92.54
70.98
59.12
77.29
81.67
99.99
57.67
95.31
95.13
91.87
88.86
92.80
97.16
96.30
99.35
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.92
99.99
100.00
100.00
100.00
99.94
99.38
99.46
99.48
98.95
47.90
98.70
91.21
90.72
92.63
99.80
95.96
97.30
66.23
97.94
79.85
99.50
99.37
99.91
91.06
66.46
52.67
75.89
83.05
100.00
47.00
95.40
94.62
92.42
90.72
92.29
96.56
95.93
99.62
100.00
99.87
99.99
51.49
96.820

F chg

100.00
100.00
99.90
100.00
100.00
100.00
100.00
100.00
99.62
99.32
99.33
95.53
53.02
98.17
91.94
84.99
88.15
99.65
96.27
96.68
63.51
97.50
70.92
98.91
99.73
99.38
94.06
76.17
67.37
78.74
80.33
99.99
74.60
95.22
95.64
91.32
87.08
93.33
97.78
96.67
99.08
100.00
100.00
100.00

F(0.5)

Rec

TBL
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

Prec

True

(Pos = NNP & Par = NP-TMP) (Pos NNPTMP)

B.50

116

0.00
79.64
88.46
86.49
81.06
82.51
52.94
87.28

0.00
90.32
0.00

33.33
85.88
80.53
82.02
78.53
72.41
89.29

p(FU)

24622

99.35

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.374
0.178
0.999
0.374

0.912
0.027
0.076
0.374

0.222
0.162
0.436
0.114

0.178

0.150
0.071
0.374
0.517

0.247
0.176
0.199
0.067
0.051
0.132
0.731

0.030
0.030

100.00
0.00 99.99
99.85
100.00
100.00
100.00
100.00
99.96
0.00 99.53
98.48 99.42
0.00 99.37
97.33
0.00 43.02
0.00 98.37
82.03 92.20
27.06 88.38
59.26 95.43
0.00 99.74
67.67 96.11
98.12 96.29
11.54 62.60
90.44 97.45
76.48
99.21
0.00 99.59
0.00 99.67
81.40 92.79
0.00 71.76
0.00 85.92
75.73
0.00 75.44
99.99
5.88 48.98
78.95 95.64
70.89 95.44
91.34 93.26
73.04 89.52
32.06 93.00
64.24 97.58
0.00 96.80
99.45
100.00
0.00 99.94
0.00 99.99

-0.02

-0.03

0.02
-0.01
-3.57
0.05

0.21
0.02
0.19

-0.66

0.374

0.527

0.701
0.374
0.322
0.674

0.199
0.944
0.374

0.245

84.62

-0.01 0.374

RecU

-0.00
-0.00
-0.00
-0.05

0.00
-0.01
-0.04
-0.02

-0.00
-0.01
-0.04
-0.01

0.01

-0.01
-0.14
-0.14
0.02

-0.02
-0.02
-0.03
-0.04
-0.06
-0.03
0.02

-0.27
-0.01

PrecU

100.00
99.99
99.85
100.00
100.00
100.00
100.00
99.96
99.53
99.42
99.37
97.33
43.02
98.37
92.20
88.38
95.43
99.74
96.11
96.29
62.60
97.45
76.48
99.21
99.59
99.67
92.79
71.76
85.92
75.73
75.44
99.99
48.98
95.64
95.44
93.26
89.52
93.00
97.58
96.80
99.45
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.75
99.99
100.00
100.00
100.00
99.94
99.33
99.56
99.28
99.58
32.35
98.91
92.04
92.29
95.73
99.79
95.08
97.96
73.26
97.19
84.38
99.65
99.33
99.96
90.54
66.82
84.16
75.72
72.88
100.00
36.00
94.59
95.95
94.19
90.86
92.11
96.38
96.44
100.00
100.00
99.87
99.99
50.58
96.845

F chg

100.00
100.00
99.95
100.00
100.00
100.00
100.00
99.98
99.73
99.27
99.47
95.18
64.17
97.83
92.37
84.78
95.14
99.69
97.16
94.67
54.64
97.72
69.94
98.76
99.85
99.39
95.15
77.48
87.77
75.75
78.18
99.99
76.60
96.71
94.93
92.34
88.23
93.91
98.80
97.15
98.90
100.00
100.00
100.00

F(0.5)

SVM
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

Rec

VB & Wd {do, help, let, make}) (Pos VBI)


VBG & Wd {doing, helping, letting, making}) (Pos VBGI)
VBD & Wd {did, helped, let, made}) (Pos VBDI)
VBN & Wd {done, helped, let, made}) (Pos VBNI)
VBP & Wd {do, help, let, make}) (Pos VBPI)
VBZ & Wd {does, helps, lets, makes}) (Pos VBZI)

Prec

=
=
=
=
=
=

True

(Pos
(Pos
(Pos
(Pos
(Pos
(Pos

B.51

117

22.22
0.00
77.80
68.75
73.21
75.56
83.78
41.50
82.57
0.00
0.00
83.37
33.33

81.78
78.64
71.63
66.82
61.74
84.85

p(FU)

24622

98.84

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.080

0.380
0.428
0.260
0.447
0.240
0.460
0.019
0.609
0.990
0.210
0.014
0.165
0.507
0.054
0.093
0.567
0.209
0.995
0.032
0.881
0.974
0.488

0.374

0.370
0.081
0.492
0.114
0.335
0.878
0.095
0.589

0.033
0.031

100.00
0.00 99.99
99.86
99.99
100.00
100.00
100.00
99.97
0.00 99.51
96.95 99.39
0.00 99.39
97.16
2.99 49.09
0.00 98.41
74.37 91.48
25.88 87.63
75.93 93.11
0.00 99.73
66.47 96.02
93.71 96.96
26.07 65.26
87.59 97.67
74.40
99.20
0.00 99.54
0.00 99.64
81.59 92.51
14.29 70.54
0.00 78.29
77.15
0.00 81.67
99.99
0.00 57.67
77.19 95.32
62.21 94.92
93.18 91.86
82.96 88.61
54.20 92.90
53.96 97.01
0.00 95.88
99.37
100.00
0.00 99.94
0.00 99.99

-0.20

0.05

0.91

-1.48
-0.11
54.10
-0.89

0.21

-0.36
-3.86
2.26
0.02
4.08
-1.17

0.294

0.955

0.374

0.143
0.784
0.192
0.047

0.815

0.774
0.103
0.042
0.986
0.527
0.829

81.70

-0.30 0.508

RecU

-0.05

0.03
-0.02
-0.01
-0.11
-5.11
-0.02
-0.20
0.09
-0.16
0.01
-0.11
-0.06
-0.51
-0.06
-1.04
-0.02
-0.01
0.00
-0.07
0.09
-0.34
0.14

0.00

-0.11
-0.15
0.07
-0.26
-0.07
-0.01
-0.22
0.02

-1.60
-0.06

PrecU

100.00
99.99
99.86
99.99
100.00
100.00
100.00
99.97
99.51
99.39
99.39
97.16
49.09
98.41
91.48
87.63
93.11
99.73
96.02
96.96
65.26
97.67
74.40
99.20
99.54
99.64
92.51
70.54
78.29
77.15
81.67
99.99
57.67
95.32
94.92
91.86
88.61
92.90
97.01
95.88
99.37
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.86
99.99
100.00
100.00
100.00
99.94
99.38
99.44
99.43
98.84
45.38
98.70
91.32
90.67
91.97
99.79
95.79
97.26
67.57
97.89
77.58
99.51
99.35
99.91
91.14
66.35
82.72
75.41
83.05
100.00
47.00
95.46
94.41
92.46
90.38
91.99
96.34
95.28
99.85
100.00
99.87
99.99
50.99
96.783

F chg

100.00
100.00
99.87
100.00
100.00
100.00
100.00
100.00
99.63
99.35
99.35
95.53
53.47
98.11
91.64
84.79
94.27
99.67
96.26
96.66
63.11
97.44
71.46
98.90
99.73
99.37
93.92
75.28
74.31
78.97
80.33
99.99
74.60
95.19
95.43
91.27
86.91
93.84
97.70
96.48
98.90
100.00
100.00
100.00

F(0.5)

Rec

TBL
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

Prec

True

(Pos = DT & Wd {a, another, each, every, little, many, much}) (Pos DT1)
(Pos = DT & Wd {these, those}) (Pos DTP)
(Pos = NNS & Wd {acrobatics, adenoids, alms,...<66 ommitted>...,tweezers, vicissitudes,
waterworks}) (Pos NNSP)
(Pos = NN & Wd {abaci, aback, abaft,...<32532 ommitted>...,zydeco, zygotic, zymurgy})
(Pos NNM)
(Pos = JJ & Wd {countless, few, many, numerous, several}) (Pos JJP)

B.52

118

40.00
0.00
77.21
75.00
73.21
74.93
84.20
34.19
82.37

83.37
0.00

0.00
0.00
78.64
79.82
74.20
65.82
67.27
84.90

p(FU)

24622

98.48

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.080

0.104
0.198
0.816
0.134
0.209
0.315
0.001
0.025
0.658
0.291
0.007
0.448
0.498
0.203
0.310
0.983
0.574
0.629
0.964
0.929
0.629
0.868
0.374

0.374
0.126
0.118
0.084
0.077
0.650
0.662
0.277
0.589

0.002
0.001

100.00
0.00 99.99
99.86
99.99
100.00
100.00
100.00
99.97
0.00 99.54
96.69 99.36
0.00 99.40
97.01
5.97 49.89
0.00 98.41
74.20 91.40
21.18 87.38
75.93 93.29
0.00 99.73
67.01 96.05
94.09 96.99
17.09 65.14
87.59 97.67
74.49
99.22
0.00 99.55
0.00 99.64
84.50 92.57
0.00 70.43
0.00 78.83
77.01
0.00 82.64
99.99
0.00 57.32
76.84 95.29
62.21 94.88
92.75 92.04
79.61 88.43
56.49 93.00
54.18 97.04
0.00 95.94
99.37
100.00
0.00 99.94
0.00 99.99

-0.51

-0.44

0.91

-1.45
0.34
9.69
-1.01

1.98

-2.48
-3.23
4.08
-2.63
10.73
-0.91

0.177

0.618

0.374

0.009
0.314
0.710
0.260

0.164

0.114
0.154
0.008
0.184
0.003
0.922

81.67

-0.34 0.192

RecU

-0.05

0.06
-0.05
-0.00
-0.26
-3.56
-0.02
-0.29
-0.19
0.03
0.02
-0.09
-0.03
-0.71
-0.06
-0.92
0.00
-0.01
-0.01
-0.00
-0.06
0.36
-0.05
1.20

-0.61
-0.14
-0.19
0.27
-0.46
0.04
0.02
-0.15
0.02

-1.78
-0.06

PrecU

100.00
99.99
99.86
99.99
100.00
100.00
100.00
99.97
99.54
99.36
99.40
97.01
49.89
98.41
91.40
87.38
93.29
99.73
96.05
96.99
65.14
97.67
74.49
99.22
99.55
99.64
92.57
70.43
78.83
77.01
82.64
99.99
57.32
95.29
94.88
92.04
88.43
93.00
97.04
95.94
99.37
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.86
99.99
100.00
100.00
100.00
99.94
99.38
99.41
99.43
98.84
48.32
98.71
91.28
90.20
92.06
99.77
95.82
97.33
67.09
97.81
77.58
99.52
99.38
99.90
91.22
66.46
83.54
74.92
84.75
100.00
47.00
95.55
94.28
92.51
90.12
92.16
96.44
95.38
99.85
100.00
99.87
99.99
50.90
96.785

F chg

100.00
100.00
99.87
100.00
100.00
100.00
100.00
100.00
99.70
99.30
99.37
95.24
51.57
98.10
91.51
84.74
94.55
99.70
96.28
96.65
63.29
97.53
71.63
98.93
99.71
99.37
93.96
74.91
74.63
79.21
80.65
99.98
73.44
95.03
95.48
91.58
86.81
93.87
97.65
96.50
98.90
100.00
100.00
100.00

F(0.5)

Rec

TBL
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

Prec

True

(Pos = DT & Wd {a, another, each, every, little, many, much}) (Pos DT1)
(Pos = DT & Wd {these, those}) (Pos DTP)
(Pos = NN & Wd {abaci, aback, abaft,...<37563 ommitted>...,zydeco, zygotic, zymurgy})
(Pos NNM)
(Pos = JJ & Wd {countless, few, many, numerous, several}) (Pos JJP)

B.53

119

60.00
0.00
78.52
70.59
71.93
73.86
84.71
38.05
81.76

83.70
0.00

0.00

84.38
81.46
69.87
66.74
57.39
82.21

p(FU)

24622

98.46

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.374

0.374
0.332
0.937
0.374
0.524
0.658
0.144
0.555
0.378
0.762
0.467
0.405
0.225
0.395
0.203
0.458
0.817
0.374
0.217
0.180
0.343
0.127
0.374
0.374

0.775
0.771
0.837
0.042
0.219
0.318
0.572
0.374

0.186
0.490

100.00
0.00 99.99
99.91
99.99
100.00
100.00
100.00
99.97
0.00 99.47
97.25 99.38
0.00 99.40
97.16
4.48 50.79
0.00 98.43
73.50 91.62
14.12 87.67
75.93 94.05
0.00 99.72
69.04 96.11
92.72 97.00
18.38 64.84
88.02 97.69
74.76
99.24
0.00 99.55
0.00 99.65
81.59 92.62
0.00 71.12
0.00 81.28
77.50
0.00 80.99
99.99
0.00 57.67
75.79 95.43
62.91 95.05
92.86 91.81
82.68 88.70
50.38 92.87
57.39 97.08
0.00 96.17
99.33
100.00
0.00 99.94
0.00 99.99

-0.23

-0.11

-0.58
-0.04
19.27
-1.16

0.41

0.18
-1.74
0.67
-0.19
-3.25
1.26

0.262

0.779

0.341
0.901
0.836
0.310

0.743

0.956
0.412
0.270
0.687
0.657
0.626

81.67

-0.33 0.399

RecU

-0.00

-0.00
-0.03
0.00
-0.11
-1.81
0.00
-0.04
0.14
0.85
0.00
-0.02
-0.02
-1.16
-0.03
-0.55
0.02
0.00
0.01
0.05
0.91
3.48
0.60
-0.83
0.00

0.01
-0.01
0.01
-0.15
-0.11
0.05
0.08
-0.02

-0.63
-0.01

PrecU

100.00
99.99
99.91
99.99
100.00
100.00
100.00
99.97
99.47
99.38
99.40
97.16
50.79
98.43
91.62
87.67
94.05
99.72
96.11
97.00
64.84
97.69
74.76
99.24
99.55
99.65
92.62
71.12
81.28
77.50
80.99
99.99
57.67
95.43
95.05
91.81
88.70
92.87
97.08
96.17
99.33
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.91
99.99
100.00
100.00
100.00
99.94
99.39
99.45
99.47
98.84
47.06
98.69
91.32
90.56
93.52
99.77
96.02
97.26
66.47
97.93
79.09
99.59
99.36
99.90
91.18
66.82
83.54
76.17
83.05
100.00
47.00
95.36
94.46
92.15
90.59
92.16
96.54
95.66
99.58
100.00
99.87
99.99
51.49
96.834

F chg

100.00
100.00
99.91
100.00
100.00
100.00
100.00
100.00
99.55
99.31
99.33
95.53
55.17
98.17
91.92
84.97
94.58
99.67
96.20
96.74
63.28
97.46
70.88
98.89
99.75
99.39
94.11
76.00
79.14
78.88
79.03
99.99
74.60
95.50
95.66
91.46
86.88
93.59
97.63
96.68
99.08
100.00
100.00
100.00

F(0.5)

Rec

TBL
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

Prec

True

(Pos = RB & Wd = nt) (Pos RB1)

B.54

120

100.00
0.00 0.00 99.98
99.85
100.00
100.00
100.00
100.00
99.96
0.00 99.53
99.35 98.51 99.42
0.00 99.37
97.43
0.00 0.00 43.02
0.00 98.36
79.69 82.21 92.22
88.89 28.24 88.45
86.49 59.26 95.48
0.00 99.73
81.21 67.69 96.12
82.54 98.12 96.29
53.85 11.97 62.65
87.27 90.35 97.47
76.48
99.20
0.00 0.00 99.59
0.00 99.67
90.30 81.20 92.82
0.00 71.98
0.00 0.00 86.13
75.71
0.00 75.44
99.99
33.33 5.88 48.98
86.04 78.95 95.64
80.75 70.89 95.47
82.10 91.34 93.28
78.59 73.32 89.57
72.88 32.82 93.10
89.58 64.45 97.61
0.00 96.75
99.45
100.00
0.00 99.94
0.00 99.99

24622

84.67

p(FU)

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F chgU

F(0.5)U

0.374

0.617
0.178
0.990
0.374

0.302
0.988
0.344
0.374
0.374
0.001
0.673
0.525
0.733

0.260
0.109
0.374
0.374

0.117
0.599
0.917
0.596
0.080
0.984
0.590

0.759
0.366

RecU

-0.01

-0.00
-0.00
-0.00
0.05

-0.00
0.00
0.05
0.03
-0.00
0.01
-0.00
0.05
0.00

0.02
0.16
0.11
-0.02

-0.01
0.01
0.00
0.01
0.04
-0.00
-0.02

-0.02
0.00

PrecU

100.00
99.98
99.85
100.00
100.00
100.00
100.00
99.96
99.53
99.42
99.37
97.43
43.02
98.36
92.22
88.45
95.48
99.73
96.12
96.29
62.65
97.47
76.48
99.20
99.59
99.67
92.82
71.98
86.13
75.71
75.44
99.99
48.98
95.64
95.47
93.28
89.57
93.10
97.61
96.75
99.45
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.75
99.99
100.00
100.00
100.00
99.94
99.33
99.56
99.28
99.58
32.35
98.91
92.07
92.32
95.77
99.78
95.09
97.95
73.23
97.21
84.38
99.65
99.33
99.96
90.55
67.14
84.36
75.72
72.88
100.00
36.00
94.61
95.91
94.22
90.97
92.24
96.41
96.33
100.00
100.00
99.87
99.99
50.71
96.855

F chg

100.00
99.99
99.95
100.00
100.00
100.00
100.00
99.98
99.74
99.27
99.47
95.37
64.17
97.82
92.36
84.89
95.19
99.69
97.18
94.69
54.74
97.72
69.94
98.74
99.85
99.39
95.19
77.56
87.98
75.69
78.18
99.99
76.60
96.70
95.03
92.36
88.21
93.98
98.84
97.17
98.90
100.00
100.00
100.00

F(0.5)

Rec

SVM
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

Prec

True

(Pos = RB & Wd = nt) (Pos RB1)


(Pos = RB & Wd = also) (Pos RB2)

0.11
3.42

0.13
0.02
-0.35
-0.01

-0.14

0.09
0.33
0.07
0.43
1.84
-0.33

0.176
0.374

0.087
0.374
0.374
0.969

0.374

0.709
0.648
0.374
0.353
0.374
0.418

0.05 0.145

121

50.00
0.00
79.25
81.25
71.93
73.95
83.81
33.80
82.69

84.57
25.00

83.59
82.42
69.80
65.64
58.72
89.44

0.00

p(FU)

24622

98.72

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.230
0.653
0.056
0.407
0.272
0.647
0.472
0.200
0.791
0.226
0.676
0.783
0.129
0.379
0.691
0.015
0.255
0.374
0.660
0.673
0.530
0.343

0.374

0.148
0.181
0.025
0.445
0.228
0.027
0.661

0.374

0.413
0.876

100.00
0.00 99.99
99.91
99.99
100.00
100.00
100.00
99.97
0.00 99.52
97.33 99.40
0.00 99.41
97.11
1.49 49.56
0.00 98.43
71.62 91.64
30.59 87.77
75.93 93.53
0.00 99.74
68.99 96.12
93.76 97.01
20.51 64.48
89.50 97.75
75.03
99.26
0.00 99.56
0.00 99.65
81.78 92.59
14.29 70.69
0.00 74.32
77.21
0.00 81.67
99.99
0.00 57.67
75.09 95.35
63.85 95.12
92.32 91.98
80.59 88.81
48.85 92.82
58.03 97.18
0.00 96.14
99.35
100.00
0.00 99.91
0.00 99.99

-0.06

-1.01
27.49

-0.56
-0.07
22.87
0.22

1.04

-0.75
-0.41
0.36
-2.23
-3.84
5.45

0.738

0.137
0.374

0.352
0.793
0.803
0.527

0.464

0.129
0.783
0.605
0.231
0.435
0.121

81.73

-0.26 0.191

RecU

0.04
-0.01
0.01
-0.16
-4.20
0.01
-0.02
0.25
0.29
0.02
-0.01
-0.01
-1.71
0.02
-0.20
0.04
0.01
0.01
0.01
0.31
-5.39
0.21

0.00

-0.08
0.06
0.20
-0.03
-0.16
0.16
0.06

-0.02

-0.20
0.00

PrecU

100.00
99.99
99.91
99.99
100.00
100.00
100.00
99.97
99.52
99.40
99.41
97.11
49.56
98.43
91.64
87.77
93.53
99.74
96.12
97.01
64.48
97.75
75.03
99.26
99.56
99.65
92.59
70.69
74.32
77.21
81.67
99.99
57.67
95.35
95.12
91.98
88.81
92.82
97.18
96.14
99.35
100.00
99.91
99.99

TrueU

p(F)

100.00
99.98
99.92
99.99
100.00
100.00
100.00
99.94
99.36
99.46
99.51
98.84
47.48
98.69
91.17
90.94
96.38
99.80
96.06
97.34
66.34
97.93
79.85
99.54
99.38
99.90
91.09
66.35
64.61
75.65
83.05
100.00
47.00
95.27
94.43
92.33
90.94
92.21
96.63
95.66
99.62
100.00
99.87
99.99
51.72
96.843

F chg

Rec

100.00
100.00
99.90
100.00
100.00
100.00
100.00
100.00
99.68
99.33
99.32
95.43
51.83
98.17
92.12
84.81
90.84
99.67
96.17
96.68
62.72
97.56
70.76
98.99
99.74
99.39
94.13
75.64
87.47
78.83
80.33
99.99
74.60
95.43
95.82
91.64
86.78
93.44
97.73
96.64
99.08
100.00
99.96
100.00

F(0.5)

Prec

True

TBL
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

122

24622

0.00
99.21 98.77
0.00
100.00
72.73
82.55
81.25
90.24
79.38
89.43
56.76
90.02

5.97
36.36
82.19
45.88
68.52
0.00
78.29
95.51
35.90
92.78

0.00
0.00
92.77 84.50
0.00 0.00
0.00
100.00 100.00

90.28
83.66
85.51
81.14
73.64
89.46

0.00
86.32
80.52
90.04
79.33
61.83
78.16
0.00

0.00
0.00
87.28

p(FU)

0.00

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.374
0.999
0.070
0.583
0.203
0.415
0.953
0.101
0.235
0.374
0.568
0.772
0.172
0.145
0.667
0.374
0.209
0.374
0.289
0.808
0.471
0.365

0.374
0.141
0.626
0.659
0.946
0.915
0.805
0.570

0.892
0.972

RecU

-0.00
0.00
-0.00
0.05
1.75
-0.00
0.00
-0.03
0.07
-0.00
0.01
0.00
-0.51
0.01
0.13
0.01
0.01
0.01
-0.03
-0.08
0.11
-0.17

2.92
0.02
-0.00
-0.01
0.00
-0.00
0.00
-0.01

-0.02
-0.00

PrecU

100.00
99.99
99.77
100.00
100.00
100.00
100.00
99.97
99.54
99.40
99.36
97.10
42.04
98.21
92.31
87.43
94.84
99.66
96.48
96.96
59.13
98.08
68.91
99.04
99.51
99.54
92.29
68.31
82.66
76.24
70.48
99.99
42.11
96.29
96.32
93.07
90.33
94.16
97.79
96.13
99.45
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.57
99.99
100.00
100.00
100.00
99.94
99.33
99.47
99.31
98.53
29.41
98.70
92.80
90.20
94.97
99.57
96.60
97.71
53.79
98.51
64.48
99.62
99.17
99.95
90.85
62.47
80.45
73.12
62.71
100.00
28.00
95.11
96.61
92.01
89.54
93.73
96.74
95.95
100.00
100.00
99.87
99.99
53.72
97.056

F chg

Rec

100.00
100.00
99.97
100.00
100.00
100.00
100.00
100.00
99.75
99.34
99.42
95.71
73.68
97.73
91.83
84.83
94.71
99.74
96.37
96.22
65.65
97.65
73.99
98.46
99.85
99.13
93.77
75.36
85.00
79.64
80.43
99.99
84.85
97.50
96.03
94.16
91.14
94.59
98.85
96.30
98.90
100.00
100.00
100.00

F(0.5)

Prec

True

MAX
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

100.00
99.99
99.77
100.00
100.00
100.00
100.00
99.97
99.54
99.40
99.36
97.10
42.04
98.21
92.31
87.43
94.84
99.66
96.48
96.96
59.13
98.08
68.91
99.04
99.51
99.54
92.29
68.31
82.66
76.24
70.48
99.99
42.11
96.29
96.32
93.07
90.33
94.16
97.79
96.13
99.45
100.00
99.94
99.99

0.01

-0.08
0.00
-0.58

-0.20
0.01
-1.44
0.07

-0.20

-0.18
-0.34
-0.05
-0.04
-1.60
0.09

0.577

0.528
0.888
0.374

0.276
0.613
0.546
0.558

0.374

0.474
0.178
0.671
0.851
0.131
0.809

-0.06 0.067

B.55

123

50.00
0.00
78.46
65.38
71.43
73.24
84.38
33.00
82.38

84.10
0.00

85.97
78.65
69.54
65.49
58.54
87.67

p(FU)

24622

0.00
98.78

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.633
0.290
0.714
0.095
0.596
0.403
0.083
0.478
0.420
0.983
0.490
0.175
0.205
0.688
0.424
0.643
0.743

0.557
0.298
0.446
0.198
0.178
0.374

0.416
0.957
0.387
0.383
0.079
0.196
0.014
0.374

0.377
0.500

100.00
0.00 99.99
99.91
99.99
100.00
100.00
100.00
99.97
0.00 99.49
97.04 99.38
0.00 99.40
97.06
1.49 50.83
0.00 98.42
72.34 91.59
20.00 87.75
74.07 94.87
0.00 99.72
69.22 96.10
93.43 96.98
14.10 64.69
88.44 97.72
74.88
99.21
0.00 99.55
0.00 99.64
81.01 92.55
0.00 71.00
0.00 83.86
77.29
0.00 83.61
99.99
0.00 57.67
75.26 95.47
63.15 95.07
92.64 91.87
80.59 88.71
54.96 92.86
56.32 97.09
0.00 95.92
99.33
100.00
0.00 99.94
0.00 99.99

-0.18

-0.97
-12.14
-1.55

-0.86
0.12
-4.90
-0.54

0.28

0.69
-3.04
0.29
-2.36
2.22
2.74

0.255

0.156
0.165
0.374

0.271
0.422
0.771
0.435

0.806

0.764
0.228
0.759
0.062
0.826
0.471

81.56

-0.47 0.143

RecU

0.01
-0.03
-0.00
-0.21
-1.74
-0.01
-0.08
0.23
1.73
0.00
-0.03
-0.04
-1.39
-0.01
-0.39
-0.01
-0.00

-0.03
0.74
6.76
0.32
2.38
0.00

0.05
0.00
0.09
-0.14
-0.12
0.07
-0.17
-0.02

-0.39
-0.01

PrecU

100.00
99.99
99.91
99.99
100.00
100.00
100.00
99.97
99.49
99.38
99.40
97.06
50.83
98.42
91.59
87.75
94.87
99.72
96.10
96.98
64.69
97.72
74.88
99.21
99.55
99.64
92.55
71.00
83.86
77.29
83.61
99.99
57.67
95.47
95.07
91.87
88.71
92.86
97.09
95.92
99.33
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.92
99.99
100.00
100.00
100.00
99.94
99.37
99.44
99.47
98.84
44.96
98.69
91.32
90.80
95.58
99.77
96.01
97.27
66.27
97.94
79.60
99.51
99.35
99.90
91.07
66.56
81.28
75.79
86.44
100.00
47.00
95.26
94.48
92.30
90.63
92.24
96.48
95.41
99.58
100.00
99.87
99.99
51.62
96.831

F chg

100.00
100.00
99.90
100.00
100.00
100.00
100.00
100.00
99.61
99.33
99.32
95.33
58.47
98.15
91.86
84.90
94.17
99.67
96.20
96.69
63.19
97.49
70.69
98.92
99.74
99.38
94.07
76.06
86.62
78.86
80.95
99.99
74.60
95.69
95.66
91.45
86.87
93.49
97.71
96.44
99.08
100.00
100.00
100.00

F(0.5)

Rec

TBL
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

Prec

True

(Pos = RB & Wd = nt) (Pos RB1)


(Pos = RB & Wd = also) (Pos RB2)
(Pos = RB & Wd = not) (Pos RB3)

B.56

124

100.00
100.00 100.00 99.99
99.90
99.99
100.00
100.00
100.00
99.97
0.00 99.49
98.60 97.25 99.38
0.00 99.41
97.06
75.00 4.48 52.82
0.00 0.00 98.42
77.12 73.74 91.58
69.23 21.18 87.52
71.43 74.07 91.15
0.00 0.00 99.73
74.98 68.33 96.14
83.92 92.38 96.97
31.48 14.53 64.80
81.28 86.91 97.68
74.73
99.19
0.00 99.55
0.00 99.65
84.80 80.04 92.53
0.00 0.00 70.47
0.00 72.99
77.27
0.00 81.67
99.99
0.00 57.67
81.20 75.79 95.40
80.00 62.91 95.07
71.21 92.10 91.92
65.35 80.59 88.76
52.21 54.20 92.81
87.50 56.96 97.14
0.00 96.07
99.35
100.00
0.00 99.94
0.00 99.99

24622

81.29

p(FU)

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F chgU

F(0.5)U

0.374
0.373

0.649
0.112
0.262
0.241
0.602
0.882
0.305
0.832
0.368
0.297
0.694
0.161
0.067
0.254
0.413
0.193
0.997
0.374
0.658
0.976
0.370
0.017

0.374

0.578
0.811
0.161
0.407
0.018
0.064
0.887

0.135
0.252

RecU

0.01
-0.01

0.01
-0.03
0.01
-0.21
2.11
-0.00
-0.09
-0.03
-2.26
0.01
0.01
-0.04
-1.22
-0.05
-0.59
-0.03
-0.00
0.01
-0.04
-0.01
-7.08
0.29

0.00

-0.03
0.01
0.13
-0.08
-0.17
0.12
-0.02

-0.73
-0.02

PrecU

100.00
99.99
99.90
99.99
100.00
100.00
100.00
99.97
99.49
99.38
99.41
97.06
52.82
98.42
91.58
87.52
91.15
99.73
96.14
96.97
64.80
97.68
74.73
99.19
99.55
99.65
92.53
70.47
72.99
77.27
81.67
99.99
57.67
95.40
95.07
91.92
88.76
92.81
97.14
96.07
99.35
100.00
99.94
99.99

TrueU

p(F)

100.00
99.99
99.92
99.99
100.00
100.00
100.00
99.94
99.36
99.45
99.49
98.84
49.16
98.68
91.32
90.75
88.30
99.80
96.02
97.28
66.34
97.85
79.35
99.50
99.36
99.90
90.97
65.88
83.13
75.86
83.05
100.00
47.00
95.37
94.68
92.27
90.48
92.24
96.60
95.62
99.62
100.00
99.87
99.99
51.44
96.820

F chg

100.00
100.00
99.88
100.00
100.00
100.00
100.00
100.00
99.63
99.32
99.33
95.33
57.07
98.17
91.85
84.52
94.19
99.66
96.26
96.67
63.34
97.51
70.63
98.89
99.74
99.39
94.16
75.74
65.06
78.73
80.33
99.99
74.60
95.42
95.46
91.56
87.11
93.38
97.69
96.53
99.08
100.00
100.00
100.00

F(0.5)

Rec

TBL
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

Prec

True

(Pos = RB & Wd {nt, not}) (Pos RB1)


(Pos = RB & Wd = also) (Pos RB2)

-0.16

-0.82
-6.97
-1.55

-0.40
-0.70
-4.31
-2.06

0.07

-1.64
-2.52
1.40
-2.48
-4.11
3.37

0.292

0.392
0.676
0.374

0.358
0.068
0.641
0.022

0.903

0.056
0.028
0.213
0.037
0.289
0.426

-0.79 0.066

B.57

rbr(clc,s) Mapping

125

57.14
0.00
78.03
68.00
71.43
74.25
83.31
43.24
83.34

85.07
0.00

81.80
82.87
71.61
67.34
63.96
87.93

p(FU)

24622

98.72

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.182
0.370
0.348
0.393
0.924
0.937
0.281
0.886
0.834
0.377
0.556
0.036
0.368
0.436
0.505
0.745
0.212
0.071
0.644
0.746
0.262
0.352

0.374

0.253
0.756
0.035
0.181
0.229
0.107
0.134

0.217
0.459

100.00
0.00 99.99
99.91
99.99
100.00
100.00
100.00
99.97
0.00 99.47
96.98 99.39
0.00 99.39
97.11
5.97 52.09
0.00 98.43
72.95 91.57
20.00 87.61
74.07 91.97
0.00 99.72
68.91 96.11
94.01 96.97
13.68 65.23
87.59 97.75
75.47
99.23
0.00 99.56
0.00 99.66
80.62 92.54
0.00 70.77
0.00 61.66
77.18
0.00 81.67
99.99
0.00 57.67
76.49 95.33
63.62 95.05
93.07 91.95
83.24 88.77
54.20 92.93
54.60 97.12
0.00 96.17
99.35
100.00
0.00 99.94
0.00 99.99

-0.24

-0.80
-11.34
-1.55

-0.42
-0.26
0.00
-0.41

0.60

-0.82
-0.38
2.19
0.61
5.80
0.93

0.201

0.416
0.829
0.374

0.621
0.354
0.775
0.504

0.605

0.418
0.605
0.026
0.528
0.235
0.770

81.77

-0.21 0.615

RecU

-0.01
-0.02
-0.01
-0.16
0.70
-0.00
-0.10
0.07
-1.38
0.00
-0.02
-0.04
-0.57
0.02
0.39
0.01
0.01
0.02
-0.04
0.42
-21.50
0.18

0.00

-0.09
-0.02
0.17
-0.07
-0.05
0.10
0.08

-0.78
-0.02

PrecU

100.00
99.99
99.91
99.99
100.00
100.00
100.00
99.97
99.47
99.39
99.39
97.11
52.09
98.43
91.57
87.61
91.97
99.72
96.11
96.97
65.23
97.75
75.47
99.23
99.56
99.66
92.54
70.77
61.66
77.18
81.67
99.99
57.67
95.33
95.05
91.95
88.77
92.93
97.12
96.17
99.35
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.92
99.99
100.00
100.00
100.00
99.94
99.37
99.43
99.47
98.84
47.06
98.69
91.24
90.69
96.62
99.77
96.02
97.34
66.34
97.98
80.60
99.52
99.37
99.91
90.97
66.46
48.15
75.65
83.05
100.00
47.00
95.30
94.54
92.31
90.54
92.30
96.46
95.72
99.62
100.00
99.87
99.99
51.41
96.822

F chg

100.00
100.00
99.90
100.00
100.00
100.00
100.00
100.00
99.56
99.35
99.31
95.43
58.33
98.16
91.91
84.73
87.76
99.67
96.21
96.61
64.16
97.52
70.95
98.94
99.75
99.40
94.16
75.67
85.71
78.77
80.33
99.99
74.60
95.37
95.55
91.59
87.08
93.56
97.79
96.62
99.08
100.00
100.00
100.00

F(0.5)

Rec

TBL
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

Prec

True

(Pos = RBR & Wd = more) (Pos RBR0)


(Pos = RBR & Wd {better, faster, further, less}) (Pos RBR1)

B.58

126

20.00
0.00
78.43
86.67
71.70
74.66
83.61
33.01
83.82

84.99
0.00

0.00
0.00
82.27
83.65
70.78
66.44
57.14
87.71

p(FU)

24622

98.78

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.427
0.428
0.141
0.583
0.497
0.626
0.813
0.782
0.709
0.090
0.778
0.810
0.306
0.913
0.817
0.986
0.990
0.178
0.315
0.921
0.389
0.180

0.374
0.118
0.730
0.346
0.312
0.098
0.044
0.031

0.292
0.500

100.00
0.00 99.99
99.91
99.99
100.00
100.00
100.00
99.97
0.00 99.50
97.30 99.38
0.00 99.42
97.21
1.49 51.09
0.00 98.43
73.17 91.65
15.29 87.65
70.37 91.72
0.00 99.74
69.04 96.12
94.47 97.01
14.53 64.96
87.38 97.72
75.03
99.22
0.00 99.55
0.00 99.65
81.20 92.62
0.00 70.58
0.00 70.16
77.39
0.00 81.67
99.99
0.00 57.32
74.91 95.34
61.27 95.03
92.53 91.87
81.56 88.64
51.91 92.85
56.53 97.12
0.00 96.16
99.35
100.00
0.00 99.94
0.00 99.99

-0.04

-0.40

-3.85

-0.07
0.16
-2.89
-0.24

0.92

-1.62
-2.10
1.26
-1.05
-1.92
2.99

0.867

0.618

0.416

0.948
0.652
0.896
0.692

0.465

0.412
0.214
0.122
0.484
0.780
0.342

81.83

-0.14 0.710

RecU

0.02
-0.02
0.01
-0.05
-1.24
0.01
-0.01
0.12
-1.65
0.02
-0.01
-0.01
-0.98
-0.00
-0.20
-0.00
-0.00
0.01
0.05
0.15
-10.69
0.44

-0.61
-0.09
-0.03
0.08
-0.22
-0.13
0.10
0.07

-0.59
-0.01

PrecU

100.00
99.99
99.91
99.99
100.00
100.00
100.00
99.97
99.50
99.38
99.42
97.21
51.09
98.43
91.65
87.65
91.72
99.74
96.12
97.01
64.96
97.72
75.03
99.22
99.55
99.65
92.62
70.58
70.16
77.39
81.67
99.99
57.32
95.34
95.03
91.87
88.64
92.85
97.12
96.16
99.35
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.92
99.99
100.00
100.00
100.00
99.94
99.38
99.44
99.49
98.95
49.16
98.70
91.31
91.25
92.06
99.82
96.02
97.35
66.47
97.87
79.85
99.49
99.37
99.91
91.15
65.30
68.93
76.07
83.05
100.00
47.00
95.39
94.41
92.35
90.44
92.28
96.49
95.68
99.62
100.00
99.87
99.99
51.51
96.829

F chg

100.00
100.00
99.90
100.00
100.00
100.00
100.00
100.00
99.61
99.32
99.34
95.53
53.18
98.17
91.99
84.33
91.38
99.66
96.22
96.67
63.51
97.58
70.76
98.96
99.73
99.39
94.13
76.79
71.43
78.75
80.33
99.98
73.44
95.28
95.67
91.38
86.91
93.43
97.77
96.64
99.08
100.00
100.00
100.00

F(0.5)

Rec

TBL
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

Prec

True

(Pos = RBR & Wd {less, more}) (Pos RBRML)

B.59

in(clc) Mapping

127

0.00
79.53
88.46
86.49
80.99
82.55
54.90
87.32

0.00
90.13
0.00

33.33
85.88
80.48
81.99
78.50
73.68
89.91

p(FU)

24622

99.35

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.046
0.374
0.743
0.178
0.181
0.237
0.040
0.369
0.685

0.779
0.566
0.160
0.315

0.120
0.667
0.585
0.454

0.951
0.273
0.308
0.048
0.957
0.099
0.370

0.594
0.389

100.00
0.00 99.99
99.85
100.00
100.00
100.00
100.00
99.96
0.00 99.55
98.51 99.42
0.00 99.37
97.28
0.00 43.89
0.00 98.38
82.03 92.20
27.06 88.46
59.26 95.43
0.00 99.74
67.67 96.11
98.09 96.30
11.97 62.70
90.44 97.47
76.48
99.20
0.00 99.59
0.00 99.67
81.40 92.83
0.00 71.92
0.00 85.92
75.69
0.00 75.44
99.99
5.88 48.98
78.95 95.65
70.66 95.45
91.13 93.27
72.91 89.54
32.06 93.06
64.88 97.62
0.00 96.82
99.45
100.00
0.00 99.94
0.00 99.99

-0.10

-0.02
0.00

0.07

-0.10

0.00
-0.12
0.07
0.53
0.21

0.172

0.870
0.832

0.237

0.374

0.959
0.183
0.583
0.859
0.374

84.61

-0.01 0.380

RecU

0.02
-0.00
0.00
-0.10
2.03
0.01
-0.01
0.06
-0.02

-0.00
0.00
0.14
0.00

0.04
0.08
-0.14
-0.03

0.00
-0.01
-0.01
-0.02
-0.00
0.01
0.04

0.04
0.00

PrecU

100.00
99.99
99.85
100.00
100.00
100.00
100.00
99.96
99.55
99.42
99.37
97.28
43.89
98.38
92.20
88.46
95.43
99.74
96.11
96.30
62.70
97.47
76.48
99.20
99.59
99.67
92.83
71.92
85.92
75.69
75.44
99.99
48.98
95.65
95.45
93.27
89.54
93.06
97.62
96.82
99.45
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.75
99.99
100.00
100.00
100.00
99.94
99.37
99.57
99.29
99.58
33.19
98.90
92.04
92.32
95.73
99.79
95.08
97.97
73.23
97.22
84.38
99.65
99.33
99.96
90.62
67.09
84.16
75.72
72.88
100.00
36.00
94.63
95.90
94.20
90.95
92.13
96.42
96.44
100.00
100.00
99.87
99.99
50.74
96.855

F chg

100.00
100.00
99.95
100.00
100.00
100.00
100.00
99.98
99.74
99.27
99.46
95.08
64.75
97.86
92.37
84.91
95.14
99.69
97.17
94.69
54.83
97.72
69.94
98.74
99.85
99.39
95.16
77.50
87.77
75.67
78.18
99.99
76.60
96.70
95.00
92.35
88.18
94.01
98.84
97.20
98.90
100.00
100.00
100.00

F(0.5)

Rec

SVM
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

Prec

True

(Pos = IN & Wd {@, albeit, although,...<14 ommitted>...,via, vs., whereas}) (Pos


IN0)
(Pos = IN & Wd {are, complicated, including, once, then, till, which}) (Pos IN1)
(Pos = IN & Wd {til, a, aka,...<10 ommitted>...,to, towards, underneath}) (Pos
IN2)

128

0.00
79.53
68.42
71.93
74.48
83.41
50.00
83.10
0.00

84.85
25.00

50.00
81.58
84.49
70.49
66.82
60.53
86.01

p(FU)

24622

98.69

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.258
0.268
0.124
0.407
0.725
0.117
0.462
0.538
0.997
0.071
0.780
0.925
0.835
0.212
0.248
0.834
0.615

0.002
0.509
0.932
0.103

0.374
0.374
0.210
0.163
0.408
0.887
0.654
0.434
0.882

0.980
0.600

100.00
0.00 99.99
99.91
99.99
100.00
100.00
100.00
99.97
0.00 99.51
97.10 99.39
0.00 99.41
97.11
0.00 52.49
0.00 98.44
72.93 91.67
30.59 87.67
75.93 93.06
0.00 99.73
69.80 96.13
94.68 97.02
14.53 65.55
87.55 97.70
74.49
99.23
0.00 99.55
0.00 99.64
81.40 92.64
14.29 70.83
0.00 78.05
77.47
0.00 81.67
99.99
5.88 57.83
76.14 95.38
62.68 95.13
93.07 91.86
80.73 88.82
52.67 92.95
52.68 97.04
0.00 96.10
99.35
100.00
0.00 99.94
0.00 99.99

-0.19

0.10
21.27

0.39
0.14
8.36
-0.59

0.96

-1.18
-0.39
1.28
-1.20
1.56
-2.12

0.274

0.820
0.435

0.151
0.460
0.638
0.336

0.453

0.167
0.795
0.233
0.442
0.737
0.477

82.00

0.07 0.721

RecU

0.03
-0.01
0.01
-0.16
1.46
0.02
0.01
0.14
-0.21
0.01
0.00
0.00
-0.08
-0.03
-0.91
0.01
-0.00

0.08
0.51
-0.64
0.56

0.00
0.28
-0.04
0.08
0.08
-0.02
-0.03
0.02
0.01

-0.01
0.01

PrecU

100.00
99.99
99.91
99.99
100.00
100.00
100.00
99.97
99.51
99.39
99.41
97.11
52.49
98.44
91.67
87.67
93.06
99.73
96.13
97.02
65.55
97.70
74.49
99.23
99.55
99.64
92.64
70.83
78.05
77.47
81.67
99.99
57.83
95.38
95.13
91.86
88.82
92.95
97.04
96.10
99.35
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.92
99.99
100.00
100.00
100.00
99.94
99.40
99.45
99.47
98.84
48.74
98.73
91.27
90.61
91.97
99.75
96.03
97.38
66.99
97.88
78.34
99.52
99.36
99.90
91.15
66.93
82.30
76.28
83.05
100.00
48.00
95.34
94.51
92.37
90.77
92.43
96.38
95.57
99.62
100.00
99.87
99.99
51.82
96.850

F chg

Rec

100.00
100.00
99.90
100.00
100.00
100.00
100.00
100.00
99.61
99.33
99.36
95.43
56.86
98.16
92.08
84.91
94.18
99.71
96.23
96.66
64.17
97.53
71.00
98.94
99.73
99.38
94.19
75.22
74.21
78.71
80.33
99.99
72.73
95.42
95.77
91.37
86.96
93.47
97.71
96.63
99.08
100.00
100.00
100.00

F(0.5)

Prec

True

TBL
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

129

24622

0.00
99.26 98.77
0.00
80.00
83.33
82.41
81.25
84.44
79.70
89.38
59.46
90.02

5.97
22.73
82.43
45.88
70.37
0.00
78.04
95.60
37.61
92.73

0.00
0.00
92.95 84.30
0.00 0.00
0.00
100.00 100.00

90.74
83.70
85.68
80.80
71.93
89.49

0.00
85.96
80.75
90.04
79.33
62.60
78.37
0.00

0.00
0.00
87.32

p(FU)

0.00

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.025
0.374
0.390
0.374
0.238
0.683
0.587
0.503
0.012
0.374
0.679
0.097
0.278
0.776
0.354
0.374
0.374
0.374
0.546
0.390
0.058
0.236

0.805
0.660
0.240
0.005
0.205
0.889
0.078

0.012
0.006

RecU

0.01
0.00
-0.00
-0.05
2.81
0.00
-0.01
-0.03
-0.30
-0.00
-0.00
-0.02
-0.43
0.00
0.53
0.01
0.00
0.01
-0.02
-0.11
-1.38
0.15

0.00
-0.00
-0.03
-0.04
-0.04
-0.00
-0.11

-0.24
-0.01

PrecU

100.00
99.99
99.77
100.00
100.00
100.00
100.00
99.97
99.55
99.41
99.36
96.99
42.48
98.22
92.30
87.43
94.49
99.66
96.47
96.94
59.18
98.07
69.18
99.04
99.50
99.54
92.30
68.29
81.43
76.49
70.48
99.99
40.91
96.28
96.32
93.06
90.29
94.12
97.78
96.03
99.45
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.57
99.99
100.00
100.00
100.00
99.94
99.35
99.47
99.31
98.42
30.25
98.66
92.80
90.11
94.65
99.57
96.59
97.72
53.75
98.50
64.74
99.63
99.16
99.95
90.95
62.52
79.42
73.39
62.71
100.00
27.00
95.08
96.63
91.94
89.50
93.65
96.73
95.83
100.00
100.00
99.87
99.99
53.59
97.050

F chg

Rec

100.00
100.00
99.97
100.00
100.00
100.00
100.00
100.00
99.76
99.35
99.41
95.61
71.29
97.78
91.81
84.91
94.34
99.74
96.36
96.18
65.83
97.64
74.28
98.45
99.85
99.13
93.68
75.24
83.55
79.86
80.43
99.99
84.38
97.50
96.02
94.21
91.10
94.60
98.86
96.23
98.90
100.00
100.00
100.00

F(0.5)

Prec

True

MAX
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

100.00
99.99
99.77
100.00
100.00
100.00
100.00
99.97
99.55
99.41
99.36
96.99
42.48
98.22
92.30
87.43
94.49
99.66
96.47
96.94
59.18
98.07
69.18
99.04
99.50
99.54
92.30
68.29
81.43
76.49
70.48
99.99
40.91
96.28
96.32
93.06
90.29
94.12
97.78
96.03
99.45
100.00
99.94
99.99

0.04

-0.02

-2.02

-0.17
0.04
3.26
0.04

-0.23

-0.14
-0.17
0.05
-0.25
-2.01
0.25

0.353

0.831

0.208

0.083
0.436
0.093
0.732

0.182

0.682
0.374
0.374
0.237
0.037
0.483

-0.02 0.357

B.60

rbloc[lm] Mapping

130

50.00
0.00
78.91
66.67
71.43
73.58
84.13
44.87
82.42
0.00
0.00
83.64
0.00

81.68
82.87
70.41
65.98
55.91
88.70

p(FU)

24622

0.00
98.92

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.374

0.602
0.278
0.539
0.568
0.570
0.859
0.882
0.598
0.165
0.737
0.675
0.683
0.317
0.898
0.165
0.781
0.981
0.374
0.884
0.443
0.175
0.191
0.374
0.374

0.713
0.488
0.246
0.381
0.043
0.094
0.870
0.374

0.279
0.462

100.00
0.00 99.99
99.90
99.99
100.00
100.00
100.00
99.97
0.00 99.49
96.86 99.38
0.00 99.41
97.16
1.49 51.03
0.00 98.42
72.52 91.65
14.12 87.44
74.07 90.38
0.00 99.72
70.01 96.12
93.83 97.00
14.96 65.36
87.85 97.73
73.81
99.23
0.00 99.55
0.00 99.65
80.23 92.57
0.00 70.79
0.00 59.28
77.24
0.00 82.64
99.99
0.00 57.67
75.09 95.40
63.62 95.10
92.97 91.89
81.01 88.75
54.20 92.80
57.17 97.16
0.00 96.12
99.33
100.00
0.00 99.94
0.00 99.99

-0.20

-0.57
-33.16
-1.55

-0.06
0.17
7.97
-0.84

-0.48

-1.84
-0.38
1.16
-1.73
-0.76
4.16

0.236

0.596
0.226
0.374

0.943
0.579
0.658
0.026

0.767

0.208
0.921
0.370
0.214
0.821
0.269

81.74

-0.24 0.390

RecU

-0.01

0.02
-0.03
0.01
-0.11
-1.37
-0.00
-0.01
-0.12
-3.09
0.00
-0.01
-0.02
-0.37
0.00
-1.82
0.01
0.00
0.01
-0.01
0.45
-24.53
0.25
1.20
0.00

-0.03
0.04
0.10
-0.10
-0.18
0.14
0.03
-0.02

-0.56
-0.02

PrecU

100.00
99.99
99.90
99.99
100.00
100.00
100.00
99.97
99.49
99.38
99.41
97.16
51.03
98.42
91.65
87.44
90.38
99.72
96.12
97.00
65.36
97.73
73.81
99.23
99.55
99.65
92.57
70.79
59.28
77.24
82.64
99.99
57.67
95.40
95.10
91.89
88.75
92.80
97.16
96.12
99.33
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.91
99.99
100.00
100.00
100.00
99.94
99.36
99.42
99.49
98.84
47.06
98.66
91.25
90.36
92.67
99.78
96.06
97.31
66.20
97.92
78.09
99.52
99.38
99.90
91.11
66.35
52.88
76.38
84.75
100.00
47.00
95.35
94.50
92.30
90.73
92.26
96.62
95.70
99.58
100.00
99.87
99.99
51.53
96.826

F chg

100.00
100.00
99.90
100.00
100.00
100.00
100.00
100.00
99.63
99.34
99.33
95.53
55.72
98.19
92.06
84.70
88.20
99.67
96.18
96.68
64.54
97.54
69.98
98.94
99.72
99.39
94.07
75.87
67.45
78.12
80.65
99.99
74.60
95.45
95.71
91.48
86.86
93.35
97.72
96.53
99.08
100.00
100.00
100.00

F(0.5)

Rec

TBL
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

Prec

True

(Pos = RB & Wd {aboard, about, above,...<106 ommitted>...,westwards, whence, where})


(Pos RBLOC)

B.61

131

0.00
0.00
78.33
64.71
71.43
0.00
73.80
82.26
32.11
82.51

0.00
84.69
0.00

0.00

83.83
79.87
71.44
68.24
61.86
89.56

0.00

p(FU)

24622

98.38

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.373

0.123
0.147
0.160
0.667
0.001
0.249
0.042
0.928
0.175
0.306
0.102
0.080
0.098
0.021
0.224
0.750
0.847
0.178
0.243
0.799
0.245
0.462
0.208
0.374

0.004
0.031
0.074
0.001
0.001
0.904
0.067
0.374

0.374

0.003
0.003

100.00
0.00 99.99
99.90
99.99
100.00
100.00
100.00
99.97
0.00 99.53
97.57 99.40
0.00 99.41
97.31
0.00 49.11
0.00 98.40
73.87 91.47
12.94 87.52
74.07 89.50
0.00 99.70
69.75 96.07
94.90 96.95
14.96 64.51
87.04 97.67
74.41
99.23
0.00 99.55
0.00 99.65
80.43 92.45
0.00 70.60
0.00 63.18
76.64
0.00 80.00
99.99
0.00 57.67
74.56 94.91
55.87 94.69
82.03 91.50
72.91 88.03
45.80 92.21
56.96 97.03
0.00 95.73
99.33
100.00
0.00 99.91
0.00 99.99

-0.11

0.03

-1.55

-0.09
-0.50
-1.79
-1.23

0.26

-0.99
-9.00
-3.58
-4.75
-5.10
4.32

0.361

0.967

0.374

0.877
0.279
0.668
0.050

0.854

0.294
0.011
0.009
0.003
0.138
0.399

81.39

-0.67 0.145

RecU

-0.01

0.05
-0.01
0.01
0.05
-5.06
-0.02
-0.20
-0.03
-4.03
-0.01
-0.06
-0.07
-1.67
-0.06
-1.02
0.01
0.00
0.01
-0.13
0.18
-19.57
-0.53
-2.04
0.00

-0.54
-0.39
-0.32
-0.91
-0.82
0.01
-0.38
-0.02

-0.02

-2.85
-0.12

PrecU

100.00
99.99
99.90
99.99
100.00
100.00
100.00
99.97
99.53
99.40
99.41
97.31
49.11
98.40
91.47
87.52
89.50
99.70
96.07
96.95
64.51
97.67
74.41
99.23
99.55
99.65
92.45
70.60
63.18
76.64
80.00
99.99
57.67
94.91
94.69
91.50
88.03
92.21
97.03
95.73
99.33
100.00
99.91
99.99

TrueU

p(F)

100.00
99.98
99.92
99.99
100.00
100.00
100.00
99.94
99.39
99.50
99.50
98.84
40.76
98.67
91.61
90.53
88.68
99.77
96.11
97.41
66.16
97.94
79.09
99.52
99.35
99.91
90.92
65.98
65.84
74.92
84.75
100.00
47.00
94.93
93.81
91.02
88.76
90.85
96.17
95.09
99.58
100.00
99.87
99.99
50.35
96.730

F chg

100.00
100.00
99.88
100.00
100.00
100.00
100.00
100.00
99.67
99.29
99.33
95.82
61.78
98.14
91.34
84.70
90.33
99.64
96.03
96.50
62.93
97.39
70.25
98.94
99.76
99.39
94.04
75.91
60.72
78.44
75.76
99.99
74.60
94.90
95.59
91.98
87.31
93.60
97.91
96.37
99.08
100.00
99.96
100.00

F(0.5)

TBL
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

Rec

VB & SibAllR =r e NP(.*)?) (Pos VBTR)


VBG & SibAllR =r e NP(.*)?) (Pos VBGTR)
VBD & SibAllR =r e NP(.*)?) (Pos VBDTR)
VBN & SibAllR =r e NP(.*)?) (Pos VBNTR)
VBP & SibAllR =r e NP(.*)?) (Pos VBPTR)
VBZ & SibAllR =r e NP(.*)?) (Pos VBZTR)

Prec

=
=
=
=
=
=

True

(Pos
(Pos
(Pos
(Pos
(Pos
(Pos

B.62

132

100.00
100.00 100.00 99.99
99.90
99.99
100.00
100.00
100.00
99.97
0.00 99.52
98.37 97.36 99.38
0.00 99.42
96.86
100.00 1.49 50.77
11.11 4.55 98.43
78.31 73.06 91.65
77.78 24.71 87.63
74.55 75.93 91.57
0.00 99.72
74.54 68.40 96.10
83.68 94.72 97.03
39.13 15.38 64.94
83.51 87.17 97.71
74.76
99.19
0.00 99.56
0.00 99.65
85.54 81.40 92.62
25.00 14.29 70.88
0.00 68.16
77.09
0.00 81.67
0.00
99.99
0.00 57.67
83.88 75.79 95.42
83.23 62.91 95.11
70.11 92.64 91.79
66.63 81.70 88.79
58.93 50.38 92.94
86.62 55.46 97.08
0.00 96.03
99.33
100.00
0.00 99.94
0.00 99.99

24622

81.85

p(FU)

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F chgU

F(0.5)U

0.374
0.178

0.187
0.132
0.115
0.237
0.422
0.817
0.810
0.814
0.727
0.759
0.478
0.663
0.216
0.377
0.470
0.149
0.494
0.374
0.271
0.338
0.388
0.924

0.968
0.406
0.875
0.691
0.554
0.173
0.078
0.374

0.493
0.673

RecU

0.01
-0.01

0.04
-0.03
0.02
-0.42
-1.85
0.00
-0.01
0.09
-1.81
0.00
-0.03
0.01
-1.01
-0.02
-0.55
-0.03
0.01
0.01
0.05
0.58
-13.23
0.06

-0.00
0.05
-0.01
-0.05
-0.03
0.06
-0.06
-0.02

-0.49
-0.01

PrecU

100.00
99.99
99.90
99.99
100.00
100.00
100.00
99.97
99.52
99.38
99.42
96.86
50.77
98.43
91.65
87.63
91.57
99.72
96.10
97.03
64.94
97.71
74.76
99.19
99.56
99.65
92.62
70.88
68.16
77.09
81.67
99.99
57.67
95.42
95.11
91.79
88.79
92.94
97.08
96.03
99.33
100.00
99.94
99.99

TrueU

p(F)

100.00
99.99
99.92
99.99
100.00
100.00
100.00
99.94
99.36
99.45
99.49
98.84
48.32
98.71
91.30
90.47
92.30
99.78
96.00
97.41
66.51
97.82
79.09
99.49
99.36
99.91
91.14
66.77
65.84
75.69
83.05
100.00
47.00
95.34
94.59
92.12
90.63
92.31
96.51
95.49
99.58
100.00
99.87
99.99
51.57
96.831

F chg

100.00
100.00
99.87
100.00
100.00
100.00
100.00
100.00
99.68
99.30
99.34
94.95
53.49
98.15
92.01
84.96
90.85
99.67
96.20
96.65
63.44
97.60
70.88
98.90
99.75
99.38
94.15
75.53
70.64
78.55
80.33
99.98
74.60
95.50
95.63
91.45
87.03
93.58
97.66
96.57
99.08
100.00
100.00
100.00

F(0.5)

Rec

TBL
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

Prec

True

(Pos = RB & Par =r e ADVPLOC(?:[A-Z]+)?) (Pos RBLOC)

-0.22

-0.55
7.57
1.83

-0.62
0.33
6.29
-0.55

1.36

-0.10
-0.82
0.76
-0.82
-2.06
1.30

0.276

0.337
0.717
0.374

0.501
0.331
0.908
0.104

0.333

0.838
0.326
0.088
0.677
0.701
0.476

-0.11 0.747

B.63

133

50.00
0.00
78.55
50.00
72.22
74.74
83.02
34.15
82.98
0.00

84.27
0.00

0.00
81.92
81.54
72.63
66.06
54.69
84.59

p(FU)

24622

0.00
98.81

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.374

0.379
0.410
0.465
0.568
0.782
0.280
0.143
0.517
0.413
0.225
0.227
0.051
0.411
0.475
0.375
0.690
0.966
0.995
0.654
0.782
0.436
0.676

0.820
0.755
0.757
0.145
0.793
0.975
0.080
0.498

0.117
0.115

100.00
0.00 99.99
99.90
99.99
100.00
100.00
100.00
99.97
0.00 99.48
96.98 99.39
0.00 99.41
97.16
1.49 51.22
0.00 98.41
73.00 91.56
16.47 87.46
72.22 94.90
0.00 99.74
68.91 96.08
94.52 96.96
11.97 65.09
87.63 97.70
74.46
99.23
0.00 99.55
0.00 99.64
81.01 92.52
0.00 70.53
0.00 83.92
77.09
0.00 81.67
99.99
0.00 57.67
76.32 95.42
62.21 95.05
91.34 91.77
80.45 88.74
53.44 92.95
52.89 97.02
0.00 95.83
99.31
100.00
0.00 99.94
0.00 99.99

-0.20

-0.45
-28.92
-2.24

-0.11
-0.19
-14.72
-0.61

0.38

-0.87
-2.32
2.16
-1.97
-2.54
-2.50

0.307

0.477
0.507
0.496

0.863
0.434
0.618
0.342

0.782

0.508
0.078
0.099
0.231
0.700
0.485

81.66

-0.35 0.433

RecU

-0.01

0.01
-0.02
0.01
-0.11
-0.98
-0.01
-0.11
-0.10
1.76
0.02
-0.05
-0.06
-0.78
-0.03
-0.95
0.01
-0.00
0.00
-0.05
0.08
6.83
0.06

-0.01
-0.01
-0.03
-0.11
-0.02
-0.00
-0.27
-0.04

-0.73
-0.03

PrecU

100.00
99.99
99.90
99.99
100.00
100.00
100.00
99.97
99.48
99.39
99.41
97.16
51.22
98.41
91.56
87.46
94.90
99.74
96.08
96.96
65.09
97.70
74.46
99.23
99.55
99.64
92.52
70.53
83.92
77.09
81.67
99.99
57.67
95.42
95.05
91.77
88.74
92.95
97.02
95.83
99.31
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.91
99.99
100.00
100.00
100.00
99.94
99.40
99.44
99.45
98.84
48.32
98.71
91.27
90.80
95.63
99.81
95.93
97.35
66.64
97.90
78.59
99.52
99.35
99.91
91.06
65.83
81.07
75.10
83.05
100.00
47.00
95.40
94.56
92.20
90.44
92.37
96.36
95.24
99.62
100.00
99.87
99.99
51.44
96.818

F chg

100.00
100.00
99.90
100.00
100.00
100.00
100.00
100.00
99.57
99.35
99.37
95.53
54.50
98.11
91.84
84.35
94.17
99.67
96.23
96.57
63.61
97.51
70.75
98.94
99.75
99.37
94.03
75.95
86.98
79.19
80.33
99.98
74.60
95.44
95.54
91.34
87.09
93.55
97.70
96.44
99.01
100.00
100.00
100.00

F(0.5)

Rec

TBL
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

Prec

True

(Pos = DT & Wd {a, another, each, every, little, many, much}) (Pos DT1)
(Pos = DT & Wd {these, those}) (Pos DTP)
(Pos = NNS & Wd {acrobatics, adenoids, alms,...<66 ommitted>...,tweezers, vicissitudes,
waterworks}) (Pos NNSP)
(Pos = NN & Wd {abaci, aback, abaft,...<32532 ommitted>...,zydeco, zygotic, zymurgy})
(Pos NNM)
(Pos = JJ & Wd {countless, few, many, numerous, several}) (Pos JJP)

B.64

134

33.33
0.00
79.23
74.29
71.93
73.87
83.85
31.68
82.30
0.00

90.32
20.00

82.06
80.95
72.22
66.06
65.45
90.39

p(FU)

24622

98.20

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.374

0.284
0.047
0.258
0.374
0.381
0.265
0.155
0.436
0.518
0.330
0.524
0.267
0.064
0.704
0.150
0.180
0.379
0.374
0.521
0.875
0.191
0.095

0.374

0.125
0.976
0.020
0.062
0.283
0.190
0.407
0.374

0.113
0.170

100.00
0.00 99.99
99.90
99.99
100.00
100.00
100.00
99.97
0.00 99.47
97.33 99.38
0.00 99.39
97.21
1.49 49.78
0.00 98.41
72.01 91.55
30.59 87.73
75.93 90.37
0.00 99.73
69.90 96.11
94.19 96.99
13.68 64.58
88.74 97.72
74.34
99.20
0.00 99.55
0.00 99.65
81.40 92.55
14.29 70.55
0.00 59.34
77.29
0.00 81.67
99.99
0.00 57.67
77.02 95.38
63.85 95.06
91.99 92.00
81.28 88.65
54.96 92.88
54.39 97.11
0.00 96.16
99.33
100.00
0.00 99.94
0.00 99.99

-0.32

-0.74
24.30

0.06
0.17
-8.06
-0.43

4.05

-0.32
-1.19
2.15
-1.51
7.73
1.74

0.073

0.335
0.414

0.916
0.593
0.583
0.482

0.004

0.689
0.469
0.016
0.270
0.325
0.559

81.92

-0.02 0.942

RecU

-0.01

-0.01
-0.03
-0.02
-0.05
-3.77
-0.01
-0.12
0.21
-3.10
0.02
-0.02
-0.03
-1.55
-0.01
-1.11
-0.02
0.00
0.01
-0.02
0.11
-24.46
0.32

0.00

-0.05
-0.00
0.22
-0.21
-0.10
0.09
0.08
-0.02

-1.02
-0.04

PrecU

100.00
99.99
99.90
99.99
100.00
100.00
100.00
99.97
99.47
99.38
99.39
97.21
49.78
98.41
91.55
87.73
90.37
99.73
96.11
96.99
64.58
97.72
74.34
99.20
99.55
99.65
92.55
70.55
59.34
77.29
81.67
99.99
57.67
95.38
95.06
92.00
88.65
92.88
97.11
96.16
99.33
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.93
99.99
100.00
100.00
100.00
99.94
99.37
99.46
99.49
98.95
47.90
98.72
91.17
90.91
92.53
99.80
96.09
97.29
66.20
97.91
78.09
99.49
99.37
99.90
90.68
67.09
53.29
75.79
83.05
100.00
47.00
95.41
94.54
92.17
90.45
92.18
96.46
95.64
99.58
100.00
99.87
99.99
51.29
96.807

F chg

100.00
100.00
99.87
100.00
100.00
100.00
100.00
100.00
99.56
99.29
99.28
95.53
51.82
98.11
91.93
84.76
88.30
99.67
96.14
96.68
63.04
97.52
70.94
98.91
99.73
99.39
94.50
74.39
66.93
78.86
80.33
99.99
74.60
95.35
95.59
91.83
86.92
93.59
97.77
96.70
99.08
100.00
100.00
100.00

F(0.5)

Rec

TBL
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

Prec

True

(Pos = RB & Par = ADVP-TMP) (Pos RBTMP)

B.65

rbloc[s] Mapping

135

0.00
79.69
88.89
88.89
81.19
82.45
52.94
87.10

0.00
90.71
0.00

33.33
86.04
81.07
82.12
78.92
74.14
90.03

p(FU)

24622

99.35

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.374

0.178
0.998
0.374
0.856
0.907
0.587
0.430
0.154
0.105
0.374

0.031
0.209
0.643
0.730
0.297
0.374

0.634
0.139
0.472
0.940

0.374
0.097
0.821
0.207
0.786
0.105
0.996
0.101

0.137
0.521

100.00
0.00 99.99
99.84
100.00
100.00
100.00
100.00
99.94
0.00 99.53
98.51 99.42
0.00 99.37
97.33
0.00 43.33
0.00 98.37
82.14 92.19
28.24 88.53
59.26 95.48
0.00 99.74
67.69 96.11
98.12 96.29
11.54 62.65
90.40 97.46
76.11
99.20
0.00 99.59
0.00 99.67
81.40 92.77
0.00 72.04
0.00 86.19
75.76
0.00 75.44
99.99
5.88 47.95
78.95 95.68
71.36 95.46
91.45 93.26
73.18 89.57
32.82 93.10
63.81 97.61
0.00 96.81
99.45
100.00
0.00 99.94
0.00 99.99

0.07
3.42
1.11

0.11
-0.05
-3.57
-0.08

0.20

0.09
0.87
0.13
0.53
2.38
-0.71

0.548
0.374
0.374

0.211
0.320
0.322
0.367

0.179

0.553
0.217
0.408
0.155
0.447
0.075

84.66

0.04 0.340

RecU

-0.01

-0.02
0.00
-0.00
-0.00
-0.05
0.74
0.00
-0.03
0.14
0.02

-0.01
-0.00
0.05
-0.00
-0.48
0.01

-0.03
0.24
0.17
0.05

-2.11
0.02
0.01
-0.02
0.01
0.04
0.00
0.04

-0.12
-0.00

PrecU

100.00
99.99
99.84
100.00
100.00
100.00
100.00
99.94
99.53
99.42
99.37
97.33
43.33
98.37
92.19
88.53
95.48
99.74
96.11
96.29
62.65
97.46
76.11
99.20
99.59
99.67
92.77
72.04
86.19
75.76
75.44
99.99
47.95
95.68
95.46
93.26
89.57
93.10
97.61
96.81
99.45
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.74
99.99
100.00
100.00
100.00
99.94
99.34
99.56
99.28
99.58
32.77
98.93
92.15
92.29
95.73
99.79
95.09
97.95
73.26
97.21
83.88
99.65
99.33
99.96
90.13
67.35
84.77
76.97
72.88
100.00
35.00
94.63
95.94
94.17
90.94
92.18
96.41
96.38
100.00
100.00
99.87
99.99
50.66
96.851

F chg

100.00
100.00
99.95
100.00
100.00
100.00
100.00
99.94
99.73
99.28
99.46
95.18
63.93
97.82
92.23
85.06
95.23
99.69
97.15
94.69
54.72
97.72
69.67
98.75
99.85
99.39
95.58
77.43
87.66
74.59
78.18
99.99
76.09
96.74
95.00
92.38
88.24
94.04
98.83
97.26
98.90
100.00
100.00
100.00

F(0.5)

Rec

SVM
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

Prec

True

(Pos = RB & Par =r e ADVP(?:LOCDIR)(?:[A-Z]+)?) (Pos RBLOC)

136

5.56
78.53
72.22
79.59
0.00
74.25
84.22
27.63
81.28

84.03
0.00

84.21
81.76
71.16
66.17
55.74
86.29
0.00

p(FU)

24622

98.81

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.374

0.341
0.953
0.858
0.223
0.077
0.617
0.966
0.553
0.372
0.330
0.944
0.186
0.160
0.662
0.733
0.231
0.972
0.374
0.109
0.329
0.395
0.844

0.374

0.208
0.695
0.125
0.252
0.383
0.172
0.513
0.374

0.156
0.945

100.00
0.00 99.99
99.91
99.99
100.00
100.00
100.00
99.97
0.00 99.50
97.39 99.41
0.00 99.40
96.86
0.00 49.23
4.55 98.42
73.15 91.66
15.29 87.66
72.22 94.06
0.00 99.73
69.65 96.13
93.33 96.99
8.97 64.62
88.57 97.72
75.27
99.24
0.00 99.55
0.00 99.65
81.59 92.65
0.00 70.86
0.00 81.12
76.96
0.00 81.67
99.99
0.00 57.67
75.79 95.33
61.03 95.05
93.18 91.95
81.15 88.76
51.91 92.88
55.25 97.12
0.00 96.15
99.33
100.00
0.00 99.94
0.00 99.99

0.02

-0.35
-27.59
2.51

0.13
-0.03
-34.80
-1.17

0.61

0.09
-3.26
1.87
-1.50
-3.08
0.91

0.961

0.670
0.501
0.179

0.858
0.930
0.105
0.010

0.438

0.987
0.086
0.014
0.115
0.784
0.838

81.73

-0.26 0.398

RecU

-0.00

0.03
-0.00
-0.00
-0.42
-4.84
-0.01
0.00
0.12
0.86
0.02
0.00
-0.02
-1.50
-0.01
0.12
0.02
0.00
0.01
0.09
0.55
3.27
-0.10

0.00

-0.10
-0.01
0.17
-0.09
-0.10
0.10
0.06
-0.02

-0.44
-0.00

PrecU

100.00
99.99
99.91
99.99
100.00
100.00
100.00
99.97
99.50
99.41
99.40
96.86
49.23
98.42
91.66
87.66
94.06
99.73
96.13
96.99
64.62
97.72
75.27
99.24
99.55
99.65
92.65
70.86
81.12
76.96
81.67
99.99
57.67
95.33
95.05
91.95
88.76
92.88
97.12
96.15
99.33
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.91
99.99
100.00
100.00
100.00
99.94
99.37
99.48
99.48
98.84
47.06
98.72
91.40
90.61
93.33
99.81
96.05
97.27
65.51
97.96
80.10
99.53
99.38
99.90
91.00
66.56
83.54
76.59
83.05
100.00
47.00
95.34
94.47
92.24
90.47
92.14
96.53
95.81
99.58
100.00
99.87
99.99
51.59
96.841

F chg

Rec

100.00
100.00
99.91
100.00
100.00
100.00
100.00
100.00
99.64
99.34
99.32
94.95
51.61
98.12
91.93
84.89
94.80
99.66
96.21
96.72
63.74
97.48
70.98
98.96
99.72
99.39
94.37
75.75
78.83
77.34
80.33
99.99
74.60
95.32
95.63
91.67
87.11
93.63
97.72
96.50
99.08
100.00
100.00
100.00

F(0.5)

Prec

True

TBL
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

B.66

137

0.00
0.00
78.89
73.08
71.93
0.00
73.71
83.69
42.86
82.88
0.00
0.00
14.29
84.10
33.33

83.90
83.33
69.71
66.41
58.47
85.06
0.00

p(FU)

24622

98.72

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.374

0.185
0.729
0.640
0.231
0.545
0.255
0.407
0.967
0.804
0.633
0.107
0.268
0.439
0.575
0.321
0.838
0.513
0.374
0.811
0.547
0.530
0.365
0.374
0.374

0.999
0.779
0.545
0.699
0.589
0.635
0.399
0.178

0.141
0.568

100.00
0.00 99.99
99.90
99.99
100.00
100.00
100.00
99.97
0.00 99.51
97.28 99.40
0.00 99.41
97.06
0.00 51.16
0.00 98.44
72.30 91.62
22.35 87.57
75.93 93.76
0.00 99.73
69.42 96.10
93.83 96.99
15.38 65.41
87.46 97.69
74.58
99.22
9.09 99.54
0.00 99.65
81.98 92.59
14.29 70.71
0.00 76.34
77.16
0.00 82.64
99.99
0.00 57.67
74.04 95.42
62.21 95.08
92.64 91.85
83.38 88.79
52.67 92.92
56.10 97.06
0.00 96.18
99.31
100.00
0.00 99.94
0.00 99.99

-0.09

-0.74
-1.80

-0.40
-0.11
8.96
-0.76

0.89

-1.32
-1.40
0.43
-0.10
-0.07
1.29

0.512

0.187
0.975

0.590
0.640
0.786
0.257

0.360

0.384
0.540
0.242
0.991
0.874
0.645

81.69

-0.31 0.342

RecU

-0.01

0.03
-0.00
0.01
-0.21
-1.10
0.01
-0.04
0.03
0.54
0.01
-0.03
-0.02
-0.29
-0.03
-0.79
-0.01
-0.01
0.01
0.01
0.34
-2.81
0.15
1.20
0.00

-0.00
0.02
0.06
-0.05
-0.05
0.04
0.09
-0.04

-0.44
-0.01

PrecU

100.00
99.99
99.90
99.99
100.00
100.00
100.00
99.97
99.51
99.40
99.41
97.06
51.16
98.44
91.62
87.57
93.76
99.73
96.10
96.99
65.41
97.69
74.58
99.22
99.54
99.65
92.59
70.71
76.34
77.16
82.64
99.99
57.67
95.42
95.08
91.85
88.79
92.92
97.06
96.18
99.31
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.90
99.99
100.00
100.00
100.00
99.94
99.39
99.46
99.48
98.84
46.22
98.71
91.22
90.58
95.96
99.80
96.06
97.32
66.61
97.87
78.34
99.52
99.37
99.90
91.15
66.40
68.72
75.79
84.75
100.00
47.00
95.26
94.50
92.21
90.72
92.28
96.48
95.66
99.58
100.00
99.87
99.99
51.59
96.836

F chg

100.00
100.00
99.90
100.00
100.00
100.00
100.00
100.00
99.63
99.35
99.33
95.33
57.29
98.16
92.03
84.75
91.66
99.65
96.14
96.67
64.25
97.52
71.17
98.92
99.71
99.39
94.07
75.61
85.86
78.57
80.65
99.99
74.60
95.59
95.67
91.50
86.94
93.57
97.65
96.70
99.05
100.00
100.00
100.00

F(0.5)

Rec

TBL
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

Prec

True

(Pos = NN & Par =r e NP-TMP(?:-[A-Z]+)?) (Pos NNTMP)


(Pos = NNP & Par =r e NP-TMP(?:-[A-Z]+)?) (Pos NNPTMP)

B.67

138

50.00
0.00
78.43
73.91
73.21
74.29
84.21
37.21
83.97
0.00

85.80
33.33

80.00
83.38
70.85
66.21
60.98
88.49

p(FU)

24622

98.74

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.373

0.379
0.030
0.179
0.241
0.784
0.095
0.009
0.153
0.716
0.694
0.515
0.322
0.329
0.015
0.228
0.642
0.991
0.374
0.478
0.203
0.386
0.576

0.374

0.093
0.064
0.140
0.092
0.420
0.036
0.526
0.374

0.167
0.521

100.00
0.00 99.99
99.90
99.99
100.00
100.00
100.00
99.97
0.00 99.48
96.54 99.37
0.00 99.41
97.06
4.48 52.49
0.00 98.45
72.54 91.56
20.00 87.79
75.93 91.49
0.00 99.72
69.27 96.14
94.17 97.00
20.51 65.06
89.08 97.79
74.07
99.24
0.00 99.55
0.00 99.65
80.81 92.61
14.29 71.15
0.00 68.01
77.20
0.00 81.67
99.99
0.00 57.67
75.79 95.28
64.79 95.02
92.32 91.90
81.28 88.68
57.25 92.91
57.60 97.19
0.00 96.16
99.33
100.00
0.00 99.94
0.00 99.99

-0.46

-0.85

0.91

-0.13
0.39
27.27
0.80

1.14

-2.35
0.92
1.21
-1.39
6.48
4.53

0.058

0.055

0.374

0.819
0.088
0.510
0.166

0.446

0.022
0.307
0.285
0.290
0.376
0.266

81.93

-0.01 0.958

RecU

-0.01

0.01
-0.04
0.01
-0.21
1.46
0.02
-0.11
0.28
-1.89
0.00
0.01
-0.01
-0.83
0.06
-1.47
0.02
-0.00
0.01
0.03
0.96
-13.42
0.20

0.00

-0.16
-0.04
0.11
-0.18
-0.06
0.17
0.07
-0.02

-0.66
-0.01

PrecU

100.00
99.99
99.90
99.99
100.00
100.00
100.00
99.97
99.48
99.37
99.41
97.06
52.49
98.45
91.56
87.79
91.49
99.72
96.14
97.00
65.06
97.79
74.07
99.24
99.55
99.65
92.61
71.15
68.01
77.20
81.67
99.99
57.67
95.28
95.02
91.90
88.68
92.91
97.19
96.16
99.33
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.92
99.99
100.00
100.00
100.00
99.94
99.39
99.42
99.49
98.84
48.74
98.71
91.26
90.86
92.20
99.78
96.04
97.29
66.58
97.98
78.09
99.51
99.36
99.91
91.03
66.67
65.84
75.72
83.05
100.00
47.00
95.34
94.39
92.31
90.69
92.34
96.62
95.72
99.58
100.00
99.87
99.99
51.48
96.831

F chg

100.00
100.00
99.88
100.00
100.00
100.00
100.00
100.00
99.57
99.32
99.33
95.33
56.86
98.19
91.86
84.93
90.80
99.67
96.25
96.72
63.61
97.60
70.45
98.97
99.74
99.38
94.23
76.28
70.33
78.73
80.33
99.99
74.60
95.21
95.65
91.49
86.75
93.49
97.76
96.60
99.08
100.00
100.00
100.00

F(0.5)

Rec

TBL
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

Prec

True

(Pos = NN & Wd {afternoon, evening, midsummer,...<13 ommitted>...,winter, wintertime,


yesterday}) (Pos NNTMP)
(Pos = NNP & Wd {advent, apr, apr.,...<62 ommitted>...,wed., wednesday, xmas}) (Pos
NNPTMP)

B.68

vbcop[s] Mapping

139

140

0.00
79.35
91.67
88.89
81.27
82.56
55.10
87.33

0.00
90.15
0.00

33.33
86.90
81.52
82.50
79.00
72.88
89.58

p(FU)

24622

99.38

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.374

0.765
0.621
0.407
0.224
0.686
0.519
0.029
0.367
0.897
0.771
0.513
0.268
0.678
0.164
0.460
0.037
0.178

0.021
0.179
0.696
0.045

0.317
0.073
0.099
0.345
0.055
0.514
0.040
0.958

0.720
0.081

100.00
0.00 99.99
99.84
100.00
100.00
100.00
100.00
99.96
0.00 99.53
98.51 99.42
0.00 99.37
97.53
0.00 42.90
0.00 98.37
82.89 92.26
25.88 88.46
59.26 95.45
0.00 99.74
67.95 96.12
98.14 96.29
11.54 62.48
90.48 97.43
76.85
99.17
0.00 99.60
0.00 99.67
81.59 92.90
0.00 72.33
0.00 85.98
75.99
0.00 75.44
99.99
5.88 47.62
79.12 95.59
70.42 95.41
91.34 93.23
68.30 89.35
32.82 93.03
64.45 97.48
0.00 96.77
99.45
100.00
0.00 99.94
0.00 99.99

0.01

0.29
-2.59
1.11

0.36
0.04
-2.89
0.09

0.03

0.68
0.42
0.32
-3.02
1.84
-0.33

0.374

0.001
0.599
0.374

0.079
0.230
0.486
0.413

0.830

0.040
0.430
0.199
0.029
0.374
0.184

84.70

0.09 0.012

RecU

-0.01

-0.00
-0.00
-0.00
0.15
-0.28
0.01
0.05
0.06
-0.00
0.00
0.00
-0.01
-0.21
-0.04
0.48
-0.03
0.01

0.10
0.65
-0.07
0.35

-2.78
-0.07
-0.05
-0.05
-0.23
-0.03
-0.13
-0.00

-0.04
-0.01

PrecU

100.00
99.99
99.84
100.00
100.00
100.00
100.00
99.96
99.53
99.42
99.37
97.53
42.90
98.37
92.26
88.46
95.45
99.74
96.12
96.29
62.48
97.43
76.85
99.17
99.60
99.67
92.90
72.33
85.98
75.99
75.44
99.99
47.62
95.59
95.41
93.23
89.35
93.03
97.48
96.77
99.45
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.74
99.99
100.00
100.00
100.00
99.94
99.34
99.57
99.28
99.58
32.35
98.92
92.20
92.10
95.63
99.74
95.11
97.93
73.23
97.18
84.89
99.66
99.34
99.96
90.71
67.72
84.57
75.79
72.88
100.00
35.00
94.51
96.07
94.09
90.14
92.24
96.26
96.35
100.00
100.00
99.87
99.99
50.70
96.846

F chg

Rec

100.00
100.00
99.95
100.00
100.00
100.00
100.00
99.98
99.72
99.27
99.46
95.56
63.64
97.83
92.33
85.10
95.27
99.75
97.15
94.70
54.49
97.68
70.21
98.68
99.85
99.39
95.19
77.62
87.45
76.19
78.18
99.99
74.47
96.70
94.76
92.39
88.57
93.84
98.73
97.19
98.90
100.00
100.00
100.00

F(0.5)

Prec

True

SVM
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

141

0.00
0.00
78.34
63.16
74.55
73.57
84.45
32.00
82.45

83.00
0.00

86.41
79.71
71.37
66.23
61.86
89.60
0.00

p(FU)

24622

98.86

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.374

0.374
0.476
0.179
0.417
0.369
0.286
0.550
0.659
0.872
0.331
0.459
0.930
0.356
0.751
0.271
0.714
0.616
0.374
0.419
0.876
0.906
0.226

0.374

0.813
0.470
0.124
0.450
0.029
0.009
0.735
0.374

0.411
0.859

100.00
0.00 99.99
99.90
99.99
100.00
100.00
100.00
99.97
0.00 99.51
96.89 99.39
0.00 99.41
97.31
0.00 50.23
0.00 98.44
72.80 91.62
14.12 87.64
75.93 93.27
0.00 99.73
69.40 96.10
93.40 97.01
13.68 65.07
88.65 97.74
75.48
99.22
0.00 99.55
0.00 99.65
81.40 92.61
0.00 70.56
0.00 78.56
77.42
0.00 81.67
99.99
0.00 57.67
74.74 95.41
64.55 95.02
92.53 91.99
84.08 88.70
55.73 92.78
57.17 97.22
0.00 96.05
99.33
100.00
0.00 99.94
0.00 99.99

-0.21

-0.72

1.83

-0.51
0.15
-7.78
-0.38

-0.12

0.55
-1.27
1.74
0.12
5.72
4.57

0.379

0.488

0.374

0.593
0.626
0.773
0.447

0.880

0.365
0.698
0.028
0.885
0.604
0.161

81.78

-0.20 0.557

RecU

-0.01

0.03
-0.02
0.01
0.05
-2.91
0.02
-0.05
0.10
0.01
0.02
-0.03
-0.00
-0.81
0.01
0.40
-0.01
-0.01
0.01
0.04
0.12
0.01
0.49

0.00

-0.01
-0.05
0.22
-0.15
-0.21
0.20
-0.04
-0.02

-0.39
-0.00

PrecU

100.00
99.99
99.90
99.99
100.00
100.00
100.00
99.97
99.51
99.39
99.41
97.31
50.23
98.44
91.62
87.64
93.27
99.73
96.10
97.01
65.07
97.74
75.48
99.22
99.55
99.65
92.61
70.56
78.56
77.42
81.67
99.99
57.67
95.41
95.02
91.99
88.70
92.78
97.22
96.05
99.33
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.93
99.99
100.00
100.00
100.00
99.94
99.37
99.43
99.49
98.84
46.64
98.71
91.30
90.75
92.11
99.80
96.03
97.30
66.64
98.00
79.85
99.51
99.36
99.90
91.21
66.04
82.92
75.76
83.05
100.00
47.00
95.27
94.55
92.25
90.49
92.13
96.46
95.57
99.62
100.00
99.87
99.99
51.62
96.839

F chg

Rec

100.00
100.00
99.87
100.00
100.00
100.00
100.00
100.00
99.64
99.35
99.34
95.82
54.41
98.17
91.94
84.73
94.46
99.67
96.17
96.72
63.57
97.47
71.56
98.93
99.73
99.39
94.05
75.74
74.63
79.17
80.33
99.99
74.60
95.56
95.49
91.74
86.98
93.43
98.00
96.53
99.05
100.00
100.00
100.00

F(0.5)

Prec

True

TBL
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

B.69

dtnum[l] + jjnum[l] Mapping

142

100.00
0.00
79.58
60.87
71.93
74.14
83.47
33.56
82.56

18.18
84.34
0.00

79.53
83.33
69.96
67.16
62.73
84.93

p(FU)

24622

98.63

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.374

0.120
0.377
0.763
0.134
0.772
0.934
0.175
0.372
0.832
0.418
0.333
0.193
0.356
0.137
0.807
0.694
0.187
0.986
0.332
0.455
0.263
0.923
0.374
0.374

0.204
0.373
0.835
0.823
0.079
0.487
0.849
0.178

0.196
0.258

100.00
0.00 99.99
99.90
99.99
100.00
100.00
100.00
99.97
0.00 99.52
97.04 99.38
0.00 99.41
97.01
1.49 52.28
0.00 98.43
71.80 91.55
16.47 87.38
75.93 92.22
0.00 99.73
69.17 96.09
93.99 96.97
20.94 64.90
87.72 97.69
74.94
99.23
18.18 99.53
0.00 99.64
81.40 92.53
0.00 70.25
0.00 64.37
77.07
0.00 82.64
99.99
0.00 57.67
77.02 95.29
63.38 95.11
91.99 91.81
82.54 88.81
52.67 92.85
53.10 97.06
0.00 96.07
99.31
100.00
0.00 99.94
0.00 99.99

-0.25

-0.69
-25.63

-0.31
-0.17
24.11
-0.82

0.66

-1.83
-0.35
0.34
0.07
3.24
-2.10

0.349

0.486
0.409

0.746
0.512
0.686
0.129

0.616

0.253
0.948
0.668
0.877
0.823
0.663

81.59

-0.43 0.370

RecU

-0.01

0.05
-0.02
0.00
-0.26
1.06
0.00
-0.13
-0.19
-1.11
0.01
-0.04
-0.04
-1.08
-0.04
-0.32
0.01
-0.02
0.00
-0.05
-0.32
-18.06
0.03
1.20
0.00

-0.14
0.05
0.02
-0.03
-0.12
0.04
-0.02
-0.04

-0.88
-0.03

PrecU

100.00
99.99
99.90
99.99
100.00
100.00
100.00
99.97
99.52
99.38
99.41
97.01
52.28
98.43
91.55
87.38
92.22
99.73
96.09
96.97
64.90
97.69
74.94
99.23
99.53
99.64
92.53
70.25
64.37
77.07
82.64
99.99
57.67
95.29
95.11
91.81
88.81
92.85
97.06
96.07
99.31
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.90
99.99
100.00
100.00
100.00
99.94
99.37
99.43
99.48
98.84
45.80
98.71
91.06
90.53
96.38
99.80
96.00
97.33
66.99
97.89
79.09
99.52
99.38
99.90
91.06
65.88
51.85
75.16
84.75
100.00
47.00
95.52
94.68
92.12
90.49
92.05
96.43
95.64
99.58
100.00
99.87
99.99
51.37
96.810

F chg

100.00
100.00
99.90
100.00
100.00
100.00
100.00
100.00
99.67
99.34
99.33
95.24
60.89
98.15
92.03
84.44
88.41
99.66
96.18
96.62
62.93
97.49
71.20
98.94
99.68
99.38
94.04
75.24
84.85
79.06
80.65
99.99
74.60
95.06
95.55
91.50
87.20
93.67
97.71
96.51
99.05
100.00
100.00
100.00

F(0.5)

Rec

TBL
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

Prec

True

(Pos = DT & Wd {a, another, each, every, little, many, much}) (Pos DT1)
(Pos = DT & Wd {these, those}) (Pos DTP)
(Pos = JJ & Wd {countless, few, many, numerous, several}) (Pos JJP)

B.70

143

0.00
79.36
88.89
88.89
80.19
82.37
55.10
87.40

0.00
90.30
0.00

33.33
86.83
79.84
82.12
78.14
72.31
89.71

p(FU)

24622

99.35

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.374

0.987
0.190
0.741
0.374
0.745
0.747
0.105
0.606
0.281
0.178
0.622
0.239
0.966
0.093
0.374
0.995
0.071
0.016
0.726
0.648
0.525
0.094

0.317
0.171
0.407
0.833
0.462
0.159
0.697
0.333

0.449
0.251

100.00
0.00 99.99
99.84
100.00
100.00
100.00
100.00
99.96
0.00 99.53
98.48 99.42
0.00 99.38
97.33
0.00 42.25
0.00 98.37
82.17 92.18
28.24 88.47
59.26 95.53
0.00 99.75
67.72 96.11
98.01 96.28
11.54 62.62
90.52 97.48
76.34
99.20
0.00 99.60
0.00 99.70
81.20 92.79
0.00 71.98
0.00 86.25
75.35
0.00 75.44
99.99
5.88 47.62
68.25 95.58
70.66 95.45
91.45 93.27
72.91 89.53
35.88 93.10
65.31 97.61
0.00 96.65
99.45
100.00
0.00 99.94
0.00 99.99

-0.02

-0.13
3.42
1.11

-0.43
-0.15
-2.89
0.16

-0.14

-7.10
-0.37
0.13
-0.14
7.91
0.49

0.374

0.517
0.962
0.374

0.029
0.046
0.402
0.004

0.233

0.025
0.531
0.667
0.844
0.307
0.369

84.42

-0.24 0.015

RecU

-0.01

0.00
-0.01
0.00
-0.05
-1.77
0.01
-0.04
0.07
0.07
0.01
-0.00
-0.01
0.00
0.01
-0.18
0.00
0.01
0.02
-0.01
0.17
0.24
-0.49

-2.78
-0.08
-0.01
-0.01
-0.04
0.04
0.00
-0.13

-0.14
-0.01

PrecU

100.00
99.99
99.84
100.00
100.00
100.00
100.00
99.96
99.53
99.42
99.38
97.33
42.25
98.37
92.18
88.47
95.53
99.75
96.11
96.28
62.62
97.48
76.34
99.20
99.60
99.70
92.79
71.98
86.25
75.35
75.44
99.99
47.62
95.58
95.45
93.27
89.53
93.10
97.61
96.65
99.45
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.74
99.99
100.00
100.00
100.00
99.94
99.37
99.57
99.34
99.58
31.51
98.79
92.02
92.27
95.82
99.79
95.12
97.95
73.19
97.22
84.13
99.65
99.35
99.96
90.65
67.03
84.57
75.55
72.88
100.00
35.00
94.35
95.90
94.16
90.95
92.26
96.43
96.90
100.00
100.00
99.87
99.99
50.64
96.846

F chg

100.00
100.00
99.95
100.00
100.00
100.00
100.00
99.98
99.70
99.26
99.42
95.18
64.10
97.96
92.35
84.97
95.24
99.71
97.12
94.67
54.72
97.73
69.87
98.74
99.85
99.44
95.03
77.72
88.01
75.16
78.18
99.99
74.47
96.85
95.00
92.40
88.14
93.95
98.83
96.39
98.90
100.00
100.00
100.00

F(0.5)

SVM
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

Rec

TO & Wd = to & SibR = NP) (Pos IN)


TO & Par = QP & Wd = to) (Pos IN)
IN & SibR = S) (Pos INSUB)
IN & Par = SBAR) (Pos INSUB)

Prec

=
=
=
=

True

(Pos
(Pos
(Pos
(Pos

to:in + insub[s] Mapping

144

100.00
0.00
78.81
75.00
71.93
74.01
84.36
34.69
82.21

84.17
0.00

84.39
81.68
70.80
65.67
66.36
85.17

0.00

p(FU)

24622

0.00
98.69

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.374

0.445
0.648
0.726
0.917
0.206
0.370
0.735
0.512
0.395
0.979
0.485
0.860
0.259
0.747
0.958
0.094
0.748
0.205
0.364
0.626
0.388
0.126
0.374
0.178

0.822
0.138
0.970
0.546
0.343
0.097
0.036

0.374

0.186
0.344

100.00
0.00 99.99
99.91
99.99
100.00
100.00
100.00
99.97
0.00 99.50
97.10 99.40
0.00 99.40
97.21
1.49 49.57
0.00 98.41
72.65 91.64
21.18 87.43
75.93 92.04
0.00 99.72
69.57 96.11
93.94 97.02
14.53 64.98
88.99 97.74
75.23
99.24
0.00 99.55
0.00 99.66
81.40 92.53
0.00 70.16
0.00 71.52
77.47
0.00 82.64
99.99
0.00 57.67
74.91 95.41
61.74 95.12
93.40 91.79
79.89 88.78
54.20 93.05
57.82 97.15
0.00 96.30
99.35
100.00
0.00 99.89
0.00 99.99

-0.19

-0.54
-5.26

-0.09
0.37
-1.43
-0.35

0.56

-0.43
-2.67
1.68
-2.59
7.58
3.18

0.361

0.505
0.374

0.902
0.235
0.842
0.311

0.358

0.639
0.068
0.027
0.142
0.388
0.320

81.88

-0.08 0.779

RecU

-0.00

0.02
-0.01
-0.01
-0.05
-4.18
-0.02
-0.02
-0.14
-1.30
0.00
-0.02
0.01
-0.95
0.01
0.07
0.02
0.00
0.02
-0.05
-0.44
-8.95
0.55
1.20
-0.00

-0.01
0.06
-0.00
-0.06
0.08
0.13
0.22

-0.04

-0.53
-0.01

PrecU

100.00
99.99
99.91
99.99
100.00
100.00
100.00
99.97
99.50
99.40
99.40
97.21
49.57
98.41
91.64
87.43
92.04
99.72
96.11
97.02
64.98
97.74
75.23
99.24
99.55
99.66
92.53
70.16
71.52
77.47
82.64
99.99
57.67
95.41
95.12
91.79
88.78
93.05
97.15
96.30
99.35
100.00
99.89
99.99

TrueU

p(F)

100.00
99.98
99.91
99.99
100.00
100.00
100.00
99.94
99.40
99.47
99.48
98.84
48.32
98.63
91.26
89.98
92.34
99.78
96.08
97.29
66.40
97.92
81.11
99.54
99.37
99.93
91.05
66.35
70.78
76.38
84.75
99.99
47.00
95.34
94.58
92.10
90.63
92.33
96.63
96.23
99.62
100.00
99.87
99.99
51.54
96.834

F chg

Rec

100.00
100.00
99.91
100.00
100.00
100.00
100.00
100.00
99.60
99.33
99.31
95.63
50.88
98.20
92.03
85.02
91.74
99.66
96.14
96.75
63.61
97.55
70.15
98.95
99.73
99.38
94.06
74.44
72.27
78.59
80.65
99.98
74.60
95.48
95.68
91.49
87.01
93.77
97.68
96.37
99.08
100.00
99.91
100.00

F(0.5)

Prec

True

TBL
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

B.71

vbcop[lm] Mapping

145

7.14
78.34
73.33
70.00
74.72
84.11
36.07
80.54

84.35
0.00

85.63
82.57
69.63
66.21
63.06
85.52

p(FU)

24622

0.00
98.99

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.373

0.778
0.621
0.961
0.231
0.329
0.714
0.666
0.710
0.875
0.626
0.907
0.586
0.012
0.380
0.231
0.310
0.993

0.367
0.836
0.265
0.137

0.374
0.374
0.147
0.804
0.863
0.552
0.033
0.192
0.170

0.037
0.113

100.00
0.00 99.99
99.90
99.99
100.00
100.00
100.00
99.97
0.00 99.48
97.22 99.40
0.00 99.40
97.06
0.00 49.41
4.55 98.42
73.35 91.63
25.88 87.64
77.78 92.33
0.00 99.71
68.73 96.13
93.42 97.00
9.40 64.84
89.50 97.71
74.64
99.20
0.00 99.55
0.00 99.64
80.43 92.49
0.00 70.57
0.00 64.52
77.27
0.00 81.67
99.99
0.00 56.79
75.26 95.33
63.38 95.07
92.53 91.81
81.01 88.86
53.44 92.82
54.39 97.12
0.00 96.21
99.35
100.00
0.00 99.94
0.00 99.99

0.02

-0.33

-0.26

-0.26
-0.05
-28.22
-1.15

0.06

0.51
-0.74
0.32
-1.54
4.31
-0.39

0.937

0.556

0.374

0.685
0.855
0.098
0.046

0.991

0.704
0.469
0.723
0.205
0.652
0.894

81.73

-0.26 0.423

RecU

-0.01

0.00
-0.01
-0.00
-0.21
-4.49
-0.00
-0.03
0.11
-1.00
-0.00
0.00
-0.01
-1.16
-0.02
-0.71
-0.02
-0.00

-0.09
0.14
-17.86
0.30

0.00
-1.52
-0.10
0.01
0.01
0.02
-0.16
0.10
0.13

-0.71
-0.02

PrecU

100.00
99.99
99.90
99.99
100.00
100.00
100.00
99.97
99.48
99.40
99.40
97.06
49.41
98.42
91.63
87.64
92.33
99.71
96.13
97.00
64.84
97.71
74.64
99.20
99.55
99.64
92.49
70.57
64.52
77.27
81.67
99.99
56.79
95.33
95.07
91.81
88.86
92.82
97.12
96.21
99.35
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.92
99.99
100.00
100.00
100.00
99.94
99.40
99.45
99.48
98.84
44.12
98.69
91.36
90.78
96.67
99.77
96.01
97.31
65.75
97.94
79.35
99.52
99.36
99.90
90.96
66.09
51.65
75.41
83.05
100.00
46.00
95.16
94.55
92.38
90.76
92.28
96.59
95.76
99.62
100.00
99.87
99.99
51.45
96.827

F chg

100.00
100.00
99.88
100.00
100.00
100.00
100.00
100.00
99.56
99.35
99.33
95.33
56.15
98.16
91.91
84.72
88.36
99.66
96.25
96.70
63.95
97.47
70.47
98.89
99.74
99.38
94.08
75.71
85.96
79.23
80.33
99.99
74.19
95.50
95.60
91.24
87.04
93.37
97.66
96.66
99.08
100.00
100.00
100.00

F(0.5)

Rec

TBL
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

Prec

True

(Pos = VBZ & Wd {appears, becomes, feels, looks, remains, seems, smells, sounds})
(Pos VJZ)
(Pos = VB & Wd {appear, become, feel, look, remain, seem, smell, sound}) (Pos VJ)
(Pos = VBP & Wd {appear, become, feel, look, remain, seem, smell, sound}) (Pos
VJP)
(Pos = VBN & Wd {appeared, become, felt, looked, remained, seemed, smelled, smelt,
sounded}) (Pos VJN)
(Pos = VBD & Wd {appeared, became, felt, looked, remained, seemed, smelled, smelt,
sounded}) (Pos VJD)

B.72

in/rp/rb[l] Mapping

146

147

80.00
0.00
78.75
81.25
70.69
0.00
74.03
83.29
33.02
83.90

83.43
20.00

0.00
83.53
82.28
70.25
66.36
55.12
88.12

p(FU)

24622

0.00
98.86

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.373

0.430
0.119
0.725
0.152
0.633
0.796
0.124
0.697
0.786
0.975
0.145
0.059
0.242
0.714
0.989
0.744
0.369
0.986
0.021
0.623
0.396
0.345

0.176
0.414
0.542
0.458
0.543
0.056
0.903

0.146
0.124

100.00
0.00 99.99
99.90
99.99
100.00
100.00
100.00
99.97
0.00 99.50
96.69 99.37
0.00 99.40
96.65
5.97 52.25
0.00 98.42
72.50 91.57
30.59 87.49
75.93 91.93
0.00 99.72
69.06 96.09
94.06 96.97
14.96 64.87
87.25 97.74
75.15
99.23
0.00 99.55
0.00 99.64
81.01 92.50
14.29 70.24
0.00 69.12
77.34
0.00 81.67
99.99
0.00 57.67
75.61 95.32
64.32 95.10
93.29 91.86
80.73 88.78
53.44 92.90
57.17 97.16
0.00 96.11
99.35
100.00
0.00 99.94
0.00 99.99

-0.32

-0.68
27.49
-0.89

-0.45
-0.25
-0.92
-0.26

-0.11

-0.42
-0.07
1.19
-1.57
-2.16
3.89

0.227

0.175
0.353
0.374

0.280
0.170
0.913
0.333

0.935

0.611
0.954
0.267
0.209
0.771
0.246

81.68

-0.32 0.150

RecU

-0.01

0.02
-0.04
-0.01
-0.63
1.01
-0.00
-0.10
-0.06
-1.42
0.00
-0.04
-0.05
-1.11
0.01
-0.04
0.01
-0.00
0.00
-0.08
-0.33
-12.00
0.39

-0.11
0.04
0.07
-0.07
-0.07
0.13
0.02

-0.78
-0.03

PrecU

100.00
99.99
99.90
99.99
100.00
100.00
100.00
99.97
99.50
99.37
99.40
96.65
52.25
98.42
91.57
87.49
91.93
99.72
96.09
96.97
64.87
97.74
75.15
99.23
99.55
99.64
92.50
70.24
69.12
77.34
81.67
99.99
57.67
95.32
95.10
91.86
88.78
92.90
97.16
96.11
99.35
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.92
99.99
100.00
100.00
100.00
99.94
99.37
99.41
99.46
98.53
48.74
98.70
91.21
90.61
93.14
99.79
95.97
97.30
66.85
97.87
80.35
99.53
99.36
99.90
90.97
65.72
65.64
76.66
83.05
100.00
47.00
95.43
94.53
92.36
90.62
92.10
96.59
95.81
99.62
100.00
99.87
99.99
51.42
96.812

F chg

Rec

100.00
100.00
99.88
100.00
100.00
100.00
100.00
100.00
99.62
99.33
99.33
94.84
56.31
98.15
91.93
84.58
90.76
99.65
96.20
96.63
63.00
97.61
70.58
98.93
99.73
99.38
94.09
75.42
73.00
78.04
80.33
99.98
74.60
95.21
95.68
91.37
87.01
93.72
97.73
96.42
99.08
100.00
100.00
100.00

F(0.5)

Prec

True

TBL
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

B.73

148

149

0.00
78.10
80.00
71.93
72.96
83.61
37.17
84.04

85.13
0.00

84.87
83.84
70.04
66.40
52.67
88.24

p(FU)

24622

98.80

F chgU

0
1
0
0
0
0
0
0
1
3413
2
0
67
22
4581
85
54
5
3934
6075
234
2353
0
0
11
1
516
7
2
0
1
0
17
570
426
924
716
131
467
4
0
0
1
1

F(0.5)U

0.765
0.093
0.667
0.136
0.579
0.951
0.135
0.696
0.544
0.299
0.033
0.162
0.149
0.551
0.599
0.519
0.032
0.110
0.021
0.798
0.330
0.487
0.374
0.374
0.374
0.339
0.300
0.293
0.119
0.116
0.007
0.628

0.033
0.079

100.00
0.00 99.99
99.91
99.99
100.00
100.00
100.00
99.97
0.00 99.48
96.45 99.37
0.00 99.40
96.60
0.00 51.40
0.00 98.43
72.84 91.58
23.53 87.61
75.93 91.41
0.00 99.73
68.71 96.07
94.11 96.98
17.95 64.68
86.83 97.74
75.00
99.24
0.00 99.52
0.00 99.61
81.01 92.48
0.00 70.24
0.00 69.36
77.37
0.00 82.64
99.99
0.00 56.79
75.79 95.33
64.55 95.00
92.10 91.73
80.03 88.66
52.67 92.79
57.82 97.18
0.00 96.16
99.35
100.00
0.00 99.94
0.00 99.99

-0.47

-0.84
4.31

-1.42
-0.02
16.50
-0.42

0.88

0.46
0.96
0.45
-1.93
-5.03
4.65

0.085

0.136
0.783

0.016
0.928
0.499
0.253

0.498

0.817
0.461
0.550
0.264
0.421
0.274

81.57

-0.46 0.094

RecU

0.01
-0.04
-0.01
-0.68
-0.64
-0.00
-0.09
0.07
-1.99
0.01
-0.06
-0.04
-1.41
0.01
-0.24
0.02
-0.03
-0.03
-0.10
-0.33
-11.70
0.42
1.20
0.00
-1.52
-0.10
-0.07
-0.07
-0.20
-0.20
0.16
0.07

-1.30
-0.04

PrecU

100.00
99.99
99.91
99.99
100.00
100.00
100.00
99.97
99.48
99.37
99.40
96.60
51.40
98.43
91.58
87.61
91.41
99.73
96.07
96.98
64.68
97.74
75.00
99.24
99.52
99.61
92.48
70.24
69.36
77.37
82.64
99.99
56.79
95.33
95.00
91.73
88.66
92.79
97.18
96.16
99.35
100.00
99.94
99.99

TrueU

p(F)

100.00
99.98
99.92
99.99
100.00
100.00
100.00
99.94
99.36
99.40
99.46
98.53
46.22
98.69
91.29
90.83
91.17
99.80
95.97
97.31
66.13
97.90
80.10
99.54
99.34
99.89
90.94
65.30
70.58
76.69
84.75
100.00
46.00
95.36
94.45
92.06
90.40
92.15
96.59
95.97
99.62
100.00
99.87
99.99
51.15
96.801

F chg

Rec

100.00
100.00
99.90
100.00
100.00
100.00
100.00
100.00
99.61
99.33
99.33
94.74
57.89
98.16
91.87
84.62
91.64
99.67
96.16
96.65
63.29
97.58
70.51
98.95
99.71
99.33
94.08
75.99
68.19
78.05
80.65
99.99
74.19
95.29
95.55
91.40
86.98
93.43
97.79
96.34
99.08
100.00
100.00
100.00

F(0.5)

Prec

True

TBL
#
158
$
8103

7620
,
53640
-LRB1489
-RRB1505
.
43373
:
5335
CC
26227
CD
40132
DT
90066
EX
951
FW
238
IN
108456
JJ
67085
JJR
3621
JJS
2129
MD
10743
NN
146173
NNP
100926
NNPS
2917
NNS
65922
PDT
397
POS
9529
PRP
19164
PRP$
9173
RB
33806
RBR
1905
RBS
486
RP
2879
SYM
59
TO
24551
UH
100
VB
29021
VBD
32941
VBG
16321
VBN
22177
VBP
13819
VBZ
23816
WDT
4745
WP
2604
WP$
183
WRB
2322

7811
SENT
43766
TOKENS 1044667

Anda mungkin juga menyukai