Eligiendo El Mejor Estimador No Parametrico para Calcular Riqueza en Bases de Datos de Macroinvertebrados Bentonicos

ISSN 0373-5680 (impresa), ISSN 1851-7471 (en lnea) Rev. Soc. Entomol. Argent.
70 (1-2): 27-38, 2011 27
Choosing the best non-parametric richness estimator

for benthic macroinvertebrates databases
BASUALDO, Carola V.
Instituto de Biodiversidad Neotropical (IBN), Facultad de Ciencias Naturales e Instituto

Miguel Lillo Universidad Nacional de Tucumn. Miguel Lillo 205, C.P. 4000, SM de
Tucumn, Tucumn, Argentina; e-mail: carolabasualdo@gmail.com
Eligiendo el mejor estimador no paramtrico para calcular riqueza

en bases de datos de macroinvertebrados bentnicos
RESUMEN. Los estimadores no paramtricos permiten comparar la

riqueza estimada de conjuntos de datos de origen diverso. Empero, como su
comportamiento depende de la distribucin de abundancia del conjunto de
datos, la preferencia por alguno representa una decisin difcil. Este trabajo
rescata algunos criterios presentes en la literatura para elegir el estimador
ms adecuado para macroinvertebrados bentnicos de ros y ofrece algunas
herramientas para su aplicacin. Cuatro estimadores de incidencia y dos de
abundancia se aplicaron a un inventario regional a nivel de familia y gnero.
Para su evaluacin se consider: el tamao de submuestra para estimar la
riqueza observada, la constancia de ese tamao de submuestra, la ausencia
de comportamiento errtico y la similitud en la forma de la curva entre los
distintos conjuntos de datos. Entre los estimadores de incidencia, el mejor
fue Jack1; entre los de abundancia, ACE para muestras de baja riqueza y
Chao1, para las de alta riqueza. La forma uniforme de las curvas permiti
describir secuencias generales de comportamiento, que pueden utilizarse
como referencia para comparar curvas de pequeas muestras e inferir su
comportamiento y riqueza probable, si la muestra fuera mayor. Estos
resultados pueden ser muy tiles para la gestin ambiental y actualizan el
estado del conocimiento regional de macroinvertebrados.
PALABRAS CLAVE. Riqueza estimada. Ros neotropicales. Gestin

ambiental.
ABSTRACT. Non-parametric estimators allow to compare the

estimates of richness among data sets from heterogeneous sources. However,
since the estimator performance depends on the species-abundance distribution
of the sample, preference for one or another is a difficult issue. The present
study recovers and revalues some criteria already present in the literature in
order to choose the most suitable estimator for streams macroinvertebrates,
and provides some tools to apply them. Two abundance and four incidence
estimators were applied to a regional database at family and genus level. They
were evaluated under four criteria: sub-sample size required to estimate the
observed richness; constancy of the sub-sample size; lack of erratic behavior
and similarity in curve shape through different data sets. Among incidence
estimators, Jack1 had the best performance. Between abundance estimators,
ACE was the best when the observed richness was small and Chao1 when
Recibido: 9-XI-2010; aceptado: 3-II-2011
28 Rev. Soc. Entomol. Argent. 70 (1-2): 27-38, 2011
the observed richness was high. The uniformity of curves shapes allowed to
describe the general sequences of curves behavior that could act as references
to compare estimations of small databases and to infer the possible behavior
of the curve (i.e the expected richness) if the sample were larger. These results
can be very useful for environmental management, and update the state of
knowledge of regional macroinvertebrates.
KEY WORDS. Richness estimation. Neotropical streams. Environmental

management.
INTRODUCTION species, using a previously formulated non-

parametric model (e.g. Chao & Bunge, 2002;
Species richness is a widely used measure Srensen et al., 2002; Chiarucci et al., 2003;
of biodiversity because it is relatively Rosenzweig et al. 2003). There are two kinds
easy to measure and is well understood of non-parametric estimators: abundance
by researchers, managers and the society and incidence models.
(Hellmann & Fowler, 1999). Unfortunately, The abundance model considers the
it depends heavily on sample size, then if we number of individuals that represents each
make a simple count of the species present species in the sample, and the incidence
in a sample, surely we will fall into an model considers the number of samples
underestimation, unless we make a census in which each species is present (i.e.
of the community of interest (Hellmann & presence-absence data). These methods use
Fowler, 1999). Because exhaustive sampling the number of rare species to estimate the
is impractical or rarely supported (especially possible number of undiscovered species.
in tropical invertebrate, microbial or plant Since the abundance and incidence of
communities) (Hellmann & Fowler, 1999; rare species grow with the increasing of
Gotelli & Colwell, 2001), the estimation sampling effort, these methods are expected
of richness based on available biological to estimate different species richness as
inventories has received in the last decade sampling effort increases. Moreover, as
progressively more attention in both different populations have different species-
theoretical and empirical aspects (Walther & abundance distributions, the estimator
Morand, 1998; Heyer et al., 1999; Petersen performance should depend also on the
et al., 2003; Chao et al., 2005). Much of the species-abundance distribution of the data
importance of these methods lies in that they set (Bunge & Fitzpatrick, 1993; Sobern
make comparable the estimates of richness & Llorente, 1993; Colwell & Coddington,
among data sets from different regions, 1994; Walter & Morand, 1998). Thus, given
seasons, results of different methodologies or a particular data set, the preference for one
sampling effort (Jimnez-Valverde & Hortal, or another method should be based on the
2003). extent that they meet the characteristics of an
Among the methodologies currently ideal estimator.
used for this task, four groups can be Many authors (e.g. Palmer, 1990;
distinguished: non-parametric estimators, Hellmann & Flower, 1999; Walter & Morand,
fitting species-abundance distributions, 1998; Hortal et al., 2006) agree that some
species accumulation curves and species of these characteristics are: independence
area curves (Hortal et al., 2006). of sample size (amount of sampling effort
Here, I will focus only in the non- carried out); insensitivity to unevenness in
parametric estimators. These indices use species distributions; insensitivity to sample
the abundance or incidence of rare species order (Chazdon et al., 1998) and insensitivity
in the samples to estimate total number of to heterogeneity in the sample units used
BASUALDO, C. V. Assessing richness estimator for benthic invertebrates databases 29
among studies to compare richness values come from a variety of sampling methods.
obtained from different survey strategies, Other orders of macroinvertebrates present
which is often the case for macroecological in this database were discarded due to low
studies (Hortal et al., 2006). representativeness.
Unfortunately, to reach these conditions All collection sites available for each
is a very difficult challenge for an estimator, order at each taxonomic level were used
and also to assess them, especially if quick as units of sampling effort, regardless of the
answers about species richness estimation number of visits, season or sampling method
are needed. However, some simple and used (mist nets, light traps, hand picking, D
practical criteria can offer an overview of the frame net, Surber net, etc).
performance of estimators for taxa for which For the purposes of this work a site
exhaustive statistics assessments have not yet was defined as a stretch of a river or stream
been implemented. (rhithron) of approximately 200 m long, with
This study seek to recover and revalue data of geographic coordinates, elevation,
some criteria already present in the literature ecoregion, and date of collection. This unit
and to provide some tools to apply them in of sampling effort was chosen to account
order to assist researchers, managers, and for natural levels of sample heterogeneity
policy makers to choose the most suitable (patchiness) in the data (Gotelli & Colwell,
non-parametric estimator of richness of 2001). However, in order to make comparable
streams macroinvertebrates communities. estimations among different groups, graphics
Furthermore, such criteria are applied results were rescaled to the percentage of
to a large and heterogeneous regional sites (to family level) or individuals (to genera
database of benthic macroinvertebrates level) as a measure of sampling effort.
from a subtropical Andean area. Results are
exemplified at two hierarchical levels and Richness estimators
several levels of sampling efforts, making Six non-parametric estimators were
such tools applicable to a wide range of calculated: four incidence estimators (Jacknife
situations. Additionally, this work provides 1, Jacknife 2, Chao2 and Incidence-based
valuable information regarding the regional Coverage Estimator) and two abundance
state of knowledge of the taxa analyzed. estimators (Abundance-based Coverage
Estimator and Chao1).
Incidence or presence-absence estimators,
MATERIAL AND METHODS use counts of uniques and duplicates, i.e.
species that are present only in one or two
Data set samples respectively (first-order Jackknife or
A large database of benthic Jack 1 and Jackknife of second order or Jack
macroinvertebrates from the Yungas 2; Burnham & Overton 1978, 1979). The
ecoregion (NW of Argentina, Neotropical Chao2 estimator (Chao, 1987; Colwell 1997),
region) was used to calculate the richness takes into account rare species and the total
of Ephemeroptera, Trichoptera, Coleoptera, number of species observed in the sample to
Diptera and Prostigmata (Hydrachnidia) calculate its richness. The Incidence-based
at family and genera level. This database Coverage Estimator (ICE) (Lee & Chao, 1994;
belongs to the Instituto de Biodiversidad Colwell, 1997) calculates both rare and
Neotropical (IBN), and includes data of all the common species, considering as common
specimens collected during a decade by its those that occur in more than 10 samples (by
members, who made significant contributions default) or a value chosen by the user.
to the knowledge of macroinvertebrates of Abundance estimators base their
northwestern Argentina in both systematic calculations firstly in singletons and
and ecological aspects (Domnguez & doubletons species, i.e. the number of
Fernndez, 2009). Since the specimens were species represented for just one or two
collected with diverse purposes, records individuals (Chao1) (Chao, 1984; Colwell,
1997). Besides singletons and doubletons, the the minimum percentage of sampling effort).
Abundance-based Coverage Estimator (ACE) 3) Lack of erratic behavior in curve shape
(Chao et al. 1993; Colwell, 1997) considers (specifically large variations of estimates for
abundant species, i.e. those represented by closely similar sub-sample sizes) is considered
more than 10 individuals by default. a greater stability and therefore greater
To proceed, the program requires the reliability of the estimate. Melo & Froehlich
upload of the data in the form of an abundance (2001) made a qualitative categorization of
spreadsheet (to calculate abundance and / this criterion (good stability/bad stability)
or incidence estimators) or an incidence and here I propose a very simple quantitative
spreadsheet (only for incidence estimators). tool to measure this characteristic. This tool
This spreadsheet must specify the number consists in the sum of the absolute value of the
of samples and taxa and the abundance or differences between Sobs (i.e. the observed
presence of each taxon in each sample. The richness) and the three previous estimates
program admits six models of spreadsheet and the three posterior estimations of Sobs.
that meet these conditions, which is a very In this way a dimension of the stability of the
convenient option for the user. Data do not curve was obtained in a segment of seven
require previous statistical transformations. items, whose median position is Sobs.
All estimators were calculated with the Hereafter I will refer to these first three
program Estimates 7.0.1 (Colwell, 1997) and criteria as quantitative criteria.
their mathematical expressions are given in 4) Similarity in curve shape through
the Appendix. different data sets. This is a very appreciated
characteristic since it makes the behavior of
Evaluation of estimators the estimator predictable and easy to interpret
Bias and accuracy of the estimated when it is applied to new data sets. Although
richness are the most popular measures to it is mentioned by Melo & Froehlich (2001),
evaluate estimation methods. To use these some additional details presented here
measures it is necessary to choose an a priori would be useful. In this study, uniformity
sub-sample size, which is a difficult choice or similarity of the shape of the curves
in rich assemblages like macroinvertebrates was evaluated qualitatively, considering:
communities were the estimated richness tendency of the slope at different levels of
is strongly dependent on sample size. For sampling effort, presence of peaks of over
this reason, in a study of evaluation of non- or under estimation and position of the
parametric estimators applied to benthic maximum or minimum estimations respect
macroinvertebrates, Melo & Froehlich (2001) to level of sampling effort.
opted for not using such bias and accuracy The six non parametric-estimators were
statistics and instead they used four criteria applied to the dataset and then they were
they argued are more practical and realistic. evaluated under the four criteria mentioned
These criteria were: in order to provide further evidence to
1) Sub-sample size required to estimate decide on the convenience of applying them
the observed richness in the total sample. to datasets of benthic macroinvertebrates in
The smaller the sub-sample size, the subtropical areas.
better performance of the estimator. This Performance of estimators was assessed
characteristic can be interpreted as an according to the following categories:
indication of the capability of the estimator incidence estimators at family level;
to reach a reliable value of richness with a incidence estimators at genera level;
small sampling effort. abundance estimators at family level and
2) Constancy of the sub-sample size abundance estimators at genera level.
needed to estimate the total observed For each of the quantitative criteria (i.e.
richness, measured as 1 standard deviation percentage of samples to achieve Sobs,
(SD) of the previous criterion (i.e. SD of the standard deviation, stability) the incidence
estimation that equals observed richness in estimators were ranked from 1 to 4 according
to the results, being the score 1 assigned final value of estimated richness with a lower
to the one with the best performance (i.e. sampling effort in almost all cases.
estimator that obtained the lowest value in
this criterion) and 4 to the one with the worst Respect to similarity in curves shape, the
performance (i.e. estimator that obtained the behavior of estimators shows the following
highest value in this criterion). The final scores general sequences:
for each estimator were obtained adding
those ranking positions. The same procedure ICE (see Fig. 1a): General behavior:
was applied to abundance estimators except This estimator presented its maximum peak
that only 1 and 2 scores were possible. followed by a minimum peak in the first 10-
20% of samples and then showed a gradual
growth up to the final estimate. General
RESULTS sequence variations: In Coleoptera, after an
overestimation peak there was a progressive
Regarding the three quantitative criteria, increase in the slope of the curve until it
performance of estimators varied depending achieved the maximum estimated value.
on the taxa considered. In Tables, numbers
between parentheses represent the score Chao2 (see Fig. 1b): General behavior:
assigned to each estimator according to At family level, the three less sampled orders
the value obtained in each criterion. For (Diptera, Prostigmata and Coleoptera) curves
example, for incidence estimators at family showed a very similar pattern, although out
level under SD criterion for the order of phase in sampling effort. The overall
Coleoptera (Table I), Jack1 obtained the behavior sequence would be: pronounced
lowest value (1,68) and Chao2 the highest growth (in the first 10- 20% of the samples),
value (6,44), then their scores were (1) and overestimation plateau, and decrease to final
(4) respectively. Ranking position is the value (see Fig. 1). At genus level (see Fig. 1b)
result of the sum of all scores obtained for -except for Trichoptera- the curves described
each estimator, e.g. for ICE in Coleoptera this a similar pattern to that seen in families but
would be: (3)+(1)+(2)=6, as six is the lowest much more blurred by the large number of
value, ICE results the estimator with better small peaks in the range and because the
performance for that order. Estimators having sequence described occurs at very different
the best ranking positions in most orders are levels of sampling for each order. General
considered the estimators of best overall sequence variations: In the better sampled
performance. For each category, these were: orders (Trichoptera and Ephemeroptera),
- Incidence estimators at family level (see at family level the curves had a tendency
Table I): Jack1, is the only one that always to a continuous growth (i.e. without peaks)
took the first or second ranking position. from the 20% of sampling effort. Trichoptera
- Incidence estimators at genera level (see genera curve, on the other hand, did not
Table I): Jack1 took the first position in four show resemblance respect to those described
of the five cases, followed by ICE, with very above, as it showed cubic growth from the
close scores, 66% of samples toward the end of the axis
- Abundance estimators at family level x (R2=0.99).
(see Table II): ACE o Chao1 with very close
scores, Jack1 (see Fig. 1c): General behavior:
- Abundance estimators at genera level At genus level, there was no clear distinction
(see Table II): Chao1 took the first position between well or sub sampled orders, all
in all cases. showing a logarithmic growth. At family
Most of the estimators had a different level this growth was very pronounced in
performance depending on the sample under the three less sampled orders where the
study, except Chao1 -which was always the curves fit very well to a logarithmic curve
most stable- and Jack2 which reached the (R2=0.88 Diptera and R2= 0.99 Coleoptera
Table I. Quantitative evaluation of incidence estimators. Values 1 to 4 between parentheses indicate
the score assigned to the estimator in each criterion (% samples, stabitity, SD). Final ranking position of
estimators was obtained summarizing their scores (ss) in each order.
Coleoptera Diptera Ephemeroptera Prostigmata Trichoptera
Estimator % samples Stability SD % samples Stability SD % samples Stability SD % samples Stability SD % samples Stability SD
ICE 30,86 1,61 3,64 40,00 1,60 1,81 58,82 0,21 0,92 42,85 4,86 2,69 73,17 0,32 0,6
(3) (1) (2) (4) (1) (2) (3) (2) (3) (3) (4) (2) (3) (2) (3)
Chao 2 25,92 2,55 6,44 24,44 2,99 3,21 100 0,07 0,43 38,09 3,59 4,77 96,34 0,12 0,47
(2) (3) (4) (2) (3) (4) (4) (1) (1) (2) (3) (4) (4) (1) (1)
Family level
Jack1 38,27 1,75 1,68 26,66 1,86 1,27 54,90 0,40 0,48 42,86 3,23 1,56 45,12 0,39 0,53
(4) (2) (1) (3) (2) (1) (2) (3) (2) (3) (2) (1) (2) (3) (2)
Jack2 24,69 2,80 3,63 19,30 4,11 2,98 40,20 0,58 1,40 30,95 2,91 3,32 39,02 0,76 1,69
(1) (4) (2) (1) (4) (3) (1) (4) (4) (1) (1) (3) (1) (4) (4)
1. ICE (ss=6) 1. Jack1 (ss=6) 1. Chao2 (ss=6) 1. Jack2 (ss=5) 1. Chao2 (ss=6)
Ranking position
2. Jack1/Jack2 (ss=7) 2. ICE (ss=7) 2. Jack1 (ss=7) 2. Jack1 (ss=6) 2. Jack1 (ss=7)
ICE 35,13 3,08 9,31 31,11 5,95 6,82 48,57 2,30 3,71 43,55 5,01 4,18 50,61 2,10 3,01
(3) (1) (3) (3) (1) (2) (4) (2) (1) (3) (2) (1) (4) (2) (1)
Chao 2 22,97 12,88 15,12 13,33 15,05 18,54 12,38 7,83 13,01 29,03 11,83 10,90 20,99 2,43 8,44
(1) (4) (4) (1) (3) (4) (2) (4) (4) (1) (4) (4) (2) (3) (4)
Genus level
Jack1 45,94 5,63 8,06 33,33 8,38 4,74 30,08 1,39 2,53 40,32 4,54 3,35 30,86 1,76 3,19
(4) (2) (1) (4) (2) (1) (1) (1) (2) (2) (1) (2) (3) (1) (2)
Jack2 29,73 9,95 7,64 20,00 17,20 8,74 30,10 3,90 5,93 29,03 7,65 5,63 16,05 8,28 7,02
(2) (3) (2) (2) (4) (3) (1) (3) (3) (1) (3) (3) (1) (4) (3)
1. ICE /Jack1/Jack2 1. ICE (ss=6) 1. Jack1 (ss=4) 1.Jack1 (ss=5) 1. Jack1 (ss=6)
Ranking position
(ss=7) 2. Jack1 (ss=7) 2. Jack2 (ss=6) 2. ICE/Jack2 (ss=6) 2. ICE (ss=7)
Table II. Quantitative evaluation of abundance estimators. Values 1 and 2 between parentheses indicate
the score assigned to the estimator in each criterion (% samples, stabitity, SD). Final ranking position of
estimators was obtained summarizing their scores (ss) in each order.
Ephemeroptera Prostigmata Trichoptera Coleoptera Diptera
Estimator % samples Stability SD % samples Stability SD % samples Stability SD % samples Stability SD % samples Stability SD
ACE 100,00 0,07 0,00 46,43 1,63 1,83 40,90 0,66 1,42 35,82 4,61 6,93 38,09 1,90 3,81
Family level
(1) (1) (1) (1) (2) (2) (1) (2) (2) (1) (2) (1) (1) (1) (1)
Chao1 100,00 0,07 0,45 85,71 0,38 0,72 100,00 0,03 0,48 35,82 3,14 7,10 44,44 2,30 4,37
(1) (1) (2) (2) (1) (1) (2) (1) (1) (1) (1) (2) (2) (2) (2)
Ranking position ACE (ss=3) Chao1 (ss=4) Chao1 (ss=4) ACE/Chao1 (ss=4) ACE (ss=3)
ACE 39,70 2,87 5,13 9,37 5,02 3,48 80,30 1,75 1,61
Genus level
(1) (2) (1) (2) (2) (1) (2) (2) (1)

Chao1 39,70 1,67 5,74 43,75 4,76 5,83 66,66 1,49 3,86
(1) (1) (2) (1) (1) (2) (1) (1) (2)
Ranking position ACE/Chao1 (ss=4) Chao1 (ss=4) Chao1 (ss=4)
and Prostigmata at family level) that never curves, which anyway did not reverse the
became asymptotic. General sequence general tendency to a logarithmic growth.
variations: At family level in better sampled At genera level, such instability was not that
orders (Trichoptera and Ephemeroptera) and obvious and unlike the family level estimate,
at genus level for Coleoptera, the curves three orders achieved an asymptote (Diptera,
showed a much steeper growth in the first Ephemeroptera and Prostigmata). There
10% of the sampling where it begins a were no important exceptions to this general
gradual growth to reach the final value of behavior.
estimate.
ACE (see Fig. 2a1 and 2b1) and Chao1
Jack2 (see Fig. 1d): General behavior: (see Fig. 2a2 and 2b2): General behavior:
Curves behaved very similarly to Jack1 in at family and genera level they followed a
all aspects, except that at the family level logarithmic growth reaching or tending to
presented small peaks during the line of the asymptote at different levels of sampling
Fig. 1. Incidence estimators curves for family and genera. a) ICE, b) Chao2, 3) Jack1 and d) Jack2.
X axis is re-scaled to percentage of sampling effort: 100%=the maximum value of the larger sample
(Ephemeroptera). Y axis is rescaled to the percentage of richness estimated: 100%=final value of richness
in each order (fill line).
Fig. 2. Abundance estimators curves for family and genera. a) ACE and b) Chao1. X axis is re-scaled
to percentage of sampling effort: 100%=the maximum value of the larger sample (Ephemeroptera). Y
axis is rescaled to the percentage of richness estimated: 100%=final value of richness in each order (fill
line).
effort in the best sampled orders. In all cases 10% (Trichoptera) and 12% (Ephemeroptera
the growth was smooth, with no evidence of and Diptera) of unknown genera for these
peak fluctuation. inventories, while this value grew to 20% for
Prostigmata.
Application to the local data set Respect to Coleoptera, Jack1 suggested
Incidence data: At the family level (see a very high percentage of unknown families
Fig. 1a) ICE and Chao2- estimated a final (near 48%).
richness value equal to Sobs (Diptera, Abundance data: At family level (see
Ephemeroptera and Trichoptera), suggesting Fig. 2) estimates of ACE and Chao1 indices
a high level of completeness of these coincided or were nearly identical to the
inventories. For Prostigmata, ICE estimates observed richness (Sobs) for Ephemeroptera,
ranged between 15 and 17% above Sobs. For Trichoptera and Prostigmata. For Coleoptera
Coleoptera, Jack1 suggested the existence of and Diptera at the family level, both estimators
30% more families than currently known. suggested an unknown percentage of
At genus level (see Fig. 1a) ICE was close approximately 35% and 17% respectively.
to Sobs of Diptera and Trichoptera. The At genus level (see Fig. 2b), both indices
estimated richness suggests values between of abundance estimated very similar richness
values for each of the three studied orders, freely available on the web, it is very easy
being Prostigmata the group with the highest to use and does not require the user to make
percentage of unknown genera (15%). any statistical procedures for calculation.
In this study, among the incidence
DISCUSSION estimators, the one of better overall
performance was Jack1, followed by ICE and
The impact of human activities on Chao2. These results are slightly different
ecological systems is well illustrated by from those of Melo & Froehlich (2001)
changes in land use arising from urbanization, who firstly recommend Jack2 for benthic
whose consequences are a serious threat to studies. In the present study Jack2 had a
biodiversity conservation (Gonzalez-Oreja similar performance -but not as good as- ICE
et al. 2010). Ecologists face the challenge at genus level analysis. At the family level
of quantifying this impact proposing precise Jack2 performance was very poor. In Melo
tools and measures to preserve biodiversity. & Froehlich (2001), ICE and Chao2 show a
In this context, groups of organisms that act good performance, as in the present study.
as biodiversity indicators -like many benthic Gonzalez-Oreja e.t al. (2010) studied the
macroinvertebrates- are of great importance performance of non parametric estimators
as they enable these changes to be monitored with data of birds of urban areas in the city
(Valladares et al., 2010). of Puebla, Mexico. They concluded that
Non-parametric estimators can help to Jack1 was the estimator with better general
reach inventories in less time and with lower performance, even when they made the
costs (Petersen & Meier, 2003), since they evaluation under some criteria they called
can give a measure of how much sampling hard, i.e. criteria that are very rigorous
effort can be enough to reach a representative statistically.
inventory. Hence the importance to assess Regarding abundance estimators, the
their reliability to adequately adjust them analysis did not reveal a better overall
for each group and region. Moreover, when performance estimator, but performance
assessing the relevance of a single area in depended on whether they were applied to
terms of species richness, endemism, or family or genus level. Whereas the number
conservation status, the use of richness as a of samples had little or no difference among
tool for decision making can be misleading family and genus level within the same order
without a measure of how complete the lists (see Table II), numerically this could indicate
are (Sobern & Llorente, 1993). Measuring that some estimators have better performance
the completeness is then the safest approach at low richness samples (level of family) and
for dealing with taxonomic, geographic others are more effective when richness
or ecological biases. This is especially reaches higher values observed (genus
important when inventories are derived level). Then, according to the results of this
from heterogeneous sources such as non- study, a most advisable estimator for low
standardized samplings or bibliographic Sobs samples would be ACE, whereas Chao1
references (Valladares 2010). Anyway, would be the best for high Sobs samples.
even if the data were imperfect (e.g. The curves shapes analysis allow me not
museum collection lists), if the appropriate to propose the use of estimators of more
technique is used and a strict interpretation erratic behavior in the first portion of the
of the results is made, we can evaluate sample like ICE and Chao2- if the sample to
the quality of the lists of species to locate be analyzed is small, agreeing in this point
priority areas for conservation (Heyer et al., with Rico et al. (2005).
1999). The assessment of the completeness Nonetheless, from the description of
of inventories is a powerful tool that non- curve shapes it can be said that all are quite
parametric estimators provide. Moreover, a stable, being the variations related to the
widely recognized program to calculate them representativeness achieved by each order
(EstimateS in all versions, Colwell 1994) is in relation to sampling effort. This uniformity
in curves form across different data sets and very grateful to Dr. Hugo Fernndez, Dr.
hierarchic levels allowed to describe general Carlos Molineri, Dra. Laura Miserendino
sequences that turn out to be very useful and two anonymous reviewers for their
from a predictive point of view. That means, constructive comments on early versions of
for example, that they can act as references the manuscript. Preparation of this paper
to compare estimations of small databases was supported by Consejo Nacional de
of macroinvertebrates and then infer the Investigaciones Cientficas y Tecnolgicas
possible behavior of the curve and therefore (CONICET) and Agencia de Promocin
the expected richness- if the sample were Cientfica y Tecnolgica.
larger.
In a regional context, non-parametric
estimates showed good results, i.e. high LITERATURE CITED
completeness of the inventories of Trichoptera,
1. BUNGE, J. & M. FITZPATRICK. 1993. Estimating the number
Ephemeroptera and Prostigmata at both of species: a review. J. Am. Stat. Assoc. 88: 364-373.
hierarchical levels, while they demonstrated 2. BURNHAM, K. P. & W. S. OVERTON. 1978. Estimation of
a great lack of knowledge in Coleoptera. size of a closed population when capture probabilities
vary among animals. Biometrika 65: 625-633.
These results can be attributed to the fact that 3. BURNHAM, K. P. & W. S. OVERTON. 1979. Robust
only Elmidae is widely studied. Estimates for estimation of the size of a closed population when
capture probabilities vary among animals. Ecology 60:
Diptera produced very good results respect 927-936.
to inventory completeness, however these 4. CHAO, A. 1984. Nonparametric estimation of the number
of classess in a population. Scand. J. Statist. 11: 265-
results should be taken carefully, since it is 270.
a very diverse group and intuitively much 5. CHAO, A. 1987. Estimating the population size for capture-
greater richness than that recorded would recapture data with unequal catchability. Biometrics
43: 783-791.
be expected in this area. As in Coleoptera, a 6. CHAO, A. & J. BUNGE. 2002. Estimating the number of
possible explanation for this could be the low species in a stochastic abundance model. Biometrics
58: 531539.
level of determination that can be reached 7. CHAO, A., R. L. CHAZDON, R. K. COLWELL & T. J. SHEN.
for this group in the region. 2005. A new statistical approach for assessing similarity
of species composition with incidence and abundance
data. Ecol. Lett. 8: 148-159.
CONCLUSION 8. CHAO, A. M., M. C. MA & M. C. K. YANG. 1993. Stopping
rules and estimation for recapture debugging with
unequal failure rates. Biometrika 80: 193-201.
The results of this study allow to propose 9. CHAZDON, R. L., R. K. COLWELL, J. S. DENSLOW & M. R.
the non-parametric estimators Jack1, ACE GUARIGUATA. 1998. Statistical methods for estimating
species richness of woody regeneration in primary and
and Chao1 as excellent tools for both the secondary rain forests of NE Costa Rica. In: Dallmeier,
estimation of richness and as a measure F. & J. A. Comiskey (eds.), Forest Biodiversity Research,
Monitoring and Modeling: Conceptual Background
of the completeness of inventories of some and Old World Case Studies, Parthenon Publishing,
of the most conspicuous groups of benthic Paris, pp. 285-309.
macroinvertebrates. The application of the 10. CHIARUCCI, A. N., J. ENRIGHT, G. L. PERRY, B. P. MILLER &
B. B. LAMONT. 2003. Performance of nonparametric
estimators to the local data set shows that species richness estimators in a high diversity plant
the best known orders of the region are community. Divers. Distrib. 9: 283-295.
11. COLWELL, R. K. & J. A. CODDINGTON. 1994. Estimating
Trichoptera, Ephemeroptera and Prostigmata, terrestrial biodiversity through extrapolation. Phil.
and in a lower degree Coleoptera, while Trans. Royal Soc. London B. 345: 101-118.
12. COLWELL, R. K. 1997. EstimateS: Statistical Estimation of
the inventory of the Diptera presented Species Richness and Shared Species from Samples
intermediate values of representativeness. (Software and Users Guide), Versin 7.01. http://
viceroy.eeb.uconn.edu/estimates
13. DOMNGUEZ, E. & H. R. FERNNDEZ (eds.). 2009.
Macroinvertebrados bentnicos sudamericanos.
ACKNOWLEDGEMENTS Sistemtica y biologa. Fundacin Miguel Lillo,
Tucumn, Argentina.
14. GONZALEZ-OREJA, J. A., DE LA FUENTEDAZORDAZ,
I am indebted to all the taxonomists A. A., HERNNDEZSANTN, L., BUZOFRANCO,
D. & C. BONACHEREGIDOR. 2010. Evaluacin de
of Instituto de Biodiversidad Neotropical, estimadores no paramtricos de la riqueza de especies.
who identified the specimens included in Un ejemplo con aves en reas verdes de la ciudad de
the database used for this paper and I am Puebla, Mxico. Anim. Biodiv. Conserv. 33(1): 31-45.
15. GOTELLI, N. J. & R. K. COLWELL. 2001. Quantifying 24. PETERSEN, F. T., R. MEIER & M. N. LARSEN. 2003. Testing
biodiversity: procedures and pitfalls in the measurement species richness estimation methods using museum
and comparison of species richness. Ecol. Lett. 4: 379- label data on the Danish Asilidae. Biodivers. Conserv.
391. 12: 687-701.
16. HELLMANN, J. J. & G. W. FOWLER. 1999. Bias, precision, 25. RICO, A., J. P. BELTRN, A. LVAREZ & E. FLREZ.
and accuracy of four measures of species richness. 2005. Diversidad de araas (Arachnida: Araneae) en
Ecol. Appl. 9(3): 824834. el Parque Nacional Natural Isla Gorgona, Pacfico
17. HEYER, W. R., J. A. CODDINGTON, W. J. KRESS, D. C. Colombiano. Biota Neotrop (Special Issue) Vol
ACEVEDO, T. L. ERWIN, B. J. MEGGERS, M. POGUE, 5.1A. http://www.biotaneotropica.org.br/v5n1a/es/
R. W. THORINGTON, R. P. VARI, M. J. WEITZMAN & abstract?article+BN007051a2005
S. H. WEITZMAN. 1999. Amazonian biotic data and 26. ROSENZWEIG, M. L., W. R. TURNER, J. G. COX & T. H.
conservation decisions. Cien. Cult. 51 (5/6): 372-385. RICKLEFTS. 2003. Estimating diversity in unsampled
18. HORTAL, J., P. A. BORGES & C. GASPAR. 2006. Evaluating habitats of a biogeographical province. Conserv. Biol.
the performance of species richness estimators: 17: 864-874.
sensitivity to sample grain size. J. Anim. Ecol. 75: 27. SOBERN, J. & J. LLORENTE. 1993. The use of species
274287. accumulation functions for the prediction of species
19. JIMNEZ-VALVERDE, A. & J. HORTAL. 2003. Las curvas de richness. Conserv. Biol. 7: 480-488.
acumulacin de especies y la necesidad de evaluar 28. SRENSEN, L. L., J. A. CODDINGTON & N. SCHARFF.
la calidad de los inventarios biolgicos. Rev. Iber. 2002. Inventorying and estimating subcanopy spiders
Aracnol. 8: 151-161. diversity using semiquantitative sampling methods in
20. LEE, S-M. & A. CHAO. 1994. Estimating population size via an afromontane forest. Environ. Entomol. 31: 319-
sample coverage for closed capture-recapture models. 330.
Biometrics 50: 8897. 29. VALLADARES, L. F., BASELGA, A. & J. GARRIDO. 2010.
21. MELO, A. S. & C. G. FROEHLICH. 2001. Evaluation of Diversity of water beetles in Picos de Europa National
methods for estimating macroinvertebrate species Park, Spain: Inventory completeness and conservation
richness using individual stones in tropical streams. assessment. Coleopt. Bull. 64(3): 201-219.
Freshwat. Biol. 46: 711-721. 30. WALTER, B. A. & S. MORAND. 1998. Comparative
22. PALMER, M. W. 1990. The estimation of species richness by performance of species richness estimation methods.
extrapolation. Ecology 71: 1195-1198. Parasitology 116: 395-405.
23. PETERSEN, F. T. & R. MEIER. 2003. Testing species-richness
estimation methods on single-sample collection data
using the Danish Diptera. Biodivers. Conserv. 12: 667-
686.
Appendix. Adapted from EstimateS Users Guide. Appendix B

(http://viceroy.eeb.uconn.edu/EstimateSPages/EstSUsersGuide/EstimateSUsersGuide.htm#AppendixB)
Incidence estimators:
Where:
Q1 is the number of uniques
Q2: number of duplicates
n: Total number of samples
Sfreq: number of frequent species, i.e species present in more than k samples (by
default each found in more than 10 samples).
Sinfr: number of infrequent species, i.e species present in less than k samples (by
default each found in 10 or fewer samples).
Cice: Sample incidence coverage estimator
Ninfr: Total number of incidences (occurrences) of infrequent species
ninfr: Number of samples that have at least one infrequent species
Qj: Number of species that occur in exactly j samples (Q1 is the frequency of
uniques, Q2 the frequency of duplicates)
Q1: Total frequency of uniques
ice: Estimated coefficient of variation of the Qi for infrequent species
Abundance estimators:
Where:
A1: number of species singletons and
A2: number of species doubletons
Sabun: number of abundant species, i.e. species represented by more than k
individuals (by default each with more than 10 individuals).
Srare: number of non abundant species, i.e. species represented by less than k
individuals (by default each with 10 or fewer individuals)
CACE: Sample abundance coverage estimator
Nrare: Total number of individuals in rare species
Ai: Number of species that have exactly i individuals when all samples are pooled
(F1 is the frequency of singletons, F<sub>2</sub> the frequency of doubletons)
ACE: Estimated coefficient of variation of the Ai for infrequent species.

Eligiendo El Mejor Estimador No Parametrico para Calcular Riqueza en Bases de Datos de Macroinvertebrados Bentonicos

Diunggah oleh

Informasi Dokumen

Judul Asli

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Eligiendo El Mejor Estimador No Parametrico para Calcular Riqueza en Bases de Datos de Macroinvertebrados Bentonicos

Diunggah oleh

Hak Cipta:

Format Tersedia

ISSN 0373-5680 (impresa), ISSN 1851-7471 (en lnea) Rev. Soc. Entomol. Argent.

70 (1-2): 27-38, 2011 27

Choosing the best non-parametric richness estimator

Instituto de Biodiversidad Neotropical (IBN), Facultad de Ciencias Naturales e Instituto

Eligiendo el mejor estimador no paramtrico para calcular riqueza

RESUMEN. Los estimadores no paramtricos permiten comparar la

PALABRAS CLAVE. Riqueza estimada. Ros neotropicales. Gestin

ABSTRACT. Non-parametric estimators allow to compare the

KEY WORDS. Richness estimation. Neotropical streams. Environmental

INTRODUCTION species, using a previously formulated non-

(1) (2) (1) (2) (2) (1) (2) (2) (1)

Appendix. Adapted from EstimateS Users Guide. Appendix B

Anda mungkin juga menyukai