repertoire, and performs computations according to appropriate statistical rules. This white paper describes the basic principles for undertaking AFS using the Frame Sampling Tools software.
The art of AFS involves achieving the most accurate result for the least possible cost. Creating the sampling frame and defining the primary sampling unit are critical steps. AFS theory includes the choice between single and double sampling. Double sampling involves defining the sampling frame to include two levels of sampling and is termed Nested Area Frame Sampling (NAFS). Currently, the Frame Sampling Tools software does not support NAFS but uses basic techniques that apply to both single and double sampling approaches. The following examples describe three cases that formed the experiential foundation for the single sampling procedure implemented under the Frame Sampling Tools software.
The National Agricultural Statistics Service (NASS), US Department of Agriculture See Area Frame Sampling for Agricultural Surveys, by Jim Cotter and Jack Nealon, August 1987. The USDA has a long-standing practice of performing agricultural surveys using Area Frame Sampling. The USDA sampling frame is refined on an ongoing basis. Stratification materials include satellite imagery, imagery from the National Aerial Photography Program, and an assortment of other aerial photography and thematic maps. Factors considered in establishing strata include the percent of area under cultivation and historical cropland content. The NASS employs a form of double sampling that they call replicated sampling. Primary sampling units (PSU) are based on permanent boundaries, such as roads, and in any given state each PSU may be as large as a county. Primary samples are then selected and each is divided into secondary sampling units called segments; for example, land sections defined by one square mile road boundaries. Segments are sampled and a team is deployed to the field to determine the MOIs in each sampled segment. The results are then used to determine the MOI in each primary sample and in turn for the area frame. The NASS refines sampling by annually rotating 20 percent of the secondary samples. Such a policy reduces the reporting burden on farmers who are in the sample and allows for identification of changes and trends in the region. MEDEA China Cultivated Land Study See China Agriculture: Cultivated Land Estimation Methods, by Richard Cicone, Frank Pont, Chris Chiesa and Michael Goodchild, July 1997. In 1997, MEDEA used a remote sensing based nested area frame sampling approach to determine that China has underreported its cultivated land base by nearly 50 percent. The MEDEA estimate had been validated by recent official reports stating that Chinas cultivated land area exceeded 130 M Ha versus the 90 M Ha previously reported. The technique used by MEDEA formed the foundation for much of the method and theory employed in the Frame Sampling Tools software. Figure 1 illustrates the basic elements of the approach. Stratification was based on time-series NOAA AVHRR LAC data for 1992. The primary sampling unit was a 125x125 mile Landsat frame. Scientists overcame a difficult problem that the primary sampling unit, a Landsat frame, was composed of multiple strata. The method developed was incorporated into the Frame Sampling Tools software. A secondary sampling frame based on approximately 10x10 kilometer square, high-resolution optical imagery from national systems was used to correct classification error in the primary sample. A post-sampling stratification process was used to refine the area frame by eliminating the area that did not contain the MOI. MEDEA demonstrated the ability to estimate the statistical characteristics of its result, reporting application of the estimation procedure
resulted in an estimate of 143.4 M Ha circa 1992 to within +/5.6% accuracy (at the .95 level of statistical confidence).
Figure 1. MEDEA Nested Area Frame Sampling. The area frame was defined by using AVHRR. Colors relate to strata derived from time-series data. About 22 Landsat cluster samples were allocated. Data are categorized as cultivated land and other, using unsupervised clustering and image interpretation techniques. Then approximately twenty secondary samples were used to correct error in the categorization of each primary sample. Estimation of Forest Cover in Loudoun County, Virginia. For further information, contact Earth Satellite Corporation. Area frame sampling using single samples was featured in this study investigation of the quantity of forest cover in Loudon County, Virginia. The area frame was constructed using one Landsat scene, with a county mask that excluded regions of the scene outside the county (see Figure 2). A grid of one-kilometer cells overlaid on the image formed the principal sampling unit. Forest cover was estimated by first computing the area detected in each stratum based on selected samples, and then computing the weighted sum across all strata. Although one Landsat scene was used here, a study area could be composed of more than one image, each called a tile in the Frame Sampling Tools program. At times it may be necessary to work with several image tiles. For instance, if images that comprise a study area are collected at different times, it may not be possible to mosaic tiles into a single image to define the area frame. Too many labeling errors could result. Each image tile would require separate attention, to account for changes in the spectral signature of the MOI. The Frame Sampling Tools software permits construction of the area frame from a single Landsat scene, a scene mosaic or several tiles.
Sampling units selected and analyzed for forest content Loudoun Studys Landsat-derived strata
Figure 2: Shows Estimation of Forested Area in Loudoun County, Virginia. Loudoun County Forest Cover - Frame Sampling Tools Demonstration Project. Landsat imagery used to stratify study area into 40 classes. 30 grid cells (1km x 1km) selected for forest cover content analysis. Dot Grid approach utilized to delineate each samples forest cover, which enabled
Area Frame
The area frame defines the study area. The area frame is the extent of area over which the MOI will be estimated. It may be defined as an arbitrary geographic region (like a rectangle bounded by latitude and longitude), a specific zone defined by natural or political boundaries, or a combination of the above. Considerations as you begin. The Frame Sampling Tools software assumes the area frame to be composed of the set of image tiles defined by the user as the study area. Each image tile can be masked to exclude regions outside the area of interest. If a single image tile is used, for example a mosaic of Landsat scenes comprising a given county, data outside the county boundaries could be set to zero in the image tile to exclude them from the area frame.
Stratum Number Stratum Area (Hectares) Stratum Forest Cover (%) Stratum Forest Cover (Ha)
1 3,406
2 3,359
3 5,352
4 4,351
45,380.68
Sampling Frame
The sampling frame refers to the process of stratification; definition (shape and size) of the principal sampling unit (in the case of NAFS, the principal primary and secondary units); and selection and labeling of samples, all described in following sections. Two sample frame definition options are available in the Frame Sampling Tools program. The surveyor may provide definitions of the Principal Sampling Unit (PSU) as a vector polygon. For example, if the area frame is the United States, the surveyor may choose to define the PSU as all the U.S. counties (this would be generally unadvisable unless employing double sampling). Or the surveyor may use the Frame Sampling Tools
assessment of each stratums forest content. Stratum forest cover hectares were summed to compute the estimate. The following identifies important elements of an Area Frame Sampling project.
Grid Generation Tool to define a uniform grid over the area frame defining the PSU. Considerations as you begin. Determining the sampling frame is critical. An important decision is whether single or double sampling strategies would be required. In agricultural surveys, this decision is based on the feasibility of accurately identifying all parcels in the primary sample. The larger the primary sampling unit, the more difficult the task. Generally, study areas the size of a continent will necessitate double sampling, whereas study areas the size of a county can be analyzed effectively with single sampling techniques. If your area frame falls between these, it is a judgment call. Since the Frame Sampling Tools software only supports single sampling, making the surveyor restricted in choice. If the study area is large, it is recommended that the primary sampling unit be selected to be of manageable size (where manageable is dictated by the surveyors ability to identify all parcels in the sample as accurately as possible). AFS will provide statistics with the final result that should reflect accuracy of the process selected.
MOI appearing in each stratum, and the determination of strata in which the MOI is a rare occurrence: A) Stratum definition seeks to clump regions that are everywhere uniform relative to the proportion of MOI. If the proportion of MOI in a stratum is, approximately 50 percent, it is desirable that any substratum within that stratum maintains the same proportion. This quality is termed spatial stationarity. The term is borrowed from the statistical expression stationarity as used in referring to a quality in temporal statistics that a population does not change in time relative to the attribute of interest. For example, a voting population intent can be highly nonstationary prior to an election as mood about candidates swing as a result of changing opinion. Lack of stationarity can affect the statistical performance of the estimate. Frame Sampling Tools and will reflect the impact of strata non-stationarity in its final statistics; early warning is not available at this time. The surveyor can guard against this problem by examining strata relative to other known information, derived from maps or previous surveys. If the surveyor judges a stratum to be spatially non-stationary, it would be advisable to change that stratum designation. Standard ERDAS IMAGINE tools can assist the surveyor in assessment of strata uniformity. B) Priors are an estimate of the percent of the MOI expected in any given stratum, before the survey is conducted. Priors may be derived from previous records in the study area, maps, or field reports. Given priors, the Frame Sampling Tools software is able to improve its sample allocation. If priors are unavailable, then the Frame Sampling Tools program assumes that the MOI is equally likely in any stratum when allocating samples. If the surveyor is interested in estimates for more than one MOI, use of a prior associated with one MOI will lead to increased variance in the estimate of other MOIs. It is advised to set the selection of a prior that relates to the set of MOIs. For example, if the surveyors interest is in estimating wheat, barley and corn (all grains), using priors for grains would be a good compromise over using the prior for a specific grain (though not optimal with respect to any one). The variance of the estimate will be affected depending on the choice, and will be reflected in Final Analysis. C) Rare occurrence of the MOI in a stratum is difficult to detect using the sample strategy employed in the Frame Sampling Tools software. It is recommended that any strata (or any region within the study area) where the occurrence of the material of interest is rare (say, less than one percent) be excluded from the area frame. For example, if deserts occur within the area of interest, and no desert cultivation is known to occur, eliminate deserts from the area frame. This may result in a structured bias, as you are making no measurement to estimate the MOI in this region. However, experience has indicated that greater errors may occur as a result of inadequate sampling resulting from normal treatment of such stratum. If a rare occurrence event is what is being sought, it is recommended that such regions be
Stratification
Stratification is the next step in defining the sampling frame. Stratification entails a process of grouping homogeneous areas relative to the MOI. The more homogeneous each stratum, the fewer samples that will be required to achieve the desired accuracy. Since each sample may entail expensive action, for example visiting each parcel of land in the sample, reducing the number of samples needed will reduce overall cost. A simple, though unrealistic, example will make this clear. If the surveyor is interested in knowing the percent of forestland in an area, the ultimate stratification is to divide the area into two strata - one that contains all the forest and another that contains all else. The surveyor would then need to sample only one parcel of land to determine how much forest is present. Clearly, such stratification would be accidental and unlikely. It is not necessary to stratify, if the surveyor can allocate and analyze a sufficient number of samples to achieve the statistical performance desired. However, it is most often the case that attention paid to this process pays dividends by reducing the number of samples required to achieve a desired level of statistical accuracy. Considerations as you begin. It is desirable to identify as few strata as possible while retaining a high degree of uniformity in each stratum. Imagery based strata files can be constructed using unsupervised or supervised clustering tools available in ERDAS IMAGINE. Typically, twenty to thirty strata are effective. It is also advisable to maintain strata of similar size. Strata do not have to be contiguous. In fact, one of the advantages of the Frame Sampling Tools software is its ability to manage highly fragmented strata to the surveyors statistical advantage.Three important considerations are: the definition of strata that are internally homogeneous, the definition of prior expectations for the
separately treated with AFS, and extreme care be given to stratification of that region, or alternate statistical methods be employed altogether.
software randomly allocates samples in proportion to the size of each stratum and relative to expected proportion of the MOI in that stratum. The Frame Sampling Tools software tends to allocate more samples to larger strata, to strata with equal distributions of the material of interest and other materials, thereby taking advantage of statistical relationships to provide the lowest variance result possible for a given random sample size. Considerations as you begin. The surveyor may pre-select samples, and inform the sample selection tool. This may be desirable if ground observations in certain areas are already in hand. However, the surveyor must exercise caution that no more than half, but preferably less, of all samples are so designated otherwise there is a risk of introducing bias into the sampling procedure. Once the Frame Sampling Tools software takes over the sample allocation, a statistical figure of merit called the Selection Score as samples are allocated in order to allow the surveyor to make a judgment about sample size. Even including a set of pre-allocated samples, as long as the samples set is not dominated by pre-selected samples, the sampling tool strives to maintain randomness. This is a pre-requisite to insuring an unbiased AFS design. Generally one will find that with cluster sampling, variance due to sample size alone is rarely a limiting factor. Stratification stationarity and labeling accuracy are greater concerns.
Sample Identification
Once samples are selected, labeling of the materials present in each sample is the next order of business. The surveyor may choose among a number of methods. The Frame Sampling Tools software provides two mechanisms: one based on dot grid labeling and the other based on polygon labeling. Labels may be generated from field surveys, or though imagery interpretation. Considerations as you begin. The results of the survey are fundamentally limited by the level of accuracy in labels assigned to parcels within each sample. Labeling of parcels is entirely at the surveyors discretion. Errors in labeling can affect both the bias and the variance of the end result. Random labeling errors will affect the variance of the estimate, while correlated errors will affect the bias. Having analysts cross-checking each others work is a good practice to ensure that the most accurate labels possible have been assigned. The analyst should improve labels at throughout during the procedure to achieve the best possible label for all parcels within each sample.
Sample Selection
Once the primary sampling unit is defined, the Frame Sampling Tools software assists the surveyor in selecting samples. Random sampling is encouraged, but not strictly enforced. If strata and priors are provided, the Frame Sampling Tools
Estimation Methods
Once samples are selected and identified, the Final Analysis computes the proportion of the MOI in the study area and reports statistics associated with that result. The Frame Sampling Tools software uses a direct expansion estimator meaning that the
Final Analysis bases the area of the MOI within each stratum on the parcels from each allocated sample that fall within each stratum. Then it, computes the proportion of MOI in the study area by summing the overall proportion based on the weighted contribution of each stratum. Considerations as you begin. The estimator relies on stratum definitions established to optimize sample allocations. Often knowledge of the region accrued through the course of this process helps to identify possible stationarity problems in strata, after samples are allocated. Modification of the stratification prior to the estimation step, called post-stratification, is at times advisable. The variance of the estimate due to the sample allocation would change, possibly becoming larger. However, this would be offset by the advantage of expanding the estimate across uniform strata, which would tend to reduce the variance. Only variance will be affected, no bias would be introduced (unless a stratum is left altogether without samples). Under the Frame Sampling Tools software, estimates can be generated using any stratum definition. The resulting statistics generated by Final Results can guide the surveyor as to which result would be most dependable.
non-stationarity and labeling error. The Frame Sampling Tools software uses a Monte Carlo simulation technique to estimate this variance using the key elements of the AFS design itself. Note that significant variance can be introduced if strata in which the MOI is a rare occurrence are under-sampled. This leads to the recommendation that such areas be omitted from the area frame. This approach is a double-edged sword - see structured error (B). B) Structured error, or bias, is a result of errors that occur on a systematic basis. Such errors are often a result of the experimental design or of the introduction of bias in analyst processes. Possible sources of bias include omission of regions containing the MOI from the study area, preselection of all or most samples, and correlated labeling error, including unresolvable land features. Area that is omitted from the study region may contain some of the MOI. This is disregarded in the estimate produced by Final Analysis. For example, if one is attempting to determine all the agricultural land in a state and eliminates forestlands using a state map to do so, any reclaimed agricultural land in those forests would be missed. Hence, the estimate for the entire state would include a bias. Labeling can introduce another bias source. If the surveyor in labeling land parcels using image interpretation always mislabels a spectral feature, a bias is introduced. Small features unresolved in a scene (for example, irrigation ditches, roads or pathways within a field) may be regularly missed as well, possibly introducing bias if the presence of such features is correlated to the MOI. (For example, irrigation ditches occurring only in grain-growing regions would result in an overestimate of area planted to grain unless the ditch area were estimated and subtracted from the area of grain.)
ERDAS IMAGINE/Frame Sampling Tool Resources Project Manager Tool (PMT): Select from Frame Sampling Tools Menu to manage all project files. Three levels are supported: project level (or root node); area frame level (tile node); and sample frame level (sample node). Single Sampling Wizard Palette : Select from PMT tool bar to be guided throughout the AFS project management process. Image Information Dialog: Select tab 5 to display single sampling history of image file.
Project History:
Track activities logged with every image file used in AFS
Area Frame:
Define study area
Project Manager Tool (PMT): Use the PMT to define the image tiles that comprise your study area
See following
Sampling Frame:
Establish strata, and Principal sampling unit.
Stratification:
Define homogeneous areas to improve the efficiency of sampling.
Strata to be used for Area Frame Sampling can be supplied from any source. For example strata can be digitized from a map and identified as the stratification file using the PMT. Frame Sampling Tools provides a set of tools that can be helpful in defining strata from imagery.
Unsupervised Clustering with Isodata: Available in ERDAS IMAGINE under the classifier menu to allow
statistical clustering of multispectral image data appropriate for definition of AFS strata. Class Grouping Tool: facilitates the process of grouping clusters generated from unsupervised methods into stratum. Dendogram Tool: Implements a fuzzy recode procedure that allows users to group clusters according to various spectral attributes. Ancillary Data Tool: Aids the user in defining strata by associating ancillary data like ground truth, with clusters derived using unsupervised methods.
The user can provide a polygon file that identifies the size and shape of each possible sample. The polygon must cover the entire area frame. Alternatively (recommended) the analyst can generate a uniform grid using the: Grid Generation Tool: Generates a lattice grid of rectangular polygons over an image tile(s).
Sample Selection:
Sample Selection Tool: guides the selection of samples to be analyzed in estimating the MOI. The user
may identify preferred sites, or rule out certain sites; the tool will select up to the total number of desired samples in a way that reduces the variance of the final estimate due to sampling.
Material Identification:
Label the amount of MOI present in each selected sample.
Identifying MOI in each sample is a critical step. The user can employ imagery interpretation or provide label based on ground observations. It is important that systematic labeling errors are avoided. The user is provided two options: labeling grid intersections, from which an estimate of the MOI area is computed, or labeling polygons representing land parcels . Dot Grid Tool: Automatically generates a grid of dots over a sample image, records labels provided by the user, and computes the area of MOI in the sample.
Polygon Analysis Tool: Using heads-up digitizing, the user identifies all parcels in a sample image belonging to the MOI. The tool computes the area of MOI in the sample from the resultant polygons. Estimation:
Define study area.
Final Analysis: Performs the direct expansion estimation of the area of the MOI over the study area. Final Analysis: Computes the variance of the estimate and confidence associated with the AFS, considering
the possible sources of error, including the number of samples, strata stationarity, and random labeling error. The analysis is deterministic, requiring no further user input. Monte Carlo methods are used to determine the effect of sampling error.
Error Assessment:
Define study area.
Information subject to change without notice. The contributions of Richard Cicone, ISCIENCES, L.L.C., Ann Arbor, MI, USA and R. Peter Kollasch, EarthSat, Rockville, MD, USA to develop this white paper are acknowledged. Copyright 2002-2003 Leica Geosystems GIS & Mapping, LLC. All rights reserved. ERDAS and ERDAS IMAGINE are registered trademarks of Leica Geosystems GIS & Mapping, LLC. Other brand and product names are the properties of their respective owners. Part No. FSTwhitepaper. cc 01/03.
Leica Geosystems GIS & Mapping, LLC. 9 2801 Buford Hwy, Suite 300 Atlanta, Georgia 30329, USA Phone +1 404 248 9000 Fax +1 404 248 9000
gis.leica-geosystems.com