Anda di halaman 1dari 20

MET Online Tutorial for METv4.

0
Point-Stat Tool: General
Point-Stat Functionality
The Point-Stat tool provides verification statistics for forecasts at observation points, as opposed to over gridded analyses. The Point-Stat tool matches gridded forecasts to point observation locations using several configurable interpolation methods. The tool then computes continuous as well as categorical verification statistics for the matched pairs falling inside the verification masking regions defined by the user. The categorical statistics generally are calculated by applying one or more thresholds to the forecast and observation values. Confidence intervals, which represent a measure of uncertainty, are computed for most of the verification statistics.

Point-Stat Usage
View the usage statement for Point-Stat by simply typing the following: bin/point_stat At a minimum, the input gridded fcst_file, the point observation obs_file in NetCDF format, and the configuration config_file must be passed in on the command line. You may use the -point_obs command line argument to specify additional NetCDF observation files to be used. In order to produce anomaly statistics (SAL1L2 and VAL1L2 line types described in the MET Users Guide), you must specify a file containing climatological data using the -climo command line argument. You may also explicitly define the observation valid time window to determine which observations are used in the verification by using the -obs_valid_beg and -obs_valid_end command line arguments.

Point-Stat Tool: Configure


The behavior of Point-Stat is controlled by the contents of the configuration file passed to it on the command line. The default Point-Stat configuration file may be found in the METv4.0/data/config/PointStatConfig_default file. The configuration used by the test script may be found in the METv4.0/scripts/config/PointStatConfig file. Prior to modifying the configuration file, users are advised to make a copy of the default: cp data/config/PointStatConfig_default tutorial/config/PointStatConfig_tutorial

Open up the tutorial/config/PointStatConfig_tutorial file for editing with your preferred text editor. The configurable items for Point-Stat are used to specify how the verification is to be performed. The configurable items include specifications for the following:

The forecast fields to be verified at the specified vertical levels. The threshold values to be applied. The matching time window for point observations. The type of point observations to be matched to the forecasts. The areas over which to aggregate statistics - as predefined grids, configurable lat/lon polylines, or individual stations. The confidence interval methods to be used. The interpolation methods to be used. The types of verification methods to be used.

You may find a complete description of the configurable items in the MET Users Guide or in the comments of the configuration file itself. Please take some time to review them. For this tutorial, we'll configure Point-Stat to verify the model temperature at two vertical levels and winds at the surface. However, Point-Stat may be configured to verify as many or as few model variables as you desire. The sample input forecast file is not on the NCEP Grid 212 domain. However, we'll use the NCEP Grid 212 domain to define a masking region for our data. Edit the tutorial/config/PointStatConfig_tutorial file as follows:

In the fcst dictionary, set


field = [ { name = "TMP"; level = "Z2"; cat_thresh = [ >278, >283, >288 ]; }, { name = "TMP"; level = "P750-850"; cat_thresh = [ >278 ]; }, { name = "UGRD"; level = "Z10"; cat_thresh = [ >=5.0 ]; }, { name = "VGRD"; level = "Z10"; cat_thresh = [ >=5.0 ]; } ];

To verify 2-meter temperature, 10-meter winds, and temperature fields between 750hPa and 850hPa and apply the categorical thresholds listed. TMP is in Kelvin and the U and V components of wind are in m/s.

In the fcst dictionary, set wind_thresh = [ >0.0, >=1.0, >=5.0, >=8.0 ]; To indicate that we'd like VL1L2 lines computed using these thresholds on the wind speeds. In the fcst dictionary, set message_type = [ "ADPUPA", "ADPSFC" ]; To verify using these 2 observation types. Set obs = fcst; To use the settings from the fcst dictionary above. In the mask dictionary, set grid = [ "G212" ]; To accumulate statistics over NCEP Grid 212 domain. In the mask dictionary, set poly = [ "MET_BASE/data/poly/EAST.poly", "MET_BASE/data/poly/WEST.poly" ]; To accumulate statistics over the regions defined by the EAST and WEST polyline files. In the interp dictionary, set

type = [ { method = UW_MEAN; width = 1; }, { method = UW_MEAN; width = 5; } ];

To indicate that the forecast values should be interpolated to the observation locations using the nearest neighbor method and by averaging the forecast values over the 5 by 5 boxsurrounding the observation location.

Set
output_flag fho = ctc = cts = mctc = mcts = cnt = sl1l2 = sal1l2 = vl1l2 = val1l2 = pct = pstd = pjc = prc = mpr = }; = { NONE; BOTH; BOTH; BOTH; BOTH; BOTH; BOTH; NONE; BOTH; NONE; NONE; NONE; NONE; NONE; NONE;

To indicate that the contingency table counts (CTC), contingency table statistics (CTS), multi-category contingency table counts (MCTC), multi-category contingency table statistics (MCTS), continuous statistics (CNT), scalar partial sum (SL1L2), and vector partial sum (VL1L2) line types should be output.

Save and close this file.

Point-Stat Tool: Run


Next, run Point-Stat on the command line using the following command: bin/point_stat \ data/sample_fcst/2007033000/nam.t00z.awip1236.tm00.20070330.grb \ tutorial/out/pb2nc/tutorial_pb.nc \ tutorial/config/PointStatConfig_tutorial \ -outdir tutorial/out/point_stat \ -v 2

Point-Stat is now performing the verification tasks we requested in the configuration file. It should take a minute or two to run. In this example, Point-Stat accumulates matched forecast/observation pairs into 48 groups based on our configuration file selections. The 48 groups are a result of: 4 fields (TMP at Z2, TMP at P750-850, UGRD at Z10, VGRD at Z10) * 2 observing message types * 3 masking regions * 2 interpolation methods. However, many of these combinations, such as verifying TMP at Z2 versus upper-air observations (ADPUPA), will result in zero matched pairs being found. As Point-Stat runs, you should see several status messages printed to the screen to indicate progress. Updates to the configuration file language in METv4.0 allow you greater control of which message types are used for each verification task. Instead of trying all 48 combinations listed above, you could specify which message type(s) should be used for verifying each field. Simply move the message_type setting inside each entry of the field array. Use ADPSFC to verify the surface fields (TMP at Z2, UGRD at Z10, and VGRD at Z10), and use ADPUPA to verify the upper-air field (TMP from 750 to 850mb). The execution time of Point-Stat can be greatly improved by disabling the computation of bootstrap confidence intervals. To see the effect of this, edit the configuration file tutorial/config/PointStatConfig_tutorial, in the boot dictionary, set n_rep equal to 0, and re-run the previous Point-Stat command. It should run more than twice as fast. The execution time of Point-Stat can also be improved by disabling the computation of rank correlation statistics. To see the effect of this, edit the configuration file tutorial/config/PointStatConfig_tutorial, set the rank_corr_flag variable equal to FALSE, and re-run the previous Point-Stat command. It should now run even faster.

Point-Stat Tool: Output


The output of Point-Stat is one or more ASCII files containing statistics summarizing the verification performed. In this example, the output is written to the tutorial/out/point_stat directory as we requested on the command line. That output directory should now contain 8 files, one each for the CTC, CTS, MCTC, MCTS, CNT, SL1L2, and VL1L2 line types (.txt), and an eighth one for the STAT file (.stat). The STAT file contains all of the output statistics while the other ASCII files contain the exact same data, but sorted by line type.

Since the lines of data in these ASCII files are so long, we strongly recommend configuring your text editor to NOT use dynamic word wrapping. The files will be much easier to read that way. Open up the tutorial/out/point_stat/point_stat_360000L_20070331_120000V_ctc.txt CTC file using the text editor of your choice and note the following:

This is a simple ASCII file consisting of several rows of data. Each row contains data for a single verification task. The first 21 header columns contain data applicable to all line types, such as timing information, variable and level information, verifying message type, masking region applied, interpolation method applied, and threshold values applied. The twenty-first column, labeled LINE_TYPE, indicates the type of statistics contained in this line. In this file, the LINE_TYPE column contains CTC indicating that the columns to follow contain contingency table counts. The remaining columns after LINE_TYPE are labeled FY_OY, FY_ON, FN_OY, and FN_ON and contain the contingency table counts.

Close this file, open up the tutorial/out/point_stat/point_stat_360000L_20070331_120000V_cts.txt CTS file, and note the following:

The first 21 columns contain the same type of header data as in the previous file. The LINE_TYPE column is set to CTS which indicates that the columns to follow contain contingency table statistics. Refer to the MET Users's Guide for a thorough description of this output line type. Confidence intervals are given for each of these statistics, computed using either one or two methods. The columns ending in _NCL and _NCU give lower and upper confidence limits computed using assumptions of normality. The columns ending in _BCL and _BCU give lower and upper confidence limits computed using bootstrapping. If you re-ran the Point-Stat example with bootstrapping turned off, the _BCL and _BCU will contain the missing data value of NA.

Open up the tutorial/out/point_stat/point_stat_360000L_20070331_120000V_mctc.txt MCTC file, and note the following:


This file contains 6 lines of multi-category contingency table counts. These 6 lines are a result of: 3 masking regions * 2 interpolation methods. Since we only provided multiple thresholds for 2-meter temperature, this file only contains MCTC output for that field. Point-Stat used the 3 thresholds we provided to define 4x4 contingency tables. The corresponding statistics are written out in the MCTS file. This functionality is new for METv4.0.

Open up the tutorial/out/point_stat/point_stat_360000L_20070331_120000V_vl1l2.txt VL1L2 file, and note the following:


This file contains 24 lines of VL1L2 partial sums. These 24 lines are a result of: 3 masking regions * 2 interpolation methods * 4 wind speed thresholds. For the VL1L2 line, the contents of the FCST_THRESH and OBS_THRESH header columns indicate the thresholds that were applied to the wind speed values to determine which U/V points would be included in the sum.

The other output text files contain data specific to their individual line types. Refer to tables 4-2 through 4-19 in the MET Users Guide for a description of their contents. Lastly, the tutorial/out/point_stat/point_stat_360000L_20070331_120000V.stat STAT file contains all of the same data we just viewed but in a single file. The Stat-Analysis tool, which we'll use later in this tutorial, only reads the STAT output of the Point-Stat, Grid-Stat, Wavelet-Stat, and Ensemble-Stat tools, not the ASCII (.txt) files.

Grid-Stat Tool: General


Grid-Stat Functionality
The Grid-Stat tool provides verification statistics for a matched forecast and observation grid. All of the forecast gridpoints in the region of interest are matched to observation gridpoints on the same grid. All the matched gridpoints falling inside a verification masking region defined by the user are used to compute the verification statistics. The Grid-Stat tool functions in much the same way as the Point-Stat tool, except that no interpolation is required because the forecasts and observations are on the same grid. However, the interpolation parameters may be used to perform a smoothing operation on the input data prior to verifying it. The output statistics generated by Grid-Stat are largely the same as those generated by Point-Stat.

Grid-Stat Usage
View the usage statement for Grid-Stat by simply typing the following: bin/grid_stat At a minimum, the input gridded fcst_file, the input gridded obs_file, and the configuration config_file must be passed in on the command line. The forecast and observation fields must be interpolated to a common grid prior to running Grid-Stat. The copygb utility is recommended for regridding files in GRIB format.

Grid-Stat Tool: Configure


The behavior of Grid-Stat is controlled by the contents of the configuration file passed to it on the command line. The default Grid-Stat configuration file may be found in the METv4.0/data/config/GridStatConfig_default file. The configurations used by the test script may be found in the METv4.0/scripts/config/GridStatConfig* files. Prior to modifying the configuration file, users are advised to make a copy of the default:
cp data/config/GridStatConfig_default tutorial/config/GridStatConfig_APCP_12 cp data/config/GridStatConfig_default tutorial/config/GridStatConfig_POP_12

The configurable items for Grid-Stat are used to specify how the verification is to be performed. The Grid-Stat configuration file should look very similar to the one for Point-Stat. The configurable items include specifications for the following:

The forecast fields to be verified at the specified vertical level or accumulation interval. The threshold values to be applied. The areas over which to aggregate statistics - as predefined grids or configurable lat/lon polylines. The confidence interval methods to be used. The smoothing methods to be applied (as opposed to interpolation methods). The types of verification methods to be used.

You may find a complete description of the configurable items in the MET Users Guide or in the comments of the configuration file itself. Please take some time to review them. For this tutorial, we'll run Grid-Stat twice - once to verify the 12-hour accumulated precipitation output of PCP-Combine and once to apply the probabilistic verification methods to a 12-hour probability of precipitation forecast. In the first run, we'll use NetCDF for both the forecast and observation files. In

the second run, we'll use a GRIB forecast file and a NetCDF observation file. While we'll use Grid-Stat to verify only one field at a time, it may be configured to verify more than one field at a time. Open up the tutorial/config/GridStatConfig_APCP_12 file for editing with your preferred text editor and edit it as follows:

In the fcst dictionary, set


field = [ { name = "APCP_12"; level = "(*,*)"; cat_thresh = [ >0, >=5.0, >=10.0 ]; } ];

To verify the NetCDF variable of that name and apply the 3 thresholds listed. Accumulated precipitation is in millimeters.

Set obs = fcst; To use the settings from the fcst dictionary above. In the mask dictionary, set grid = [ "G212" ]; To accumulate statistics over NCEP Grid 212 domain. In the mask dictionary, set
poly = [ "MET_BASE/tutorial/out/gen_poly_mask/CONUS_G212_poly.nc", "MET_BASE/data/poly/EAST.poly", "MET_BASE/data/poly/WEST.poly" ];

To accumulate statistics over the entire CONUS using the NetCDF output of the Gen-Poly-Mask tool and over the regions defined by the EAST and WEST polyline files.

In the nbrhd dictionary, set width = [ 3, 5 ]; To select two neighborhood sizes over which to accumulate neighborhood statistics. In the nbrhd dictionary, set cov_thresh = [ >=0.5, >=0.75 ]; To define the fractional coverage threshold values of interest. Set
output_flag fho = ctc = cts = mctc = mcts = cnt = sl1l2 = vl1l2 = pct = pstd = pjc = prc = nbrctc = nbrcts = nbrcnt = = { NONE; BOTH; BOTH; NONE; NONE; BOTH; BOTH; NONE; NONE; NONE; NONE; NONE; BOTH; BOTH; BOTH;

};

To indicate that contingency table counts (CTC), contingency table statistics (CTS), continuous statistics (CNT), scalar partial sums (SL1L2), neighborhood contingency table counts (NBRCTC), neighborhood contingency table statistics (NBRCTS), and neighborhood continuous statistics (NBRCNT) should be output.

Set nc_pairs_flag = TRUE; To indicate that the NetCDF difference field should be output.

Note that we are not requesting multi-category contingency table output, MCTC and MCTS lines. While we are specifying multiple thresholds (>0.0, >=5.0, >=10.0), they are not all of the same type (">" versus ">=") which would cause an error. Save and close this file and open up the tutorial/config/GridStatConfig_POP_12 file for editing with your preferred text editor and edit it as follows:

In the fcst dictionary, set


field = [ { name level prob cat_thresh } ];

= = = =

"POP"; "Z0"; TRUE; [ >=0.0, >=0.25, >=0.50, >=0.75, >=1.0 ];

To verify the 12-hour probability of precipitation forecast from the input GRIB file and apply the probabilistic thresholds listed.

Set the obs dictionary to


obs = { field = [ { name = "APCP_12"; level = "(*,*)"; cat_thresh = [ >=0.0 ]; } ]; };

To verify against the NetCDF variable of that name in the observation file and define the probabilistic event as any non-zero precipitation.

In the mask dictionary, set grid = [ "G212" ]; To accumulate statistics over NCEP Grid 212 domain. In the mask dictionary, set
poly = [ "MET_BASE/tutorial/out/gen_poly_mask/CONUS_G212_poly.nc", "MET_BASE/data/poly/EAST.poly",

"MET_BASE/data/poly/WEST.poly" ];

To accumulate statistics over the entire CONUS using the NetCDF output of the Gen-Poly-Mask tool and over the regions defined by the EAST and WEST polyline files.

Set
output_flag fho = ctc = cts = mctc = mcts = cnt = sl1l2 = vl1l2 = pct = pstd = pjc = prc = nbrctc = nbrcts = nbrcnt = }; = { NONE; NONE; NONE; NONE; NONE; NONE; NONE; NONE; BOTH; BOTH; BOTH; BOTH; NONE; NONE; NONE;

To indicate that probability contingency table counts (PCT), probability statistics (PSTD), joint/continuous probabilistic statistics (PJC), and probabilistic ROC curve points (PRC) should be output.

Set nc_pairs_flag = TRUE; To indicate that the NetCDF difference field should be output.

Save and close this file.

Grid-Stat Tool: Run


Next, we'll run Grid-Stat twice on the command line using the following two commands: bin/grid_stat \ tutorial/out/pcp_combine/sample_fcst_24L_2005080800V_12A.nc \ tutorial/out/pcp_combine/sample_obs_2005080800V_12A.nc \ tutorial/config/GridStatConfig_APCP_12 \ -outdir tutorial/out/grid_stat \ -v 2 bin/grid_stat \ data/sample_fcst/2005080312/pop5km_2005080312F096.grib_G212 \ tutorial/out/pcp_combine/sample_obs_2005080800V_12A.nc \ tutorial/config/GridStatConfig_POP_12 \ -outdir tutorial/out/grid_stat \ -v 2 With the bootstrap confidence intervals turned off (in boot dictionary, n_rep = 0;), these Grid-Stat commands should run very quickly - in a matter of seconds.

In the first command, which verifies a precipitation forecast, Grid-Stat performs 28 verification tasks. The 28 tasks are a result of: (1 field (APCP at 12-hours) * 4 masking regions) + (1 field * 2 neighborhood sizes * 3 raw thresholds * 4 masking regions) In the second command, which verifies a probability of precipitation forecast, Grid-Stat performs only 4 verification tasks. The 4 tasks are a result of: (1 field (APCP at 12-hours) * 4 masking regions). Note that the neighborhood verification methods are not applied to probability forecasts. In general, the MET tools check the output flag values to determine which verification methods to apply. Only those methods required to produce the output statistics requested are performed. The output file names from these two commands are of the form: For the first command: ls tutorial/out/grid_stat/grid_stat_240000L_20050808_000000V* For the second command: ls tutorial/out/grid_stat/grid_stat_1080000L_20050808_000000V* Note how similar these output filenames are. Since they're both verified against the same observation file, the valid times listed (20050808_000000V) are the same. The only difference is in the forecast lead times, 24 hours for the first command and 108 hours for the second command. If you'd like to differentiate these output file names in a more descriptive way, you could set the output_prefix parameter in the configuration files. For example, setting the output_prefix parameter to APCP in the first configuration file and POP in the second configuration file would result in the following naming conventions: grid_stat_APCP_240000L_20050808_000000V* for the first command grid_stat_POP_1080000L_20050808_000000V* for the second command

Grid-Stat Tool: Output


The output of Grid-Stat is one or more ASCII files containing statistics summarizing the verification performed and a NetCDF file containing difference fields. In this example, the output is written to the tutorial/out/grid_stat directory as we requested on the command line. That output directory should now contain 15 files, 9 from the first Grid-Stat command and 6 from the second.

The first command generates CTC, CTS, CNT, SL1L2, NBRCTC, NBRCTS, and NBRCNT ASCII files, a STAT file, and a NetCDF difference fields file. The second command generates PCT, PSTD, PJC, and PRC ASCII files, a STAT file, and a NetCDF difference fields file. The format of the CTC, CTS, CNT, and SL1L2 ASCII files are the same as was described for the PointStat tool. What's new for the Grid-Stat tool is the neighborhood method output (NBRCTC, NBRCTS, and NBRCNT) and the probability methods output (PCT, PSTD, PJC, and PRC). While Point-Stat is also able to use the probabilistic verification methods, it is NOT able to use the neighborhood verification methods since the observations are not gridded. Neighborhood verification is only available in Grid-Stat. For the neighborhood methods, rather than comparing forecast/observation values at individual grid points, areas of forecast values are compared to areas of observation values. At each grid box, a fractional coverage value is computed for each field as the number of grid points within the neighborhood (centered on the current grid point) that exceed the specified raw threshold value. The

forecast/observation fractional coverage values are then compared rather than the raw values themselves. For the probability methods, the probabilistic forecast values are thresholded using multiple thresholds between 0 and 1 to define a multi-row contingency table. The observation field is also thresholded to define a binary yes/no field. The pairs of probabilistic forecast values and binary yes/no observation values are used to fill the multi-row contingency table. The output probability counts and statistics are derived from this multi-row contingency table. Since the lines of data in these ASCII files are so long, we strongly recommend configuring your text editor to NOT use dynamic word wrapping. The files will be much easier to read that way. Open up the tutorial/out/grid_stat/grid_stat_240000L_20050808_000000V_nbrctc.txt NBRCTC file using the text editor of your choice and note the following:

The format of this file is almost identical to that of the CTC file. The INTERP_MTHD column is set to NBRHD, indicating that the neighborhood method was applied. The INTERP_PNTS column is set to 9 or 25, indicating that the neighborhood was defined over a 3-by-3 or 5-by-5 square. The LINE_TYPE column is set to NBRCTC, indicating that the columns to follow contain neighborhood contingency table counts. The COV_THRESH column is set to >=0.500 or >=0.7500, indicating the coverage thresholds that were applied to the coverage fields to define these contingency tables.

The same types of differences exist between the CTS and the NBRCTS files. Close this file, open up the tutorial/out/grid_stat/grid_stat_240000L_20050808_000000V_nbrcnt.txt NBRCNT file, and note the following:

The format of this file is NOT very similar to that of the CNT files. The two statistics included in this file are the Fractions Skill Score (FSS column) and the Fractions Brier Score (FBS column) and their corresponding confidence intervals. See the MET Users Guide for a description of the neighborhood methods.

Close this file and use the ncview utility (if avaliable on your machine) to view the NetCDF output of Grid-Stat:
ncview tutorial/out/grid_stat/grid_stat_240000L_20050808_000000V_pairs.nc &

Click through the variable names in the ncview window to see plots of the forecast, observation, and difference fields for each masking region. Now dump the header using the ncdump utility (if available on your machine):
ncdump -h tutorial/out/grid_stat/grid_stat_240000L_20050808_000000V_pairs.nc

View the NetCDF header to see how the variable names are defined. Next, open up the tutorial/out/grid_stat/grid_stat_1080000L_20050808_000000V_pct.txt PCT probabilistic output file using the text editor of your choice and note the following:

The LINE_TYPE column is set to PCT, indicating that the columns to follow contain information about the probability contingency table counts. Since the number of forecast thresholds the user may choose is variable, the number of columns in this line (and the other probability lines) is variable. This line contains columns named OY_i, ON_i, and THRESH_i for i = 1 to 5, the probability thresholds chosen.

Close this file and see the MET Users Guide for a description of the other output probability line types.

MODE Tool: General


MODE Functionality
MODE, the Method for Object-Based Diagnostic Evaluation, provides an object-based verification for comparing gridded forecasts to gridded observations. MODE may be used in a generalized way to compare any two fields containing data from which objects may be well defined. It has most commonly been applied to precipitation fields and radar reflectivity. The steps performed in MODE consist of:

Define objects in the forecast and observation fields based on user-defined parameters. Compute attributes for each of those objects: such as area, centroid, axis angle, and intensity. For each forecast/observation object pair, compute differences between their attributes: such as area ratio, centriod distance, angle difference, and intensity ratio. Use fuzzy logic to compute a total interest value for each forecast/observation object pair based on user-defined weights. Based on the computed interest values, match objects across fields and merge objects within the same field. Write output statistics summarizing the characteristics of the single objects, the pairs of objects, and the matched/merged objects.

MODE may be configured to use a few different sets of logic with which to perform matching and merging. In this tutorial, we'll use the most simple approach, but users are encouraged to read the MET Users Guide for a more thorough description of MODE's capabilities.

MODE Usage
View the usage statement for MODE by simply typing the following: bin/mode

At a minimum, the input gridded fcst_file, the input gridded obs_file, and the configuration config_file must be passed in on the command line. Just as with Grid-Stat and Wavelet-Stat, the forecast and observation fields must be interpolated to a common grid prior to running MODE. The copygb utility is recommended for regridding files in GRIB format.

MODE Tool: Configure


The behavior of MODE is controlled by the contents of the configuration file passed to it on the command line. The default MODE configuration file may be found in the METv4.0/data/config/MODEConfig_default file. The configurations used by the test scripts may be found in the METv4.0/data/config/MODEConfig* files. Prior to modifying the configuration file, users are advised to make a copy of the default: cp data/config/MODEConfig_default tutorial/config/MODEConfig_tutorial

The configuration items for MODE are used to specify how the object-based verification approach is to be performed. Whereas in Point-Stat and Grid-Stat you may only compare the same type of forecast and observation fields, in MODE you may compare any two fields. When necessary, the items in the

configuration file are specified separately for the forecast and observation fields. In most cases though, users will be comparing the same forecast and observation fields. The configurable items include specifications for the following:

The forecast and observation fields and vertical levels or accumulation intervals to be compared. Options to mask out a portion of or threshold the raw fields. The forecast and observation object definition parameters. Options to filter out objects that don't meet a size or intensity criteria. Flags to control the logic for matching/merging. Weights to be applied for the fuzzy engine matching/merging algorithm. Interest functions to be used for the fuzzy engine matching/merging algorithm. Total interest threshold for matching/merging. Various plotting options.

While the MODE configuration file contains many options, beginning users will typically only need to modify a few of them. You may find a complete description of the configurable items in the MET Users Guide or in the comments of the configuration file itself. Please take some time to review them. For this tutorial, we'll configure MODE to verify the same 12-hour accumulated precipitation output of PCP-Combine that we used for Grid-Stat. MODE can read either GRIB files or the NetCDF output of PCP-Combine. Whereas Grid-Stat and Point-Stat may be used to compare multiple fields in one run, MODE compares a single forecast field to a single observation field. Open up the the tutorial/config/MODEConfig_tutorial file for editing with the text editor of your choice and edit it as follows:

Set grid_res = 40; To set the nominal grid spacing to 40km for this grid. The grid_res parameter is used further down in the config file in defining interest functions. In the fcst dictionary, set
field = { name = "APCP_12"; level = "(*,*)"; };

To select the forecast field of 12-hour rainfall total accumulation from the input NetCDF file.

In the fcst dictionary, set conv_radius = 5; To specify a convolution smoothing radius of 5 grid units. This parameters may be set explicity, as we're doing now, or relative to another value, like the grid_res parameter which was set above. In the fcst dictionary, set conv_thresh = >=5.0; To threshold the convolved field and define objects. In the fcst dictionary, set merge_flag = NONE; To disable the additional forecast and observation merging methods. Set obs = fcst; To use the settings from the fcst dictionary above. Set match_flag = MERGE_BOTH; To use the one-step matching/merging method.

Save and close this file.

MODE Tool: Run


Next, run MODE on the command line using the following command: bin/mode \ tutorial/out/pcp_combine/sample_fcst_24L_2005080800V_12A.nc \ tutorial/out/pcp_combine/sample_obs_2005080800V_12A.nc \ tutorial/config/MODEConfig_tutorial \ -outdir tutorial/out/mode \ -v 2

MODE is now performing the verification task we requested in the configuration file. It should take a minute or two to run. MODE's runtime is greatly influenced by the number of gridpoints in the domain, the convolution radius chosen, and the number of objects resolved. The more dense the domain, larger the convolution radius, and greater the number of objects, the more computations required. When MODE is finished, it will have created 4 files: 2 ASCII statistics files, a NetCDF object file, and a PostScript summary plot. Open up the PostScript summary plot using the PostScript viewer of your choice, gv, or Ghostview, for example:
gv tutorial/out/mode/mode_240000L_20050808_000000V_120000A.ps

This PostScript summary plot contains 5 pages. The first page summarizes the application of MODE to this dataset. The second and third pages contain enlargements of the forecast and observation raw and object fields. The fourth page shows the forecast and observation object fields overlaid on top of each other. And the fifth page contains pair-wise differences for the matched clusters of objects. The PostScript summary plot will contain additional pages when additional merging methods are selected. Looking at the first page, note the following:

The valid data in the forecast field extends much further than in the observation field leading to objects in the forecast field with no match (royal blue = unmatched) in the observation field. The forecast field contains 5 objects while the observation field contains 6. Two pairs of objects (colored red and green) are matched across these fields. Forecast object 4 matches observed object 5 (red). Forecast object 3 matches observed objects 2 and 6 (green).

Now, let's modify the configuration file and rerun this case. Again, open up the tutorial/config/MODEConfig_tutorial file and edit it as follows:

Set mask_missing_flag = BOTH; To mask out the bad data in both fields with each other. In the fcst dictionary, set conv_radius = 2; To apply less smoothing before defining objects.

Now, rerun the MODE command listed above, and when it's finished, reload the PostScript plot. Reducing the convolution radius (amount of smoothing) while keeping the convolution threshold fixed should result in a greater number of smaller objects. On the first page of the PostScript plot, note the following:

The valid data in the raw forecast and observation fields now match up nicely. The forecast field contains 6 objects and the observation field contains 17. These are called simple objects. Three sets of objects are now matched across the fields. They are colored red, greeen, and light blue.

Objects that are colored the same color within the same field are called merged. Objects that have the same color across fields are called matched. Each set of colored objects is referred to as a cluster object. A cluster object consists of one or more simple objects. For example, in the observation field, simple object numbers 11 and 13 (both colored green) are merged together and are members of the same cluster object. They match forecast object number 4 which is its own cluster object.

After completing the next page on MODE Output, users are welcome to return to this page, play around with settings in the configuration file, and rerun this case several times. Listed below are some configuration parameters you may want to try modifying:

total_interest_thresh fcst_conv_radius and obs_conv_radius fcst_conv_thresh and obs_conv_thresh fcst_area_thresh and obs_area_thresh fcst_inten_thresh and obs_inten_thresh fcst_merge_thresh and obs_merge_thresh with fcst_merge_flag and obs_merge_flag both set to 1

MODE Tool: Output


As mentioned on the previous page, the output of MODE typically consists of 4 files: 2 ASCII statistics files, 1 NetCDF object file, and 1 PostScript summary plot. The output of any of these files may be disabled using the appropriate MODE command line argument. In this example, the output is written to the tutorial/out/mode directory as we requested on the command line.

The MODE output file naming convention is designed to contain the names of the forecast and observation fields and levels, the lead times, valid times, and accumulation times. If you rerun MODE on the same fields but with a slightly different configuration, the new output will override the old output, unless you redirect it to a different directory using the -outdir command line argument. The 4 MODE output files are described briefly below:

The PostScript file ends in .ps and was described on the previous page. The NetCDF object file ends in _obj.nc and contains the raw and cluster object indices and boundary polylines for the simple objects. The ASCII contingency table statistics file ends in _cts.txt. The ASCII object statistics file ends in _obj.txt and contains all of the object and object comparison data.

Since we've already seen the PostScript summary plot, we'll skip that one here. Use the ncview utility (if available on your machine) to view the NetCDF object output of MODE:
ncview tutorial/out/mode/mode_240000L_20050808_000000V_120000A_obj.nc&

Click through the variable names in the ncview window to see plots of the four object fields in the file. The fcst_obj_id and obs_obj_id contain the indices for the forecast and observation objects defined by MODE. The fcst_clus_id and obs_clus_id contain indices for the matched cluster objects. Now dump the header:
ncdump -h tutorial/out/mode/mode_240000L_20050808_000000V_120000A_obj.nc

View the NetCDF header to see how the file is structured.

The object colors plotted by ncview will generally not correspond to those in MODE's PostScript output. Next, open up the tutorial/out/mode/mode_240000L_20050808_000000V_120000A_cts.txt contingency table statistics ASCII file using the text editor of your choice. This file is similar to the CTS output of Grid-Stat but much less complete. It contains 4 lines, a header row followed by contingency table statistics computed 3 ways:

The first row contains RAW in the FIELD column. The scores listed in this row are computed from the RAW forecast and observation fields. The raw fields are thresholded using the fcst_conv_thresh and obs_conv_thresh values specified to create 0/1 mask fields. Those mask fields are compared point by point to compute a contingency table. The scores listed in this row are derived from that contingency table. The second row contains FILTER in the FIELD column. The scores listed in this row are computed from the filtered forecast and observation fields. The filtered fields are just what's left of the raw fields after the fcst_raw_thresh and obs_raw_thresh values have been applied. Those filtered fields are then thresholded using the fcst_conv_thresh and obs_conv_thresh values specified to create a 0/1 mask field. The mask fields are compared point by point, a contingency table is computed, and the corresponding statistics are listed in this row. Since no filtering was performed in this example, the contents of this FILTER row match the contents of the RAW row. The third row contains OBJECT in the FIELD column. The scores listed in this row are computed from the forecast and observation OBJECT fields. In MODE, after objects have been defined, the field may be thought of as a 0/1 mask field, 1 at grid points contained inside an object and 0 everywhere else. The object mask fields are compared in this way point by point, a contingency table is computed, and the corresponding statistics are listed in this row.

This file is not meant to replicate or replace the functionality of the Grid-Stat tool which includes many more features and options. It is simply meant to provide a convenient way of seeing how the output of MODE compares to the traditional contingency table statistics that are often computed. Close this file, and open up the tutorial/out/mode/mode_240000L_20050808_000000V_120000A_obj.txt object statistics ASCII file using the text editor of your choice. This file contains all of the object statistics in which most users will be interested. It contains 4 different line types which may be distinguished by the contents of the OBJECT_ID column:

The rows containing FNNN and ONNN in that column give information about the simple forecast and observation objects, respectively. NNN refers to the simple object number. The rows containing FNNN_ONNN in that column give information about pairs of simple objects. The rows containing CFNNN and CONNN in that column give information about the cluster forecast and observation objects, respectively. NNN refers to the cluster object number. The rows containing CFNNN_CONNN in that column give information about pairs of cluster objects.

In the ASCII MODE statistics file, the value of 000 for NNN in the OBJECT_ID column indicates that that object was not matched. Each line in this file contains the same number of columns. However, only certain columns are applicable to certain line types. For example, the CENTROID_X and CENTROID_Y columns contain valid data for simple object lines, but not for pairs of simple object lines. The opposite is true for the CENTROID_DIST column which gives the distance between the centroids of two objects. Columns which are not applicable to a given line type are filled with a value of NA.

Please refer to the MET Users Guide for a more thorough description of the MODE output. At this point, feel free to return to the previous page and play around with the MODE configuration settings.

Wavelet-Stat Tool: General


Wavelet-Stat Functionality
The Wavelet-Stat tool provides a wavelet-based intensity-scale decomposition for comparing gridded forecasts to gridded observations. Wavelet-Stat may be used in a generalized way to compare any two fields but has been most commonly applied to precipitation. The steps performed in Wavelet-Stat consist of:

Preprocess the input forecast and observation fields to select one or more tiles of dimension 2^n by 2^n. The wavelet decomposition may only be performed on such fields, known as dyadic. Threshold the forecast and observation fields to create a 0/1 binary field. Use a wavelet decomposition approach to decompose the thresholded forecast and observation tiles into separate scales. Compare the forecast and observation tiles at each scale and compute statistics, such as the meansquared error and intensity skill score. If multiple tiles were used, aggregate the results across all of the tiles and write out the aggregated statistics as well as the statistics for each tile.

Wavelet-Stat may be configured to use several different types of wavelet decompositions, all of those that are supported by the GNU Scientific Library. Here we'll use the Haar wavelet which is employed in the Intensity-Scale method by Casati et al. See the MET Users Guide for a more thorough description of how to configure the Wavelet-Stat tool.

Wavelet-Stat Usage
View the usage statement for Wavelet-Stat by simply typing the following: bin/wavelet_stat

At a minimum, the input gridded fcst_file, the input gridded obs_file, and the configuration config_file must be passed in on the command line. Just as with Grid-Stat, the forecast and observation fields must be interpolated to a common grid prior to running Wavelet-Stat. The copygb utility is recommended for regridding files in GRIB format.

Wavelet-Stat Tool: Configure


The behavior of Wavelet-Stat is controlled by the contents of the configuration file passed to it on the command line. The default Wavelet-Stat configuration file may be found in the METv4.0/data/config/WaveletStatConfig_default file. The configuration used by the test script may be found in the METv4.0/data/config/WaveletStatConfig_APCP_12 file. Prior to modifying the configuration file, users are advised to make a copy of the default: cp data/config/WaveletStatConfig_default tutorial/config/WaveletStatConfig_tutorial

Open up the tutorial/config/WaveletStatConfig_tutorial file for editing with the text editor of your choice. The configuration items for Wavelet-Stat are used to specify how the intensity-scale verification approach is to be performed. In previous versions of MET, Wavelet-Stat was restricted to comparing variables of the same type. In METv4.0, this has been generalized for comparing two different types of variables, if desired. The configurable items include specifications for the following:

The fields and vertical level or accumulation interval to be compared. Option to mask out a portion of the raw fields.

Specify how one or more tiles of size 2^n by 2^n are extracted from the domain. Select which wavelet family and type is used. Various plotting options.

While the Wavelet-Stat configuration file contains many options, beginning users will typically only need to modify a few of them. You may find a complete description of the configurable items in the MET Users Guide or in the comments of the configuration file itself. Please take some time to review them. For this tutorial, we'll configure Wavelet-Stat to verify the same 12-hour accumulated precipitation output of PCP-Combine that we used for Grid-Stat. Edit the tutorial/config/WaveletStatConfig_tutorial file as follows:

In the fcst dictionary, set


field = [ { name = "APCP_12"; level = "(*,*)"; cat_thresh = [ >0 ]; } ];

To verify the NetCDF variable of that name and threshold any non-zero precipitation.

Set grid_decomp_flag = TILE; To use the tile we'll manually define below. Set the tile dictionary to
tile = { width = 64; location = [ { x_ll = 80; y_ll = 25; } ]; };

To define a single tile with a lower-left corner of (80, 25) in the grid and a dyadic width of 64 (2^6) grid boxes. Save and close this file.

Wavelet-Stat Tool: Run


Next, run Wavelet-Stat on the command line using the following command: bin/wavelet_stat \ tutorial/out/pcp_combine/sample_fcst_24L_2005080800V_12A.nc \ tutorial/out/pcp_combine/sample_obs_2005080800V_12A.nc \ tutorial/config/WaveletStatConfig_tutorial \ -outdir tutorial/out/wavelet_stat \ -v 2

Wavelet-Stat is now performing the verification task we requested in the configuration file. It should take several seconds to run. Generally, the Wavelet-Stat tool runs pretty quickly. When Wavelet-Stat is finished, it will have created 4 files: 2 ASCII statistics files, a NetCDF scale decomposition file, and a PostScript summary plot. Open up the PostScript summary plot using the PostScript viewer of your choice, gv, or Ghostview, for example:
gv tutorial/out/wavelet_stat/wavelet_stat_240000L_20050808_000000V.ps

This PostScript summary plot contains 5 pages. The first page summarizes the definition of the tile(s) in the domain. The remaining pages show the difference field (f-o) for each decomposed scale and the statistics for each scale. Now, let's modify the configuration file and rerun this case. Again, open up the tutorial/config/WaveletStatConfig_tutorial file and edit it as follows:

Set grid_decomp_flag = AUTO; To let the Wavelet-Stat tool automatically define the largest 2^n by 2^n tile that fits in the center of the domain.

Now, rerun the Wavelet-Stat command listed above, and when it is finished, reload the PostScript plot. On the first page of the PostScript plot, note the following:

A tile of dimension 128 by 128 was chosen in the center of the domain. Since the dimension increased from 64 (= 2^6) to 128 (= 2^7) the number of scales has increased by 1. The tile chosen includes a large amount of missing data in the observation field. You should try to avoid including missing data when running the Wavelet-Stat tool as it will cause misleading results. Missing data is replaced with a value of 0 for preciptation fields or the mean value of the valid data for nonpreciptation fields.

Close that PostScript file.

Wavelet-Stat Tool: Output


As mentioned on the previous page, the output of Wavelet-Stat typically consists of 4 files: 2 ASCII statistics files, 1 NetCDF scale decomposition file, and 1 PostScript summary plot. The output of these files may be disabled in the configuration file or using the appropriate command line argument. In this example, the output is written to the tutorial/out/wavelet_stat directory as we requested on the command line.

The Wavelet-Stat output file naming convention is similar to that of the Point-Stat and Grid-Stat tools. The 4 Wavelet-Stat output files are described briefly below:

The PostScript file was described on the previous page. The NetCDF scale decomposition file contains the raw, thresholded, and decomposed fields for each variable, tile, and threshold used. Note that while the PostScript plot only shows the difference (f-o) fields, the NetCDF file contains the actual forecast, observation, and difference fields decomposed for each scale. The ASCII ISC file contains the ISC line type with a header row for the column names. The ASCII STAT file contains only the ISC line type. Currently, the Wavelet-Stat tool only creates one output line type. So the ISC file and the STAT file are almost identical. In future versions of MET, the Wavelet-Stat tool may be enhanced to produce additional line types.

Open up the tutorial/out/wavelet_stat/wavelet_stat_240000L_20050808_000000V_isc.txt ISC file using the text editor of your choice, and note the following:

The header columns are identical to the other ASCII output files from Point-Stat and Grid-Stat. The LINE_TYPE column is set to ISC, indicating that the columns to follow contain information about the intensity-scale method. This file contains 9 rows of data. The ISCALE column indicates the scale for that row. The row with ISCALE equal to 0 contains scores for the thresholded binary fields. The rows with ISCALE greater than 0 contain scores for the thresholded binary fields decomposed into separate scales. Looking carefully you'll see that the columns for MSE, FENERGY2, and OENERGY2 are additive across the scales. The sum of these values in the lines where ISCALE is greater than 0 equals the values in the line where ISCALE equals 0.

Close this file and use the ncdump and ncview utilities (if available on your machine) to view the NetCDF output of Wavelet-Stat:
ncdump -h tutorial/out/wavelet_stat/wavelet_stat_240000L_20050808_000000V.nc ncview tutorial/out/wavelet_stat/wavelet_stat_240000L_20050808_000000V.nc

When clicking through and displaying each variable, note that some have a dimension for scale. Click through the different scales to see the decompositions of those fields.