Assessing the Quality of Input Data Layers for DSM and of the Resulting output maps

Experiences with Applied DSM: Protocol, Availability, Quality and Capacity Building

10.7 Assessing the Quality of Input Data Layers for DSM and of the Resulting output maps

10 Experiences with Applied DSM 125 training sites that are deemed to be representative of each class of interest except here, instead of using the training sites to develop rules, we use successive refine-ments of heuristic rules to classify entire areas that are then treated as a single large set of training data and are reviewed to see if the resulting patterns correspond with expert expectations.

Shary et al. (2002) and Sharaya and Shary (2004) describe examples of a com-prehensive system of classification of surface curvatures based entirely on objective theoretical considerations of expected relationships between curvature classes and anticipated environmental conditions. This approach demonstrates that it is possible to impose a set of theoretical classification rules even without any local, empirical knowledge to guide definition of classes of interest. The resulting maps are antici-pated to differentiate portions of the landscape that can be expected to exhibit signif-icant differences in soil processes and in patterns of development of soil properties and soil classes. So, if local expert knowledge of actual patterns of soil distribution is weak or absent, it may still be possible to produce useful maps based on application of theoretical considerations only.

126 R.A. MacMillan The quality of predictive maps can be considered to be a function of the accuracy and relevance of the input layers used to produce the maps, the effectiveness of the predictive procedures used to create the maps and the thematic and positional accuracy of the final resulting maps.

10.7.1 Considerations of the Quality of Input Maps used to Make Predictions

Consider first issues of quality of the predictor input maps. With respect to the pri-mary inputs derived from a DEM, considerations of quality have more to do with the ability of the DEM to faithfully portray the location, shape, size and pattern of surface features of relevance to the predictions than to any measures of absolute elevation accuracy, such as the commonly used root mean square (RMS) error in DEM elevation values relative to measured elevations at specific locations. The DEM needs to faithfully render relative point to point relations that capture and portray surface form at a scale and resolution appropriate for describing the land-form entities that are to be mapped. The degree to which a DEM faithfully portrays a surface is related to its horizontal and vertical resolution and accuracy. Shary et al. (2002) showed that local measures of surface gradient and curvature computed within a 3× 3 window were strongly influenced by grid resolution, with slope tend-ing towards zero for large grid spactend-ing and towards extremely large values for very small grid spacing. Moran and Bui (2002) noted that local measures of surface form (slope, aspect, curvatures) computed within a 3×3 window became less meaningful and useful as predictors of soil classes or properties as grid dimensions increased but that more regional measures of context or pattern (such as upslope catchment area) were less sensitive to grid resolution and were more reliable inputs for predictions of patterns of soil associations that used grids with larger horizontal dimensions.

Coarser resolution DEM data sets are therefore best used to compute measures of regional context and texture or variance within a neighbourhood analysis window and are less useful for computing local measures of surface form (slope, aspect, curvatures).

Within the predictive mapping community, a consensus appears to have emerged that grids with a horizontal resolution of 5–10 m and a relative vertical accuracy of ±0.5m or better appear to capture and portray meso scale terrain features at about the level of abstraction that they are most commonly appreciated by human observers (note however that in Chapter 4 Howell et al. did not achieve significant improvements using a 5 m DEM compared to a 25 m DEM). Grids of coarser resolution do not appear to capture and describe the correct location of landform features at the scale of hillslopes or portions of hillslopes that are of most common interest for predictive mapping. Coarser grids (25–100 m) can be thought of as a kind of regular sampling frame or mesh that can provide some relevant information on the approximate vertical range in elevation within a local neighbourhood and on relative values for slope gradient and curvatures. These values will always be under-estimates, because not all local terrain maxima and minima will be sampled,

10 Experiences with Applied DSM 127 but the mesh will provide some relevant information on the size, scale and com-plexity of landform features that are at least 2 times the horizontal dimensions of the grid. Consequently, when using grids of coarser horizontal resolution (>10m), DSM practitioners should explicitly acknowledge that any predictions or classifica-tions that are based on analysis of local surface form or context are not likely to be spatially accurate at point locations. The locations of local rises or hollows portrayed by a DEM of 25 m grid dimensions or more are likely to be displaced relative to the actual locations of these features in the real world and smaller local rises or hollows will be missed altogether. However the 25 m DEM data can be useful in indicating the frequency of occurrence of features of a particular relative size and scale within any small area.

Evaluations of the quality of input data layers should focus on establishing the size, scale, shape and context of terrain features that can be accurately described by an input layer of a particular resolution. The features to be predicted by analysis of these input layers need to be conceptualized as having horizontal dimensions that are at least 2x those of the grid postings. Any variation in the predicted classes or attributes that occurs over distances less than 2x the grid interval cannot be spatially located with any accuracy and can only be described in general terms. Another quality consideration is that many existing coarser resolution DEM sources do not provide a consistent and faithful portrayal of the bare ground surface, but rather, portray a digital surface model (DSM) that often describes the top of a forested canopy, in densely forested areas, or the tops of man-made features in built-up areas.

Finally, many existing secondary source environmental maps are inadequate and need to be improved.

With respect to soil maps produced by predictive methods, quality can be de-fined as the ability of a map or product to correctly predict the characteristics of the landscape at particular points or within particular small areas (see also Section 1.5).

10.7.2 Considerations of Quality of Predictive Maps of Individual Soil Properties

The quality of maps that predict individual soil properties can be assessed by ob-taining field observations or samples at randomly selected locations and computing RMS error between the predicted and observed values (see also Chapter 11). Uncer-tainty associated with the predictions of individual soil properties can be conveyed by preparing and presenting maps of the residuals arising from the predictive pro-cess. An alternative approach is to produce multiple realizations of each predictive map by varying the values of the input variables randomly with the range of expected accuracy of the input layers at point locations. The variation in predicted values observed in these multiple realizations can provide an illustration of the uncertainty of the predictions at any given location. Multiple realizations can also be achieved by using different predictive equations or techniques to produce each realization so as to illustrate the range of uncertainty in predictions arising from method error, as opposed to data error.

128 R.A. MacMillan

10.7.3 Considerations of Quality of Predictive Maps of Discrete Soil Classes

The quality of maps that predict discrete soil classes can be assessed in several ways.

Kuhnert et al. (2005) Wang et al. (2005) recognized a need to distinguish between positional errors and thematic errors in raster maps. Similarly, Walker (2003) as-sessed accuracy in terms of ability to predict the correct classes at exact locations, termed locational ability and ability to predict proportions of classes within an area without consideration of location, termed predictive quality. The global accuracy or precision of a map is traditionally assessed by a preparing a contingency table that compares predicted values at specific sample locations to observed values at those same locations. The degree of similarity (or error) is then assessed by computing a measure such as the Kappa statistic that corrects for agreement that may arise from chance (see Section 19.2).

A problem that has been recognized with Kappa and its modifications is that it is entirely based on cell by cell comparison statistics (Kuhnert et al. (2005). Maps that have a bias or have similar patterns that are slightly distorted or misregistered may not agree well. The textbook example of this is a comparison of two identical chess boards displaced by exactly one cell in one direction. A point by point comparison would conclude that the two maps had zero similarity whereas, in reality, the pat-terns they predict are identical but the locational accuracy is off by one cell. Increas-ingly, new methods of assessing accuracy have been proposed that can identify and quantify positional and thematic accuracy errors separately (Pontius, 2000, 2002).

The degree of positional error can be assessed by comparing proportions of classes estimated within search windows of ever increasing dimensions with proportions of classes on a map taken to represent ground truth. Kuhnert et al. (2005) calculated an index based on the difference between the total numbers of cells in each category in each size of moving window and a reference map. If the proportions of each class in two windows were identical the index was zero. The more mismatches identified, the larger the index. The window was systematically increased in size and new comparisons were made. As the window size grew, the granularity of maps was blurred, and eventually they obtained a perfect fit assuming that the numbers of cells in the same category was the same for the whole area for both the predicted and ground truth maps.

The thematic accuracy of categorical soil maps is typically assessed in terms of exact categorical match between specific predicted classes and hard classes observed at reference locations. This approach ignores the fact that soil varies continuously across the landscape and that soils at a particular location may be more or less similar to the central concept used to define any mapped class. This is the under-lying assumption of fuzzy methods of soil classification and should be familiar to DSM practitioners by now. Liem et al. (2005) and Metzler and Sadler (2005) have both suggested methods for computing relaxed measures of agreement between hard classes and reference classes by adopting a fuzzy matching definition for a crisp classification which allows for varying levels of set membership for multiple

10 Experiences with Applied DSM 129 map categories. In this approach, fuzzy sets were used to represent the “relative strengths”’ of various membership categories for predicted classes relative to ob-served classes at a mapped pixel level (Metzler and Sadler, 2005).

Many applications of soil maps, especially of smaller scale soil maps, are non site-specific. For these maps, estimates of the proportions of particular soils within small areas equivalent to delineations of some minimum size for which management decisions are to be taken may be all that is required. Exact thematic accuracy at exact point locations is not necessary for these maps to be useful. Consequently, tests of map accuracy that evaluate the degree of exact match between predicted class and observed class at exact locations are neither appropriate nor desirable. We know that hard classifications assigned to reference or test locations in the field by local experts have to reflect decisions about what class best describes each reference location. We know that different experts are highly likely to assign the same location to different classes if the soil at the location is ambiguous and not completely representative of a specific class. These potential differences in classification of reference or test locations by a local expert represent sampling error in evaluation data sets which is not taken into account by any measures of exact categorical match at exact loca-tions. Similarly, consider that many of the coarser resolution DEM data sets used to produce predictive maps do not portray the exact location of terrain features of interest correctly but displace them or only partially capture them. It is clear that this inaccurate depiction of landform features leads to spatial displacement of pre-dicted classes relative to their true locations. Comparisons of thematic map accuracy should therefore assess relative (fuzzy) degrees of thematic correspondence between predicted and observed classes. They should also assess the degree of correspon-dence of predictions of proportions of classes within areas of different size, as well as just exact correspondence at specific point locations.

In general terms then, assessment of the accuracy of classed soil maps needs to take into account the scale of mapping and the intended use of the maps. The quality of fine resolution (large scale) maps intended for use for site specific opera-tional management decisions may need to be evaluated for exact thematic match at exact point locations. The quality of coarser resolution (small scale) maps intended to support regional management or operational decision making for management areas of some minimum size that is larger than a single pixel may only need to be evaluated in terms of relative degree of predictions of proportions of classes within some minimum sized area of interest for management decisions. Lagacherie (Section 1.7.2) has recognized a need for improved and more formal protocols for assessing accuracy and error in predictive maps.

In the author’s experience, while the pixilated appearance of raster predictive soil maps has often been criticized by users more accustomed to traditional, carto-graphically precise, polygonal soil maps, the accuracy of raster predictive maps has uniformly proven to be equal to, or superior to, that provided by manually prepared vector maps for the same areas. Zhu (see http://solim.geography.wisc.edu/projects/

index.htm) has cited values for exact spatial and categorical match predictive accu-racy of 77–89% for soil maps produced using the SOLIM method. Moon (2005a,b)

130 R.A. MacMillan reported accuracies of estimates of proportions of classes within small areas of 66–71% for predictive maps that compared favourably to accuracies of 55–60%

for manually prepared vector maps for the same areas. Explicit evaluation of the accuracy of predictive maps, using a method appropriate to the scale of the maps and their intended use, is likely to document that these maps are at least as accurate as conventionally produced manual soil maps and that they are usually more accurate (see also Sections 1.5 and 7.1).

10.8 Building Capacity for Routine Operational

Dalam dokumen Digital Soil Mapping with Limited Data (Halaman 142-147)