• Tidak ada hasil yang ditemukan

DATA STRUCTURES AND SCALING

Dalam dokumen Geospatial monitoring and modeling system (Halaman 45-51)

EXERCISE 1-8 DATA STRUCTURES AND SCALING 44

"WESTLUSE.RST." This is the actual data file for this raster image, which has an ".rst" file extension.

Now change the filter to show all files. Go to the input box below the Files pane and select the pull-down menu. Select the All Files (*.*) option. Now locate again WESTLUSE.RST. Notice, however, that also shown is a second file with an ".rdc" extension. The

".rdc" file is its accompanying metadata file. The term metadata means "data about data," i.e., documentation (which explains the

"rdc" extension—it stands for "raster documentation"). The data shown in the Metadata pane come from the “.rdc” files. Vector files also have a documentation file, “.vdc.”

Change the filter back again to the default listing. You can do this from the pull-down menu.

D

Now with WESTLUSE highlighted, right-click and choose the Show Structure option. This shows the actual data values behind the upper left-most portion (8 columns and 16 rows) of the raster image. Each of these numbers represents a land use type, and is symbolized by the corresponding palette entry. For example, cells with a number 3 indicate forested land and are symbolized with the third color in the WESTLUSE palette. Use the arrow keys to move around the image. Then close the Show Structure dialog.

E

Make sure that the WESTLUSE raster layer is still highlighted in TerrSet Explorer, and view its metadata which will show us the contents of the "WESTLUSE.RDC" file. This file contains the fundamental information that allows the file to be displayed as a raster image and to be registered with other map data.

The file type is specified as binary, meaning that numeric values are stored in standard IEEE base 2 format. The Show Structure utility in TerrSet Explorer allows us to view these values in the familiar base 10 numeric system. However, they are not directly accessible through other means such as a word processor. TerrSet also provides the ability to convert raster images to an ASCII2 format, although this format is only used to facilitate import and export.

The data type is byte. This is a special sub-type of integer. Integer numbers have no fractional parts, increasing only by whole number steps.

The byte data type includes only the positive integers between 0 and 255. In contrast, files designated as having an integer data type can contain any whole numbers from -32768 to + 32767. The reason that they both exist is that byte files only require one byte per cell whereas integer files require 2. Thus, if only a limited integer range is required (as in this case), use of the byte data type can halve the amount of computer storage space required. Raster files can also be stored as real numbers, as will be discussed below.

The columns and rows indicate the basic raster structure. Note that you cannot change this structure by simply changing these values. Entries in a documentation file simply describe what exists. Changing the structure of a file requires the use of special procedures (which are

extensively provided within TerrSet). For example, to change the data type of a file from byte to integer, you would use the module CONVERT.

There are seven fields related to the reference system to indicate where the image exists in space. The Georeferencing chapter in the TerrSet Manual gives extensive details on these entries. However, for now, simply recognize that the reference system is typically the name of a special reference system parameter file (called a REF file in TerrSet) that is stored in the GEOREF sub-folder of the TerrSet program directory. Reference units can be meters, feet, kilometers, miles, degrees or radians (abbreviated m, ft, km, mi, deg, rad). The unit distance multiplier is used to accommodate units of other types (e.g., minutes). Thus, if the units are one of the six recognized unit types, the unit

2 ASCII is the American Standard Code for Information Interchange. It was one of the earliest coding standards for the digital representation of alphabetic characters, numerals and symbols. Each ASCII character takes one byte (8 bits) of memory. Recently, a new system has been introduced to cope with non-US alphabet systems such as Greek, Chinese and Arabic. This is called UNICODE and requires 2 bytes per character. TerrSet accepts UNICODE for its text layers since the software is used worldwide. However, the ASCII format is still very much in use as a means of storing single byte codes (such as Roman numerals), and is a subset of UNICODE.

EXERCISE 1-8 DATA STRUCTURES AND SCALING 45

distance will always be 1.0. With other types, the value will be other than 1. For example, units can be expressed in yards if one sets the units to feet and the unit distance to 3.

The positional error indicates how close the actual location of a feature is to its mapped position. This is often unknown and may be left blank or may read unknown. The resolution field indicates the size of each pixel (in X) in reference units. It may also be left blank or may read unknown. Both the positional error and resolution fields are informational only (i.e., are not used analytically).

The minimum and maximum value fields express the lowest and highest values that occur in any cell, while the display minimum and display maximum express the limits that are used for scaling (see below). Commonly, the display minimum and display maximum values are the same as the minimum and maximum values.

The value units field indicates the unit of measure used for the attributes, while the value error field indicates either an RMS value for

quantitative data or a proportional error value for qualitative data. The value error field can also contain the name of an error map. Both fields may be left blank or read unknown. They are used analytically by only a few modules.

A data flag is any special value. Some TerrSet modules recognize the data flags background or missing data as indicating non-data.

F

Using WESTLUSE we see there are 13 legend categories. Either double-click in Categories input box or select the ellipse button to the right of the Categories input box to show the legend categories. This Categories dialog box contains interpretations for each of the land use categories. Clearly it was this information that was used to construct the legend for this layer. You can now close Categories dialog.

G

Now highlight the ETDEM raster layer in File Tab of TerrSet Explorer and right-click to Show Structure. What you will initially see are the zeros which represent the background area. However, you may use the arrow keys to move farther to the right and down until you reach cells within Ethiopia. Notice how some of the cells contain fractional parts. Then exit from Show Structure and view this file’s Metadata.

Notice that the data type of this image is real. Real numbers are numbers that may contain fractional parts. In TerrSet, raster images with real numbers are stored as single precision floating point numbers in standard IEEE format, requiring 4 bytes of storage for each number. They can contain cells with data values from -1 x 1037 to +1 x 1037 with up to 7 significant figures. In computer systems, such numbers may be expressed in general format (such as you saw in the Show Structure display) or in scientific format. In the latter case, for example, the number 1624000 would be expressed as 1.624e+006 (i.e., 1.624 x 106).

Notice also that the minimum and maximum values range from 0 to 4267.

Now notice the number of legend categories. There is no legend stored for this image. This is logical. In these metadata files, legend entries are simply keys to the interpretation of specific data values, and typically only apply to qualitative data. In this case, any value represents an elevation.

H

Remove everything from the screen except your ETHIOPIA composition. Then use DISPLAY Launcher to display ETDEM, and for variety, use the TerrSet Default Quantitative palette and select 16 as the number of classes. Be sure that the legend option is selected and then click OK. Also, for variety, click the Transparency button on Composer (the one on the far right in Composer).

Notice that this is yet another form of legend.

EXERCISE 1-8 DATA STRUCTURES AND SCALING 46

What should be evident from this is that the manner in which TerrSet renders cell values as well as the nature of the legend depends on a combination of the data type and the number of classes.

When the data type is either byte or integer, and the layer contains only positive values from 0-255 (the range of permissible values for symbol codes), TerrSet will automatically interpret cell values as symbol codes. Thus, a cell value of 3 will be interpreted as palette color 3. In addition, if the metadata contains legend captions, it will display those captions.

If the data type is integer and contains values less than 0 or greater than 255, or if the data type is real, TerrSet will automatically assign cells to symbols using a feature known as autoscaling and it will automatically construct a legend.

Autoscaling divides the data range into as many categories as are included in the Autoscale Min to Autoscale Max range specified in the palette (commonly 0-255, yielding 256 categories). It then assigns cell values to palette colors using this relationship. Thus, for example, an image with values from 1000 to 3000 would assign the value 2000 to palette entry 128.

The nature of the scaling and the legend created under autoscaling depends upon the number of classes chosen. In the User Preferences dialog under the File menu, there is an entry for the maximum number of displayable legend categories. By default, it is set at 16. Thus when the number of classes is 16 or less, TerrSet will display them as separate classes and construct a legend showing the range of values assigned to each class.

When there are more than 16 classes, the result depends on the data type. When the data contain real numbers or integers with values less than 0 or greater than 255, it will create a continuous legend with pointers to representative values (such as you see in the ETHIOPIA composition). For cases of positive integer values less than 256, it will use a third form of legend. To appreciate this, use DISPLAY Launcher to examine the SIERRA4 layer using the Greyscale palette. Be sure the legend option is on but that the autoscaling option is set to Off (Direct).

In this case, the image is not autoscaled (cell values all fall within a 0-255 range). However, the range of values for which legend captions are required exceeds the maximum set in User Preferences,3 so TerrSet provides a scrollable legend. To understand this effect further, click on the Layer Properties button in Composer. Then, alternately set the autoscaling option to Equal Intervals and None (Direct). Notice how the legend changes.

I

You will also notice that when the autoscaling is set to Equal Intervals, the contrast of the image is improved. The Display Min and Display Max sliders also become active when autoscaling is active. Set the autoscaling to Equal Intervals and then try sliding these with the mouse. They can also be moved with the keyboard arrow keys (hold down the shift key with the arrows for smaller increments).

Slide the Display Min slider to the far left. Then press the right arrow twice to move the Display Min to 26 (or close to it). Then move the Display Max slider to the far right, followed by three clicks of the left arrow to move the Display Max to 137. Notice the start and end legend categories on the display.

When the Display Min is increased from the actual minimum, all cell values lower than the Display Min are assigned the lowest palette entry (black in this case). Similarly, all cell values higher than the Display Max are assigned the highest palette entry (white in this case). This is a phenomenon called saturation. This can be very effective in improving the visual appearance of autoscaled images, particularly those with very skewed distributions.

3 The number of displayable legend categories can be increased to a maximum of 48.

EXERCISE 1-8 DATA STRUCTURES AND SCALING 47

J

Use DISPLAY Launcher to display SIERRA2 with the Greyscale palette and without autoscaling. Clearly this image has very poor contrast. Create a histogram display of this image using HISTO from the Display menu (or its toolbar icon). Specify SIERRA2 as the image name and click OK, accepting all defaults.

Notice that the distribution is very skewed (the maximum extends to 96 despite the fact that very few pixels have values greater than 60). Given that the palette ranges from 0-255, the dark appearance of the image is not surprising. Virtually all values are less than 60 and are therefore displayed with the darkest quarter of palette colors.

If the Layer Properties dialog is not visible, be sure that SIERRA2 has focus and click Layer Properties again. Now set autoscaling to use Equal Intervals and click Apply. This provides a big improvement in contrast since the majority of cell values now cover half the color range (which is spread between the minimum of 23 and the maximum of 96). Now slide the Display Max slider to a value around 60. Notice the dramatic improvement! Click the Save button. This saves the new Display Min and Display Max values to the metadata file for that layer. Now whenever you display this image with equal intervals autoscaling, these enhanced settings will be used.

K

You will have noticed that there are two other options for autoscaling -- Quantiles and Standard Scores. Use DISPLAY Launcher to display SIERRA2 using the Greyscale palette and no autoscaling (i.e., Direct). Notice again how little contrast there is. Now go to Layer Properties and select the Quantiles option. Notice how the contrast sliders are now greyed out. Despite this, choose 16 classes and click Apply. As you can see, the Quantiles scheme does not need any contrast enhancement! It is designed to create the maximum degree of contrast possible by rank ordering pixel values and assigning equal numbers to each class.

Now use Layer Properties to select the Standard Scores autoscaling option using 6 classes. Click Apply. This scheme creates class boundaries based on standard scores. The first class includes all pixels more than 2 standard deviations below the mean. The next shows all cases between 1 and 2 standard deviations below the mean. The next shows cases from 1 standard deviation below the mean to the mean itself. Similarly, the next class shows cases from the mean to one standard deviation above the mean, and so on.

As with the other end, the last class shows all cases of 2 or more standard deviations. For an appropriate palette, go to the Advanced Palette / Symbol Selection dialog. Choose a Quantitative data relationship and a Bipolar (Low-High-Low) color logic.

Select the third scheme from the top of the four offered, and then set the inflection point to be 37.12 (the mean). Then click on OK. Bipolar palettes seem to be composed of two different color groups -- in this case, the green and orange group, signifying values below and above the mean.

L

Remove all images and dialogs from the screen and then display the color composite named SIERRA345. Then click on Layer Properties on Composer. Notice that three sets of sliders are provided—one for each primary color. Also notice that the Display Min and Max values for each are set to values other than the actual minimum and maximum for each band. This was caused by the saturation option in COMPOSITE. They have each been moved in so that 1% of the data values is saturated at each end of the scale for each primary.

Experiment with moving the sliders. You probably won't be able to improve on what COMPOSITE calculated. Note also that you can revert to the original image characteristics by clicking either the Revert or Close buttons.

Scaling is a powerful visual tool. In this exercise, we have explored it only in the context of raster layers and palettes. However, the same logic applies to vector layers. Note that when we use the interactive scaling tools, we do not alter the actual data values of the layers. Only their appearance when displayed is changed. When we use these layers analytically the original values will be used (which is what we want).

EXERCISE 1-8 DATA STRUCTURES AND SCALING 48

We have reviewed the important display techniques in TerrSet. With Composer and DISPLAY LAUNCHER you have limitless possibilities for visualizing your data. Note, however, that you can also use TerrSet Explorer to quickly display raster and vector layers. But unlike with DISPLAY LAUNCHER, you will not have control over its initial display, but you can always use Composer to alter its display characteristics.

Displaying files with TerrSet Explorer is meant as a quick look. Also, you can specify some initial parameters for the TerrSet Explorer display in User Preferences under the File menu.

To finish this exercise, we will use TerrSet Explorer a bit further to examine the structure of vector layers.

M

Open TerrSet Explorer and make sure the filter used is displaying vector files (.vct). Then choose the WESTROAD layer and right- click on Show Structure. As you can see, the output from this module is quite different for vector layers. Indeed, it will even differ between vector layer types.

The WESTROAD file contains a vector line layer. However, what you see here is not the actual way it is stored. Like all TerrSet data files, the true manner of storage is binary. To get a sense of this, close the Show Structure dialog and then right-click on WESTROAD to Show Binary. Clearly this is unintelligible. The Show Structure procedure for vector layers provides an interpreted format known as "Vector Export Format".4 That said, the logical correspondence between what is seen in Show Structure and what is contained in the binary file is very close. The binary version does not contain the interpretation strings on the left, and it encodes numbers in a standard IEEE binary format.

N

Remove any displays related to Show Structure or Show Binary. Then view the Metadata button for WESTROADS. As you can see, there is a great deal of similarity between the metadata file structures for raster and vector. The primary difference is related to the data type field, which in this case reads ID type. Vector files always store coordinates as double precision real numbers.

However, the ID field can be either integer5 or real. When it contains a real number, it is assumed that it is a free-standing vector layer, not associated with a database. However, when it is an integer, the value may represent an ID that is linked to a data table, or it may be a free-standing layer. In the first case, the vector feature IDs would match a link field in a database that contains

attributes relating to the features. In the second case, the vector feature IDs would be embedded integer attributes such as elevations or land use codes.

O

You may wish to explore some other vector files with the Show Structure option to see the differences in their structure. All are self-evident in their organization, with the exception of polygon files. To appreciate this, find the AWRAJAS2 vector layer in the Files list. Then right-click on Show Structure. The item that may be difficult to interpret is the Number of Parts. Most polygons will have only one part (the polygon itself). However, polygons that contain holes will have more than one part. For example, a polygon with two holes will list three parts—the main polygon, followed by the two holes.

4 A vector export format file has a ".vxp" extension and is editable. The CONVERT module can import and export these files. In addition, the content of Show Structure can be saved as a VXP file (simply click on the Save to File button). Furthermore, you can edit within the Show Structure dialog. If you edit a VXP file, be sure to re-import it under a new name using CONVERT. This way your original file will be left intact. The Help System has more details on this process.

5 The integer type is not further broken down into a byte subtype as it is with raster. In fact, the integer format used for vector files is technically a long integer, with a range within +/- 2,000,000.

Dalam dokumen Geospatial monitoring and modeling system (Halaman 45-51)