• Tidak ada hasil yang ditemukan

Using Predefined Array Scanning Orders with gawk

Dalam dokumen GAWK: Effective AWK Programming (Halaman 196-199)

8.1 The Basics of Arrays

8.1.6 Using Predefined Array Scanning Orders with gawk

This subsection describes a feature that is specific to gawk.

By default, when a for loop traverses an array, the order is undefined, meaning that the awk implementation determines the order in which the array is traversed. This order is usually based on the internal implementation of arrays and will vary from one version of awkto the next.

Often, though, you may wish to do something simple, such as “traverse the array by comparing the indices in ascending order,” or “traverse the array by comparing the values in descending order.” gawk provides two mechanisms that give you this control:

• Set PROCINFO["sorted_in"] to one of a set of predefined values. We describe this now.

• Set PROCINFO["sorted_in"] to the name of a user-defined function to use for com- parison of array elements. This advanced feature is described later in Section 12.2 [Controlling Array Traversal and Array Sorting], page 318.

The following special values for PROCINFO["sorted_in"] are available:

"@unsorted"

Array elements are processed in arbitrary order, which is the default awk be- havior.

"@ind_str_asc"

Order by indices in ascending order compared as strings; this is the most basic sort. (Internally, array indices are always strings, so with ‘a[2*5] = 1’ the index is "10" rather than numeric 10.)

"@ind_num_asc"

Order by indices in ascending order but force them to be treated as numbers in the process. Any index with a non-numeric value will end up positioned as if it were zero.

"@val_type_asc"

Order by element values in ascending order (rather than by indices). Ordering is by the type assigned to the element (see Section 6.3.2 [Variable Typing and Comparison Expressions], page 128). All numeric values come before all string values, which in turn come before all subarrays. (Subarrays have not been described yet; see Section 8.6 [Arrays of Arrays], page 183.)

If you choose to use this feature in traversingFUNCTAB(seeSection 7.5.2 [Built- in Variables That Convey Information], page 159), then the order is built-in functions first (seeSection 9.1 [Built-in Functions], page 187), then user-defined functions (seeSection 9.2 [User-Defined Functions], page 214) next, and finally functions loaded from an extension (see Chapter 17 [Writing Extensions for gawk], page 381).

"@val_str_asc"

Order by element values in ascending order (rather than by indices). Scalar values are compared as strings. If the string values are identical, the index string values are compared instead. When comparing non-scalar values,"@val_

type_asc" sort ordering is used, so subarrays, if present, come out last.

"@val_num_asc"

Order by element values in ascending order (rather than by indices). Scalar values are compared as numbers. Non-scalar values are compared using"@val_

type_asc"sort ordering, so subarrays, if present, come out last. When numeric values are equal, the string values are used to provide an ordering: this guar- antees consistent results across different versions of the C qsort() function,2 which gawk uses internally to perform the sorting. If the string values are also identical, the index string values are compared instead.

2 When two elements compare as equal, the Cqsort()function does not guarantee that they will maintain their original relative order after sorting. Using the string value to provide a unique ordering when the numeric values are equal ensures thatgawkbehaves consistently across different environments.

"@ind_str_desc"

Like "@ind_str_asc", but the string indices are ordered from high to low.

"@ind_num_desc"

Like "@ind_num_asc", but the numeric indices are ordered from high to low.

"@val_type_desc"

Like"@val_type_asc", but the element values, based on type, are ordered from high to low. Subarrays, if present, come out first.

"@val_str_desc"

Like "@val_str_asc", but the element values, treated as strings, are ordered from high to low. If the string values are identical, the index string values are compared instead. When comparing non-scalar values, "@val_type_desc"sort ordering is used, so subarrays, if present, come out first.

"@val_num_desc"

Like "@val_num_asc", but the element values, treated as numbers, are ordered from high to low. If the numeric values are equal, the string values are com- pared instead. If they are also identical, the index string values are compared instead. Non-scalar values are compared using "@val_type_desc" sort order- ing, so subarrays, if present, come out first.

The array traversal order is determined before the for loop starts to run. Changing PROCINFO["sorted_in"] in the loop body does not affect the loop. For example:

$ gawk '

> BEGIN {

> a[4] = 4

> a[3] = 3

> for (i in a)

> print i, a[i]

> }' a 4 4 a 3 3

$ gawk '

> BEGIN {

> PROCINFO["sorted_in"] = "@ind_str_asc"

> a[4] = 4

> a[3] = 3

> for (i in a)

> print i, a[i]

> }' a 3 3 a 4 4

When sorting an array by element values, if a value happens to be a subarray then it is considered to be greater than any string or numeric value, regardless of what the subarray itself contains, and all subarrays are treated as being equal to each other. Their order relative to each other is determined by their index strings.

Here are some additional things to bear in mind about sorted array traversal:

• The value of PROCINFO["sorted_in"] is global. That is, it affects all array traversal forloops. If you need to change it within your own code, you should see if it’s defined and save and restore the value:

...

if ("sorted_in" in PROCINFO) {

save_sorted = PROCINFO["sorted_in"]

PROCINFO["sorted_in"] = "@val_str_desc" # or whatever }

...

if (save_sorted)

PROCINFO["sorted_in"] = save_sorted

• As already mentioned, the default array traversal order is represented by

"@unsorted". You can also get the default behavior by assigning the null string to PROCINFO["sorted_in"] or by just deleting the "sorted_in" element from the PROCINFO array with the delete statement. (The delete statement hasn’t been described yet; see Section 8.4 [The deleteStatement], page 180.)

In addition,gawkprovides built-in functions for sorting arrays; seeSection 12.2.2 [Sorting Array Values and Indices withgawk], page 322.

Dalam dokumen GAWK: Effective AWK Programming (Halaman 196-199)