• Tidak ada hasil yang ditemukan

Supervised Classification

2.2 Atmospheric Classification

2.2.2 Supervised Classification

that are not representative of individual circulation patterns (e.g. Huth et al., 2008).

A description of the T-mode and S-mode PCA is given in Compagnucci & Richmond (2008). For T-mode PCA the data matrix Xt is of order (t×n) where t is the length of the record and n is the size of the grid. The PCA algorithm is the solution to the formulation Xt = FAT where F is a matrix of the principle components and A relates the components of Fto the input variables in Xt, for exampleA could be the correlation matrix.

interval [0,1]. The value unity implies that ydoes belong to X whereas 0 implies that y does not belong to X.

Therefore this approach can be used within pattern recognition analysis by incorpo- rating a set of fuzzy rules. Bárdossy et al. (1995) define a fuzzy rule as a simple ‘if A then B’ statement. The object A is a fuzzy condition formulated with fuzzy sets and B is the consequence which is also a fuzzy set. The fuzzy rule attaches a degree of truth to the membership of an object to a class thus replacing binary logic. In terms of atmospheric classification the fuzzy rules represent the different CP classes. The rules contain spatial arrangements of fuzzy numbers that relate to the spatial arrangement of high and low pressures within the CP class.

The argument for the use of fuzzy logic as a tool in atmospheric classification is simple. Since CP realizations form part of a continuum it follows that each realization belongs to some extent to all classes (Bárdossyet al., 1995; Huthet al., 2008). There- fore there is a degree of ambiguity when assigning CPs to classes and fuzzy logic is an attempt to quantify this ambiguity. CPs are assigned to classes based on the degree to which they belong to a class and a CP is assigned to a class with which it shares the highest degree of fit (DOF). However Huthet al. (2008) argue that the manner in which CPs are assigned to classes can lead to erroneous classification.

Fuzzy Classification

In Bárdossyet al. (1995) CP classes were defined by spatially distributed fuzzy num- bers. The fuzzy numbers attempt to represent different pressure states such that

1. very low values (−∞,−3,0)T, 2. medium low values (−4,−0.85,4)T, 3. medium high values (0.25,0.85,4)T, 4. very high values (0,3,+∞)T and 5. any value (−∞,0,+∞)T.

The fuzzy numbers (n = 1. . .5) represent different triangular membership functions.

The functions (denoted by subscript T) assign a membership grade (value between [0,1]) to pressure values at each grid point based on their relative strengths of asso- ciation to the fuzzy numbers. The triangular membership functions are defined as

(Bárdossy et al., 2002)

µ(x, y, z)T(P) =

Px

yx if xPy

Pz

yz if y < Pz 0 else

(2.19)

where P is the pressure value at a location. For example a very high pressure value that coincides with the location of fuzzy number 4 will be assigned a value of 1 whereas if it were a very low value it will be assigned 0. The fuzzy number n = 5 is used to describe values that have no effect on the CP and ensures spatially compact features.

Therefore each CP class c can defined by the vector Vc(i), iI, where i are the locations in grid I and the integers Vc(i) are the fuzzy numbers described above.

The classification method can be summarised as following (Bárdossyet al., 2015):

1. At each time realizationt the membership grades µVc(i)(P(i, t)) for the pressure values P(i, t) at each location iare calculated.

2. The membership grades are then combined in an overall DOF value such that

DOF(c, t) = Y4

n=1

1

N(c, iVc(i) =n)

N(c,i Vc(i)=n)

X

i=1

µVc(i)(P(i, t))Qn

1 Qn

(2.20) where N(c, iVc(i) = n) are the amounts for each fuzzy number n = 1, . . . ,4 present in the CP classc. The powerQnreflects the relative importance for each of the fuzzy numbers.

The CP classes are derived through an optimization technique that is based on a variable of interest. The variable of interest is incorporated within a set of objective functions that relate to particular properties of the variable the user is interested in.

For example the following objective functions were used in Bárdossy et al. (2002) to link CPs with rainfall:

O1(θ) =XS

i=1

v u u t1

T

T

X

t=1

(p(CP(t))ipi)2, (2.21)

and O2 =XS

i=1

1 T

X

t=1

T

ln z(CP(t))i

zi

!

(2.22)

where S is the number of rainfall stations, T is the total time period used for the classification, p(CP(t))i is the probability of the precipitation exceeding a threshold θ for a given CP at station i, and pi is the probability of the precipitation exceeding threshold θ without classification, z(CP(t))i is the mean precipitation for the given CP at time t and station i and zi is the mean precipitation without classification.

Different thresholds θ were incorporated to delineate between CPs driving high and low rainfall events. The objective functions are linearly combined into an overall objective function O = αO1 +. . .+αOm where the different weights (α1, . . . , αm) reflect the relative importance of each objective function (Bárdossy, 2010).

The aforementioned objective functions are specific to precipitation. However it is possible to incorporate any variable of interest but the objective remain the same, to derive a set of CP classes with statistics of the variable of interest that differ significantly from those in the unclassified case (Bárdossy, 2010). Given the nature of the classification scheme the set of all possible CP combinations is significantly large (Bárdossy, 2010). There are 5 fuzzy numbers for each CP class that can be combined in any manner within the specified regionI for a specified number of classes. Therefore a simulated annealing technique after Aarts & Korst (1989) is utilized as an optimization procedure.