• Tidak ada hasil yang ditemukan

The present method of evaluation allocates weights (or scores) to each sub-criterion for every academic. The scores of all the sub-criteria under a main criterion are then added to attain a score for that criterion. The final score for each academic is acquired by adding the scores of all six criteria.

138

This section compares the newly developed fuzzy-based system with the manual weighting system that is currently adopted at DUT. In order to achieve a valid comparison, the same decision-makers (𝐷1, 𝐷2, 𝐷3 and 𝐷4) were asked to evaluate the performance of the same three academics (𝐴1, 𝐴2, and 𝐴 3) using the same sub-criteria and criteria mentioned in section 5.3 (that is, the same criteria and sub-criteria that was used for fuzzy AHP). The following approach is adopted in this section:

 The evaluators were asked to rate the six criteria in terms of their importance intensity;

 A reliability score and an intra-correlation index are calculated to determine whether the scoring patterns for the criteria are acceptable;

 The weights for the criteria are calculated;

 The 4 decision-makers were asked to evaluate the 3 academics using the manual weighting system that is presently in use at DUT;

 A reliability score and an intra-correlation index are calculated to determine whether the scoring patterns of the evaluators regarding the evaluation of the 3 academics are acceptable;

 The decision-makers were asked to fill in a short open-ended questionnaire. The purpose was to elicit their opinions regarding the manual weighting system and the newly developed system in terms of the inputs. Their opinions are then analysed and discussed; and

 A comparison was made between the manual weighting system and the newly developed system using quantitative techniques in terms of the objectives mentioned in section 5.4.

6.4.1 Rating the six criteria using absolute values

For fuzzy AHP, a comprehensive pair-wise comparison matrix was used to calculate the weights in order to determine the rankings of the six criteria. The use of a comprehensive pair-wise comparison matrix is not possible when using the current method of evaluation at DUT, that is, the manual weighting system. Determining the rankings of the criteria in terms of “importance intensity” is not mandatory with the current evaluation system. However, the 4 evaluators were asked to rank the criteria so that a comparison can be made with the rankings (of the criteria) of

139

the fuzzy AHP. Rating the criteria was done using absolute values to indicate the weights of the criteria as determined by each evaluator. Since there are six criteria, there will be six levels of

“importance intensity”. The “most important” criterion is indicated using the value 1 and the

“least important” criterion is indicated using the value 6. After the evaluators made their choices, the following results as indicated in Table 6-2 was attained.

6.4.2 Calculating the reliability scores for the ratings

Before analysing the absolute value method for rating the criteria, the reliability of the scoring patterns of the evaluators are determined. This is achieved by calculating the mean and standard deviation scores of the 4 evaluators for each criterion. The results of the reliability test for the data from Table 6-2 are indicated in Table 6-3.

Descriptives

Statistic Std. Error

Administration Mean 2.2500 .25000

95% Confidence Interval for Mean Lower Bound 1.4544

Upper Bound 3.0456

5% Trimmed Mean 2.2222

Median 2.0000

Variance .250

Std. Deviation .50000

Teaching and Supervision Mean 1.7500 .47871

95% Confidence Interval for Mean Lower Bound .2265

Upper Bound 3.2735

Criteria 𝑫𝟏 𝑫𝟐 𝑫𝟑 𝑫𝟒

Administration 2 2 3 2

Teaching and Supervision 1 1 2 3

Research and Innovation 4 3 1 1

Writing and Publication 5 6 4 4

Consultancy 3 4 5 6

Services Rendered and Ext. Engagement 6 5 6 5

Table 6-2: Ranking the criteria using absolute values

140

5% Trimmed Mean 1.7222

Median 1.5000

Variance .917

Std. Deviation .95743

Research and Innovation Mean 2.2500 .75000

95% Confidence Interval for Mean Lower Bound -.1368

Upper Bound 4.6368

5% Trimmed Mean 2.2222

Median 2.0000

Variance 2.250

Std. Deviation 1.50000

Writing and Publication Mean 4.7500 .47871

95% Confidence Interval for Mean Lower Bound 3.2265

Upper Bound 6.2735

5% Trimmed Mean 4.7222

Median 4.5000

Variance .917

Std. Deviation .95743

Consultancy Mean 4.5000 .64550

95% Confidence Interval for Mean Lower Bound 2.4457

Upper Bound 6.5543

5% Trimmed Mean 4.5000

Median 4.5000

Variance 1.667

Std. Deviation 1.29099

Services Rendered and Ext.

Engagement

Mean 5.5000 .28868

95% Confidence Interval for Mean Lower Bound 4.5813

Upper Bound 6.4187

5% Trimmed Mean 5.5000

Median 5.5000

Variance .333

Std. Deviation .57735

The results from Table 6-3 indicate a 95% Confidence Interval for the Mean. Since the data sets are not normal, Mann Whitney tests were used to determine whether there was a significant difference in the mean values (using the central value comparison of the distributions). All of the p-values for the different combinations of raters per variable have p-values > level of

Table 6-3: Reliability scores for the 4 evaluators

141

significance of 0.05. This indicates that there was a high degree of acceptability and consistency scoring by the 4 evaluators.

6.4.3 Calculating the intra class correlation index

Intra class correlations are used when quantitative measurements are made on units that are organised into groups. In this case the 4 decision-makers (evaluators) would belong to such a group. It describes how strongly units in the same group resemble each other. In this case all 4 evaluators (decision-makers) have many years of experience in conducting evaluations which indicates that they resemble each other due to their expertise. Intra class correlations are generally used to quantify the degree to which individuals with a fixed degree of relatedness resemble each other in terms of some quantitative traits. Intra class correlations are generally applied when an assessment of consistency or reproducibility of quantitative measurements by different observers measuring the same quantity is required. Refer to Annexure F for an explanation on how intra class correlations are calculated. The intra class coefficient was calculated using SPSS version 21.0. The results are indicated in Table 6-4.

Intra class Correlation Coefficient

Intra class Correlationb

95% Confidence Interval F Test with True Value 0

Lower Bound Upper Bound Value df1 df2 Sig

Single Measures .808a .499 .966 17.812 5 15 .000

Average Measures .944c .799 .991 17.812 5 15 .000

Two-way mixed effects model where people effects are random and measures effects are fixed.

a. The estimator is the same, whether the interaction effect is present or not.

b. Type C intra class correlation coefficients using a consistency definition-the between-measure variance is excluded from the denominator variance.

c. This estimate is computed assuming the interaction effect is absent, because it is not estimable otherwise.

The results from Table 6-4 indicate that the single measures intra class correlations are high (significant as p < 0.05). This means that the evaluators rated the criteria in a similar manner.

Now that the reliability and the intra class correlation coefficients have indicated acceptable results, the weights of the six criteria can be computed.

Table 6-4: Results of the intra class correlation coefficient

142 6.4.4 Calculating the weights using absolute values

The geometric mean method was used in Saaty’s method to calculate the weights for conventional and fuzzy AHP. In order for the results to be valid, the geometric mean method is also used in the calculation of the weights for the manual evaluation system from Table 6-2. The calculation for the first criteria (Administration) is as follows: [∏𝑘𝑖=1 𝑥𝑖]1/𝑘=√2X2X3X24 = 2.213.

The rest are determined by analogy and the weight for each criterion is indicated in Table 6-5.

For fuzzy AHP, the calculation of criteria weights involving all four evaluators (group decision) is necessary. In other words, ranking the criteria in terms of ‘intensity importance’ involving all members of the evaluation panel is necessary to determine the fuzzy performance matrix. The fuzzy performance matrix was used to address the objectives in section 5.4. However, for the manual weighting system, it is not mandatory that the four evaluators collectively rank the criteria. Some departments may however want to rank the six criteria. The purpose of requesting the evaluators to rank the criteria (for the manual weighting system) was to make a comparison between the criteria ranking of fuzzy AHP and the criteria ranking of the current manual system in use. Since the fuzzy weights in terms of the BNP values (Table 5-21) are normalised with values between 0 and 1, the absolute values in Table 6-5 were also converted into values between 0 and 1. The comparisons are indicated in Figure 6-3.

Criteria Weight

Administration 2.213

Teaching and Supervision 1.861 Research and Innovation 1.861 Writing and Publication 4.680

Consultancy 4.356

Services Rendered and Ext. Engagement 5.477

Table 6-5: Criteria weights using absolute values

143 6.4.5 A comparison of the criteria weights

From Figure 6-3, the largest BNP values indicate the most important criteria for fuzzy AHP and the smallest value for absolute weights indicate the most important criteria. For both fuzzy and absolute value weights, Research and Innovation, Administration as well as Teaching and Supervision ranks among the most important. In both cases, Services Rendered and External Engagement is ranked the lowest. The discrepancy revolves around the order of the “importance intensity” for the first 3 criteria in both cases. This is due to the fuzziness in the evaluator’s decisions regarding Teaching and Supervision as well as Research and Innovation when absolute values were used to rate the six criteria. Although there are no major discrepancies in the overall results for both methods, fuzzy AHP is more reliable (Lee, 2010). This is due to the fact that the use of linguistic values limits or eradicates the uncertainty that evaluators experience when using absolute values to rate the criteria.

Although the reliability score and the intra class coefficient correlation shows good ratings, the evaluation system itself is not preferred when the results of the experiment as well as the opinions of academic staff (as indicated in Figure 7-7) regarding the manual weighting system are taken into consideration. Therefore, one can also conclude that the scoring patterns of evaluators have little or no bearing in the type of system that is used for the evaluation of academic staff.

0.10 0.20.3 0.40.5 0.60.7 0.80.9

Absolute weights Fuzzy weights

Figure 6-3: Comparing fuzzy weights with absolute value weights

144