The present method of evaluation allocates weights (or scores) to each sub-criterion for every academic. The scores of all the sub-criteria under a main criterion are then added to attain a score for that criterion. The final score for each academic is acquired by adding the scores of all six criteria.
138
This section compares the newly developed fuzzy-based system with the manual weighting system that is currently adopted at DUT. In order to achieve a valid comparison, the same decision-makers (𝐷1, 𝐷2, 𝐷3 and 𝐷4) were asked to evaluate the performance of the same three academics (𝐴1, 𝐴2, and 𝐴 3) using the same sub-criteria and criteria mentioned in section 5.3 (that is, the same criteria and sub-criteria that was used for fuzzy AHP). The following approach is adopted in this section:
The evaluators were asked to rate the six criteria in terms of their importance intensity;
A reliability score and an intra-correlation index are calculated to determine whether the scoring patterns for the criteria are acceptable;
The weights for the criteria are calculated;
The 4 decision-makers were asked to evaluate the 3 academics using the manual weighting system that is presently in use at DUT;
A reliability score and an intra-correlation index are calculated to determine whether the scoring patterns of the evaluators regarding the evaluation of the 3 academics are acceptable;
The decision-makers were asked to fill in a short open-ended questionnaire. The purpose was to elicit their opinions regarding the manual weighting system and the newly developed system in terms of the inputs. Their opinions are then analysed and discussed; and
A comparison was made between the manual weighting system and the newly developed system using quantitative techniques in terms of the objectives mentioned in section 5.4.
6.4.1 Rating the six criteria using absolute values
For fuzzy AHP, a comprehensive pair-wise comparison matrix was used to calculate the weights in order to determine the rankings of the six criteria. The use of a comprehensive pair-wise comparison matrix is not possible when using the current method of evaluation at DUT, that is, the manual weighting system. Determining the rankings of the criteria in terms of “importance intensity” is not mandatory with the current evaluation system. However, the 4 evaluators were asked to rank the criteria so that a comparison can be made with the rankings (of the criteria) of
139
the fuzzy AHP. Rating the criteria was done using absolute values to indicate the weights of the criteria as determined by each evaluator. Since there are six criteria, there will be six levels of
“importance intensity”. The “most important” criterion is indicated using the value 1 and the
“least important” criterion is indicated using the value 6. After the evaluators made their choices, the following results as indicated in Table 6-2 was attained.
6.4.2 Calculating the reliability scores for the ratings
Before analysing the absolute value method for rating the criteria, the reliability of the scoring patterns of the evaluators are determined. This is achieved by calculating the mean and standard deviation scores of the 4 evaluators for each criterion. The results of the reliability test for the data from Table 6-2 are indicated in Table 6-3.
Descriptives
Statistic Std. Error
Administration Mean 2.2500 .25000
95% Confidence Interval for Mean Lower Bound 1.4544
Upper Bound 3.0456
5% Trimmed Mean 2.2222
Median 2.0000
Variance .250
Std. Deviation .50000
Teaching and Supervision Mean 1.7500 .47871
95% Confidence Interval for Mean Lower Bound .2265
Upper Bound 3.2735
Criteria 𝑫𝟏 𝑫𝟐 𝑫𝟑 𝑫𝟒
Administration 2 2 3 2
Teaching and Supervision 1 1 2 3
Research and Innovation 4 3 1 1
Writing and Publication 5 6 4 4
Consultancy 3 4 5 6
Services Rendered and Ext. Engagement 6 5 6 5
Table 6-2: Ranking the criteria using absolute values
140
5% Trimmed Mean 1.7222
Median 1.5000
Variance .917
Std. Deviation .95743
Research and Innovation Mean 2.2500 .75000
95% Confidence Interval for Mean Lower Bound -.1368
Upper Bound 4.6368
5% Trimmed Mean 2.2222
Median 2.0000
Variance 2.250
Std. Deviation 1.50000
Writing and Publication Mean 4.7500 .47871
95% Confidence Interval for Mean Lower Bound 3.2265
Upper Bound 6.2735
5% Trimmed Mean 4.7222
Median 4.5000
Variance .917
Std. Deviation .95743
Consultancy Mean 4.5000 .64550
95% Confidence Interval for Mean Lower Bound 2.4457
Upper Bound 6.5543
5% Trimmed Mean 4.5000
Median 4.5000
Variance 1.667
Std. Deviation 1.29099
Services Rendered and Ext.
Engagement
Mean 5.5000 .28868
95% Confidence Interval for Mean Lower Bound 4.5813
Upper Bound 6.4187
5% Trimmed Mean 5.5000
Median 5.5000
Variance .333
Std. Deviation .57735
The results from Table 6-3 indicate a 95% Confidence Interval for the Mean. Since the data sets are not normal, Mann Whitney tests were used to determine whether there was a significant difference in the mean values (using the central value comparison of the distributions). All of the p-values for the different combinations of raters per variable have p-values > level of
Table 6-3: Reliability scores for the 4 evaluators
141
significance of 0.05. This indicates that there was a high degree of acceptability and consistency scoring by the 4 evaluators.
6.4.3 Calculating the intra class correlation index
Intra class correlations are used when quantitative measurements are made on units that are organised into groups. In this case the 4 decision-makers (evaluators) would belong to such a group. It describes how strongly units in the same group resemble each other. In this case all 4 evaluators (decision-makers) have many years of experience in conducting evaluations which indicates that they resemble each other due to their expertise. Intra class correlations are generally used to quantify the degree to which individuals with a fixed degree of relatedness resemble each other in terms of some quantitative traits. Intra class correlations are generally applied when an assessment of consistency or reproducibility of quantitative measurements by different observers measuring the same quantity is required. Refer to Annexure F for an explanation on how intra class correlations are calculated. The intra class coefficient was calculated using SPSS version 21.0. The results are indicated in Table 6-4.
Intra class Correlation Coefficient
Intra class Correlationb
95% Confidence Interval F Test with True Value 0
Lower Bound Upper Bound Value df1 df2 Sig
Single Measures .808a .499 .966 17.812 5 15 .000
Average Measures .944c .799 .991 17.812 5 15 .000
Two-way mixed effects model where people effects are random and measures effects are fixed.
a. The estimator is the same, whether the interaction effect is present or not.
b. Type C intra class correlation coefficients using a consistency definition-the between-measure variance is excluded from the denominator variance.
c. This estimate is computed assuming the interaction effect is absent, because it is not estimable otherwise.
The results from Table 6-4 indicate that the single measures intra class correlations are high (significant as p < 0.05). This means that the evaluators rated the criteria in a similar manner.
Now that the reliability and the intra class correlation coefficients have indicated acceptable results, the weights of the six criteria can be computed.
Table 6-4: Results of the intra class correlation coefficient
142 6.4.4 Calculating the weights using absolute values
The geometric mean method was used in Saaty’s method to calculate the weights for conventional and fuzzy AHP. In order for the results to be valid, the geometric mean method is also used in the calculation of the weights for the manual evaluation system from Table 6-2. The calculation for the first criteria (Administration) is as follows: [∏𝑘𝑖=1 𝑥𝑖]1/𝑘=√2X2X3X24 = 2.213.
The rest are determined by analogy and the weight for each criterion is indicated in Table 6-5.
For fuzzy AHP, the calculation of criteria weights involving all four evaluators (group decision) is necessary. In other words, ranking the criteria in terms of ‘intensity importance’ involving all members of the evaluation panel is necessary to determine the fuzzy performance matrix. The fuzzy performance matrix was used to address the objectives in section 5.4. However, for the manual weighting system, it is not mandatory that the four evaluators collectively rank the criteria. Some departments may however want to rank the six criteria. The purpose of requesting the evaluators to rank the criteria (for the manual weighting system) was to make a comparison between the criteria ranking of fuzzy AHP and the criteria ranking of the current manual system in use. Since the fuzzy weights in terms of the BNP values (Table 5-21) are normalised with values between 0 and 1, the absolute values in Table 6-5 were also converted into values between 0 and 1. The comparisons are indicated in Figure 6-3.
Criteria Weight
Administration 2.213
Teaching and Supervision 1.861 Research and Innovation 1.861 Writing and Publication 4.680
Consultancy 4.356
Services Rendered and Ext. Engagement 5.477
Table 6-5: Criteria weights using absolute values
143 6.4.5 A comparison of the criteria weights
From Figure 6-3, the largest BNP values indicate the most important criteria for fuzzy AHP and the smallest value for absolute weights indicate the most important criteria. For both fuzzy and absolute value weights, Research and Innovation, Administration as well as Teaching and Supervision ranks among the most important. In both cases, Services Rendered and External Engagement is ranked the lowest. The discrepancy revolves around the order of the “importance intensity” for the first 3 criteria in both cases. This is due to the fuzziness in the evaluator’s decisions regarding Teaching and Supervision as well as Research and Innovation when absolute values were used to rate the six criteria. Although there are no major discrepancies in the overall results for both methods, fuzzy AHP is more reliable (Lee, 2010). This is due to the fact that the use of linguistic values limits or eradicates the uncertainty that evaluators experience when using absolute values to rate the criteria.
Although the reliability score and the intra class coefficient correlation shows good ratings, the evaluation system itself is not preferred when the results of the experiment as well as the opinions of academic staff (as indicated in Figure 7-7) regarding the manual weighting system are taken into consideration. Therefore, one can also conclude that the scoring patterns of evaluators have little or no bearing in the type of system that is used for the evaluation of academic staff.
0.10 0.20.3 0.40.5 0.60.7 0.80.9
Absolute weights Fuzzy weights
Figure 6-3: Comparing fuzzy weights with absolute value weights
144