Discussion - Digitizing Assessment of Creative Aptitude: A Human-Centred Design Approach

Chapter 1. Introduction 1

6.1 Discussion

This thesis highlights the creative aptitude assessment process for Design based educational institutes. Specifically, the assessment is from two major perspectives-1) identifying creative questions that instigate creative response from students, and 2) evaluating responses to these questions that are illustrated by multiple patterns of creative responses. This investigation is directed towards systematically identifying variables of creative questions and evaluation of creative responses. Previous studies pointed out the stress factors of pedagogues due to huge workload in institutions (Boyle et al., 1995; Naghieh et al., 2015; Prilleltensky et al., 2016;

Sanetti et al., 2020; Skaalvik & Skaalvik, 2016, 2017), but hardly any reported literatures were found that addressed frustrations of pedagogues generated due to large scale evaluation of creative responses and issues related to consistency and errors during creative question formulation and solution evaluation. This research addressed this gap by development of a digitized system that attempts at automating the assessment process specifically for large scale Design entrance examinations.

The proposed systems were validated by comparing their outcomes with human evaluation scores. The reliability estimate of the examiners and the proposed model in identifying creative questions was considerably high (α=0.96). The MAE of descriptive, labelled image-based, and annotated image-based pattern of creative responses are 0.085, 0.015, and 0.009, respectively.

These experimental results indicate that there is a negligible difference between the evaluation of the proposed systems and their human-based evaluation scores.

Assessment conducted in real-time environment encounters multiple exceptions which were identified and handled by the proposed models. For example, relevance between the question and a response is a significant factor in evaluating creative aptitude. In the proposed model, relevance score between question and response was obtained by using cosine similarity function. However, in some situations it might happen that a student intentionally reproduce the same question and presented it as a response. In that case, the relevance score would be extraordinarily high, leading to miscalculation in the computational model. Therefore, rules are essential to verify a very high relevance score. An overwhelming score requires a matching of string length between question and its corresponding response. Pattern matching of string would also further support the verification process. This case is illustrated in Figure 6.1, where a question-response pair receives extraordinary scores. A response, when passed through a rule-base comprising of measuring string length and pattern matching may subject to change in relevance score.

Figure 6.1: Handling faulty relevance score

Another case has been studied, where a student might just put a line or a joke as response to a given question. In that case, embeddings of the question and their corresponding responses were matched using cosine similarity function (Fauzi et al., 2017). This function generated a relevance score within the range of 0 to 1. Any unrelated line or a joke for a given question generated a value almost closer to 0 and lower than the threshold. Therefore, the responses getting scores closer to 0 could be considered irrelevant. These kinds of cases were undertaken to handle exceptions in the proposed models.

In case educational practitioners would like to replicate this study, then the first and foremost criteria are to identify domain-specific features of subjective evaluation of creative aptitude. A mixed-method approach is essential to capture the evaluation process of the particular field of interest. Any new domain-specific features for assessing questions or answers requires to be included in the model. An appropriate scoring function needs to be defined depending on the scoring mechanism to be followed in an examination. The major points to be considered to replicate the entire study are:

i. These studies are specifically meant for identifying questions that has the potential to instigate creative responses from students, and subjective evaluation of novelty in creative responses.

ii. A human-centred design approach is essential to investigate the identification of creative questions and evaluating their creative responses.

iii. A scientific and systematic study is essential to identify the features for identifying creative questions and evaluating creative responses.

iv. Investigating evaluation mechanism is required for any new domain-specific feature.

v. Decision-making is required to formulate a scoring function to measure the intended score.

vi. Updating the proposed model is required to imbibe new characteristics in it.

vii. It is essential to validate the proposed model to establish trust of pedagogues.

6.1.1 Key findings of the thesis

The salient findings of the thesis work described in different chapters are listed as follows.

i. Presently, subjective evaluation of creative aptitude in India is based on pen-and-paper- based techniques. This manual evaluation process conducted on a large scale leads to inconsistencies and errors in assessment. This thesis therefore, focussed on identifying features of evaluation to automate the assessment process.

ii. Creative question has the potential to instigate creative response. While framing creative questions examiners often self-evaluate, compare, and contrast their ideas before finally phrasing the question. During this process, they remain ever-inquisitive to know whether questions framed by them are really creative; to be more precise Design pedagogues often critically examine their framed questions by asking, do the questions framed can really capture creative responses? In order to avoid human bias, it is essential to identify features of questions that has the potential of instigating creative response from students. Further, automating the process of identifying creative questions may support pedagogues in decision-making of whether a question reformulation is required or not.

iii. Twenty two variables were systematically identified in creative questions that has the potential to instigate creative response among students. These include

‘question_verify_intent’,‘question_communicational’,‘question_expect_short_answer

’,‘question_seek_fact’,‘question_novel_answer’,‘question_interest_others’,‘question_

interest_self’,‘question_multi_interpretation’,‘question_verify’,‘question_seek_opinio n’,‘question_choice_type’,‘question_compare_type’,‘question_consequence_action’,

‘question_definition’,‘question_entity’, ‘question_instructions’, ‘question_procedure’,

‘question_seek_reason’,‘question_spelling’,‘question_well_written’,‘question_subject

ivity’,and ‘question_ polarity. These features supported in selecting creative questions from a bunch of non-creative questions.

iv. A computational model was designed and developed to identify creative questions from non-creative questions. This would ensure support to pedagogues in deciding whether a question confirms the features of a creative question and whether question reformulation is required or not.

v. Identifying a question that has the potential to instigate creative response among students is subjective and usually depends on experts’ opinion. Therefore, inter-rater reliability of subjective judges and outcome of the model was measured. The reliability estimate between subjective judges and the proposed model was satisfactory (α=0.96).

This higher rate of alpha highlight high agreement among Design experts and the models, we believe this would ensure increased trust in the proposed model.

vi. While creative question plays a major role in triggering creative responses, assessment of the responses play the most important role to ensure appropriate evaluation of the creative aptitude. Novelty is an important parameter that Design pedagogues look out for in these type of responses. Systematically five dimensions were identified that explains novelty in descriptive pattern of creative responses. The dimensions were grammatical mistakes and misspellings in responses, finding relevance between questions and responses, coherence in responses, and measuring relative uniqueness among responses. Though language processing revealed minor deviation from the mean value, but it has been considered as a parameter for evaluating novelty (Schumann et al., 1996). This parameter was considered as it supported pedagogues in perceiving novelty. These dimensions would support pedagogues in a consistent evaluation process of responses during mass examinations.

vii. A computational model was designed and developed to evaluate novelty in descriptive pattern of creative responses. This model would support pedagogues to consistently measure novelty in mass examinations.

viii. Literature findings suggests seven dimensions of evaluating image-based pattern of creative responses (Berbague et al., 2021; Camburn et al., 2020; Chaudhuri et al., 2020, 2021b; Demirkan & Afacan, 2012; Schumann et al., 1996; Takai et al., 2015), where

systematically two parameters were identified by Design pedagogues for specifically assessing labelled image-based pattern of creative responses. These features support finding relevance between question and labelled image-based pattern of creative responses and uniqueness of a response.

ix. A computational model was designed and developed to evaluate novelty in labelled image-based pattern of creative responses. This model would support pedagogues to consistently measure novelty in mass examinations.

x. Literature findings suggests seven dimensions of evaluating image-based pattern of creative responses (Berbague et al., 2021; Camburn et al., 2020; Chaudhuri et al., 2020, 2021b; Demirkan & Afacan, 2012; Schumann et al., 1996; Takai et al., 2015), while systematically three parameters were identified by Design pedagogues for specifically assessing annotated image-based pattern of creative responses in mass examination of Design education. These features would support language processing, finding the relevance between question and annotated image-based pattern of creative responses, and uniqueness of a response.

xi. A computational model was designed and developed to evaluate novelty in annotated image-based pattern of creative responses. This model would support pedagogues to consistently measure novelty in mass examinations of Design education.

xii. The performance metric of the proposed models for evaluating descriptive, labelled image-based, and annotated image-based pattern of creative responses were measured.

The MAE of these models were satisfactory as negligible difference was found between the outcome of the models and experts.

xiii. The proposed model comprise multiple self-contained pre-defined models. A comparative study was conducted among the baseline models, and the best-performing model was considered and made part of the proposed model.

Dalam dokumen Digitizing Assessment of Creative Aptitude: A Human-Centred Design Approach (Halaman 157-162)